Tag Archives: Performance

Service Bus Batching and Pre-Fetch

Azure Service Bus is often used as a way to link services (as the name suggests). This typically means that the throughput is large, but not so large that anyone worries about the speed. However, in this post, I’m going to cover some techniques to speed up service bus messaging.

A quick disclaimer here, though: it might be that if you’re reading this then you’re not using the best tool for the job, and perhaps something more lightweight might be a better fit.

Essentially, there are two inbuilt methods in Service Bus to do this: Batching, and Prefetch; let’s look at Batching first.

Batching

The principle behind batching is very simple: instead of making multiple calls to Azure Service Bus with multiple messages, you make one call with all the messages that you wish to send.

Let’s see some code that sends 1000 messages:

var stopwatch = new Stopwatch();
stopwatch.Start();

var queueClient = new QueueClient(connectionString, QUEUE_NAME);

for (int i = 0; i < 1000; i++)
{
    string messageBody = $"{DateTime.Now}: {messageText} ({Guid.NewGuid()})";
    var message = new Message(Encoding.UTF8.GetBytes(messageBody));

    await queueClient.SendAsync(message);
}
await queueClient.CloseAsync();

stopwatch.Stop();
Console.WriteLine($"Send messages took {stopwatch.ElapsedMilliseconds}");

The timing for these posts is not a benchmark test, but whatever my test recorded at the time – your mileage may vary a lot, although I would say they are a fair rough estimation. As you’ll see later on, the differences are very stark.

In my tests, sending 1000 messages took around 30 seconds (which is good enough for most scenarios, as they are unlikely to be sent in a batch such as this). Let’s see how we can batch the messages to speed things up:

var stopwatch = new Stopwatch();
stopwatch.Start();

var queueClient = new QueueClient(connectionString, QUEUE_NAME);
var messages = new List<Message>();

for (int i = 0; i < 1000; i++)
{
    string messageBody = $"{DateTime.Now}: {messageText} ({Guid.NewGuid()})";
    var message = new Message(Encoding.UTF8.GetBytes(messageBody));                

    messages.Add(message);
}
await queueClient.SendAsync(messages);
await queueClient.CloseAsync();

stopwatch.Stop();
Console.WriteLine($"Send messages took {stopwatch.ElapsedMilliseconds}");

As you can see, there’s not a huge amount of change to the code. It’s worth noting that in the first example, if the code crashes half way through the send, half the messages would be sent – whereas with the batch, it’s all or nothing. However, this came in at under a second:

On the receive side, we have a similar thing.

Batch Receive

Here, we can receive messages in chunks, instead of one at a time. Again, let’s see how we might receive 1000 messages (there are multiple ways to do this):

var stopwatch = new Stopwatch();
stopwatch.Start();

var messageReceiver = new MessageReceiver(connectionString, QUEUE_NAME);
for (int i = 0; i < 1000; i++)
{
    var message = await messageReceiver.ReceiveAsync();
    string messageBody = Encoding.UTF8.GetString(message.Body);
    Console.WriteLine($"Message received: {messageBody}");

    await messageReceiver.CompleteAsync(message.SystemProperties.LockToken);
}

stopwatch.Stop();
Console.WriteLine($"Receive messages took {stopwatch.ElapsedMilliseconds}");

The execution here takes around 127 seconds (over 2 minutes) in my tests:

The same is true for batch receipt as for batch send; with a slight caveat:

var stopwatch = new Stopwatch();
stopwatch.Start();
Int count = 1000;
int remainingCount = count;

while (remainingCount > 0)
{
    var messageReceiver = new MessageReceiver(connectionString, QUEUE_NAME);
    var messages = await messageReceiver.ReceiveAsync(remainingCount);

    foreach (var message in messages)
    {
        string messageBody = Encoding.UTF8.GetString(message.Body);
        Console.WriteLine($"Message received: {messageBody}");
        remainingCount--;
    }

    await messageReceiver.CompleteAsync(messages.Select(a => a.SystemProperties.LockToken));
}

stopwatch.Stop();
Console.WriteLine($"Receive messages took {stopwatch.ElapsedMilliseconds}");
Console.WriteLine($"Remaining count: {remainingCount}");

Note that the CompleteAsync can also be called in batch.

You may be wondering here what the while loop is all about. In fact, it’s because batch receive isn’t guaranteed to return the exact number of messages that you request. However, we still brought the receive time down to around 10 seconds:

A Note on Batching and Timeouts

It’s worth bearing in mind that when you retrieve a batch of messages, you’re doing just that – retrieving them. In a PeekLock scenario, they are now locked; and, if you don’t complete or abandon them, they will time out like any other message. If you have a large number of messages, you may need to extend the timeout; for example:

var messages = await messageReceiver.ReceiveAsync(remainingCount, TimeSpan.FromSeconds(20));

In the next section, we’ll discuss the second technique, of allowing the service bus to “run ahead” and get messages before you actually request them.

Prefetch

Prefetch speeds up the retrieval of messages by getting Azure Service Bus to return messages ahead of them being needed. This presents a problem (similar to receiving in batch), which is that the system is actually retrieving messages on your behalf before you ask for them. In this example, we’ve been using PeekLock – that is, the message is left on the queue until we explicitly complete it. However, once you Peek the message, it’s locked. That means that with the code above, we can easily trip ourselves up.

int count = 1000;
var stopwatch = new Stopwatch();
stopwatch.Start();

var messageReceiver = new MessageReceiver(connectionString, QUEUE_NAME);
messageReceiver.PrefetchCount = prefetchCount;
for (int i = 0; i < count; i++)
{
    var message = await messageReceiver.ReceiveAsync(TimeSpan.FromSeconds(60));
    string messageBody = Encoding.UTF8.GetString(message.Body);
    Console.WriteLine($"Message received: {messageBody}");

    await messageReceiver.CompleteAsync(message.SystemProperties.LockToken);
}

stopwatch.Stop();
Console.WriteLine($"Receive messages took {stopwatch.ElapsedMilliseconds}");

Note the extended timeout on the Receive allows for the prefetched messages to complete.

Here’s the timing for Prefetch:

This is slightly quicker than processing the messages one at a time, but much slower than a batch. The main reason being that the complete takes the bulk of the time.

Remember that with Prefetch, if you’re using PeekLock, once you’ve pre-fetched a message, the timeout on the lock starts – this means that if you’re lock is for 5 seconds, and you’ve prefetched 500 records – you need to be sure that you’ll get around to them in time.

ReceiveAndDelete

Whilst the Prefetch messages timing out may be bad, with ReceiveAndDelete, they are taken off the queue, this means that you can consume the messages without ever actually seeing them!

Prefetch with Batch

Here, we can try to use the prefetch and batch combined:

int count = 1000
var stopwatch = new Stopwatch();
stopwatch.Start();
int remainingCount = count;

while (remainingCount > 0)
{
    var messageReceiver = new MessageReceiver(connectionString, QUEUE_NAME);
    messageReceiver.PrefetchCount = prefetchCount;
    var messages = await messageReceiver.ReceiveAsync(remainingCount);
    if (messages == null) break;

    foreach (var message in messages)
    {
        string messageBody = Encoding.UTF8.GetString(message.Body);
        Console.WriteLine($"Message received: {messageBody}");
        remainingCount--;
    }

    await messageReceiver.CompleteAsync(messages.Select(a => a.SystemProperties.LockToken));
}

stopwatch.Stop();
Console.WriteLine($"Receive messages took {stopwatch.ElapsedMilliseconds}");
Console.WriteLine($"Remaining count: {remainingCount}");

In fact, in my tests, the timing for this was around the same as a batch receipt:

There may be some advantages with much higher numbers, but generally, combining the two in this manner doesn’t seem to provide much benefit.

References

https://www.planetgeek.ch/2020/04/27/azure-service-bus-net-sdk-deep-dive-sender-side-batching/

https://markheath.net/post/speed-up-azure-service-bus-with-batching

https://github.com/Azure/azure-service-bus-dotnet/issues/441

https://markheath.net/post/migrating-to-new-servicebus-sdk

https://weblogs.asp.net/sfeldman/understanding-Azure-service-bus-prefetch

https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-performance-improvements?tabs=net-standard-sdk-2

https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-prefetch

WPF Performance – TextBlock

WPF typically doesn’t have many performance issues and, where it does, this can usually be fixed by virtualisation. Having said that, even if you never need to use this, it’s useful to know that you can eek that last ounce of performance out of the system.

Here’s a basic program that contains a TextBlock:

<Window x:Class="TextBlockTest.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        xmlns:local="clr-namespace:TextBlockTest"
        mc:Ignorable="d"
        Title="MainWindow" Height="350" Width="525"
        x:Name="MainWindowView">
    <Grid>
        <ScrollViewer>
            <ItemsControl ItemsSource="{Binding BigList, ElementName=MainWindowView}" Margin="0,-1,0,1">
                <ItemsControl.ItemTemplate>
                    <DataTemplate>
                        <TextBlock Text="{Binding}"/>
                    </DataTemplate>
                </ItemsControl.ItemTemplate>
            </ItemsControl>
        </ScrollViewer>
    </Grid>
</Window>

Code behind:

using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Data;
using System.Windows.Documents;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Media.Imaging;
using System.Windows.Navigation;
using System.Windows.Shapes;

namespace TextBlockTest
{
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window
    {
        public ObservableCollection<string> BigList { get; set; }

        public MainWindow()
        {
            BigList = new ObservableCollection<string>();
            for (int i = 0; i <= 10000; i++)
            {
                BigList.Add($"Item {i}");
            }

            InitializeComponent();
        }
    }
}

Let’s, for a minute, imagine this is slow, and profile it:

The layout is taking most of the time. Remember that each control needs to be created, and remember that the TextBlock does slightly more than just display text:

The whole panel took 3.46s. Not terrible, performance, but can it be improved? The answer is: yes, it can! Very, very slightly.

Let’s create a Custom Control:

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Data;
using System.Windows.Documents;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Media.Imaging;
using System.Windows.Navigation;
using System.Windows.Shapes;

namespace FastTextBlock
{
   
    public class MyTextBlockTest : Control
    {
        private FormattedText _formattedText;

        static MyTextBlockTest()
        {
            //DefaultStyleKeyProperty.OverrideMetadata(typeof(MyTextBlockTest), new FrameworkPropertyMetadata(typeof(MyTextBlockTest)));
        }

        public static readonly DependencyProperty TextProperty =
             DependencyProperty.Register(
                 "Text", 
                 typeof(string),
                 typeof(MyTextBlockTest), 
                 new FrameworkPropertyMetadata(string.Empty, FrameworkPropertyMetadataOptions.AffectsMeasure,
                    (o, e) => ((MyTextBlockTest)o).TextPropertyChanged((string)e.NewValue)));

        private void TextPropertyChanged(string text)
        {
            var typeface = new Typeface(
                new FontFamily("Times New Roman"),
                FontStyles.Normal, FontWeights.Normal, FontStretches.Normal);

            _formattedText = new FormattedText(
                text, CultureInfo.CurrentCulture,
                FlowDirection.LeftToRight, typeface, 15, Brushes.Black);
        }


        public string Text
        {
            get { return (string)GetValue(TextProperty); }
            set { SetValue(TextProperty, value); }
        }

        protected override void OnRender(DrawingContext drawingContext)
        {
            if (_formattedText != null)
            {
                drawingContext.DrawText(_formattedText, new Point());
            }
        }

        protected override Size MeasureOverride(Size constraint)
        {
            //return base.MeasureOverride(constraint);

            return _formattedText != null
            ? new Size(_formattedText.Width, _formattedText.Height)
            : new Size();
        }
    }
}

Here’s the new XAML:

    <Grid>
        <ScrollViewer>
            <ItemsControl ItemsSource="{Binding BigList, ElementName=MainWindowView}" Margin="0,-1,0,1">
                <ItemsControl.ItemTemplate>
                    <DataTemplate>
                        <!--<TextBlock Text="{Binding}"/>-->
                        <controls:MyTextBlockTest Text="{Binding}" />
                    </DataTemplate>
                </ItemsControl.ItemTemplate>
            </ItemsControl>
        </ScrollViewer>
    </Grid>

Shaves around 10ms off the time:

Even more time can be shaved by moving up an element (that is, inheriting from a more base class than `Control`. In fact, `Control` inherits from `FrameworkElement`:

public class MyTextBlockTest : FrameworkElement

Shaves another 10ms off:

Conclusion

Clearly, this isn’t a huge performance boost, and in 99% of use cases, this would be massively premature optimisation. However, the time that this really comes into its own is where you have a compound control (a control that consists of other controls), and you have lots of them (hundreds). See my next post for details.

References:

https://social.msdn.microsoft.com/Forums/en-US/94ddd25e-7093-4986-b8c8-b647924251f6/manual-rendering-of-a-wpf-user-control?forum=wpf

http://www.codemag.com/article/100023

http://stackoverflow.com/questions/20338044/how-do-i-make-a-custom-uielement-derived-class-that-contains-and-displays-othe

http://stackoverflow.com/questions/42494455/wpf-custom-control-inside-itemscontrol

WPF Performance Debugging

WPF is an interesting (and currently still active framework. How long that will continue depends, IMHO, largely on how well MS can bring UWP XAML to a state where people are happy to switch.

I recently investigated a performance problem in one of our WPF screens. After running a few analysis tools, including Prefix (which I’m finding increasingly my first port of call for this kind of thing), I came to the conclusion that the performance problem was with the screen itself.

Performance Profiler

You can reach this via:

Analyse -> Performance Profiler

You can actually run this against a compiled exe, a store app, or even a website. For my purposes, I ran it against the screen that I’d identified as being slow:

The bar graph above clearly marks out the points at which the app suddenly spikes, and the legends tells me that it’s caused by the layout. With this information, you can highlight relevant area:

Once I did this, I could instantly see that a very large number of controls were being created:

So, the problem here was that the client was going to the service and bringing back a huge volume of data, and as soon as this was bound to the screen, WPF was attempting to render the layout for thousands of controls immediately.

The Solution

So, the solution to this issue is to virtualise the ItemsControl. Whilst the standard items control will attempt the render the layout for every possible control bound to the underlying data, virtualising it allows to it only render those that are actually displayed on the screen. Here’s how you might achieve that:

                        <ItemsControl Grid.Row="1" ItemsSource="{Binding Path=MyObject.Data}"
                              Margin="10" BorderBrush="Black" BorderThickness="2" 
                                      VirtualizingPanel.VirtualizationMode="Recycling"
                                      VirtualizingPanel.IsVirtualizing="True"
                                      ScrollViewer.CanContentScroll="True">
                            <ItemsControl.Template>
                                <ControlTemplate>
                                    <ScrollViewer HorizontalScrollBarVisibility="Disabled" VerticalScrollBarVisibility="Auto">
                                        <ItemsPresenter/>
                                    </ScrollViewer>
                                </ControlTemplate>
                            </ItemsControl.Template>
                            <ItemsControl.ItemsPanel>
                                <ItemsPanelTemplate>
                                    <VirtualizingStackPanel Orientation="Vertical" Margin="5" IsItemsHost="True" />
                                </ItemsPanelTemplate>
                            </ItemsControl.ItemsPanel>

Re-running the screen with the analyser reveals that we have now alleviated the spike in activity:

Summary

Obviously, there is a trade-off here; if you’re dealing with a screen that will be used extensively and change very infrequently, then you might decide it’s better to have the upfront hit (as the work still needs to be done). However, if you’re loading so much data that you’re in this situation, I would have thought it very unlikely that the end-user is ever going to want to actually see it all!

It’s also worth acknowledging here that this solution doesn’t actually speed anything up, just defers it. I’m not saying that’s a good or bad thing, but it is definitely a thing.

References

https://blogs.windows.com/buildingapps/2015/10/07/optimizing-your-xaml-app-for-performance-10-by-10/#4zjWfXrk69bTPpi0.97

https://blogs.msdn.microsoft.com/wpf/2015/01/16/new-ui-performance-analysis-tool-for-wpf-applications/

http://stackoverflow.com/questions/2783845/virtualizing-an-itemscontrol

https://msdn.microsoft.com/en-us/library/system.windows.controls.virtualizingstackpanel(v=vs.110).aspx