Sending messages on scale to Service Bus from durable functions - c#

I have a scenario where one activity function has retrieved a set of records which can be anywhere from 1000 to a million and stored in an object. This object is then used by the next activity function to send messages in parallel to service bus.
Currently I am using a for loop on this object to send each record in the object to service bus. Please let me know if there is a better alternative pattern where the object or content (wherever it is stored) is emptied to be sent to service bus and the function scales out automatically without restricting the processing to a for loop.
Have used a for loop from a function that orchestrates to call activity functions for the records in the object.
Have looked at the scaling of the activity function and for a set of 18000 records it has scaled up-to 15 instances and processed the whole set in 4 minutes.
Currently the function is using the consumption plan.Checked to see that only this function app is using this plan and its not shared.
The topic to which the message is sent has another service listening to it, to read the message.
The instance count for both orchestrating & activity function is as available by default.
for(int i=0;i<number_messages;i++)
taskList[i] =
await Task.WhenAll(taskList);
catch (AggregateException ae)
The messages should be quickly sent to service bus by scaling out the activity functions appropriately.

I would suggest you to use Batch for sending messages.
Azure Service Bus client supports sending messages in batches (SendBatch and SendBatchAsync methods of QueueClient and TopicClient). However, the size of a single batch must stay below 256k bytes, otherwise the whole batch will get rejected.
We will start with a simple use case: the size of each message is known to us. It's defined by hypothetical Func getSize function. Here is a helpful extension method that will split an arbitrary collection based on a metric function and maximum chunk size:
public static List<List<T>> ChunkBy<T>(this IEnumerable<T> source, Func<T, long> metric, long maxChunkSize)
return source
Sum = 0L,
Current = (List<T>)null,
Result = new List<List<T>>()
(agg, item) =>
var value = metric(item);
if (agg.Current == null || agg.Sum + value > maxChunkSize)
var current = new List<T> { item };
return new { Sum = value, Current = current, agg.Result };
return new { Sum = agg.Sum + value, agg.Current, agg.Result };
Now, the implementation of SendBigBatchAsync is simple:
public async Task SendBigBatchAsync(IEnumerable<T> messages, Func<T, long> getSize)
var chunks = messages.ChunkBy(getSize, MaxServiceBusMessage);
foreach (var chunk in chunks)
var brokeredMessages = chunk.Select(m => new BrokeredMessage(m));
await client.SendBatchAsync(brokeredMessages);
private const long MaxServiceBusMessage = 256000;
private readonly QueueClient client;
how do we determine the size of each message? How do we implement getSize function?
BrokeredMessage class exposes Size property, so it might be tempting to rewrite our method the following way:
public async Task SendBigBatchAsync<T>(IEnumerable<T> messages)
var brokeredMessages = messages.Select(m => new BrokeredMessage(m));
var chunks = brokeredMessages.ChunkBy(bm => bm.Size, MaxServiceBusMessage);
foreach (var chunk in chunks)
await client.SendBatchAsync(chunk);
The last possibility that I want to consider is actually allow yourself violating the max size of the batch, but then handle the exception, retry the send operation and adjust future calculations based on actual measured size of the failed messages. The size is known after trying to SendBatch, even if operation failed, so we can use this information.
// Sender is reused across requests
public class BatchSender
private readonly QueueClient queueClient;
private long batchSizeLimit = 262000;
private long headerSizeEstimate = 54; // start with the smallest header possible
public BatchSender(QueueClient queueClient)
this.queueClient = queueClient;
public async Task SendBigBatchAsync<T>(IEnumerable<T> messages)
var packets = (from m in messages
let bm = new BrokeredMessage(m)
select new { Source = m, Brokered = bm, BodySize = bm.Size }).ToList();
var chunks = packets.ChunkBy(p => this.headerSizeEstimate + p.Brokered.Size, this.batchSizeLimit);
foreach (var chunk in chunks)
await this.queueClient.SendBatchAsync(chunk.Select(p => p.Brokered));
catch (MessageSizeExceededException)
var maxHeader = packets.Max(p => p.Brokered.Size - p.BodySize);
if (maxHeader > this.headerSizeEstimate)
// If failed messages had bigger headers, remember this header size
// as max observed and use it in future calculations
this.headerSizeEstimate = maxHeader;
// Reduce max batch size to 95% of current value
this.batchSizeLimit = (long)(this.batchSizeLimit * .95);
// Re-send the failed chunk
await this.SendBigBatchAsync(packets.Select(p => p.Source));
You can use this blog for further reference. Hope it helps.


Kafka: Consume partition with manual batching - Messages are being skipped

I am using Confluent Kafka .NET to create a consumer for a partitioned topic.
As Confluent Kafka .NET does not support consuming in batches, I built a function that consumes messages until the batch size is reached. The idea of this function is to build batches with messages from the same partition only, that is why I stop building the batch once I consume a result that has a different partition and return whatever number of messages I was able to consume up to that point.
Goal or Objective: I want to be able to process the messages I returned in the batch, and commit the offsets for those messages only. i.e:
Message Consumed From Partition
Stored in Batch
From the table above I would like to process both messages I got from partition 0. Message from partition 2 would be ignored and (hopefully) PICKED UP LATER in another call to ConsumeBatch.
To commit I simply call the synchronous Commit function passing the offset of the latest message I processed as parameter. In this case I would pass the offset of the second message of the batch shown in the table above (Partition 0 - Offset 1).
The problem is that for some reason, when I build a batch like the one shown above, the messages I decide not to process because of validations are being ignored forever. i.e: Message 0 of partition 2 will never be picked up by the consumer again.
As you can see in the consumer configuration below, I have set both EnableAutoCommit and EnableAutoOffsetStore as false. I think this would be enough for the consumer to not do anything with the offsets and be able to pick up ignored messages in another Consume call, but it isn't. The offset is somehow increasing up to the latest consumed message for each partition, regardless of my configuration.
Can anybody give me some light on what am I missing here to achieve the desired behavior if possible?
Simplified version of the function to build the batch:
public IEnumerable<ConsumeResult<string, string>> ConsumeBatch(int batchSize)
List<ConsumeResult<string, string>> consumedMessages = new List<ConsumeResult<string, string>>();
int latestPartition = -1; // The partition from where we consumed the last message
for (int i = 0; i < batchSize; i++)
var result = _consumer.Consume(100);
if (result != null)
if (latestPartition == -1 || result.Partition.Value == latestPartition)
latestPartition = result.Partition.Value;
return consumedMessages;
ConsumerConfig used to instantiate my consumer client:
_consumerConfig = new ConsumerConfig
BootstrapServers = _bootstrapServers,
EnableAutoCommit = false,
AutoCommitIntervalMs = 0,
GroupId = "WorkerConsumers",
AutoOffsetReset = AutoOffsetReset.Earliest,
EnableAutoOffsetStore = false,
Additional Information:
This is being tested with:
1 topic with 6 partitions and replication factor of 2
3 brokers
1 single-threaded consumer client that belongs to a consumer group
Local environment with wsl2 on Windows 10
The key was to use the Seek function to reset the partition's offset to a specific position so that the ignored message could be picked up again as part of another batch.
In the same function above:
public IEnumerable<ConsumeResult<string, string>> ConsumeBatch(int batchSize)
List<ConsumeResult<string, string>> consumedMessages = new List<ConsumeResult<string, string>>();
int latestPartition = -1; // The partition from where we consumed the last message
for (int i = 0; i < batchSize; i++)
var result = _consumer.Consume(100);
if (result != null)
if (latestPartition == -1 || result.Partition.Value == latestPartition)
latestPartition = result.Partition.Value;
// This call will guarantee that this message that will not be included in the current batch, will be included in another batch later
_consumer.Seek(result.TopicPartitionOffset); // IMPORTANT LINE!!!!!!!
return consumedMessages;
I think in general, if you want to consume a message without altering the offsets in any way (kinda peeking the topic partition), you can call Consume and then use Seek(result.TopicPartitionOffset) to set the offset of that topic partition back to where it was before consuming the message.

Directing messages to consumers

My client is attempting to send messages to the receiver. However I noticed that the receiver sometimes does not receive all the messages sent by the client thus missing a few messages (not sure where the problem is ? Client or the receiver).
Any suggestions on why that might be happening. This is what I am currently doing
On the receiver side this is what I am doing.
This is the Event Processor
async Task IEventProcessor.ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> messages)
foreach (var eventData in messages)
var data = Encoding.UTF8.GetString(eventData.Body.Array, eventData.Body.Offset, eventData.Body.Count);
This is how the client connects to the event hub
var StrBuilder = new EventHubsConnectionStringBuilder(eventHubConnectionString)
EntityPath = eventHubName,
this.eventHubClient = EventHubClient.CreateFromConnectionString(StrBuilder.ToString());
How do I direct my messages to specific consumers
I'm using this sample code from eventhub official doc, for sending and receiving.
And I have 2 consumer groups: $Default and newcg. Suppose you have 2 clients, the client_1 are using the default consumer group($Default), and client_2 are using the other consumer group(newcg)
First, after create the send client, in the SendMessagesToEventHub method, we need to add a property with value. The value should be the consumer group name. Sample code like below:
private static async Task SendMessagesToEventHub(int numMessagesToSend)
for (var i = 0; i < numMessagesToSend; i++)
var message = "444 Message";
Console.WriteLine($"Sending message: {message}");
EventData mydata = new EventData(Encoding.UTF8.GetBytes(message));
//here, we add a property named "cg", it's value is the consumer group. By setting this property, then we can read this message via this specified consumer group.
mydata.Properties.Add("cg", "newcg");
await eventHubClient.SendAsync(mydata);
catch (Exception exception)
Console.WriteLine($"{DateTime.Now} > Exception: {exception.Message}");
await Task.Delay(10);
Console.WriteLine($"{numMessagesToSend} messages sent.");
Then in the client_1, after create the receiver project, which use the default consumer group($Default)
-> in the SimpleEventProcessor class -> ProcessEventsAsync method, we can filter out the unnecessary event data. Sample code for ProcessEventsAsync method:
public Task ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> messages)
foreach (var eventData in messages)
//filter the data here
if (eventData.Properties["cg"].ToString() == "$Default")
var data = Encoding.UTF8.GetString(eventData.Body.Array, eventData.Body.Offset, eventData.Body.Count);
Console.WriteLine($"Message received. Partition: '{context.PartitionId}', Data: '{data}'");
return context.CheckpointAsync();
And in another client, like client_2, which use another consumer group, like it's name is newcg, we can follow the steps in client_1, just a little changes in ProcessEventsAsync method, like below:
public Task ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> messages)
foreach (var eventData in messages)
//filter the data here, using another consumer group name
if (eventData.Properties["cg"].ToString() == "newcg")
//other code
return context.CheckpointAsync();
This happens only when there are 2 or more Event Processor Host reading from same consumer group.
If you have event hub with 32 partitions and 2 event processor host reading from same consumer group. Then each event processor host will read from 16 partition and so on.
Similarly if 4 Event processor host parallelly reading from same consumer group then each will read from 8 partitions.
Check if you have 2 or more event processor host running on same consumer group.
I have tested your code and slightly modified it(different overload of EventProcessorHost constructor, and added CheckpointAsync after consuming the messages), and then did some tests.
By using the default implementation and default EventProcessorOptions(EventProcessorOptions.DefaultOptions) I can say that I did experience some latency when it comes to consuming messages, but all messages were processed successfully.
So, sometimes it seems like I am not getting the messages from the certain partition, but after a certain period of time, all messages arrive:
Here you can find the actual modified code that worked for me. It is a simple console app that prints to the console if something arrives.
string processorHostName = Guid.NewGuid().ToString();
var Options = new EventProcessorOptions()
MaxBatchSize = 1, //not required to make it working, just for testing
Options.SetExceptionHandler((ex) =>
System.Diagnostics.Debug.WriteLine($"Exception : {ex}");
var eventHubCS = "event hub connection string";
var storageCS = "storage connection string";
var containerName = "test";
var eventHubname = "test2";
EventProcessorHost eventProcessorHost = new EventProcessorHost(eventHubname, "$Default", eventHubCS, storageCS, containerName);
For sending the messages to the event hub and testing I used this message publisher app.

JIT SignalR Hub Sending and Receiving

Up till now for the past 3 months, I still have 0 clue how SignalR works at the JIT (Just-in-time) level. I'm trying to build a Hub that sends data to the client just in time, and the client will then receive the data and work along with it.
EDIT: Incase you have no idea what I mean by JIT Sending and
I meant it by the server being able to send connected socket clients data when there is new data available. The socket connection will only be closed either when the server is shutdown/has an issue OR the client disconnects from the socket. So in short, no matter what, when new data arises from the server, it will always send that data ONE BY ONE to connection clients.
So here's what I'm missing out/confused about:
Is the SubscribeToAll (Check out TickerHub.cs below) Method the place where I call when I have new data to notify and beep to the clients or where is it?
I know how the asynchronous WriteToChannel works. Basically it sends a collection, item by item to the client. Key issue is, how do I convert this entire function to JIT? And where do I handle the list of clients subscribed to this hub?
Currently, TickerHub.cs keeps retrieving a dataset (named CurrencyPairs) and then broadcasts it to the clients indefinitely. I have a background service that syncs and updates the CurrencyPairs 24/7. I just need a SignalR expert's help to explain/show how I can invoke the Hub from the background service and then allow the hub to broadcast that new data to the connected clients.
public class TickerHub : Hub, ITickerHubClient
private IEnumerable<CurrencyPair> _currencyPairs;
private readonly ICurrencyPairService _cpService;
public TickerHub(ICurrencyPairService cpService)
_cpService = cpService;
public async Task<NozomiResult<CurrencyPair>> Tickers(IEnumerable<CurrencyPair> currencyPairs = null)
var nozRes = new NozomiResult<CurrencyPair>()
Success = true,
ResultType = NozomiResultType.Success,
Data = currencyPairs
return nozRes;
// We can use this to return a payload
public async Task<ChannelReader<NozomiResult<CurrencyPair>>> SubscribeToAll()
// Initialize an unbounded channel
// Unbounded Channels have no boundaries, allowing the server/client to transmit
// limitless amounts of payload. Bounded channels have limits and will tend to
// drop the clients after awhile.
var channel = Channel.CreateUnbounded<NozomiResult<CurrencyPair>>();
_ = WriteToChannel(channel.Writer); // Write all Currency Pairs to the channel
// Return the reader
return channel.Reader;
// This is a nested method, allowing us to write repeated methods
// with the same semantic conventions while maintaining conformity.
async Task WriteToChannel(ChannelWriter<NozomiResult<CurrencyPair>> writer)
// Pull in the latest data
_currencyPairs = _cpService.GetAllActive();
// Iterate them currency pairs
foreach (var cPair in _currencyPairs)
// Write one by one, and the client receives them one by one as well
await writer.WriteAsync(new NozomiResult<CurrencyPair>()
Success = (cPair != null),
ResultType = (cPair != null) ? NozomiResultType.Success : NozomiResultType.Failed,
Data = new[] {cPair}
// Beep the client, telling them you're done
In case you want to find out if my client sided code doesn't work well, here it is
using Microsoft.AspNetCore.SignalR.Client;
using Newtonsoft.Json;
using Nozomi.Client.Data.Interfaces;
using Nozomi.Data;
using Nozomi.Data.CurrencyModels;
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace Nozomi.Client
public class NozomiClient
private CancellationToken _tickerStreamCancellationToken;
private string ServerPath;
private HubConnection _hubConnection;
public NozomiClient(string serverPath)
ServerPath = serverPath;
_hubConnection = new HubConnectionBuilder()
public async Task InitializeAsync()
await _hubConnection.StartAsync();
public async Task StreamTickers()
// Setup the channel for streaming
var streamTickerChannel = await _hubConnection.StreamAsChannelAsync<NozomiResult<CurrencyPair>>("SubscribeToAll", CancellationToken.None);
// Setup the asynchronous data stream
//while (await streamTickerChannel.WaitToReadAsync())
// while (streamTickerChannel.TryRead(out var cp))
// {
// Console.WriteLine(JsonConvert.SerializeObject(cp));
// }
_hubConnection.On<CurrencyPair>("SubscribeToAll", cp =>
while (!_tickerStreamCancellationToken.IsCancellationRequested)
if (await streamTickerChannel.WaitToReadAsync())
while (streamTickerChannel.TryRead(out var cp))
public ICurrencyPair CurrencyPairs { get; }
public ISource Sources { get; }

How to sort a ConcurrentBag?

I am working on a client/server application. The server sends messages to the client, but the order cannot be guaranteed. I am using TCP... I don't want to get into why the order cannot be guaranteed (it is to do with threads on the server).
Anyway, on the client, I am processing messages like this:
private Queue<byte[]> rawMessagesIn = new Queue<byte[]>();
public ConcurrentBag<ServerToClient> messages = new ConcurrentBag<ServerToClient>();
public void Start()
var processTask = Task.Factory.StartNew(() =>
while (run)
void process(){
if(rawMessagesIn.Count > 0){
var raw_message = rawMessagesIn.Dequeue();
var message = (ServerToClient)Utils.Deserialize(raw_message);
private void OnDataReceived(object sender, byte[] data)
Now, it is important that when I call messages.TryTake() or messages.TryPeek() that the message out is the next in the sequence. Every message has a number/integer representing its order. For example, message.number = 1
I need to use TryPeek because the message at index 0 might be the correct message or it might be the wrong message, in which case we remove the message from the bag. However, there is a possibility that the message is a future required message, and so it should not be removed.
I have tried using message.OrderBy(x=>x.number).ToList(); but I cannot see how it will work. If I use the OrderBy and get a sorted list SL and the item at index 0 is the correct one, I cannot simply remove or modify messages because I do not know its position in the ConcurrentBag!
Does anyone have a suggestion for me?
My suggestion is to switch from manually managing queues, to a TransformBlock<TInput,TOutput> from the TPL Dataflow library. This component is a combination of an input queue, and output queue, and a processor that transforms the TInput to TOutput. The EnsureOrdered functionality is built-in, and it is the default. Example:
private readonly TransformBlock<byte[], ServerToClient> _transformer;
public Client() // Constructor
_transformer = new((byte[] raw_message) =>
ServerToClient message = (ServerToClient)Utils.Deserialize(raw_message);
return message;
}, new ExecutionDataflowBlockOptions()
EnsureOrdered = true, // Just for clarity. true is the default.
MaxDegreeOfParallelism = 1, // the default is 1
private void OnDataReceived(object sender, byte[] data)
bool accepted = _transformer.Post(data);
// The accepted will be false in case the _transformer has failed.
public bool TryReceiveAll(out IList<ServerToClient> messages)
return _transformer.TryReceiveAll(out messages);
There are many ways to consume the ServerToClient messages that are stored in the output queue of the block. The example above demonstrates the TryReceiveAll method. There are also the TryReceive, Receive, ReceiveAsync and ReceiveAllAsync (some of them are extension methods). You can also use the lower level method OutputAvailableAsync as shown here. Linking it to another dataflow block is also an option.

How to read from multiple EventHub partitions simultaneously with high throughput?

My one role instance needs to read data from 20-40 EventHub partitions at the same time (context: this is our internal virtual partitioning scheme - 20-40 partitions represent scale out unit).
In my prototype I use below code. By I get throughput 8 MBPS max. Since if I run the same console multiple times I get throughput (perfmon counter) multiplied accordingly then I think this is not neither VM network limit nor EventHub service side limit.
I wonder whether I create clients correctly here...
Thank you!
const string EventHubName = "...";
const string ConsumerGroupName = "...";
var connectionStringBuilder = new ServiceBusConnectionStringBuilder();
connectionStringBuilder.SharedAccessKeyName = "...";
connectionStringBuilder.SharedAccessKey = "...";
connectionStringBuilder.Endpoints.Add(new Uri("sb://"));
connectionStringBuilder.TransportType = TransportType.Amqp;
var clientConnectionString = connectionStringBuilder.ToString();
var eventHubClient = EventHubClient.CreateFromConnectionString(clientConnectionString, EventHubName);
var runtimeInformation = await eventHubClient.GetRuntimeInformationAsync().ConfigureAwait(false);
var consumerGroup = eventHubClient.GetConsumerGroup(ConsumerGroupName);
var offStart = DateTime.UtcNow.AddMinutes(-10);
var offEnd = DateTime.UtcNow.AddMinutes(-8);
var workUnitManager = new WorkUnitManager(runtimeInformation.PartitionCount);
var readers = new List<PartitionReader>();
for (int i = 0; i < runtimeInformation.PartitionCount; i++)
var reader = new PartitionReader(
internal async Task Read()
Console.WriteLine("Creating a receiver for '{0}' with offset {1}", this.partitionId, this.startOffset);
EventHubReceiver receiver = await this.consumerGroup.CreateReceiverAsync(this.partitionId, this.startOffset).ConfigureAwait(false);
Console.WriteLine("Receiver for '{0}' has been created.", this.partitionId);
var stopWatch = new Stopwatch();
while (true)
var message =
(await receiver.ReceiveAsync(1, TimeSpan.FromSeconds(10)).ConfigureAwait(false)).FirstOrDefault();
if (message == null)
if (message.EnqueuedTimeUtc >= this.endOffset)
this.processor.Push(this.partitionIndex, message);
this.Duration = TimeSpan.FromMilliseconds(stopWatch.ElapsedMilliseconds);
catch (Exception ex)
The above code snippet you provided is effectively: creating 1 Connection to ServiceBus Service and then running all receivers on one single connection (at protocl level, essentially, creating multiple Amqp Links on that same connection).
Alternately - to achieve high throughput for receive operations, You will need to create multiple connections and map your receivers to connection ratio to fine-tune your throughput. That's what happens when you run the above code in multiple processes.
Here's how:
You will need to go one layer down the .Net client SDK API and code at MessagingFactory level - you can start with 1 MessagingFactory per EventHubClient. MessagingFactory is the one - which represents 1 Connection to EventHubs service. Code to create a dedicated connection per EventHubClient:
var connStr = new ServiceBusConnectionStringBuilder("Endpoint=sb://;SharedAccessKeyName=saskeyname;SharedAccessKey=sakKey");
connStr.TransportType = TransportType.Amqp;
var msgFactory = MessagingFactory.CreateFromConnectionString(connStr.ToString());
var ehClient = msgFactory.CreateEventHubClient("teststream");
I just added connStr in my sample to emphasize assigning TransportType to Amqp.
You will end up with multiple connections with outgoing port 5671:
If you rewrite your code with 1 MessagingFactory per EventHubClient (or a reasonable ratio) - you are all set (in your code - you will need to move EventHubClient creation to Reader)!
The only extra criteria one need to consider while creating multiple connections is the Bill - only 100 connections are included (including senders and receivers) in basic sku. I guess you are already on standard (as you have >1 TUs) - which gives 1000 connections included in the package - so no need to worry - but mentioning just-in-case.
A good option is to create a Task for each partition.
This a copy of my implementation which is able to process a rate of 2.5k messages per second per partition. This rate will be also related to your downstream speed.
static void EventReceiver()
for (int i = 0; i <= EventHubPartitionCount; i++)
Task.Factory.StartNew((state) =>
Console.WriteLine("Starting worker to process partition: {0}", state);
var factory = MessagingFactory.Create(ServiceBusEnvironment.CreateServiceUri("sb", "tests-eventhub", ""), new MessagingFactorySettings()
TokenProvider = TokenProvider.CreateSharedAccessSignatureTokenProvider("Listen", "PGSVA7L="),
TransportType = TransportType.Amqp
var client = factory.CreateEventHubClient("eventHubName");
var group = client.GetConsumerGroup("customConsumer");
Console.WriteLine("Group: {0}", group.GroupName);
var receiver = group.CreateReceiver(state.ToString(), DateTime.Now);
while (true)
if (cts.IsCancellationRequested)
var messages = receiver.Receive(20);
messages.ToList().ForEach(aMessage =>
// Process your event
}, i);
