How to read from multiple EventHub partitions simultaneously with high throughput?

How to read from multiple EventHub partitions simultaneously with high throughput? - c#

My one role instance needs to read data from 20-40 EventHub partitions at the same time (context: this is our internal virtual partitioning scheme - 20-40 partitions represent scale out unit).
In my prototype I use below code. By I get throughput 8 MBPS max. Since if I run the same console multiple times I get throughput (perfmon counter) multiplied accordingly then I think this is not neither VM network limit nor EventHub service side limit.
I wonder whether I create clients correctly here...
Thank you!
Zaki
const string EventHubName = "...";
const string ConsumerGroupName = "...";
var connectionStringBuilder = new ServiceBusConnectionStringBuilder();
connectionStringBuilder.SharedAccessKeyName = "...";
connectionStringBuilder.SharedAccessKey = "...";
connectionStringBuilder.Endpoints.Add(new Uri("sb://....servicebus.windows.net/"));
connectionStringBuilder.TransportType = TransportType.Amqp;
var clientConnectionString = connectionStringBuilder.ToString();
var eventHubClient = EventHubClient.CreateFromConnectionString(clientConnectionString, EventHubName);
var runtimeInformation = await eventHubClient.GetRuntimeInformationAsync().ConfigureAwait(false);
var consumerGroup = eventHubClient.GetConsumerGroup(ConsumerGroupName);
var offStart = DateTime.UtcNow.AddMinutes(-10);
var offEnd = DateTime.UtcNow.AddMinutes(-8);
var workUnitManager = new WorkUnitManager(runtimeInformation.PartitionCount);
var readers = new List<PartitionReader>();
for (int i = 0; i < runtimeInformation.PartitionCount; i++)
{
var reader = new PartitionReader(
consumerGroup,
runtimeInformation.PartitionIds[i],
i,
offStart,
offEnd,
workUnitManager);
readers.Add(reader);
}
internal async Task Read()
{
try
{
Console.WriteLine("Creating a receiver for '{0}' with offset {1}", this.partitionId, this.startOffset);
EventHubReceiver receiver = await this.consumerGroup.CreateReceiverAsync(this.partitionId, this.startOffset).ConfigureAwait(false);
Console.WriteLine("Receiver for '{0}' has been created.", this.partitionId);
var stopWatch = new Stopwatch();
stopWatch.Start();
while (true)
{
var message =
(await receiver.ReceiveAsync(1, TimeSpan.FromSeconds(10)).ConfigureAwait(false)).FirstOrDefault();
if (message == null)
{
continue;
}
if (message.EnqueuedTimeUtc >= this.endOffset)
{
break;
}
this.processor.Push(this.partitionIndex, message);
}
this.Duration = TimeSpan.FromMilliseconds(stopWatch.ElapsedMilliseconds);
}
catch (Exception ex)
{
Console.WriteLine(ex);
throw;
}
}

The above code snippet you provided is effectively: creating 1 Connection to ServiceBus Service and then running all receivers on one single connection (at protocl level, essentially, creating multiple Amqp Links on that same connection).
Alternately - to achieve high throughput for receive operations, You will need to create multiple connections and map your receivers to connection ratio to fine-tune your throughput. That's what happens when you run the above code in multiple processes.
Here's how:
You will need to go one layer down the .Net client SDK API and code at MessagingFactory level - you can start with 1 MessagingFactory per EventHubClient. MessagingFactory is the one - which represents 1 Connection to EventHubs service. Code to create a dedicated connection per EventHubClient:
var connStr = new ServiceBusConnectionStringBuilder("Endpoint=sb://servicebusnamespacename.servicebus.windows.net/;SharedAccessKeyName=saskeyname;SharedAccessKey=sakKey");
connStr.TransportType = TransportType.Amqp;
var msgFactory = MessagingFactory.CreateFromConnectionString(connStr.ToString());
var ehClient = msgFactory.CreateEventHubClient("teststream");
I just added connStr in my sample to emphasize assigning TransportType to Amqp.
You will end up with multiple connections with outgoing port 5671:
If you rewrite your code with 1 MessagingFactory per EventHubClient (or a reasonable ratio) - you are all set (in your code - you will need to move EventHubClient creation to Reader)!
The only extra criteria one need to consider while creating multiple connections is the Bill - only 100 connections are included (including senders and receivers) in basic sku. I guess you are already on standard (as you have >1 TUs) - which gives 1000 connections included in the package - so no need to worry - but mentioning just-in-case.
~Sree

A good option is to create a Task for each partition.
This a copy of my implementation which is able to process a rate of 2.5k messages per second per partition. This rate will be also related to your downstream speed.
static void EventReceiver()
{
for (int i = 0; i <= EventHubPartitionCount; i++)
{
Task.Factory.StartNew((state) =>
{
Console.WriteLine("Starting worker to process partition: {0}", state);
var factory = MessagingFactory.Create(ServiceBusEnvironment.CreateServiceUri("sb", "tests-eventhub", ""), new MessagingFactorySettings()
{
TokenProvider = TokenProvider.CreateSharedAccessSignatureTokenProvider("Listen", "PGSVA7L="),
TransportType = TransportType.Amqp
});
var client = factory.CreateEventHubClient("eventHubName");
var group = client.GetConsumerGroup("customConsumer");
Console.WriteLine("Group: {0}", group.GroupName);
var receiver = group.CreateReceiver(state.ToString(), DateTime.Now);
while (true)
{
if (cts.IsCancellationRequested)
{
receiver.Close();
break;
}
var messages = receiver.Receive(20);
messages.ToList().ForEach(aMessage =>
{
// Process your event
});
Console.WriteLine(counter);
}
}, i);
}
}

Related

GRPC performance vs WCF performance

We have a legacy app that runs on top of WCF, so we are trying to move off of it, and find another technology. One of the problems is that we need performance from the wire, so part of evaluating GRPC is evaluating how quickly it works, but also how many simultaneous clients we can run.
So, to that end, we're simulating many calls with relatively low amount of data being passed through, but high number of calls. In that respect, WCF has turned out to be significantly better than GRPC, which is very unexpected. Is there possibly something wrong with the way the test were conceived and implemented?
The server code:
public override Task<TestReply> Test(TestRequest request, ServerCallContext context)
{
var ret = new char[request.Size];
var a = (int)'a';
for (var i = 0; i < request.Size; i++)
{
ret[i] = (char)(a + (i % 26));
}
return Task.FromResult(new TestReply { Message = new string(ret) });
}
The client code:
static void Main(string[] args)
{
AppContext.SetSwitch("System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport", true);
using var channel = GrpcChannel.ForAddress("http://remote_server:8001", new GrpcChannelOptions { Credentials = ChannelCredentials.Insecure });
var client = new Greeter.GreeterClient(channel);
string TestMethod(int i)
{
var request = new TestRequest {Size = i};
return client.Test(request).Message;
}
var start = DateTime.Now;
for (var i = 0; i < 15625; i++)
{
var val = TestMethod(10);
}
var end = DateTime.Now;
}
If we run one a single instance of the client, it takes just under 7 seconds. If we run 64 instances simultaneously, each takes an average of 23 seconds. Part of the problem is that running 64 instances is also CPU intensive, on both client and server. With 64 clients, the client will see 85-95% CPU utilization, and the server will see 70-80%.
By comparison, WCF will run a single instance of that code in 2.4 seconds, and 64 in an average of 9 seconds, and never experience significant CPU utilization on either.
Are we using GRPC wrongly? Is there something wrong with the test? What can we do to make GRPC run a little faster/leaner?

Blocking connection to Azure storage account

I have an application developed with c# which the first functionality is a method that connect to a storage account in order to be able to manage blobs.
My problem is that I want to block connection after 3 essaies of trying to connect.
this is the method that represent the connection to the storage account
public bool Connect(out String strerror)
{
strerror = "";
try
{
storageAccount = new CloudStorageAccount(new StorageCredentials(AccountName, AccountConnectionString), true);
MSAzureBlobStorageGUILogger.TraceLog(MessageType.Control,CommonMessages.ConnectionSuccessful);
return true;
}
catch (Exception ex01)
{
Console.WriteLine(CommonMessages.ConnectionFailed + ex01.Message);
strerror =CommonMessages.ConnectionFailed +ex01.Message;
return false;
}
}

At the moment you create the CloudStorageAccount variable there's still no connection made to the Storage Account, which you can easily test out by adding random credentials. In the background all the library does is fire a REST call to the Storage API and therefore doesn't make any connection until you actually retrieve or send data.
The library also already has its own mechanism implemented to retry requests in case of failures, which defaults to 3 retries but you can change manually like this:
var options = new BlobRequestOptions()
{
RetryPolicy = new ExponentialRetry(deltaBackoff, maxAttempts),
};
cloudBlobClient.DefaultRequestOptions = options;

What about wrapping it in a while loop and continuing to retry until either success or hitting the 3 attempt maximum?
string strError;
const int maxConnectionAttempts = 3;
var connectionAttempts = 0;
var connected = false;
while (!connected && connectionAttempts < maxConnectionAttempts)
{
connected = Connect(out strError);
connectionAttempts++;
}

Sending messages on scale to Service Bus from durable functions

I have a scenario where one activity function has retrieved a set of records which can be anywhere from 1000 to a million and stored in an object. This object is then used by the next activity function to send messages in parallel to service bus.
Currently I am using a for loop on this object to send each record in the object to service bus. Please let me know if there is a better alternative pattern where the object or content (wherever it is stored) is emptied to be sent to service bus and the function scales out automatically without restricting the processing to a for loop.
Have used a for loop from a function that orchestrates to call activity functions for the records in the object.
Have looked at the scaling of the activity function and for a set of 18000 records it has scaled up-to 15 instances and processed the whole set in 4 minutes.
Currently the function is using the consumption plan.Checked to see that only this function app is using this plan and its not shared.
The topic to which the message is sent has another service listening to it, to read the message.
The instance count for both orchestrating & activity function is as available by default.
for(int i=0;i<number_messages;i++)
{
taskList[i] =
context.CallActivityAsync<string>("Sendtoservicebus",
(messages[i],runId,CorrelationId,Code));
}
try
{
await Task.WhenAll(taskList);
}
catch (AggregateException ae)
{
ae.Flatten();
}
The messages should be quickly sent to service bus by scaling out the activity functions appropriately.

I would suggest you to use Batch for sending messages.
Azure Service Bus client supports sending messages in batches (SendBatch and SendBatchAsync methods of QueueClient and TopicClient). However, the size of a single batch must stay below 256k bytes, otherwise the whole batch will get rejected.
We will start with a simple use case: the size of each message is known to us. It's defined by hypothetical Func getSize function. Here is a helpful extension method that will split an arbitrary collection based on a metric function and maximum chunk size:
public static List<List<T>> ChunkBy<T>(this IEnumerable<T> source, Func<T, long> metric, long maxChunkSize)
{
return source
.Aggregate(
new
{
Sum = 0L,
Current = (List<T>)null,
Result = new List<List<T>>()
},
(agg, item) =>
{
var value = metric(item);
if (agg.Current == null || agg.Sum + value > maxChunkSize)
{
var current = new List<T> { item };
agg.Result.Add(current);
return new { Sum = value, Current = current, agg.Result };
}
agg.Current.Add(item);
return new { Sum = agg.Sum + value, agg.Current, agg.Result };
})
.Result;
}
Now, the implementation of SendBigBatchAsync is simple:
public async Task SendBigBatchAsync(IEnumerable<T> messages, Func<T, long> getSize)
{
var chunks = messages.ChunkBy(getSize, MaxServiceBusMessage);
foreach (var chunk in chunks)
{
var brokeredMessages = chunk.Select(m => new BrokeredMessage(m));
await client.SendBatchAsync(brokeredMessages);
}
}
private const long MaxServiceBusMessage = 256000;
private readonly QueueClient client;
how do we determine the size of each message? How do we implement getSize function?
BrokeredMessage class exposes Size property, so it might be tempting to rewrite our method the following way:
public async Task SendBigBatchAsync<T>(IEnumerable<T> messages)
{
var brokeredMessages = messages.Select(m => new BrokeredMessage(m));
var chunks = brokeredMessages.ChunkBy(bm => bm.Size, MaxServiceBusMessage);
foreach (var chunk in chunks)
{
await client.SendBatchAsync(chunk);
}
}
The last possibility that I want to consider is actually allow yourself violating the max size of the batch, but then handle the exception, retry the send operation and adjust future calculations based on actual measured size of the failed messages. The size is known after trying to SendBatch, even if operation failed, so we can use this information.
// Sender is reused across requests
public class BatchSender
{
private readonly QueueClient queueClient;
private long batchSizeLimit = 262000;
private long headerSizeEstimate = 54; // start with the smallest header possible
public BatchSender(QueueClient queueClient)
{
this.queueClient = queueClient;
}
public async Task SendBigBatchAsync<T>(IEnumerable<T> messages)
{
var packets = (from m in messages
let bm = new BrokeredMessage(m)
select new { Source = m, Brokered = bm, BodySize = bm.Size }).ToList();
var chunks = packets.ChunkBy(p => this.headerSizeEstimate + p.Brokered.Size, this.batchSizeLimit);
foreach (var chunk in chunks)
{
try
{
await this.queueClient.SendBatchAsync(chunk.Select(p => p.Brokered));
}
catch (MessageSizeExceededException)
{
var maxHeader = packets.Max(p => p.Brokered.Size - p.BodySize);
if (maxHeader > this.headerSizeEstimate)
{
// If failed messages had bigger headers, remember this header size
// as max observed and use it in future calculations
this.headerSizeEstimate = maxHeader;
}
else
{
// Reduce max batch size to 95% of current value
this.batchSizeLimit = (long)(this.batchSizeLimit * .95);
}
// Re-send the failed chunk
await this.SendBigBatchAsync(packets.Select(p => p.Source));
}
}
}
}
You can use this blog for further reference. Hope it helps.

Starting multiple threads in a for loop as no effect

I'm trying to read off messages from a websphere mq queue and dump it in another queue.
Below is the code i have to do it
private void transferMessages()
{
MQQueueManager sqmgr = connectToQueueManager(S_SERVER_NAME, S_QMGR_NAME, S_PORT_NUMBER, S_CHANNEL_NAME);
MQQueueManager dqmgr = connectToQueueManager(D_SERVER_NAME, D_QMGR_NAME, D_PORT_NUMBER, D_CHANNEL_NAME);
if (sqmgr != null && dqmgr != null)
{
MQQueue sq = openSourceQueueToGet(sqmgr, S_QUEUE_NAME);
MQQueue dq = openDestQueueToPut(dqmgr, D_QUEUE_NAME);
if (sq != null && dq != null)
{
setPutMessageOptions();
setGetMessageOptions();
processMessages(sqmgr, sq, dqmgr, dq);
}
}
}
And I'm calling the above method in a for loop and creating separate threads as below.
int NO_OF_THREADS = 5;
Thread[] ts = new Thread[NO_OF_THREADS];
for (int i = 0; i < NO_OF_THREADS; i++)
{
ts[i] = new Thread(() => transferMessages());
ts[i].Start();
}
As you see, I'm making a fresh connection to the queue manager in the transferMessages method. Not sure for some reason, the program makes only one connection to MQ.
The custom method to connect to the queue manager is below..
private MQQueueManager connectToQueueManager(string MQServerName, string MQQueueManagerName, string MQPortNumber, string MQChannel)
{
try
{
mqErrorString = "";
MQQueueManager qmgr;
Hashtable mqProps = new Hashtable();
mqProps.Add(MQC.HOST_NAME_PROPERTY, MQServerName);
mqProps.Add(MQC.CHANNEL_PROPERTY, MQChannel);
mqProps.Add(MQC.PORT_PROPERTY, Convert.ToInt32(MQPortNumber));
mqProps.Add(MQC.TRANSPORT_PROPERTY, MQC.TRANSPORT_MQSERIES_CLIENT);
qmgr = new MQQueueManager(MQQueueManagerName, mqProps);
return qmgr;
}
catch (MQException mqex)
{
//catch and log MQException here
return null;
}
}
Any advise what am i missing?

That is because of Shared Conversation (SHARECNV) feature of MQ where multiple connections to queue manager from one application share the same socket. This value is a negotiated between client and queue manager while establishing a connection. By default 10 connections will be shared over a socket.
You can increase the number of threads in your application to 11, then you can see a second connection being opened. More details on SHARECNV are here.
UPDATE
Channel status when running 6 threads each for put and get. Note I am connecting to the same queue manager (test purpose only). SHARECNV is set to 10.
2 : dis chstatus(MY.SVRCONN)
AMQ8417: Display Channel Status details.
CHANNEL(MY.SVRCONN) CHLTYPE(SVRCONN)
CONNAME(127.0.0.1) CURRENT
STATUS(RUNNING) SUBSTATE(RECEIVE)
AMQ8417: Display Channel Status details.
CHANNEL(MY.SVRCONN) CHLTYPE(SVRCONN)
CONNAME(127.0.0.1) CURRENT
STATUS(RUNNING) SUBSTATE(RECEIVE)
When running 5 threads each.
3 : dis chstatus(MY.SVRCONN)
AMQ8417: Display Channel Status details.
CHANNEL(MY.SVRCONN) CHLTYPE(SVRCONN)
CONNAME(127.0.0.1) CURRENT
STATUS(RUNNING) SUBSTATE(RECEIVE)

How to optimize SOA requests in HPC

I want to use HPC to do some simulations, I'm going to use SOA. I have following code from some sample materials, I modified it (I added this first for). Currently I stumbled upon problem of optimization / poor performance. This basic sample do nothing expect querying service method, this method return value it gets in parameter. However my example is slow. I have 60 computers with 4 core processors and 1Gb network. First phase of sending messages takes something about 2 seconds and then I have to wait another 7 seconds for return values. All values come leas or more at the same time. Another problem I have is that I cannot re-use session object, that is why this first for is outside using I want to put it inside using, but then I get time out, or information that BrokerClient is ended.
Can I reuse BrokerClient or DurableSession object.
How can I speed up this whole process of message passing ?
static void Main(string[] args)
{
const string headnode = "Head-Node.hpcCluster.edu.edu";
const string serviceName = "EchoService";
const int numRequests = 1000;
SessionStartInfo info = new SessionStartInfo(headnode, serviceName);
for (int j = 0; j < 100; j++)
{
using (DurableSession session = DurableSession.CreateSession(info))
{
Console.WriteLine("done session id = {0}", session.Id);
NetTcpBinding binding = new NetTcpBinding(SecurityMode.Transport);
using (BrokerClient<IService1> client = new BrokerClient<IService1>(session, binding))
{
for (int i = 0; i < numRequests; i++)
{
EchoRequest request = new EchoRequest("hello world!");
client.SendRequest<EchoRequest>(request, i);
}
client.EndRequests();
foreach (var response in client.GetResponses<EchoResponse>())
{
try
{
string reply = response.Result.EchoResult;
Console.WriteLine("\tReceived response for request {0}: {1}", response.GetUserData<int>(), reply);
}
catch (Exception ex)
{
}
}
}
session.Close();
}
}
}
Second version with Session instead of DurableSession, which is working better, but I have problem with Session reuse:
using (Session session = Session.CreateSession(info))
{
for (int i = 0; i < 100; i++)
{
count = 0;
Console.WriteLine("done session id = {0}", session.Id);
NetTcpBinding binding = new NetTcpBinding(SecurityMode.Transport);
using (BrokerClient<IService1> client = new BrokerClient<IService1>( session, binding))
{
//set getresponse handler
client.SetResponseHandler<EchoResponse>((item) =>
{
try
{
Console.WriteLine("\tReceived response for request {0}: {1}",
item.GetUserData<int>(), item.Result.EchoResult);
}
catch (SessionException ex)
{
Console.WriteLine("SessionException while getting responses in callback: {0}", ex.Message);
}
catch (Exception ex)
{
Console.WriteLine("Exception while getting responses in callback: {0}", ex.Message);
}
if (Interlocked.Increment(ref count) == numRequests)
done.Set();
});
// start to send requests
Console.Write("Sending {0} requests...", numRequests);
for (int j = 0; j < numRequests; j++)
{
EchoRequest request = new EchoRequest("hello world!");
client.SendRequest<EchoRequest>(request, i);
}
client.EndRequests();
Console.WriteLine("done");
Console.WriteLine("Retrieving responses...");
// Main thread block here waiting for the retrieval process
// to complete. As the thread that receives the "numRequests"-th
// responses does a Set() on the event, "done.WaitOne()" will pop
done.WaitOne();
Console.WriteLine("Done retrieving {0} responses", numRequests);
}
}
// Close connections and delete messages stored in the system
session.Close();
}
I get exception during second run of EndRequest: The server did not provide a meaningful reply; this might be caused by a contract mismatch, a premature session shutdown or an internal server error.

Don't use DurableSession for computations where the indivdual requests are shorter than about 30 seconds. A DurableSession will be backed by an MSMQ queue in the broker. Your requests and responses may be round-tripped to disk; this will cause performance problems if your amount of computation per request is small. You should use Session instead.
In general, for performance reasons, don't use DurableSession unless you absolutely need the durable behavior in the broker. In this case, since you are calling GetResponses immediately after SendRequests, Session will work fine for you.
You can reuse a Session or DurableSession object to create any number of BrokerClient objects, as long you haven't called Session.Close.
If it's important to process the responses in parallel on the client side, use BrokerClient.SetResponseHandler to set a callback function which will handle responses asynchronously (rather than use client.GetResponses, which handles them synchronously). Look at the HelloWorldR2 sample code for details.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to read from multiple EventHub partitions simultaneously with high throughput? - c#

Related

GRPC performance vs WCF performance

Blocking connection to Azure storage account

Sending messages on scale to Service Bus from durable functions

Starting multiple threads in a for loop as no effect

How to optimize SOA requests in HPC

Categories

Resources