I have a queue processor that is retrieving all messages from a ServiceBus Queue. I am wondering how I should determine the MessageReceiver PrefetchCount and the ReceiveBatch messageCount to optimize performance. I am currently setting these to the arbitrary number 500, as seen below:
var receiverFactory = MessagingFactory.CreateFromConnectionString("ConnectionString");
var receiver = await receiverFactory.CreateMessageReceiverAsync("QueueName", ReceiveMode.PeekLock);
receiver.PrefetchCount = 500;
bool loopBatch = true;
while (loopBatch)
{
var tempMessages = await receiver.ReceiveBatchAsync(500, TimeSpan.FromSeconds(1));
// Do some message processing...
loopBatch = tempMessages.Any();
}
When running, I see that my batches often take time to "warm up," retrieving counts such as "1, 1, 1, 1, 1, 1, 1, 1, 125, 125, 125, 125..." where the batch retrieval number suddenly jumps much higher.
From the Prefetching optimization docs:
When using the default lock expiration of 60 seconds, a good value for SubscriptionClient.PrefetchCount is 20 times the maximum processing rates of all receivers of the factory. For example, a factory creates 3 receivers, and each receiver can process up to 10 messages per second. The prefetch count should not exceed 20 X 3 X 10 = 600. By default, QueueClient.PrefetchCount is set to 0, which means that no additional messages are fetched from the service.
I don't really understand how to determine the receiver's "messages per second" when the batch retrieval seems to retrieve widely-varying numbers of messages at a time. Any assistance would be greatly appreciated.
I don't really understand how to determine the receiver's "messages per second" when the batch retrieval seems to retrieve widely-varying numbers of messages at a time.
Prefetch makes more sense in the scenario when OnMessage API is used. In that scenario a callback is registered that takes a single message for processing and you can estimate an average processing time of that message. OnMessage API allows to define how many concurrent callback will be running. It would be extremely innefficient to retrieve messages one by one knowing there is a constant flow of incoming messages. Hence, PrefetchCount is used to specify how many mesasges should be retrieved in a "batch" by clients in the background to save the roundtrips back to the server.
Related
Is there a way using C# to identify whether a private MSMQ has exceeded it's storage limit (KB)?
In the following example I created a private MSMQ using the Computer Management console and I set the storage limit to 100 KB.
I send messages to the queue using a simple c# program which works fine. I would like to be able to figure out when the limit has been reached in order to stop sending messages.
MessageQueue msgQ =new MessageQueue(".\\Private$\\name_of_queue");
msgQ.Send(msg);
Maximum Size of Queue
Use the MessageQueue.MaximumQueueSize Property to get the queue's maximum size.
The maximum size, in kilobytes, of the queue. The Message Queuing
default specifies that no limit exists.
So, something like this should work:
var msgQ = new MessageQueue(".\\Private$\\name_of_queue");
long size = msgQ.MaximumQueueSize;
Size of Queue
Use the PerformanceCounter to get the current size of the queue:
var bytesCounter = new PerformanceCounter(
"MSMQ Queue",
"Bytes in Queue",
Environment.MachineName + "\\private$\\queue-name");
Looks like there are two different queries to get the size of the current queue:
Query
Description
Bytes in Queue
Shows the total number of bytes that currently reside in the queue. For the computer queue instance, this represents the dead letter queue.
Bytes in Journal Queue
Shows the total number of bytes that reside in the journal queue. For the computer queue instance, this represents the computer journal queue.
The above queries were found on MSDN in a now deprecated section of MSMQ Queue Object. However, I believe the queries are still valid.
I'm playing around with sending messages through the Azure Service Bus library Azure.Messaging.ServiceBus. I'm following this tutorial and sending/processing a single message.
When calling processor.StopProcessingAsync(), the action takes about a minute (each single time). When looking in the Azure portal, all messaged processed. I have no clue why it takes so long for the processor the stop even though there are no messages on the queue.
It seems like it takes the (exact) same amount of time each time. If anyone could point me to why it takes such a long time and how to reduce it (configuration/setup?), I would be more than thankful. Thanks in advance!
Ok. While going through the source code, I found that there's a default "wait time" after a receiver was started which is 60 seconds. This can be lowered by setting TryTimeout on ServiceBusClientOptions.ServiceBusRetryOptions.
See:
ServiceBusRetryOptions
AmqpReceiver.ReceiveMessagesAsyncInternal
I tested this, it works as expected:
var clientOptions = new ServiceBusClientOptions();
clientOptions.RetryOptions.TryTimeout = new TimeSpan(0, 0, 5);
await using var client = new ServiceBusClient(connectionString, clientOptions);
sets the timeout to 5 seconds.
var msgs = new List<string> {“msg1”, “msg2”, “msg3”};
var tasks = new List<Task>();
Foreach(var msg in msgs) {
tasks.add(_producer.ProduceAsync(...)); }
var deliveryReports = Task.WhenAll(tasks).Result;
My Kafka producer config:
Batch size: 10
Linger:100 ms
My question is, do the tasks get completed in the order they were
created. Can I guarantee that the task representing msg1 completes
before the task representing msg2 or msg3.
Thanks.
Ok I think I now understand how the producer and the broker works to achieve ordering.
So, when ProduceAsync is called, it adds the message to the send buffer, creates promise that is used to complete future and returns future.So, it creates task completion source object and returns its task.
The client library(librdkafka) waits until it receives the configured number of messages or timeout period to batch the messages. A batch is created containing the messages in the same order as in the send buffer. The batch is partitioned (randomly if the default partitioner is used) based on their destination partitions/topics, i.e. split into smaller batches. Each post-split batch is sent to the respective leader broker/ISR (the individual send()’s happen sequentially), and each is acked by its respective leader broker according to request.required.acks. The client library invokes a callback on each ack it receives and the callback completes its respective future i.e taskCompletionSource.Set();
There's a couple of things here.
First, librdkafka has the capability to manage re-tries for you and by default it does ('retries' is set to 2) - so this can cause re-ordering of message delivery and delivery reports. To ensure this doesn't happen you can set 'max.in.flight' to 1 (or 'retries' to 0 and manage this yourself).
With librdkafka configured to supply delivery-reports back to .net in the order the messages are sent, the question becomes one of Task completion ordering guarantees. I need to think about this for more than 5 minutes to give a good answer, but for now assume ordering is not guaranteed (I will write more later). You can get guaranteed ordering by using the variants of ProduceAsync that accept an IDeliveryReport handler. Note that in version 1.0, these methods will be changed somewhat and will be called BeginProduce.
I use google translate API with C# code via "Google.Apis.Translate.v2" version 1.9.2.410 with paid service.
Code is some like:
var GoogleService = new Google.Apis.Translate.v2.TranslateService(
new BaseClientService.Initializer
{
ApiKey = Context.ConfigData.GoogleApiKey,
ApplicationName = "Translator"
});
...
var rqr = GoogleService.Translations.List(item, 'de');
rqr.Source = "cs";
var result = await rqr.ExecuteAsync();
This code take Exception:
User Rate Limit Exceeded [403] Errors [ Message[User Rate Limit
Exceeded] Location[ - ] Reason[userRateLimitExceeded]
Domain[usageLimits] ]
Before that, it never was. My limit it's:
Total quota
50 000 000 characters/day
Remaining
49 344 849 characters/day
98,69 % of total
Per-user limit
100 requests/second/user
The number of requests is certainly less than 100 request per second
Please what's wrong?
There is an existing undocumented quota for Translate API. This quota limits the number of characters per 100 seconds per user to 10,000 (aka 10,000 chars/100seconds/user).
This means that, even if you’re splitting large texts into different requests, you won’t be able to bypass 10,000 characters within a 100-seconds interval.
Brief examples:
If you bypass 10k characters within the first 5 seconds, you will need to wait 95 seconds to continue analyzing chars.
If you hit this quota after 50 seconds, you will need to wait another 50.
If you hit it on the second 99th, you will need to wait 1 second to continue the work.
What I would recommend is to always catch exceptions, and retry a number of times doing an exponential backoff. The idea is that if the server is down temporarily due to hitting the 100-seconds interval quota, it is not overwhelmed with requests hitting at the same time until it comes back up (and therefore returning 403 errors continuously). You can see a brief explanation of this practice here (the sample is focused on Drive API, but the same concepts apply to every cloud-based service).
Alternatively, you could catch exceptions, and whenever you get a 403 error, apply a delay of 100 seconds and retry again. This won't be the most time-efficient solution, as the 100-seconds intervals are continuous (not started when the quota is reached), but it will assure that you don’t hit the limit twice with the same request.
I have a console application to read all the brokered messages present in the subscription on the Azure Service Bus. I have around 3500 messages in there. This is my code to read the messages:
SubscriptionClient client = messagingFactory.CreateSubscriptionClient(topic, subscription);
long count = namespaceManager.GetSubscription(topic, subscription).MessageCountDetails.ActiveMessageCount;
Console.WriteLine("Total messages to process : {0}", count.ToString()); //Here the number is showing correctly
IEnumerable<BrokeredMessage> dlIE = null;
dlIE = client.ReceiveBatch(Convert.ToInt32(count));
When I execute the code, in the dlIE, I can see only 256 messages. I have also tried giving the prefetch count like this client.PrefetchCountbut then also it returns 256 messages only.
I think there is some limit to the number of messages that can be retrieved at a time.However there is no such thing mentioned on the msdn page for the RecieveBatch method. What can I do to retrieve all messages at a time?
Note:
I only want to read the message and then let it exist on the service bus. Therefore I do not use message.complete method.
I cannot remove and re-create the topic/subscription from the Service Bus.
Edit:
I used PeekBatch instead of ReceiveBatch like this:
IEnumerable<BrokeredMessage> dlIE = null;
List<BrokeredMessage> bmList = new List<BrokeredMessage>();
long i = 0;
dlIE = subsciptionClient.PeekBatch(Convert.ToInt32(count)); // count is the total number of messages in the subscription.
bmList.AddRange(dlIE);
i = dlIE.Count();
if(i < count)
{
while(i < count)
{
IEnumerable<BrokeredMessage> dlTemp = null;
dlTemp = subsciptionClient.PeekBatch(i, Convert.ToInt32(count));
bmList.AddRange(dlTemp);
i = i + dlTemp.Count();
}
}
I have 3255 messages in the subscription. When the first time peekBatch is called it gets 250 messages. so it goes into the while loop with PeekBatch(250,3225). Every time 250 messages are only received. The final total messages I am having in the output list is 3500 with duplicates. I am not able to understand how this is happening.
I have figured it out. The subscription client remembers the last batch it retrieved and when called again, retrieves the next batch.
So the code would be :
IEnumerable<BrokeredMessage> dlIE = null;
List<BrokeredMessage> bmList = new List<BrokeredMessage>();
long i = 0;
while (i < count)
{
dlIE = subsciptionClient.PeekBatch(Convert.ToInt32(count));
bmList.AddRange(dlIE);
i = i + dlIE.Count();
}
Thanks to MikeWo for guidance
Note: There seems to be some kind of a size limit on the number of messages you can peek at a time. I tried with different subscriptions and the number of messages fetched were different for each.
Is the topic you are writing to partitioned by chance? When you receive messages from a partitioned entity it will only fetch from one of the partitions at a time. From MSDN:
"When a client wants to receive a message from a partitioned queue, or from a subscription of a partitioned topic, Service Bus queries all fragments for messages, then returns the first message that is returned from any of the messaging stores to the receiver. Service Bus caches the other messages and returns them when it receives additional receive requests. A receiving client is not aware of the partitioning; the client-facing behavior of a partitioned queue or topic (for example, read, complete, defer, deadletter, prefetching) is identical to the behavior of a regular entity."
It's probably not a good idea to assume that even with a non partitioned entity that you'd get all messages in one go with really either the Receive or Peek methods. It would be much more efficient to loop through the messages in much smaller batches, especially if your message have any decent size to them or are indeterminate in size.
Since you don't actually want to remove the message from the queue I'd suggest using PeekBatch instead of ReceiveBatch. This lets you get a copy of the message and doesn't lock it. I'd highly suggest a loop using the same SubscriptionClient in conjunction with PeekBatch. By using the same SubscriptionClient with PeekBatch under the hood the last pulled sequence number is kept as as you loop through it should keep track and go through the whole queue. This would essentially let you read through the entire queue.
I came across a similar issue where client.ReceiveBatchAsync(....) would not retrieve any data from the subscription in the azure service bus.
After some digging around I found out that there is a bit for each subscriber to enable batch operations. This can only be enabled through powershell. Below is the command I used:
$subObject = Get-AzureRmServiceBusSubscription -ResourceGroup '#resourceName' -NamespaceName '#namespaceName' -Topic '#topicName' -SubscriptionName '#subscriptionName'
$subObject.EnableBatchedOperations = $True
Set-AzureRmServiceBusSubscription -ResourceGroup '#resourceName' -NamespaceName '#namespaceName' -Topic '#topicName'-SubscriptionObj $subObject
More details can be found here. While it still didn't load all the messages at least it started to clear the queue. As far as I'm aware, the batch size parameter is only there as a suggestion to the service bus but not a rule.
Hope it helps!