Translate API User Rate Limit Exceeded [403] without reason - c#

I use google translate API with C# code via "Google.Apis.Translate.v2" version 1.9.2.410 with paid service.
Code is some like:
var GoogleService = new Google.Apis.Translate.v2.TranslateService(
new BaseClientService.Initializer
{
ApiKey = Context.ConfigData.GoogleApiKey,
ApplicationName = "Translator"
});
...
var rqr = GoogleService.Translations.List(item, 'de');
rqr.Source = "cs";
var result = await rqr.ExecuteAsync();
This code take Exception:
User Rate Limit Exceeded [403] Errors [ Message[User Rate Limit
Exceeded] Location[ - ] Reason[userRateLimitExceeded]
Domain[usageLimits] ]
Before that, it never was. My limit it's:
Total quota
50 000 000 characters/day
Remaining
49 344 849 characters/day
98,69 % of total
Per-user limit
100 requests/second/user
The number of requests is certainly less than 100 request per second
Please what's wrong?

There is an existing undocumented quota for Translate API. This quota limits the number of characters per 100 seconds per user to 10,000 (aka 10,000 chars/100seconds/user).
This means that, even if you’re splitting large texts into different requests, you won’t be able to bypass 10,000 characters within a 100-seconds interval.
Brief examples:
If you bypass 10k characters within the first 5 seconds, you will need to wait 95 seconds to continue analyzing chars.
If you hit this quota after 50 seconds, you will need to wait another 50.
If you hit it on the second 99th, you will need to wait 1 second to continue the work.
What I would recommend is to always catch exceptions, and retry a number of times doing an exponential backoff. The idea is that if the server is down temporarily due to hitting the 100-seconds interval quota, it is not overwhelmed with requests hitting at the same time until it comes back up (and therefore returning 403 errors continuously). You can see a brief explanation of this practice here (the sample is focused on Drive API, but the same concepts apply to every cloud-based service).
Alternatively, you could catch exceptions, and whenever you get a 403 error, apply a delay of 100 seconds and retry again. This won't be the most time-efficient solution, as the 100-seconds intervals are continuous (not started when the quota is reached), but it will assure that you don’t hit the limit twice with the same request.

Related

How to crawl XML(s) very fast — considering the below networking limitations?

I have a .Net crawler that's running when the user makes a request (so, it needs to be fast). It crawls 400+ links in real time. (This is the business ask.)
The problem: I need to detect if a link is xml (think of rss or atom feeds) or html. If the link is xml then I continue with processing, but if the link is html I can skip it. Usually, I have 2 xml(s) and 398+ html(s). Currently, I have multiple threads going but the processing is still slow, usually 75 seconds running with 10 threads for 400+ links, or 280 seconds running with 1 thread. (I want to add more threads but see below..)
The challenge that I am facing is that I read the streams as follows:
var request = WebRequest.Create(requestUriString: uri.AbsoluteUri);
// ....
var response = await request.GetResponseAsync();
//....
using (var reader = new StreamReader(stream: response.GetResponseStream(), encoding: encoding)) {
char[] buffer = new char[1024];
await reader.ReadAsync(buffer: buffer, index: 0, count: 1024);
responseText = new string(value: buffer);
}
// parse first byts of reasponseText to check if xml
The problem is that my optimization to get only 1024 is quite useless because the GetResponseAsync is downloading the entire stream anyway, as I see.
(The other option that I have is to look for the header ContentType, but that's quite similar AFAIK because I get the content anyway - in case that you don't recommend to use OPTIONS, that I did not use so far - and in addition xml might be content-type incorrectly marked (?) and I am going to miss some content.)
If there is any optimization that I am missing please help, as I am running out of ideas.
(I do consider to optimize this design by spreading the load on multiple servers, so that I balance the network with the parallelism, but that's a bit of change from the current architecture, that I cannot afford to do at this point in time.)
Using HEAD requests could speed up the requests significantly, IF you can rely on the Content-Type.
e.g
HttpClient client = new HttpClient();
HttpResponseMessage response = await client.SendAsync(new HttpRequestMessage() { Method = HttpMethod.Head});
Just showing basic usage. Obviously you need to add uri and anything else required to the request.
Also just to note that even with 10 threads, 400 request will likely always take quite a while. 400/10 means 40 requests sequentially. Unless the requests are to servers close by then 200ms would be a good response time meaning a minimum of 8 seconds. Ovserseas serves that may be slow could easily push this out to 30-40 seconds of unavoidable delay, unless you increase the amount of threads to parallel more of the requests.
Dataflow (Task Parallel Library) Can be very helpful for writing parallel pipes with a convenient MaxDegreeOfParallelism property for easily adjusting how many parallel instances can be run.

How to Determine Azure ServiceBus PrefetchCount and ReceiveBatch Size

I have a queue processor that is retrieving all messages from a ServiceBus Queue. I am wondering how I should determine the MessageReceiver PrefetchCount and the ReceiveBatch messageCount to optimize performance. I am currently setting these to the arbitrary number 500, as seen below:
var receiverFactory = MessagingFactory.CreateFromConnectionString("ConnectionString");
var receiver = await receiverFactory.CreateMessageReceiverAsync("QueueName", ReceiveMode.PeekLock);
receiver.PrefetchCount = 500;
bool loopBatch = true;
while (loopBatch)
{
var tempMessages = await receiver.ReceiveBatchAsync(500, TimeSpan.FromSeconds(1));
// Do some message processing...
loopBatch = tempMessages.Any();
}
When running, I see that my batches often take time to "warm up," retrieving counts such as "1, 1, 1, 1, 1, 1, 1, 1, 125, 125, 125, 125..." where the batch retrieval number suddenly jumps much higher.
From the Prefetching optimization docs:
When using the default lock expiration of 60 seconds, a good value for SubscriptionClient.PrefetchCount is 20 times the maximum processing rates of all receivers of the factory. For example, a factory creates 3 receivers, and each receiver can process up to 10 messages per second. The prefetch count should not exceed 20 X 3 X 10 = 600. By default, QueueClient.PrefetchCount is set to 0, which means that no additional messages are fetched from the service.
I don't really understand how to determine the receiver's "messages per second" when the batch retrieval seems to retrieve widely-varying numbers of messages at a time. Any assistance would be greatly appreciated.
I don't really understand how to determine the receiver's "messages per second" when the batch retrieval seems to retrieve widely-varying numbers of messages at a time.
Prefetch makes more sense in the scenario when OnMessage API is used. In that scenario a callback is registered that takes a single message for processing and you can estimate an average processing time of that message. OnMessage API allows to define how many concurrent callback will be running. It would be extremely innefficient to retrieve messages one by one knowing there is a constant flow of incoming messages. Hence, PrefetchCount is used to specify how many mesasges should be retrieved in a "batch" by clients in the background to save the roundtrips back to the server.

Why is this eating memory?

I wrote an application whose purpose is to read logs from a large table (90 million) and process them into easily understandable stats, how many, how long etc.
The first run took 7.5 hours and only had to process 27 of the 90 million. I would like to speed this up. So I am trying to run the queries in parallel. But when I run the below code, within a couple minutes I crash with an Out of Memory exception.
Environments:
Sync
Test : 26 Applications, 15 million logs, 5 million retrieved, < 20mb, takes 20 seconds
Production: 56 Applications, 90 million logs, 27 million retrieved, < 30mb, takes 7.5 hours
Async
Test : 26 Applications, 15 million logs, 5 million retrieved, < 20mb, takes 3 seconds
Production: 56 Applications, 90 million logs, 27 million retrieved, Memory Exception
public void Run()
{
List<Application> apps;
//Query for apps
using (var ctx = new MyContext())
{
apps = ctx.Applications.Where(x => x.Type == "TypeIWant").ToList();
}
var tasks = new Task[apps.Count];
for (int i = 0; i < apps.Count; i++)
{
var app = apps[i];
tasks[i] = Task.Run(() => Process(app));
}
//try catch
Task.WaitAll(tasks);
}
public void Process(Application app)
{
//Query for logs for time period
using (var ctx = new MyContext())
{
var logs = ctx.Logs.Where(l => l.Id == app.Id).AsNoTracking();
foreach (var log in logs)
{
Interlocked.Increment(ref _totalLogsRead);
var l = log;
Task.Run(() => ProcessLog(l, app.Id));
}
}
}
Is it ill advised to create 56 contexts?
Do I need to dispose and re-create contexts after a certain number of logs retrieved?
Perhaps I'm misunderstanding how the IQueryable is working? <-- My Guess
My understanding is that it will retrieve logs as needed, I guess that means for the loop is it like a yield? or is my issue that 56 'threads' call to the database and I am storing 27 million logs in memory?
Side question
The results don't really scale together. Based on the Test environment results i would expect Production would only take a few minutes. I assume the increase is directly related to the number of records in the table.
With 27 Million rows the problem is one of stream processing, not parallel execution. You need to approach the problem as you would with SQL Server's SSIS or any other ETL tools: each processing step is a transofrmation that processes its input and sends its output to the next step.
Parallel processing is achieved by using a separate thread to run each step. Some steps could also use multiple threads to process multiple inputs up to a limit. Setting limits to each step's thread count and input buffer ensures you can achieve maximum throughput without flooding your machine with waiting tasks.
.NET's TPL Dataflow addresses exactly this scenario. It provides blocks to transfrom inputs to outputs (TransformBlock), split collections to individual messages (TransformManyBlock), execute actions without transformations (ActionBlock), combine data in batches (BatchBlock) etc.
You can also specify the Maximum degree of parallelism for each step so that, eg. you have only 1 log queries executing at each time, but use 10 tasks for log processing.
In your case, you could:
Start with a TransformManyBlock that receives an application type and returns a list of app IDs
A TranformBlock reads the logs for a specific ID and sends them downstream
An ActionBlock processes the batch.
Step #3 could be broken to many other steps. Eg if you don't need to process all app log entries together, you can use a step to process individual entries. Or you could first group them by date.
Another option is to create a custom block to read data from the database using a DbDataReader and post each entry to the next step immediatelly, instead of waiting for all rows to return. This would allow you to process each entry as it arrives, instead of waiting to receive all entries.
If each app log contains many entries, this could be a huge memory and time saver

Boosting performance on async web calls

Backgound: I must call a web service call 1500 times which takes roughly 1.3 seconds to complete. (No control over this 3rd party API.) total Time = 1500 * 1.3 = 1950 seconds / 60 seconds = 32 minutes roughly.
I came up with what I though was a good solution however it did not pan out that great.
So I changed the calls to async web calls thinking this would dramatically help my results it did not.
Example Code:
Pre-Optimizations:
foreach (var elmKeyDataElementNamed in findResponse.Keys)
{
var getRequest = new ElementMasterGetRequest
{
Key = new elmFullKey
{
CmpCode = CodaServiceSettings.CompanyCode,
Code = elmKeyDataElementNamed.Code,
Level = filterLevel
}
};
ElementMasterGetResponse getResponse;
_elementMasterServiceClient.Get(new MasterOptions(), getRequest, out getResponse);
elementList.Add(new CodaElement { Element = getResponse.Element, SearchCode = filterCode });
}
With Optimizations:
var tasks = findResponse.Keys.Select(elmKeyDataElementNamed => new ElementMasterGetRequest
{
Key = new elmFullKey
{
CmpCode = CodaServiceSettings.CompanyCode,
Code = elmKeyDataElementNamed.Code,
Level = filterLevel
}
}).Select(getRequest => _elementMasterServiceClient.GetAsync(new MasterOptions(), getRequest)).ToList();
Task.WaitAll(tasks.ToArray());
elementList.AddRange(tasks.Select(p => new CodaElement
{
Element = p.Result.GetResponse.Element,
SearchCode = filterCode
}));
Smaller Sampling Example:
So to easily test I did a smaller sampling of 40 records this took 60 seconds with no optimizations with the optimizations it only took 50 seconds. I would have though it would have been closer to 30 or better.
I used wireshark to watch the transactions come through and realized the async way was not sending as fast I assumed it would have.
Async requests captured
Normal no optimization
You can see that the asnyc pushes a few very fast then drops off...
Also note that between requests 10 and 11 it took nearly 3 seconds.
Is the overhead for creating threads for the tasks that slow that it takes seconds?
Note: The tasks I am referring to are the 4.5 TAP task library.
Why wouldn't the request come faster than that.
I was told the Apache web server I was hitting could hold 200 max threads so I don't see an issue there..
Am I not thinking about this clearly?
When calling web services are there little advantages from async requests?
Do I have a code mistake?
Any ideas would be great.
After many days of searching I found this post that solved my problem:
Trying to run multiple HTTP requests in parallel, but being limited by Windows (registry)
The reason that the request was not hitting the server quicker was due too the my client side code and nothing to do with the server. By default C# only allows 2 concurrent requests.
see here: http://msdn.microsoft.com/en-us/library/system.net.servicepointmanager.defaultconnectionlimit.aspx
I simply added this line of code and then all request came through in milliseconds.
System.Net.ServicePointManager.DefaultConnectionLimit = 50;

Timeout for Web Request

What is a reasonable amount of time to wait for a web request to return? I know this is maybe a little loaded as a question, but all I am trying to do is verify if a web page is available.
Maybe there is a better way?
try
{
// Create the web request
HttpWebRequest request = WebRequest.Create(this.getUri()) as HttpWebRequest;
request.Credentials = System.Net.CredentialCache.DefaultCredentials;
// 2 minutes for timeout
request.Timeout = 120 * 1000;
if (request != null)
{
// Get response
response = request.GetResponse() as HttpWebResponse;
connectedToUrl = processResponseCode(response);
}
else
{
logger.Fatal(getFatalMessage());
string error = string.Empty;
}
}
catch (WebException we)
{
...
}
catch (Exception e)
{
...
}
You need to consider how long the consumer of the web service is going to take e.g. if you are connecting to a DB web server and you run a lengthy query, you need to make the web service timeout longer then the time the query will take. Otherwise, the web service will (erroneously) time out.
I also use something like (consumer time) + 10 seconds.
Offhand I'd allow 10 seconds, but it really depends on what kind of network connection the code will be running with. Try running some test pings over a period of a few days/weeks to see what the typical response time is.
I would measure how long it takes for pages that do exist to respond. If they all respond in about the same amount of time, then I would set the timeout period to approximately double that amount.
Just wanted to add that a lot of the time I'll use an adaptive timeout. Could be a simple metric like:
period += (numTimeouts/numRequests > .01 ? someConstant: 0);
checked whenever you hit a timeout to try and keep timeouts under 1% (for example). Just be careful about decrementing it too low :)
The reasonable amount of time to wait for a web request may differ from one server to the next. If a server is at the far end of a high-delay link then clearly it will take longer to respond than when it is in the next room. But two minutes seems like it's more than ample time for a server to respond. The default timeout value for the PING command is expressed in seconds, not minutes. I suggest you look into the timeout values that are used by networking utilities like PING or TRACERT for inspiration.
I guess this depends on two things:
network speed/load (as others wrote, using ping might give you an idea about this)
the kind of page you are calling: e.g. is it a static HTML page or is it a page which might do some time-consuming operations (DB access, etc.)
Anyway, I think 2 minutes is a lot of time. I would definitely reduce the timeout to less than 30 seconds.
I realize this doesn't directly answer your question, but then an "answer" to this question is a little tough. Anyway, a tool I've used gomez in the past to measure page load times from various parts of the world. It's free and if you haven't done this kind of testing before it might be helpful in terms of giving you a firm idea of what typical page load times are for a given page from a given location.
I would only wait (MAX) 30 seconds probably closer to 15. It really depends on what you are doing and what the result is of unsuccessful connection. As I am sure you know there is lots of reason why you could get a timeout...

Categories