Bloomberglp.Blpapi.RequestQueueOverflowException: Queue Size: 128 - c#

I can't figure out what the problem is with Bloomberg API.
Everytime when I try to download historical finance data, that means to create a DataRequest for 5000 instruments for 3 days once for euro currency and once for local currency, I get this queue exception.
What's really confusing is, the program still goes on for the first request which contains euro prices for instruments but not for the second.
Thanks for the help.

Well since you are not getting a slow consumer warning, I'd bet you are just requesting to much data on one request.
Try Splitting your request into several chunks.

Queue Size of each request is 1024, if the size > 1024, it will throw BloombergOverFolowException

Related

Bigquery internalError when streaming data

I'm getting the following error while streaming data:
Google.ApisGoogle.Apis.Requests.RequestError
Internal Error [500]
Errors [
Message[Internal Error] Location[ - ] Reason[internalError] Domain[global]
]
My code:
public bool InsertAll(BigqueryService s, String datasetId, String tableId, List<TableDataInsertAllRequest.RowsData> data)
{
try
{
TabledataResource t = s.Tabledata;
TableDataInsertAllRequest req = new TableDataInsertAllRequest()
{
Kind = "bigquery#tableDataInsertAllRequest",
Rows = data
};
TableDataInsertAllResponse response = t.InsertAll(req, projectId, datasetId, tableId).Execute();
if (response.InsertErrors != null)
{
return true;
}
}
catch (Exception e)
{
throw e;
}
return false;
}
I'm streaming data constantly and many times a day I have this error. How can I fix this?
We seen several problems:
the request randomly fails with type 'Backend error'
the request randomly fails with type 'Connection error'
the request randomly fails with type 'timeout' (watch out here, as only some rows are failing and not the whole payload)
some other error messages are non descriptive, and they are so vague that they don't help you, just retry.
we see hundreds of such failures each day, so they are pretty much constant, and not related to Cloud health.
For all these we opened cases in paid Google Enterprise Support, but unfortunately they didn't resolved it. It seams the recommended option to take is an exponential-backoff with retry, even the support told to do so. Also the failure rate fits the 99.9% uptime we have in the SLA, so there is no reason for objection.
There's something to keep in mind in regards to the SLA, it's a very strictly defined structure, the details are here. The 99.9% is uptime not directly translated into fail rate. What this means is that if BQ has a 30 minute downtime one month, and then you do 10,000 inserts within that period but didn't do any inserts in other times of the month, it will cause the numbers to be skewered. This is why we suggest a exponential backoff algorithm. The SLA is explicitly based on uptime and not error rate, but logically the two correlates closely if you do streaming inserts throughout the month at different times with backoff-retry setup. Technically, you should experience on average about 1/1000 failed insert if you are doing inserts through out the month if you have setup the proper retry mechanism.
You can check out this chart about your project health:
https://console.developers.google.com/project/YOUR-APP-ID/apiui/apiview/bigquery?tabId=usage&duration=P1D
About times. Since streaming has a limited payload size, see Quota policy it's easier to talk about times, as the payload is limited in the same way to both of us, but I will mention other side effects too.
We measure between 1200-2500 ms for each streaming request, and this was consistent over the last month as you can see in the chart.
The approach you've chosen if takes hours that means it does not scale, and won't scale. You need to rethink the approach with async processes that can retry.
Processing in background IO bound or cpu bound tasks is now a common practice in most web applications. There's plenty of software to help build background jobs, some based on a messaging system like Beanstalkd.
Basically, you needed to distribute insert jobs across a closed network, to prioritize them, and consume(run) them. Well, that's exactly what Beanstalkd provides.
Beanstalkd gives the possibility to organize jobs in tubes, each tube corresponding to a job type.
You need an API/producer which can put jobs on a tube, let's say a json representation of the row. This was a killer feature for our use case. So we have an API which gets the rows, and places them on tube, this takes just a few milliseconds, so you could achieve fast response time.
On the other part, you have now a bunch of jobs on some tubes. You need an agent. An agent/consumer can reserve a job.
It helps you also with job management and retries: When a job is successfully processed, a consumer can delete the job from the tube. In the case of failure, the consumer can bury the job. This job will not be pushed back to the tube, but will be available for further inspection.
A consumer can release a job, Beanstalkd will push this job back in the tube, and make it available for another client.
Beanstalkd clients can be found in most common languages, a web interface can be useful for debugging.

Socket and ports setup for high-speed audio/video streaming

I have a one-on-one connection between a server and a client. The server is streaming real-time audio/video data.
My question may sound weird, but should I use multiple ports/socket or only one? Is it faster to use multiple ports or a single one offer better performance? Should I have a port only for messages, one for video and one for audio or is it more simple to package the whole thing in a single port?
One of my current problem is that I need to first send the size of the current frame as the size - in bytes - may change from one frame to the next. I'm fairly new to Networking, but I haven't found any mechanism that would automatically detect the correct range for a specific object being transmitted. For example, if I send a 2934 bytes long packet, do I really need to tell the receiver the size of that packet?
I first tried to package the frame as fast as they were coming in, but I found out the receiving end would sometime not get the appropriated number of bytes. Most of the time, it would read faster than I send them, getting only a partial frame. What's the best way to get only the appropriated number of bytes as quickly as possible?
Or am I looking too low and there's a higher-level class/framework used to handle object transmission?
I think it is better to use an object mechanism and send data in an interleaved fashion. This mechanism may work faster than multiple port mechanism.
eg:
class Data {
DataType, - (Adio/Video)
Size, - (Size of the Data buffer)
Data Buffer - (Data depends on the type)
}
'DataType' and 'Size' always of constant size. At the client side take the 'DataType' and 'Size' and then read the specifed size of corresponding sent data(Adio/Video).
Just making something up off the top of my head. Shove "packets" like this down the wire:
1 byte - packet type (audio or video)
2 bytes - data length
(whatever else you need)
|
| (raw data)
|
So whenever you get one of these packets on the other end, you know exactly how much data to read, and where the beginning of the next packet should start.
[430 byte audio L packet]
[430 byte audio R packet]
[1000 byte video packet]
[20 byte control packet]
[2000 byte video packet]
...
But why re-invent the wheel? There are protocols to do these things already.

Fragmented length prefix causes next data read from buffer use incorrect message length

I'm one of those guys who come here to find answers to those questions that others have asked, and I think i newer asked anything myself, but after two days searching unsuccessfully I decided that it's time to ask something myself. So here it is...
I have a TCP server and client written in C#, .NET 4, asynchronous sockets using SocketAsyncEventArgs. I have a length-prefixed message framing protocol. Overall everything works just fine, but one issue keeps bugging me.
Situation is like this (I will use small numbers just as an example):
Lets say Server has a Send buffer length of 16 bytes.
It sends a message which is 6 bytes long, and prefixes it with 4 bytes long length prefix. Total message length is 6+4=10.
Client reads the data and receives a buffer of 16 bytes length (yes 10 bytes of data and 6 bytes equal to zero).
Received buffer looks like this: 6 0 0 0 56 21 33 1 5 7 0 0 0 0 0 0
So I read first 4 bytes which is my length prefix, I determine that my message is 6 bytes long, I read it as well and everything is fine so far. Then i have 16-10=6 bytes left to read. All of them are zeroes I read 4 of them, since it's my length prefix. So it's a zero length message which is allowed as keep-alive packet.
Remaining data to read: 0 0
Now the issue "kicks in". I got only 2 remaining bytes to read, they are not enough to complete a 4 byte-long length prefix buffer. So I read those 2 bytes, and wait for more incoming data. Now server is not aware that I'm still reading length prefix (I'm just reading all those zeroes in the buffer) and sends another message correctly prefixed with 4 bytes. And the client is assuming the server sends those missing 2 bytes. I receive the data on the client side, and read first two bytes to form a complete 4 byte length buffer. The results are something like that
lengthBuffer = new byte[4]{0, 0, 42, 0}
Which then translates into 2752512 message length. So my code will continue to read next 2752512 bytes to complete the message...
So in every single message framing example I have seen zero length messages are supported as keep-alive's. And every example I've seen doesn't do anything more than I do. The problem is that I do not know how much data I have to read when I receive it from the server. Since I have partially-filled buffer with zeroes, I have to read it all as those zeroes could be keep-alive's I sent from the other end of connection.
I could drop zero-length messages and stop reading the buffer after first empty message and it should fix this issue, and use custom messages for my keep-alive mechanism. But I want to know if I am missing something, or doing something wrong, since every code example I've seen seems to have same issue (?)
UPDATE
Marc Gravell, you sir pulled words out of my mouth. Was about to update that the issue is with sending the data. The problem is that initially when exploring .NET Sockets and SocketAsyncEventArgs I came across this sample: http://archive.msdn.microsoft.com/nclsamples/Wiki/View.aspx?title=socket%20performance
It uses reusable pool of buffers. Simply takes predefined number of maximum client connections allowed, for example 10, takes maximum single buffer size, for example 512, and creates one large buffer for all of them. So 512 * 10 * 2 (for send and receive) = 10240
So we have byte[] buff = new byte[10240];
Then for each client that connects it assigns a piece of this large buffer. First connected client gets first 512 bytes for Data Reading operations, and gets next 512 bytes (offset 512) for Data Sending operations. Therefore the code ended up having already allocated Send buffer which size is 512 (exactly the number the client later receives as BytesTransferred). This buffer is populated with data, and all remaining space out of these 512 bytes is sent as zeroes.
Strange enough this example is from msdn. The reason there is a single huge buffer is to avoid fragmented heap memory, when buffer gets pinned and GC cant collect it or something like that.
Comment from BufferManager.cs in the provided example (see link above):
This class creates a single large buffer which can be divided up and
assigned to SocketAsyncEventArgs objects for use with each socket I/O
operation. This enables bufffers to be easily reused and gaurds
against fragmenting heap memory.
So the issue is pretty much clear. Any suggestions on how I should resolve this are welcome :) Is it true what they say about fragmented heap memory, is it OK to create a data buffer "on the fly"? If so, will I have memory issues when the server scales to a few hundred or even thousands of clients?
I guess the problem is that you are treating the trailing zeros in the buffer you read as data. This is not data. It is garbage. No one ever sent it to you.
The Stream.Read call returns you the number of bytes actually read. You should not interpret the rest of the buffer in any way.
The problem is that I do not know how much data I have to read when I
receive it from the server.
Yes, you do: Use the return value from Stream.Read.
That sounds simply like a bug in either your send or receive code. You should only get BytesTransferred as the data that was actually sent, or some number smaller than that if arriving in fragments. The first thing I would wonder is: did you setup the send correctly? i.e. if you have an oversized buffer, a correct implementation might look like:
args.SetBuffer(buffer, 0, actualBytesToSend);
if (!socket.SendAsync(args)) { /* whatever */ }
where actualBytesToSend can be much less than buffer.Length. My initial suspicion is that
you are doing something like:
args.SetBuffer(buffer, 0, buffer.Length);
and therefore sending more data than you have actually populated.
I should emphasize: there is something wrong in either your send or receive; I do not believe, at least without an example, that there is some fundamental underlying bug in the BCL here - I use the async API extensively, and it works fine - but you do need to accurately track the data you are sending and receiving at all points.
"Now server is not aware that I'm still reading length prefix (I'm just reading all those zeroes in the buffer) and sends another message correctly prefixed with 4 bytes.".
Why? How does the server know what you are and aren't reading? If the server retransmits any part of a message it is in error. TCP already does that for you.
There seems to be something radically wrong with your server.

How to download the data from the server discontinuously?

i need to download a big data from the server,because the data is so big,i am not able to download it at a time,do you have any idea?Thanks you very much.
If the server supports it, you can use HTTP byte ranges to request specific parts of the file.
This page describes HTTP byte range requests:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.1
The following code creates a request which will ask to skip the first 100 bytes, but return the rest of the file:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(#"http://example.com/somelargefile");
request.Headers.Add("Range", "bytes=100-");
The only logical way I can think about doing it is to pre-arrange the data into chunks for download with index. The index increments with the number of chunks received, so when the server sends down the file, it knows it can skip (chunkCount * chunkSize) from the byte stream and begin sending down the next chunkSize bytes.
Of course, this would mean a rather excessive number of requests, so YMMV.
There is a Background Transfer Service code sample on MSDN that might help. I've never used it, but the sample might give you a place to start from.

real time stock quotes, StreamReader performance optimization

I am working on a program that extracts real time quote for 900+ stocks from a website. I use HttpWebRequest to send HTTP request to the site and
store the response to a stream and open a stream using the following code:
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream stream = response.GetResponseStream ();
StreamReader reader = new StreamReader( stream )
the size of the received HTML is large (5000+ lines), so it takes a long time to parse it and extract the price. For 900 files,
It takes about 6 mins for parsing and extracting. Which my boss isn't happy with, he told me he'd want the whole process to be done in TWO mins.
I've identified the part of the program that takes most of time to finish is parsing and extracting. I've tried to optimize the code to make it faster, the following is
what I have now after some optimization:
// skip lines at the top
for(int i=0;i<1500;++i)
reader.ReadLine();
// read the line that contains the price
string theLine = reader.ReadLine();
// ... extract the price from the line
now it takes about 4 mins to process all the files, there is still a significant gap to what my boss's expecting. So I am wondering, is there other way that I
can further speed up the parsing and extracting and have everything done within 2 mins?
I was doing HTML screen scraping for a while with stock quotes but I found that Yahoo offers a great simple web service that is much better that loading websites.
http://www.gummy-stuff.org/Yahoo-data.htm
With this service you can request up to 100 stock quotes in a single request and it returns a csv formatted response with one line for every symbol. You can set what columns you want returned in the query string of the request. I built a small program that would query the service once a day for every stock in the stock market to get prices. It seemed to work well for me and was way faster than hitting websites for the data.
An example querystring would be
http://finance.yahoo.com/d/quotes.csv?s=GE&f=nkqwxyr1l9t5p4
Which returns text of
"GENERAL ELEC CO",32.98,"Jun 26","21.30 - 32.98","NYSE",2.66,"Jul 25",28.55,"Jul 3","-0.21%"
for(int i=0;i<1500;++i)
reader.ReadLine();
this particulary is not good. ReadLine reads all line and stores it somewhere, but no one uses it. Extra work for GC. Read byte-by-byte and catch \D \A.
Then don't use StreamReader at all! It is fat overhead, read from stream.
Hard to see how this is possible, StreamReader is blindingly fast compared to HttpWebRequest. Some basic assumptions: say you are downloading 900 files with 5000 lines, 100 chars each in 6 minutes. That means you need to download 900 x 5000 x 100 = 450 Megabytes. In 6 minutes, that requires a bandwidth of 450E6 / 6 / 60 * 8 = 10 Mbps.
What do you have? 10 Mbps is about typical for high-speed Internet service, although you need a server that can sustain this. To get it down to 2 seconds, you'll need to upgrade your service to 30 Mbps. Your boss can fix that.
About the speed improvement you saw: watch out for the cache.
If you really need to have real-time data fast then you should subscribe to the data feeds rather than scrape them off a site.
Alternatively, isn't there some token that you can search for to find the field/data pair(s) you need.
4 minutes sounds ridiculously long for reading in 900 files.

Categories