I have an small application receives data from thousands for agents and upload those data to another server, the data from agents are pretty small, usually 10KB but agents write speed is very fast, so my application has an internal 4M buffer, once buffer is full, it creates a new 4M buffer, and pass the old buffer to a Task to perform HTTP upload. code like this:
lock (Locker)
{
if (input.Length > this.buffer.Length - this.bufferDataLen)
{
// Save the current buffer and create a new one.
// We should release locker as fast as we can.
byte[] tempBuffer = this.buffer;
int oldBufferDataLen = this.bufferDataLen;
this.buffer = new byte[tempBuffer.Length]; // save size buffer, 4MB
this.bufferDataLen = 0;
Task.Factory.StartNew(
() =>
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(this._uploadUrl);
request.Method = "POST";
request.ContentType = this._contentType;
request.ContentLength = oldBufferDataLen;
request.KeepAlive = false;
request.Proxy = null;
UploaderState state = new UploaderState(tempBuffer, oldBufferDataLen, request);
IAsyncResult result = request.BeginGetRequestStream(this.OnGetRequestStreamComplete, state);
ThreadPool.RegisterWaitForSingleObject(result.AsyncWaitHandle, this.TimeoutCallback, state, this._timeoutMs, true);
});
}
// Copy incoming data to either old buffer or new buffer
Buffer.BlockCopy(input.Buffer, 0, this.buffer, this.bufferDataLen, input.Length);
this.bufferDataLen += input.Length;
}
I expect that tempBuffer will be disposed correctly by GC. However, when I ran my application, I noticed that the memory usage of application increased very fast, from the memory dump, there are 305 byte[] object on managed heap, total size is 469,339,928 B, inclusive size is 469,339,928 B, and that dump is captured when the application ran only few minutes.
My question is why GC didn't free those byte[]? Should I explicitly call GC.Collect? In my case should I manage a buffer pool myself?
Related
I am using Blazor Webssembly and .Net 5.0. I need to be able to upload very large files (2-5GB) to Azure Blob Storage using chunking by uploading file data in stages and then firing a final commit message on the blob once all blocks have been staged.
I was able to achieve this using SharedAccessSignatures and the Azure JavaScript Libraries (there are many examples available online).
However I would like to handle this using pure C#. Where I am running into an issue is the IBrowserFile reference seems to try to load the entire file into memory rather than read in just the chunks it needs for each stage in the loop.
For simplicity sake my example code below does not include any Azure Blob Storage code. I am simply writing the chunking and commit messages to the console:
#page "/"
<InputFile OnChange="OnInputFileChange" />
#code{
async Task OnInputFileChange(InputFileChangeEventArgs e)
{
try
{
var file = e.File;
int blockSize = 1 * 1024 * 1024;//1 MB Block
int offset = 0;
int counter = 0;
List<string> blockIds = new List<string>();
using (var fs = file.OpenReadStream(5000000000)) //<-- Need to go up to 5GB
{
var bytesRemaining = fs.Length;
do
{
var dataToRead = Math.Min(bytesRemaining, blockSize);
byte[] data = new byte[dataToRead];
var dataRead = fs.Read(data, offset, (int)dataToRead);
bytesRemaining -= dataRead;
if (dataRead > 0)
{
var blockId = Convert.ToBase64String(System.Text.Encoding.UTF8.GetBytes(counter.ToString("d6")));
Console.WriteLine($"blockId:{blockId}");
Console.WriteLine(string.Format("Block {0} uploaded successfully.", counter.ToString("d6")));
blockIds.Add(blockId);
counter++;
}
}
while (bytesRemaining > 0);
Console.WriteLine("All blocks uploaded. Now committing block list.");
Console.WriteLine("Blob uploaded successfully!");
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
}
This first issue is that is that:
Synchronous reads are not supported.
So I tried:
var fs = new System.IO.MemoryStream();
await file.OpenReadStream(5000000000).CopyToAsync(fs);
using (fs)
{
...
}
But obviously I am now going to run into memory issues! And I do. The error on even a 200kb file is:
Out of memory
And anything over 1MB:
Garbage collector could not allocate 16384u bytes of memory for major heap section.
Is there a way to read in smaller chunks of data at a time from the IBrowserFile so this can be achieved natively in client side Blazor without having to resort to JavaScript?
.NET 6.0 has a nice Stream.CopyToAsync() implementation, which can be found here
https://github.com/microsoft/referencesource/blob/master/mscorlib/system/io/stream.cs
This will copy the data from one stream to an other asynchronously.
The gist of it is this:
private async Task CopyToAsyncInternal(Stream source, Stream destination, Int32 bufferSize, CancellationToken cancellationToken)
{
byte[] buffer = new byte[bufferSize];
int bytesRead;
while ((bytesRead = await source.ReadAsync(buffer, 0, buffer.Length, cancellationToken).ConfigureAwait(false)) != 0)
{
await destination.WriteAsync(buffer, 0, bytesRead, cancellationToken).ConfigureAwait(false);
}
}
(copied from link above)
Set the bufferSize to something like 4096 or a multiple and it should work. Other values are also possible, but usually block are taken as a multiple of 4k.
The assumption here is that you have a writable stream to which you can write the bytes asynchronously. You can modify this loop to count blocks and to other stuff per block. In any case don't use a memory stream client side or server side with large files.
I have a web api, hosted on IIS, that returns a 4MB memory buffer through the StreamContent class.
public class RestPerfController : ApiController
{
byte[] data = new byte[4 * 1024 * 1024]; // 4MB
public HttpResponseMessage Get()
{
return new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new StreamContent(new MemoryStream(data))
};
}
}
I also have .net client running on another machine that GETs the data 128 times in a loop and calculates average latency.
static void Main(string[] args)
{
Stopwatch timer = new Stopwatch();
byte[] buffer = new byte[FourMB];
for (int i = 0; i < 128; ++i)
{
// Create the request
var request = WebRequest.Create("https://<IpAdddress>/RestPerfController/") as HttpWebRequest;
request.Method = "GET";
request.ContentLength = 0;
// Start the timer
timer.Restart();
// Download the response
WebResponse response = request.GetResponse();
var responseStream = response.GetResponseStream();
long bytesRead = 0;
do
{
bytesRead = responseStream.Read(buffer, 0, FourMB);
}
while (bytesRead > 0);
Console.WriteLine(timer.ElapsedMilliseconds);
}
}
The client and server are connected through a 10Gbps LAN.
Using default settings, the client sees an average latency of 90ms.
Then I changed the server code to use PushStreamContent instead of StreamContent
return new HttpResponseMessage(HttpStatusCode.OK)
{
Content = //new StreamContent(new MemoryStream(data))
new PushStreamContent(async (s, hc, tc) =>
{
await s.WriteAsync(data, 0, data.Length);
s.Close();
},
"application/octet-stream")
};
This caused the average latency on the client to drop from 90ms to 50ms
Why is PushStreamContent almost twice as fast as StreamContent?
Is there a way of reducing the latency even further on the client? 50ms too seems pretty high for a 4MB transfer on a 10 Gigabit LAN.
EDIT: When I used http instead of https, the latency dropped from 50ms to 18ms. So it appears a large part of the latency was coming from the use of https.
Next, I did another experiment using ntttcp
Server: ntttcp.exe -r -m 1,*,<ipaddress> -rb 2M -a 2 -t 15
Client: ntttcp.exe -s -m 1,*,<ipaddress> -l 4M -a 2 -t 15
This showed an average latency of 11.4ms for 4MB transfers. This I believe is the fastest I can get from tcp.
Since I am constrained to use https, I am interested in knowing if there are ways to bring down the 50ms latency.
Did you try to work with less buffer than 4Mb? I think it´s too large and may cause some system bottleneck. Remember that, at this rate, some VIRTUAL/PAGE operations may occurr if RAM is not available. Try something like 32kb-256Kb.
The problem may be not in the LAN itself but in the Windows to manage data at this rate.
The PushStreamContent forces the system to transmit the buffer, stoping some other activities - a kind of HIGH PRIORITY at streams. The problem is about som error than can be occurr of the Stream is not well aligned/complete (the data itself).
Another problem is related to network checks are performed internally by StreamContent and not performed by PushStreamContent. As the name says, you´re forcing the communication (a kind of transmit anyway order).
I have a program where I send data over TCP link. I am using Asynchronous Reads and writes to both the disk and network. If I put a DeflateStream in the middle (so I compress before I write to the network link and I decompress when I receive the data and write it out to the disk) I am CPU bound on the compressing side. This causes my max transfer rate to be about 300 KB/s. However if I remove the compression step I am now I/O bound to the disk and I get transfer rates of 40,000 KB/s.
Under strictly LAN conditions my upper I/O limit will always be more than 300 KB/s, however if my program is run over the internet I very well may have a network IO limit below 300 KB/s.
I would like to detect if I am I/O bound and my network/disk link is the limiting factor or if I am CPU bound and the act of compressing is what is slowing me down most. How could I detect if my program is being limited by my CPU or by my I/O at runtime so I could switch protocols and get the best possible transfer rate?
private static void SendFile(string filename, NetworkStream stream, int sendBufferSize)
{
using (var fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 4096, FileOptions.Asynchronous | FileOptions.SequentialScan))
using (var ds = new DeflateStream(stream, CompressionMode.Compress))
{
StreamUtilities.CopyAsync(fs, ds, sendBufferSize);
}
}
public static void CopyAsync(Stream sourceStream, Stream destStream, int bufferSize = 4096)
{
Byte[] bufferA = new Byte[bufferSize];
Byte[] bufferB = new Byte[bufferSize];
IAsyncResult writeResult = null;
IAsyncResult readResult = null;
bool readBufferA = false;
int read;
readResult = sourceStream.BeginRead(bufferA, 0, bufferA.Length, null, null);
//Complete last read
while ((read = sourceStream.EndRead(readResult)) > 0)
{
if (readBufferA)
{
PerformOperations(sourceStream, destStream, bufferA, bufferB, ref readResult, ref writeResult, read);
}
else
{
PerformOperations(sourceStream, destStream, bufferB, bufferA, ref readResult, ref writeResult, read);
}
//Flip the bit on the next buffer
readBufferA = !readBufferA;
}
if (writeResult != null)
destStream.EndWrite(writeResult);
}
private static void PerformOperations(Stream sourceStream, Stream destStream, Byte[] readBuffer, Byte[] writeBuffer, ref IAsyncResult readResult, ref IAsyncResult writeResult, int bytesToWrite)
{
//Start next read
readResult = sourceStream.BeginRead(readBuffer, 0, readBuffer.Length, null, null);
//End previous write
if (writeResult != null)
destStream.EndWrite(writeResult);
writeResult = destStream.BeginWrite(writeBuffer, 0, bytesToWrite, null, null);
}
One option is to separate the two aspects out into a producer/consumer queue: your compressor write blocks into a queue which is then consumed by a thread which just performs IO.
That way:
You can compress while the IO is occurring, without going into asynchronous IO
You can detect whether you're CPU bound (queue is normally empty, or briefly has 1 block on it) or IO bound (queue gradually gets bigger as you compress faster than it can be sent)
With a bit of work, you could multi-thread the compression; you'd need to keep track of block order, but that should be feasible.
I have the following code:
const int bufferSize = 1024 * 1024;
var buffer = new byte[bufferSize];
for (int i = 0; i < 10; i++)
{
const int writesCount = 400;
using (var stream = new MemoryStream(writesCount * bufferSize))
{
for (int j = 0; j < writesCount; j++)
{
stream.Write(buffer, 0, buffer.Length);
}
stream.Close();
}
}
which I run on a 32-bit machine.
The first iteration finishes just fine and then on the next iteration I get a System.OutOfMemoryException exception on the line that news the MemoryStream.
Why isn't the previous MemoryStream memory reclaimed despite using statement? How do I force release of memory used by the MemoryStream?
I don't think the problem is the garbage collector not doing its job. If the GC is under memory pressure it should run and reclaim the 400 MBs you've just allocated.
This is more likely down to the GC not finding a contigious 400 MB block.
Rather, an “out of memory” error happens because the process is unable
to find a large enough section of contiguous unused pages in its
virtual address space to do the requested mapping.
You should read Eric Lippert's blog entry "Out Of Memory" Does Not Refer to Physical Memory
You're far better off doing both of the below.
Reusing the memory block you've allocated (why are you creating another with the exact same size)
Allocating much smaller chunks (less than 85KBs)
Prior to Dotnet 4.5, Dotnet constructed two heaps, Small Object Heap (SOH) and Large Object Heap (LOH). See Large Object Hearp Improvements in .NET 4.5 by Brandon Bray. Your MemoryStream is being allocated in LOH, and not compacted (defragmented) for the duration of the process, making it much more likely that multiple calls to allocate this large amount of memory will throw an OutOfMemoryException
The CLR manages two different heaps for allocation, the small object
heap (SOH) and the large object heap (LOH). Any allocation greater
than or equal to 85,000 bytes goes on the LOH. Copying large objects
has a performance penalty, so the LOH is not compacted unlike the SOH.
Another defining characteristic is that the LOH is only collected
during a generation 2 collection. Together, these have the built-in
assumption that large object allocations are infrequent.
Looks like you're allocating too much than your system can handle. Your code runs fine on my machine, but if I change it like this :
const int bufferSize = 1024 * 1024 * 2;
I get the same error as you.
But if I change the target processor to x64, then the code runs, which seems logical as you can address lot more memory.
Detailed explanation on this article : http://www.guylangston.net/blog/Article/MaxMemory
And some information on this question : Maximum Memory a .NET process can allocate
First of all, Dispose() does not guarantee that memory will be released (it does not mark objects for GC collection, in case of MemoryStream - it releases nothing, as MemoryStream has no unmanaged resources). The only reliable way to free memory used by MemoryStream is to lose all references to it and wait for garbage collection to occur (and if you have OutOfMemoryException - garbage collector already tried but failed to free enough memory). Also, allocating such large objects (anything > 85000 bytes) have some consequences - these objects are going to large object heap (LOH), which can get fragmented (and cannot be compacted). As .NET object must occupy a contiguous sequence of bytes, it can lead to a situation where you have enough memory, but there is no room for large object. Garbage collector won't help in this case.
It seems like the main problem here is that reference to a stream object is kept on stack, preventing garbage collection of stream object (even forcing garbage collection won't help, as GC considers that object is still alive, you can check this creating a WeakRefrence to it). Refactoring this sample can fix it:
static void Main(string[] args)
{
const int bufferSize = 1024 * 1024 * 2;
var buffer = new byte[bufferSize];
for(int i = 0; i < 10; i++)
{
const int writesCount = 400;
Write(buffer, writesCount, bufferSize);
}
}
static void Write(byte[] buffer, int writesCount, int bufferSize)
{
using(var stream = new MemoryStream(writesCount * bufferSize))
{
for(int j = 0; j < writesCount; j++)
{
stream.Write(buffer, 0, buffer.Length);
}
}
}
Here is a sample which proves that object can't be garbage collected:
static void Main(string[] args)
{
const int bufferSize = 1024 * 1024 * 2;
var buffer = new byte[bufferSize];
WeakReference wref = null;
for(int i = 0; i < 10; i++)
{
if(wref != null)
{
// force garbage collection
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
// check if object is still alive
Console.WriteLine(wref.IsAlive); // true
}
const int writesCount = 400;
using(var stream = new MemoryStream(writesCount * bufferSize))
{
for(int j = 0; j < writesCount; j++)
{
stream.Write(buffer, 0, buffer.Length);
}
// weak reference won't prevent garbage collection
wref = new WeakReference(stream);
}
}
}
Try to force garbage collection when you are sure that it is necessary to clean unreferenced objects.
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
Another alternative is to use the Stream with external storage: FileStream, for example.
But, in general case, it would be better to use one small enough buffer (array, allocated one time) and use it for read/write calls. Avoid having many large objects in .NET (see CLR Inside Out: Large Object Heap Uncovered).
Update
Assuming that the writesCount is the constant, the why not allocate one buffer and reuse it?
const int bufferSize = 1024 * 1024;
const int writesCount = 400;
byte[] streamBuffer = new byte[writesCount * bufferSize];
byte[] buffer = new byte[bufferSize];
for (int i = 0; i < 10; i++)
{
using (var stream = new MemoryStream(streamBuffer))
{
for (int j = 0; j < writesCount; j++)
{
stream.Write(buffer, 0, buffer.Length);
}
}
}
When using a blocking TCP socket, I don't have to specify a buffer size. For example:
using (var client = new TcpClient())
{
client.Connect(ServerIp, ServerPort);
using (reader = new BinaryReader(client.GetStream()))
using (writer = new BinaryWriter(client.GetStream()))
{
var byteCount = reader.ReadInt32();
reader.ReadBytes(byteCount);
}
}
Notice how the remote host could have sent any number of bytes.
However, when using async TCP sockets, I need to create a buffer and thus hardcode a maximum size:
var buffer = new byte[BufferSize];
socket.BeginReceive(buffer, 0, buffer.Length, SocketFlags.None, callback, null);
I could simply set the buffer size to, say, 1024 bytes. That'll work if I only need to receive small chunks of data. But what if I need to receive a 10 MB serialized object? I could set the buffer size to 10*1024*1024... but that would waste a constant 10 MB of RAM for as long as the application is running. This is silly.
So, my question is: How can I efficiently receive big chunks of data using async TCP sockets?
Two examples are not equivalent - your blocking code assumes the remote end sends the 32-bit length of the data to follow. If the same protocol is valid for the async - just read that length (blocking or not) and then allocate the buffer and initiate the asynchronous IO.
Edit 0:
Let me also add that allocating buffers of user-entered, and especially of network-input, size is a receipt for disaster. An obvious problem is a denial-of-service attack when client requests a huge buffer and holds on to it - say sends data very slowly - and prevents other allocations and/or slows the whole system.
Common wisdom here is accepting a fixed amount of data at a time and parsing as you go. That of course affects your application-level protocol design.
EDITED
The best approach for this problem found by me, after a long analysis was the following:
First, you need to set the buffer size in order to receive data from the server/client.
Second, you need to find the upload/download speed for that connection.
Third, you need to calculate how many seconds should the connection timeout last in accordance with the size of package to be sent or received.
Set the buffer size
The buffer size can be set in two ways, arbitrary or objectively. If the information to be received is text based, it is not large and it does not require character comparison, than an arbitrary pre-set buffer size is optimal. If the information to be received needs to be processed character by character, and/or large, an objective buffer size is optimal choice
// In this example I used a Socket wrapped inside a NetworkStream for simplicity
// stability, and asynchronous operability purposes.
// This can be done by doing this:
//
// For server:
//
// Socket server= new Socket();
// server.ReceiveBufferSize = 18000;
// IPEndPoint iPEndPoint = new IPEndPoint(IPAddress.Any, port);
// server.Bind(iPEndPoint);
// server.Listen(3000);
//
//
// NetworkStream ns = new NetworkStream(server);
// For client:
//
// Socket client= new Socket();
// client.Connect("127.0.0.1", 80);
//
// NetworkStream ns = new NetworkStream(client);
// In order to set an objective buffer size based on a file's size in order not to
// receive null characters as extra characters because the buffer is bigger than
// the file's size, or a corrupted file because the buffer is smaller than
// the file's size.
// The TCP protocol follows the Sys, Ack and Syn-Ack paradigm,
// so within a TCP connection if the client or server began the
// connection by sending a message, the next message within its
// connection must be read, and if the client or server began
// the connection by receiving a message, the next message must
// be sent.
// [SENDER]
byte[] file = new byte[18032];
byte[] file_length = Encoding.UTF8.GetBytes(file.Length.ToString());
await Sender.WriteAsync(file_length, 0, file_length.Length);
byte[] receiver_response = new byte[1800];
await Sender.ReadAsync(receiver_response, 0, receiver_response.Length);
await Sender.WriteAsync(file, 0, file.Length);
// [SENDER]
// [RECEIVER]
byte[] file_length = new byte[1800];
await Receiver.ReadAsync(file_length, 0, file_length.Length);
byte[] encoded_response = Encoding.UTF8.GetBytes("OK");
await Receiver.WriteAsync(encoded_response, 0, encoded_response.Length);
byte[] file = new byte[Convert.ToInt32(Encoding.UTF8.GetString(file_length))];
await Receiver.ReadAsync(file, 0, file.Length);
// [RECEIVER]
The buffers that are used to receive the payload length are using an arbitrary buffer size. The length of the payload to be sent is converted to string and then the string is converted in a UTF-8 encoded byte array. The received length of the payload is then converted back into a string format and then converted to an integer in order to set the length of the buffer that will receive the payload. The length is converted to string, then to int and then to byte[], in order to avoid data corruption due to the fact that the information related to the payload length will not be sent into a buffer that has the same size as the information. When the receiver will convert the byte[] content to a string and then to an int, the extra characters will be removed and the information will remain the same.
Get the upload/download speed of the connection and calculate the Socket receive and send buffer size
First, Make a class that is responsible for calculating the buffer size for each connection.
// In this example I used a Socket wrapped inside a NetworkStream for simplicity
// stability, and asynchronous operability purposes.
// This can be done by doing this:
//
// For server:
//
// Socket server= new Socket();
// server.ReceiveBufferSize = 18000;
// IPEndPoint iPEndPoint = new IPEndPoint(IPAddress.Any, port);
// server.Bind(iPEndPoint);
// server.Listen(3000);
//
// NetworkStream ns = new NetworkStream(server);
// For client:
//
// Socket client= new Socket();
// client.Connect("127.0.0.1", 80);
//
// NetworkStream ns = new NetworkStream(client);
class Internet_Speed_Checker
{
public async Task<bool>> Optimum_Buffer_Size(System.Net.Sockets.NetworkStream socket)
{
System.Diagnostics.Stopwatch latency_counter = new System.Diagnostics.Stopwatch();
byte[] test_payload = new byte[2048];
// The TCP protocol follows the Sys, Ack and Syn-Ack paradigm,
// so within a TCP connection if the client or server began the
// connection by sending a message, the next message within its
// connection must be read, and if the client or server began
// the connection by receiving a message, the next message must
// be sent.
//
// In order to test the connection, the client and server must
// send and receive a package of the same size. If the client
// or server began the connection by sending a message, the
// client or server must do the this connection test by
// initiating a write-read sequence, else it must do this
// connection test initiating a read-write sequence.
latency_counter .Start();
await client_secure_network_stream.ReadAsync(test_payload, 0, test_payload.Length);
await client_secure_network_stream.WriteAsync(test_payload, 0, test_payload.Length);
latency_counter .Stop();
int bytes_per_second = (int)(test_payload.Length * (1000 / latency_time_counter.Elapsed.TotalMilliseconds));
int optimal_connection_timeout = (Convert.ToInt32(payload_length) / download_bytes_per_second) * 1000 + 1000;
double optimal_buffer_size_double = ((download_bytes_per_second / 125000) * (latency_time_counter.Elapsed.TotalMilliseconds / 1000)) * 1048576;
int optimal_buffer_size = (int)download_optimal_buffer_size_double + 1024;
// If you want to upload data to the client/server --> client.SendBufferSize = optimal_buffer_size;
// client.SendTimeout = optimal_connection_timeout;
// If you want to download data from the client/server --> client.ReceiveBufferSize = optimal_buffer_size;
// client.ReceiveTimeout = optimal_connection_timeout;
}
}
The aforementioned method is ensuring that the data transmitted between the client buffer and server buffer uses an appropriate socket buffer size and socket connection timeout in order to avoid data corruption and fragmentation. When the data is sent through a socket with an async Read/Write operation, the length of the information to be sent will be segmented in packets. The packet size has a default value but it does not cover the fact that the upload/download speed of the connection is varying. In order to avoid data corruption and an optimal download/upload speed of the connection, the packet size must be set in accordance with the speed of the connection. In the aforementioned example I also showcased also the how to calculate the timeout in relation with the connection speed. The packet size for upload/download can be set by using the socket.ReceiveBufferSize = ... / socket.SendBufferSize = ... respectively.
For more information related to the equations and principles used check:
https://www.baeldung.com/cs/calculate-internet-speed-ping
https://docs.oracle.com/cd/E36784_01/html/E37476/gnkor.html#:~:text=You%20can%20calculate%20the%20correct,value%20of%20the%20connection%20latency.