Closing MemoryStream in async task

Closing MemoryStream in async task - c#

I am writing code that loads multiple instances of the same task at once and waits for them all to finish. Each task reads from a file and uploads a byte array of a portion of that file.
var requests = new Task[parts.Count];
foreach (var part in parts)
{
var partNumber = part.Item1;
var partSize = part.Item2;
var ms = new MemoryStream(partSize);
var bw = new BinaryWriter(ms);
var offset = (partNumber - 1) * partMaxSize;
var count = partSize;
bw.Write(assetContentBytes, offset, count);
ms.Position = 0;
Console.WriteLine("beginning upload of part " + partNumber);
requests[partNumber - 1] = uploadClient.UploadPart(uploadResult.AssetId, partNumber, ms);
}
await Task.WhenAll(requests);
I would like to close these MemoryStreams after the related task is complete, but if I write stream.Close() into the loop, the streams close before the task is complete. Is it possible to close each stream after the task is complete? Thanks.

Just extract the part that uses the stream to another async method:
var requests = new Task[parts.Count];
foreach (var part in parts)
{
var partNumber = part.Item1;
var partSize = part.Item2;
requests[partNumber - 1] = UploadPartAsync(partNumber, partSize);
}
await Task.WhenAll(requests);
...
async Task UploadPartAsync(int partNumber, int partSize)
{
using (var ms = new MemoryStream(partSize))
using (var bw = new BinaryWriter(ms))
{
var offset = (partNumber - 1) * partMaxSize;
var count = partSize;
bw.Write(assetContentBytes, offset, count);
ms.Position = 0;
Console.WriteLine("beginning upload of part " + partNumber);
await uploadClient.UploadPart(uploadResult.AssetId, partNumber, ms);
}
}

Related

C# gRPC file streaming, original file smaller than the streamed one

I am having some problems with setting up a request-stream type gRPC architecture. The code below is just for testing purposes and it is missing various validation checks, but the main issue is that the original file is always smaller than the received one.
Could the cause here be encoding? It doesn't matter what the file type is, the end result is always that the file sizes are different.
Protobuf inteface:
syntax = "proto3";
package FileTransfer;
option csharp_namespace = "FileTransferProto";
service FileTransferService {
rpc DownloadFile(FileRequest) returns (stream ChunkMsg);
}
message ChunkMsg {
string FileName = 1;
int64 FileSize = 2;
bytes Chunk = 3;
}
message FileRequest {
string FilePath = 1;
}
Server side (sending):
public override async Task DownloadFile(FileRequest request, IServerStreamWriter<ChunkMsg> responseStream, ServerCallContext context)
{
string filePath = request.FilePath;
if (!File.Exists(filePath)) { return; }
FileInfo fileInfo = new FileInfo(filePath);
ChunkMsg chunk = new ChunkMsg();
chunk.FileName = Path.GetFileName(filePath);
chunk.FileSize = fileInfo.Length;
int fileChunkSize = 64 * 1024;
byte[] fileByteArray = File.ReadAllBytes(filePath);
byte[] fileChunk = new byte[fileChunkSize];
int fileOffset = 0;
while (fileOffset < fileByteArray.Length && !context.CancellationToken.IsCancellationRequested)
{
int length = Math.Min(fileChunkSize, fileByteArray.Length - fileOffset);
Buffer.BlockCopy(fileByteArray, fileOffset, fileChunk, 0, length);
fileOffset += length;
ByteString byteString = ByteString.CopyFrom(fileChunk);
chunk.Chunk = byteString;
await responseStream.WriteAsync(chunk).ConfigureAwait(false);
}
}
Client side (receiving):
public static async Task GetFile(string filePath)
{
var channel = Grpc.Net.Client.GrpcChannel.ForAddress("https://localhost:5001/", new GrpcChannelOptions
{
MaxReceiveMessageSize = 5 * 1024 * 1024, // 5 MB
MaxSendMessageSize = 5 * 1024 * 1024, // 5 MB
});
var client = new FileTransferProto.FileTransferService.FileTransferServiceClient(channel);
var request = new FileRequest { FilePath = filePath };
string tempFileName = $"temp_{DateTime.UtcNow.ToString("yyyyMMdd_HHmmss")}.tmp";
string finalFileName = tempFileName;
using (var call = client.DownloadFile(request))
{
await using (Stream fs = File.OpenWrite(tempFileName))
{
await foreach (ChunkMsg chunkMsg in call.ResponseStream.ReadAllAsync().ConfigureAwait(false))
{
Int64 totalSize = chunkMsg.FileSize;
string tempFinalFilePath = chunkMsg.FileName;
if (!string.IsNullOrEmpty(tempFinalFilePath))
{
finalFileName = chunkMsg.FileName;
}
fs.Write(chunkMsg.Chunk.ToByteArray());
}
}
}
if (finalFileName != tempFileName)
{
File.Move(tempFileName, finalFileName);
}
}

To add to Marc's answer, I feel like you can simplify your code a little bit.
using var fs = File.Open(filePath, System.IO.FileMode.Open);
int bytesRead;
var buffer = new byte[fileChunkSize];
while ((bytesRead = await fs.ReadAsync(buffer)) > 0)
{
await call.RequestStream.WriteAsync(new ChunkMsg
{
// Here the correct number of bytes must be sent which is starting from
// index 0 up to the number of read bytes from the file stream.
// If you solely pass 'buffer' here, the same bug would be present.
Chunk = ByteString.CopyFrom(buffer[0..bytesRead]),
});
}
I've used the array range operator from C# 8.0 which makes this cleaner or you can also use the overload of ByteString.CopyFrom which takes in an offset and count of how many bytes to include.

In your write loop, the chunk you actually send is for the oversized buffer, not accounting for length. This means that the last segment includes some garbage and is oversized. The received payload will be oversized by this same amount. So: make sure you account for length when constructing the chunk to send.

I tested the code and modified it to transfer the correct size.
The complete code is available at the following URL: https://github.com/lisa3907/grpc.fileTransfer
server-side-code
while (_offset < _file_bytes.Length)
{
if (context.CancellationToken.IsCancellationRequested)
break;
var _length = Math.Min(_chunk_size, _file_bytes.Length - _offset);
Buffer.BlockCopy(_file_bytes, _offset, _file_chunk, 0, _length);
_offset += _length;
_chunk.ChunkSize = _length;
_chunk.Chunk = ByteString.CopyFrom(_file_chunk);
await responseStream.WriteAsync(_chunk).ConfigureAwait(false);
}
client-side-code
await foreach (var _chunk in _call.ResponseStream.ReadAllAsync().ConfigureAwait(false))
{
var _total_size = _chunk.FileSize;
if (!String.IsNullOrEmpty(_chunk.FileName))
{
_final_file = _chunk.FileName;
}
if (_chunk.Chunk.Length == _chunk.ChunkSize)
_fs.Write(_chunk.Chunk.ToByteArray());
else
{
_fs.Write(_chunk.Chunk.ToByteArray(), 0, _chunk.ChunkSize);
Console.WriteLine($"final chunk size: {_chunk.ChunkSize}");
}
}

c# web request html response doesn't show well

I'm trying to retrieve a website using tcp and http requests, I added a textbox and a Go button , I type an address in the textbox and then I press the button to get to the website. it works well except it doesn't show me a complete page, for example when I try to reach www.google.com, it won't show google logo and it also keeps giving me warnings about js files.
here is the main part of my code , any help is deeply appreciated.
private async void button1_ClickAsync(object sender, EventArgs e)
{
string result = string.Empty;
using (var tcp = new TcpClient(textBox1.Text, 80))
using (var stream = tcp.GetStream())
{
tcp.SendTimeout = 500;
tcp.ReceiveTimeout = 1000;
var builder = new StringBuilder();
builder.AppendLine("GET /?scope=images&nr=1 HTTP/1.1");
builder.AppendLine("Host: " + textBox1.Text);
//builder.AppendLine("Content-Length: " + data.Length); // only for POST request
builder.AppendLine("Connection: close");
builder.AppendLine();
var header = Encoding.ASCII.GetBytes(builder.ToString());
await stream.WriteAsync(header, 0, header.Length);
//await stream.WriteAsync(data, 0, data.Length);
using (var memory = new MemoryStream())
{
await stream.CopyToAsync(memory);
memory.Position = 0;
var data = memory.ToArray();
var index = BinaryMatch(data, Encoding.ASCII.GetBytes("\r\n\r\n")) + 4;
var headers = Encoding.ASCII.GetString(data, 0, index);
memory.Position = index;
if (headers.IndexOf("Content-Encoding: gzip") > 0)
{
using (GZipStream decompressionStream = new GZipStream(memory, CompressionMode.Decompress))
using (var decompressedMemory = new MemoryStream())
{
decompressionStream.CopyTo(decompressedMemory);
decompressedMemory.Position = 0;
result = Encoding.UTF8.GetString(decompressedMemory.ToArray());
webBrowser2.DocumentText = result;
}
}
else
{
result = Encoding.UTF8.GetString(data, index, data.Length - index);
webBrowser2.DocumentText = result;
//result = Encoding.GetEncoding("gbk").GetString(data, index, data.Length - index);
}
}
//Debug.WriteLine(result);
//return result;
}
}

useing thread (task) for doing work contain I/O

I need to read data from a file,process and write result to another file. I use backgroundworker to show process state .I write something like this to use in DoWork event of backgroundworker
private void ProcData(string fileToRead,string fileToWrite)
{
byte[] buffer = new byte[4 * 1024];
//fileToRead & fileToWrite have same size
FileInfo fileInfo = new FileInfo(fileToRead);
using (FileStream streamReader = new FileStream(fileToRead, FileMode.Open))
using (BinaryReader binaryReader = new BinaryReader(streamReader))
using (FileStream streamWriter = new FileStream(fileToWrite, FileMode.Open))
using (BinaryWriter binaryWriter = new BinaryWriter(streamWriter))
{
while (streamWriter.Position < fileInfo.Length)
{
if (streamWriter.Position + buffer.Length > fileInfo.Length)
{
buffer = new byte[fileInfo.Length - streamWriter.Position];
}
//read
buffer = binaryReader.ReadBytes(buffer.Length);
//proccess
Proc(buffer);
//write
binaryWriter.Write(buffer);
//report if procentage changed
//...
}//while
}//using
}
but it is 5 more time slower than just reading from fileToRead and writing to fileToWrite so I think about threading. I read some question in site and try something like this base on this question
private void ProcData2(string fileToRead, string fileToWrite)
{
int threadNumber = 4; //for example
Task[] tasks = new Task[threadNumber];
long[] startByte = new long[threadNumber];
long[] length = new long[threadNumber];
//divide file to threadNumber(4) part
//and update startByte & length
var parentTask = Task.Run(() =>
{
for (int i = 0; i < threadNumber; i++)
{
tasks[i] = Task.Factory.StartNew(() =>
{
Proc2(fileToRead, fileToWrite, startByte[i], length[i]);
});
}
});
parentTask.Wait();
Task.WaitAll(tasks);
}
//
private void Proc2(string fileToRead,string fileToWrite,long fileStartByte,long partLength)
{
byte[] buffer = new byte[4 * 1024];
using (FileStream streamReader = new FileStream(fileToRead, FileMode.Open,FileAccess.Read,FileShare.Read))
using (BinaryReader binaryReader = new BinaryReader(streamReader))
using (FileStream streamWriter = new FileStream(fileToWrite, FileMode.Open,FileAccess.Write,FileShare.Write))
using (BinaryWriter binaryWriter = new BinaryWriter(streamWriter))
{
streamReader.Seek(fileStartByte, SeekOrigin.Begin);
streamWriter.Seek(fileStartByte, SeekOrigin.Begin);
while (streamWriter.Position < fileStartByte+partLength)
{
if (streamWriter.Position + buffer.Length > fileStartByte+partLength)
{
buffer = new byte[fileStartByte+partLength - streamWriter.Position];
}
//read
buffer = binaryReader.ReadBytes(buffer.Length);
//proccess
Proc(buffer);
//write
binaryWriter.Write(buffer);
//report if procentage changed
//...
}//while
}//using
}
but I think it have some problem and by each time switching task it needs to seek again. I think about reading file, use threading for Proc() and then writing result, but it seems wrong. How can I do it properly?(reading a buffer from a file, process and write it on other file by using task)
//===================================================================
base on Pete Kirkham post I modified my method. I do not know why ,but it did not work for me. I added new method for who it may help them. thanks every body
private void ProcData3(string fileToRead, string fileToWrite)
{
int bufferSize = 4 * 1024;
int threadNumber = 4;//example
List<byte[]> bufferPool = new List<byte[]>();
Task[] tasks = new Task[threadNumber];
//fileToRead & fileToWrite have same size
FileInfo fileInfo = new FileInfo(fileToRead);
using (FileStream streamReader = new FileStream(fileToRead, FileMode.Open))
using (BinaryReader binaryReader = new BinaryReader(streamReader))
using (FileStream streamWriter = new FileStream(fileToWrite, FileMode.Open))
using (BinaryWriter binaryWriter = new BinaryWriter(streamWriter))
{
while (streamWriter.Position < fileInfo.Length)
{
//read
for (int g = 0; g < threadNumber; g++)
{
if (streamWriter.Position + bufferSize <= fileInfo.Length)
{
bufferPool.Add(binaryReader.ReadBytes(bufferSize));
}
else
{
bufferPool.Add(binaryReader.ReadBytes((int)(fileInfo.Length - streamWriter.Position)));
break;
}
}
//do
var parentTask = Task.Run(() =>
{
for (int th = 0; th < bufferPool.Count; th++)
{
int index = th;
//threads
tasks[index] = Task.Factory.StartNew(() =>
{
Proc(bufferPool[index]);
});
}//for th
});
//stop parent task(run childs)
parentTask.Wait();
//wait till all task be done
Task.WaitAll(tasks);
//write
for (int g = 0; g < bufferPool.Count; g++)
{
binaryWriter.Write(bufferPool[g]);
}
//report if procentage changed
//...
}//while
}//using
}

Essentially you want a split the processing of the data up into parallel tasks, but you don't want want to split the IO up.
How this happens depends on the size of your data. If it is small enough to fit into memory, then you can read it all into an input array and create an output array, then create tasks to process some of the input array and populate some of the output array, then write the whole output array to file.
If the data is too large for this, then you need to put a limit on the amount of data read and written at a time. So you have your main flow which starts off by reading N blocks of data and creating N tasks to process them. You then wait for the tasks to complete in order, and each time one completes you write the block of output and read a new block of input and create another task. Some experimentation will be required for a good value for N and block size which means tasks tend to complete in about the same rate as the IO works at.

.NET 4.5 file read performance sync vs async

We're trying to measure the performance between reading a series of files using sync methods vs async. Was expecting to have about the same time between the two but turns out using async is about 5.5x slower.
This might be due to the overhead of managing the threads but just wanted to know your opinion. Maybe we're just measuring the timings wrong.
These are the methods being tested:
static void ReadAllFile(string filename)
{
var content = File.ReadAllBytes(filename);
}
static async Task ReadAllFileAsync(string filename)
{
using (var file = File.OpenRead(filename))
{
using (var ms = new MemoryStream())
{
byte[] buff = new byte[file.Length];
await file.ReadAsync(buff, 0, (int)file.Length);
}
}
}
And this is the method that runs them and starts the stopwatch:
static void Test(string name, Func<string, Task> gettask, int count)
{
Stopwatch sw = new Stopwatch();
Task[] tasks = new Task[count];
sw.Start();
for (int i = 0; i < count; i++)
{
string filename = "file" + i + ".bin";
tasks[i] = gettask(filename);
}
Task.WaitAll(tasks);
sw.Stop();
Console.WriteLine(name + " {0} ms", sw.ElapsedMilliseconds);
}
Which is all run from here:
static void Main(string[] args)
{
int count = 10000;
for (int i = 0; i < count; i++)
{
Write("file" + i + ".bin");
}
Console.WriteLine("Testing read...!");
Test("Read Contents", (filename) => Task.Run(() => ReadAllFile(filename)), count);
Test("Read Contents Async", (filename) => ReadAllFileAsync(filename), count);
Console.ReadKey();
}
And the helper write method:
static void Write(string filename)
{
Data obj = new Data()
{
Header = "random string size here"
};
int size = 1024 * 20; // 1024 * 256;
obj.Body = new byte[size];
for (var i = 0; i < size; i++)
{
obj.Body[i] = (byte)(i % 256);
}
Stopwatch sw = new Stopwatch();
sw.Start();
MemoryStream ms = new MemoryStream();
Serializer.Serialize(ms, obj);
ms.Position = 0;
using (var file = File.Create(filename))
{
ms.CopyToAsync(file).Wait();
}
sw.Stop();
//Console.WriteLine("Writing file {0}", sw.ElapsedMilliseconds);
}
The results:
-Read Contents 574 ms
-Read Contents Async 3160 ms
Will really appreciate if anyone can shed some light on this as we searched the stack and the web but can't really find a proper explanation.

There are lots of things wrong with the testing code. Most notably, your "async" test does not use async I/O; with file streams, you have to explicitly open them as asynchronous or else you're just doing synchronous operations on a background thread. Also, your file sizes are very small and can be easily cached.
I modified the test code to write out much larger files, to have comparable sync vs async code, and to make the async code asynchronous:
static void Main(string[] args)
{
Write("0.bin");
Write("1.bin");
Write("2.bin");
ReadAllFile("2.bin"); // warmup
var sw = new Stopwatch();
sw.Start();
ReadAllFile("0.bin");
ReadAllFile("1.bin");
ReadAllFile("2.bin");
sw.Stop();
Console.WriteLine("Sync: " + sw.Elapsed);
ReadAllFileAsync("2.bin").Wait(); // warmup
sw.Restart();
ReadAllFileAsync("0.bin").Wait();
ReadAllFileAsync("1.bin").Wait();
ReadAllFileAsync("2.bin").Wait();
sw.Stop();
Console.WriteLine("Async: " + sw.Elapsed);
Console.ReadKey();
}
static void ReadAllFile(string filename)
{
using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, 4096, false))
{
byte[] buff = new byte[file.Length];
file.Read(buff, 0, (int)file.Length);
}
}
static async Task ReadAllFileAsync(string filename)
{
using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, 4096, true))
{
byte[] buff = new byte[file.Length];
await file.ReadAsync(buff, 0, (int)file.Length);
}
}
static void Write(string filename)
{
int size = 1024 * 1024 * 256;
var data = new byte[size];
var random = new Random();
random.NextBytes(data);
File.WriteAllBytes(filename, data);
}
On my machine, this test (built in Release, run outside the debugger) yields these numbers:
Sync: 00:00:00.4461936
Async: 00:00:00.4429566

All I/O Operation are async. The thread just waits(it gets suspended) for I/O operation to finish. That's why when read jeffrey richter he always tells to do i/o async, so that your thread is not wasted by waiting around.
from Jeffery Ricter
Also creating a thread is not cheap. Each thread gets 1 mb of address space reserved for user mode and another 12kb for kernel mode. After this the OS has to notify all the dll in system that a new thread has been spawned.Same happens when you destroy a thread. Also think about the complexities of context switching
Found a great SO answer here

Update two progressbar in same time

I have auto-upload application from ftp server and two progressbar's to update overall and current download state.
First one works fine (update% = currentFile as int / allFilesToDownload *100%). But i'd like to upload current file downloading.
My code:
Uri url = new Uri(sUrlToDnldFile);
int inde = files.ToList().IndexOf(file);
string subPath = ...
bool IsExists = System.IO.Directory.Exists(subPath);
if (!IsExists)
System.IO.Directory.CreateDirectory(subPath);
sFileSavePath = ...
System.Net.FtpWebRequest request = (FtpWebRequest)FtpWebRequest.Create(new Uri(file));
System.Net.FtpWebResponse response = (System.Net.FtpWebResponse)request.GetResponse();
response.Close();
long iSize = response.ContentLength;
long iRunningByteTotal = 0;
WebClient client = new WebClient();
Stream strRemote = client.OpenRead(url);
FileStream strLocal = new FileStream(sFileSavePath, FileMode.Create, FileAccess.Write, FileShare.None);
int iByteSize = 0;
byte[] byteBuffer = new byte[1024];
while ((iByteSize = strRemote.Read(byteBuffer, 0, byteBuffer.Length)) > 0)
{
strLocal.Write(byteBuffer, 0, iByteSize);
iRunningByteTotal += iByteSize;
//THERE I'D LIKE TO UPLOAD CURRENT FILE DOWNLOAD STATUS
string a = iByteSize.ToString();
double b = double.Parse(a.ToString()) / 100;
string[] c = b.ToString().Split(',');
int d = int.Parse(c[0].ToString());
update(d);
//update(int prog) { bgWorker2.ReportProgress(prog); }
}
double dIndex = (double)(iRunningByteTotal);
double dTotal = (double)iSize;
// THIS CODE COUNTING OVERAL PROGRESS - WORKS FINE
double iProgressPercentage1 = double.Parse(ind.ToString()) / double.Parse(files.Count().ToString()) * 100;
ind++;
string[] tab = iProgressPercentage1.ToString().Split(',');
int iProgressPercentage = int.Parse(tab[0]);
currentFile = file;
bgWorker1.ReportProgress(iProgressPercentage);
strRemote.Close();
Unfortunately i still getting error, that I cant update progressBar2, becouse another process using it.
Is there any way to do it?
Thanks

update values thru dispatcher.BeginInvoke Methods something like.
Dispatcher.BeginInvoke(DispatcherPriority.Background, new Action(()=>
{
progressbar2.value = newvalue;
}));
This Dispatcher Will push you work to the main thread which is holding the Progressbar2.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Closing MemoryStream in async task - c#

Related

C# gRPC file streaming, original file smaller than the streamed one

c# web request html response doesn't show well

useing thread (task) for doing work contain I/O

.NET 4.5 file read performance sync vs async

Update two progressbar in same time

Categories

Resources