.Net web client fails downloading after 2 GB from S3 - c#

I'm using the web client object to download a file like so:
strm = Client.OpenRead(url);
strm.ReadTimeout = 30000;
bool bFirst = true;
while ((read = strm.Read(buf, 0, 2000)) > 0)
{
fout.Write(buf, 0, read);
}
Where the url points to an S3 bucket. In some cases the download fails with a timeout at exactly 2 GB. Is this a network issue, or is there something I could change in the code?
Any ideas appreciated.

I believe WebClient will read the file into memory, and you're probably running into process size limitations.
What you'll want to use is WebClient.DownloadFile
I believe this will work better for you!

Related

s3 File Uploading is so slow C#

I'm using the transferUtility class to upload files into S3 using input stream in .NET. The problem is, uploading a file of 4MB takes around a minute. I tried both transferUtility.Upload and S3Client.PutObject and the upload time didn't seem to change. The code is below:
WebRequest.DefaultWebProxy = null;
this.S3Client = new Amazon.S3.AmazonS3Client(accessKeyId, secretAccessKey, endPoint);
this.transferUtility = new TransferUtility(this.S3Client);
Amazon.S3.Transfer.TransferUtilityUploadRequest transferRequest = new TransferUtilityUploadRequest();
transferRequest.BucketName = s3Bucket;
transferRequest.CannedACL = Amazon.S3.S3CannedACL.PublicRead;
transferRequest.InputStream = (Stream)fileStream;
transferRequest.Key = newFileName;
transferRequest.Headers.Expires = new DateTime(2020, 1, 1);
this.transferUtility.Upload(transferRequest);
Any help would be appreciated.
Thank you,
I found that this is happening because of an uploading the file stream to the application server takes a very long time.
This is likely also due to the C# SDK MD5sum'ing every chunk it uploads (CPU performance hit and thus slows down the upload while an MD5Sum is computed).
See https://github.com/aws/aws-sdk-net/issues/832#issuecomment-383756520 for workaround.

100s of concurrent users trying to download files (asp.net C# application)

I am trying to implement file download feature in asp.net application. The application would be used by say around 200 users concurrently to download various files.
It would be hosted on IIS 7. I do not want the application server to crash because of multiple requests coming concurrently.
I am assuming that by calling Context.Response.Flush() in a loop, I am flushing out all the file data that I would have read till then, so application memory usage would be kept uniform. What other optimizations can I make to the current code or what other approach should be used in a scenario like this?
The requests would be for various files and the file sizes can be anywhere between 100 KB to 10 MB.
My current code is like this:
FileStream inStr = null;
byte[] buffer = new byte[1024];
String fileName = #"C:\DwnldTest\test.doc";
long byteCount; inStr = File.OpenRead(fileName);
Response.AddHeader("content-disposition", "attachment;filename=test.doc");
while ((byteCount = inStr.Read(buffer, 0, buffer.Length)) > 0)
{
if (Context.Response.IsClientConnected)
{
Context.Response.ContentType = "application/msword";
//Context.Response.BufferOutput = true;
Context.Response.OutputStream.Write(buffer, 0, buffer.Length);
Context.Response.Flush();
}
}
You can use Response.TransmitFile to save server memory when sending files.
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment; filename=testdoc.pdf");
Response.TransmitFile(#"e:\inet\www\docs\testdoc.pdf");
Response.End();
In your code example, you're not closing / disposing inStr. That could affect performance.
Another more simple way to do this would be to use the built in method:
WriteFile
It should already be optimized and will take care of opening / closing files for you.
Maybe you want to use FileSystemWatcher class to check if the file was modified, and read it into memory only while such change was detected. For rest of the time just return the byte array that is already stored in memory. I don't know if HttpResponse.WriteFile method is sensitive for such file modification changes, or if always reads a file from given path, but this also seems to be a good option to use, as it is served by framework out of the box.
Since you are sending an existing file to the client, consider using HttpResponse.TransmitFile (http://msdn.microsoft.com/en-us/library/12s31dhy.aspx).
Looking at the .NET code it seems that this will forward the file writing to IIS instead of reading/writing it in ASP.NET process. HttpResponse.WriteFile(string, false) and HttpResponse.Write(string) seems to do the same thing.
In order to verify that the file sending is relayed to IIS, at HttpResponse.Output property - it should be of type HttpWriter. The HttpWriter._buffers array should now contain a new element HttpFileResponseElement).
Of course, you should always investigate if caching is appropriate in your scenario and test if it is being used.

Find Length of Stream object in WCF Client?

I have a WCF Service, which uploads the document using Stream class.
Now after this, i want to get the Size of the document(Length of Stream), to update the fileAttribute for FileSize.
But doing this, the WCF throws an exception saying
Document Upload Exception: System.NotSupportedException: Specified method is not supported.
at System.ServiceModel.Dispatcher.StreamFormatter.MessageBodyStream.get_Length()
at eDMRMService.DocumentHandling.UploadDocument(UploadDocumentRequest request)
Can anyone help me in solving this.
Now after this, i want to get the Size of the document(Length of Stream), to update the fileAttribute for FileSize.
No, don't do that. If you are writing a file, then just write the file. At the simplest:
using(var file = File.Create(path)) {
source.CopyTo(file);
}
or before 4.0:
using(var file = File.Create(path)) {
byte[] buffer = new byte[8192];
int read;
while((read = source.Read(buffer, 0, buffer.Length)) > 0) {
file.Write(buffer, 0, read);
}
}
(which does not need to know the length in advance)
Note that some WCF options (full message security etc) require the entire message to be validated before processing, so can never truly stream, so: if the size is huge, I suggest you instead use an API where the client splits it and sends it in pieces (which you then reassemble at the server).
If the stream doesn't support seeking you cannot find its length using Stream.Length
The alternative is to copy the stream to a byte array and find its cumulative length. This involves processing the whole stream first , if you don't want this, you should add a stream length parameter to your WCF service interface

SOAP , getting progress of the uploaded Request while its uploading c#

Im trying to upload a file through a SOAP request , and it worked perfectly , but I couldnt get a progress for the uploaded amount of the request .
You could try sending the file up in "chunks", like 1MB at a time rather than sending it all up at once? That way when each chunk completes, you'll be able to update the progress.
I can answer my question now,
Im not using SOAP anymore to upload my files in my solution, Im using HTTPWebRequest now,
1) yes im uploading my large files in chunks (each chuck is 1MB),
2) each chunk(1 MB) can give me progress each BufferSize (4 KB in my case);
so there is a big loop, foreach(Chunk in File) {} .
and inside the big loop there is another loop, as Im using HTTPWebRequest:
long buffer = 4096;
Stream stm = request.GetRequestStream();
while (remainingOfChunkWithReq != 0)
{
stm.Write(buffer, 0, bytesRead);
remainingOfChunkWithReq = remainingOfChunkWithReq - bytesRead;
bytesRead = memoryStream.Read(buffer, 0, bytesSize);
//Send Progress
}
then continue to send the request. and receive the response.

Joining Binary files that have been split via download

I am trying to join a number of binary files that were split during download. The requirement stemmed from the project http://asproxy.sourceforge.net/. In this project author allows you to download files by providing a url.
The problem comes through where my server does not have enough memory to keep a file that is larger than 20 meg in memory.So to solve this problem i modified the code to not download files larger than 10 meg's , if the file is larger it would then allow the user to download the first 10 megs. The user must then continue the download and hopefully get the second 10 megs. Now i have got all this working , except when the user needs to join the files they downloaded i end up with corrupt files , as far as i can tell something is either being added or removed via the download.
I am currently join the files together by reading all the files then writing them to one file.This should work since i am reading and writing in bytes. The code i used to join the files is listed here http://www.geekpedia.com/tutorial201_Splitting-and-joining-files-using-C.html
I do not have the exact code with me atm , as soon as i am home i will post the exact code if anyone is willing to help out.
Please let me know if i am missing out anything or if there is a better way to do this , i.e what could i use as an alternative to a memory stream. The source code for the original project which i made changes to can be found here http://asproxy.sourceforge.net/download.html , it should be noted i am using version 5.0. The file i modified is called WebDataCore.cs and i modified line 606 to only too till 10 megs of data had been loaded the continue execution.
Let me know if there is anything i missed.
Thanks
You shouldn't split for memory reasons... the reason to split is usually to avoid having to re-download everything in case of failure. If memory is an issue, you are doing it wrong... you shouldn't be buffering in memory, for example.
The easiest way to download a file is simply:
using(WebClient client = new WebClient()) {
client.DownloadFile(remoteUrl, localPath);
}
Re your split/join code - again, the problem is that you are buffering everything in memory; File.ReadAllBytes is a bad thing unless you know you have small files. What you should have is something like:
byte[] buffer = new byte[8192]; // why not...
int read;
while((read = inStream.Read(buffer, 0, buffer.Length)) > 0)
{
outStream.Write(buffer, 0, read);
}
This uses a moderate buffer to pump data between the two as a stream. A lot more efficient. The loop says:
try to read some data (at most, the buffer-size)
(this will read at least 1 byte, or we have reached the end of the stream)
if we read something, write this many bytes from the buffer to the output
In the end i have found that by using a FTP request i was able to get arround the memory issue and the file is saved correctly.
Thanks for all the help
That example is loading each entire chunk into memory, instead you could do something like this:
int bufSize = 1024 * 32;
byte[] buffer = new byte[bufSize];
using (FileStream outputFile = new FileStream(OutputFileName, FileMode.OpenOrCreate,
FileAccess.Write, FileShare.None, bufSize))
{
foreach (string inputFileName in inputFiles)
{
using (FileStream inputFile = new FileStream(inputFileName, FileMode.Append,
FileAccess.Write, FileShare.None, buffer.Length))
{
int bytesRead = 0;
while ((bytesRead = inputFile.Read(buffer, 0, buffer.Length)) != 0)
{
outputFile.Write(buffer, 0, bytesRead);
}
}

Categories