Writing file from HttpWebRequest periodically vs. after download finishes?

Writing file from HttpWebRequest periodically vs. after download finishes? - c#

Right now I am using this code to download files (with a Range header). Most of the files are large, and it is running 99% of CPU currently as the file downloads. Is there any way that the file can be written periodically so that it does not remain in RAM constantly?
private byte[] GetWebPageContent(string url, long start, long finish)
{
byte[] result = new byte[finish];
HttpWebRequest request;
request = WebRequest.Create(url) as HttpWebRequest;
//request.Headers.Add("Range", "bytes=" + start + "-" + finish);
request.AddRange((int)start, (int)finish);
using (WebResponse response = request.GetResponse())
{
return ReadFully(response.GetResponseStream());
}
}
public static byte[] ReadFully(Stream stream)
{
byte[] buffer = new byte[32768];
using (MemoryStream ms = new MemoryStream())
{
while (true)
{
int read = stream.Read(buffer, 0, buffer.Length);
if (read <= 0)
return ms.ToArray();
ms.Write(buffer, 0, read);
}
}
}

Instead of writing the data to a MemoryStream (which stores the data in memory), write the data to a FileStream (which stores the data in a file on disk).
byte[] buffer = new byte[32768];
using (FileStream fileStream = File.Create(path))
{
while (true)
{
int read = stream.Read(buffer, 0, buffer.Length);
if (read <= 0)
break;
fileStream.Write(buffer, 0, read);
}
}
Using .NET 4.0:
using (FileStream fileStream = File.Create(path))
{
stream.CopyTo(fileStream);
}

Related

Zip file is getting corrupted after downloading from server in C#

request = MakeConnection(uri, WebRequestMethods.Ftp.DownloadFile, username, password);
response = (FtpWebResponse)request.GetResponse();
Stream responseStream = response.GetResponseStream();
//This part of the code is used to write the read content from the server
using (StreamReader responseReader = new StreamReader(responseStream))
{
using (var destinationStream = new FileStream(toFilenameToWrite, FileMode.Create))
{
byte[] fileContents = Encoding.UTF8.GetBytes(responseReader.ReadToEnd());
destinationStream.Write(fileContents, 0, fileContents.Length);
}
}
//This part of the code is used to write the read content from the server
using (var destinationStream = new FileStream(toFilenameToWrite, FileMode.Create))
{
long length = response.ContentLength;
int bufferSize = 2048;
int readCount;
byte[] buffer = new byte[2048];
readCount = responseStream.Read(buffer, 0, bufferSize);
while (readCount > 0)
{
destinationStream.Write(buffer, 0, readCount);
readCount = responseStream.Read(buffer, 0, bufferSize);
}
}
The former ones writes the content to the file but when I try to open the file it says it is corrupted. But the later one does the job perfectly when downloading zip files. Is there any specific reason why the former code doesn't work for zip files as it works perfectly for text files?

byte[] fileContents = Encoding.UTF8.GetBytes(responseReader.ReadToEnd());
You try to interpret a binary PDF file as an UTF-8 text. That just cannot work.
For a correct code, see Upload and download a binary file to/from FTP server in C#/.NET.

Use BinaryWriter and pass it FileStream.
//This part of the code is used to write the read content from the server
using (var destinationStream = new BinaryWriter(new FileStream(toFilenameToWrite, FileMode.Create)))
{
long length = response.ContentLength;
int bufferSize = 2048;
int readCount;
byte[] buffer = new byte[2048];
readCount = responseStream.Read(buffer, 0, bufferSize);
while (readCount > 0)
{
destinationStream.Write(buffer, 0, readCount);
readCount = responseStream.Read(buffer, 0, bufferSize);
}
}

here is my solution that worked for me
C#
public IActionResult GetZip([FromBody] List<DocumentAndSourceDto> documents)
{
List<Document> listOfDocuments = new List<Document>();
foreach (DocumentAndSourceDto doc in documents)
listOfDocuments.Add(_documentService.GetDocumentWithServerPath(doc.Id));
using (var ms = new MemoryStream())
{
using (var zipArchive = new ZipArchive(ms, ZipArchiveMode.Create, true))
{
foreach (var attachment in listOfDocuments)
{
var entry = zipArchive.CreateEntry(attachment.FileName);
using (var fileStream = new FileStream(attachment.FilePath, FileMode.Open))
using (var entryStream = entry.Open())
{
fileStream.CopyTo(entryStream);
}
}
}
ms.Position = 0;
return File(ms.ToArray(), "application/zip");
}
throw new ErrorException("Can't zip files");
}
don't miss the ms.Position = 0; here

uncompressed file is bigger than original file in GZIP

i'm using the following function to compress(thanks to http://www.dotnetperls.com/):
public static void CompressStringToFile(string fileName, string value)
{
// A.
// Write string to temporary file.
string temp = Path.GetTempFileName();
File.WriteAllText(temp, value);
// B.
// Read file into byte array buffer.
byte[] b;
using (FileStream f = new FileStream(temp, FileMode.Open))
{
b = new byte[f.Length];
f.Read(b, 0, (int)f.Length);
}
// C.
// Use GZipStream to write compressed bytes to target file.
using (FileStream f2 = new FileStream(fileName, FileMode.Create))
using (GZipStream gz = new GZipStream(f2, CompressionMode.Compress, false))
{
gz.Write(b, 0, b.Length);
}
}
and for decompress:
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
so my goal is actually compress log files and than to decompress them in memory and compare the uncompressed file to the original file in order to check that the compression succeeded and i'm able to open the compressed file successfuly.
the problem is that the uncompressed file is most of the time bigger than the original file and my compare check is failing altough the compression probably succeeded.
any idea why ?
btw here how i compare the uncompressed file to the original file:
static bool FileEquals(byte[] file1, byte[] file2)
{
if (file1.Length == file2.Length)
{
for (int i = 0; i < file1.Length; i++)
{
if (file1[i] != file2[i])
{
return false;
}
}
return true;
}
return false;
}

Try this method to compress a file:
public static byte[] Compress(byte[] raw)
{
using (MemoryStream memory = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(memory,
CompressionMode.Compress, true))
{
gzip.Write(raw, 0, raw.Length);
}
return memory.ToArray();
}
}
}
And this to decompress :
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
}
Tell me if it worked.
Goodluck.

Think you'd be better off with the simplest API call, try Stream.CopyTo(). I can't find the error in your code. If I was working on it, I'd probably make sure everything is getting flushed properly.. can't recall if GZipStream is going to flush its output to FileStream when the using block closes.. but then you are also saying that the final file is larger, not smaller.
Anyhow, best policy in my experience.. don't rewrite gotcha prone code when you don't need to. At least you tested it ;)

Return stream reader from FTP response is good practice or not

I have a method for FTP download file, but I do not save file locally rather I parse the file in memory through ftp response. My question is, is returning stream reader after getting ftp response stream a good practice? Because do not want to do parsing and other stuff in the same method.
var uri = new Uri(string.Format("ftp://{0}/{1}/{2}", "somevalue", remotefolderpath, remotefilename));
var request = (FtpWebRequest)FtpWebRequest.Create(uri);
request.Credentials = new NetworkCredential(userName, password);
request.Method = WebRequestMethods.Ftp.DownloadFile;
var ftpResponse = (FtpWebResponse)request.GetResponse();
/* Get the FTP Server's Response Stream */
ftpStream = ftpResponse.GetResponseStream();
return responseStream = new StreamReader(ftpStream);

For me there are 2 disadvantages of using the stream directly, if you can live with them, you shouldn't waste memory or disk space.
In this stream you can not seek to a specific position, you can only read the contents as it comes in;
Your internet connection could suddenly drop and you will get an exception while parsing and processing your file, either split the parsing and processing or make sure your processing routine can handle the case that a file is processed for a second time (after a failure halfway through the first attempt).
To work around these issues, you could copy the stream to a MemoryStream:
using (var ftpStream = ftpResponse.GetResponseStream())
{
var memoryStream = new MemoryStream()
while ((bytesRead = ftpStream.Read(buffer, 0, buffer.Length)) > 0)
{
memoryStream.Write(buffer, 0, bytesRead);
}
memoryStream.Flush();
memoryStream.Position = 0;
return memoryStream;
}
If you are working with larger files I prefer writing it to a file, this way you minimize the memory footprint of your application:
using (var ftpStream = ftpResponse.GetResponseStream())
{
var fileStream = new FileStream(Path.GetTempFileName(), FileMode.CreateNew)
while ((bytesRead = ftpStream.Read(buffer, 0, buffer.Length)) > 0)
{
fileStream.Write(buffer, 0, bytesRead);
}
fileStream.Flush();
fileStream.Position = 0;
return fileStream;
}

I see more practical returning a responseStream when you are performing an HttpWebRequest. If you are using FtpWebRequest it means you are working with files. I would read the responseStream to byte[] and return the byte file content of the downloaded file, so you can easily work with the System.IO.Fileclasses to handle the file.

Thanks Carlos it was really helpful . I just return the byte[]
byte[] buffer = new byte[16 * 1024];
using (MemoryStream ms = new MemoryStream())
{
int read;
while ((read = ftpStream.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
memoryStream=ms;
}
return memoryStream.ToArray();
and used byte[] in the method like this
public async Task ParseReport(byte[] bytesRead)
{
Stream stream = new MemoryStream(bytesRead);
using (StreamReader reader = new StreamReader(stream))
{
string line = null;
while (null != (line = reader.ReadLine()))
{
string[] values = line.Split(';');
}
}
stream.Close();
}

Download to file and save to byte array

I am trying to download a file to my computer and in the same time save it to Byte Array:
try
{
var req = (HttpWebRequest)HttpWebRequest.Create(url);
var fileStream = new FileStream(filePath,
FileMode.Create, FileAccess.Write, FileShare.Write);
using (var resp = req.GetResponse())
{
using (var stream = resp.GetResponseStream())
{
byte[] buffer = new byte[0x10000];
int len;
while ((len = stream.Read(buffer, 0, buffer.Length)) > 0)
{
//Do with the content whatever you want
// ***YOUR CODE***
MemoryStream memoryStream = new MemoryStream();
if (len > 0)
{
memoryStream.Write(buffer, 0, len);
len = stream.Read(buffer, 0, buffer.Length);
}
file = memoryStream.ToArray();
fileStream.Write(buffer, 0, len);
}
}
}
fileStream.Close();
}
catch (Exception exc) { }
And i noticed that it's not download all the file with this.
I wan to do it because i want to download a file and in the same time work with it.
Any idea why this problem happen?

There is a much easier way to get the file bytes by using the System.Net.WebClient.WebClient():
private static byte[] DownloadFile(string absoluteUrl)
{
using (var client = new System.Net.WebClient())
{
return client.DownloadData(absoluteUrl);
}
}
Usage:
var bytes = DownloadFile(absoluteUrl);

The problem looks to be double-reading - you are putting different things into the memory-stream / file-stream - it should be more like:
// declare file/memory stream here
while ((len = stream.Read(buffer, 0, buffer.Length)) > 0)
{
memoryStream.Write(buffer, 0, len);
fileStream.Write(buffer, 0, len);
// if you need to process "len" bytes, do it here
}
You might be able to lose "memoryStream" completely if you are processing the "len" bytes immediately. If it fits in-memory, it may be easier to just use WebClient.DownloadData and then File.WriteAllBytes.

Missing bytes when reading from WebClient stream

Why am I missing bytes when reading from a WebClient stream as follows?
const int chuckDim = 80;
System.Net.WebClient client = new System.Net.WebClient();
Stream stream = client.OpenRead("http://media-cdn.tripadvisor.com/media/photo-s/01/70/3e/a9/needed-backup-lol.jpg");
//Stream stream = client.OpenRead("file:///C:/Users/Tanganello/Downloads/needed-backup-lol.jpg");
//searching file length
WebHeaderCollection whc = client.ResponseHeaders;
int totalLength = (Int32.Parse(whc["Content-Length"]));
byte[] buffer = new byte[totalLength];
//reading and writing
FileStream filestream = new FileStream("C:\\Users\\Tanganello\\Downloads\\clone1.jpg", FileMode.Create, FileAccess.ReadWrite);
int accumulator = 0;
while (accumulator + chuckDim < totalLength) {
stream.Read(buffer, accumulator, chuckDim);
filestream.Write(buffer, accumulator, chuckDim);
accumulator += chuckDim;
}
stream.Read(buffer, accumulator, totalLength - accumulator);
filestream.Write(buffer, accumulator, totalLength - accumulator);
stream.Close();
filestream.Flush();
filestream.Close();
this is what I get with the first stream:
http://img839.imageshack.us/img839/830/clone1h.jpg

The problem is that you are ignoring the return value of the Stream.Read Method:
count
The maximum number of bytes to be read from the current stream.
Return Value
The total number of bytes read into the buffer. This can be less than the number of bytes requested
You can avoid the whole business of reading and writing streams by simply using the WebClient.DownloadFile Method:
using (var client = new WebClient())
{
client.DownloadFile(
"http://media-cdn.tripadvisor.com/media/photo-s/01/70/3e/a9/needed-backup-lol.jpg",
"C:\\Users\\Tanganello\\Downloads\\clone1.jpg");
}
Alternatively, if you really want to use streams, you can simply use the Stream.CopyTo Method:
using (var client = new WebClient())
using (var stream = client.OpenRead("http://..."))
using (var file = File.OpenWrite("C:\\..."))
{
stream.CopyTo(file);
}
If you insist on really copying the bytes yourself, the correct way to do this would be as follows:
using (var client = new WebClient())
using (var stream = client.OpenRead("http://..."))
using (var file = File.OpenWrite("C:\\..."))
{
var buffer = new byte[512];
int bytesReceived;
while ((bytesReceived = stream.Read(buffer, 0, buffer.Length)) != 0)
{
file.Write(buffer, 0, bytesReceived);
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Writing file from HttpWebRequest periodically vs. after download finishes? - c#

Related

Zip file is getting corrupted after downloading from server in C#

uncompressed file is bigger than original file in GZIP

Return stream reader from FTP response is good practice or not

Download to file and save to byte array

Missing bytes when reading from WebClient stream

Categories

Resources