Resume downloading while using System.Net.ConnectStream - c#

I am downloading file using System.Net.ConnectStream. However to support pause functionality for downloading,what I am doing is create new connection on click of start as follows:
this.InputStream = CreateLink(this.URL);
In CreateLink I check whether file is updated at server and return the corresponding stream for downloading.
I download file as chunk of bytes as:
InputStream.Read(buffer, offset, bytesToRead);
The Problem is it starts reading from beginning and not where it is paused.Also I am unable to use this.InputStream.Position = CurrentPosition; to set position of InputStream since it is nonseekable.Moreover,Stream supports 'Accept-Ranges' as 'bytes'
So, How can I begin downloading from paused position?
Update:
'this' refers to the instance of downloader as:
Downloader downloader = new Downloader();
HttpWebRequest request = (HttpWebRequest)GetRequest(path);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
downloader.InputStream = response.GetResponseStream();

Related

downloading with web request

I want to get a link from a TextBox and download a file from link.
But before downloading file, I want to know the size of the file in advance and create an empty file with that size. but I can't.
and another question, I want to show percentage of download progress. How can I know data is downloaded and I should update the percentage?
WebRequest request = WebRequest.Create(URL);
WebResponse response = request.GetResponse();
totalSize = request.ContentLength;//always is -1
using (FileStream f = new FileStream(savePath, FileMode.Create))
{
f.SetLength(totalSize);
}
System.IO.StreamReader reader = new
System.IO.StreamReader(response.GetResponseStream());
WebClient client = new WebClient();
client.DownloadFile (URL, savePath);
The best way would be to use the WebClient with its DownloadFile Function, which has an async callback for events like Completed or ProgressChanged.
Getting the size of the file in advance would be a step harder though.

Asynchronous streaming of large files using ASP.Net Framework 2.0

I am working on an ASP.NET framework 2.0 application. On a particular page I am providing a link to user. By clicking on this link a window opens with another aspx page. This page actually sends http request to a third-party url which points to a file(like - mirror urls to download file from cloud). The http response is sent back to user on the very first page using response.write from where user click the link.
Now, the problem I am facing is if the file size is low then it works fine. But, if the file is large (i.e., more than 1 GB), then my application waits until whole file is downloaded from the URL. I have tried using response.flush() to send chunk by chunk data to user, but still user is unable to use application because the worker process is busy getting streams of data from third party URL.
Is there any way by which large files can be downloaded asynchronously so that my pop-up window finishes its execution(download will be in progress) and also user can do other activities on application parallely.
Thanks,
Suvodeep
Use WebClient to read the remote file. Instead of downloading you can take the Stream from the WebClient. Put that in while() loop and push the bytes from the WebClient stream in the Response stream. On this way, you will be async downloading and uploading at the same time.
HttpRequest example:
private void WriteFileInDownloadDirectly()
{
//Create a stream for the file
Stream stream = null;
//This controls how many bytes to read at a time and send to the client
int bytesToRead = 10000;
// Buffer to read bytes in chunk size specified above
byte[] buffer = new byte[bytesToRead];
// The number of bytes read
try
{
//Create a WebRequest to get the file
HttpWebRequest fileReq = (HttpWebRequest)HttpWebRequest.Create("Remote File URL");
//Create a response for this request
HttpWebResponse fileResp = (HttpWebResponse)fileReq.GetResponse();
if (fileReq.ContentLength > 0)
fileResp.ContentLength = fileReq.ContentLength;
//Get the Stream returned from the response
stream = fileResp.GetResponseStream();
// prepare the response to the client. resp is the client Response
var resp = HttpContext.Current.Response;
//Indicate the type of data being sent
resp.ContentType = "application/octet-stream";
//Name the file
resp.AddHeader("Content-Disposition", $"attachment; filename=\"{ Path.GetFileName("Local File Path - can be fake") }\"");
resp.AddHeader("Content-Length", fileResp.ContentLength.ToString());
int length;
do
{
// Verify that the client is connected.
if (resp.IsClientConnected)
{
// Read data into the buffer.
length = stream.Read(buffer, 0, bytesToRead);
// and write it out to the response's output stream
resp.OutputStream.Write(buffer, 0, length);
// Flush the data
resp.Flush();
//Clear the buffer
buffer = new byte[bytesToRead];
}
else
{
// cancel the download if client has disconnected
length = -1;
}
} while (length > 0); //Repeat until no data is read
}
finally
{
if (stream != null)
{
//Close the input stream
stream.Close();
}
}
}
WebClient Stream reading:
using (WebClient client = new WebClient())
{
Stream largeFileStream = client.OpenRead("My Address");
}

Download a PDF from a third party using ASP.NET HttpWebRequest/HttpWebResponse

I want to send a url as query string e.g.
localhost/abc.aspx?url=http:/ /www.site.com/report.pdf
and detect if the above URL returns the PDF file. If it will return PDF then it gets saved automatically otherwise it gives error.
There are some pages that uses Handler to fetch the files so in that case also I want to detect and download the same.
localhost/abc.aspx?url=http:/ /www.site.com/page.aspx?fileId=223344
The above may return a pdf file.
What is best way to capture this?
Thanks
You can download a PDF like this
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(uri);
HttpWebResponse response = req.GetResponse();
//check the filetype returned
string contentType = response.ContentType;
if(contentType!=null)
{
splitString = contentType.Split(';');
fileType = splitString[0];
}
//see if its PDF
if(fileType!=null && fileType=="application/pdf"){
Stream stream = response.GetResponseStream();
//save it
using(FileStream fileStream = File.Create(fileFullPath)){
// Initialize the bytes array with the stream length and then fill it with data
byte[] bytesInStream = new byte[stream.Length];
stream.Read(bytesInStream, 0, bytesInStream.Length);
// Use write method to write to the file specified above
fileStream.Write(bytesInStream, 0, bytesInStream.Length);
}
}
response.Close();
The fact that it may come from an .aspx handler doesn't actually matter, it's the mime returned in the server response that is used.
If you are getting a generic mime type, like application/octet-stream then you must use a more heuristical approach.
Assuming you cannot simply use the file extension (eg for .aspx), then you can copy the file to a MemoryStream first (see How to get a MemoryStream from a Stream in .NET?). Once you have a memory stream of the file, you can take a 'cheeky' peek at it (I say cheeky because it's not the correct way to parse a PDF file)
I'm not an expert on PDF format, but I believe reading the first 5 chars with an ASCII reader will yield "%PDF-", so you can identify that with
bool isPDF;
using( StreamReader srAsciiFromStream = new StreamReader(memoryStream,
System.Text.Encoding.ASCII)){
isPDF = srAsciiFromStream.ReadLine().StartsWith("%PDF-");
}
//set the memory stream back to the start so you can save the file
memoryStream.Position = 0;

How to cancel large file download yet still get page source in C#?

I'm working in C# on a program to list all course resources for a MOOC (e.g. Coursera). I don't want to download the content, just get a listing of all the resources (e.g. pdf, videos, text files, sample files, etc...) which are made available to the course.
My problem lies in parsing the html source (currently using HtmlAgilityPack) without downloading all the content.
For example, if you go to this intro video for a banking course on Coursera and check the source (F12 in Chrome for Developer Tools), you can see the page source. I can stop the video download which autoplays, but still see the source.
How can I get the source in C# without download all the content?
I've looked in the HttpWebRequest headers (problem: time out), and DownloadDataAsync with Cancel (problem: the Completed Result object is invalid when cancelling the async request). I've also tried various Loads from HtmlAgilityPack but with no success.
Time out:
HttpWebRequest postRequest = (HttpWebRequest)WebRequest.Create(url);
postRequest.Timeout = TIMEOUT * 1000000; //Really long
postRequest.Referer = "https://www.coursera.org";
if (headers != null)
{ //headers here }
//Deal with cookies
if (cookie != null)
{ cookieJar.Add(cookie); }
postRequest.CookieContainer = cookiejar;
postRequest.Method = "GET";
postRequest.AllowAutoRedirect = allowRedirect;
postRequest.ServicePoint.Expect100Continue = true;
HttpWebResponse postResponse = (HttpWebResponse)postRequest.GetResponse();
Any tips on how to proceed?
There are at least two ways to do what you're asking. The first is to use a range get. That is, specify the range of the file you want to read. You do that by calling AddRange on the HttpWebRequest. So if you want, say, the first 10 kilobytes of the file, you'd write:
request.AddRange(-10240);
Read carefully what the documentation says about the meaning of that parameter. If it's negative, it specifies the ending point of the range. There are also other overloads of AddRange that you might be interested in.
Not all servers support range gets, though. If that doesn't work, you'll have to do it another way.
What you can do is call GetResponse and then start reading data. Once you've read as much data as you want, you can stop reading and close the stream. I've modified your sample slightly to show what I mean.
string url = "https://www.coursera.org/course/money";
HttpWebRequest postRequest = (HttpWebRequest)WebRequest.Create(url);
postRequest.Method = "GET";
postRequest.AllowAutoRedirect = true; //allowRedirect;
postRequest.ServicePoint.Expect100Continue = true;
HttpWebResponse postResponse = (HttpWebResponse) postRequest.GetResponse();
int maxBytes = 1024*1024;
int totalBytesRead = 0;
var buffer = new byte[maxBytes];
using (var s = postResponse.GetResponseStream())
{
int bytesRead;
// read up to `maxBytes` bytes from the response
while (totalBytesRead < maxBytes && (bytesRead = s.Read(buffer, 0, maxBytes)) != 0)
{
// Here you can save the bytes read to a persistent buffer,
// or write them to a file.
Console.WriteLine("{0:N0} bytes read", bytesRead);
totalBytesRead += bytesRead;
}
}
Console.WriteLine("total bytes read = {0:N0}", totalBytesRead);
That said, I ran this sample and it downloaded about 6 kilobytes and stopped. I don't know why you're having trouble with timeouts or too much data.
Note that sometimes trying to close the stream before the entire response is read will cause the program to hang. I'm not sure why that happens at all, and I can't explain why it only happens sometimes. But you can solve it by calling request.Abort before closing the stream. That is:
using (var s = postResponse.GetResponseStream())
{
// do stuff here
// abort the request before continuing
postRequest.Abort();
}

Creating a Download Accelerator

I am referring to this article to understand file downloads using C#.
Code uses traditional method to read Stream like
((bytesSize = strResponse.Read(downBuffer, 0, downBuffer.Length)) > 0
How can I divide a file to be downloaded into multiple segments, so that I can download separate segments in parallel and merge them?
using (WebClient wcDownload = new WebClient())
{
try
{
// Create a request to the file we are downloading
webRequest = (HttpWebRequest)WebRequest.Create(txtUrl.Text);
// Set default authentication for retrieving the file
webRequest.Credentials = CredentialCache.DefaultCredentials;
// Retrieve the response from the server
webResponse = (HttpWebResponse)webRequest.GetResponse();
// Ask the server for the file size and store it
Int64 fileSize = webResponse.ContentLength;
// Open the URL for download
strResponse = wcDownload.OpenRead(txtUrl.Text);
// Create a new file stream where we will be saving the data (local drive)
strLocal = new FileStream(txtPath.Text, FileMode.Create, FileAccess.Write, FileShare.None);
// It will store the current number of bytes we retrieved from the server
int bytesSize = 0;
// A buffer for storing and writing the data retrieved from the server
byte[] downBuffer = new byte[2048];
// Loop through the buffer until the buffer is empty
while ((bytesSize = strResponse.Read(downBuffer, 0, downBuffer.Length)) > 0)
{
// Write the data from the buffer to the local hard drive
strLocal.Write(downBuffer, 0, bytesSize);
// Invoke the method that updates the form's label and progress bar
this.Invoke(new UpdateProgessCallback(this.UpdateProgress), new object[] { strLocal.Length, fileSize });
}
}
you need several threads to accomplish that.
first you start the first download thread, creating a webclient and getting the file size. then you can start several new thread, which add a download range header.
you need a logic which takes care about the downloaded parts, and creates new download parts when one finished.
http://msdn.microsoft.com/de-de/library/system.net.httpwebrequest.addrange.aspx
I noticed that the WebClient implementation has sometimes a strange behaviour, so I still recommend implementing an own HTTP client if you really want to write a "big" download program.
ps: thanks to user svick

Categories