WebClient isn't downloading the right file from the supplied URL

WebClient isn't downloading the right file from the supplied URL - c#

I want to download a .torrent file from a Linux distro, but for some reason the final file downloaded from my app is different from the one downloaded manually. The one that my app downloads has 31KB and it is a invalid .torrent file, while right one (when i download manually) has 41KB and it is valid.
The URL from the file i want to download is http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent
Why is it happening and how can i download the same file (the valid one, with 41KB)?
Thanks.
C# Code from the method that downloads the file above:
string sLinkTorCache = #"http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent";
using (System.Net.WebClient wc = new System.Net.WebClient())
{
var path = #"D:\Baixar automaticamente"; // HACK Pegar isso dos settings na versão final
var data = Helper.Retry(() => wc.DownloadData(sLinkTorCache), TimeSpan.FromSeconds(3), 5);
string fileName = null;
// Try to extract the filename from the Content-Disposition header
if (!string.IsNullOrEmpty(wc.ResponseHeaders["Content-Disposition"]))
{
fileName = wc.ResponseHeaders["Content-Disposition"].Substring(wc.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 10).Replace("\"", "");
}
var torrentPath = Path.Combine(path, fileName ?? "Arch Linux Distro");
if (File.Exists(torrentPath))
{
File.Delete(torrentPath);
}
Helper.Retry(() => wc.DownloadFile(new Uri(sLinkTorCache), torrentPath), TimeSpan.FromSeconds(3), 5);
}
Helper.Retry (Try to execute the method again in case of HTTP Exceptions):
public static void Retry(Action action, TimeSpan retryInterval, int retryCount = 3)
{
Retry<object>(() =>
{
action();
return null;
}, retryInterval, retryCount);
}
public static T Retry<T>(Func<T> action, TimeSpan retryInterval, int retryCount = 3)
{
var exceptions = new List<Exception>();
for (int retry = 0; retry < retryCount; retry++)
{
try
{
if (retry > 0)
System.Threading.Thread.Sleep(retryInterval); // TODO adicionar o Using pro thread
return action();
}
catch (Exception ex)
{
exceptions.Add(ex);
}
}
throw new AggregateException(exceptions);
}

I initially though the site was responding with junk if it thought it was a request from a bot (that is, it was checking some of the headers). After having a look with Fiddler - it appears that the data returned is exactly the same for both a web browser and the code. Which means, we're not properly deflating (extracting) the response. It's very common for web servers to compress the data (using something like gzip). WebClient does not automatically deflate the data.
Using the answer from Automatically decompress gzip response via WebClient.DownloadData - I managed to get it to work properly.
Also note that you're downloading the file twice. You don't need to do that.
Working code:
//Taken from above linked question
class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
return request;
}
}
And using it:
string sLinkTorCache = #"http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent";
using (var wc = new MyWebClient())
{
var path = #"C:\Junk";
var data = Helper.Retry(() => wc.DownloadData(sLinkTorCache), TimeSpan.FromSeconds(3), 5);
string fileName = "";
var torrentPath = Path.Combine(path, fileName ?? "Arch Linux Distro.torrent");
if (File.Exists(torrentPath))
File.Delete(torrentPath);
File.WriteAllBytes(torrentPath, data);
}

Related

PutAsync without altering content of CSV file

I am trying to use HttpClient with putasync to send file to server. The function looks like:
public async Task SendCsvFile(string path,string apiKey)
{
try
{
string clientKey = "";
LoggerService.Logger.CreateLog("Log");
LoggerService.Logger.Info("Start:SendCsvFile");
FileStream fileStream = null;
HttpClientHandler clientHandler = new HttpClientHandler();
clientHandler.ServerCertificateCustomValidationCallback = (sender, cert, chain, sslPolicyErrors) => { return true; };
HttpClient httpClient = new HttpClient(clientHandler);
httpClient.DefaultRequestHeaders.Accept.Clear();
httpClient.DefaultRequestHeaders.Add("x-api-key", clientKey);
string url = "https://test.csv";
var content = new MultipartFormDataContent();
var fileName = Path.GetFileName(path);
fileStream = File.OpenRead(path);
StreamContent streamContent = new StreamContent(fileStream);
content.Add(new StreamContent(fileStream), fileName, fileName);
var response = await httpClient.PutAsync(url, content);
if (response.IsSuccessStatusCode == true)
{
LoggerService.Logger.Info("File sent correctly.");
}
else
{
LoggerService.Logger.Error("Error during sending." + response.StatusCode + ";" + response.ReasonPhrase + ";");
}
fileStream.Close();
LoggerService.Logger.Info("End:SendCsvFile");
}
catch (Exception ex)
{
LoggerService.Logger.Error(ex.ToString());
//return 0;
}
//return 1;
}
File is send fine and it works however Content-disposition header is added to the file to the first line, and client doesn't want that. It's the first time I am doing anything with services and I read through a lot but still I don't know what can i change to not alter the content of csv file.
EDIT.
After I send the file header is added to the content so the file looks like that.
Screenshot from client server
All the data is fine, though the client server processes the data in a way that it should start from column names. So my question really is what can I can change to omit that first line and is it even possible. Maybe thats something obvious but i' m just a newbie in this stuff.

Changing to MultipartContent and Clearing Headers almost work but left me with boundary still visible in a file. Eventually I changed to RestSharp and adding content this way got rid of the problem.
public async Task SendCsvFile(string path, string apiKey)
{
try
{
string clientKey = "";
string url = "";
LoggerService.Logger.CreateLog("CreationAndDispatchStatesWithPrices");
LoggerService.Logger.Info("Start:SendCsvFile");
FileStream fileStream = null;
var fileName = Path.GetFileName(path);
fileStream = File.OpenRead(path);
var client = new RestClient(url);
// client.Timeout = -1;
var request = new RestRequest();
request.AddHeader("x-api-key", clientKey);
request.AddHeader("Content-Type", "text/csv");
request.AddParameter("text/csv", File.ReadAllBytes(path), ParameterType.RequestBody);
RestResponse response = client.Put(request);
if (response.IsSuccessful == true)
{
LoggerService.Logger.Info("File sent correctly.");
}
else
{
LoggerService.Logger.Error("Error sending file." + response.StatusCode + ";" + response.ErrorMessage + ";");
}
fileStream.Close();
LoggerService.Logger.Info("End:SendCsvFile");
}
catch (Exception ex)
{
LoggerService.Logger.Error(ex.ToString());
//return 0;
}
//return 1;
}

Receive media message from whatsApp twilio via webhook

I'm trying to get media file from incoming WhatsApp message, for that I tried git example shared by Twilio site GITHUB
Here is my code snip
//-----------------------------------------------------------------
[HttpPost]
public TwiMLResult Index(SmsRequest incomingMessage, int numMedia)
{
MessagingResponse messagingResponse = new MessagingResponse();
if (numMedia>0)
{
GetMediaFilesAsync(numMedia,incomingMessage).GetAwaiter().GetResult();
messagingResponse.Append(new Twilio.TwiML.Messaging.Message().Body("Media received"));
return TwiML(messagingResponse);
}
// first authorize incoming message
TwilioClient.Init(accountSid, authToken);
messagingResponse = GetResponseMsg(incomingMessage);
return TwiML(messagingResponse);
}
private async Task GetMediaFilesAsync(int numMedia, SmsRequest incomingMessage)
{
try
{
for (var i = 0; i < numMedia; i++)
{
var mediaUrl = Request.Form[$"MediaUrl{i}"];
Trace.WriteLine(mediaUrl);
var contentType = Request.Form[$"MediaContentType{i}"];
var filePath = GetMediaFileName(mediaUrl, contentType);
await DownloadUrlToFileAsync(mediaUrl, filePath);
}
}
catch (Exception ex)
{
}
}
private string GetMediaFileName(string mediaUrl,string contentType)
{
string SavePath = "~/App_Data/";
return Server.MapPath(
// e.g. ~/App_Data/MExxxx.jpg
SavePath +
System.IO.Path.GetFileName(mediaUrl) +
GetDefaultExtension(contentType)
);
}
private static async Task DownloadUrlToFileAsync(string mediaUrl,string filePath)
{
using (var client = new HttpClient())
{
var response = await client.GetAsync(mediaUrl);
var httpStream = await response.Content.ReadAsStreamAsync();
using (var fileStream = System.IO.File.Create(filePath))
{
await httpStream.CopyToAsync(fileStream);
await fileStream.FlushAsync();
}
}
}
public static string GetDefaultExtension(string mimeType)
{
// NOTE: This implementation is Windows specific (uses Registry)
var key = Registry.ClassesRoot.OpenSubKey(
#"MIME\Database\Content Type\" + mimeType, false);
var ext = key?.GetValue("Extension", null)?.ToString();
return ext ?? "application/octet-stream";
}
//----------------------------------------------------------------------
but it's not working,
for normal text message its working well but not for media, I tried it by sending a .jpg file.
I checked debugger, but unable to understand what I missed.
this is what I receive
sourceComponent "14100"
httpResponse "502"
url "https://myUrl.com/WAResponse/index"
ErrorCode "11200"
LogLevel "ERROR"
Msg "Bad Gateway"
EmailNotification "false"
Please let me know if I need to perform changes in my code to receive the media.
Thank you!

After detailing from Twilio support, found that the current code is fine, I made a little change and made it async so its work.
public async Task<TwiMLResult> Index(SmsRequest incomingMessage, int numMedia)
You may need to grant access permission to the directory if required

Downloading a Zip file from a URL shows the ZIP file as empty when trying to manually extract it

I'm downloading a zip file from a URL and when trying to extract it manually just to check it has come through correctly it shows it's empty and doesn't let me.
try {
using (var client = new WebClient())
{
client.DownloadFile("url", "C:/1.zip");
}
} catch(Exception e) {
Debug.WriteLine(e + "DDDD");
}
Also how would I programmatically extract this so I can go into the contents of the file and extract more things. What is the simplest way?

You can extract you zip file using below code.
System.IO.Compression.ZipFile.ExtractToDirectory(#"C:/1.zip",#"c:\example\extract");
Do not forget to add System.IO.Compression.FileSystem from assembly.

Try using the Async method and the events that go with it.
Something like this:
void Foo() {
var webClient = new WebClient();
var totalBytes = 0l;
var destFile = new FileInfo(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "platform-tools-latest-windows.zip"));
webClient.DownloadProgressChanged += (s, e) => Debug.WriteLine($"Download progress changed: { e.ProgressPercentage }% ({ e.BytesReceived } / { (totalBytes = e.TotalBytesToReceive) })");
webClient.DownloadFileCompleted += (s, e) => {
destFile.Refresh();
if (destFile.Length != totalBytes) {
// Handle error
} else {
// Do nothing?
}
};
webClient.DownloadFileAsync(new Uri("https://dl.google.com/android/repository/platform-tools-latest-windows.zip"), destFile.FullName);
}
Give that a try and see if it works with your zip
EDIT:
If the above code doesn't work, there are a few more possibilities worth trying.
1: Try appending while (webClient.IsBusy); to the end of the above method, to force the running thread to wait until the WebClient has finished downloading
2: Try downloading the raw data (byte[]) first, then flushing the buffer to the file.
NOTE: ONLY DO THIS FOR SMALL(er) FILES!
public void DownloadFoo() {
var webClient = new WebClient();
var totalBytes = 0l;
var destFile = new FileInfo(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "platform-tools-latest-windows.zip"));
webClient.DownloadProgressChanged += (s, e) => Debug.WriteLine($"Download progress changed: { e.ProgressPercentage }% ({ e.BytesReceived } / { (totalBytes = e.TotalBytesToReceive) })");
using (webClient) {
var buffer = webClient.DownloadData(new Uri("https://dl.google.com/android/repository/platform-tools-latest-windows.zip"));
using (var oStream = destFile.Open(FileMode.Truncate)) {
oStream.Write(buffer, 0, buffer.Length);
oStream.Flush(true); // true => flushToDisk
}
}
// webClient is automatically disposed of; method will return cleanly
}

Checked the URL through HTTP Headers in Firefox. Found this URL was going to some sort of API before giving the zip file. The URL had parameters passed through as arguments.
I installed RestSharp then done this:
var client = new RestClient(URL);
var request = new RestRequest("&user=bam&pass=boom", Method.GET);
var queryResult = client.Execute(request);
string zipPath = C:/Temp + zippy.zip";
client.DownloadData(request).SaveAs(zipPath);

a faster way to download multiple files

i need to download about 2 million files from the SEC website. each file has a unique url and is on average 10kB. this is my current implementation:
List<string> urls = new List<string>();
// ... initialize urls ...
WebBrowser browser = new WebBrowser();
foreach (string url in urls)
{
browser.Navigate(url);
while (browser.ReadyState != WebBrowserReadyState.Complete) Application.DoEvents();
StreamReader sr = new StreamReader(browser.DocumentStream);
StreamWriter sw = new StreamWriter(), url.Substring(url.LastIndexOf('/')));
sw.Write(sr.ReadToEnd());
sr.Close();
sw.Close();
}
the projected time is about 12 days... is there a faster way?
Edit: btw, the local file handling takes only 7% of the time
Edit: this is my final implementation:
void Main(void)
{
ServicePointManager.DefaultConnectionLimit = 10000;
List<string> urls = new List<string>();
// ... initialize urls ...
int retries = urls.AsParallel().WithDegreeOfParallelism(8).Sum(arg => downloadFile(arg));
}
public int downloadFile(string url)
{
int retries = 0;
retry:
try
{
HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(url);
webrequest.Timeout = 10000;
webrequest.ReadWriteTimeout = 10000;
webrequest.Proxy = null;
webrequest.KeepAlive = false;
webresponse = (HttpWebResponse)webrequest.GetResponse();
using (Stream sr = webrequest.GetResponse().GetResponseStream())
using (FileStream sw = File.Create(url.Substring(url.LastIndexOf('/'))))
{
sr.CopyTo(sw);
}
}
catch (Exception ee)
{
if (ee.Message != "The remote server returned an error: (404) Not Found." && ee.Message != "The remote server returned an error: (403) Forbidden.")
{
if (ee.Message.StartsWith("The operation has timed out") || ee.Message == "Unable to connect to the remote server" || ee.Message.StartsWith("The request was aborted: ") || ee.Message.StartsWith("Unable to read data from the transport connection: ") || ee.Message == "The remote server returned an error: (408) Request Timeout.") retries++;
else MessageBox.Show(ee.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
goto retry;
}
}
return retries;
}

Execute the downloads concurrently instead of sequentially, and set a sensible MaxDegreeOfParallelism otherwise you will try to make too many simultaneous request which will look like a DOS attack:
public static void Main(string[] args)
{
var urls = new List<string>();
Parallel.ForEach(
urls,
new ParallelOptions{MaxDegreeOfParallelism = 10},
DownloadFile);
}
public static void DownloadFile(string url)
{
using(var sr = new StreamReader(HttpWebRequest.Create(url)
.GetResponse().GetResponseStream()))
using(var sw = new StreamWriter(url.Substring(url.LastIndexOf('/'))))
{
sw.Write(sr.ReadToEnd());
}
}

Download files in several threads. Number of threads depends on your throughput. Also, look at WebClient and HttpWebRequest classes. Simple sample:
var list = new[]
{
"http://google.com",
"http://yahoo.com",
"http://stackoverflow.com"
};
var tasks = Parallel.ForEach(list,
s =>
{
using (var client = new WebClient())
{
Console.WriteLine($"starting to download {s}");
string result = client.DownloadString((string)s);
Console.WriteLine($"finished downloading {s}");
}
});

I'd use several threads in parallel, with a WebClient. I recommend setting the max degree of parallelism to the number of threads you want, since unspecified degree of parallelism doesn't work well for long running tasks. I've used 50 parallel downloads in one of my projects without a problem, but depending on the speed of an individual download a much lower might be sufficient.
If you download multiple files in parallel from the same server, you're by default limited to a small number (2 or 4) of parallel downloads. While the http standard specifies such a low limit, many servers don't enforce it. Use ServicePointManager.DefaultConnectionLimit = 10000; to increase the limit.

I think the code from o17t H1H' S'k seems right and all but to perform I/O bound tasks an async method should be used.
Like this:
public static async Task DownloadFileAsync(HttpClient httpClient, string url, string fileToWriteTo)
{
using HttpResponseMessage response = await httpClient.GetAsync(url, HttpCompletionOption.ResponseHeadersRead);
using Stream streamToReadFrom = await response.Content.ReadAsStreamAsync();
using Stream streamToWriteTo = File.Open(fileToWriteTo, FileMode.Create);
await streamToReadFrom.CopyToAsync(streamToWriteTo);
}
Parallel.Foreach is also available with Parallel.ForEachAsync. Parallel.Foreach has a lot of features that the async does't has, but most of them are also depracticed. You can implement an Producer Consumer system with Channel or BlockingCollection to handle the amount of 2 million files. But only if you don't know all URLs at the start.
private static async void StartDownload()
{
(string, string)[] urls = new ValueTuple<string, string>[]{
new ("https://dotnet.microsoft.com", "C:/YoureFile.html"),
new ( "https://www.microsoft.com", "C:/YoureFile1.html"),
new ( "https://stackoverflow.com", "C:/YoureFile2.html")};
var client = new HttpClient();
ParallelOptions options = new() { MaxDegreeOfParallelism = 2 };
await Parallel.ForEachAsync(urls, options, async (url, token) =>
{
await DownloadFileAsync(httpClient, url.Item1, url.Item2);
});
}
Also look into this NuGet Package. The Github Wiki gives examples how to use it. To download 2 million files this is a good library and has also a retry function. To download a file you only have to create an instance of LoadRequest and it downloads it with the name of the file into the Downloads directory.
private static void StartDownload()
{
string[] urls = new string[]{
"https://dotnet.microsoft.com",
"https://www.microsoft.com",
" https://stackoverflow.com"};
foreach (string url in urls)
new LoadRequest(url).Start();
}
I hope this helps to improve the code.

Unable to cast object of type 'System.Net.HttpWebRequest' to type 'System.Net.FileWebRequest'

I try download file from server with FileWebRequest. But I get error:
Method on download is here:
public string HttpFileGetReq(Uri uri, int reqTimeout, Encoding encoding)
{
try
{
string stringResponse;
var req = (FileWebRequest)WebRequest.Create(uri);
req.Timeout = reqTimeout;
req.Method = WebRequestMethods.File.DownloadFile;
var res = (FileWebResponse)req.GetResponse();
//using (var receiveStream = res.GetResponseStream())
//using (var readStream = new StreamReader(receiveStream,encoding))
//{
// stringResponse = readStream.ReadToEnd();
//}
return stringResponse="0K";
}
catch (WebException webException)
{
throw webException;
}
}
Usage is here:
public dynamic LoadRoomMsg(IAccount account, string roomId)
{
try
{
string uri = string.Format("http://www-pokec.azet.sk/_s/chat/nacitajPrispevky.php?{0}&lok={1}&lastMsg=0&pub=0&prv=0&r=1295633087203&changeroom=1" , account.SessionId, roomId);
var htmlStringResult = HttpFileGetReq(new Uri(uri), ReqTimeout, EncodingType);
//var htmlStringResult = _httpReq.HttpGetReq(new Uri(string.Format("{0}{1}?{2}&lok=", PokecUrl.RoomMsg,account.SessionId,roomId)),
// ReqTimeout, account.Cookies, EncodingType);
if (!string.IsNullOrEmpty(htmlStringResult))
{
return true;
}
return false;
}
catch (Exception exception)
{
throw exception;
}
}
URL on file is here.
I would like read this file to string variable, that’s all. If anyone have some time and can help me I would be very glad to him.

Your URL (http://...) will produce a HttpWebRequest. You can check with the debugger.
Form MSDN:
The FileWebRequest class implements
the WebRequest abstract base class for
Uniform Resource Identifiers (URIs)
that use the file:// scheme to request
local files.
Note the file:// and local files in there.
Tip: Just use the WebClient class.

Rather than implement your own web streams allow the .NET framework to do it all for you with WebClient, for example:
string uri = string.Format(
"http://www-pokec.azet.sk/_s/chat/nacitajPrispevky.php?{0}&lok={1}&lastMsg=0&pub=0&prv=0&r=1295633087203&changeroom=1",
account.SessionId,
roomId);
System.Net.WebClient wc = new System.Net.WebClient();
string webData = wc.DownloadString(uri);
...parse the webdata response here...
Looking at the response from the URL you posted:
{"reason":0}
parsing that should be a simple task with a little string manipulation.

Change FileWebRequest and FileWebResponse to HttpWebRequest and HttpWebResponse.
It doesn't matter that what you're downloading may be a file; as far as the .NET Framework is concerned, you're just retrieving a page from a website.

FileWebRequest is for file:// protocols. Since you're using an http:// url, you want to use HttpWebRequest.
public string HttpFileGetReq(Uri uri, int reqTimeout, Encoding encoding)
{
string stringResponse;
var req = (HttpWebRequest)WebRequest.Create(uri);
req.Timeout = reqTimeout;
var res = (HttpWebResponse)req.GetResponse();
using (var receiveStream = res.GetResponseStream())
{
using (var readStream = new StreamReader(receiveStream,encoding))
{
return readStream.ReadToEnd();
}
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

WebClient isn't downloading the right file from the supplied URL - c#

Related

PutAsync without altering content of CSV file

Receive media message from whatsApp twilio via webhook

Downloading a Zip file from a URL shows the ZIP file as empty when trying to manually extract it

a faster way to download multiple files

Unable to cast object of type 'System.Net.HttpWebRequest' to type 'System.Net.FileWebRequest'

Categories

Resources