I have been working on this for a week. I have done a lot of searching and a lot of tests for different methods.
When I use HttpClient to download a file, no errors are generated but the files do not show up in the folder until after the program exits. I have a synchronous method (before someone asks - no I cannot change it to asynchronous) that calls an asynchronous method to download language files for my Tesseract OCR (I test with language == "eng"):
if (!Directory.Exists(folderName))
Directory.CreateDirectory(folderName);
Task.Run(async () => await HelperMethods.LoadLanguage(folderName, language));
Task.Run(async () => await HelperMethods.LoadLanguage(folderName, "osd"));
And here is the method that is being awaited:
public static async Task LoadLanguage(string folderName, string language)
{
string dest = Path.GetFullPath(Path.Combine(folderName, $"{language}.traineddata"));
if (!File.Exists(dest))
{
// Now we know that we need network - start it up if it isn't already.
if (httpClient == null)
httpClient = new HttpClient();
Uri uri = new Uri($"https://github.com/tesseract-ocr/tessdata/raw/main/{language}.traineddata");
HttpResponseMessage response = await httpClient.GetAsync(uri);
using (FileStream fs = new FileStream(dest, FileMode.Create, FileAccess.Write))
{
await response.Content.CopyToAsync(fs);
await fs.FlushAsync();
fs.Close();
}
}
}
I added the Flush and Close as part of the testing, but it did not make a difference.
This is supposed to download the language files and allow the next lines to perform an OCR using those languages. The files are downloaded and are written to the folder (only show up) after the program exits.
If I run the program a second time, it works - because it does not need to download new files.
How do I get this to download the files and save them immediately so the files can be used in subsequent operations?
I also tried this method:
var request = new HttpRequestMessage(HttpMethod.Get, uri);
var sendTask = httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
var response = sendTask.Result.EnsureSuccessStatusCode();
var httpStream = await response.Content.ReadAsStreamAsync();
using (var fileStream = File.Create(dest))
{
using (var reader = new StreamReader(httpStream))
{
httpStream.CopyTo(fileStream);
fileStream.Flush();
}
}
No difference. And quite a few other methods of working with the stream.
This is .NET framework 4.8 (not able to update to .NET6).
Your post states
I have a synchronous method [...] that calls an asynchronous method to download language files
If the caller is synchronous anyway, why not make the downloader synchronous, too?
public void LoadLanguage(string folderName, string language)
{
Enabled = false;
try
{
Uri uri = new Uri($"https://github.com/tesseract-ocr/tessdata/raw/main/{language}.traineddata");
using (var client = new HttpClient())
{
using (var response =
client
.GetAsync(uri)
.GetAwaiter()
.GetResult())
{
var bytes =
response
.Content
.ReadAsByteArrayAsync()
.GetAwaiter()
.GetResult();
File.WriteAllBytes(
Path.Combine(
folderName,
$"{language}.traineddata"),
bytes
);
}
}
}
finally
{
Enabled = true;
}
}
I tested this and it seems to work [clone]. Does this get you any closer?
Related
I'm working on a windows client for uploading a lot of small files over an http post request.
I’m using .NET 4.5.2
public async void Upload3(HttpClient client, string url, string[] files)
{
foreach (var file in files)
{
using (var stream = new FileStream(file, FileMode.Open))
{
FileInfo info = new FileInfo(file);
HttpContent fileStreamContent = new StreamContent(stream);
using (var content = new MultipartFormDataContent())
{
content.Add(fileStreamContent);
var response = await client.PostAsync(url, content);
response.EnsureSuccessStatusCode();
//code is stopping at the following line:
string finalresults = await response.Content.ReadAsStringAsync();
Console.WriteLine(finalresults);
Console.WriteLine(" > Uploaded file " + info.Name);
}
stream.Close();
}
}
Console.WriteLine("> Uploaded all files");
}
The Code is working fine for the very first file. But every other file is not uploaded. When I try to debug the code step by step, the code execution stops (in the second iteration of the loop) on this line:
string finalresults = await response.Content.ReadAsStringAsync();
Since the server log only shows on single request, I think that the error already occurs in this line:
var response = await client.PostAsync(url, content);
Even if I use different HttpClient objects and different FileStream objects, the upload is only working for the first file.
What is wrong with this code?
For your requiment, you can user third party libraries like RESTSharp. There are lots of examples and good documentation. Also it is easy to use.
My program uses HttpClient to send a GET request to a Web API, and this returns a file.
I now use this code (simplified) to store the file to disc:
public async Task<bool> DownloadFile()
{
var client = new HttpClient();
var uri = new Uri("http://somedomain.com/path");
var response = await client.GetAsync(uri);
if (response.IsSuccessStatusCode)
{
var fileName = response.Content.Headers.ContentDisposition.FileName;
using (var fs = new FileStream(#"C:\test\" + fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
await response.Content.CopyToAsync(fs);
return true;
}
}
return false;
}
Now, when this code runs, the process loads all of the file into memory. I actually would rather expect the stream gets streamed from the HttpResponseMessage.Content to the FileStream, so that only a small portion of it is held in memory.
We are planning to use that on large files (> 1GB), so is there a way to achieve that without having all of the file in memory?
Ideally without manually looping through reading a portion to a byte[] and writing that portion to the file stream until all of the content is written?
It looks like this is by-design - if you check the documentation for HttpClient.GetAsync() you'll see it says:
The returned task object will complete after the whole response
(including content) is read
You can instead use HttpClient.GetStreamAsync() which specifically states:
This method does not buffer the stream.
However you don't then get access to the headers in the response as far as I can see. Since that's presumably a requirement (as you're getting the file name from the headers), then you may want to use HttpWebRequest instead which allows you you to get the response details (headers etc.) without reading the whole response into memory. Something like:
public async Task<bool> DownloadFile()
{
var uri = new Uri("http://somedomain.com/path");
var request = WebRequest.CreateHttp(uri);
var response = await request.GetResponseAsync();
ContentDispositionHeaderValue contentDisposition;
var fileName = ContentDispositionHeaderValue.TryParse(response.Headers["Content-Disposition"], out contentDisposition)
? contentDisposition.FileName
: "noname.dat";
using (var fs = new FileStream(#"C:\test\" + fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
await response.GetResponseStream().CopyToAsync(fs);
}
return true
}
Note that if the request returns an unsuccessful response code an exception will be thrown, so you may wish to wrap in a try..catch and return false in this case as in your original example.
Instead of GetAsync(Uri) use the the GetAsync(Uri, HttpCompletionOption) overload with the HttpCompletionOption.ResponseHeadersRead value.
The same applies to SendAsync and other methods of HttpClient
Sources:
docs (see remarks)
https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httpclient.getasync?view=netcore-1.1#System_Net_Http_HttpClient_GetAsync_System_Uri_System_Net_Http_HttpCompletionOption_
The returned Task object will complete based on the completionOption parameter after the part or all of the response (including content) is read.
.NET Core implementation of GetStreamAsync that uses HttpCompletionOption.ResponseHeadersRead https://github.com/dotnet/corefx/blob/release/1.1.0/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L163-L168
HttpClient spike in memory usage with large response
HttpClient.GetStreamAsync() with custom request? (don't mind the comment on response, the ResponseHeadersRead is what does the trick)
Another simple and quick way to do it is:
public async Task<bool> DownloadFile(string url)
{
using (MemoryStream ms = new MemoryStream()) {
new HttpClient().GetStreamAsync(webPath).Result.CopyTo(ms);
... // use ms in what you want
}
}
now you have the file downloaded as stream in ms.
i have this two methods for writting and reading from the file.
public static async Task WriteDataToFileAsync(string fileName, string content)
{
byte[] data = Encoding.Unicode.GetBytes(content);
var folder = ApplicationData.Current.LocalFolder;
var file = await folder.CreateFileAsync(fileName, CreationCollisionOption.OpenIfExists);
using (var s = await file.OpenStreamForWriteAsync())
{
await s.WriteAsync(data, 0, data.Length);
}
}
public async static Task<string> ReadFileContentsAsync()
{
var folder = ApplicationData.Current.LocalFolder;
try
{
var file = await folder.OpenStreamForReadAsync("MenuData.json");
using (var streamReader = new StreamReader(file))
{
Debug.WriteLine(streamReader.ReadToEnd());
return streamReader.ReadToEnd();
}
}
catch (Exception)
{
return string.Empty;
}
}
which are then used in this two methods
public static async void ApiToFileRestaurants()
{
HttpClient client = new HttpClient();
HttpResponseMessage response = client.GetAsync("http://bonar.si/api/restaurants").Result;
response.EnsureSuccessStatusCode();
string responseBody = response.Content.ReadAsStringAsync().Result;
await Restaurant.WriteDataToFileAsync("MenuData.json", responseBody);
}
public async static Task<List<Restaurant>> FileToRestaurantList()
{
var responseBody = await Restaurant.ReadFileContentsAsync();
List<Restaurant> parsedRestaurants = (List<Restaurant>)Newtonsoft.Json.JsonConvert.DeserializeObject(responseBody, typeof(List<Restaurant>));
return parsedRestaurants;
}
now my problem here is that ReadFileAsync doesn't return the results which i know are saved in MenuData.json file but instead returns empty string.
I was mostly getting source code for this from msdn
documentation.
Location of the file in my wp power tools looks like that.
I'm a novice programer so i might overlooked something else
Can you try to read the data from file asyncronously by using ReadToEndAsync which basically parses the complete data and sends response as one string.
var file = await folder.OpenStreamForReadAsync("MenuData.json");
using (var streamReader = new StreamReader(file))
{
return await streamReader.ReadToEndAsync();
}
Hope this helps!
i got the solution from one other forum
You're calling streamReader.ReadToEnd() twice. The first time you log
it to the Debug stream, the second is what you actually use as a
result. The method moves the file pointer to the end everytime it's
called and by the second time there's nothing to read.
so removing that debug line almost fixed my problem. I did get the string i wanted to but there was an error somewhere in it so Newtonsoft.Json had a hard time parsing it. So i tried #asitis solution and changed .json to .text and it worked
i'm trying to create an app that downloads a file and then edits this file.
The Problem Im having is once the file is downloaded it doesn't seem to let go of that file, i can download the file to its local storage, i have gotten the file manually from the Iso and its fine. if i use the app to proceed after downloading the file i get the System.UnauthorizedAccessException error, but if i close and open the app and then just edit the file saved in iso it works, like i said its like something is still using the downloaded file.
public async void DownloadTrack(Uri SongUri)
{
var httpClient = new HttpClient();
var data = await httpClient.GetByteArrayAsync(SongUri);
var file = await ApplicationData.Current.LocalFolder.CreateFileAsync("Test.mp3", CreationCollisionOption.ReplaceExisting);
var targetStream = await file.OpenAsync(FileAccessMode.ReadWrite);
await targetStream.AsStreamForWrite().WriteAsync(data, 0, data.Length);
await targetStream.FlushAsync();
}
this code works fine to download the mp3, as ive tested the download file. I have seen if examples where the code ends with
targetStream.Close();
but it doesnt give me that, is there another way to close the download
thanks.
Instead of calling Close() or Dispose() I really like to use using which does the job automatically. So your method could look like this:
public async void DownloadTrack(Uri SongUri)
{
using (HttpClient httpClient = new HttpClient())
{
var data = await httpClient.GetByteArrayAsync(SongUri);
var file = await ApplicationData.Current.LocalFolder.CreateFileAsync("Test.mp3", CreationCollisionOption.ReplaceExisting);
using (var targetStream = await file.OpenAsync(FileAccessMode.ReadWrite))
{
await targetStream.AsStreamForWrite().WriteAsync(data, 0, data.Length);
await targetStream.FlushAsync();
}
}
}
I'm having a problem with inclomplete blobs being downloaded from Azure storage. The files that are stored are an images. Almost every file that's downloaded ends up missing several lines on the bottom. I've checked the blobs and they were uploaded correctly.
I'm using the following code for downloading a blob from the Azure service:
private async Task Download(CloudBlobClient client)
{
try
{
_media = await _directory.CreateFileAsync(ResourceName, CreationCollisionOption.FailIfExists);
}
catch (Exception)
{
return;
}
using (var stream = await _media.OpenAsync(FileAccessMode.ReadWrite))
{
var blob = await GetBlob(client);
await blob.DownloadToStreamAsync(stream);
_category.NotifyAzureProgress();
await stream.FlushAsync();
}
}
The method GetBlob() looks like this:
private async Task<CloudBlockBlob> GetBlob(CloudBlobClient client)
{
CloudBlobContainer container = client.GetContainerReference(ContainerName);
await container.CreateIfNotExistsAsync();
var blob = container.GetBlockBlobReference(ResourceName);
return blob;
}
Upload code:
private async Task UploadAsync(CloudBlobClient client)
{
_media = await _directory.GetFileAsync(ResourceName);
using (var stream = await _media.OpenAsync(FileAccessMode.Read))
{
var blob = await GetBlob(client);
await blob.UploadFromStreamAsync(stream);
_category.NotifyAzureProgress();
}
}
Thanks for any help!
Edit: I've realized I've missed out one detail - the downloaded image has correct dimensions, but several lines from the bottom are black - it doesn't has the same pixels as the source image. I've checked the MD5 hashes and while they match, when I download the image through an external app, they don't match when I download them with the code above.
Edit2: after inspecting the properties of CloudBlob and the output stream, I've noticed, that even though the blob gives correct length after download, the stream usually says something a little lower. I've tried downloading throught range, but to no avail
Ok, so I've managed to download the images afterall, by partially using the WinRT Azure library combined with a standard .NET HttpClient.
I used the Azure Lib establish the initial connection and then to get only the Blob reference, because the BlockBlobReference has a method to create Shared Access Signature (and I really didn't want to try to construct it myself). Then I created the HttpClient, made a download URL using the SAS and issued a GET request to the URL, which finally worked and downloaded all the images intact.
I think there might be some weird bug in the official library, since using my download method instead of theirs solved everything.
Code sample:
internal async Task Download(CloudBlobClient client)
{
try
{
_media = await _directory.CreateFileAsync(ResourceName, CreationCollisionOption.FailIfExists);
}
catch (Exception)
{
return;
}
try
{
var blob = await GetBlob(client);
HttpClient httpClient = new HttpClient();
var date = DateTime.UtcNow;
var policy = new SharedAccessBlobPolicy();
policy.Permissions = SharedAccessBlobPermissions.Read;
policy.SharedAccessStartTime = new DateTimeOffset(date);
policy.SharedAccessExpiryTime = new DateTimeOffset(date.AddDays(1));
var signature = blob.GetSharedAccessSignature(policy);
var uriString = string.Format("{0}{1}", blob.Uri.ToString(), signature);
var data = await httpClient.GetByteArrayAsync(uriString);
var buf = new Windows.Storage.Streams.Buffer((uint)data.Length);
await FileIO.WriteBytesAsync(_media, data);
_category.NotifyAzureProgress();
}
catch (Exception e)
{
_media.DeleteAsync();
throw e;
}
}