This question already has answers here:
Download file from FTP with Progress - TotalBytesToReceive is always -1?
(3 answers)
Closed 4 years ago.
I have a ListBox that contains a list of DirectAdmin user backups. List is populated using WebRequestMethods.Ftp.ListDirectory and it looks like this:
I can download an archive using the button at the bottom right. When I click on the button, another form appears and downloads the archive.
My download code is this:
public static void DownloadFile(string server, string username, ...)
{
Uri URI = new Uri($"ftp://{server}/{targetFilePath}");
using (WebClient client = new WebClient())
{
client.Credentials = new NetworkCredential(username, password);
if (progress != null)
{
client.DownloadProgressChanged += new DownloadProgressChangedEventHandler(progress);
}
if (complete != null)
{
client.DownloadFileCompleted += new AsyncCompletedEventHandler(complete);
}
before?.Invoke();
client.DownloadFileAsync(URI, localFilePath);
}
}
and this is what I pass to the DownloadFile() method for the DownloadProgressChanged event:
delegate (object s2, DownloadProgressChangedEventArgs e2)
{
TransferLabel.Invoke((MethodInvoker)delegate
{
TransferLabel.Text = $"{(e2.BytesReceived / 1024).ToString()} KB / {(e2.TotalBytesToReceive / 1024).ToString()} KB";
});
TransferProgressBar.Invoke((MethodInvoker)delegate
{
TransferProgressBar.Value = (int)(e2.BytesReceived / (float)e2.TotalBytesToReceive * 100);
});
}
I'm using this same approach to upload a file and it works fine, but with download e2.TotalBytesToReceive returns -1 throughout the process:
and only when it's done, I get the correct value:
Why is that?
I've found a workaround to solve the problem. I'll change the ListBox to ListView and also store the filesize of the archives using ListDirectoryDetails. This way I can compare the e.BytesReceived to stored total bytes instead of e.TotalBytesToReceive. This would solve my problem, but I'm still curious about the problem. Why do I get -1? Am I doing something wrong, or is this a server related problem? Also is there anything I can do to fix it (get the correct value)?
With FTP protocol, WebClient in general does not know total download size. So you commonly get -1 with FTP.
See also Download file from FTP with Progress - TotalBytesToReceive is always -1?
Note that the behavior actually contradicts the .NET documentation, which says for FtpWebResponse.ContentLength (where the value of TotalBytesToReceive comes from):
For requests that use the DownloadFile method, the property is greater than zero if the downloaded file contained data and is zero if it was empty.
But you will easily find out many of questions about this (like the one I've linked above), effectively showing that the behavior is not always as documented. The FtpWebResponse.ContentLength has a meaningful value for GetFileSize method only.
The FtpWebRequest/WebClient makes no explicit attempt to find out a size of the file that it is downloading. All it does is that it tries to look for (xxx bytes). string in 125/150 responses to RETR command. No FTP RFC mandates that the server should include such information. ProFTPD (see data_pasv_open in src/data.c) and vsftpd (see handle_retr in postlogin.c) seem to include this information. Other common FTP servers (IIS, FileZilla) do not do this.
Certainly for HTTP downloads it's possible for the server not to supply size information when performing a file download and you're left with no sensible information until the server signals that it's done.
Not sure for FTP (I'd note that there's a separate SIZE command defined in the FTP command set and so including such information during a Retrieve may be considered redundant).
I'm slightly surprised that the documentation for TotalBytesToRetrieve isn't more explicit on the possibility that the information will not be available and what will be returned in such circumstances.
Related
Does anyone know how to use the C# OneDrive SDK to perform a resumable upload?
When I use IDriveItemRequestBuilder.CreateUploadSession I always get a new session with the NextExpectedRanges reset.
If I use the .UploadURL and manually send a HTTP Post I get the correct, next ranges back however I don't then know the means to resume the upload session using the sdk. There doesn't seem to be a means from the API to 'OpenUploadSession', or at least that I can find.
Nor can I find a working example.
I suspect this must be a common use case.
Please note that keywords in the text - resumable.
I was looking for the same thing and just stepped on an example from the official docs:
https://learn.microsoft.com/en-us/graph/sdks/large-file-upload?tabs=csharp.
I tried the code and it worked.
In case, my sample implementation: https://github.com/xiaomi7732/onedrive-sample-apibrowser-dotnet/blob/6639444d6298492c38f841e411066635760930c2/OneDriveApiBrowser/FormBrowser.cs#L565
The method of resumption depends on how much state you have. The absolution minimum that is required is UploadSession.UploadUrl (think of it as unique identifier for the session). If you don't have that URL you'd need to create a new upload session and start from the beginning, otherwise if you do have it you can do something like the following to resume:
var uploadSession = new UploadSession
{
NextExpectedRanges = Enumerable.Empty<string>(),
UploadUrl = persistedUploadUrl,
};
var maxChunkSize = 320 * 1024; // 320 KB - Change this to your chunk size. 5MB is the default.
var provider = new ChunkedUploadProvider(uploadSession, graphClient, ms, maxChunkSize);
// This will query the service and make sure the remaining ranges are accurate.
uploadSession = await provider.UpdateSessionStatusAsync();
// Since the remaining ranges is now accurate, this will return the requests required to
// complete the upload.
var chunkRequests = provider.GetUploadChunkRequests();
...
If you have more state you'd be able to skip some of the above. For example, if you already had a ChunkedUploadProvider but don't know that it's accurate (maybe it was serialized to disk or something) then you can just start the process with the call to UpdateSessionStatusAsync.
FYI, you can see the code for ChunkedUploadProvider here in case that'll be helpful to see what's going on under the covers.
In our Azure Data Lake, we have daily files recording events and coordinates for those events. We need to take these coordinates and lookup what State, County, Township, and Section these coordinates fall into. I've attempted several versions of this code.
I attempted to do this in U-SQL. I even uploaded a custom assembly that implemented Microsoft.SqlServer.Types.SqlGeography methods, only to find ADLA isn't set up to perform row-by-row operations like geocoding.
I pulled all the rows into SQL Server, converted the coordinates into a SQLGeography and built T-SQL code that would perform the State, County, etc. lookups. After much optimization, I got this process down to ~700ms / row. (with 133M rows in the backlog and ~16k rows added daily we're looking at nearly 3 years to catch up. So I parallelized the T-SQL, things got better, but not enough.
I took the T-SQL code, and built the process as a console application, since the SqlGeography library is actually a .Net library, not a native SQL Server product. I was able to get single threaded processing down t0 ~ 500ms. Adding in .Net's parallelism (parallel.ForEach) and throwing 10/20 of the cores of my machine at it does a lot, but still isn't enough.
I attempted to rewrite this code as an Azure Function and processing files in the data lake file-by-file. Most of the files timed out, since they took longer than 10 minutes to process. So I've updated the code to read in the files, and shread the rows into Azure Queue storage. Then I have a second Azure function that fires for each row in the queue. The idea is, Azure Functions can scale out far greater than any single machine can.
And this is where I'm stuck. I can't reliably write rows to files in ADLS. Here is the code as I have it now.
public static void WriteGeocodedOutput(string Contents, String outputFileName, ILogger log) {
AdlsClient client = AdlsClient.CreateClient(ADlSAccountName, adlCreds);
//if the file doesn't exist write the header first
try {
if (!client.CheckExists(outputFileName)) {
using (var stream = client.CreateFile(outputFileName, IfExists.Fail)) {
byte[] headerByteArray = Encoding.UTF8.GetBytes("EventDate, Longitude, Latitude, RadarSiteID, CellID, RangeNauticalMiles, Azimuth, SevereProbability, Probability, MaxSizeinInchesInUS, StateCode, CountyCode, TownshipCode, RangeCode\r\n");
//stream.Write(headerByteArray, 0, headerByteArray.Length);
client.ConcurrentAppend(outputFileName, true, headerByteArray, 0, headerByteArray.Length);
}
}
} catch (Exception e) {
log.LogInformation("multiple attempts to create the file. Ignoring this error, since the file was created.");
}
//the write the data
byte[] textByteArray = Encoding.UTF8.GetBytes(Contents);
for (int attempt = 0; attempt < 5; attempt++) {
try {
log.LogInformation("prior to write, the outputfile size is: " + client.GetDirectoryEntry(outputFileName).Length);
var offset = client.GetDirectoryEntry(outputFileName).Length;
client.ConcurrentAppend(outputFileName, false, textByteArray, 0, textByteArray.Length);
log.LogInformation("AFTER write, the outputfile size is: " + client.GetDirectoryEntry(outputFileName).Length);
//if successful, stop trying to write this row
attempt = 6;
}
catch (Exception e){
log.LogInformation($"exception on adls write: {e}");
}
Random rnd = new Random();
Thread.Sleep(rnd.Next(attempt * 60));
}
}
The file will be created when it needs to be, but I do get several messages in my log that several threads tried to create it. I'm not always getting the header row written.
I also no longer get any data rows only:
"BadRequest ( IllegalArgumentException concurrentappend failed with error 0xffffffff83090a6f
(Bad request. The target file does not support this particular type of append operation.
If the concurrent append operation has been used with this file in the past, you need to append to this file using the concurrent append operation.
If the append operation with offset has been used in the past, you need to append to this file using the append operation with offset.
On the same file, it is not possible to use both of these operations.). []
I feel like I'm missing some fundamental design idea here. The code should try to write a row into a file. If the file doesn't yet exist, create it and put the header row in. Then, put in the row.
What's the best-practice way to accomplish this kind of write scenario?
Any other suggestions of how to handle this kind of parallel-write workload in ADLS?
I am a bit late to this but I guess one of the problems could be due to the use of "Create" and "ConcurrentAppend" on the same file stream?
ADLS documentation mentions that they can't be used on the same file. Maybe, try changing the "Create" command to "ConcurrentAppend" as the latter can be used to create a file if it doesn't exist.
Also, if you found a better way to do it, please do post your solution here.
Basically, I'm building a website that allows user to upload file.
From the front end (JavaScript), the user will browse a file, I can get the site to send POST data (the parameter "UploadInput" and it's value, which the value is the file)
In the backend (C#), I want to make a copy of the file and save it in a specific path.
Below is the way I did it.
var files = Request.Files;
file[0].SaveAs("\temp\\" + file[0].FileName);
The problem I ran into is that I got the error message saying index out of range. I tried Response.Write(files.Count) and it gives me 0 instead of 1.
I'm wondering where I did wrong and how to fix it, or if there's a better way of doing it.
Thanks!
Edit:
I am using HttpFox to debug. From HttpFox, I can see that under POST data, parameter is "UploadInput" and the value is "test.txt"
Edit 2:
So I tried the way Marc provides, and I have a different problem.
I am able to create a new file, however, the content is not copied over. I tried opening the new created file in notepad and all it says is "UploadInput = test.txt"
If they simply posted the file as the body content, then there will be zero "files" involved here, so file[0] will fail. Instead, you need to look at the input-stream, and simply read from that stream. For example:
using(var file = File.Create(somePath)) {
Request.InputStream.CopyTo(file);
}
I'm currently building an application that is, among other things, going to download large files from a FTP server. Everything works fine for small files (< 50 MB) but the files I'm downloading are way bigger, mainly over 2 GB.
I've been trying with a Webclient using DownloadfileAsync() and a list system as I'm downloading these files one after the other due to their sizes.
DownloadClient.DownloadProgressChanged += new DownloadProgressChangedEventHandler(DownloadProgress);
DownloadClient.DownloadFileCompleted += new AsyncCompletedEventHandler(DownloadCompleted);
private void FileDownload()
{
DownloadClient.DownloadFileAsync(new Uri(#"ftp://" + RemoteAddress + FilesToDownload[0]), LocalDirectory + FilesToDownload[0]));
}
private void DownloadProgress(object sender, DownloadProgressChangedEventArgs e)
{
// Handle progress
}
private void DownloadCompleted(object sender, AsyncCompletedEventArgs e)
{
FilesToDownload.RemoveAt(0);
FileDownload();
}
It works absolutely fine this way on small files, they are all downloaded one by one, the progress is reported and DownloadCompleted fires after each file. This issue I'm facing with big files is that it launches the first download without any issue but doesn't do anything after that. The DownloadCompleted event never fires for some reasons. It looks like the WebClient doesn't know that the file has finished to download, which is an issue as I'm using this event to launch the next download in the FilesToDownload list.
I've also tried to do that synchronously using WebClient.DownloadFile and a for loop to cycle through my FilesToDownload list. It downloads the first file correctly and I get an exception when the second download should start: "The underlying connection was closed: An unexpected error occurred on a receive".
Finally, I've tried to go through this via FTP using edtFTPnet but I'm facing download speed issues (i.e. My download goes full speed with the WebClient and I just get 1/3 of the full speed with edtFTPnet library).
Any thoughts? I have to admit that I'm running out of ideas here.
public string GetRequest(Uri uri, int timeoutMilliseconds)
{
var request = System.Net.WebRequest.Create(uri);
request.Timeout = timeoutMilliseconds;
using (var response = request.GetResponse())
using (var stream = response.GetResponseStream())
using (var reader = new System.IO.StreamReader(stream))
{
return reader.ReadToEnd();
}
}
Forgot to update this thread but I figured how to sort this out a while ago.
The issue was that the Data connection that is opened for a file transfer randomly times out for some reason or is closed by the server before the transfer ends. I haven't been able to figure out why however as there is a load of local and external network interfaces between my computer and the remote server. As it's totally random (i.e the transfer works fine for five files in a row, times out for one file, works fine for the following files etc), the issue may be server or network related.
I'm now catching any FTP exception raised by the FTP client object during the download and issue a REST command with an offset equals to the position in the data stream where the transfer stopped (i.e total bytes amount of the remote file - currently downloaded bytes amount). Doing so allows to get the remaining bytes that are missing in the local file.
I have implemented something similar to this
only real difference is
string filename = context.Request.RawUrl.Replace("/", "\\").Remove(0,1);
string path = Uri.UnescapeDataString(Path.Combine(_baseFolder, filename));
so that I can traverse to subdirectories. This works great for webpages and other text file types but when trying to serve up media content I get the exception
HttpListenerException: The I/O
operation has been aborted because of
either a thread exit or an application
request
Followed by
InvalidOperationException: Cannot close stream until all bytes are written.
In the using statement.
Any suggestions on how to handle this or stop these exceptions?
Thanks
I should mention that I am using Google Chrome for my browser (Google Chrome doesn't seem to care about the MIME types, when it sees audio it will try to use it like it's in a HTML5 player), but this is also applicable if you are trying to host media content in a page.
Anyways, I was inspecting my headers with fiddler and noticed that Chrome passes 3 requests to the server. I started playing with other browsers and noticed they did not do this, but depending on the browser and what I had hard coded as the MIME type I would either get a page of crazy text, or a download of the file.
On further inspection I noticed that chrome would first request the file. Then request the file again with a few different headers most notably the range header. The first one with byte=0- then the next with a different size depending on how large the file was (more than 3 requests can be made depending how large the file is).
So there was the problem. Chrome will first ask for the file. Once seeing the type it would send another request which seems to me looking for how large the file is (byte=0-) then another one asking for the second half of the file or something similar to allow for a sort of streaming experienced when using HTML5. I coded something quickly up to handle MIME types and threw a HTML5 page together with the audio component and found that other browsers also do this (except IE)
So here is a quick solution and I no longer get these errors
string range = context.Request.Headers["Range"];
int rangeBegin = 0;
int rangeEnd = msg.Length;
if (range != null)
{
string[] byteRange = range.Replace("bytes=", "").Split('-');
Int32.TryParse(byteRange[0], out rangeBegin);
if (byteRange.Length > 1 && !string.IsNullOrEmpty(byteRange[1]))
{
Int32.TryParse(byteRange[1], out rangeEnd);
}
}
context.Response.ContentLength64 = rangeEnd - rangeBegin;
using (Stream s = context.Response.OutputStream)
{
s.Write(msg, rangeBegin, rangeEnd - rangeBegin);
}
Try:
using (Stream s = context.Response.OutputStream)
{
s.Write(msg, 0, msg.Length);
s.Flush()
}