Note : I use a translation app. Sorry if it's not always very understandable.
I'm developing a UWP application, and I'm having a problem with managing a file type, the CBZ extension.
Some files open without a problem, others the file never opens and blocks the Task.
Here's the code I use :
Task loadEbookTask = Task.Factory.StartNew(() =>
{
Stream streamEbook = WindowsRuntimeStorageExtensions.OpenStreamForReadAsync(ebookFile).Result;
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
Content = new ZipArchive(streamEbook, ZipArchiveMode.Read, false);
// Pour charque archive, prendre que des extensions valident.
foreach (var file in Content.Entries)
{
string extension = Path.GetExtension(file.Name).ToLower();
bool isFileExtensionOk = EbooksManager.AvailableExtensionsImage.Contains(extension);
if (isFileExtensionOk)
{
ArchivesExploitable.Add(file);
}
}
TotalPage = Convert.ToUInt32(ArchivesExploitable.Count());
});
if (loadEbookTask.Wait(4000))
{
EbookCbz.LoadEbook = EbookLoad.Ok;
}
else
{
EbookCbz.LoadEbook = EbookLoad.Timeout;
}
It's looping on :
Stream streamEbook = WindowsRuntimeStorageExtensions.OpenStreamForReadAsync(ebookFile).Result;
In Visual Studio, memory doesn't go up any more, but the Garbage Collector keeps being called.
With the Task.Wait(4000), it does not stop the Task, so it does not stop turning in background.
And if I open another file, a new task is created, and will turn into a background task.
My question is:
- Is there a method that open a file, and that it is possible to cancel if it exceeds a certain time.
It is this method that is problematic.
Stream streamEbook = WindowsRuntimeStorageExtensions.OpenStreamForReadAsync(ebookFsile).Result;
I change my code to :
byte[] buffer = await ebookFile.ReadBytesAsync();
Stream stream = new MemoryStream(buffer);
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
Content = new ZipArchive(stream, ZipArchiveMode.Read, false);
It's fast and if the file is corrupted, there's an exception. It's no longer in memory.
Thanks for your help, I learned a new concept.
Related
I'm using C# to download different files via url and run them afterwards from a physical filepath in the system, however I need to wait for the file until it is completely written and only then run it. The reason is that some files would need more time than others to be saved and I can't really use Thread.Sleep() for this purpose.
I tried this code but it is not that flexible for some reason, as I can't really tell how many tries or how much time it should be until the file is saved. This depends always on the internet connection as well as on the file size.
WebClient client = new WebClient();
var downloadTask = client.DownloadFileTaskAsync(new Uri(url), filepath);
var checkFile = Task.Run(async () => await downloadTask);
WaitForFile(filepath, FileMode.CreateNew);
—
FileStream WaitForFile(string fullPath, FileMode mode)
{
for (int numTries = 0; numTries < 15; numTries++)
{
FileStream fs = null;
try
{
fs = new FileStream(fullPath, mode);
return fs;
}
catch (IOException)
{
if (fs != null)
{
fs.Dispose();
}
Thread.Sleep(50);
}
}
return null;
}
Is there a way to keep waiting until the File.Length > 0?
I would appreciate any help.
Thanks.
You're not awaiting for the file to complete download. Or better said, you're awaiting, in a different thread, and then throwing that result away. Just wait in the very same method and you no longer need a separate way to know when the file is downloaded
WebClient client = new WebClient();
await client.DownloadFileTaskAsync(new Uri(url), filepath);
You can use FileSystemWatcher to get an event when file changes its size or a new file appear https://learn.microsoft.com/en-us/dotnet/api/system.io.filesystemwatcher.onchanged?view=netcore-3.1
But you can't really tell if the file is fully downloaded this way unless you know its size.
You should just change the code that downloads the file so that it notifies when the file is downloaded.
I think this is all you have to do:
WebClient client = new WebClient();
await client.DownloadFileTaskAsync(new Uri(url), filepath);
// code continues when the file finishes downloading
I have a text box that a user can submit a list of document ids to download those files zipped up from an Azure blob.
How the code currently functions building a zip memory stream and then for each document id submitted we build a memory stream, get the file from that stream, and then add it to the zip file. The issue is that we we are building the memory stream and getting a file that is larger than 180 mb the program throws an out of memory exception.
There is the code
public async Task<byte[]> BuildZipStream(string valueDataUploadContainerName, IEnumerable<Document> docs)
{
var zipMemStream = new MemoryStream();
using (Ionic.Zip.ZipFile zip = new Ionic.Zip.ZipFile())
{
zip.Name = System.IO.Path.GetTempFileName();
var insertedEntries = new List<string>();
foreach (var doc in docs)
{
var EntryName = $"{doc.Name}{Path.GetExtension(doc.DocumentPath)}";
if (insertedEntries.Contains(EntryName))
{
EntryName = $"{doc.Name} (1){Path.GetExtension(doc.DocumentPath)}";
var i = 1;
while (insertedEntries.Contains(EntryName))
{
EntryName = $"{doc.Name} ({i.ToString()}){Path.GetExtension(doc.DocumentPath)}";
i++;
}
}
insertedEntries.Add(EntryName);
var file = await GetFileStream(blobFolderName, doc.DocumentPath);
if (file != null)
zip.AddEntry($"{EntryName}", file);
}
zip.Save(zipMemStream);
}
zipMemStream.Seek(0, 0);
return zipMemStream.ToArray();
And then for actually getting the file from the blob storage
public async Task<byte[]> GetFileStream(string container, string filename)
{
var blobStorageAccount = _keyVaultService.GetSecret(new KeyVaultModel { Key = storageLocation });
var storageAccount = CloudStorageAccount.Parse(blobStorageAccount ?? _config.Value.StorageConnection);
var blobClient = storageAccount.CreateCloudBlobClient();
var blobContainer = blobClient.GetContainerReference(container);
await blobContainer.CreateIfNotExistsAsync();
var blockBlob = blobContainer.GetBlockBlobReference(filename);
if (blockBlob.Exists())
{
using (var mStream = new MemoryStream())
{
await blockBlob.DownloadToStreamAsync(mStream);
mStream.Seek(0, 0);
return mStream.ToArray();
}
}
}
The problem occurs when the program hits await blockBlob.DownloadToStreamAsync(mStream); it will sit and spin for a while and then throw an out of memory exception.
I have read a few different solutions which have not been working for me, the most common being to change the Platform target under properties to be at least x64 and I am running this at x86. Another solution I could see would be to move the GetFileStream logic into the method for BuildZipStream, but then I feel the method would be doing too much.
Any suggestions?
EDIT:
The problem is actually occurring when the program hits zip.Save(zipMemStream)
Your methodology here is flawed. Because you do not know:
The Amount of Files.
The Size of Each File.
You cannot accurately determine if you'll have the memory in RAM within the server to actually HOUSE all the files in memory. What you are doing here is collecting every Azure Blob File they list and putting that into a Zip File IN MEMORY, while downloading each file IN MEMORY. It's no wonder you're getting a Out-Of-Memory exception, even with 128gb of RAM, if the user requests a big enough file, you'll Run out of Memory.
Your best solution, and the most common practice found with downloading and zipping multiple Azure Blob Files is to utilize a Temp Blob file.
Instead of writing to a MemoryStream, you write to FileStream and place that Zipped File onto the Azure Blob, and then Serve that zipped Blob File. Once the file is served you remove it from the Blob.
Hope this helps.
Hello i was able to upload multiple files in multiple threads when i was using WindowsAzure.Storage 2.0.4.0. but i have recently upgraded my library to 9.3.3.
Now i am facing error in setting my multiple threads to upload my file. Please have a look at my code and tell me that where i am missing. Although i have searched to set the parallel threads but its not setting the threads of the blob as it was setting before.
public void UploadBlobAsync(Microsoft.WindowsAzure.StorageClient.CloudBlob
blob, string LocalFile)
{
Microsoft.WindowsAzure.StorageCredentialsAccountAndKey account = blob.ServiceClient.Credentials as Microsoft.WindowsAzure.StorageCredentialsAccountAndKey;
ICloudBlob blob2 = new CloudBlockBlob(blob.Attributes.Uri, new Microsoft.WindowsAzure.Storage.Auth.StorageCredentials(blob.ServiceClient.Credentials.AccountName, account.Credentials.ExportBase64EncodedKey()));
UploadBlobAsync(blob2, LocalFile);
}
public void UploadBlobAsync(ICloudBlob blob, string LocalFile)
{
// The class currently stores state in class level variables so calling UploadBlobAsync or DownloadBlobAsync a second time will cause problems.
// A better long term solution would be to better encapsulate the state, but the current solution works for the needs of my primary client.
// Throw an exception if UploadBlobAsync or DownloadBlobAsync has already been called.
lock (WorkingLock)
{
if (!Working)
Working = true;
else
throw new Exception("BlobTransfer already initiated. Create new BlobTransfer object to initiate a new file transfer.");
}
// Attempt to open the file first so that we throw an exception before getting into the async work
using (FileStream fstemp = new FileStream(LocalFile, FileMode.Open, FileAccess.Read)) { }
// Create an async op in order to raise the events back to the client on the correct thread.
asyncOp = AsyncOperationManager.CreateOperation(blob);
TransferType = TransferTypeEnum.Upload;
m_Blob = blob;
m_FileName = LocalFile;
var file = new FileInfo(m_FileName);
long fileSize = file.Length;
FileStream fs = new FileStream(m_FileName, FileMode.Open, FileAccess.Read, FileShare.Read);
ProgressStream pstream = new ProgressStream(fs);
pstream.ProgressChanged += pstream_ProgressChanged;
pstream.SetLength(fileSize);
m_Blob.ServiceClient.ParallelOperationThreadCount = 10; //This Line is giving an error that is does not contain the definition.
m_Blob.StreamWriteSizeInBytes = GetBlockSize(fileSize);
asyncresult = m_Blob.BeginUploadFromStream(pstream, BlobTransferCompletedCallback, new BlobTransferAsyncState(m_Blob, pstream));
}
m_Blob.ServiceClient.ParallelOperationThreadCount = 10; is giving the error that it does not contain the definition. As i tried to find the work around but couldn't. I fount the code on Microsoft forum but it didn't help much.
Updated code of uploading azure blob storage multiple files in multi-threaded way here is the snippet of updated code which can be integrated in my previous code.
//Replace
m_Blob.ServiceClient.ParallelOperationThreadCount = 10
//with
BlobRequestOptions options = new BlobRequestOptions
{
ParallelOperationThreadCount = 8,
DisableContentMD5Validation = true,
StoreBlobContentMD5 = false
};
//Replace
asyncresult = m_Blob.BeginUploadFromStream(pstream, BlobTransferCompletedCallback, new BlobTransferAsyncState(m_Blob, pstream));
//with
asyncresult = m_Blob.BeginUploadFromStream(pstream,null,options,null,BlobTransferCompletedCallback, new BlobTransferAsyncState(m_Blob, pstream));
I'm trying to build a small program to monitor my pfirewall.log, but I can't seem to open it.
I found quite many (simple) answers, that all kinda say
// use FilesystemWatcher
// open FileStream
// read from last position to end
// output new lines
The problem here is: The file seems to always be opened by another process already. I guess that's the windows process writing to the file, since it's getting written to all the time, as Notepad++ shows me.
Which means, Notepad++ can for some reason do what I can not: Read the file despite it being opened already.
I initialize my monitor in the constructor:
public FirewallLogMonitor(string path)
{
if (!File.Exists(path))
throw new FileNotFoundException("Logfile not found");
this.file = path;
this.lastPosition = 0;
this.monitor = new FileSystemWatcher(Path.GetDirectoryName(path), Path.GetFileName(path));
this.monitor.NotifyFilter = NotifyFilters.Size;
}
And try to read the file on monitor.Changed event:
private void LogFileChanged(object sender, FileSystemEventArgs e)
{
using (FileStream stream = new FileStream(e.FullPath, FileMode.Open, FileAccess.Read, FileShare.Read))
using (StreamReader reader = new StreamReader(stream))
{
stream.Seek(this.lastPosition, SeekOrigin.Begin);
var newLines = reader.ReadToEnd();
this.lastPosition = stream.Length;
var filteredLines = filterLines(newLines);
if (filteredLines.Count > 0)
NewLinesAvailable(this, filteredLines);
}
}
It always throws the IOException on new FileStream(...) to tell me the file is already in use.
Since Notepad++ does it, there has to be a way I can do it too, right?
**Edit: ** A button does this:
public void StartLogging()
{
this.IsRunning = true;
this.monitor.Changed += LogFileChanged;
this.monitor.EnableRaisingEvents = true;
}
**Edit2: ** This is not a duplicate of FileMode and FileAccess and IOException: The process cannot access the file 'filename' because it is being used by another process, since that one assumes I have control over the writing process. Will try the other suggestions, and report back with results.
If i understand your question you can use the notepad++ itself with a plugin to monitor you need to go to:
plugins -> Document Moniter -> Start to monitor
if you dont have this plugin you can download it here:
http://sourceforge.net/projects/npp-plugins/files/DocMonitor/
I have a working solution to load and render a PDF document from a byte array in a Windows Store App. Lately some users have reported out-of-memory errors though. As you can see in the code below there is one stream I am not disposing of. I've commented out the line. If I do dispose of that stream, then the PDF document does not render anymore. It just shows a completely white image. Could anybody explain why and how I could load and render the PDF document and dispose of all disposables?
private static async Task<PdfDocument> LoadDocumentAsync(byte[] bytes)
{
using (var stream = new InMemoryRandomAccessStream())
{
await stream.WriteAsync(bytes.AsBuffer());
stream.Seek(0);
var fileStream = RandomAccessStreamReference.CreateFromStream(stream);
var inputStream = await fileStream.OpenReadAsync();
try
{
return await PdfDocument.LoadFromStreamAsync(inputStream);
}
finally
{
// do not dispose otherwise pdf does not load / render correctly. Not disposing though may cause memory issues.
// inputStream.Dispose();
}
}
}
and the code to render the PDF
private static async Task<ObservableCollection<BitmapImage>> RenderPagesAsync(
PdfDocument document,
PdfPageRenderOptions options)
{
var items = new ObservableCollection<BitmapImage>();
if (document != null && document.PageCount > 0)
{
for (var pageIndex = 0; pageIndex < document.PageCount; pageIndex++)
{
using (var page = document.GetPage((uint)pageIndex))
{
using (var imageStream = new InMemoryRandomAccessStream())
{
await page.RenderToStreamAsync(imageStream, options);
await imageStream.FlushAsync();
var renderStream = RandomAccessStreamReference.CreateFromStream(imageStream);
using (var stream = await renderStream.OpenReadAsync())
{
var bitmapImage = new BitmapImage();
await bitmapImage.SetSourceAsync(stream);
items.Add(bitmapImage);
}
}
}
}
}
return items;
}
As you can see I am using this RandomAccessStreamReference.CreateFromStream method in both of my methods. I've seen other examples that skip that step and use the InMemoryRandomAccessStream directly to load the PDF document or the bitmap image, but I've not managed to get the PDF to render correctly then. The images will just be completely white again. As I mentioned above, this code does actually render the PDF correctly, but does not dispose of all disposables.
Why
I assume LoadFromStreamAsync(IRandomAccessStream) does not parse the whole stream into the PdfDocument object but instead only parses the main PDF dictionaries and holds a reference to the IRandomAccessStream.
This actually is the sane thing to do, why parse the whole PDF into own objects (a possibly very expensive operation resource-wise) if the user eventually only wants to render one page, or even merely wants to query the number of pages...
Later on, when other methods of the returned PdfDocument are called, e.g. GetPage, these methods try to read the additional data from the stream they need for their task, e.g. for rendering. Unfortunately in your case that means after the finally { inputStream.Dispose(); }
How else
You have to postpone the inputStream.Dispose() until all operations on the PdfDocument are finished. That means some hopefully minor architectural changes for your code. Probably moving the LoadDocumentAsync code as a frame into the RenderPagesAsync method or its caller suffices.