I've found tons of information about how to create and upload a zip file to Azure storage, but I'm trying to see if there's a good way to add content to an existing zip file in blob storage that doesn't involve downloading the whole blob, rebuilding in memory, then uploading again.
My use case involves zipping several hundred, up to several million, items into an archive to be stored in Azure blob storage for later download. I have code that will handle splitting so I don't wind up with a single several-GB size file, but I still run into memory management issues when dealing with large quantities of files. I'm trying to address this by creating the zip file in blob storage, and adding subsequent files to it one by one. I recognize this will incur cost for the additional writes, and that's fine.
I know how to use Append Blobs and Block Blobs, and I have working code to create the zip file and upload, but I can't seem to find out if there's a way to do this. Anyone managed to accomplish this, or able to confirm that this is not possible?
Since you're dealing with zip files, only way to add new files to an existing zip file is to download the blob, add new file to that zip file and then reupload that blob.
Related
I want to download a file from a direct link. Those files are a between 900mb and 30GB. That's prettly large so I don't want to download them to a temp folder and then upload them. I want to use something like Azure Functions to do this every x hours and the temp storage then becomes pretty limited.
Is there a way to download / download stream and upload simultaneously to blobstorage? I don't want to save it first.
Hope you can help me out
Is there a way to download / download stream and upload simultaneously
to blobstorage? I don't want to save it first.
You don't really need to do it. Azure Storage can do this for you.
You will need to use Copy Blob functionality and provide the URL of the file you wish to transfer and Azure Storage will asynchronously copy the file into blob storage. Please do note that this is an asynchronous operation and you do not have control over when the blob gets copied.
If you want synchronous copy operation, you can take a look at Put Block From URLenter link description here operation. This is where you control how many bytes of data you want to transfer from source to blob storage.
I am using the Azure Storage File Shares client library for .NET in order to save files in the cloud, read them and so on. I got a file saved in the storage which is supposed to be updated after every time I'm doing a specific action in my code.
The way I'm doing it now is by downloading the file from the storage using
ShareFileDownloadInfo download = file.Download();
And then I edit the file locally and uploading it back to the storage.
The problem is that the file can be updated frequently which means lots of downloads and uploads of the file which increases in size.
Is there a better way of editing a file on Azure storage? Maybe some way to edit the file directly in the storage without the need to download it before editing?
Downloading and uploading the file is the correct way to make edits with the way you currently handling the data. If you are finding yourself doing this often, there are some strategies you could use to reduce traffic:
If you are the only one editing the file you could cache a copy of it locally and upload the updates to that copy instead of downloading it each time.
Cache pending updates and only update the file at regular intervals instead of with each change.
Break the single file up into multiple time-boxed files, say one per hour. This wouldn't help with frequency but it can with size.
FYI, when pushing logs to storage, many Azure services use a combination of #2 and #3 to minimize traffic.
I want to download a single file in a remote Zip file that is in the cloud. The zip file is too large for me to download as a whole therefore I have decided to look for a way to download only a single file(XML) that I need within the archive. I have tried and Tested a webclient and web request but it downloads the whole zip file(also too large file for these technuques usually fails). I'm eyeing the SharpZipLib but I dont know how to use it. Is it the right library I should use or there are other available ones I can get and test. Thank you so much.
I was wondering if it is possible to get a specific file in a ZIP from Azure File Storage without downloading and unzipping the whole ZIP.
The problem is that the zip file can be large (>1 gb), while the file in the zip which I need is just a few MB tops.
If it is possible, could you provide an example or link(s)?
Thank you
I was wondering if it is possible to get a specific file in a ZIP from
Azure File Storage without downloading and unzipping the whole ZIP.
No. You would need to download the entire file from storage, unzip the file and then extract the desired file. It is possible to keep the entire downloaded file in memory (in form of stream) and have a zipping library work on that stream instead of saving the entire file on the disk. But you have to read the entire file.
I've got a project which requires a fairly complicated process and I want to make sure I know the best way to do this. I'm using ASP.net C# with Adobe Flex 3. The app server is Mosso (cloud server) and the file storage server is Amazon S3. The existing site can be viewed at NoiseTrade.com
I need to do this:
Allow users to upload MP3 files to
an album "widget"
After the user has uploaded their
album/widget, I need to
automatically zip the mp3 (for other
users to download) and upload the
zip along with the mp3 tracks to
Amazon S3
I actually have this working already (using client side processing in Flex) but this no longer works because of Adobe's flash 10 "security" update. So now I need to implement this server-side.
The way I am thinking of doing this is:
Store the mp3 in a temporary folder
on the app server
When the artist "publishes" create a
zip of the files in that folder
using a c# library
Start the amazon S3 upload process (zip and mp3s)
and email the user when it is
finished (as well as deleting the
temporary folder)
The major problem I see with this approach is that if a user deletes or adds a track later on I'll have to update the zip file but the temporary files will not longer exist.
I'm at a loss at the best way to do this and would appreciate any advice you might have.
Thanks!
The bit about updating the zip but not having the temporary files if the user adds or removes a track leads me to suspect that you want to build zips containing multiple tracks, possibly complete albums. If this is incorrect and you're just putting a single mp3 into each zip, then StingyJack is right and you'll probably end up making the file (slightly) larger rather than smaller by zipping it.
If my interpretation is correct, then you're in luck. Command-line zip tools frequently have flags which can be used to add files to or delete files from an existing zip archive. You have not stated which library or other method you're using to do the zipping, but I expect that it probably has this capability as well.
MP3's are compressed. Why bother zipping them?
I would say it is not necessary to zip a compressed file format, you are only gong to get a five percent reduction in filesize, give or take a little. Mp3's dont really zip up by their nature the have compressed most of the possible data already.
DotNetZip can zip up files from C#/ASP.NET. I concur with the prior posters regarding compressibility of MP3s. DotNetZip will automatically skip compression on MP3, and just store the file, just for this reason. It still may be interesting to use a zip as a packaging/archive container, aside from the compression.
If you change the zip file later (user adds a track), you could grab the .zip file from S3, and just update it. DotNetZip can update zip files, too. But in this case you would have to pay for the transfer cost into and out of S3.
DotNetZip can do all of this with in-memory handling of the zips - though that may not be feasible for large archives with lots of MP3s and lots of concurrent users.