HttpPostedFileBase gets content length to 0 when C# iterates the zipfile - c#

I have a web interface where users can choose one of many files from local computer and upload them to a central location, in this case Azure Blob Storage. I have a check in my C# code to validate that the filename ending is .bin. The receiving method in C# takes an array of HttpPostedFileBase.
I want to allow users to choose a zipfile instead. In my C# code, I iterate through the content of the zipfile and check each filename to verify that the ending is .bin.
However, when I iterate through the zipfile, the ContentLength of the HttpPostedFileBase object becomes 0 (zero) and when I later on upload the zipfile to Azure, it is empty.
How can I make a check for filename endings without manipulating the zipfile?
I have tried to DeepCopy a single object of HttpPostedFileBase but it is not serializable.
I've tried to make a copy of the array but nothing works. It seems that everything is reference and not value. Some example of my code as follows. Yes, I tried the lines individually.
private static bool CanUploadBatchOfFiles(HttpPostedFileBase[] files)
{
var filesCopy = new HttpPostedFileBase[files.Length];
// Neither of these lines works
Array.Copy(files, 0, filesCopy, 0, files.Length);
Array.Copy(files, filesCopy, files.Length);
files.CopyTo(filesCopy, 0);
}
This is how I iterate through the zipfile
foreach (var file in filesCopy)
{
if (file.FileName.EndsWith(".zip"))
{
using (ZipArchive zipFile = new ZipArchive(file.InputStream))
{
foreach (ZipArchiveEntry entry in zipFile.Entries)
{
if (entry.Name.EndsWith(".bin"))
{
// Some code left out
}
}
}
}
}

I solved my problem. I had to do two separate things:
First, I do not do a copy of the array. Instead, for each zip file, I just copy the stream. This made the ContentLength stay at whatever length it was.
The second thing is did was to reset the position after I looked inside the zipfile. I need to do this or else the zip file that I upload to Azure Blob Storage will be empty.
private static bool CanUploadBatchOfFiles(HttpPostedFileBase[] files)
{
foreach (var file in files)
{
if (file.FileName.EndsWith(".zip"))
{
// Part one of the solution
Stream fileCopy = new MemoryStream();
file.InputStream.CopyTo(fileCopy);
using (ZipArchive zipFile = new ZipArchive(fileCopy))
{
foreach (ZipArchiveEntry entry in zipFile.Entries)
{
// Code left out
}
}
// Part two of the solution
file.InputStream.Position = 0;
}
}
return true;
}

Related

Why is my zip file not showing any contents?

I've created a zip file method in my web api which returns a zip file to the front end (Angular / typescript) that should download a zip file in the browser. The issue I have is the file shows it has data by the number of kbs it has but on trying to extract the files it says it's empty. From a bit of research this is most likely down to the file being corrupt, but I want to know where I can find this is going wrong. Here's my code:
WebApi:
I won't show the controller as it basically just takes the inputs and passes them to the method. The DownloadFileResults passed in basically have a byte[] in the File property.
public FileContentResult CreateZipFile(IEnumerable<DownloadFileResult> files)
{
using (var compressedFileStream = new MemoryStream())
{
using (var zipArchive = new ZipArchive(compressedFileStream, ZipArchiveMode.Update))
{
foreach (var file in files)
{
var zipEntry = zipArchive.CreateEntry(file.FileName);
using (var entryStream = zipEntry.Open())
{
entryStream.Write(file.File, 0, file.File.Length);
}
}
}
return new FileContentResult(compressedFileStream.ToArray(), "application/zip");
}
}
This appears to work in that it generates a result with data. Here's my front end code:
let fileData = this._filePaths;
this._fileStorageProxy.downloadFile(Object.entries(fileData).map(([key, val]) => val), this._pId).subscribe(result => {
let data = result.data.fileContents;
const blob = new Blob([data], {
type: 'application/zip'
});
const url = window.URL.createObjectURL(blob);
window.open(url);
});
The front end code then displays me a zip file being downloaded, which as I say appears to have data due to it's size, but I can't extract it.
Update
I tried writing the compressedFileStream to a file on my local and I can see that it creates a zip file and I can extract the files within it. This leads me to believe it's something wrong with the front end, or at least with what the front end code is receiving.
2nd Update
Ok, turns out this is specific to how we do things here. The request goes through platform, but for downloads it can only handle a BinaryTransferObject and I needed to hit a different end point. So with a tweak to no longer returning a FileContentResult and hitting the right end point and making the url simply an ahref it's now working.

How to recursively explore for zip file contents without extraction

I want to write a function that will explore a ZIP file and will find if it contains a .png file. Problem is, it should also explore contained zip files that might be within the parent zip (also from other zip files and folders).
as if it is not painful enough, the task must be done without extracting any of the zip files, parent or children.
I would like to write something like this (semi pseudo):
public bool findPng(zipPath) {
bool flag = false;
using (ZipArchive archive = ZipFile.OpenRead(zipPath))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
string s = entry.FullName;
if (s.EndsWith(".zip"))
{
/* recoursively calling findPng */
flag = findPng(s);
if (flag == true)
return true;
}
/* same as above with folders within the zip */
if((s.EndsWith(".png")
return true;
}
return false
}
}
Problem is, I can't find a way to explore inner zip files without extracting the file, which is a must prerequisite (to not extract the file).
Thanks in advance!
As I pointed to in the question I marked yours basically as a duplicate off, you need to open the inner zip file.
I'd change your "open from file" method to be like this:
// Open ZipArchive from a file
public bool findPng(zipPath) {
using (ZipArchive archive = ZipFile.OpenRead(zipPath))
{
return findPng(archive);
}
}
And then have a separate method that takes a ZipArchive so that you can call it recursively by opening the entry as a Stream as demonstrated here
// Search ZipArchive for PNG
public bool findPng(ZipArchive archive)
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
string s = entry.FullName;
if (s.EndsWith(".zip"))
{
// Open inner zip and pass to same method
using (ZipArchive innerArchive = new ZipArchive(entry.Open()))
{
if (findPng(innerArchive))
return true;
}
}
/* same as above with folders within the zip */
if(s.EndsWith(".png"))
return true;
}
return false;
}
}
As an optimisation, I would recommend checking all of the filenames before handling nested zip files.

How to extract multi-volume archive within Azure Blob Storage?

I have a multi-volume archive stored in Azure Blob Storage that is split into a series of zips titled like this: Archive-Name.zip.001, Archive-Name.zip.002, etc. . . Archive-Name.zip.010. Each file is 250 MB and contains hundreds of PDFs.
Currently we were trying to iterate through each archive part and extract the PDFs. This works except when the past PDF in an archive has been split between two archive parts, ZipFile in C# is unable to process the split file and throws an exception.
We tried reading all the archive parts into a single MemoryStream and then extracting the files, however then we are finding the memory streams exceed 2GBs which is the limit - so this method does not work either.
It is not feasible to download the archive into a machines memory, extract, then upload the PDFs to a new file. The extraction needs to be done in Azure where the program will run.
This is the code we are currently using - it is unable to handle PDFs split between two archive parts.
public static void UnzipTaxForms(TextWriter log, string type, string fiscalYear)
{
var folderName = "folderName";
var outPutContainer = GetContainer("containerName");
CreateIfNotExists(outPutContainer);
var fileItems = ListFileItems(folderName);
fileItems = fileItems.Where(i => i.Name.Contains(".zip")).ToList();
foreach (var file in fileItems)
{
using (var ziped = ZipFile.Read(GetMemoryStreamFromFile(folderName, file.Name)))
{
foreach (var zipEntry in ziped)
{
using (var outPutStream = new MemoryStream())
{
zipEntry.Extract(outPutStream);
var blockblob = outPutContainer.GetBlockBlobReference(zipEntry.FileName);
outPutStream.Seek(0, SeekOrigin.Begin);
blockblob.UploadFromStream(outPutStream);
}
}
}
}
}
Another note. We are unable to change the way the multi-volume archive is generated. Any help would be appreciated.

c# zip file - Extract file last

Quick question: I need to extract zip file and have a certain file extract last.
More info: I know how to extract a zip file with c# (fw 4.5).
The problem I'm having now is that I have a zip file and inside it there is always a file name (for example) "myFlag.xml" and a few more files.
Since I need to support some old applications that listen to the folder I'm extracting to, I want to make sure that the XML file will always be extract the last.
Is there some thing like "exclude" for the zip function that can extract all but a certain file so I can do that and then extract only the file alone?
Thanks.
You could probably try a foreach loop on the ZipArchive, and exclude everything that doesn't match your parameters, then, after the loop is done, extract the last file.
Something like this:
private void TestUnzip_Foreach()
{
using (ZipArchive z = ZipFile.Open("zipfile.zip", ZipArchiveMode.Read))
{
string LastFile = "lastFileName.ext";
int curPos = 0;
int lastFilePosition = 0;
foreach (ZipArchiveEntry entry in z.Entries)
{
if (entry.Name != LastFile)
{
entry.ExtractToFile(#"C:\somewhere\" + entry.FullName);
}
else
{
lastFilePosition = curPos;
}
curPos++;
}
z.Entries[lastFilePosition].ExtractToFile(#"C:\somewhere_else\" + LastFile);
}
}

Manipulating freshly uploaded file causes IOException

My Asp.net MVC app requires a file upload. In the course of the upload I'd like to manipulate the freshly uploaded file.
public ActionResult Edit(int id, FormCollection collection) {
Block block = userrep.GetBlock(id);
foreach (string tag in Request.Files) {
var file = Request.Files[tag] as HttpPostedFileBase;
if (file.ContentLength == 0)
continue;
string tempfile = Path.GetTempFileName()
file.SaveAs(tempfile);
// This doesn't seem to make any difference!!
// file.InputStream.Close();
if (FileIsSmallEnough(file)) {
// Will throw an exception!!
File.Move(tempfile, permanentfile);
} else {
GenerateResizedFile(tempfile, permanentfile);
// Will throw an exception!!
File.Delete(tempfile);
}
block.Image = permanentfile;
}
userrep.Save();
The problem with this snippet is that any attempt to manipulate the initially uploaded file generates an IOException("The process cannot access the file because it is being used by another process.") Of course I can bypass the problem by copying rather than moving uploaded file but I still can't delete it once I have a better alternative.
Any advice?
Duffy
As you have mentioned in your comments, you load an Image from file. The MSDN documentation states that the file remains locked until the image is disposed.
http://msdn.microsoft.com/en-us/library/stf701f5.aspx
To dispose your image, you can either call the Dispose method on the instance, or use the preferred mechanism of the using statement:
private bool FileIsSmallEnough()
{
using (Image i = Image.FromFile())
{
}
}
This should solve the problem.

Categories