I am reading files into an array; here is the relevant code; a new DiskReader is created for each file and path is determined using OpenFileDialog.
class DiskReader{
// from variables section:
long MAX_STREAM_SIZE = 300 * 1024 * 1024; //300 MB
FileStream fs;
public Byte[] fileData;
...
// Get file size, check it is within allowed size (MAX)STREAM_SIZE), start process including progress bar.
using (fs = File.OpenRead(path))
{
if (fs.Length < MAX_STREAM_SIZE)
{
long NumBytes = (fs.Length < MAX_STREAM_SIZE ? fs.Length : MAX_STREAM_SIZE);
updateValues[0] = (NumBytes / 1024 / 1024).ToString("#,###.0");
result = LoadData(NumBytes);
}
else
{
// Need for something to handle big files
}
if (result)
{
mainForm.ShowProgress(true);
bw.RunWorkerAsync();
}
}
...
bool LoadData(long NumBytes)
{
try
{
fileData = new Byte[NumBytes];
fs.Read(fileData, 0, fileData.Length);
return true;
}
catch (Exception e)
{
return false;
}
}
The first time I run this, it works fine. The second time I run it, sometimes it works fine, most times it throws an System.OutOfMemoryException at
[Edit:
"first time I run this" was a bad choice of words, I meant when I start the programme and open a file is fine, I get the problem when I try to open a different file without exiting the programme. When I open the second file, I am setting the DiskReader to a new instance which means the fileData array is also a new instance. I hope that makes it clearer.]
fileData = new Byte[NumBytes];
There is no obvious pattern to it running and throwing an exception.
I don't think it's relevant, but although the maximum file size is set to 300 MB, the files I am using to test this are between 49 and 64 MB.
Any suggestions on what is going wrong here and how I can correct it?
If the exception is being thrown at that line only, then my guess is that you've got a problem somewhere else in your code, as the comments suggest. Reading the documentation of that exception here, I'd bet you call this function one too many times somewhere and simply go over the limit on object length in memory, since there don't seem to be any problem spots in the code that you posted.
The fs.Length property requires the whole stream to be evaluated, hence to read the file anyway. Try doing something like
byte[] result;
if (new FileInfo(path).Length < MAX_STREAM_SIZE)
{
result = File.ReadAllBytes(path);
}
Also depending on your needs, you might avoid using byte array and read the data directly from the file stream. This should have much lower memory footprint
If I understand well what you want to do, I have this proposal: The best option is to allocate one static array of defined MAX size at the beginning. And then keep that array, only fill it with a new data from another file. This way your memory should be absolutely fine. You just need to store file size in a separate variable, because the array will have always the same MAX size.
This is a common approach in systems with automatic memory management - it makes the program faster when you allocate a constant size of memory at the start and then never allocate anything during the computation, because garbage collector is not run many times.
Related
I am facing an issue of OutofMemoryException when I add large number of files in ZipFile. The sample code is as below:
ZipFile file = new ZipFile("E:\\test1.zip");
file.UseZip64WhenSaving = Zip64Option.AsNecessary;
file.ParallelDeflateThreshold = -1;
for (Int64 i = 0; i < 1000000; i++)
{
file.CompressionLevel = Ionic.Zlib.CompressionLevel.None;
byte[] data = Encoding.ASCII.GetBytes("rama");
ZipEntry entry = file.AddEntry(#"myFolder1/test1/myhtml111.html" + i.ToString(), data);
}
file.Save();
I have downloaded the source Code of Ionic.zip library and I see that for every Add*() function like AddEntry(), AddFile() etc. they add item into Dictionary called _entry.
This dictionary does not get cleared when we call Save() or Dispose() methods on ZipFile object.
I feel this is the root cause of OutOfMemoryException.
How do I overcome this issue? Is there any other way to achieve the same result without running into OutOfMemoryException? Am I missing something?
I am open to using other open Source libraries too.
The dictionary holding the internal structure of the archvie shouldn't be a problem.
Assuming your entry 'path' is a string of about 50 bytes - even 1000000 entries should amount to about 50 Mb (a lot - but nowhere near the limit of 2 Gb) - while I haven't bothered checking the size of the ZipEntry - I also doubt it would be large enough (would need to be around 2kb each)
I also think that your expectation for this Entry dictionary to be cleared is wrong. Since this is the informational structure of the contents of the zip file - you need it to hold all the entries.
From this point on I am going to assume that the posted code:
byte[] data = Encoding.ASCII.GetBytes("rama");
Is a place holder for actual file data in bytes (since for 1M x 4 bytes - should be under 4Mb)
The most likely issue here that the declared byte[] data remains in memory untill the entire ZipFile is disposed.
It makes sense to keep this array untill the data is saved.
The simplest way to work around this is to wrap the ZipFile in using re-opening and closing for every file you want to add.
var zipFileName = "E:\\test1.zip";
for (int i = 0; i < 1000000; ++i)
{
using (ZipFile zf = new ZipFile(zipFileName))
{
byte[] data = File.ReadAllBytes(file2Zip);
ZipEntry entry = zf.AddEntry(#"myFolder1/test1/myhtml111.html" + i.ToString(), data);
zf.Save();
}
}
This approach might seem wasteful if you are saving a lot of small files, since you are using byte[] directly it would be quite simple to implement a buffering mechanism.
While it is true that it's possible to side step this issue by compiling to 64-bit, unless you are really just barelly going over the 2Gb limit, this would create a very memory hungry app.
Within a tool copying big files between disks, I replaced the
System.IO.FileInfo.CopyTo method by System.IO.Stream.CopyToAsync.
This allow a faster copy and a better control during the copy, e.g. I can stop the copy.
But this create even more fragmentation of the copied files. It is especially annoying when I copy file of many hundreds megabytes.
How can I avoid disk fragmentation during copy?
With the xcopy command, the /j switch copies files without buffering. And it is recommended for very large file in TechNet
It seems indeed to avoid file fragmentation (while a simple file copy within windows 10 explorer DOES fragment my file!)
A copy without buffering seems to be the opposite way than this async copy. Or it there any way to do async copy without buffering?
Here it my current code for aync copy. I let the default buffersize of 81920 bytes, i.e. 10*1024*size(int64).
I am working with NTFS file systems, thus 4096 bytes clusters.
EDIT: I updated the code with SetLength as suggested, added the FileOptions Async while creating the destinationStream and fix setting the attributes AFTER setting the time (otherwise, exception is thrown for ReadOnly files)
int bufferSize = 81920;
try
{
using (FileStream sourceStream = source.OpenRead())
{
// Remove existing file first
if (File.Exists(destinationFullPath))
File.Delete(destinationFullPath);
using (FileStream destinationStream = File.Create(destinationFullPath, bufferSize, FileOptions.Asynchronous))
{
try
{
destinationStream.SetLength(sourceStream.Length); // avoid file fragmentation!
await sourceStream.CopyToAsync(destinationStream, bufferSize, cancellationToken);
}
catch (OperationCanceledException)
{
operationCanceled = true;
}
} // properly disposed after the catch
}
}
catch (IOException e)
{
actionOnException(e, "error copying " + source.FullName);
}
if (operationCanceled)
{
// Remove the partially written file
if (File.Exists(destinationFullPath))
File.Delete(destinationFullPath);
}
else
{
// Copy meta data (attributes and time) from source once the copy is finished
File.SetCreationTimeUtc(destinationFullPath, source.CreationTimeUtc);
File.SetLastWriteTimeUtc(destinationFullPath, source.LastWriteTimeUtc);
File.SetAttributes(destinationFullPath, source.Attributes); // after set time if ReadOnly!
}
I fear also that the File.SetAttributes and Time at the end on my code could increase file fragmentation.
Is there a proper way to create a 1:1 asynchronous file copy without any file fragmentation, i.e. asking the HDD that the file steam get only contiguous sectors?
Other topics regarding file fragmentation like How can I limit file fragmentation while working with .NET suggests incrementing the file size in larger chunks, but it does not seem to be a direct answer to my question.
but the SetLength method does the job
It does not do the job. It only updates the file size in the directory entry, it does not allocate any clusters. The easiest way to see this for yourself is by doing this on a very large file, say 100 gigabytes. Note how the call completes instantly. Only way it can be instant is when the file system does not also do the job of allocating and writing the clusters. Reading from the file is actually possible, even though the file contains no actual data, the file system simply returns binary zeros.
This will also mislead any utility that reports fragmentation. Since the file has no clusters, there can be no fragmentation. So it only looks like you solved your problem.
The only thing you can do to force the clusters to be allocated is to actually write to the file. It is in fact possible to allocate 100 gigabytes worth of clusters with a single write. You must use Seek() to position to Length-1, then write a single byte with Write(). This will take a while on a very large file, it is in effect no longer async.
The odds that it will reduce fragmentation are not great. You merely reduced the risk somewhat that the writes will be interleaved by writes from other processes. Somewhat, actual writing is done lazily by the file system cache. Core issue is that the volume was fragmented before you began writing, it will never be less fragmented after you're done.
Best thing to do is to just not fret about it. Defragging is automatic on Windows these days, has been since Vista. Maybe you want to play with the scheduling, maybe you want to ask more about it at superuser.com
I think, FileStream.SetLength is what you need.
Considering Hans Passant answer,
in my code above, an alternative to
destinationStream.SetLength(sourceStream.Length);
would be, if I understood it properly:
byte[] writeOneZero = {0};
destinationStream.Seek(sourceStream.Length - 1, SeekOrigin.Begin);
destinationStream.Write(writeOneZero, 0, 1);
destinationStream.Seek(0, SeekOrigin.Begin);
It seems indeed to consolidate the copy.
But a look at the source code of FileStream.SetLengthCore seems it does almost the same, seeking at the end but without writing one byte:
private void SetLengthCore(long value)
{
Contract.Assert(value >= 0, "value >= 0");
long origPos = _pos;
if (_exposedHandle)
VerifyOSHandlePosition();
if (_pos != value)
SeekCore(value, SeekOrigin.Begin);
if (!Win32Native.SetEndOfFile(_handle)) {
int hr = Marshal.GetLastWin32Error();
if (hr==__Error.ERROR_INVALID_PARAMETER)
throw new ArgumentOutOfRangeException("value", Environment.GetResourceString("ArgumentOutOfRange_FileLengthTooBig"));
__Error.WinIOError(hr, String.Empty);
}
// Return file pointer to where it was before setting length
if (origPos != value) {
if (origPos < value)
SeekCore(origPos, SeekOrigin.Begin);
else
SeekCore(0, SeekOrigin.End);
}
}
Anyway, not sure that theses method guarantee no fragmentation, but at least avoid it for most of the cases. Thus the auto defragment tool will finish the job at a low performance expense.
My initial code without this Seek calls created hundred of thousands of fragments for 1 GB file, slowing down my machine when the defragment tool went active.
I work with a program that takes large amounts of data, turns the data into xml files, then takes those xml files and zips them for use in another program. Occasionally, during the zipping process, one or two xml files gets left out. It is fairly rare, once or twice a month, but when it does happen it's a big mess. I am looking for help figuring out why the files don't get zipped and how to prevent it. This code is straightforward:
public string AddToZip(string outfile, string toCompress)
{
if (!File.Exists(toCompress)) throw new FileNotFoundException("Could not find the file to compress", toCompress);
string dir = Path.GetDirectoryName(outfile);
if(!Directory.Exists(dir))
{
Directory.CreateDirectory(dir);
}
// The program that gets this data can't handle files over
// 20 MB, so it splits it up into two or more files if it hits the
// limit.
if (File.Exists(outfile))
{
FileInfo tooBig = new FileInfo(outfile);
int converter = 1024;
float fileSize = tooBig.Length / converter; //bytes to KB
fileSize = fileSize / converter; //KB to MB
int limit = CommonTypes.Helpers.ConfigHelper.GetConfigEntryInt("zipLimit", "19");
if (fileSize >= limit)
{
outfile = MakeNewName(outfile);
}
}
using (ZipFile zf = new ZipFile(outfile))
{
zf.AddFile(toCompress,"");
zf.Save();
}
return outfile;
}
Ultimately, what I want to do is have a check that sees if any xml files weren't added to the zip after the zip file is created, but stopping the problem in its tracks are best overall. Thanks for the help.
Make sure you have that code inside a try... catch statement. Also make sure that if you have done that, you do something with the exception. It would not be the first case that has this type of exception handling:
try
{
//...
}
catch { }
Given the code above if you have any exception on your process, you will never notice.
It's hard to judge from this function alone, here's a list of things that can go wrong:
- The toCompress file can be gone by the time zf.AddFile is called (but after the Exists test). Test return value or add exception handling to detect this.
- The zip outFile can be just below the size limit, adding a new file can make it go over the limit.
- The AddToZip() may be called concurrently, that may cause adding to fail.
How is the toCompress file remove handled? I think adding locking to the AddoZip() on a function scope might also be a good idea.
This could be a timing issue. You are checking to see if outfile is too big before trying to add the toCompress file. What you should be doing is:
Add toCompress to outfile
Check to see if adding the file made outfile too big
If outfile is now too big, remove toCompress, create new outfile, add toCompress to new outfile.
I suspect that you occasionally have an outfile that is just under the limit, but adding toCompress puts it over. Then the receiving program does not process outfile because it is too big.
I could be completely off base, but it is something to check.
I am working on a simple program that grabs image from a remote IP camera. After days of research, I was able to extract JPEG images from MJPEG live stream with sample codes I got.
I did a prototype with using Windows Form. With Windows Form, I receive appropriately 80 images every 10 second from the IP camera.
Now I ported the code to Unity3D and I get about 2 frames every 10 seconds.
So basically about 78 Images are not received.
The thing looks like medieval PowerPoint slide show.
I am running the function in new Thread just like I did in the Windows Form. I first thought that the problem in Unity is because I was displaying the image, but it wasn't.
I removed the code that displays the Image as a texture and used an integer to count the number of images received. Still, I get about 2 to 4 images every 10 seconds. Meaning in the Windows Form App, I get about 80 to 100 images every 10 seconds.
Receiving 2 images in 10 seconds in Unity is unacceptable for what I am doing. The code I wrote doesn't seem to be the problem because it works great in Windows Form.
Things I've Tried:
I though the problem is from the Unity3D Editor run-time, so I called for Windows 10 64bit and ran it but that didn't solve the problem.
Changed the Scripting Backend from Mono2x to IL2CPP but the problem still remains.
Changed the Api compatibility Level from .NET 2.0 to .NET 2.0 Subset and nothing changed.
Below is a simple function I that is having that problem. It runs too slow on Unity even though I called it from another thread.
bool keepRunning = true;
private void Decode_MJPEG_Images(string streamTestURL = null)
{
keepRunning = true;
streamTestURL = "http://64.122.208.241:8000/axis-cgi/mjpg/video.cgi?resolution=320x240"; //For Testing purposes only
// create HTTP request
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(streamTestURL);
// get response
WebResponse resp = req.GetResponse();
System.IO.Stream imagestream = resp.GetResponseStream();
const int BufferSize = 5000000;
byte[] imagebuffer = new byte[BufferSize];
int a = 2;
int framecounter = 0;
int startreading = 0;
byte[] start_checker = new byte[2];
byte[] end_checker = new byte[2];
while (keepRunning)
{
start_checker[1] = (byte)imagestream.ReadByte();
end_checker[1] = start_checker[1];
//This if statement searches for the JPEG header, and performs the relevant operations
if (start_checker[0] == 0xff && start_checker[1] == 0xd8)// && Reset ==0)
{
Array.Clear(imagebuffer, 0, imagebuffer.Length);
//Rebuild jpeg header into imagebuffer
imagebuffer[0] = 0xff;
imagebuffer[1] = 0xd8;
a = 2;
framecounter++;
startreading = 1;
}
//This if statement searches for the JPEG footer, and performs the relevant operations
if (end_checker[0] == 0xff && end_checker[1] == 0xd9)
{
startreading = 0;
//Write final part of JPEG header into imagebuffer
imagebuffer[a] = start_checker[1];
System.IO.MemoryStream jpegstream = new System.IO.MemoryStream(imagebuffer);
Debug.Log("Received Full Image");
Debug.Log(framecounter.ToString());
//Display Image
}
//This if statement fills the imagebuffer, if the relevant flags are set
if (startreading == 1 && a < BufferSize)
{
imagebuffer[a] = start_checker[1];
a++;
}
//Catches error condition where a = buffer size - this should not happen in normal operation
if (a == BufferSize)
{
a = 2;
startreading = 0;
}
start_checker[0] = start_checker[1];
end_checker[0] = end_checker[1];
}
resp.Close();
}
Now I am blaming HttpWebRequest for this problem. Maybe it was poorly implemented in Unity. Not sure....
What's going on? Why is this happening? How can I fix it?
Is it perhaps the case that one has to use Read[a lot] instead of Read ??
Read[a lot]:
https://msdn.microsoft.com/en-us/library/system.io.stream.read(v=vs.110).aspx
Return Value Type: System.Int32 The total number of bytes read into the buffer. This can be less than the number of bytes requested if that many bytes are not currently available
Conceivably, ReadAsync could help, manual, although it results in wildly different code.
I'm a bit puzzled as to what part of your code you are saying has the performance problem - its it displaying the MPG or is it the snippet of code you've published here ? Assuming the HttpRequest isn't your problem (which you can easily test in Fiddler to see how long the call and fetch actually take) then I'm guessing your problem is in the display of the MPG not the code you've posted (which wont be different between WinForms and Unity)
My guess is, if the problem is in Unity you are passing the created MemoryStream to unity to create a graphics resource ? Your code looks like it reads the stream and when it hits the end-of-image character it creates a new MemoryStream which contains the entire data buffer content. This may be a problem for Unity that isn't a problem in WinForms - the memory stream seems to contain the whole buffer you created, but this is bigger than the actual content you read - does Unity perhaps see this as a corrupted Jpg ?
Try using the MemoryStream constructor that takes an byte range from your byte[] and pass through just the data you know make up your image stream.
Other issues in the code might be (but unlikely to be performance related); Large Object Heap fragmentation from the creation and discard of the large byte[]; non dynamic storage of the incoming stream (fixed destination buffer size); no checking of incoming stream size, or end-of-stream indicators (if the response stream does not contain the whole image stream, there doesnt seem to be a strategy to deal with it).
In my project user can upload file up to 1GB. I want to copy that uploaded file stream data to second stream.
If I use like this
int i;
while ( ( i = fuVideo.FileContent.ReadByte() ) != -1 )
{
strm.WriteByte((byte)i);
}
then it is taking so much time.
If i try to do this by byte array then I need to add array size in long which is not valid.
If someone has better idea to do this then please let me know.
--
Hi Khepri thanks for your response. I tried Stream.Copy but it is taking so much time to copy one stream object to second.
I tried with 8.02Mb file and it took 3 to 4 minutes.
The code i have added is
Stream fs = fuVideo.FileContent; //fileInf.OpenRead();
Stream strm = ftp.GetRequestStream();
fs.CopyTo(strm);
If i am doing something wrong then please let me know.
Is this .NET 4.0?
If so Stream.CopyTo is probably your best bet.
If not, and to give credit where credit is due, see the answer in this SO thread. If you're not .NET 4.0 make sure to read the comments in that thread as there are some alternative solutions (Async stream reading/writing) that may be worth investigating if performance is at an absolute premium which may be your case.
EDIT:
Based off the update, are you trying to copy the file to another remote destination? (Just guessing based on GetRequestStream() [GetRequestStream()]. The time is going to be the actual transfer of the file content to the destination. So in this case when you do fs.CopyTo(strm) it has to move those bytes from the source stream to the remote server. That's where the time is coming from. You're literally doing a file upload of a huge file. CopyTo will block your processing until it completes.
I'd recommend looking at spinning this kind of processing off to another task or at the least look at the asynchronous option I listed. You can't really avoid this taking a large period of time. You're constrained by file size and available upload bandwidth.
I verified that when working locally CopyTo is sub-second. I tested with a half gig file and a quick Stopwatch class returned a processing time of 800 millisecondss.
If you are not .NET 4.0 use this
static void CopyTo(Stream fromStream, Stream destination, int bufferSize)
{
int num;
byte[] buffer = new byte[bufferSize];
while ((num = fromStream.Read(buffer, 0, buffer.Length)) != 0)
{
destination.Write(buffer, 0, num);
}
}