I'm writing an UnmanagedRewindBuffer and I want to implement dynamic resizing of the buffer. I've tried several different things, but I can't seem to be able to get it right. The basic idea is that:
I allocate a new block of unmanaged memory.
Create a new UnmanagedMemoryStream (UMS).
Copy the contents from the old UMS to the new UMS.
Dispose of the old UMS and free the old allocated block.
Replace the old UMS and memory block with the new ones.
Here is my resize function:
private void DynamicallyResizeBuffer(long spaceNeeded)
{
while (_ums.Length < spaceNeeded)
{
// Allocate a new buffer
int length = (int)((double)spaceNeeded * RESIZE_FACTOR);
IntPtr tempMemoryPointer = Marshal.AllocHGlobal(length);
// Set the temporary pointer to null
//MemSet(tempMemoryPointer, length, 0);
byte* bytePointer = (byte*)tempMemoryPointer.ToPointer();
for (int i = 0; i < length; i++)
{
*(bytePointer + i) = 0;
}
// Copy the data
// MoveMemory(bytePointer, _memoryPointer.ToPointer(), _length);
// Create a new UnmanagedMemoryStream
UnmanagedMemoryStream tempUms = new UnmanagedMemoryStream(bytePointer, length, length, FileAccess.ReadWrite);
// Set up the reader and writers
BinaryReader tempReader = new BinaryReader(tempUms);
BinaryWriter tempWriter = new BinaryWriter(tempUms);
// Copy the data
_ums.Position = 0;
tempWriter.Write(ReadBytes(_length));
// I had deleted this line while I was using the writers and
// I forgot to copy it over, but the line was here when I used
// the MoveMemory function
tempUms.Position = _ums.Position;
// Free the old resources
Free(true);
_ums = tempUms;
_reader = tempReader;
_writer = tempWriter;
_length = length;
}
}
And here is my test for resizing:
public void DynamicResizeTest()
{
Int32 expected32 = 32;
Int32 actual32 = 0;
UInt64 expected64 = 64;
UInt64 actual64 = 0;
string expected = "expected";
string actual = string.Empty;
string actualFromBytes = string.Empty;
byte[] expectedBytes = Encoding.UTF8.GetBytes(expected);
// Create an 4 byte buffer
UnmanagedRewindBuffer ubs = null;
try
{
ubs = new UnmanagedRewindBuffer(4, 1);
ubs.WriteInt32(expected32);
// should dynamically resize for the 64 bit integer
ubs.WriteUInt64(expected64);
ubs.WriteString(expected);
// should dynamically resize for the bytes
ubs.WriteByte(expectedBytes);
ubs.Rewind();
actual32 = ubs.ReadInt32();
actual64 = ubs.ReadUInt64();
actual = ubs.ReadString();
actualFromBytes = Encoding.UTF8.GetString(ubs.ReadBytes(expected.Length));
}
finally
{
if (ubs != null)
{
ubs.Clear();
ubs.Dispose();
}
ubs = null;
}
Assert.AreEqual(expected32, actual32);
Assert.AreEqual(expected64, actual64);
Assert.AreEqual(expected, actual);
Assert.AreEqual(expected, actualFromBytes);
}
I've tried calling MoveMemory, which is just an unsafe extern to the kernel32 RtlMoveMemory, but when I run the test I get the following results:
actual32 is 32, expected 32
actual64 is 0, expected 64
actual is "", expected "expected"
actualFromBytes is some gibberish, expected "expected"
When I use the reader/writer to directly read from the old UMS to the new UMS, I get the following results:
actual32 is 32, expected 32
actual64 is 64, expected 64
actual is "", expected "expected"
actualFromBytes is "\b\0expect", expected "expected"
If I allocate enough space right from the start, then I have no issues with reading the values and I get the correct expected results.
What's the right way to copy the data?
Update:
Per Alexi's comment, here is the Free method which disposes of the reader/writer and the UnmanagedMemoryStream:
private void Free(bool disposeManagedResources)
{
// Dispose unmanaged resources
Marshal.FreeHGlobal(_memoryPointer);
// Dispose managed resources. Should not be called from destructor.
if (disposeManagedResources)
{
_reader.Close();
_writer.Close();
_reader = null;
_writer = null;
_ums.Dispose();
_ums = null;
}
}
You forgot this assignment:
_memoryPointer = tempMemoryPointer;
That can go unnoticed for a while, _memoryPointer is pointing to a released memory block that still contains the old bytes. Until the Windows heap manager re-uses the block or your code overwrites memory owned by another allocation. Exactly when that happens is unpredictable. You can take "unsafe" in the class name quite literally here.
First guess - you are not disposing StreamWriter - data may not be commited to the underlying stream.
You also may be missing code that updates position in your UnmanagedRewindBuffer...
Second guess: reader created on wrong stream.
Note: Consider using Stream.CopyTo (.Net 4 - http://msdn.microsoft.com/en-us/library/system.io.stream.copyto.aspx) to copye the stream. For 3.5 check How do I copy the contents of one stream to another? .
Related
I am trying to write some functions which allow encrypting/decrypting a chunk of data. For compatibility reasons, it needs to be symmetrical encryption which can be saved on one machine and read back on another, so that rules out the use of ProtectedData since that doesn't scope any wider than the local machine. (Too bad, just adding that option would have saved a lot of hassle.)
I've done a lot of web searching and so far I haven't come across any good examples that really attempt to make a clean transition from encrypted value on disk to SecureString in memory with no traces in between. I'd really like to see how (if?) it can be done.
I have an Encrypt function which seems to work reasonably well. You give it a SecureString, and it returns back a regular string containing an encrypted copy of the contents of the SecureString. If I save a memory dump after calling the function, I am unable to find any copies of my unencrypted data anywhere in the dump. THIS IS GOOD.
However I am not having as much luck with the Decrypt counterpart. While I can successfully decrypt my data and get it back in to a SecureString, I end up with multiple copies of the decrypted value in the memory dump by the time it is done. Of course having copies scattered about in clear text nullifies the whole point of using in memory cryptography. THIS IS BAD.
Can anyone make any suggestions on how to make this code "clean" so that no copies of the unencrypted value are left in memory after it completes?
Here is my code:
public SecureString Decrypt(string base64EncryptedText)
{
using (SymmetricAlgorithm sa = GetAlgorithm())
{
ICryptoTransform transform = sa.CreateDecryptor();
var result = new SecureString();
using (var memstream = new MemoryStream())
{
using (var cs = new CryptoStream(memstream, transform, CryptoStreamMode.Write))
{
byte[] base64EncryptedTextByteArray = Convert.FromBase64String(base64EncryptedText);
cs.Write(base64EncryptedTextByteArray, 0, base64EncryptedTextByteArray.Length);
cs.FlushFinalBlock(); // If you don't do this, results are inconsistent from the ToArray method.
byte[] decodedBytes = memstream.ToArray();
// Clear the contents of the memory stream back to nulls
memstream.Seek(0, 0);
for (int i = 0; i < memstream.Length; i++)
{ memstream.WriteByte(0); }
char[] decodedChars = Encoding.UTF8.GetChars(decodedBytes);
// Null out the bytes we copied from the memory stream
for (int i = 0; i < decodedBytes.Length; i++)
{ decodedBytes[i] = 0; }
// Put the characters back in to the SecureString for safe keeping and null out the array as we go...
for (int i = 0; i < decodedChars.Length; i++)
{
result.AppendChar(decodedChars[i]);
decodedChars[i] = '\0';
}
}
}
return result;
}
}
private SymmetricAlgorithm GetAlgorithm()
{
string password = "DummyPassword";
string salt = "salty";
DeriveBytes rgb = new Rfc2898DeriveBytes(password, Encoding.Unicode.GetBytes(salt));
SymmetricAlgorithm sa = new RijndaelManaged();
sa.Key = rgb.GetBytes(sa.KeySize >> 3);
sa.IV = rgb.GetBytes(sa.BlockSize >> 3);
return sa;
}
If I trace through the code and make a memory dump (Debug | Save Dump As...), then search through the resulting file for my unencrypted value, I am clean all the way up to the cs.FlushFinalBlock() statement. As soon as that executes, I have 3 copies of my unencrypted value in the dump file.
I am guessing one of them is the memory stream buffer itself. I can eliminate that one with the Seek and WriteByte loop. A couple more copies pop up later with the decodedBytes and decodedChars arrays, but those also go away with the loops that null them out. That still leaves behind two of the copies created by the flush though.
I'm not shooting for no copies ever exist at any time in memory here. (Even though it would be ideal) I figure that is an impossibility without writing a WHOLE lot of code. My goal is simply to have no traces remain in memory after the function exits to keep the attack surface to a minimum.
For the sake of completeness, might as well include the Encrypt function too. Like I said, this one seems to work ok as it is. I don't find any remnants when it's done.
unsafe public string Encrypt(SecureString textToEncrypt)
{
using (SymmetricAlgorithm sa = GetAlgorithm())
{
ICryptoTransform ict = sa.CreateEncryptor();
using (MemoryStream memStream = new MemoryStream())
{
using (CryptoStream cs = new CryptoStream(memStream, ict, CryptoStreamMode.Write))
{
IntPtr unmanagedBytes = Marshal.SecureStringToGlobalAllocAnsi(textToEncrypt);
try
{
byte* bytePointer = (byte*)unmanagedBytes.ToPointer();
byte[] singleByte = new byte[1];
while (*bytePointer != 0) // This is a null terminated value, so copy until we get a null
{
singleByte[0] = *bytePointer;
cs.Write(singleByte, 0, 1);
singleByte[0] = 0;
bytePointer++;
}
}
finally
{
Marshal.ZeroFreeGlobalAllocAnsi(unmanagedBytes);
}
}
string base64EncryptedText = Convert.ToBase64String(memStream.ToArray());
return base64EncryptedText;
}
}
}
The other thing I haven't explored that much yet is what happens if the garbage collector runs in the middle of all this... All bets are off if that happens, so would it be a good idea to run GC at the start of these functions to reduce the likelihood of it happening in the middle?
Thanks.
Edit: Solution is at bottom of post
I am trying my luck with reading binary files. Since I don't want to rely on byte[] AllBytes = File.ReadAllBytes(myPath), because the binary file might be rather big, I want to read small portions of the same size (which fits nicely with the file format to read) in a loop, using what I would call a "buffer".
public void ReadStream(MemoryStream ContentStream)
{
byte[] buffer = new byte[sizePerHour];
for (int hours = 0; hours < NumberHours; hours++)
{
int t = ContentStream.Read(buffer, 0, sizePerHour);
SecondsToAdd = BitConverter.ToUInt32(buffer, 0);
// further processing of my byte[] buffer
}
}
My stream contains all the bytes I want, which is a good thing. When I enter the loop several things cease to work.
My int t is 0although I would presume that ContentStream.Read() would process information from within the stream to my bytearray, but that isn't the case.
I tried buffer = ContentStream.GetBuffer(), but that results in my buffer containing all of my stream, a behaviour I wanted to avoid by using reading to a buffer.
Also resetting the stream to position 0 before reading did not help, as did specifying an offset for my Stream.Read(), which means I am lost.
Can anyone point me to reading small portions of a stream to a byte[]? Maybe with some code?
Thanks in advance
Edit:
Pointing me to the right direction was the answer, that .Read() returns 0 if the end of stream is reached. I modified my code to the following:
public void ReadStream(MemoryStream ContentStream)
{
byte[] buffer = new byte[sizePerHour];
ContentStream.Seek(0, SeekOrigin.Begin); //Added this line
for (int hours = 0; hours < NumberHours; hours++)
{
int t = ContentStream.Read(buffer, 0, sizePerHour);
SecondsToAdd = BitConverter.ToUInt32(buffer, 0);
// further processing of my byte[] buffer
}
}
And everything works like a charm. I initially reset the stream to its origin every time I iterated over hour and giving an offset. Moving the "set to beginning-Part" outside my look and leaving the offset at 0 did the trick.
Read returns zero if the end of the stream is reached. Are you sure, that your memory stream has the content you expect? I´ve tried the following and it works as expected:
// Create the source of the memory stream.
UInt32[] source = {42, 4711};
List<byte> sourceBuffer = new List<byte>();
Array.ForEach(source, v => sourceBuffer.AddRange(BitConverter.GetBytes(v)));
// Read the stream.
using (MemoryStream contentStream = new MemoryStream(sourceBuffer.ToArray()))
{
byte[] buffer = new byte[sizeof (UInt32)];
int t;
do
{
t = contentStream.Read(buffer, 0, buffer.Length);
if (t > 0)
{
UInt32 value = BitConverter.ToUInt32(buffer, 0);
}
} while (t > 0);
}
As part of my thesis, I need to load, modify and save .dds texture files. Therefore I'm using the DevIL.NET-Wrapper library (but the problem isn't specific to this library I guess, it's more of a general problem).
I managed (by using the visual studio memory analysis tools) to figure out the memory leaking function inside the DevIL.NET-Wrapper:
public static byte[] ReadStreamFully(Stream stream, int initialLength) {
if(initialLength < 1) {
initialLength = 32768; //Init to 32K if not a valid initial length
}
byte[] buffer = new byte[initialLength];
int position = 0;
int chunk;
while((chunk = stream.Read(buffer, position, buffer.Length - position)) > 0) {
position += chunk;
//If we reached the end of the buffer check to see if there's more info
if(position == buffer.Length) {
int nextByte = stream.ReadByte();
//If -1 we reached the end of the stream
if(nextByte == -1) {
return buffer;
}
//Not at the end, need to resize the buffer
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[position] = (byte) nextByte;
buffer = newBuffer;
position++;
}
}
//Trim the buffer before returning
byte[] toReturn = new byte[position];
Array.Copy(buffer, toReturn, position);
return toReturn;
}
I did a test program to figure out where the memory leak actually comes from:
private static void testMemoryOverflow(string[] args)
{
DevIL.ImageImporter im;
DevIL.ImageExporter ie;
...
foreach (String file in ddsPaths)
{
using (FileStream fs = File.Open(file, FileMode.Open))
{
/* v memory leak v */
DevIL.Image img = im.LoadImageFromStream(fs);
/* ^ memory leak ^ */
ie.SaveImage(img, fileSavePath);
img = null;
}
}
}
The LoadImageFromStream() function is also part of the DevIL.NET-Wrapper, and in fact calling the function from above. This is where the leak occurs.
What I already tried:
Using GC.Collect()
Disposing the FileStream object manually instead of using the using{} directive
Disposing the stream inside the DevIL.NET ReadStreamFully() function from above
Does anyone have a solution for this?
I'm new to C#, so maybe it's kind of a basic mistake.
Your issue is the buffer size.
byte[] newBuffer = new byte[buffer.Length * 2];
After 2 iterations.. you're already very close to the 85K limit of objects hitting the Large Object Heap. At 3 iterations.. you've hit the threshold. Once there.. they won't be collected until a full garbage collection occurs across all generations. Even then.. the LOH isn't compacted.. so you'll still see some high memory.
I'm not sure why the library you're using does this. I'm not sure why you're using it either.. given that you can use:
Image img = Image.FromStream(fs); // built into .NET.
The way that library is written looks like it was from an earlier version of .NET. It doesn't appear to have memory usage as any sort of concern.
I am trying to empower users to upload large files. Before I upload a file, I want to chunk it up. Each chunk needs to be a C# object. The reason why is for logging purposes. Its a long story, but I need to create actual C# objects that represent each file chunk. Regardless, I'm trying the following approach:
public static List<FileChunk> GetAllForFile(byte[] fileBytes)
{
List<FileChunk> chunks = new List<FileChunk>();
if (fileBytes.Length > 0)
{
FileChunk chunk = new FileChunk();
for (int i = 0; i < (fileBytes.Length / 512); i++)
{
chunk.Number = (i + 1);
chunk.Offset = (i * 512);
chunk.Bytes = fileBytes.Skip(chunk.Offset).Take(512).ToArray();
chunks.Add(chunk);
chunk = new FileChunk();
}
}
return chunks;
}
Unfortunately, this approach seems to be incredibly slow. Does anyone know how I can improve the performance while still creating objects for each chunk?
thank you
I suspect this is going to hurt a little:
chunk.Bytes = fileBytes.Skip(chunk.Offset).Take(512).ToArray();
Try this instead:
byte buffer = new byte[512];
Buffer.BlockCopy(fileBytes, chunk.Offset, buffer, 0, 512);
chunk.Bytes = buffer;
(Code not tested)
And the reason why this code would likely be slow is because Skip doesn't do anything special for arrays (though it could). This means that every pass through your loop is iterating the first 512*n items in the array, which results in O(n^2) performance, where you should just be seeing O(n).
Try something like this (untested code):
public static List<FileChunk> GetAllForFile(string fileName, FileMode.Open)
{
var chunks = new List<FileChunk>();
using (FileStream stream = new FileStream(fileName))
{
int i = 0;
while (stream.Position <= stream.Length)
{
var chunk = new FileChunk();
chunk.Number = (i);
chunk.Offset = (i * 512);
Stream.Read(chunk.Bytes, 0, 512);
chunks.Add(chunk);
i++;
}
}
return chunks;
}
The above code skips several steps in your process, preferring to read the bytes from the file directly.
Note that, if the file is not an even multiple of 512, the last chunk will contain less than 512 bytes.
Same as Robert Harvey's answer, but using a BinaryReader, that way I don't need to specify an offset. If you use a BinaryWriter on the other end to reassemble the file, you won't need the Offset member of FileChunk.
public static List<FileChunk> GetAllForFile(string fileName) {
var chunks = new List<FileChunk>();
using (FileStream stream = new FileStream(fileName)) {
BinaryReader reader = new BinaryReader(stream);
int i = 0;
bool eof = false;
while (!eof) {
var chunk = new FileChunk();
chunk.Number = i;
chunk.Offset = (i * 512);
chunk.Bytes = reader.ReadBytes(512);
chunks.Add(chunk);
i++;
if (chunk.Bytes.Length < 512) { eof = true; }
}
}
return chunks;
}
Have you thought about what you're going to do to compensate for packet loss and data corruption?
Since you mentioned that the load is taking a long time then I would use asynchronous file reading in order to speed up the loading process. The hard disk is the slowest component of a computer. Google does asynchronous reads and writes on Google Chrome to improve their load times. I had to do something like this in C# in a previous job.
The idea would be to spawn several asynchronous requests over different parts of the file. Then when a request comes in, take the byte array and create your FileChunk objects taking 512 bytes at a time. There are several benefits to this:
If you have this run in a separate thread, then you won't have the whole program waiting to load the large file you have.
You can process a byte array, creating FileChunk objects, while the hard disk is still trying to for-fill read request on other parts of the file.
You will save on RAM space if you limit the amount of pending read requests you can have. This allows less page faulting to the hard disk and use the RAM and CPU cache more efficiently, which speeds up processing further.
You would want to use the following methods in the FileStream class.
[HostProtectionAttribute(SecurityAction.LinkDemand, ExternalThreading = true)]
public virtual IAsyncResult BeginRead(
byte[] buffer,
int offset,
int count,
AsyncCallback callback,
Object state
)
public virtual int EndRead(
IAsyncResult asyncResult
)
Also this is what you will get in the asyncResult:
// Extract the FileStream (state) out of the IAsyncResult object
FileStream fs = (FileStream) ar.AsyncState;
// Get the result
Int32 bytesRead = fs.EndRead(ar);
Here is some reference material for you to read.
This is a code sample of working with Asynchronous File I/O Models.
This is a MS documentation reference for Asynchronous File I/O.
i am new to the C# world. I am using it for fast deployment of a solution to capture a live feed which comes in this form (curly brackets for clarity): {abcde}{CompressedMessage}, where {abcde} constitutes 5 characters indicating the length of the compressed message. The CompressedMessage is compressed using XCeedZip.dll, and needs to be uncompressed using the dll's uncompress method. The uncompress method returns an integer value indicating success or failure (of various sorts, eg no license failure, uncompression failure etc). I am receiving failure 1003 http://doc.xceedsoft.com/products/XceedZip/ for reference of the return values from the uncompress method.
while(true){
byte[] receiveByte = new byte[1000];
sock.Receive(receiveByte);
string strData =System.Text.Encoding.ASCII.GetString(receiveByte,0,receiveByte.Length);
string cMesLen = strData.Substring(0,5); // length of compressed message;
string compressedMessageStr = strData.Substring(5,strData.Length-5);
byte[] compressedBytes = System.Text.Encoding.ASCII.GetBytes(compressedMessageStr);
//instantiating xceedcompression object
XceedZipLib.XceedCompression obXCC = new XceedZipLib.XceedCompression();
obXCC.License("blah");
// uncompress method reference http://doc.xceedsoft.com/products/XceedZip/
// visual studio displays Uncompress method signature as Uncompress(ref object vaSource, out object vaUncompressed, bool bEndOfData)
object oDest;
object oSource = (object)compressedBytes;
int status = (int) obXCC.Uncompress(ref oSource, out oDest, true);
Console.WriteLine(status); /// prints 1003 http://doc.xceedsoft.com/products/XceedZip/
}
So basically my question boils down to invocation of the uncompress method and correct way of passing the parameters. I am in unfamiliar territory in the .net world, so i won't be surprised if the question is really simplistic.
Thanks for replies ..
##################################### updates
I am now doing the following:
int iter = 1;
int bufSize = 1024;
byte[] receiveByte = new byte[bufSize];
while (true){
sock.Receive(receiveByte);
//fetch compressed message length;
int cMesLen = Convert.ToInt32(System.Text.Encoding.ASCII.GetString(receiveByte,0,5));
byte[] cMessageByte = new byte[cMesLen];
if (i==1){
if (cMesLen < bufSize){
for (int i = 5; i < 5+cMesLen; ++i){
cMessageByte[i-5] = b[i];
}
}
}
XceedZipLib.XceedCompression obXCC = new XceedZipLib.XceedCompression();
obXCC.License("blah");
object oDest;
object oSource = (object) cMessageByte;
int status = (int) obXCC.Uncompress(ref oSource, out oDest, true);
if (iter==1){
byte[] testByte = objectToByteArray(oDest);
Console.WriteLine(System.Text.Encoding.ASCII.GetString(testByte,0,testByte.Length));
}
}
private byte[] objectToByteArray(Object obj){
if (obj==null){
return null;
}
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms,obj);
return ms.ToArray();
}
Problem is the testByte writeline command prints out gibberish. Any suggestions on how to move forward on this ? the status variable of uncompress is good and equal to 0 now.
The first mistake, always, is not looking at the return value of Receive; you have no idea how much data you just read, nor whether it constitutes an entire message.
It seems likely to me that you have corrupted the message payload by treating the entire data as ASCII. Rather than doing a GetString on the entire buffer, you should use GetString specifying only to use 5 bytes.
Correct process:
keep calling Receive (buffering the data, or increasing the offset and decreasing the count) until you have at least 5 bytes
process these 5 bytes to get the payload length
keep calling Receive (buffering the data, or increasing the offset and decreasing the count) until you have at least the payload length
process the payload without ever converting to/from ASCII