Hashing a SecureString in .NET - c#

In .NET, we have the SecureString class, which is all very well until you come to try and use it, as to (for example) hash the string, you need the plaintext. I've had a go here at writing a function that will hash a SecureString, given a hash function that takes a byte array and outputs a byte array.
private static byte[] HashSecureString(SecureString ss, Func<byte[], byte[]> hash)
{
// Convert the SecureString to a BSTR
IntPtr bstr = Marshal.SecureStringToBSTR(ss);
// BSTR contains the length of the string in bytes in an
// Int32 stored in the 4 bytes prior to the BSTR pointer
int length = Marshal.ReadInt32(bstr, -4);
// Allocate a byte array to copy the string into
byte[] bytes = new byte[length];
// Copy the BSTR to the byte array
Marshal.Copy(bstr, bytes, 0, length);
// Immediately destroy the BSTR as we don't need it any more
Marshal.ZeroFreeBSTR(bstr);
// Hash the byte array
byte[] hashed = hash(bytes);
// Destroy the plaintext copy in the byte array
for (int i = 0; i < length; i++) { bytes[i] = 0; }
// Return the hash
return hashed;
}
I believe this will correctly hash the string, and will correctly scrub any copies of the plaintext from memory by the time the function returns, assuming the provided hash function is well behaved and doesn't make any copies of the input that it doesn't scrub itself. Have I missed anything here?

Have I missed anything here?
Yes, you have, a rather fundamental one at that. You cannot scrub the copy of the array left behind when the garbage collector compacts the heap. Marshal.SecureStringToBSTR(ss) is okay because a BSTR is allocated in unmanaged memory so will have a reliable pointer that won't change. In other words, no problem scrubbing that one.
Your byte[] bytes array however contains the copy of the string and is allocated on the GC heap. You make it likely to induce a garbage collection with the hashed[] array. Easily avoided but of course you have little control over other threads in your process allocating memory and inducing a collection. Or for that matter a background GC that was already in progress when your code started running.
The point of SecureString is to never have a cleartext copy of the string in garbage collected memory. Copying it into a managed array violated that guarantee. If you want to make this code secure then you are going to have to write a hash() method that takes the IntPtr and only reads through that pointer.
Beware that if your hash needs to match a hash computed on another machine then you cannot ignore the Encoding that machine would use to turn the string into bytes.

There's always the possibility of using the unmanaged CryptoApi or CNG functions.
Bear in mind that SecureString was designed with an unmanaged consumer which has full control over memory management in mind.
If you want to stick to C#, you should pin the temporary array to prevent the GC from moving it around before you get a chance to scrub it:
private static byte[] HashSecureString(SecureString input, Func<byte[], byte[]> hash)
{
var bstr = Marshal.SecureStringToBSTR(input);
var length = Marshal.ReadInt32(bstr, -4);
var bytes = new byte[length];
var bytesPin = GCHandle.Alloc(bytes, GCHandleType.Pinned);
try {
Marshal.Copy(bstr, bytes, 0, length);
Marshal.ZeroFreeBSTR(bstr);
return hash(bytes);
} finally {
for (var i = 0; i < bytes.Length; i++) {
bytes[i] = 0;
}
bytesPin.Free();
}
}

As a complement to Hans’ answer here’s a suggestion how to implement the hasher. Hans suggests passing the pointer to the unmanaged string to the hash function but that means that client code (= the hash function) needs to deal with unmanaged memory. That’s not ideal.
On the other hand, you can replace the callback by an instance of the following interface:
interface Hasher {
void Reinitialize();
void AddByte(byte b);
byte[] Result { get; }
}
That way the hasher (although it becomes slightly more complex) can be implemented wholly in managed land without leaking secure information. Your HashSecureString would then look as follows:
private static byte[] HashSecureString(SecureString ss, Hasher hasher) {
IntPtr bstr = Marshal.SecureStringToBSTR(ss);
try {
int length = Marshal.ReadInt32(bstr, -4);
hasher.Reinitialize();
for (int i = 0; i < length; i++)
hasher.AddByte(Marshal.ReadByte(bstr, i));
return hasher.Result;
}
finally {
Marshal.ZeroFreeBSTR(bstr);
}
}
Note the finally block to make sure that the unmanaged memory is zeroed, no matter what shenanigans the hasher instance does.
Here’s a simple (and not very useful) Hasher implementation to illustrate the interface:
sealed class SingleByteXor : Hasher {
private readonly byte[] data = new byte[1];
public void Reinitialize() {
data[0] = 0;
}
public void AddByte(byte b) {
data[0] ^= b;
}
public byte[] Result {
get { return data; }
}
}

As a further complement, could you not wrap the logic #KonradRudolph and #HansPassant supplied into a custom Stream implementation?
This would allow you to use the HashAlgorithm.ComputeHash(Stream) method, which would keep the interface managed (although it would be down to you to dispose the stream in good time).
Of course, you are at the mercy of the HashAlgorithm implementation as to how much data ends up in memory at a time (but, of course, that's what the reference source is for!)
Just an idea...
public class SecureStringStream : Stream
{
public override bool CanRead { get { return true; } }
public override bool CanWrite { get { return false; } }
public override bool CanSeek { get { return false; } }
public override long Position
{
get { return _pos; }
set { throw new NotSupportedException(); }
}
public override void Flush() { throw new NotSupportedException(); }
public override long Seek(long offset, SeekOrigin origin) { throw new NotSupportedException(); }
public override void SetLength(long value) { throw new NotSupportedException(); }
public override void Write(byte[] buffer, int offset, int count) { throw new NotSupportedException(); }
private readonly IntPtr _bstr = IntPtr.Zero;
private readonly int _length;
private int _pos;
public SecureStringStream(SecureString str)
{
if (str == null) throw new ArgumentNullException("str");
_bstr = Marshal.SecureStringToBSTR(str);
try
{
_length = Marshal.ReadInt32(_bstr, -4);
_pos = 0;
}
catch
{
if (_bstr != IntPtr.Zero) Marshal.ZeroFreeBSTR(_bstr);
throw;
}
}
public override long Length { get { return _length; } }
public override int Read(byte[] buffer, int offset, int count)
{
if (buffer == null) throw new ArgumentNullException("buffer");
if (offset < 0) throw new ArgumentOutOfRangeException("offset");
if (count < 0) throw new ArgumentOutOfRangeException("count");
if (offset + count > buffer.Length) throw new ArgumentException("offset + count > buffer");
if (count > 0 && _pos++ < _length)
{
buffer[offset] = Marshal.ReadByte(_bstr, _pos++);
return 1;
}
else return 0;
}
protected override void Dispose(bool disposing)
{
try { if (_bstr != IntPtr.Zero) Marshal.ZeroFreeBSTR(_bstr); }
finally { base.Dispose(disposing); }
}
}
void RunMe()
{
using (SecureString s = new SecureString())
{
foreach (char c in "jimbobmcgee") s.AppendChar(c);
s.MakeReadOnly();
using (SecureStringStream ss = new SecureStringStream(s))
using (HashAlgorithm h = MD5.Create())
{
Console.WriteLine(Convert.ToBase64String(h.ComputeHash(ss)));
}
}
}

Related

ByteStream source in SharpDX.MediaFoundation hanging

I have been trying to get a custom audio stream to work with SharpDX.MediaFoundation.
To this end I have wrapped my audio object in a class that implements System.IO.Stream as follows:
public class AudioReaderWaveStream : System.IO.Stream
{
byte[] waveHeader = new byte[44];
AudioCore.IAudioReader reader = null;
ulong readHandle = 0xffffffff;
long readPosition = 0;
public AudioReaderWaveStream(AudioCore.CEditedAudio content)
{
reader = content as AudioCore.IAudioReader;
readHandle = reader.OpenDevice();
int sampleRate = 0;
short channels = 0;
content.GetFormat(out sampleRate, out channels);
System.IO.MemoryStream memStream = new System.IO.MemoryStream(waveHeader);
using (System.IO.BinaryWriter bw = new System.IO.BinaryWriter(memStream))
{
bw.Write("RIFF".ToCharArray());
bw.Write((Int32)Length - 8);
bw.Write("WAVE".ToCharArray());
bw.Write("fmt ".ToCharArray());
bw.Write((Int32)16);
bw.Write((Int16)3);
bw.Write((Int16)1);
bw.Write((Int32)sampleRate);
bw.Write((Int32)sampleRate * 4);
bw.Write((Int16)4);
bw.Write((Int16)32);
bw.Write("data".ToCharArray());
bw.Write((Int32)reader.GetSampleCount() * 4);
}
}
protected override void Dispose(bool disposing)
{
if (readHandle != 0xffffffff)
{
reader.CloseDevice(readHandle);
readHandle = 0xfffffffff;
}
base.Dispose(disposing);
}
~AudioReaderWaveStream()
{
Dispose();
}
public override bool CanRead
{
get
{
return true;
}
}
public override bool CanSeek
{
get
{
return true;
}
}
public override bool CanWrite
{
get
{
return false;
}
}
public override long Length
{
get
{
// Number of float samples + header of 44 bytes.
return (reader.GetSampleCount() * 4) + 44;
}
}
public override long Position
{
get
{
return readPosition;
}
set
{
readPosition = value;
}
}
public override void Flush()
{
//throw new NotImplementedException();
}
public override int Read(byte[] buffer, int offset, int count)
{
if (count <= 0)
return 0;
int retCount = count;
if (Position < 44)
{
int headerCount = count;
if ( Position + count >= 44 )
{
headerCount = 44 - (int)Position;
}
Array.Copy(waveHeader, Position, buffer, offset, headerCount);
offset += headerCount;
Position += headerCount;
count -= headerCount;
}
if (count > 0)
{
float[] readBuffer = new float[count/4];
reader.Seek(readHandle, Position - 44);
reader.ReadAudio(readHandle, readBuffer);
Array.Copy(readBuffer, 0, buffer, offset, count);
}
return retCount;
}
public override long Seek(long offset, System.IO.SeekOrigin origin)
{
if (origin == System.IO.SeekOrigin.Begin)
{
readPosition = offset;
}
else if (origin == System.IO.SeekOrigin.Current)
{
readPosition += offset;
}
else
{
readPosition = Length - offset;
}
return readPosition;
}
public override void SetLength(long value)
{
throw new NotImplementedException();
}
public override void Write(byte[] buffer, int offset, int count)
{
throw new NotImplementedException();
}
}
I then take this object and create a source resolver using it as follows:
// Create a source resolver.
SharpDX.MediaFoundation.ByteStream sdxByteStream = new ByteStream( ARWS );
SharpDX.MediaFoundation.SourceResolver resolver = new SharpDX.MediaFoundation.SourceResolver();
ComObject source = (ComObject)resolver.CreateObjectFromStream( sdxByteStream, "File.wav", SourceResolverFlags.MediaSource );
However every time I'm doing this it hangs on the CreateObjectFromStream call. I've had a look inside SharpDX to see whats going on and it seems the actual hang occurs when it makes the call to the underlying interface through CreateObjectFromByteStream. I've also looked to see what data is actually read from the byte stream. It reads the first 16 bytes which includes the 'RIFF', the RIFF size, the 'WAVE' and the 'fmt '. Then nothing else.
Has anyone got any ideas what I could be doing wrong. I've tried all sorts of combinations of the SourceResolverFlags but nothing seems to make any difference. It just hangs.
It does remind me somewhat of interthread marshalling but all the media foundation calls are made from the same thread so I don't think its that. I'm also fairly sure that MediaFoundation uses free threading so this shouldn't be a problem anyway.
Has anyone any idea what I could possibly be doing wrong?
Thanks!
Ok I have come up with a solution to this. It looks like I may be having a COM threading issue. The read happens in a thread and that thread was calling back to the main thread which the function was called from.
So I used the async version of the call and perform an Application.DoEvents() to hand across control where necessary.
Callback cb = new Callback( resolver );
IUnknown cancel = null;
resolver.BeginCreateObjectFromByteStream( sdxByteStream, "File.wav", (int)(SourceResolverFlags.MediaSource | SourceResolverFlags.ByteStream), null, out cancel, cb, null );
if ( cancel != null )
{
cancel.Dispose();
}
while( cb.MediaSource == null )
{
System.Windows.Forms.Application.DoEvents();
}
SharpDX.MediaFoundation.MediaSource mediaSource = cb.MediaSource;
I really hate COM's threading model ...

Read binary objects from a file in C# written out by a C++ program

I am trying to read objects from very large files containing padded structs that were written into it by a C++ process. I was using an example to memory map the large file and try to deserialize the data into an object but I now can see that it won't work this way.
How can I extract all the objects from the files to use in C#? I'm probably way off but I've provided the code. The objects have a 8 byte milliseconds member followed by 21 16bit integers, which needs 6bytes of padding to align to a 8byte boundary.
[Serializable]
unsafe public struct DataStruct
{
public UInt64 milliseconds;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 21)]
public fixed Int16 data[21];
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 3)]
public fixed Int16 padding[3];
};
[Serializable]
public class DataArray
{
public DataStruct[] samples;
}
public static class Helper
{
public static Int16[] GetData(this DataStruct data)
{
unsafe
{
Int16[] output = new Int16[21];
for (int index = 0; index < 21; ++index)
output[index] = data.data[index];
return output;
}
}
}
class FileThreadSupport
{
struct DataFileInfo
{
public string path;
public UInt64 start;
public UInt64 stop;
public UInt64 elements;
};
// Create our epoch timestamp
private static readonly DateTime epoch = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
// Output TCP client
private Support.AsyncTcpClient output;
// Directory which contains our data
private string replay_directory;
// Files to be read from
private DataFileInfo[] file_infos;
// Current timestamp of when the process was started
UInt64 process_start = 0;
// Object from current file
DataArray current_file_data;
// Offset into current files
UInt64 current_file_index = 0;
// Offset into current files
UInt64 current_file_offset = 0;
// Run flag
bool run = true;
public FileThreadSupport(ref Support.AsyncTcpClient output, ref Engine.A.Information info, ref Support.Configuration configuration)
{
// Set our output directory
replay_directory = configuration.getString("replay_directory");
if (replay_directory.Length == 0)
{
Console.WriteLine("Configuration does not provide a replay directory");
return;
}
// Check the directory for playable files
if(!loadDataDirectory(replay_directory))
{
Console.WriteLine("Replay directory {} did not have any valid files", replay_directory);
}
// Set the output TCP client
this.output = output;
}
private bool loadDataDirectory(string directory)
{
string[] files = Directory.GetFiles(directory, "*.*", SearchOption.TopDirectoryOnly);
file_infos = new DataFileInfo[files.Length];
int index = 0;
foreach (string file in files)
{
string[] parts = file.Split('\\');
string name = parts.Last();
parts = name.Split('.');
if (parts.Length != 2)
continue;
UInt64 start, stop = 0;
if (!UInt64.TryParse(parts[0], out start) || !UInt64.TryParse(parts[1], out stop))
continue;
long size = new System.IO.FileInfo(file).Length;
// Add to our file info array
file_infos[index] = new DataFileInfo
{
path = file,
start = start,
stop = stop,
elements = (ulong)(new System.IO.FileInfo(file).Length / 56
/*System.Runtime.InteropServices.Marshal.SizeOf(typeof(DataStruct))*/)
};
++index;
}
// Sort the array
Array.Sort(file_infos, delegate (DataFileInfo x, DataFileInfo y) { return x.start.CompareTo(y.start); });
// Return whether or not there were files found
return (files.Length > 0);
}
public void start()
{
process_start = (ulong)DateTime.Now.ToUniversalTime().Subtract(epoch).TotalMilliseconds;
UInt64 num_samples = 0;
while(run)
{
// Get our samples and add it to the sample
DataStruct[] result = getData(100);
Engine.A.A message = new Engine.A.A();
for (int i = 0; i < result.Length; ++i)
{
Engine.A.Data sample = new Engine.A.Data();
sample.Time = process_start + num_samples * 4;
Int16[] signal_data = Helper.GetData(result[i]);
for(int e = 0; e < signal_data.Length; ++e)
sample.Value[e] = signal_data[e];
message.Signal.Add(sample);
++num_samples;
}
// Send out the websocket
this.output.SendAsync(message.ToByteArray());
// Sleep 100 milliseconds
Thread.Sleep(100);
}
}
public void stop()
{
run = false;
}
private DataStruct[] getData(UInt64 milliseconds)
{
if (file_infos.Length == 0)
return new DataStruct[0];
if (current_file_data == null)
{
current_file_data = ReadObjectFromMMF(file_infos[current_file_index].path) as DataArray;
if(current_file_data.samples.Length == 0)
return new DataStruct[0];
}
UInt64 elements_to_read = (UInt64) milliseconds / 4;
DataStruct[] result = new DataStruct[elements_to_read];
Array.Copy(current_file_data.samples, (int)current_file_offset, result, 0, (int) Math.Min(elements_to_read, file_infos[current_file_index].elements - current_file_offset));
while((UInt64) result.Length != elements_to_read)
{
current_file_index = (current_file_index + 1) % (ulong) file_infos.Length;
current_file_data = ReadObjectFromMMF(file_infos[current_file_index].path) as DataArray;
if (current_file_data.samples.Length == 0)
return new DataStruct[0];
current_file_offset = 0;
Array.Copy(current_file_data.samples, (int)current_file_offset, result, result.Length, (int)Math.Min(elements_to_read, file_infos[current_file_index].elements - current_file_offset));
}
return result;
}
private object ByteArrayToObject(byte[] buffer)
{
BinaryFormatter binaryFormatter = new BinaryFormatter(); // Create new BinaryFormatter
MemoryStream memoryStream = new MemoryStream(buffer); // Convert buffer to memorystream
return binaryFormatter.Deserialize(memoryStream); // Deserialize stream to an object
}
private object ReadObjectFromMMF(string file)
{
// Get a handle to an existing memory mapped file
using (MemoryMappedFile mmf = MemoryMappedFile.CreateFromFile(file, FileMode.Open))
{
// Create a view accessor from which to read the data
using (MemoryMappedViewAccessor mmfReader = mmf.CreateViewAccessor())
{
// Create a data buffer and read entire MMF view into buffer
byte[] buffer = new byte[mmfReader.Capacity];
mmfReader.ReadArray<byte>(0, buffer, 0, buffer.Length);
// Convert the buffer to a .NET object
return ByteArrayToObject(buffer);
}
}
}
Well for one thing you're not using that memory mapped file well at all, you're just sequentially reading it all in a buffer, which is both needlessly inefficient and much slower than if you simply opened the file to read normally. The selling point of memory mapped files is repeated random access and random updates backed by the OS's virtual memory paging.
And you definitely don't need to read the entire file in memory, since your data is so strongly structured. You know exactly how many bytes to read for a record: Marshal.SizeOf<DataStruct>().
Then you need to get rid of all that serialization noise. Again your data is strongly typed, just read it. Get rid of those fixed arrays and use regular arrays, you're already instructing the marshaller how to read them with MarshalAs attributes (good). That also gets rid of that helper function that just copies an array for some unknown reason.
Your reading loop is very simple: read the correct number of bytes for one entry, use Marshal.PtrToStructure to convert it to a readable structure and add it to a list to return at the end. Bonus points if you can use .Net Core and Unsafe.As or Unsafe.Cast.
Edit: and don't use object returns, you know exactly what you're returning, write it down.

Storing byte arrays in a SecureString fails sporadically

Before somebody comes up with the "why would you do that" be contempt with the fact that I need that. If not, then I simply want a secure byte array, think of it as a binary passphrase. Now on to the stuff!
I have the following extension method to get a SecureString instance from data in a byte array:
public static SecureString FromByteArray(this SecureString secure, byte[] data)
{
secure.Clear(); // throw exception if IsReadOnly
char[] chars = new char[data.Length];
chars = data.Select(c => (char)c).ToArray();
foreach (char c in chars)
{
secure.AppendChar(c);
}
return secure;
}
Then somewhere in StackOverflow I found the following that dumps a SecureString to a byte[]:
public static byte[] ToInsecureBytes(this SecureString securePassword)
{
byte[] bytes = null;
if (null != securePassword)
{
GCHandle? gc = null;
var handle = IntPtr.Zero;
var length = securePassword.Length;
RuntimeHelpers.PrepareConstrainedRegions();
try
{
handle = Marshal.SecureStringToGlobalAllocAnsi(securePassword);
bytes = new byte[length];
gc = GCHandle.Alloc(bytes, GCHandleType.Pinned);
for (var i = 0; i < length; i++)
{
bytes[i] = Marshal.ReadByte(handle, i);
}
}
finally
{
if (handle != IntPtr.Zero)
{
Marshal.ZeroFreeGlobalAllocAnsi(handle);
}
if (null != gc &&
gc.HasValue)
{
gc.Value.Free();
}
}
}
return bytes;
}
Do notice that the code is independent of string encoding because we are considering the data as a mere binary but using the SecureString features.
Then I have this test code in LinqPad to prove the concept:
byte[] byteData = Generate256BitsOfRandomEntropy(); // byte[32]
BitConverter.ToString(byteData).Dump();
SecureString pwd4 = pass.FromByteArray(byteData);
byte[] byteOut = pwd4.ToInsecureBytes();
BitConverter.ToString(byteOut).Dump();
Console.WriteLine("Conversion successful: {0}", byteData.SequenceEqual(byteOut));
But when run as round trip conversion I notice that they are not the same, always there is a mismatch in the sequence and it is always a character with hex code 0x3F which appears mysteriously in random places of the byteOut array.
07-F7-90-67-A7-46-1E-F9-CE-44-91-46-44-9D-B9-81-75-7B-27-43-34-C3-F9-38-AC-E7-F9-E6-F1-7F-29-84
07-F7-90-67-A7-46-1E-F9-CE-44-3F-46-44-9D-B9-81-75-7B-27-43-34-C3-F9-38-AC-E7-F9-E6-F1-7F-29-3F
Conversion successful: False

Encrypt file but can decrypt from any point

Basically i need to encrypt a file and then be able to decrypt the file from almost any point in the file. The reason i need this is i would like to use this for files like Video etc and still be able to jump though the file or video. Also the file would be served over the web so not needing to download the whole file is important. Where i am storing the file supports partial downloads so i can request any part of the file i need and this works for an un encrypted file. The question is how could i make this work for an encrypted file. I need to encrypt and decrypt the file in C# but don't really have any other restrictions than that. Symmetric keys are preferred but if that wont work it is not a deal breaker.
Another example of where i only want to download part of a file and decrypt is where i have joined multiple files together but just need to retrieve one of them. This would generally be used for files smaller than 50MB like pictures and info files.
--- EDIT ---
To be clear i am looking for a working implementation or library that does not increase the size of the source file. Stream cipher seems ideal but i have not seen one in c# that works for any point in the stream or anything apart from the start of the stream. Would consider block based implementation if it works form set blocks in stream. Basically i want to pass a raw stream though this and have unencrypted come out other side of the stream. Happy to set the starting offset it represents in the whole file/stream. Looking for something than works as i am not encryption expert. At the minute i get the parts of the file from a data source in 512kb to 5mb blocks depending on client config and i use streams CopyTo method to write it out to a file on disk. I don't get these parts in order. I am looking for a stream wrapper that i could use to pass into the CopyTo method on stream.
Your best best is probably to treat the file as a list of chunks (of whatever size is convenient for your application; let's say 50 kB) and encrypt each separately. This would allow you to decrypt each chunk independently of the others.
For each chunk, derive new keys from your master key, generate a new IV, and encrypt-then-MAC the chunk.
This method has higher storage overhead than encrypting the entire file at once and takes a bit more computation as well due to the key regeneration that it requires.
If you use a stream cipher instead of a block cipher, you'd be able to start decrypting at any byte offset, as long as the decryptor was able to get the current IV from somewhere.
For those interested i managed to work it out based on a number of examples i found plus some of my own code. It uses bouncycastle but should also work with dotnet AES with a few tweaks. This allows the decryption/encryption from any point in the stream.
using System;
using System.IO;
using Org.BouncyCastle.Crypto;
using Org.BouncyCastle.Crypto.Parameters;
namespace StreamHelpers
{
public class StreamEncryptDecrypt : Stream
{
private readonly Stream _streamToWrap;
private readonly IBlockCipher _cipher;
private readonly ICipherParameters _key;
private readonly byte[] _iv;
private readonly byte[] _counter;
private readonly byte[] _counterOut;
private readonly byte[] _output;
private long currentBlockCount;
public StreamEncryptDecrypt(Stream streamToWrap, IBlockCipher cipher, ParametersWithIV keyAndIv)
{
_streamToWrap = streamToWrap;
_cipher = cipher;
_key = keyAndIv.Parameters;
_cipher.Init(true, _key);
_iv = keyAndIv.GetIV();
_counter = new byte[_cipher.GetBlockSize()];
_counterOut = new byte[_cipher.GetBlockSize()];
_output = new byte[_cipher.GetBlockSize()];
if (_iv.Length != _cipher.GetBlockSize())
{
throw new Exception("IV must be the same size as the cipher block size");
}
InitCipher();
}
private void InitCipher()
{
long position = _streamToWrap.Position;
Array.Copy(_iv, 0, _counter, 0, _counter.Length);
currentBlockCount = 0;
var targetBlock = position/_cipher.GetBlockSize();
while (currentBlockCount < targetBlock)
{
IncrementCounter(false);
}
_cipher.ProcessBlock(_counter, 0, _counterOut, 0);
}
private void IncrementCounter(bool updateCounterOut = true)
{
currentBlockCount++;
// Increment the counter
int j = _counter.Length;
while (--j >= 0 && ++_counter[j] == 0)
{
}
_cipher.ProcessBlock(_counter, 0, _counterOut, 0);
}
public override long Position
{
get { return _streamToWrap.Position; }
set
{
_streamToWrap.Position = value;
InitCipher();
}
}
public override long Seek(long offset, SeekOrigin origin)
{
var result = _streamToWrap.Seek(offset, origin);
InitCipher();
return result;
}
public void ProcessBlock(
byte[] input,
int offset,
int length, long streamPosition)
{
if (input.Length < offset+length)
throw new ArgumentException("input does not match offset and length");
var blockSize = _cipher.GetBlockSize();
var startingBlock = streamPosition / blockSize;
var blockOffset = (int)(streamPosition - (startingBlock * blockSize));
while (currentBlockCount < streamPosition / blockSize)
{
IncrementCounter();
}
//process the left over from current block
if (blockOffset !=0)
{
var blockLength = blockSize - blockOffset;
blockLength = blockLength > length ? length : blockLength;
//
// XOR the counterOut with the plaintext producing the cipher text
//
for (int i = 0; i < blockLength; i++)
{
input[offset + i] = (byte)(_counterOut[blockOffset + i] ^ input[offset + i]);
}
offset += blockLength;
length -= blockLength;
blockOffset = 0;
if (length > 0)
{
IncrementCounter();
}
}
//need to loop though the rest of the data and increament counter when needed
while (length > 0)
{
var blockLength = blockSize > length ? length : blockSize;
//
// XOR the counterOut with the plaintext producing the cipher text
//
for (int i = 0; i < blockLength; i++)
{
input[offset + i] = (byte)(_counterOut[i] ^ input[offset + i]);
}
offset += blockLength;
length -= blockLength;
if (length > 0)
{
IncrementCounter();
}
}
}
public override int Read(byte[] buffer, int offset, int count)
{
var pos = _streamToWrap.Position;
var result = _streamToWrap.Read(buffer, offset, count);
ProcessBlock(buffer, offset, result, pos);
return result;
}
public override void Write(byte[] buffer, int offset, int count)
{
var input = new byte[count];
Array.Copy(buffer, offset, input, 0, count);
ProcessBlock(input, 0, count, _streamToWrap.Position);
_streamToWrap.Write(input, offset, count);
}
public override void Flush()
{
_streamToWrap.Flush();
}
public override void SetLength(long value)
{
_streamToWrap.SetLength(value);
}
public override bool CanRead
{
get { return _streamToWrap.CanRead; }
}
public override bool CanSeek
{
get { return true; }
}
public override bool CanWrite
{
get { return _streamToWrap.CanWrite; }
}
public override long Length
{
get { return _streamToWrap.Length; }
}
protected override void Dispose(bool disposing)
{
if (_streamToWrap != null)
{
_streamToWrap.Dispose();
}
base.Dispose(disposing);
}
}
}

How do I hash first N bytes of a file?

Using .net, I would like to be able to hash the first N bytes of potentially large files, but I can't seem to find a way of doing it.
The ComputeHash function (I'm using SHA1) takes a byte array or a stream, but a stream seems like the best way of doing it, since I would prefer not to load a potentially large file into memory.
To be clear: I don't want to load a potentially large piece of data into memory if I can help it. If the file is 2GB and I want to hash the first 1GB, that's a lot of RAM!
You can hash large volumes of data using a CryptoStream - something like this should work:
var sha1 = SHA1Managed.Create();
FileStream fs = \\whatever
using (var cs = new CryptoStream(fs, sha1, CryptoStreamMode.Read))
{
byte[] buf = new byte[16];
int bytesRead = cs.Read(buf, 0, buf.Length);
long totalBytesRead = bytesRead;
while (bytesRead > 0 && totalBytesRead <= maxBytesToHash)
{
bytesRead = cs.Read(buf, 0, buf.Length);
totalBytesRead += bytesRead;
}
}
byte[] hash = sha1.Hash;
fileStream.Read(array, 0, N);
http://msdn.microsoft.com/en-us/library/system.io.filestream.read.aspx
Open the file as a FileStream, copy the first n bytes into a MemoryStream, then hash the MemoryStream.
As others have pointed out, you should read the first few bytes into an array.
What should also be noted that you don't want to make a direct call to Read and assume that the bytes have been read.
Rather, you want to make sure that the number of bytes that are returned are the number of bytes that you requested, and make another call to Read in the event that the number of bytes returned doesn't equal the initial number requested.
Also, if you have rather large streams, you will want to create a proxy for the Stream class where you pass it the underlying stream (the FileStream in this case) and override the Read method to forward the call to the underlying stream until you read the number of bytes that you need to read. Then, when that number of bytes is returned, you would return -1 to indicate that there are no more bytes to be read.
If you are concerned about keeping too much data in memory, you can create a stream wrapper that throttles the maximum number of bytes read.
Without doing all the work, here's a sample boiler plate you could use to get started.
Edit: Please review comments for recommendations to improve this implementation. End edit
public class LimitedStream : Stream
{
private int current = 0;
private int limit;
private Stream stream;
public LimitedStream(Stream stream, int n)
{
this.limit = n;
this.stream = stream;
}
public override int ReadByte()
{
if (current >= limit)
return -1;
var numread = base.ReadByte();
if (numread >= 0)
current++;
return numread;
}
public override int Read(byte[] buffer, int offset, int count)
{
count = Math.Min(count, limit - current);
var numread = this.stream.Read(buffer, offset, count);
current += numread;
return numread;
}
public override long Seek(long offset, SeekOrigin origin)
{
throw new NotImplementedException();
}
public override void SetLength(long value)
{
throw new NotImplementedException();
}
public override void Write(byte[] buffer, int offset, int count)
{
throw new NotImplementedException();
}
public override bool CanRead
{
get { return true; }
}
public override bool CanSeek
{
get { return false; }
}
public override bool CanWrite
{
get { return false; }
}
public override void Flush()
{
throw new NotImplementedException();
}
public override long Length
{
get { throw new NotImplementedException(); }
}
public override long Position
{
get { throw new NotImplementedException(); }
set { throw new NotImplementedException(); }
}
protected override void Dispose(bool disposing)
{
base.Dispose(disposing);
if (this.stream != null)
{
this.stream.Dispose();
}
}
}
Here is an example of the stream in use, wrapping a file stream, but throttling the number of bytes read to the specified limit:
using (var stream = new LimitedStream(File.OpenRead(#".\test.xml"), 100))
{
var bytes = new byte[1024];
stream.Read(bytes, 0, bytes.Length);
}

Categories