Hash a file as its being recived - c#

End goal:
Users are uploading a large number of files in different sizes to my web site. And i dont want duplicate files on the disk.
The solution i have been using is a simple SH1 hash of the file when it is uploaded. With code like this:
public static string HashFile(string FileName)
using (FileStream stream = File.OpenRead(FileName))
SHA1Managed sha = new SHA1Managed();
byte[] checksum = sha.ComputeHash(stream);
string sendCheckSum = BitConverter.ToString(checksum).Replace("-",string.Empty);
return sendCheckSum;
This "works" fine for smaller files, but its a big pain when the file is 30gb. So i would like to hash the file as im reciving it from the client. I get the file from the client in "chunks" and size of the chunk is not always static.
Code that recives the file.
int chunk = context.Request["chunk"] != null ? int.Parse(context.Request["chunk"]) : 0;
int chunks = context.Request["chunks"] != null ? int.Parse(context.Request["chunks"]) : 0;
string fileName = context.Request["name"] != null ? context.Request["name"] : string.Empty;
HttpPostedFile fileUpload = context.Request.Files[0];
string fullFilePath = Path.Combine(SiteSettings.UploadTempFolder, fileName);
using (var fs = new FileStream(fullFilePath, chunk == 0 ? FileMode.Create : FileMode.Append))
var buffer = new byte[fileUpload.InputStream.Length];
fileUpload.InputStream.Read(buffer, 0, buffer.Length);
fs.Write(buffer, 0, buffer.Length);
**// Here i want the hash, when i have the file data in memory.**

You can always create your own stream :)
public class ActionStream : Stream
private readonly Stream _innerStream;
private readonly Action<byte[], int, int> _readAction;
public ActionStream(Stream innerStream, Action<byte[], int, int> readAction)
_innerStream = innerStream;
_readAction = readAction;
public override bool CanRead => true;
public override bool CanSeek => false;
public override bool CanWrite => false;
public override long Length => _innerStream.Length;
public override long Position
get { return _innerStream.Position; }
set { throw new NotSupportedException(); }
public override void Flush() { }
public override int Read(byte[] buffer, int offset, int count)
var bytesRead = _innerStream.Read(buffer, offset, count);
_readAction(buffer, offset, bytesRead);
return bytesRead;
public override long Seek(long offset, SeekOrigin origin)
throw new NotSupportedException();
protected override void Dispose(bool disposing)
if (disposing)
public override void SetLength(long value) { throw new NotSupportedException(); }
public override void Write(byte[] buffer, int offset, int count)
throw new NotSupportedException();
This allows you to bind together the two stream operations you're doing:
using (var fs = new FileStream(path, chunk == 0 ? FileMode.Create : FileMode.Append))
var as = new ActionStream(fileUpload.InputStream,
(buffer, offset, bytesRead) =>
fs.Write(buffer, offset, bytesRead);
var sha = new SHA1Managed();
var checksum = sha.ComputeHash(as);
This assumes that SHA1Manager reads through every single byte of the input stream in order - you should check that. I'm pretty sure that is how it works, though :)

This is a cut and paste from:
Compute a hash from a stream of unknown length in C#
MD5, like other hash functions, does not require two passes.
To start:
HashAlgorithm hasher = ..;
As each block of data arrives:
byte[] buffer = ..;
int bytesReceived = ..;
hasher.TransformBlock(buffer, 0, bytesReceived, null, 0);
To finish and retrieve the hash:
hasher.TransformFinalBlock(new byte[0], 0, 0);
byte[] hash = hasher.Hash;
This pattern works for any type derived from HashAlgorithm, including MD5CryptoServiceProvider and SHA1Managed.
HashAlgorithm also defines a method ComputeHash which takes a Stream object; however, this method will block the thread until the stream is consumed. Using the TransformBlock approach allows an "asynchronous hash" that is computed as data arrives without using up a thread.


downloading an unknown type of image to temp and sending to browser with finding content type using magic numbers

I'm coming from nodejs so using some examples from nodejs to get the concept across.
Temp Directory: I'm using dotnet core so the app can run on either mac, windows or linux and the confusion lies for temp directory across operating systems where this image will be downloaded. (find temp dir in nodejs on any os -> os.tmpDir())
File format unknown: file format is unknown, not necessary to know in order to download the image and save but is necessary when sending it to the browsers in headers and it can be done using magic numbers. Reference
Download Image from Web
using (WebClient webClient = new WebClient())
byte [] data = webClient.DownloadData("https://fbcdn-sphotos-h-a.akamaihd.net/hphotos-ak-xpf1/v/t34.0-12/10555140_10201501435212873_1318258071_n.jpg?oh=97ebc03895b7acee9aebbde7d6b002bf&oe=53C9ABB0&__gda__=1405685729_110e04e71d9");
using (MemoryStream mem = new MemoryStream(data))
using (var yourImage = Image.FromStream(mem))
// how to save in the temp directory for all operating systems
Send the image to browser
byte[] ar;
using(FileStream fstream = new FileStream(tempPathForImage, FileMode.Open, FileAccess.Read);)
ar = new byte[(long)fstream.Length];
fstream.read(ar, 0, fstream.Length);
sw.WriteLine("Content-Type: "); // image/jpeg unknown, check first 4 bytes
sw.WriteLine("Content-Length: {0}", ar.Length); //Let's
sw.BaseStream.Write(ar, 0, ar.Length);
Magic Numbers:
magic numbers are basically first 4 bytes of a file that can help to find out the file type/extension. I've done something similar in nodejs where as image right before it gets streamed, i get a chance to read the first four bytes and then i set the header (content type) and continue streaming there. It's important to set the header before an image starts streaming, that's why you see in the following code checks for writeStream == null
response.on('data', function(chunk){
if(writeStream == null) {
url += '.' + getExtension(chunk.toString('hex', 0, 4));
writeStream = fs.createWriteStream(url);
writeStream.on('error', reject);
writeStream.on('finish', function(){
data.file = url;
Some file formats and their magic numbers.
"ffd8ffDB": "jpg",
"ffd8ffe0": "jpg",
"ffd8ffe1": "jpg",
"ffd8ffe2": "jpg",
"ffd8ffe3": "jpg",
"ffd8ffe8": "jpg",
"ffd8ffdb": "jpg",
"89504e47": "png",
"47494638": "gif",
Solve the problem of finding the temp directory
How to read first four bytes during the phase of streaming but just before sending, set the headers for content type (for once) and continue streaming.
You can create a specialized stream that wraps another stream, then override Read() to handle that detection. Here's a starting point. Most of this is boilerplate that defers to the wrapped stream.
class ImageDetectionStream : Stream
public string Mime { get; private set; }
private readonly Stream _stream;
private readonly byte[] _consideredBytes = new byte[MaxMagicNumberSize];
private int _consideredPosition;
private static readonly IDictionary<byte[], string> Magics = new Dictionary<byte[], string>
[new byte[] { 0xff, 0xdb, 0xff, 0xdb }] = "image/jpeg",
[new byte[] { 0xff, 0xd8, 0xff, 0xe0 }] = "image/jpeg",
[new byte[] { 0xff, 0xd8, 0xff, 0xe1 }] = "image/jpeg",
// and so on...
private static readonly int MaxMagicNumberSize = Magics.Keys.Max(x => x.Length);
public ImageDetectionStream(Stream stream)
_stream = stream ?? throw new ArgumentNullException(nameof(stream));
public override int Read(byte[] buffer, int offset, int count)
var value = _stream.Read(buffer, offset, count);
if (Mime != null) return value;
Array.Copy(buffer, 0, _consideredBytes, 0, _consideredBytes.Length);
_consideredPosition += value;
if (_consideredPosition < MaxMagicNumberSize) return value;
foreach (var magic in Magics)
var possibleMagic = buffer.Take(magic.Key.Length).ToArray();
if (possibleMagic.SequenceEqual(magic.Key))
Mime = magic.Value;
return value;
// boilerplate
public override void Flush()
public override long Seek(long offset, SeekOrigin origin)
return _stream.Seek(offset, origin);
public override void SetLength(long value)
public override void Write(byte[] buffer, int offset, int count)
_stream.Write(buffer, offset, count);
public override bool CanRead => _stream.CanRead;
public override bool CanSeek => _stream.CanSeek;
public override bool CanWrite => _stream.CanWrite;
public override long Length => _stream.Length;
public override long Position
get => _stream.Position;
set => _stream.Position = value;
Example use -
using (var fs = File.OpenRead("\\path\\to\\image\\file"))
using (var imageStream = new ImageDetectionStream(fs))
var bytes = new byte[128];
var bytesRead = imageStream.Read(bytes, 0, bytes.Length);
Console.WriteLine($"Image has {imageStream.Mime} type.");
Image has image/jpeg type.

Stream.CopyTo - How do I get the sent Bytes?

I try to get the transfer speed at a ftp-upload, but I don't know where I should "get" it:
FtpWebRequest request = (FtpWebRequest)WebRequest.Create(job.GetDestinationFolder() + "\\" + fileOnlyName);
request.Method = WebRequestMethods.Ftp.UploadFile;
request.Credentials = new NetworkCredential(Manager._user, Manager._password);
using (var requestStream = request.GetRequestStream())
using (var input = File.OpenRead(file))
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
Console.WriteLine("Upload File Complete, status {0}", response.StatusDescription);
I already read that this code
public static void CopyStream(Stream input, Stream output)
byte[] buffer = new byte[32768];
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
output.Write (buffer, 0, read);
isn't really efficient, according to the comment that was left:
Note that this is not the fastest way to do it. In the provided code snippet, you have to wait for the Write to complete before a new block is read. When doing the Read and Write asynchronously this waiting will disappear. In some situation this will make the copy twice as fast. However it will make the code a lot more complicated so if speed is not an issue, keep it simple and use this simple loop.
How can I show the transfer speed like a download at chrome or firefox ?
This is what I tried before you (Tien Dinh) answered:
FtpWebRequest request = (FtpWebRequest)WebRequest.Create(job.GetDestinationFolder() + "\\" + fileOnlyName);
request.Method = WebRequestMethods.Ftp.UploadFile;
request.Credentials = new NetworkCredential(Manager._user, Manager._password);
using (var requestStream = request.GetRequestStream())
using (var input = File.OpenRead(file))
while (input.Position != input.Length)
//bGroundWorker.ReportProgress( (int) input.Position);
Console.WriteLine(input.Length + "(length)");
Console.WriteLine(input.Position + "(sent)");
//e.Result = input.Position;
FtpWebResponse response = (FtpWebResponse)request.GetResponse();
Console.WriteLine("Upload File Complete, status {0}", response.StatusDescription);
As you can see there is a BackgroundWorker so that I use CopyToAsync.
You could build your own stream wrapper class that reports the number of bytes written in a defined interval:
public class StreamWithProgress : Stream
private readonly TimeSpan interval;
private readonly long sourceLength;
private readonly Stopwatch stopwatch = Stopwatch.StartNew();
private readonly BackgroundWorker worker;
private int bytesInLastInterval;
private long bytesTotal;
private Stream innerStream;
public override bool CanRead
get { return this.innerStream.CanRead; }
public override bool CanSeek
get { return this.innerStream.CanSeek; }
public override bool CanWrite
get { return this.innerStream.CanWrite; }
public override long Length
get { return this.innerStream.Length; }
public override long Position
get { return this.innerStream.Position; }
set { this.innerStream.Position = value; }
public StreamWithProgress(Stream stream, BackgroundWorker worker, long sourceLength, TimeSpan? interval = null)
if (stream == null)
throw new ArgumentNullException("stream");
if (worker == null)
throw new ArgumentNullException("worker");
this.interval = interval ?? TimeSpan.FromSeconds(1);
this.innerStream = stream;
this.worker = worker;
this.sourceLength = sourceLength;
public override void Flush()
public override int Read(byte[] buffer, int offset, int count)
return this.innerStream.Read(buffer, offset, count);
public override int ReadByte()
return this.innerStream.ReadByte();
public override long Seek(long offset, SeekOrigin origin)
return this.innerStream.Seek(offset, origin);
public override void SetLength(long value)
public override void Write(byte[] buffer, int offset, int count)
this.innerStream.Write(buffer, offset, count);
public override void WriteByte(byte value)
protected override void Dispose(bool disposing)
if (this.innerStream != null)
this.innerStream = null;
private void ReportProgress(int count)
this.bytesInLastInterval += count;
this.bytesTotal += count;
if (this.stopwatch.Elapsed > this.interval)
double speed = this.bytesInLastInterval / (this.stopwatch.Elapsed.Ticks / (double) this.interval.Ticks);
double progress = this.bytesTotal / (double) this.sourceLength;
var progressPercentage = (int) (progress * 100);
this.worker.ReportProgress(progressPercentage, speed);
this.bytesInLastInterval = 0;
You would use it like this:
BackgroundWorker worker = (BackgroundWorker)sender;
WebRequest request = WebRequest.Create("SOME URL");
WebResponse response = request.GetResponse();
using (Stream stream = response.GetResponseStream())
using (var dest = new StreamWithProgress(File.OpenWrite("PATH"), worker, response.ContentLength))
The BackgroundWorker will be called repeatedly with the current progress and speed. You could refine that example using a queue that stores the last n speeds and reports a mean value.
You already have a CopyStream method, just need to improve performance. BufferedStream is great for this. See below.
I believe You can also improve it further by using the Async methods in .net 4.
public static void CopyStream(Stream input, Stream output, Action<int> totalSent)
BufferedStream inputBuffer = new BufferedStream(input);
BufferedStream outputBuffer = new BufferedStream(output);
byte[] buffer = new byte[32768];
int read;
int total = 0;
while ((read = inputBuffer.Read(buffer, 0, buffer.Length)) > 0)
outputBuffer.Write (buffer, 0, read);
total += read;

Load File from SDCard

I integrated (this)EPUB Reader reader to my project. It is working fine. & I want to load the file from SDCard instead of Isolated storage of device
To open file from Isolated storage we have IsolatedStorageFileStream like this
IsolatedStorageFileStream isfs;
using (IsolatedStorageFile isf = IsolatedStorageFile.GetUserStoreForApplication())
isfs = isf.OpenFile([Path to file], FileMode.Open);
ePubView.Source = isfs;
For file in SDcard I tried like this
ExternalStorageDevice sdCard = (await ExternalStorage.GetExternalStorageDevicesAsync()).FirstOrDefault();
// If the SD card is present, get the route from the SD card.
if (sdCard != null)
ExternalStorageFile file = await sdCard.GetFileAsync(_sdFilePath);
// _sdFilePath is string that having file path of file in SDCard
// Create a stream for the route.
Stream file = await file.OpenForReadAsync();
// Read the route data.
ePubView.Source = file;
Here I am getting exception System.IO.EndOfStreamException
If You want try.. Here is my project sample link
Question : How can I give my file as source to epubView control
Is this is proper way, please give a suggestion regarding this..
Although I've not tried your approach, and I cannot say exactly where is an error (maybe file from SD is read async and thus you get EndOfStream, and please keep in mind that as it is said at EPUB Reader Site - it's under heavy developement). Check if after copying the file to ISolatedStorage, you will be able to use it. I would try in this case first copying from SD to Memory stream like this:
ExternalStorageDevice sdCard = (await ExternalStorage.GetExternalStorageDevicesAsync()).FirstOrDefault();
if (sdCard != null)
MemoryStream newStream = new MemoryStream();
using (ExternalStorageFile file = await sdCard.GetFileAsync(_sdFilePath))
using (Stream SDfile = await file.OpenForReadAsync())
newStream = await ReadToMemory(SDfile);
ePubView.Source = newStream;
And ReadToMemory:
private async Task<MemoryStream> ReadToMemory(Stream streamToRead)
MemoryStream targetStream = new MemoryStream();
const int BUFFER_SIZE = 1024;
byte[] buf = new byte[BUFFER_SIZE];
int bytesread = 0;
while ((bytesread = await streamToRead.ReadAsync(buf, 0, BUFFER_SIZE)) > 0)
targetStream.Write(buf, 0, bytesread);
return targetStream;
Maybe it will help.
There's a bug with the stream returned from ExternalStorageFile. There's two options to get around it...
If the file is small then you can simply copy the stream to a MemoryStream:
Stream s = await file.OpenForReadAsync();
MemoryStream ms = new MemoryStream();
However, if the file is too large you'll run in to memory issues so the following stream wrapper class can be used to correct Microsoft's bug (though in future versions of Windows Phone you'll need to disable this fix once the bug has been fixed):
using System;
using System.IO;
namespace WindowsPhoneBugFix
/// <summary>
/// Stream wrapper to circumnavigate buggy Stream reading of stream returned by ExternalStorageFile.OpenForReadAsync()
/// </summary>
public sealed class ExternalStorageFileWrapper : Stream
private Stream _stream; // Underlying stream
public ExternalStorageFileWrapper(Stream stream)
if (stream == null)
throw new ArgumentNullException("stream");
_stream = stream;
// Workaround described here - http://stackoverflow.com/a/21538189/250254
public override long Seek(long offset, SeekOrigin origin)
ulong uoffset = (ulong)offset;
ulong fix = ((uoffset & 0xffffffffL) << 32) | ((uoffset & 0xffffffff00000000L) >> 32);
return _stream.Seek((long)fix, origin);
public override bool CanRead
get { return _stream.CanRead; }
public override bool CanSeek
get { return _stream.CanSeek; }
public override bool CanWrite
get { return _stream.CanWrite; }
public override void Flush()
public override long Length
get { return _stream.Length; }
public override long Position
return _stream.Position;
_stream.Position = value;
public override int Read(byte[] buffer, int offset, int count)
return _stream.Read(buffer, offset, count);
public override void SetLength(long value)
public override void Write(byte[] buffer, int offset, int count)
_stream.Write(buffer, offset, count);
Code is available here to drop in to your project:
Example of use:
ExternalStorageFile file = await device.GetFileAsync(filename); // device is an instance of ExternalStorageDevice
Stream streamOriginal = await file.OpenForReadAsync();
ExternalStorageFileWrapper streamToUse = new ExternalStorageFileWrapper(streamOriginal);

Lazy, stream driven object serialization with protobuf-net

We are developing a WCF service for streaming a large amount of data, therefore we have chosen to use WCF Streaming functionality combined with a protobuf-net serialization.
Generally an idea is to serialize objects in the service, write them into a stream and send.
On the other end the caller will receive a Stream object and it can read all data.
So currently the service method code looks somewhat like this:
public Result TestMethod(Parameter parameter)
// Create response
var responseObject = new BusinessResponse { Value = "some very large data"};
// The resposne have to be serialized in advance to intermediate MemoryStream
var stream = new MemoryStream();
serializer.Serialize(stream, responseObject);
stream.Position = 0;
// ResultBody is a stream, Result is a MessageContract
return new Result {ResultBody = stream};
The BusinessResponse object is serialized to a MemoryStream and that is returned from a method.
On the client side the calling code looks like that:
var parameter = new Parameter();
// Call the service method
var methodResult = channel.TestMethod(parameter);
// protobuf-net deserializer reads from a stream received from a service.
// while reading is performed by protobuf-net,
// on the service side WCF is actually reading from a
// memory stream where serialized message is stored
var result = serializer.Deserialize<BusinessResponse>(methodResult.ResultBody);
return result;
So when serializer.Deserialize() is called it reads from a stream methodResult.ResultBody, on the same time on the service side WCF is reading a MemoryStream, that has been returned from a TestMethod.
What we would like to achieve is to get rid of a MemoryStream and initial serialization of the whole object on the service side at once.
Since we use streaming we would like to avoid keeping a serialized object in memory before sending.
The perfect solution would be to return an empty, custom-made Stream object (from TestMethod()) with a reference to an object that is to be serialized ('BusinessResponse' object in my example).
So when WCF calls a Read() method of my stream, I internally serialize a piece of an object using protobuf-net and return it to the caller without storing it in the memory.
And now there is a problem, because what we actually need is a possibility to serialize an object piece by piece in the moment when stream is read.
I understand that this is totally different way of serialization - instead of pushing an object to a serializer, I'd like to request a serialized content piece by piece.
Is that kind of serialization is somehow possible using protobuf-net?
I cooked up some code that is probably along the lines of the gate idea of Marc.
public class PullStream : Stream
private byte[] internalBuffer;
private bool ended;
private static ManualResetEvent dataAvailable = new ManualResetEvent(false);
private static ManualResetEvent dataEmpty = new ManualResetEvent(true);
public override bool CanRead
get { return true; }
public override bool CanSeek
get { return false; }
public override bool CanWrite
get { return true; }
public override void Flush()
throw new NotImplementedException();
public override long Length
get { throw new NotImplementedException(); }
public override long Position
throw new NotImplementedException();
throw new NotImplementedException();
public override int Read(byte[] buffer, int offset, int count)
if ( count >= internalBuffer.Length)
var retVal = internalBuffer.Length;
Array.Copy(internalBuffer, buffer, retVal);
internalBuffer = null;
return retVal;
Array.Copy(internalBuffer, buffer, count);
internalBuffer = internalBuffer.Skip(count).ToArray(); // i know
return count;
public override long Seek(long offset, SeekOrigin origin)
throw new NotImplementedException();
public override void SetLength(long value)
throw new NotImplementedException();
public override void Write(byte[] buffer, int offset, int count)
internalBuffer = new byte[count];
Array.Copy(buffer, internalBuffer, count);
Debug.WriteLine("Writing some data");
public void End()
internalBuffer = new byte[0];
Debug.WriteLine("Ending writes");
This is a simple stream descendant class only implementing Read and Write (and End). The Read blocks while no data is available and the Write blocks while data is available. This way there is only one byte buffer involved. The linq copying of the rest is open for optimization ;-) The End method is added so no blocking occurs where Read is performed when no data is available and no data will be written any more.
You have to write to this stream from a separate thread. I show this below:
// create a large object
var obj = new List<ToSerialize>();
for(int i = 0; i <= 1000; i ++)
obj.Add(new ToSerialize { Test = "This is my very loooong message" });
// create my special stream to read from
var ms = new PullStream();
new Thread(x =>
ProtoBuf.Serializer.Serialize(ms, obj);
var buffer = new byte[100];
// stream to write back to (just to show deserialization is working too)
var ws = new MemoryStream();
int read;
while ((read = ms.Read(buffer, 0, 100)) != 0)
ws.Write(buffer, 0, read);
Debug.WriteLine("read some data");
ws.Position = 0;
var back = ProtoBuf.Serializer.Deserialize<List<ToSerialize>>(ws);
I hope this solves your problem :-) It was fun to code this anyway.
Regards, Jacco

StartsWith extension method for stream

I want to write a bool StartsWith(string message) extension method for a stream. What is the most efficient way?
Start with something like this ...
public static bool StartsWith(Stream stream this, string value)
using(reader = new StreamReader(stream))
string str = reader.ReadToEnd();
return str.StartsWith(value);
Then optimise ... I'll leave this as an exercise for you, StreamReader has various Read methods which will allow you to read the stream in smaller 'chunks' for a more efficient result.
static bool StartsWith(this Stream stream, string value, Encoding encoding, out string actualValue)
if (stream == null) { throw new ArgumentNullException("stream"); }
if (value == null) { throw new ArgumentNullException("value"); }
if (encoding == null) { throw new ArgumentNullException("encoding"); }
stream.Seek(0L, SeekOrigin.Begin);
int count = encoding.GetByteCount(value);
byte[] buffer = new byte[count];
int read = stream.Read(buffer, 0, count);
actualValue = encoding.GetString(buffer, 0, read);
return value == actualValue;
Of course a Stream itself does not imply that it's data is decodable to a string representation. If you're sure your stream is, you can use the extension above.
