Why does SHA256 hash not match after uploading and downloading file?

Why does SHA256 hash not match after uploading and downloading file? - c#

I'm creating a feature for an app to store a file on a webserver while maintaining data about the file on SQL Server. I generate a SHA256 hash and store it as BINARY(32) and then upload the file to a WebDav server using HTTPClient. Later when I want to view the file in the app, I do a GET request, download the file, and check the SHA256 hash with the stored hash. It doesn't match :( Why?
I've tried checking the hash on the server and the local machine and it doesn't match either. I've done a ton of research and made sure I wasn't hashing the filename (you can see the code below).
public static byte[] GetSHA256(string path) {
using (var stream = File.OpenRead(path)) {
using (var sha256 = SHA256.Create()) {
return sha256.ComputeHash(stream);
}
}
}
To Upload a file:
public async Task<bool> Upload(string path, string name) {
var storedHash = GetSHA256(path/name);
//Store this hash in a database, omitted for brevity
using (var file = File.OpenRead(path)) {
var content = new MultipartFormDataContent();
content.Headers.ContentType.Media = "multipart/form-data";
content.Add(new StreamContent(file));
var result = await HttpClient.PutAsync(uri, content);
}
}
To download:
var result = await HttpClient.GetAsync(uri);
using (var stream = await result.Content.ReadAsStreamAsync()) {
var fileInfo = new FileInfo("TestFile");
using(var fileStream = fileInfo.Open(FileMode.CreateNew, FileAccess.ReadWrite, FileShare.Delete)) {
await stream.CopyToAsync(fileStream);
}
}
var downloadedFileHash = GetSHA256("TestFile");
//check if downloadedFileHash matches the storedHash by comparing byte[] length and content with for loop.
I expect that the hash would match. I know I'm missing a few using statements and other code but I omitted a bunch for brevity.
EDIT: The hashes for the downloaded files stay the same so the problem isn't downloading but uploading. I uploaded the same files multiple times but get back different hashes for each one. But the different hashes stay constant.

Sorry y'all, you can delete this question because I found the problem/answer but am still confused why this is occurring.
Turns out webdav was adding extra headers to my file for some reason, see: Header info being written into file when PUT-ing to a Webdav server
Strangest thing. So I encountered this post. https://blogs.msdn.microsoft.com/robert_mcmurray/2011/10/18/sending-webdav-requests-in-net-revisited/
Rewrote my code to be
public static async Task<HttpResponseMessage> Upload(string path, string name, FileStream file) {
var method = new HttpMethod(#"PUT");
var message = new HttpRequestMessage(method, path/name) {
Content = new StreamContent(file)
};
return await HttpClient.SendAsync(message);
}
And it works... But I'm wonder how the two methods of uploading differ.

Related

split binary file into chunks or Parts upload / download

I am using couchdb for some reason as a content management to upload files as binary data, there is no GridFs support like mongoDB to upload large files, so I need to upload files as chunks then retrieve them as one file.
here is my code
public string InsertDataToCouchDb(string dbName, string id, string filename, byte[] image)
{
var connection = System.Configuration.ConfigurationManager.ConnectionStrings["CouchDb"].ConnectionString;
using (var db = new MyCouchClient(connection, dbName))
{
// HERE I NEED TO UPLOAD MY IMAGE BYTE[] AS CHUNKS
var artist = new couchdb
{
_id = id,
filename = filename,
Image = image
};
var response = db.Entities.PutAsync(artist);
return response.Result.Content._id;
}
}
public byte[] FetchDataFromCouchDb(string dbName, string id)
{
var connection = System.Configuration.ConfigurationManager.ConnectionStrings["CouchDb"].ConnectionString;
using (var db = new MyCouchClient(connection, dbName))
{
//HERE I NEED TO RETRIVE MY FULL IMAGE[] FROM CHUNKS
var test = db.Documents.GetAsync(id, null);
var doc = db.Serializer.Deserialize<couchdb>(test.Result.Content);
return doc.Image;
}
}
THANK YOU

Putting image data in a CouchDB document is a terrible idea. Just don't. This is the purpose of CouchDB attachments.
The potential of bloating the database with redundant blob data via document updates alone will surely have major, negative consequences for anything other than a toy database.
Further there seems to be a lack of understanding how async/await works as the code in the OP is invoking async methods, e.g. db.Entities.PutAsync(artist), without an await - the call surely will fail every time (if the compiler even allows the code). I highly recommend grok'ing the Microsoft document Asynchronous programming with async and await.
Now as for "chunking": If the image data is so large that it needs to be otherwise streamed, the business of passing it around via a byte array looks bad. If the images are relatively small, just use Attachment.PutAsync as it stands.
Although Attachment.PutAsync at MyCouch v7.6 does not support streams (effectively chunking) there exists the Support Streams for attachments #177 PR, which does, and it looks pretty good.
Here's a one page C# .Net Core console app that uploads a given file as an attachment to a specific document using the very efficient streaming provided by PR 177. Although the code uses PR 177, it most importantly uses Attachments for blob data. Replacing a stream with a byte array is rather straightforward.
MyCouch + PR 177
In a console get MyCouch sources and then apply PR 177
$ git clone https://github.com/danielwertheim/mycouch.git
$ cd mycouch
$ git pull origin 15a1079502a1728acfbfea89a7e255d0c8725e07
(I don't know git so there's probably a far better way to get a PR)
MyCouchUploader
With VS2019
Create a new .Net Core console app project and solution named "MyCouchUploader"
Add the MyCouch project pulled with PR 177 to the solution
Add the MyCouch project as MyCouchUploader dependency
Add the Nuget package "Microsoft.AspNetCore.StaticFiles" as a MyCouchUploader dependency
Replace the content of Program.cs with the following code:
using Microsoft.AspNetCore.StaticFiles;
using MyCouch;
using MyCouch.Requests;
using MyCouch.Responses;
using System;
using System.IO;
using System.Linq;
using System.Net;
using System.Security.Cryptography;
using System.Threading.Tasks;
namespace MyCouchUploader
{
class Program
{
static async Task Main(string[] args)
{
// args: scheme, database, file path of asset to upload.
if (args.Length < 3)
{
Console.WriteLine("\nUsage: MyCouchUploader scheme dbname filepath\n");
return;
}
var opts = new
{
scheme = args[0],
dbName = args[1],
filePath = args[2]
};
Action<Response> check = (response) =>
{
if (!response.IsSuccess) throw new Exception(response.Reason);
};
try
{
// canned doc id for this app
const string docId = "SO-68998781";
const string attachmentName = "Image";
DbConnectionInfo cnxn = new DbConnectionInfo(opts.scheme, opts.dbName)
{ // timely fail if scheme is bad
Timeout = TimeSpan.FromMilliseconds(3000)
};
MyCouchClient client = new MyCouchClient(cnxn);
// ensure db is there
GetDatabaseResponse info = await client.Database.GetAsync();
check(info);
// delete doc for succcessive program runs
DocumentResponse doc = await client.Documents.GetAsync(docId);
if (doc.StatusCode == HttpStatusCode.OK)
{
DocumentHeaderResponse del = await client.Documents.DeleteAsync(docId, doc.Rev);
check(del);
}
// sniff file for content type
FileExtensionContentTypeProvider provider = new FileExtensionContentTypeProvider();
if (!provider.TryGetContentType(opts.filePath, out string contentType))
{
contentType = "application/octet-stream";
}
// create a hash for silly verification
using var md5 = MD5.Create();
using Stream stream = File.OpenRead(opts.filePath);
byte[] fileHash = md5.ComputeHash(stream);
stream.Position = 0;
// Use PR 177, sea-locks:stream-attachments.
DocumentHeaderResponse put = await client.Attachments.PutAsync(new PutAttachmentStreamRequest(
docId,
attachmentName,
contentType,
stream // :-D
));
check(put);
// verify
AttachmentResponse verify = await client.Attachments.GetAsync(docId, attachmentName);
check(verify);
if (fileHash.SequenceEqual(md5.ComputeHash(verify.Content)))
{
Console.WriteLine("Atttachment verified.");
}
else
{
throw new Exception(String.Format("Attachment failed verification with status code {0}", verify.StatusCode));
}
}
catch (Exception e)
{
Console.WriteLine("Fail! {0}", e.Message);
}
}
}
}
To run:
$ MyCouchdbUploader http://name:password#localhost:5984 dbname path-to-local-image-file
Use Fauxton to visually verify the attachment for the doc.

Web API: Download Multiple Files Separately

I have a Web Api controller method that gets passed document IDs and it should return the document files individually for those requested Ids. I have tried the accepted answer from the following link to achieve this functionality, but it's not working. I don't know where I did go wrong.
What's the best way to serve up multiple binary files from a single WebApi method?
My Web Api Method,
public async Task<HttpResponseMessage> DownloadMultiDocumentAsync(
IClaimedUser user, string documentId)
{
List<long> docIds = documentId.Split(',').Select(long.Parse).ToList();
List<Document> documentList = coreDataContext.Documents.Where(d => docIds.Contains(d.DocumentId) && d.IsActive).ToList();
var content = new MultipartContent();
CloudBlockBlob blob = null;
var container = GetBlobClient(tenantInfo);
var directory = container.GetDirectoryReference(
string.Format(DirectoryNameConfigValue, tenantInfo.TenantId.ToString(), documentList[0].ProjectId));
for (int docId = 0; docId < documentList.Count; docId++)
{
blob = directory.GetBlockBlobReference(DocumentNameConfigValue + documentList[docId].DocumentId);
if (!blob.Exists()) continue;
MemoryStream memStream = new MemoryStream();
await blob.DownloadToStreamAsync(memStream);
memStream.Seek(0, SeekOrigin.Begin);
var streamContent = new StreamContent(memStream);
content.Add(streamContent);
}
HttpResponseMessage httpResponseMessage = new HttpResponseMessage();
httpResponseMessage.Content = content;
httpResponseMessage.Content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
httpResponseMessage.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment");
httpResponseMessage.StatusCode = HttpStatusCode.OK;
return httpResponseMessage;
}
I tried with 2 or more document Ids but only one file was downloaded and that also is not in the correct format (without extension).

Zipping is the only option that will have consistent result on all browsers. MIME/multipart content is for email messages (https://en.wikipedia.org/wiki/MIME#Multipart_messages) and it was never intended to be received and parsed on the client side of a HTTP transaction. Some browsers do implement it, some others don't.
Alternatively, you can change your API to take in a single docId and iterate over your API from your client for each docId.

I think only way is that you zip your all the files and then download one zip file. I guess you can use dotnetzip package because it is easy to use.
One way is that, you can first save your files on disk and then stream the zip to download. Another way is, you can zip them in memory and then download the file in stream
public ActionResult Download()
{
using (ZipFile zip = new ZipFile())
{
zip.AddDirectory(Server.MapPath("~/Directories/hello"));
MemoryStream output = new MemoryStream();
zip.Save(output);
return File(output, "application/zip", "sample.zip");
}
}

Streaming files from amazon s3 with seek possibility in C#

I need to work with huge files in Amazon S3. How can I get part of huge file from S3? Best way would be get stream with the seek possibility.
Unfortunately, CanSeek property of response.ResponseStream is false:
GetObjectRequest request = new GetObjectRequest();
request.BucketName = BUCKET_NAME;
request.Key = NumIdToAmazonKey(numID);
GetObjectResponse response = client.GetObject(request);

You could do following to read a certain part of your file
GetObjectRequest request = new GetObjectRequest
{
BucketName = bucketName,
Key = keyName,
ByteRange = new ByteRange(0, 10)
};
See the documentation

I know this isn't exactly what OP is asking for but I needed a seekable s3 stream so I could read Parquet files without downloading them so I gave this a shot here: https://github.com/mukunku/RandomHelpers/blob/master/SeekableS3Stream.cs
Performance wasn't as bad as I expected. You can use the TimeWastedSeeking property to see how much time is being wasted by allowing Seek() on an s3 stream.
Here's an example on how to use it:
using (var client = new AmazonS3Client(credentials, Amazon.RegionEndpoint.USEast1))
{
using (var stream = SeekableS3Stream.OpenFile(client, "myBucket", "path/to/myfile.txt", true))
{
//stream is seekable!
}
}

After a frustrating afternoon with the same problem I found the static class AmazonS3Util
https://docs.aws.amazon.com/sdkfornet/v3/apidocs/items/S3/TS3Util.html
Which has a MakeStreamSeekable method.

Way late for the OP, but I've just posted an article and code demonstration of a SeekableS3Stream that performs reasonably well in real-world use cases.
https://github.com/mlhpdx/seekable-s3-stream
Specifically, I demonstrate reading a single small file from a much larger ISO disk image using the DiscUtils library unmodified by implementing a random-access stream that uses Range requests to pull sections of the file as-needed and maintains them in an MRU list to prevent re-downloading ranges for hot data structures in the file (e.g. zip central directory records).
The use is similarly simple:
using System;
using System.IO;
using System.Threading.Tasks;
using Amazon.S3;
using DiscUtils.Iso9660;
namespace Seekable_S3_Stream
{
class Program
{
const string BUCKET = "rds.nsrl.nist.gov";
const string KEY = "RDS/current/RDS_ios.iso"; // "RDS/current/RDS_modern.iso";
const string FILENAME = "READ_ME.TXT";
static async Task Main(string[] args)
{
var s3 = new AmazonS3Client();
using var stream = new Cppl.Utilities.AWS.SeekableS3Stream(s3, BUCKET, KEY, 1 * 1024 * 1024, 4);
using var iso = new CDReader(stream, true);
using var file = iso.OpenFile(FILENAME, FileMode.Open, FileAccess.Read);
using var reader = new StreamReader(file);
var content = await reader.ReadToEndAsync();
await Console.Out.WriteLineAsync($"{stream.TotalRead / (float)stream.Length * 100}% read, {stream.TotalLoaded / (float)stream.Length * 100}% loaded");
}
}
}

How to get the stream for a Multipart file in webapi upload?

I need to upload a file using Stream (Azure Blobstorage), and just cannot find out how to get the stream from the object itself. See code below.
I'm new to the WebAPI and have used some examples. I'm getting the files and filedata, but it's not correct type for my methods to upload it. Therefore, I need to get or convert it into a normal Stream, which seems a bit hard at the moment :)
I know I need to use ReadAsStreamAsync().Result in some way, but it crashes in the foreach loop since I'm getting two provider.Contents (first one seems right, second one does not).
[System.Web.Http.HttpPost]
public async Task<HttpResponseMessage> Upload()
{
if (!Request.Content.IsMimeMultipartContent())
{
this.Request.CreateResponse(HttpStatusCode.UnsupportedMediaType);
}
var provider = GetMultipartProvider();
var result = await Request.Content.ReadAsMultipartAsync(provider);
// On upload, files are given a generic name like "BodyPart_26d6abe1-3ae1-416a-9429-b35f15e6e5d5"
// so this is how you can get the original file name
var originalFileName = GetDeserializedFileName(result.FileData.First());
// uploadedFileInfo object will give you some additional stuff like file length,
// creation time, directory name, a few filesystem methods etc..
var uploadedFileInfo = new FileInfo(result.FileData.First().LocalFileName);
// Remove this line as well as GetFormData method if you're not
// sending any form data with your upload request
var fileUploadObj = GetFormData<UploadDataModel>(result);
Stream filestream = null;
using (Stream stream = new MemoryStream())
{
foreach (HttpContent content in provider.Contents)
{
BinaryFormatter bFormatter = new BinaryFormatter();
bFormatter.Serialize(stream, content.ReadAsStreamAsync().Result);
stream.Position = 0;
filestream = stream;
}
}
var storage = new StorageServices();
storage.UploadBlob(filestream, originalFileName);**strong text**
private MultipartFormDataStreamProvider GetMultipartProvider()
{
var uploadFolder = "~/App_Data/Tmp/FileUploads"; // you could put this to web.config
var root = HttpContext.Current.Server.MapPath(uploadFolder);
Directory.CreateDirectory(root);
return new MultipartFormDataStreamProvider(root);
}

This is identical to a dilemma I had a few months ago (capturing the upload stream before the MultipartStreamProvider took over and auto-magically saved the stream to a file). The recommendation was to inherit that class and override the methods ... but that didn't work in my case. :( (I wanted the functionality of both the MultipartFileStreamProvider and MultipartFormDataStreamProvider rolled into one MultipartStreamProvider, without the autosave part).
This might help; here's one written by one of the Web API developers, and this from the same developer.

Hi just wanted to post my answer so if anybody encounters the same issue they can find a solution here itself.
here
MultipartMemoryStreamProvider stream = await this.Request.Content.ReadAsMultipartAsync();
foreach (var st in stream.Contents)
{
var fileBytes = await st.ReadAsByteArrayAsync();
string base64 = Convert.ToBase64String(fileBytes);
var contentHeader = st.Headers;
string filename = contentHeader.ContentDisposition.FileName.Replace("\"", "");
string filetype = contentHeader.ContentType.MediaType;
}
I used MultipartMemoryStreamProvider and got all the details like filename and filetype from the header of content.
Hope this helps someone.

When I use the .NET WebClient DownloadFileAsync I randomly get zero length files returned

I'm trying to download files from my FTP server - multiples at the same time. When i use the DownloadFileAsync .. random files are returned with a byte[] Length of 0. I can 100% confirm the file exists on the server and has content AND there FTP server (running Filezilla Server) isn't erroring and say's the file has been transferred.
private async Task<IList<FtpDataResult>> DownloadFileAsync(FtpFileName ftpFileName)
{
var address = new Uri(string.Format("ftp://{0}{1}", _server, ftpFileName.FullName));
var webClient = new WebClient
{
Credentials = new NetworkCredential(_username, _password)
};
var bytes = await webClient.DownloadDataTaskAsync(address);
using (var stream = new MemoryStream(bytes))
{
// extract the stream data (either files in a zip OR a file);
return result;
}
}
When I try this code, it's slower (of course) but all the files have content.
private async Task<IList<FtpDataResult>> DownloadFileAsync(FtpFileName ftpFileName)
{
var address = new Uri(string.Format("ftp://{0}{1}", _server, ftpFileName.FullName));
var webClient = new WebClient
{
Credentials = new NetworkCredential(_username, _password)
};
// NOTICE: I've removed the AWAIT and a different method.
var bytes = webClient.DownloadData(address);
using (var stream = new MemoryStream(bytes))
{
// extract the stream data (either files in a zip OR a file);
return result;
}
}
Can anyone see what I'm doing wrong, please? Why would the DownloadFileAsync be randomly returning zero bytes?

Try out FtpWebRequest/FtpWebResponse classes. You have more available to you for debugging purposes.
FtpWebRequest - http://msdn.microsoft.com/en-us/library/system.net.ftpwebrequest(v=vs.110).aspx
FtpWebResponse - http://msdn.microsoft.com/en-us/library/system.net.ftpwebresponse(v=vs.110).aspx

Take a look at http://netftp.codeplex.com/. It appears as though almost all methods implement IAsyncResult. There isn't much documentation on how to get started, but I would assume that it is similar to the synchronous FTP classes from the .NET framework. You can install the nuget package here: https://www.nuget.org/packages/System.Net.FtpClient/

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why does SHA256 hash not match after uploading and downloading file? - c#

Related

split binary file into chunks or Parts upload / download

Web API: Download Multiple Files Separately

Streaming files from amazon s3 with seek possibility in C#

How to get the stream for a Multipart file in webapi upload?

When I use the .NET WebClient DownloadFileAsync I randomly get zero length files returned

Categories

Resources