We have an embedded resource and need to get the md5 hash of the file before extracting it in order to know if it is different from an already existing file, (becouse if we have to extract it to compare them it would be better to replace the file directly)
Any suggestion is appreciated
What sort of embedded resource is it? If it's one you get hold of using Assembly.GetManifestResourceStream(), then the simplest approach is:
using (Stream stream = Assembly.GetManifestResourceStream(...))
{
using (MD5 md5 = MD5.Create())
{
byte[] hash = md5.ComputeHash(stream);
}
}
If that doesn't help, please give more information as to how you normall access/extract your resource.
You can use MemoryStream
using (MemoryStream ms = new MemoryStream(Properties.Resources.MyZipFile))
{
using (System.Security.Cryptography.MD5 md5 = System.Security.Cryptography.MD5.Create())
{
byte[] hash = md5.ComputeHash(ms);
string str = Convert.ToBase64String(hash);
// result for example: WgWKWcyl2YwlF/C8yLU9XQ==
}
}
Related
I am writing a zip file generator which will be consumed by a third party using a specific encryption algorithm
I have found the enumeration of algorithms here:
ICSharpCode.SharpZipLib.Zip.EncryptionAlgorithm
But I don't know how to apply the algorithm to a given zip archive.
Here's my code.
using (FileStream fsOut = File.Create(fullPath + ".zip"))
using (var zipStream = new ZipOutputStream(fsOut))
{
zipStream.SetLevel(3); //0-9, 9 being the highest level of compression
zipStream.Password = "password";
using (MemoryStream memoryStream = new MemoryStream())
using (TextWriter writer = new StreamWriter(memoryStream))
{
// redacted: write data to memorytream...
var dataEntry = new ZipEntry(fullPath.Split('\\').Last()+".txt");
dataEntry.DateTime = DateTime.Now;
zipStream.PutNextEntry(dataEntry);
memoryStream.WriteTo(zipStream);
zipStream.CloseEntry();
}
}
Edit
DotNetZip allows you to also choose Zip 2.0 PKWARE encryption algorithm.
As I understand from reading the code and forum posts, EncryptionAlgorithm exists to document the values available in the Zip standard and not as an option for the end user.
The encryption algorithms that are actually available to you are AES128 and AES256. You apply the algorithm on each entry by assigning the AESKeySize property.
So in your case:
// Specifying the AESKeySize triggers AES encryption. Allowable values are 0 (off), 128 or 256.
// A password on the ZipOutputStream is required if using AES.
dataEntry.AESKeySize = 256;
(the comments come from this page https://github.com/icsharpcode/SharpZipLib/wiki/Zip-Samples/6dc300804f36f981e516fa477219b0e40c192861)
I'm working on download and then MD5 check to ensure the download is successful. I have the following code which should work, but isn't the most efficient - especially for large files.
using (var client = new System.Net.WebClient())
{
client.DownloadFile(url, destinationFile);
}
var fileHash = GetMD5HashAsStringFromFile(destinationFile);
var successful = expectedHash.Equals(fileHash, StringComparison.OrdinalIgnoreCase);
My concern is that the bytes are all streamed through to disk, and then the MD5 ComputeHash() has to open the file and read all the bytes again. Is there a good, clean way of computing the MD5 as part of the download stream? Ideally, the MD5 should just fall out of the DownloadFile() function as a side effect of sorts. A function with a signature like this:
string DownloadFileAndComputeHash(string url, string filename, HashTypeEnum hashType);
Edit: Adds code for GetMD5HashAsStringFromFile()
public string GetMD5HashAsStringFromFile(string filename)
{
using (FileStream file = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.Read))
{
var md5er = System.Security.Cryptography.MD5.Create();
var md5HashBytes = md5er.ComputeHash(file);
return BitConverter
.ToString(md5HashBytes)
.Replace("-", string.Empty)
.ToLower();
}
}
Is there a good, clean way of computing the MD5 as part of the download stream? Ideally, the MD5 should just fall out of the DownloadFile() function as a side effect of sorts.
You could follow this strategy, to do "chunked" calculation and minimize memory pressure (and duplication):
Open the response stream on the web client.
Open the destination file stream.
Repeat while there is data available:
Read chunk from response stream into byte buffer
Write it to the destination file stream.
Use the TransformBlock method to add the bytes to the hash calculation
Use TransformFinalBlock to get the calculated hash code.
The sample code below shows how this could be achieved.
public static byte[] DownloadAndGetHash(Uri file, string destFilePath, int bufferSize)
{
using (var md5 = MD5.Create())
using (var client = new System.Net.WebClient())
{
using (var src = client.OpenRead(file))
using (var dest = File.Create(destFilePath, bufferSize))
{
md5.Initialize();
var buffer = new byte[bufferSize];
while (true)
{
var read = src.Read(buffer, 0, buffer.Length);
if (read > 0)
{
dest.Write(buffer, 0, read);
md5.TransformBlock(buffer, 0, read, null, 0);
}
else // reached the end.
{
md5.TransformFinalBlock(buffer, 0, 0);
return md5.Hash;
}
}
}
}
}
If you're talking about large files (I'm assuming over 1GB), you'll want to read the data in chunks, then process each chunk through the MD5 algorithm, and then store it to the disk. It's doable, but I don't know how much of the default .NET classes will help you with that.
One approach might be with a custom stream wrapper. First you get a Stream from WebClient (via GetWebResponse() and then GetResponseStream()), then you wrap it, and then pass it to ComputeHash(stream). When MD5 calls Read() on your wrapper, the wrapper would call Read on the network stream, write the data out when it's received, and then pass it back to MD5.
I don't know what problems would await you if you try and do this.
Something like this.
byte[] result;
using (var webClient = new System.Net.WebClient())
{
result = webClient.DownloadData("http://some.url");
}
byte[] hash = ((HashAlgorithm)CryptoConfig.CreateFromName("MD5")).ComputeHash(result);
I'm working on a encryptor application that works based on RSA Asymmetric Algorithm.
It generates a key-pair and the user have to keep it.
As key-pairs are long random strings, I want to create a function that let me compress generated long random strings (key-pairs) based on a pattern.
(For example the function get a string that contains 100 characters and return a string that contains 30 characters)
So when the user enter the compressed string I can regenerate the key-pairs based on the pattern I compressed with.
But a person told me that it is impossible to compress random things because they are Random!
What is your idea ?
Is there any way to do this ?
Thanks
It's impossible to compress (nearly any) random data. Learning a bit about information theory, entropy, how compression works, and the pigeonhole principle will make this abundantly clear.
One exception to this rule is if by "random string", you mean, "random data represented in a compressible form, like hexadecimal". In this sort of scenario, you could compress the string or (the better option) simply encode the bytes as base 64 instead to make it shorter. E.g.
// base 16, 50 random bytes (length 100)
be01a140ac0e6f560b1f0e4a9e5ab00ef73397a1fe25c7ea0026b47c213c863f88256a0c2b545463116276583401598a0c36
// base 64, same 50 random bytes (length 68)
vgGhQKwOb1YLHw5KnlqwDvczl6H+JcfqACa0fCE8hj+IJWoMK1RUYxFidlg0AVmKDDY=
You might instead give the user a shorter hash or fingerprint of the value (e.g. the last x bytes). Then by storing the full key and hash somewhere, you could give them the key when they give you the hash. You'd have to have this hash be long enough that security is not compromised. Depending on your application, this might defeat the purpose because the hash would have to be as long as the key, or it might not be a problem.
public static string ZipStr(String str)
{
using (MemoryStream output = new MemoryStream())
{
using (DeflateStream gzip =
new DeflateStream(output, CompressionMode.Compress))
{
using (StreamWriter writer =
new StreamWriter(gzip, System.Text.Encoding.UTF8))
{
writer.Write(str);
}
}
return Convert.ToBase64String(output.ToArray());
}
}
public static string UnZipStr(string base64)
{
byte[] input = Convert.FromBase64String(base64);
using (MemoryStream inputStream = new MemoryStream(input))
{
using (DeflateStream gzip =
new DeflateStream(inputStream, CompressionMode.Decompress))
{
using (StreamReader reader =
new StreamReader(gzip, System.Text.Encoding.UTF8))
{
return reader.ReadToEnd();
}
}
}
}
Take into account that this doesn't have to be shorter at all... depends on the contents of the string.
Try to use gzip compression and see if it helps you
I'm using iTextSharp to read the text from a PDF file. However, there are times I cannot extract text, because the PDF file is only containing images. I download the same PDF files everyday, and I want to see if the PDF has been modified. If the text and modification date cannot be obtained, is a MD5 checksum the most reliable way to tell if the file has changed?
If it is, some code samples would be appreciated, because I don't have much experience with cryptography.
It's very simple using System.Security.Cryptography.MD5:
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(filename))
{
return md5.ComputeHash(stream);
}
}
(I believe that actually the MD5 implementation used doesn't need to be disposed, but I'd probably still do so anyway.)
How you compare the results afterwards is up to you; you can convert the byte array to base64 for example, or compare the bytes directly. (Just be aware that arrays don't override Equals. Using base64 is simpler to get right, but slightly less efficient if you're really only interested in comparing the hashes.)
If you need to represent the hash as a string, you could convert it to hex using BitConverter:
static string CalculateMD5(string filename)
{
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(filename))
{
var hash = md5.ComputeHash(stream);
return BitConverter.ToString(hash).Replace("-", "").ToLowerInvariant();
}
}
}
This is how I do it:
using System.IO;
using System.Security.Cryptography;
public string checkMD5(string filename)
{
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(filename))
{
return Encoding.Default.GetString(md5.ComputeHash(stream));
}
}
}
I know this question was already answered, but this is what I use:
using (FileStream fStream = File.OpenRead(filename)) {
return GetHash<MD5>(fStream)
}
Where GetHash:
public static String GetHash<T>(Stream stream) where T : HashAlgorithm {
StringBuilder sb = new StringBuilder();
MethodInfo create = typeof(T).GetMethod("Create", new Type[] {});
using (T crypt = (T) create.Invoke(null, null)) {
byte[] hashBytes = crypt.ComputeHash(stream);
foreach (byte bt in hashBytes) {
sb.Append(bt.ToString("x2"));
}
}
return sb.ToString();
}
Probably not the best way, but it can be handy.
Here is a slightly simpler version that I found. It reads the entire file in one go and only requires a single using directive.
byte[] ComputeHash(string filePath)
{
using (var md5 = MD5.Create())
{
return md5.ComputeHash(File.ReadAllBytes(filePath));
}
}
I know that I am late to party but performed test before actually implement the solution.
I did perform test against inbuilt MD5 class and also md5sum.exe. In my case inbuilt class took 13 second where md5sum.exe too around 16-18 seconds in every run.
DateTime current = DateTime.Now;
string file = #"C:\text.iso";//It's 2.5 Gb file
string output;
using (var md5 = MD5.Create())
{
using (var stream = File.OpenRead(file))
{
byte[] checksum = md5.ComputeHash(stream);
output = BitConverter.ToString(checksum).Replace("-", String.Empty).ToLower();
Console.WriteLine("Total seconds : " + (DateTime.Now - current).TotalSeconds.ToString() + " " + output);
}
}
For dynamically-generated PDFs.
The creation date and modified dates will always be different.
You have to remove them or set them to a constant value.
Then generate md5 hash to compare hashes.
You can use PDFStamper to remove or update dates.
In addition to the methods answered above if you're comparing PDFs you need to amend the creation and modified dates or the hashes won't match.
For PDFs generated with QuestPdf youll need to override the CreationDate and ModifiedDate in the Document Metadata.
public class PdfDocument : IDocument
{
...
DocumentMetadata GetMetadata()
{
return new()
{
CreationDate = DateTime.MinValue,
ModifiedDate = DateTime.MinValue,
};
}
...
}
https://www.questpdf.com/concepts/document-metadata.html
I am trying to use BouncyCastle to encrypt a file using the PKCS 7 file standard. Here is the code I have which outputs a p7m file. When I go to decrypt the file (using Entrust) I am prompted for my key store password, so it knows the file was encrypted for me using AES 128, but it cannot decrypt the body of the file. Something has to be going wrong on the encrypt.
byte[] fileContent = readFile(filename);
FileStream outStream = null;
Stream cryptoStream = null;
BinaryWriter binWriter = null;
try
{
CmsEnvelopedDataStreamGenerator dataGenerator = new CmsEnvelopedDataStreamGenerator();
dataGenerator.AddKeyTransRecipient(cert); //cert is the user's x509cert that i am encrypting for
outStream = new FileStream(filename + ".p7m", FileMode.Create);
cryptoStream = dataGenerator.Open(outStream, CmsEnvelopedGenerator.Aes128Cbc);
binWriter = new BinaryWriter(cryptoStream);
binWriter.Write(fileContent);
}
And when i try and decrypt the file using BouncyCastle I get this error when i pass the file contents to a CMSEnveloped Object:
IOException converting stream to byte array: Attempted to read past the end of the stream.
Any ideas whats going on here?
I used the EnvelopedCMS class to accomplish this.
http://msdn.microsoft.com/en-us/library/bb924575(VS.90).aspx