Parallel hash computing via multiple TransformBlocks results in a disarray - c#

I'm trying to compute hashes for a whole directory, in order to monitor changes later. It's relatively easy. However, if there are big files, the computing takes too much time, so I wound up using some multithreading.
Thanks to I/O bottlenecks, I should read a file with one thread, but I can calculate hash for that file in multiple threads with calling TransformBlock methods all at once. The problem is, the result of each calculation is different - 'cause all the threads update one instance of a hashAlgorithm, they do it erratically.
public delegate void CalculateHashDelegate(byte[] buffer);
private MD5 md5;
private long completed_threads_hash;
private object lock_for_hash = new object();
`private string getMd5Hash(string file_path)
{
string file_to_be_hashed = file_path;
byte[] hash;
try
{
CalculateHashDelegate CalculateHash = AsyncCalculateHash;
md5 = MD5.Create();
using (Stream input = File.OpenRead(file_to_be_hashed))
{
int buffer_size = 0x4096;
byte[] buffer = new byte[buffer_size];
long part_count = 0;
completed_threads_hash = 0;
int bytes_read;
while ((bytes_read = input.Read(buffer, 0, buffer.Length)) == buffer_size)
{
part_count++;
IAsyncResult ar_hash = CalculateHash.BeginInvoke(buffer, CalculateHashCallback, CalculateHash);
}
// Wait for completing all the threads
while (true)
{
lock (completed_threads_lock)
{
if (completed_threads_hash == part_count)
{
md5.TransformFinalBlock(buffer, 0, bytes_read);
break;
}
}
}
hash = md5.Hash;
}
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hash.Length; i++)
{
sb.Append(hash[i].ToString("x2"));
}
md5.Clear();
return sb.ToString();
}
catch (Exception ex)
{
Console.WriteLine("An exception was encountered during hashing file {0}. {1}.", file_to_be_hashed, ex.Message);
return ex.Message;
}
}
public void AsyncCalculateHash(byte[] buffer)
{
lock (lock_for_hash)
{
md5.TransformBlock(buffer, 0, buffer.Length, null, 0);
}
}
private void CalculateHashCallback(IAsyncResult ar_hash)
{
try
{
CalculateHashDelegate CalculateHash = ar_hash.AsyncState as CalculateHashDelegate;
CalculateHash.EndInvoke(ar_hash);
}
catch (Exception ex)
{
Console.WriteLine("Callback exception: ", ex.Message);
}
finally
{
lock (completed_threads_lock)
{
completed_threads_hash++;
}
}
}
Is there a way to organize the hashing process? I can't use .Net newer than 3.5 and such classes as BackroundWorker and ThreadPool. Or maybe there is another method for parallel hash calculating?

Generally you cannot use cryptographic objects within multi-threaded code. The problem with hash methods is that they are fully linear - each block of hashing depends on the current state, and the state is calculated using all the previous blocks. So basically, you cannot do this for MD5.
There is another process that can be used, and it is called a hash tree or Merkle tree. Basically you decide on a block size and calculate the hashes for the blocks. These hashes are put together and hashed again. If you have a very large number of hashes you may actually create a tree as described in the Wikipedia article linked to earlier. Of course the resulting hash is different from just MD5 and depends on the configuration parameters of the hash tree.
Note that MD5 has been broken. You should be using SHA-256 or SHA-512/xxx (faster on 64 bit processors) instead. Also note that often the IO speed is more of an obstruction than the speed of the hash algorithm, negating any speed advantages of hash trees. If you have many files, you could also parallelize the hashing on file level.

Related

SerialPort.ReadLine() slow compared to manual method?

I've recently implemented a small program which reads data coming from a sensor and plotting it as diagram.
The data comes in as chunks of 5 bytes, roughly every 500 µs (baudrate: 500000). Around 3000 chunks make up a complete line. So the total transmission time is around 1.5 s.
As I was looking at the live diagram I noticed a severe lag between what is shown and what is currently measured. Investigating, it all boiled down to:
SerialPort.ReadLine();
It takes around 0.5 s more than the line to be transmitted. So each line read takes around 2 s. Interestingly no data is lost, it just lags behind even more with each new line read. This is very irritating for the user, so I couldn't leave it like that.
I've implemented my own variant and it shows a consistent time of around 1.5 s, and no lag occurs. I'm not really proud of my implementation (more or less polling the BaseStream) and I'm wondering if there is a way to speed up the ReadLine function of the SerialPort class. With my implementation I'm also getting some corrupted lines, and haven't found the exact issue yet.
I've tried changing the ReadTimeout to 1600, but that just produced a TimeoutException. Although the data arrived.
Any explanation as of why it is slow or a way to fix it is appreciated.
As a side-note: I've tried this on a Console application with only SerialPort.ReadLine() as well and the result is the same, so I'm ruling out my own application affecting the SerialPort.
I'm not sure this is relevant, but my implementation looks like this:
LineSplitter lineSplitter = new LineSplitter();
async Task<string> SerialReadLineAsync(SerialPort serialPort)
{
byte[] buffer = new byte[5];
string ret = string.Empty;
while (true)
{
try
{
int bytesRead = await serialPort.BaseStream.ReadAsync(buffer, 0, buffer.Length).ConfigureAwait(false);
byte[] line = lineSplitter.OnIncomingBinaryBlock(this, buffer, bytesRead);
if (null != line)
{
return Encoding.ASCII.GetString(line).TrimEnd('\r', '\n');
}
}
catch
{
return string.Empty;
}
}
}
With LineSplitter being the following:
class LineSplitter
{
// based on: http://www.sparxeng.com/blog/software/reading-lines-serial-port
public byte Delimiter = (byte)'\n';
byte[] leftover;
public byte[] OnIncomingBinaryBlock(object sender, byte[] buffer, int bytesInBuffer)
{
leftover = ConcatArray(leftover, buffer, 0, bytesInBuffer);
int newLineIndex = Array.IndexOf(leftover, Delimiter);
if (newLineIndex >= 0)
{
byte[] result = new byte[newLineIndex+1];
Array.Copy(leftover, result, result.Length);
byte[] newLeftover = new byte[leftover.Length - result.Length];
Array.Copy(leftover, newLineIndex + 1, newLeftover, 0, newLeftover.Length);
leftover = newLeftover;
return result;
}
return null;
}
static byte[] ConcatArray(byte[] head, byte[] tail, int tailOffset, int tailCount)
{
byte[] result;
if (head == null)
{
result = new byte[tailCount];
Array.Copy(tail, tailOffset, result, 0, tailCount);
}
else
{
result = new byte[head.Length + tailCount];
head.CopyTo(result, 0);
Array.Copy(tail, tailOffset, result, head.Length, tailCount);
}
return result;
}
}
I ran into this issue in 2008 talking to GPS modules. Essentially the blocking functions are flaky and the solution is to use APM.
Here are the gory details in another Stack Overflow answer: How to do robust SerialPort programming with .NET / C#?
You may also find this of interest: How to kill off a pending APM operation

Most Efficient Way To Consolidate Multiple Byte Arrays Received Over Async Network?

Basically, what I'm looking to do is find an effective means of consolidating large amounts of data that would be too large for a suitable buffer size.
For something like an instant messenger setting a fixed buffer size is a fine solution as most people accept that instant messages tend to have a limit.
However if I were to want to send a whole text document multiple pages long, you would not want to have to send it 2048 characters at a time. Or whatever you define as the limit.
I've represented my current solution in some pseudo code:
public class Pseudo
{
public const int m_BufferSize = 255;
public void SendAllData(Socket socket, byte[] data)
{
int byteCount = data.Length;
if (byteCount <= m_BufferSize)
{
Socket.Send(data);
}
else if (byteCount > m_BufferSize)
{
int wholeSegments = Math.Floor(byteCount / m_BufferSize);
int remainingBytes = byteCount % m_BufferSize;
int bytesSent = 0;
int currentSegment = 1;
string id = Guid.NewGuid();
byte[] tempData = data;
//Send initial packet creating the data handler object.
Socket.SendInitial(id);
while (bytesSent < byteCount)
{
if (currentSegment <= wholeSegments)
{
Socket.Send(tempData[m_BufferSize]);
tempData.CopyTo(tempData, m_BufferSize);
bytesSent += m_BufferSize;
}
else
{
Socket.Send(tempData[remainingBytes]);
bytesSent += remainingBytes;
}
currentSegment++;
}
//Let The Server Know Send Is Over And To Consolidate;
Socket.SendClose(id);
}
}
internal class DataHandler
{
string m_Identity;
List<byte[]> m_DataSegments = new List<byte[]>();
static Dictionary<string, DataHandler>
m_HandlerPool = new Dictionary<string, DataHandler>();
public DataHandler(string id)
{
m_Identity = id;
if (!m_HandlerPool.ContainsKey(id))
{
m_HandlerPool.Add(this);
}
}
public void AddDataSegment(byte[] segment)
{
m_DataSegments.Add(segment);
}
public byte[] Consolidate(string id)
{
var handler = m_HandlerPool(id);
List<byte> temp = new List<byte[]>();
for (int i = handler.m_DataSegments.Length; i >= 0; i--)
{
temp.Add(handler.m_DataSegments[i]);
}
Dispose();
return temp;
}
void Dispose()
{
m_DataSegments = null;
m_HandlerPool.Remove(this);
}
}
}
Basically what this is doing is assigning an identifier to individual packets so that they can be using AsyncEventArgs, as the may not necessarily all be received without being interrupted so I can't really rely on index.
These are then stored in the object 'DataHandler' and consolidated into a single byte array.
The problem is, as you can tell, it's going to add a lot of overhead in what I had hoped to be a high-performance socket server. Even if I were to pool the handler objects, the whole thing feels crufty.
Edit: It's also going to require a delimiter which I really don't want to use.
So, what would be the most efficient way of accomplishing this?
Edit: Example code for the method of processing data, this came from one of the async code projects.
internal void ProcessData(SocketAsyncEventArgs args)
{
// Get the message received from the client.
String received = this.sb.ToString();
Console.WriteLine("Received: \"{0}\". The server has read {1} bytes.", received, received.Length);
Byte[] sendBuffer = Encoding.ASCII.GetBytes(received);
args.SetBuffer(sendBuffer, 0, sendBuffer.Length);
// Clear StringBuffer, so it can receive more data from a keep-alive connection client.
sb.Length = 0;
this.currentIndex = 0;
}
This is populating the user token data which is what is referenced by this.
// Set return buffer.
token.ProcessData(e);
I assume you're using TCP/IP - and if so, I don't understand the problem you're trying to fix. You can keep sending data as long as the connection is stable. Under the hood, TCP/IP will automatically create numbered packets for you, and ensure they arrive in the same order they were sent.
On the receiving end, you will have to read to a buffer of a certain size, but you can immediately write the received data to a MemoryStream (or FileStream if you intend to store the data on disk).
Usually high performance server:
use async send/receive methods
don't store all received data in memory (if you want to store many data use some blob storage services)
if you need, you can store some unordered pieces in some reusable context, and resend it to storage as soon as posible
reuse buffers, don't allocate new buffer on each receive/send action
etc...

.NET Native incredibly slower than Debug build with ReadAsync calls

so I just found a really weird issue in my app and it turns out it was caused by the .NET Native compiler for some reason.
I have a method that compares the content of two files, and it works fine. With two 400KBs files, it takes like 0.4 seconds to run on my Lumia 930 in Debug mode. But, when in Release mode, it takes up to 17 seconds for no apparent reason. Here's the code:
// Compares the content of the two streams
private static async Task<bool> ContentEquals(ulong size, [NotNull] Stream fileStream, [NotNull] Stream testStream)
{
// Initialization
const int bytes = 8;
int iterations = (int)Math.Ceiling((double)size / bytes);
byte[] one = new byte[bytes];
byte[] two = new byte[bytes];
// Read all the bytes and compare them 8 at a time
for (int i = 0; i < iterations; i++)
{
await fileStream.ReadAsync(one, 0, bytes);
await testStream.ReadAsync(two, 0, bytes);
if (BitConverter.ToUInt64(one, 0) != BitConverter.ToUInt64(two, 0)) return false;
}
return true;
}
/// <summary>
/// Checks if the content of two files is the same
/// </summary>
/// <param name="file">The source file</param>
/// <param name="test">The file to test</param>
public static async Task<bool> ContentEquals([NotNull] this StorageFile file, [NotNull] StorageFile test)
{
// If the two files have a different size, just stop here
ulong size = await file.GetFileSizeAsync();
if (size != await test.GetFileSizeAsync()) return false;
// Open the two files to read them
try
{
// Direct streams
using (Stream fileStream = await file.OpenStreamForReadAsync())
using (Stream testStream = await test.OpenStreamForReadAsync())
{
return await ContentEquals(size, fileStream, testStream);
}
}
catch (UnauthorizedAccessException)
{
// Copy streams
StorageFile fileCopy = await file.CreateCopyAsync(ApplicationData.Current.TemporaryFolder);
StorageFile testCopy = await file.CreateCopyAsync(ApplicationData.Current.TemporaryFolder);
using (Stream fileStream = await fileCopy.OpenStreamForReadAsync())
using (Stream testStream = await testCopy.OpenStreamForReadAsync())
{
// Compare the files
bool result = await ContentEquals(size, fileStream, testStream);
// Delete the temp files at the end of the operation
Task.Run(() =>
{
fileCopy.DeleteAsync(StorageDeleteOption.PermanentDelete).Forget();
testCopy.DeleteAsync(StorageDeleteOption.PermanentDelete).Forget();
}).Forget();
return result;
}
}
}
Now, I have absolutely no idea why this same exact method goes from 0.4 seconds all the way up to more than 15 seconds when compile with the .NET Native toolchain.
I fixed this issue using a single ReadAsync call to read the entire files, then I generated two MD5 hashes from the results and compared the two. This approach worked in around 0.4 seconds on my Lumia 930 even in Release mode.
Still, I'm curious about this issue and I'd like to know why it was happening.
Thank you in advance for your help!
EDIT: so I've tweaked my method in order to reduce the number of actual IO operations, this is the result and it looks like it's working fine so far.
private static async Task<bool> ContentEquals(ulong size, [NotNull] Stream fileStream, [NotNull] Stream testStream)
{
// Initialization
const int bytes = 102400;
int iterations = (int)Math.Ceiling((double)size / bytes);
byte[] first = new byte[bytes], second = new byte[bytes];
// Read all the bytes and compare them 8 at a time
for (int i = 0; i < iterations; i++)
{
// Read the next data chunk
int[] counts = await Task.WhenAll(fileStream.ReadAsync(first, 0, bytes), testStream.ReadAsync(second, 0, bytes));
if (counts[0] != counts[1]) return false;
int target = counts[0];
// Compare the first bytes 8 at a time
int j;
for (j = 0; j < target; j += 8)
{
if (BitConverter.ToUInt64(first, j) != BitConverter.ToUInt64(second, j)) return false;
}
// Compare the bytes in the last chunk if necessary
while (j < target)
{
if (first[j] != second[j]) return false;
j++;
}
}
return true;
}
Reading eight bytes at a time from an I/O device is a performance disaster. That's why we are using buffered reading (and writing) in the first place. It takes time for an I/O request to be submitted, processed, executed and finally returned.
OpenStreamForReadAsync appears to not be using a buffered stream. So your 8-byte requests are actually requesting 8 bytes at a time. Even with the solid-state drive, this is very slow.
You don't need to read the whole file at once, though. The usual approach is to find a reasonable buffer size to pre-read; something like reading 1 kiB at a time should fix your whole issue without requiring you to load the whole file in memory at once. You can use BufferedStream between the file and your reading to handle this for you. And if you're feeling adventurous, you could issue the next read request before the CPU processing is done - though it's very likely that this isn't going to help your performance much, given how much of the work is just I/O.
It also seems that .NET native has a lot bigger overhead than managed .NET for asynchronous I/O in the first place, which would make those tiny asynchronous calls all the more of a problem. Fewer requests of larger data will help.

C# Which is faster: SerialPort.Write(byte[], int, int) or SerialPort.BaseStream.WriteByte()?

I am working some code that controls a device via serial port (over usb). This application will need to be able to run in the background, yet push a fair amount of data continuously over the port, so I need the data writing code to be reasonably fast. My code pushes its data in large batches ("frames") multiple times per second. Currently, I have a Queue that is used to generate the commands that need to be sent for the current frame. Would it be faster to just iterate through the Queue and push my commands one at a time using SerialPort.BaseStream.WriteByte(byte), or use the Queue to build up a byte array and send it all at once using SerialPort.Write(byte[], int, int)?
Some example code, if my description is confusing:
Which is faster, this:
public void PushData(List<KeyValuePair<Address, byte>> data) {
Queue<DataPushAction> actions = BuildActionQueue(data);
foreach(var item in actions) {
port.BaseStream.WriteByte(item.Value);
}
}
or this:
public void PushData(List<KeyValuePair<Address, byte>> data) {
Queue<DataPushAction> actions = BuildActionQueue(data);
byte[] buffer = actions.Select(x => x.Value).ToArray();
port.Write(buffer, 0, buffer.Length);
}
Update
Upon closer inspection of the source code, it seems that both methods are the same (WriteByte just uses a temporary array with one element and is otherwise the same as Write). However, this doesn't actually answer the question, just rephrases it: Is it faster to write many small arrays of bytes or one large one?
By modifying the code in Rion's answer below, I was able to test this and got some surprising results. I used the following code (modified from Rion, thanks):
class Program {
static void Main(string[] args) {
// Create a stopwatch for performance testing
var stopwatch = new Stopwatch();
// Test content
var data = GetTestingBytes();
var ports = SerialPort.GetPortNames();
using (SerialPort port = new SerialPort(ports[0], 115200, Parity.None, 8, StopBits.One)) {
port.Open();
// Looping Test
stopwatch.Start();
foreach (var item in data) {
port.BaseStream.WriteByte(item);
}
stopwatch.Stop();
Console.WriteLine($"Loop Test: {stopwatch.Elapsed}");
stopwatch.Start();
port.Write(data, 0, data.Length);
stopwatch.Stop();
Console.WriteLine($"All At Once Test: {stopwatch.Elapsed}");
}
Console.Read();
}
static byte[] GetTestingBytes() {
var str = String.Join(",", Enumerable.Range(0, 1000).Select(x => Guid.NewGuid()).ToArray());
byte[] bytes = new byte[str.Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
}
The results were extremely surprising; using the method that takes an array of bytes took almost exactly twice as long at 12.5818728 seconds compared to 6.2935748 seconds when just calling WriteByte repeatedly. This was the opposite result than I was expecting. Either way, I wasn't expecting one method to be twice as fast as the other!
If anyone can figure out why this is the case, I would love to know!
Based on a quick glance at the source, it looks like the SerialPort.Write() method actually just points to the Write() method of the underlying stream anyways :
public void Write(byte[] buffer, int offset, int count)
{
if (!IsOpen)
throw new InvalidOperationException(SR.GetString(SR.Port_not_open));
if (buffer==null)
throw new ArgumentNullException("buffer", SR.GetString(SR.ArgumentNull_Buffer));
if (offset < 0)
throw new ArgumentOutOfRangeException("offset", SR.GetString(SR.ArgumentOutOfRange_NeedNonNegNumRequired));
if (count < 0)
throw new ArgumentOutOfRangeException("count", SR.GetString(SR.ArgumentOutOfRange_NeedNonNegNumRequired));
if (buffer.Length - offset < count)
throw new ArgumentException(SR.GetString(SR.Argument_InvalidOffLen));
if (buffer.Length == 0) return;
internalSerialStream.Write(buffer, offset, count, writeTimeout);
}
If I had to wager a guess, I would assume that the difference between the two performance wise might be negligible. I suppose if you had some test data, you could create a StopWatch to actually time the differences between both approaches.
Update with Performance Tests
I didn't test this using SerialPort objects, but instead opted for basic MemoryStream ones as the real heart of the question seemed to be if writing bytes in a loop is more performant than writing them using a byte[].
For test data, I simply generated 1000 random Guid objects and concatenated them into a string :
static byte[] GetTestingBytes()
{
var str = String.Join(",", Enumerable.Range(0, 1000).Select(x => Guid.NewGuid()).ToArray());
byte[] bytes = new byte[str.Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
As far as the tests themselves go :
// Create a stopwatch for performance testing
var stopwatch = new Stopwatch();
// Test content
var data = GetTestingBytes();
// Looping Test
using (var loop = new MemoryStream())
{
stopwatch.Start();
foreach (var item in data)
{
loop.WriteByte(item);
}
stopwatch.Stop();
Console.WriteLine($"Loop Test: {stopwatch.Elapsed}");
}
// Buffered Test
using (var buffer = new MemoryStream())
{
stopwatch.Start();
buffer.Write(data, 0, data.Length);
stopwatch.Stop();
Console.WriteLine($"Buffer Test: {stopwatch.Elapsed}");
}
After running the tests a few times, the averages after 1000 tests broke down as follows :
LOOP: 00:00:00.0976584
BUFFER: 00:00:00.0976629
So the looping approach, at least in the context of using MemoryStream objects, appears to be the winner.
You can see the entire testing code here if you want to run it yourself.

ProtectedMemory.Unprotect outputs garbage

I've got this code to store and recover an authorization token (which is alphanumeric):
public static void Store (string token)
{
byte[] buffer = Encoding.UTF8.GetBytes (token.PadRight (32));
ProtectedMemory.Protect (buffer, MemoryProtectionScope.SameLogon);
Settings.Default.UserToken = buffer.ToHexString ();
Settings.Default.Save ();
}
public static string Retrieve ()
{
byte[] buffer = Settings.Default.UserToken.FromHexString ();
if (buffer.Length == 0)
return String.Empty;
ProtectedMemory.Unprotect (buffer, MemoryProtectionScope.SameLogon);
return Encoding.UTF8.GetString (buffer).Trim ();
}
And it mostly works fine, although some times I get garbage out (many FD bytes, and some readable ones). I suspect this happens only when I reboot, but I've had some difficulties reproducing it.
Is this the intended behaviour? That is, does MemoryProtectionScope.SameLogon mean that the data will always be unreadable upon reboot? Am I doing something wrong?
The FromHexString and ToHexString methods do exactly what you would expect from them.
Yes, ProtectedMemory will always fail after you reboot (or for the different MemoryProtectionScopes, restart the process etc.). It's only meant to work to protect memory, not data for storage.
You want to use ProtectedData instead:
ProtectedData.Protect(buffer, null, DataProtectionScope.CurrentUser);
Both of those are managed wrappers over the DPAPI (introduced with Windows 2000). There's a bunch of posts with more details on the .NET security blog - http://blogs.msdn.com/b/shawnfa/archive/2004/05/05/126825.aspx

Categories