need help to make crc calculation's class thread safe - c#

when we added paralllism elaboration on our application (dotnet service) we found some unexpected behavoir on crc calculation over text documents.
to isolate the issue i created a test case. the crc calculation fails when invoked from parallel looop. in this test case replacing parallel foreach with standard always fine. I think i've to made so change in crc32 class implementation, but i need some help to understand the right way. Thanks.
this the test method.
[TestMethod()]
public void Test_Crc_TestoDoc()
{
string query = #"select top 100 docId from sometable";
///key is document's id
///value is a couple, crc and text
Dictionary<int, Tuple<int, string>> docs = new Dictionary<int, Tuple<int, string>>();
using (SqlDataReader oSqlDataReader = Utility.ExecuteSP_Reader(query))
{
while (oSqlDataReader.Read())
{
int docId = oSqlDataReader.GetInt32(0);
///retrive the text by docId
string docText = Utility.GetDocText(docId);
///calculate and add crc in dic
int CRC = CRC32.Compute(docText);
docs.Add(docId, new Tuple<int, string>(CRC, docText));
}
oSqlDataReader.Close();
}
///calculate crc 100 times to check if the value
///is always the same for same text
for (int i = 0; i < 100; i++)
{
Parallel.ForEach(docs.Keys,(int docId) =>
{
///crc saved in dictionary
int CRC1 = docs[docId].Item1;
///text saved in dictionary
string docText = docs[docId].Item2;
///calculate crc again, crc2 must be equal to crc1 stored in dictionary
int CRC2 = CRC32.Compute(docText);
Assert.AreEqual(CRC1, CRC2, $"crc not equal, why? docId->{docId} CRC1->{CRC1} CRC2->{CRC2}");
});
}
}
crc32 class:
public class CRC32 : HashAlgorithm
{
#region CONSTRUCTORS
/// <summary>Creates a CRC32 object using the <see cref="DefaultPolynomial"/>.</summary>
public CRC32()
: this(DefaultPolynomial)
{
}
/// <summary>Creates a CRC32 object using the specified polynomial.</summary>
/// <remarks>The polynomical should be supplied in its bit-reflected form. <see cref="DefaultPolynomial"/>.</remarks>
[CLSCompliant(false)]
public CRC32(uint polynomial)
{
HashSizeValue = 32;
_crc32Table = (uint[])_crc32TablesCache[polynomial];
if (_crc32Table == null)
{
_crc32Table = CRC32._buildCRC32Table(polynomial);
_crc32TablesCache.Add(polynomial, _crc32Table);
}
Initialize();
}
// static constructor
static CRC32()
{
_crc32TablesCache = Hashtable.Synchronized(new Hashtable());
_defaultCRC = new CRC32();
}
#endregion
#region PROPERTIES
/// <summary>Gets the default polynomial (used in WinZip, Ethernet, etc.)</summary>
/// <remarks>The default polynomial is a bit-reflected version of the standard polynomial 0x04C11DB7 used by WinZip, Ethernet, etc.</remarks>
[CLSCompliant(false)]
public static readonly uint DefaultPolynomial = 0xEDB88320; // Bitwise reflection of 0x04C11DB7;
#endregion
#region METHODS
/// <summary>Initializes an implementation of HashAlgorithm.</summary>
public override void Initialize()
{
_crc = _allOnes;
}
/// <summary>Routes data written to the object into the hash algorithm for computing the hash.</summary>
protected override void HashCore(byte[] buffer, int offset, int count)
{
for (int i = offset; i < count; i++)
{
ulong ptr = (_crc & 0xFF) ^ buffer[i];
_crc >>= 8;
_crc ^= _crc32Table[ptr];
}
}
/// <summary>Finalizes the hash computation after the last data is processed by the cryptographic stream object.</summary>
protected override byte[] HashFinal()
{
byte[] finalHash = new byte[4];
ulong finalCRC = _crc ^ _allOnes;
finalHash[0] = (byte)((finalCRC >> 0) & 0xFF);
finalHash[1] = (byte)((finalCRC >> 8) & 0xFF);
finalHash[2] = (byte)((finalCRC >> 16) & 0xFF);
finalHash[3] = (byte)((finalCRC >> 24) & 0xFF);
return finalHash;
}
/// <summary>Computes the CRC32 value for the given ASCII string using the <see cref="DefaultPolynomial"/>.</summary>
public static int Compute(string asciiString)
{
_defaultCRC.Initialize();
return ToInt32(_defaultCRC.ComputeHash(asciiString));
}
/// <summary>Computes the CRC32 value for the given input stream using the <see cref="DefaultPolynomial"/>.</summary>
public static int Compute(Stream inputStream)
{
_defaultCRC.Initialize();
return ToInt32(_defaultCRC.ComputeHash(inputStream));
}
/// <summary>Computes the CRC32 value for the input data using the <see cref="DefaultPolynomial"/>.</summary>
public static int Compute(byte[] buffer)
{
_defaultCRC.Initialize();
return ToInt32(_defaultCRC.ComputeHash(buffer));
}
/// <summary>Computes the hash value for the input data using the <see cref="DefaultPolynomial"/>.</summary>
public static int Compute(byte[] buffer, int offset, int count)
{
_defaultCRC.Initialize();
return ToInt32(_defaultCRC.ComputeHash(buffer, offset, count));
}
/// <summary>Computes the hash value for the given ASCII string.</summary>
/// <remarks>The computation preserves the internal state between the calls, so it can be used for computation of a stream data.</remarks>
public byte[] ComputeHash(string asciiString)
{
byte[] rawBytes = ASCIIEncoding.ASCII.GetBytes(asciiString);
return ComputeHash(rawBytes);
}
/// <summary>Computes the hash value for the given input stream.</summary>
/// <remarks>The computation preserves the internal state between the calls, so it can be used for computation of a stream data.</remarks>
new public byte[] ComputeHash(Stream inputStream)
{
byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = inputStream.Read(buffer, 0, 4096)) > 0)
{
HashCore(buffer, 0, bytesRead);
}
return HashFinal();
}
/// <summary>Computes the hash value for the input data.</summary>
/// <remarks>The computation preserves the internal state between the calls, so it can be used for computation of a stream data.</remarks>
new public byte[] ComputeHash(byte[] buffer)
{
return ComputeHash(buffer, 0, buffer.Length);
}
/// <summary>Computes the hash value for the input data.</summary>
/// <remarks>The computation preserves the internal state between the calls, so it can be used for computation of a stream data.</remarks>
new public byte[] ComputeHash(byte[] buffer, int offset, int count)
{
HashCore(buffer, offset, count);
return HashFinal();
}
#endregion
#region PRIVATE SECTION
private static uint _allOnes = 0xffffffff;
private static CRC32 _defaultCRC;
private static Hashtable _crc32TablesCache;
private uint[] _crc32Table;
private uint _crc;
// Builds a crc32 table given a polynomial
private static uint[] _buildCRC32Table(uint polynomial)
{
uint crc;
uint[] table = new uint[256];
// 256 values representing ASCII character codes.
for (int i = 0; i < 256; i++)
{
crc = (uint)i;
for (int j = 8; j > 0; j--)
{
if ((crc & 1) == 1)
crc = (crc >> 1) ^ polynomial;
else
crc >>= 1;
}
table[i] = crc;
}
return table;
}
private static int ToInt32(byte[] buffer)
{
return BitConverter.ToInt32(buffer, 0);
}
#endregion
}

Probably the problem are all the "static" function.
In fact, a static function is the same for all of the instance of CRC32.
That means that while an instance is running, setting his parameter, another can write his own value over the first one.

Related

How to Call to HashPassword and to store User password to database?

I'm setting up a HashPassword function using SHA1CryptoServiceProvider(). My requirement includes two methods: generate the salt and initializer which I need help with.
Salt is used with the IV which is used to salt the password before it is hashed, and the password to validate. The initializer takes the string made by the salt generator it mixes the password and salt into one string and add's any extra characters to the end then hash the blended password and returns the value.
Essentially, I need to compare if the value sent from the view differs from the original and if it does then i need to regenerate the hash and initializer on a create(new record).
This controller action calls the HashPassword functions in the USERController.Helper file.
public ActionResult HashPassword(USERSModel UsersModel)
{
USERDto dto = new USERDto();
if (ModelState.IsValid)
{
string hashedPassword = UsersModel.PASSWORD;
UsersModel.PASSWORD = hashedPassword;
dto.Updated.Add(hashedPassword);
dto.Updated.Add("NAME");
dto.Updated.Add("ID");
dto.Updated.Add("PASSWORD");
UsersModel.Updated.SaveChanges();
ViewBag.Message = "User was added successfully!";
UsersModel = new USERSModel();
}
else
ViewBag.message = "Error in adding User!";
return View("USERSSettingsPartial", UsersModel);
}
/// <summary>
/// Called to hash a user password to be stored in the DB.
/// </summary>
/// <param name="password">The password to validate.</param>
/// <param name="salt">The IV used to salt the password before it is hashed.</param>
/// <param name="errorDesc">Returns an error description if an error occurs.</param>
/// <returns>Returns the hashed password as a HEX string on success, otherwise returns null.</returns>
private string HashPassword(string password, byte[] salt, ref string errorDesc)
{
try
{
byte[] newPassword = Encoding.ASCII.GetBytes(password.ToUpper());
if (salt != null && salt.Length > 0)
{
int count = (salt.Length < newPassword.Length) ? salt.Length : newPassword.Length;
byte[] temp = new byte[salt.Length + newPassword.Length];
for (int index = 0; index < count; index++)
{
temp[index * 2] = newPassword[index];
temp[index * 2 + 1] = salt[index];
}
if (count == salt.Length && count < newPassword.Length)
Buffer.BlockCopy(newPassword, count, temp, count * 2, newPassword.Length - count);
else if (count == newPassword.Length && count < salt.Length)
Buffer.BlockCopy(salt, count, temp, count * 2, salt.Length - count);
newPassword = temp;
}
using (var hash = new System.Security.Cryptography.SHA1CryptoServiceProvider())
{
hash.ComputeHash(newPassword);
return this.GetHexStringFromBytes(hash.Hash);
}
}
catch (Exception Ex)
{
errorDesc = Ex.Message;
if (Ex.InnerException != null) errorDesc = string.Format("{0}\r\n{1}", errorDesc, Ex.InnerException.Message);
}
return null;
}
/// <summary>
/// called to convert byte data into hexidecimal string were each byte is represented as two hexidecimal characters.
/// </summary>
/// <param name="data">Byte data to convert.</param>
/// <returns>A hexidecimal string version of the data.</returns>
private string GetHexStringFromBytes(byte[] data)
{
if (data == null || data.Length == 0) return string.Empty;
StringBuilder sbHex = new StringBuilder();
for (int index = 0; index < data.Length; index++) sbHex.AppendFormat(null, "{0:X2}", data[index]);
return sbHex.ToString();
}
/// <summary>
/// called to convert hexadecimal string into byte data were two hexadecimal characters are converted into a byte.
/// </summary>
/// <param name="hexString">A hexidecimal string to convert</param>
/// <returns>The converted byte data.</returns>
private byte[] GetBytesFromHexString(string hexString)
{
if (string.IsNullOrEmpty(hexString)) return null;
byte[] data = new byte[hexString.Length / 2];
for (int index = 0; index < data.Length; index++)
{
data[index] = byte.Parse(hexString.Substring(index * 2, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
}
return data;
}
This is my first go around with a project like this, therefore I don't have any output. Just need examples to understand better.
Basically needed to have a Controller in a Service where this class will create the salt and be called from a Controller Helper class. What I did was set the Initializer on the server side where in turn added code to the USERS service controller create IHttpActionResult which sets the salt and password. You never want to store passwords in your database, always want to hash them.
I created a request in the service controller of the USERS passing in DTO which is used to create a new record, returning an object containing the query results if any, otherwise returns not found or internal server error message. Within this method Salt is being called:
public IHttpActionResult Create([FromBody]USERDto dto)
{
if (!ModelState.IsValid)
{
return BadRequest(ModelState);
}
try
{
byte[] saltValue;
string error = string.Empty;
saltValue = GenerateSalt();
dto.INITIALIZER = GetHexStringFromBytes(saltValue);
dto.PASSWORD = HashPassword(dto.PASSWORD, saltValue, ref error);
USERDto created = USERSProcessor.Create(dto);
if (created == null)
{
return NotFound();
}
return Ok(created);
}
catch (Exception ex)
{
LogUtility.LogError(ex);
return InternalServerError(ex);
}
}
Then (to not clutter controller) I created a controller helper class & added this code to implement the hashing salt methods, where the USERSController calls to the hash password to be stored in the database, as well as the dto.INITIALIZER is called to convert byte data into hexadecimal string where each byte is represented as two hexadecimal characters:
partial class USERSController
{
/// <summary>
/// Called to generate salt byte array.
/// </summary>
/// <returns>The generated salt byte array.</returns>
public static byte[] GenerateSalt()
{
byte[] iv;
using (var alg = new AesCryptoServiceProvider())
{
alg.BlockSize = 128; //block size is 8bytes, which is the the size of the IV generated.
alg.KeySize = 256; //key size is 32bytes
alg.GenerateIV();
iv = alg.IV;
}
return iv;
}
/// <summary>
/// Called to hash a user password to be stored in DB.
/// </summary>
/// <param name="password">The password to validate.</param>
/// <param name="salt">The IV used to salt the password before it is hashed.</param>
/// <param name="errorDesc">Returns an error description if an error occurs.</param>
/// <returns>Returns the hashed password as a HEX string on success, otherwise returns null.</returns>
private static string HashPassword(string password, byte[] salt, ref string errorDesc)
{
try
{
byte[] newPassword = Encoding.ASCII.GetBytes(password.ToUpper());
if (salt != null && salt.Length > 0)
{
int count = (salt.Length < newPassword.Length) ? salt.Length : newPassword.Length;
byte[] temp = new byte[salt.Length + newPassword.Length];
for (int index = 0; index < count; index++)
{
temp[index * 2] = newPassword[index];
temp[index * 2 + 1] = salt[index];
}
if (count == salt.Length && count < newPassword.Length)
Buffer.BlockCopy(newPassword, count, temp, count * 2, newPassword.Length - count);
else if (count == newPassword.Length && count < salt.Length)
Buffer.BlockCopy(salt, count, temp, count * 2, salt.Length - count);
newPassword = temp;
}
using (var hash = new System.Security.Cryptography.SHA1CryptoServiceProvider())
{
hash.ComputeHash(newPassword);
return GetHexStringFromBytes(hash.Hash);
}
}
catch (Exception Ex)
{
errorDesc = Ex.Message;
if (Ex.InnerException != null) errorDesc = string.Format("{0}\r\n{1}", errorDesc, Ex.InnerException.Message);
}
return null;
}
/// <summary>
/// called to convert byte data into hexidecimal string were each byte is represented as two hexidecimal characters.
/// </summary>
/// <param name="data">Byte data to convert.</param>
/// <returns>A hexidecimal string version of the data.</returns>
private static string GetHexStringFromBytes(byte[] data)
{
if (data == null || data.Length == 0) return string.Empty;
StringBuilder sbHex = new StringBuilder();
for (int index = 0; index < data.Length; index++) sbHex.AppendFormat(null, "{0:X2}", data[index]);
return sbHex.ToString();
}
/// <summary>
/// called to convert hexidecimal string into byte data were two hexidecimal characters are converted into a byte.
/// </summary>
/// <param name="hexString">A hexidecimal string to convert</param>
/// <returns>The converted byte data.</returns>
private static byte[] GetBytesFromHexString(string hexString)
{
if (string.IsNullOrEmpty(hexString)) return null;
byte[] data = new byte[hexString.Length / 2];
for (int index = 0; index < data.Length; index++)
{
data[index] = byte.Parse(hexString.Substring(index * 2, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
}
return data;
}
}

Time based OTP generation generating wrong key C#

I've implemented a number of TOTP classes now and they all generate the wrong output. Below I've posted the code I used for the most simple one.
I'd like for it to get implemented and behave just like Google Authenticator - For example like the code https://gauth.apps.gbraad.nl/#main.
So what I want to happen is that in the front end of the application a user will enter his secret "BANANAKEY123" which translates to a base32 string of "IJAU4QKOIFFUKWJRGIZQ====".
Now in the constructor below key would be "BANANAKEY123". Yet for some reason it' not generating the same OTP keys with this code as the GAuth OTP tool does.
The only two reasonable mistakes would be
var secretKeyBytes = Base32Encode(secretKey);
is wrong or that my timing function is wrong. I checked both and couldn't find the fault in any of those. So could someone please help me in the right direction? Thank you!
public class Totp
{
private readonly int digits = 6;
private readonly HMACSHA1 hmac;
private readonly HMACSHA256 hmac256;
private readonly Int32 t1 = 30;
internal int mode;
private string secret;
private const string allowedCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567";
public Totp(string key, int mode)
{
secret = key;
this.mode = mode;
}
// defaults to SHA-1
public Totp(string key)
{
secret = key;
this.mode = 1;
}
public Totp(string base32string, Int32 t1, int digits) : this(base32string)
{
this.t1 = t1;
this.digits = digits;
}
public Totp(string base32string, Int32 t1, int digits, int mode) : this(base32string, mode)
{
this.t1 = t1;
this.digits = digits;
}
public String getCodeString()
{
return GetCode(this.secret, GetInterval(DateTime.UtcNow));
}
private static long GetInterval(DateTime dateTime)
{
TimeSpan elapsedTime = dateTime.ToUniversalTime() - new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
return (long)elapsedTime.TotalSeconds / 30;
}
private static string GetCode(string secretKey, long timeIndex)
{
var secretKeyBytes = Base32Encode(secretKey);
HMACSHA1 hmac = new HMACSHA1(secretKeyBytes);
byte[] challenge = BitConverter.GetBytes(timeIndex);
if (BitConverter.IsLittleEndian) Array.Reverse(challenge);
byte[] hash = hmac.ComputeHash(challenge);
int offset = hash[19] & 0xf;
int truncatedHash = hash[offset] & 0x7f;
for (int i = 1; i < 4; i++)
{
truncatedHash <<= 8;
truncatedHash |= hash[offset + i] & 0xff;
}
truncatedHash %= 1000000;
return truncatedHash.ToString("D6");
}
private static byte[] Base32Encode(string source)
{
var bits = source.ToUpper().ToCharArray().Select(c =>
Convert.ToString(allowedCharacters.IndexOf(c), 2).PadLeft(5, '0')).Aggregate((a, b) => a + b);
return Enumerable.Range(0, bits.Length / 8).Select(i => Convert.ToByte(bits.Substring(i * 8, 8), 2)).ToArray();
}
}
I have been using this code for quite some time to generate Time-based OTP, hope it helps.
TotpAuthenticationService.cs
using System;
using System.Net;
using System.Security.Cryptography;
using System.Text;
namespace Wteen.Infrastructure.Services
{
/// <summary>
/// An Time Based Implementation of RFC 6238, a variation from the OTP (One Time Password) with, a default code life time of 30 seconds.
/// </summary>
public sealed class TotpAuthenticationService
{
private readonly Encoding _encoding;
private readonly int _length;
private readonly TimeSpan _timestep;
private readonly DateTime _unixEpoch;
/// <summary>
/// Create a new Instance of <see cref="TotpAuthenticationService"/>
/// </summary>
/// <param name="length">The length of the OTP</param>
/// <param name="duration">The peried of time in which the genartion of a OTP with the result with the same value</param>
public TotpAuthenticationService(int length, int duration = 30)
{
_length = length;
_encoding = new UTF8Encoding(false, true);
_timestep = TimeSpan.FromSeconds(duration);
_unixEpoch = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
}
/// <summary>
/// The current time step number
/// </summary>
private ulong CurrentTimeStepNumber => (ulong)(TimeElapsed.Ticks / _timestep.Ticks);
/// <summary>
/// The number of seconds elapsed since midnight UTC of January 1, 1970.
/// </summary>
private TimeSpan TimeElapsed => DateTime.UtcNow - _unixEpoch;
/// <summary>
///
/// </summary>
/// <param name="securityToken"></param>
/// <param name="modifier"></param>
/// <returns></returns>
public int GenerateCode(byte[] securityToken, string modifier = null)
{
if (securityToken == null)
throw new ArgumentNullException(nameof(securityToken));
using (var hmacshA1 = new HMACSHA1(securityToken))
{
return ComputeTotp(hmacshA1, CurrentTimeStepNumber, modifier);
}
}
/// <summary>
/// Validating for codes generated during the current and past code generation <see cref="timeSteps"/>
/// </summary>
/// <param name="securityToken">User's secerct</param>
/// <param name="code">The code to validate</param>
/// <param name="timeSteps">The number of time steps the <see cref="code"/> could be validated for.</param>
/// <param name="channel">Possible channels could be user's email or mobile number where the code will be sent to</param>
/// <returns></returns>
public bool ValidateCode(byte[] securityToken, int code, int timeSteps, string channel = null)
{
if (securityToken == null)
throw new ArgumentNullException(nameof(securityToken));
using (var hmacshA1 = new HMACSHA1(securityToken))
{
for (var index = -timeSteps; index <= timeSteps; ++index)
if (ComputeTotp(hmacshA1, CurrentTimeStepNumber + (ulong)index, channel) == code)
return true;
}
return false;
}
private byte[] ApplyModifier(byte[] input, string modifier)
{
if (string.IsNullOrEmpty(modifier))
return input;
var bytes = _encoding.GetBytes(modifier);
var numArray = new byte[checked(input.Length + bytes.Length)];
Buffer.BlockCopy(input, 0, numArray, 0, input.Length);
Buffer.BlockCopy(bytes, 0, numArray, input.Length, bytes.Length);
return numArray;
}
private int ComputeTotp(HashAlgorithm algorithm, ulong timestepNumber, string modifier)
{
var bytes = BitConverter.GetBytes(IPAddress.HostToNetworkOrder((long)timestepNumber));
var hash = algorithm.ComputeHash(ApplyModifier(bytes, modifier));
var index = hash[hash.Length - 1] & 15;
return (((hash[index] & sbyte.MaxValue) << 24) | ((hash[index + 1] & byte.MaxValue) << 16) | ((hash[index + 2] & byte.MaxValue) << 8) | (hash[index + 3] & byte.MaxValue)) % (int)Math.Pow(10, _length);
}
}
}

Fast read C structure when it contains char array

I have the following C structure
struct MyStruct {
char chArray[96];
__int64 offset;
unsigned count;
}
I now have a bunch of files created in C with thousands of those structures. I need to read them using C# and speed is an issue.
I have done the following in C#
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Size = 108)]
public struct PreIndexStruct {
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 96)]
public string Key;
public long Offset;
public int Count;
}
And then I read the data from the file using
using (BinaryReader br = new BinaryReader(
new FileStream(pathToFile, FileMode.Open, FileAccess.Read,
FileShare.Read, bufferSize)))
{
long length = br.BaseStream.Length;
long position = 0;
byte[] buff = new byte[structSize];
GCHandle buffHandle = GCHandle.Alloc(buff, GCHandleType.Pinned);
while (position < length) {
br.Read(buff, 0, structSize);
PreIndexStruct pis = (PreIndexStruct)Marshal.PtrToStructure(
buffHandle.AddrOfPinnedObject(), typeof(PreIndexStruct));
structures.Add(pis);
position += structSize;
}
buffHandle.Free();
}
This works perfectly and I can retrieve the data just fine from the files.
I've read that I can speedup things if instead of using GCHandle.Alloc/Marshal.PtrToStructure I use C++/CLI or C# unsafe code. I found some examples but they only refer to structures without fixed sized arrays.
My question is, for my particular case, is there a faster way of doing things with C++/CLI or C# unsafe code?
EDIT
Additional performance info (I've used ANTS Performance Profiler 7.4):
66% of my CPU time is used by calls to Marshal.PtrToStructure.
Regarding I/O, only 6 out of 105ms are used to read from the file.
In this case, you don't explicitly need to use P/Invoke since you don't have to pass the struct back and forth between managed and native code. So you could do this instead. It would avoid this useless GC handle allocation, and allocate only what's needed.
public struct PreIndexStruct {
public string Key;
public long Offset;
public int Count;
}
while (...) {
...
PreIndexStruct pis = new PreIndexStruct();
pis.Key = Encoding.Default.GetString(reader.ReadBytes(96));
pis.Offset = reader.ReadInt64();
pis.Count = reader.ReadInt32();
structures.Add(pis);
}
I'm not sure you can be much faster than this.
Probably more correctly you want to use unmanaged code, this is what I would do:
Create a C++/CLI project and get your existing c# code ported and running there
Determine where your bottleneck is (use the profiler)
rewrite that section of the code in straight C++, call it from the C++/CLI code and make sure it works, profile it again
surround your new code with "#pragma unmanaged"
profile it again
You will probably get some level of speed increase, but it may not be what you are expecting.
It is possible with much fiddlyness to do a pretty quick read of some arrays of structs, but because this technique requires blittable types, the only way to do it is to make a fixed buffer of bytes for the Key instead of using a string.
If you do that, you have to use unsafe code so it's probably not really worth it.
However, just for the curious, this is how you can do a super-duper fast read and write of those structs, at the expense of having to allow unsafe code and a lot of fiddle:
using System;
using System.ComponentModel;
using System.Diagnostics;
using System.Diagnostics.CodeAnalysis;
using System.IO;
using System.Runtime.InteropServices;
namespace Demo
{
public static class Program
{
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Size = 108)]
public struct PreIndexStruct
{
public unsafe fixed byte Key[96];
public long Offset;
public int Count;
}
private static void Main(string[] args)
{
PreIndexStruct[] a = new PreIndexStruct[100];
for (int i = 0; i < a.Length; ++i)
{
a[i].Count = i;
unsafe
{
fixed (byte* key = a[i].Key)
{
for (int j = 0; j < 10; ++j)
{
key[j] = (byte)i;
}
}
}
}
using (var output = File.Create(#"C:\TEST\TEST.BIN"))
{
FastWrite(output, a, 0, a.Length);
}
using (var input = File.OpenRead(#"C:\TEST\TEST.BIN"))
{
var b = FastRead<PreIndexStruct>(input, a.Length);
for (int i = 0; i < b.Length; ++i)
{
Console.Write("Count = " + b[i].Count + ", Key =");
unsafe
{
fixed (byte* key = b[i].Key)
{
// Here you would access the bytes in Key[], which would presumably be ANSI chars.
for (int j = 0; j < 10; ++j)
{
Console.Write(" " + key[j]);
}
}
}
Console.WriteLine();
}
}
}
/// <summary>
/// Writes a part of an array to a file stream as quickly as possible,
/// without making any additional copies of the data.
/// </summary>
/// <typeparam name="T">The type of the array elements.</typeparam>
/// <param name="fs">The file stream to which to write.</param>
/// <param name="array">The array containing the data to write.</param>
/// <param name="offset">The offset of the start of the data in the array to write.</param>
/// <param name="count">The number of array elements to write.</param>
/// <exception cref="IOException">Thrown on error. See inner exception for <see cref="Win32Exception"/></exception>
[SuppressMessage("Microsoft.Reliability", "CA2001:AvoidCallingProblematicMethods", MessageId="System.Runtime.InteropServices.SafeHandle.DangerousGetHandle")]
public static void FastWrite<T>(FileStream fs, T[] array, int offset, int count) where T: struct
{
int sizeOfT = Marshal.SizeOf(typeof(T));
GCHandle gcHandle = GCHandle.Alloc(array, GCHandleType.Pinned);
try
{
uint bytesWritten;
uint bytesToWrite = (uint)(count * sizeOfT);
if
(
!WriteFile
(
fs.SafeFileHandle.DangerousGetHandle(),
new IntPtr(gcHandle.AddrOfPinnedObject().ToInt64() + (offset*sizeOfT)),
bytesToWrite,
out bytesWritten,
IntPtr.Zero
)
)
{
throw new IOException("Unable to write file.", new Win32Exception(Marshal.GetLastWin32Error()));
}
Debug.Assert(bytesWritten == bytesToWrite);
}
finally
{
gcHandle.Free();
}
}
/// <summary>
/// Reads array data from a file stream as quickly as possible,
/// without making any additional copies of the data.
/// </summary>
/// <typeparam name="T">The type of the array elements.</typeparam>
/// <param name="fs">The file stream from which to read.</param>
/// <param name="count">The number of elements to read.</param>
/// <returns>
/// The array of elements that was read. This may be less than the number that was
/// requested if the end of the file was reached. It may even be empty.
/// NOTE: There may still be data left in the file, even if not all the requested
/// elements were returned - this happens if the number of bytes remaining in the
/// file is less than the size of the array elements.
/// </returns>
/// <exception cref="IOException">Thrown on error. See inner exception for <see cref="Win32Exception"/></exception>
[SuppressMessage("Microsoft.Reliability", "CA2001:AvoidCallingProblematicMethods", MessageId="System.Runtime.InteropServices.SafeHandle.DangerousGetHandle")]
public static T[] FastRead<T>(FileStream fs, int count) where T: struct
{
int sizeOfT = Marshal.SizeOf(typeof(T));
long bytesRemaining = fs.Length - fs.Position;
long wantedBytes = count * sizeOfT;
long bytesAvailable = Math.Min(bytesRemaining, wantedBytes);
long availableValues = bytesAvailable / sizeOfT;
long bytesToRead = (availableValues * sizeOfT);
if ((bytesRemaining < wantedBytes) && ((bytesRemaining - bytesToRead) > 0))
{
Debug.WriteLine("Requested data exceeds available data and partial data remains in the file.", "Dmr.Common.IO.Arrays.FastRead(fs,count)");
}
T[] result = new T[availableValues];
GCHandle gcHandle = GCHandle.Alloc(result, GCHandleType.Pinned);
try
{
uint bytesRead = 0;
if
(
!ReadFile
(
fs.SafeFileHandle.DangerousGetHandle(),
gcHandle.AddrOfPinnedObject(),
(uint)bytesToRead,
out bytesRead,
IntPtr.Zero
)
)
{
throw new IOException("Unable to read file.", new Win32Exception(Marshal.GetLastWin32Error()));
}
Debug.Assert(bytesRead == bytesToRead);
}
finally
{
gcHandle.Free();
}
return result;
}
/// <summary>See the Windows API documentation for details.</summary>
[SuppressMessage("Microsoft.Interoperability", "CA1415:DeclarePInvokesCorrectly")]
[DllImport("kernel32.dll", SetLastError=true)]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool ReadFile
(
IntPtr hFile,
IntPtr lpBuffer,
uint nNumberOfBytesToRead,
out uint lpNumberOfBytesRead,
IntPtr lpOverlapped
);
/// <summary>See the Windows API documentation for details.</summary>
[SuppressMessage("Microsoft.Interoperability", "CA1415:DeclarePInvokesCorrectly")]
[DllImport("kernel32.dll", SetLastError=true)]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool WriteFile
(
IntPtr hFile,
IntPtr lpBuffer,
uint nNumberOfBytesToWrite,
out uint lpNumberOfBytesWritten,
IntPtr lpOverlapped
);
}
}

Gray code in .NET

Is there a built in Gray code datatype anywhere in the .NET framework? Or conversion utility between Gray and binary? I could do it myself, but if the wheel has already been invented...
Use this trick.
/*
The purpose of this function is to convert an unsigned
binary number to reflected binary Gray code.
*/
unsigned short binaryToGray(unsigned short num)
{
return (num>>1) ^ num;
}
A tricky Trick: for up to 2^n bits, you can convert Gray to binary by
performing (2^n) - 1 binary-to Gray conversions. All you need is the
function above and a 'for' loop.
/*
The purpose of this function is to convert a reflected binary
Gray code number to a binary number.
*/
unsigned short grayToBinary(unsigned short num)
{
unsigned short temp = num ^ (num>>8);
temp ^= (temp>>4);
temp ^= (temp>>2);
temp ^= (temp>>1);
return temp;
}
Here is a C# implementation that assumes you only want to do this on non-negative 32-bit integers:
static uint BinaryToGray(uint num)
{
return (num>>1) ^ num;
}
You might also like to read this blog post which provides methods for conversions in both directions, though the author chose to represent the code as an array of int containing either one or zero at each position. Personally I would think a BitArray might be a better choice.
Perhaps this collection of methods is useful
based on BitArray
both directions
int or just n Bits
just enjoy.
public static class GrayCode
{
public static byte BinaryToByte(BitArray binary)
{
if (binary.Length > 8)
throw new ArgumentException("bitarray too long for byte");
var array = new byte[1];
binary.CopyTo(array, 0);
return array[0];
}
public static int BinaryToInt(BitArray binary)
{
if (binary.Length > 32)
throw new ArgumentException("bitarray too long for int");
var array = new int[1];
binary.CopyTo(array, 0);
return array[0];
}
public static BitArray BinaryToGray(BitArray binary)
{
var len = binary.Length;
var gray = new BitArray(len);
gray[len - 1] = binary[len - 1]; // copy high-order bit
for (var i = len - 2; i >= 0; --i)
{
// remaining bits
gray[i] = binary[i] ^ binary[i + 1];
}
return gray;
}
public static BitArray GrayToBinary(BitArray gray)
{
var len = gray.Length;
var binary = new BitArray(len);
binary[len - 1] = gray[len - 1]; // copy high-order bit
for (var i = len - 2; i >= 0; --i)
{
// remaining bits
binary[i] = !gray[i] ^ !binary[i + 1];
}
return binary;
}
public static BitArray ByteToGray(byte value)
{
var bits = new BitArray(new[] { value });
return BinaryToGray(bits);
}
public static BitArray IntToGray(int value)
{
var bits = new BitArray(new[] { value });
return BinaryToGray(bits);
}
public static byte GrayToByte(BitArray gray)
{
var binary = GrayToBinary(gray);
return BinaryToByte(binary);
}
public static int GrayToInt(BitArray gray)
{
var binary = GrayToBinary(gray);
return BinaryToInt(binary);
}
/// <summary>
/// Returns the bits as string of '0' and '1' (LSB is right)
/// </summary>
/// <param name="bits"></param>
/// <returns></returns>
public static string AsString(this BitArray bits)
{
var sb = new StringBuilder(bits.Length);
for (var i = bits.Length - 1; i >= 0; i--)
{
sb.Append(bits[i] ? "1" : "0");
}
return sb.ToString();
}
public static IEnumerable<bool> Bits(this BitArray bits)
{
return bits.Cast<bool>();
}
public static bool[] AsBoolArr(this BitArray bits, int count)
{
return bits.Bits().Take(count).ToArray();
}
}
There is nothing built-in as far as Gray code in .NET.
Graphical Explanation about Gray code conversion - this can help a little

"Chunked" MemoryStream

I'm looking for the implementation of MemoryStream which does not allocate memory as one big block, but rather a collection of chunks. I want to store a few GB of data in memory (64 bit) and avoid limitation of memory fragmentation.
Something like this:
class ChunkedMemoryStream : Stream
{
private readonly List<byte[]> _chunks = new List<byte[]>();
private int _positionChunk;
private int _positionOffset;
private long _position;
public override bool CanRead
{
get { return true; }
}
public override bool CanSeek
{
get { return true; }
}
public override bool CanWrite
{
get { return true; }
}
public override void Flush() { }
public override long Length
{
get { return _chunks.Sum(c => c.Length); }
}
public override long Position
{
get
{
return _position;
}
set
{
_position = value;
_positionChunk = 0;
while (_positionOffset != 0)
{
if (_positionChunk >= _chunks.Count)
throw new OverflowException();
if (_positionOffset < _chunks[_positionChunk].Length)
return;
_positionOffset -= _chunks[_positionChunk].Length;
_positionChunk++;
}
}
}
public override int Read(byte[] buffer, int offset, int count)
{
int result = 0;
while ((count != 0) && (_positionChunk != _chunks.Count))
{
int fromChunk = Math.Min(count, _chunks[_positionChunk].Length - _positionOffset);
if (fromChunk != 0)
{
Array.Copy(_chunks[_positionChunk], _positionOffset, buffer, offset, fromChunk);
offset += fromChunk;
count -= fromChunk;
result += fromChunk;
_position += fromChunk;
}
_positionOffset = 0;
_positionChunk++;
}
return result;
}
public override long Seek(long offset, SeekOrigin origin)
{
long newPos = 0;
switch (origin)
{
case SeekOrigin.Begin:
newPos = offset;
break;
case SeekOrigin.Current:
newPos = Position + offset;
break;
case SeekOrigin.End:
newPos = Length - offset;
break;
}
Position = Math.Max(0, Math.Min(newPos, Length));
return newPos;
}
public override void SetLength(long value)
{
throw new NotImplementedException();
}
public override void Write(byte[] buffer, int offset, int count)
{
while ((count != 0) && (_positionChunk != _chunks.Count))
{
int toChunk = Math.Min(count, _chunks[_positionChunk].Length - _positionOffset);
if (toChunk != 0)
{
Array.Copy(buffer, offset, _chunks[_positionChunk], _positionOffset, toChunk);
offset += toChunk;
count -= toChunk;
_position += toChunk;
}
_positionOffset = 0;
_positionChunk++;
}
if (count != 0)
{
byte[] chunk = new byte[count];
Array.Copy(buffer, offset, chunk, 0, count);
_chunks.Add(chunk);
_positionChunk = _chunks.Count;
_position += count;
}
}
}
class Program
{
static void Main(string[] args)
{
ChunkedMemoryStream cms = new ChunkedMemoryStream();
Debug.Assert(cms.Length == 0);
Debug.Assert(cms.Position == 0);
cms.Position = 0;
byte[] helloworld = Encoding.UTF8.GetBytes("hello world");
cms.Write(helloworld, 0, 3);
cms.Write(helloworld, 3, 3);
cms.Write(helloworld, 6, 5);
Debug.Assert(cms.Length == 11);
Debug.Assert(cms.Position == 11);
cms.Position = 0;
byte[] b = new byte[20];
cms.Read(b, 3, (int)cms.Length);
Debug.Assert(b.Skip(3).Take(11).SequenceEqual(helloworld));
cms.Position = 0;
cms.Write(Encoding.UTF8.GetBytes("seeya"), 0, 5);
Debug.Assert(cms.Length == 11);
Debug.Assert(cms.Position == 5);
cms.Position = 0;
cms.Read(b, 0, (byte) cms.Length);
Debug.Assert(b.Take(11).SequenceEqual(Encoding.UTF8.GetBytes("seeya world")));
Debug.Assert(cms.Length == 11);
Debug.Assert(cms.Position == 11);
cms.Write(Encoding.UTF8.GetBytes(" again"), 0, 6);
Debug.Assert(cms.Length == 17);
Debug.Assert(cms.Position == 17);
cms.Position = 0;
cms.Read(b, 0, (byte)cms.Length);
Debug.Assert(b.Take(17).SequenceEqual(Encoding.UTF8.GetBytes("seeya world again")));
}
}
You need to first determine if virtual address fragmentation is the problem.
If you are on a 64 bit machine (which you seem to indicate you are) I seriously doubt it is. Each 64 bit process has almost the the entire 64 bit virtual memory space available and your only worry is virtual address space fragmentation not physical memory fragmentation (which is what the operating system must worry about). The OS memory manager already pages memory under the covers. For the forseeable future you will not run out of virtual address space before you run out of physical memory. This is unlikely change before we both retire.
If you are have a 32 bit address space, then allocating contiguous large blocks of memory in the GB ramge you will encounter a fragmentation problem quite quickly. There is no stock chunk allocating memory stream in the CLR. There is one in the under the covers in ASP.NET (for other reasons) but it is not accessable. If you must travel this path you are probably better off writing one youself anyway because the usage pattern of your application is unlikely to be similar to many others and trying to fit your data into a 32bit address space will likely be your perf bottleneck.
I highly recommend requiring a 64 bit process if you are manipulating GBs of data. It will do a much better job than hand-rolled solutions to 32 bit address space fragmentation regardless of how cleaver you are.
The Bing team has released RecyclableMemoryStream and wrote about it here. The benefits they cite are:
Eliminate Large Object Heap allocations by using pooled buffers
Incur far fewer gen 2 GCs, and spend far less time paused due to GC
Avoid memory leaks by having a bounded pool size
Avoid memory fragmentation
Provide excellent debuggability
Provide metrics for performance tracking
I've found similar problem in my application. I've read large amount of compressed data and I suffered from OutOfMemoryException using MemoryStream. I've written my own implementation of "chunked" memory stream based on collection of byte arrays. If you have any idea how to make this memory stream more effective, please write me about it.
public sealed class ChunkedMemoryStream : Stream
{
#region Constants
private const int BUFFER_LENGTH = 65536;
private const byte ONE = 1;
private const byte ZERO = 0;
#endregion
#region Readonly & Static Fields
private readonly Collection<byte[]> _chunks;
#endregion
#region Fields
private long _length;
private long _position;
private const byte TWO = 2;
#endregion
#region C'tors
public ChunkedMemoryStream()
{
_chunks = new Collection<byte[]> { new byte[BUFFER_LENGTH], new byte[BUFFER_LENGTH] };
_position = ZERO;
_length = ZERO;
}
#endregion
#region Instance Properties
public override bool CanRead
{
get { return true; }
}
public override bool CanSeek
{
get { return true; }
}
public override bool CanWrite
{
get { return true; }
}
public override long Length
{
get { return _length; }
}
public override long Position
{
get { return _position; }
set
{
if (!CanSeek)
throw new NotSupportedException();
_position = value;
if (_position > _length)
_position = _length - ONE;
}
}
private byte[] CurrentChunk
{
get
{
long positionDividedByBufferLength = _position / BUFFER_LENGTH;
var chunkIndex = Convert.ToInt32(positionDividedByBufferLength);
byte[] chunk = _chunks[chunkIndex];
return chunk;
}
}
private int PositionInChunk
{
get
{
int positionInChunk = Convert.ToInt32(_position % BUFFER_LENGTH);
return positionInChunk;
}
}
private int RemainingBytesInCurrentChunk
{
get
{
Contract.Ensures(Contract.Result<int>() > ZERO);
int remainingBytesInCurrentChunk = CurrentChunk.Length - PositionInChunk;
return remainingBytesInCurrentChunk;
}
}
#endregion
#region Instance Methods
public override void Flush()
{
}
public override int Read(byte[] buffer, int offset, int count)
{
if (offset + count > buffer.Length)
throw new ArgumentException();
if (buffer == null)
throw new ArgumentNullException();
if (offset < ZERO || count < ZERO)
throw new ArgumentOutOfRangeException();
if (!CanRead)
throw new NotSupportedException();
int bytesToRead = count;
if (_length - _position < bytesToRead)
bytesToRead = Convert.ToInt32(_length - _position);
int bytesreaded = 0;
while (bytesToRead > ZERO)
{
// get remaining bytes in current chunk
// read bytes in current chunk
// advance to next position
int remainingBytesInCurrentChunk = RemainingBytesInCurrentChunk;
if (remainingBytesInCurrentChunk > bytesToRead)
remainingBytesInCurrentChunk = bytesToRead;
Array.Copy(CurrentChunk, PositionInChunk, buffer, offset, remainingBytesInCurrentChunk);
//move position in source
_position += remainingBytesInCurrentChunk;
//move position in target
offset += remainingBytesInCurrentChunk;
//bytesToRead is smaller
bytesToRead -= remainingBytesInCurrentChunk;
//count readed bytes;
bytesreaded += remainingBytesInCurrentChunk;
}
return bytesreaded;
}
public override long Seek(long offset, SeekOrigin origin)
{
switch (origin)
{
case SeekOrigin.Begin:
Position = offset;
break;
case SeekOrigin.Current:
Position += offset;
break;
case SeekOrigin.End:
Position = Length + offset;
break;
}
return Position;
}
private long Capacity
{
get
{
int numberOfChunks = _chunks.Count;
long capacity = numberOfChunks * BUFFER_LENGTH;
return capacity;
}
}
public override void SetLength(long value)
{
if (value > _length)
{
while (value > Capacity)
{
var item = new byte[BUFFER_LENGTH];
_chunks.Add(item);
}
}
else if (value < _length)
{
var decimalValue = Convert.ToDecimal(value);
var valueToBeCompared = decimalValue % BUFFER_LENGTH == ZERO ? Capacity : Capacity - BUFFER_LENGTH;
//remove data chunks, but leave at least two chunks
while (value < valueToBeCompared && _chunks.Count > TWO)
{
byte[] lastChunk = _chunks.Last();
_chunks.Remove(lastChunk);
}
}
_length = value;
if (_position > _length - ONE)
_position = _length == 0 ? ZERO : _length - ONE;
}
public override void Write(byte[] buffer, int offset, int count)
{
if (!CanWrite)
throw new NotSupportedException();
int bytesToWrite = count;
while (bytesToWrite > ZERO)
{
//get remaining space in current chunk
int remainingBytesInCurrentChunk = RemainingBytesInCurrentChunk;
//if count of bytes to be written is fewer than remaining
if (remainingBytesInCurrentChunk > bytesToWrite)
remainingBytesInCurrentChunk = bytesToWrite;
//if remaining bytes is still greater than zero
if (remainingBytesInCurrentChunk > ZERO)
{
//write remaining bytes to current Chunk
Array.Copy(buffer, offset, CurrentChunk, PositionInChunk, remainingBytesInCurrentChunk);
//change offset of source array
offset += remainingBytesInCurrentChunk;
//change bytes to write
bytesToWrite -= remainingBytesInCurrentChunk;
//change length and position
_length += remainingBytesInCurrentChunk;
_position += remainingBytesInCurrentChunk;
}
if (Capacity == _position)
_chunks.Add(new byte[BUFFER_LENGTH]);
}
}
/// <summary>
/// Gets entire content of stream regardless of Position value and return output as byte array
/// </summary>
/// <returns>byte array</returns>
public byte[] ToArray()
{
var outputArray = new byte[Length];
if (outputArray.Length != ZERO)
{
long outputPosition = ZERO;
foreach (byte[] chunk in _chunks)
{
var remainingLength = (Length - outputPosition) > chunk.Length
? chunk.Length
: Length - outputPosition;
Array.Copy(chunk, ZERO, outputArray, outputPosition, remainingLength);
outputPosition = outputPosition + remainingLength;
}
}
return outputArray;
}
/// <summary>
/// Method set Position to first element and write entire stream to another
/// </summary>
/// <param name="stream">Target stream</param>
public void WriteTo(Stream stream)
{
Contract.Requires(stream != null);
Position = ZERO;
var buffer = new byte[BUFFER_LENGTH];
int bytesReaded;
do
{
bytesReaded = Read(buffer, ZERO, BUFFER_LENGTH);
stream.Write(buffer, ZERO, bytesReaded);
} while (bytesReaded > ZERO);
}
#endregion
}
Here is a full implementation:
/// <summary>
/// Defines a MemoryStream that does not sit on the Large Object Heap, thus avoiding memory fragmentation.
/// </summary>
/// <seealso cref="Stream" />
public sealed class ChunkedMemoryStream : Stream
{
/// <summary>
/// Defines the default chunk size. Currently defined as 0x10000.
/// </summary>
public const int DefaultChunkSize = 0x10000; // needs to be < 85000
private const int _lohSize = 85000;
private List<byte[]> _chunks = new List<byte[]>();
private long _position;
private int _chunkSize;
private int _lastChunkPos;
private int _lastChunkPosIndex;
/// <summary>
/// Initializes a new instance of the <see cref="ChunkedMemoryStream" /> class based on the specified byte array.
/// </summary>
/// <param name="chunkSize">Size of the underlying chunks.</param>
/// <param name="buffer">The array of unsigned bytes from which to create the current stream.</param>
public ChunkedMemoryStream(int chunkSize = DefaultChunkSize, byte[] buffer = null)
{
FreeOnDispose = true;
ChunkSize = chunkSize;
_chunks.Add(new byte[chunkSize]);
if (buffer != null)
{
Write(buffer, 0, buffer.Length);
Position = 0;
}
}
/// <summary>
/// Gets or sets a value indicating whether to free the underlying chunks on dispose.
/// </summary>
/// <value>
/// <c>true</c> if the underlying chunks must be freed on disposal; otherwise, <c>false</c>.
/// </value>
public bool FreeOnDispose { get; set; }
/// <summary>
/// Releases the unmanaged resources used by the <see cref="Stream" /> and optionally releases the managed resources.
/// </summary>
/// <param name="disposing">true to release both managed and unmanaged resources; false to release only unmanaged resources.</param>
protected override void Dispose(bool disposing)
{
if (FreeOnDispose)
{
if (_chunks != null)
{
_chunks = null;
_chunkSize = 0;
_position = 0;
}
}
base.Dispose(disposing);
}
/// <summary>
/// When overridden in a derived class, clears all buffers for this stream and causes any buffered data to be written to the underlying device.
/// This implementation does nothing.
/// </summary>
public override void Flush()
{
// do nothing
}
/// <summary>
/// When overridden in a derived class, reads a sequence of bytes from the current stream and advances the position within the stream by the number of bytes read.
/// </summary>
/// <param name="buffer">An array of bytes. When this method returns, the buffer contains the specified byte array with the values between <paramref name="offset" /> and (<paramref name="offset" /> + <paramref name="count" /> - 1) replaced by the bytes read from the current source.</param>
/// <param name="offset">The zero-based byte offset in <paramref name="buffer" /> at which to begin storing the data read from the current stream.</param>
/// <param name="count">The maximum number of bytes to be read from the current stream.</param>
/// <returns>
/// The total number of bytes read into the buffer. This can be less than the number of bytes requested if that many bytes are not currently available, or zero (0) if the end of the stream has been reached.
/// </returns>
/// <exception cref="ArgumentNullException"><paramref name="buffer" /> is null.</exception>
/// <exception cref="ArgumentOutOfRangeException"><paramref name="offset" /> or <paramref name="count" /> is negative.</exception>
/// <exception cref="ArgumentException">The sum of <paramref name="offset" /> and <paramref name="count" /> is larger than the buffer length.</exception>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override int Read(byte[] buffer, int offset, int count)
{
if (buffer == null)
throw new ArgumentNullException(nameof(buffer));
if (offset < 0)
throw new ArgumentOutOfRangeException(nameof(offset));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count));
if ((buffer.Length - offset) < count)
throw new ArgumentException(null, nameof(count));
CheckDisposed();
var chunkIndex = (int)(_position / ChunkSize);
if (chunkIndex == _chunks.Count)
return 0;
var chunkPos = (int)(_position % ChunkSize);
count = (int)Math.Min(count, Length - _position);
if (count == 0)
return 0;
var left = count;
var inOffset = offset;
var total = 0;
do
{
var toCopy = Math.Min(left, ChunkSize - chunkPos);
Buffer.BlockCopy(_chunks[chunkIndex], chunkPos, buffer, inOffset, toCopy);
inOffset += toCopy;
left -= toCopy;
total += toCopy;
if ((chunkPos + toCopy) == ChunkSize)
{
if (chunkIndex == (_chunks.Count - 1))
{
// last chunk
break;
}
chunkPos = 0;
chunkIndex++;
}
else
{
chunkPos += toCopy;
}
}
while (left > 0);
_position += total;
return total;
}
/// <summary>
/// Reads a byte from the stream and advances the position within the stream by one byte, or returns -1 if at the end of the stream.
/// </summary>
/// <returns>
/// The unsigned byte cast to an Int32, or -1 if at the end of the stream.
/// </returns>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override int ReadByte()
{
CheckDisposed();
if (_position >= Length)
return -1;
var ret = _chunks[(int)(_position / ChunkSize)][_position % ChunkSize];
_position++;
return ret;
}
/// <summary>
/// When overridden in a derived class, sets the position within the current stream.
/// </summary>
/// <param name="offset">A byte offset relative to the <paramref name="origin" /> parameter.</param>
/// <param name="origin">A value of type <see cref="SeekOrigin" /> indicating the reference point used to obtain the new position.</param>
/// <returns>The new position within the current stream.</returns>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override long Seek(long offset, SeekOrigin origin)
{
CheckDisposed();
switch (origin)
{
case SeekOrigin.Begin:
Position = offset;
break;
case SeekOrigin.Current:
Position += offset;
break;
case SeekOrigin.End:
Position = Length + offset;
break;
}
return Position;
}
private void CheckDisposed()
{
if (_chunks == null)
throw new ObjectDisposedException(null, "Cannot access a disposed stream.");
}
/// <summary>
/// When overridden in a derived class, sets the length of the current stream.
/// </summary>
/// <param name="value">The desired length of the <paramref name="value" /> stream in bytes.</param>
/// <exception cref="ArgumentOutOfRangeException"><paramref name="value" /> is out of range.</exception>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override void SetLength(long value)
{
CheckDisposed();
if (value < 0)
throw new ArgumentOutOfRangeException(nameof(value));
if (value > Length)
throw new ArgumentOutOfRangeException(nameof(value));
var needed = value / ChunkSize;
if ((value % ChunkSize) != 0)
{
needed++;
}
if (needed > int.MaxValue)
throw new ArgumentOutOfRangeException(nameof(value));
if (needed < _chunks.Count)
{
var remove = (int)(_chunks.Count - needed);
for (var i = 0; i < remove; i++)
{
_chunks.RemoveAt(_chunks.Count - 1);
}
}
_lastChunkPos = (int)(value % ChunkSize);
}
/// <summary>
/// Converts the current stream to a byte array.
/// </summary>
/// <returns>
/// An array of bytes
/// </returns>
public byte[] ToArray()
{
CheckDisposed();
var bytes = new byte[Length];
var offset = 0;
for (var i = 0; i < _chunks.Count; i++)
{
var count = (i == (_chunks.Count - 1)) ? _lastChunkPos : _chunks[i].Length;
if (count > 0)
{
Buffer.BlockCopy(_chunks[i], 0, bytes, offset, count);
offset += count;
}
}
return bytes;
}
/// <summary>
/// When overridden in a derived class, writes a sequence of bytes to the current stream and advances the current position within this stream by the number of bytes written.
/// </summary>
/// <param name="buffer">An array of bytes. This method copies <paramref name="count" /> bytes from <paramref name="buffer" /> to the current stream.</param>
/// <param name="offset">The zero-based byte offset in <paramref name="buffer" /> at which to begin copying bytes to the current stream.</param>
/// <param name="count">The number of bytes to be written to the current stream.</param>
/// <exception cref="ArgumentException">The sum of <paramref name="offset" /> and <paramref name="count" /> is greater than the buffer length.</exception>
/// <exception cref="ArgumentNullException"><paramref name="buffer" /> is null.</exception>
/// <exception cref="ArgumentOutOfRangeException"><paramref name="offset" /> or <paramref name="count" /> is negative.</exception>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override void Write(byte[] buffer, int offset, int count)
{
if (buffer == null)
throw new ArgumentNullException(nameof(buffer));
if (offset < 0)
throw new ArgumentOutOfRangeException(nameof(offset));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count));
if ((buffer.Length - offset) < count)
throw new ArgumentException(null, nameof(count));
CheckDisposed();
var chunkPos = (int)(_position % ChunkSize);
var chunkIndex = (int)(_position / ChunkSize);
if (chunkIndex == _chunks.Count)
{
_chunks.Add(new byte[ChunkSize]);
}
var left = count;
var inOffset = offset;
do
{
var copied = Math.Min(left, ChunkSize - chunkPos);
Buffer.BlockCopy(buffer, inOffset, _chunks[chunkIndex], chunkPos, copied);
inOffset += copied;
left -= copied;
if ((chunkPos + copied) == ChunkSize)
{
chunkIndex++;
chunkPos = 0;
if (chunkIndex == _chunks.Count)
{
_chunks.Add(new byte[ChunkSize]);
}
}
else
{
chunkPos += copied;
}
}
while (left > 0);
_position += count;
if (chunkIndex == (_chunks.Count - 1))
{
if (chunkIndex > _lastChunkPosIndex || (chunkIndex == _lastChunkPosIndex && chunkPos > _lastChunkPos))
{
_lastChunkPos = chunkPos;
_lastChunkPosIndex = chunkIndex;
}
}
}
/// <summary>
/// Writes a byte to the current position in the stream and advances the position within the stream by one byte.
/// </summary>
/// <param name="value">The byte to write to the stream.</param>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override void WriteByte(byte value)
{
CheckDisposed();
var chunkIndex = (int)(_position / ChunkSize);
var chunkPos = (int)(_position % ChunkSize);
if (chunkPos > (ChunkSize - 1))
{
chunkIndex++;
chunkPos = 0;
if (chunkIndex == _chunks.Count)
{
_chunks.Add(new byte[ChunkSize]);
}
}
_chunks[chunkIndex][chunkPos++] = value;
_position++;
if (chunkIndex == (_chunks.Count - 1))
{
if (chunkIndex > _lastChunkPosIndex || (chunkIndex == _lastChunkPosIndex && chunkPos > _lastChunkPos))
{
_lastChunkPos = chunkPos;
_lastChunkPosIndex = chunkIndex;
}
}
}
/// <summary>
/// Writes to the specified stream.
/// </summary>
/// <param name="stream">The stream.</param>
/// <exception cref="ArgumentNullException"><paramref name="stream" /> is null.</exception>
public void WriteTo(Stream stream)
{
if (stream == null)
throw new ArgumentNullException(nameof(stream));
CheckDisposed();
for (var i = 0; i < _chunks.Count; i++)
{
var count = i == (_chunks.Count - 1) ? _lastChunkPos : _chunks[i].Length;
stream.Write(_chunks[i], 0, count);
}
}
/// <summary>
/// When overridden in a derived class, gets a value indicating whether the current stream supports reading.
/// </summary>
public override bool CanRead => true;
/// <summary>
/// When overridden in a derived class, gets a value indicating whether the current stream supports seeking.
/// </summary>
public override bool CanSeek => true;
/// <summary>
/// When overridden in a derived class, gets a value indicating whether the current stream supports writing.
/// </summary>
public override bool CanWrite => true;
/// <summary>
/// When overridden in a derived class, gets the length in bytes of the stream.
/// </summary>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override long Length
{
get
{
CheckDisposed();
if (_chunks.Count == 0)
return 0;
return (long)(_chunks.Count - 1) * ChunkSize + _lastChunkPos;
}
}
/// <summary>
/// Gets or sets the size of the underlying chunks. Cannot be greater than or equal to 85000.
/// </summary>
/// <value>
/// The chunks size.
/// </value>
/// <exception cref="ArgumentOutOfRangeException"><paramref name="value" /> is out of range.</exception>
public int ChunkSize
{
get => _chunkSize;
set
{
if (value <= 0 || value >= _lohSize)
throw new ArgumentOutOfRangeException(nameof(value));
_chunkSize = value;
}
}
/// <summary>
/// When overridden in a derived class, gets or sets the position within the current stream.
/// </summary>
/// <exception cref="ArgumentOutOfRangeException"><paramref name="value" /> is out of range.</exception>
/// <exception cref="ObjectDisposedException">Methods were called after the stream was closed.</exception>
public override long Position
{
get
{
CheckDisposed();
return _position;
}
set
{
CheckDisposed();
if (value < 0)
throw new ArgumentOutOfRangeException(nameof(value));
if (value > Length)
throw new ArgumentOutOfRangeException(nameof(value));
_position = value;
}
}
}
You should use the UnmanagedMemoryStream when dealing with over 2GB chunks of memory, as MemoryStream is limited to 2GB, and the UnmanagedMemoryStream was made to deal with this problem.
SparseMemoryStream does this in .NET it's in buried deep down in an internal class library though -- the source code is available of course, since Microsoft put it all out there as open source.
You can grab the code for it here: http://www.dotnetframework.org/default.aspx/4#0/4#0/DEVDIV_TFS/Dev10/Releases/RTMRel/wpf/src/Base/MS/Internal/IO/Packaging/SparseMemoryStream#cs/1305600/SparseMemoryStream#cs
That being said, I highly recommend not using it as is -- At the very least remove all the calls to IsolatedStorage for starters, as this seems to be the cause of no end of bugs* in the framework's packaging API.
(*: In addition to spreading the data around in streams, if it gets too big, it basically reinvents swap files for some reason -- in the user's Isolated Storage no less -- and coincidentally, most MS products that allow for .NET based add-ins do not have their app domains setup in such a way that you can access Isolated Storage -- VSTO add-ins are notorious for suffering from this issue, for example.)
Another implementation of chunked stream could be considered as a stock MemoryStream replacement. Additionally it allows to allocate a single large byte array on LOH which will be used as a "chunk" pool, shared between all ChunkedStream instances...
https://github.com/ImmortalGAD/ChunkedStream

Categories