Get Byte array from very large integer represented as string - c#

I have a very large prime number (for RSA purposes) that needs to be converted to a byte array. The number however is currently stored as a string. I'm OK with storing it as a byte[] but either way the number is a string and I have to get it into a byte array.
Now to be clear I have used the RSA encryption and decryption sample data provided on MSDN and everything works so I have a high degree of confidence that the encryption portion is fine. Further the samples provided by MSDN provide prime numbers that have already been turned into byte[]. Thus I have a high degree of confidence that the breakdown is in MY conversion of the string representation of the number to a byte[].
I currently do this:
private static string _publicKeyExponent = "12345...310 digits......9876";
private static string _publicKeyModulus = "654782....620 digits.....4576";
_rsaPublicKey.Exponent = CoreHelpers.GetBytes(_publicKeyExponent);
And here is my GetBytes method that I suspect is causing the issue as it is getting the bytes of STRING characters NOT digits.
public static byte[] GetBytes(string str)
{
byte[] bytes = new byte[str.Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
Now if I have already identified the problem fixing should be straight forward no? Well for me yes and no. I don't know of any strong type in c# that I can parse a number of this size into. The best idea I can come up with is to break up the string into smaller chunks of say 10 chars which would then easily parse to INT32 and then getbytes of that. Add it to some master byte array and do it again.

You could use the BigInteger struct.
It contains numerous Parse static methods and the ToByteArray method.
Sample code:
public static byte[] GetBytes(string str)
{
BigInteger number;
return BigInteger.TryParse(str, out number) ? number.ToByteArray() : null;
}

Related

How to encode a string into an existing byte buffer without allocating a new array each time?

I currently have this simple code:
class Thing {
public string Message{get; set;}
public byte[] GetBytes() => Encoding.Default.GetBytes(Message);
In my case I want to add these bytes into a larger, pre-allocated buffer. I can't see any way to implementGetBytes to avoid allocating an array each time since Encoding.GetBytes is quite limited, am I missing something? If I am dealing with a lot of Thing instances then I just have to accept this performance hit?
I had preferred to be able to write something like GetBytes(byte []buffer) or even better GetBytes(ArraySegment<byte> buffer)
You can use the overload int GetBytes (string s, int charIndex, int charCount, byte[] bytes, int byteIndex);:
public int GetBytes(byte[] array) =>
encoding.GetBytes(Message, 0, Message.Length, array, 0);
Or:
public int GetBytes(ArraySegment<byte> segment) =>
encoding.GetBytes(Message, 0, Message.Length, segment.Array, segment.Offset);
Note that there's no checking whether you've exceeded the portion of the array described by the ArraySegment: if you've got an ArraySegment which ends before the underlying array, this will happily write up to the end of the array. You might want to add a check, using the return value of encoding.GetBytes.
Alternatively, if you're using .NET Core 2.1 or higher, you've got access to int GetBytes (ReadOnlySpan<char> chars, Span<byte> bytes) (allowing you to use a Span<byte> instead of an ArraySegment<byte>):
public int GetBytes(Span<byte> span) =>
encoding.GetBytes(Message, span);

Encode unicode string as byte array C++ and C#

I have C++ code which I want to rewrite to C#. This part
case ID_TYPE_UNICODE_STRING :
if(items[i].GetUString().length() > 0xFFFF)
throw dppError("error");
//GetUstring returns std::wstring type object
DataSize = (WORD) (sizeof(WCHAR)*(items[i].GetUString().length()));
blob.AppendData((const BYTE *) &DataSize, sizeof(WORD)); //blob is byte array
//GetUstring returns std::wstring type object
blob.AppendData((const BYTE *) items[i].GetUString().c_str(), DataSize);
break ;
basically serializes length in bytes of unicode string and string itself to byte array.
Here comes my problem (this code then sends this data to server). I don't know which encoding is used in above lines of code(UTF16, UTF8, etc.).
So I don't know what is the best way to reimplement it in C#.
How can I guess what encoding is used in this C++ project?
And if I can't find encoding used in C++ project, given endianness is same as stated in accepted answer of this question, do you think the two methods (GetBytes and GetString) in accepted answer will work for me (for serializing the unicode string as in C++ project and retrieving it back)? e.g.
these two:
static byte[] GetBytes(string str)
{
byte[] bytes = new byte[str.Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
static string GetString(byte[] bytes)
{
char[] chars = new char[bytes.Length / sizeof(char)];
System.Buffer.BlockCopy(bytes, 0, chars, 0, bytes.Length);
return new string(chars);
}
Or I am better of to learn what is the encoding used in C++ project?
I will then need to reconstruct the string in the same way from byte array too. And if I am better of learning which encoding was used in C++, how do I get the length of the string in bytes in C#, using System.Text.ASCII.WhateverEncodingWasUsedinC++.GetByteCount(string); ??
PS. Do you think the C++ code is working in encoding agnostic way? If yes, how can I repeat that also in C#?
UPDATE: I am guessing the encoding used is UTF16 because I saw that being mentioned in several variables names, so I think I will assume UTF16 is used, and if something doesn't work out during testing, look for alternative solutions. In that case, what is the best way to get the number of bytes of the UTF16 string? Is following method OK: System.Text.ASCII.Unicode.GetByteCount(string); ??
feedback and comments welcome. Am I wrong somewhere in my reasoning? Thanks
Change the method signature as like this for getting byte[] equivalent of input string.
static byte[] GetBytes(string str)
{
UnicodeEncoding uEncoding = new UnicodeEncoding();
byte[] stringContentBytes = uEncoding.GetBytes("Your string");
return stringContentBytes;
}
For reverse:
static string GetString(byte[] bytes)
{
UnicodeEncoding uEncoding = new UnicodeEncoding();
string stringContent=uEncoding.GetString(bytes);
return new string(stringContent);
}

Does RNGCryptoServiceProvider.GetBytes write a sequential stream of bits to the byte[] result

I've been trying to figure out the implementation of RNGCryptoServiceProvider.GetBytes to answer the title question.
To rephrase, does the GetBytes method generate a stream/array of bits, and then just sequentially write them to the byte array, such that I could recreate the original stream/array by looping over the bytes in order and looking at the bits in each byte?
I checked the source as found here: rngcryptoserviceprovider.cs, but it calls out to the CLR apparently, and I don't know how to get the source for that.
[DllImport(JitHelpers.QCall, CharSet = CharSet.Unicode), SuppressUnmanagedCodeSecurity]
[ResourceExposure(ResourceScope.None)]
private static extern void GetBytes(SafeProvHandle hProv, byte[] randomBytes, int count);
After reading the comments for more information, I think I know what you're asking...
So actually, you should use the RNGCryptoServiceProvider's GetNonZeroBytes(byte[] data) method. This will prevent you from receiving 8 zero bits in a row, which would throw off your statistical methods. Then, as blorgbeard-is-out has pointed out, "The bytes are random and evenly distributed, each bit should also be random and evenly distributed."
Also, the RNGCryptoServiceProvider can be slightly tricky to use, so I have provided a short example, along with how to convert the byte array into bits:
/* A static member inside a class somewhere. Only instantiate this once per execution of your application. */
static RNGCryptoServiceProvider _rng = new RNGCryptoServiceProvider();
/* Inside a method */
byte[] rngBytes4 = new byte[4];
_rng.GetNonZeroBytes(rngBytes4); // This populates the byte array
BitArray randomBits = new BitArray(rngBytes4);
foreach(bool bit in randomBits)
{
// Do something with bit here
}
// And, because its not obvious how to get BitArray to return you a list:
List<bool> randomBitsAsList = randomBits.Cast<bool>().ToList();

Converting byte array to string with correct encoding

I have this bit of C# code that I have translated to VB using http://www.developerfusion.com/tools/convert/csharp-to-vb/
private string DecodeToken (string token, string key)
{
byte [] buffer = new byte[0];
string decoded = "";
int i;
if (Scramble (Convert.FromBase64String(token), key, ref buffer))
{
for (i=0;i<buffer.Length;i++)
{
decoded += Convert.ToString((char)buffer[i]);
}
}
return(decoded);
}
Which, after a little modification, gives this:
Private Function DecodeToken(token As String, key As String) As String
Dim buffer As Byte()
Dim decoded As String = ""
Dim index As Integer
If Scramble(Convert.FromBase64String(token), key, buffer) Then
For index = 0 To buffer.Length - 1
decoded += Convert.ToString(ChrW(buffer(index)))
Next
'decoded = UTF8Encoding.ASCII.GetString(pbyBuffer)
'decoded = UnicodeEncoding.ASCII.GetString(pbyBuffer)
'decoded = ASCIIEncoding.ASCII.GetString(pbyBuffer)
End If
Return decoded
End Function
Scramble just rearranges the array in a specific way and I've checked the VB and C# outputs against each other so it can be ignored. It's inputs and outputs are byte arrays so it shouldn't affect the encoding.
The problem lies in that the result of this function is fed into a hashing algorithm which is then compared against the hashing signature. The result of the VB version, when hashed, does not match to the signature.
You can see from the comments that I've attempted to use different encodings to get the byte buffer out as a string but none of these have worked.
The problem appears to lie in the transalation of decoded += Convert.ToString((char)buffer[i]); to decoded += Convert.ToString(ChrW(buffer(index))).
Does ChrW produce the same result as casting as a char and which encoding will correctly duplicate the reading of the byte array?
Edit: I always have Option Strict On but it's possible that the original C# doesn't so it may be affected by implicit conversion. What does the compiler do in that situation?
Quick answer
decoded += Convert.ToString((char)buffer[i]);
is equivalent to
decoded &= Convert.ToString(Chr(buffer[i]));
VB.Net stops you taking the hacky approach used in the c# code, a Char is Unicode so consists of two bytes.
This looks likes a better implementation of what you have.
Private Function DecodeToken(encodedToken As String, key As String) As String
Dim scrambled = Convert.FromBase64String(encodedToken)
Dim buffer As Byte()
Dim index As Integer
If Not Scramble(scrambled, key, buffer) Then
Return Nothing
End If
Dim descrambled = new StringBuilder(buffer.Length);
For index = 0 To buffer.Length - 1
descrambled.Append(Chr(buffer(index)))
Next
Return descrambled.ToString()
End Function
have you tried the most direct code translation:
decoded += Convert.ToString(CType(buffer[i], char))
When covnerting a byte array to a string you should really make sure you know the encoding first though. If this is set in whatever is providing the byte array then you should use that to decode the string.
For more details on the ChrW (and Chr) functions look at http://msdn.microsoft.com/en-us/library/613dxh46%28v=vs.80%29.aspx . In essence ChrW assumes that the passed int is a unicode codepoint which may not be a valid assumption (I believe from 0 to 127 this wouldn't matter but the upper half of the byte might be different). if this is the problem then it will likely be accented and other such "special" characters that are causing the problem.
Give the following a go:
decoded += Convert.ToChar(foo)
It will work (unlike my last attempt that made assumptions about implicit conversions being framework specific and not language specific) but I can't guarantee that it will be the same as the .NET.
Given you say in comments you expected to use Encoding.xxx.GetString then why don't you use that? Do you know what the encoding was in the original string to byte array? If so then just use that. It is the correct way to convert a byte array to a string anyway since doing it byte by byte will definitely break for any multi-byte characters (clearly).
A small improvement
Private Function DecodeToken(encodedToken As String, key As String) As String
Dim scrambled = Convert.FromBase64String(encodedToken)
Dim buffer As Byte()
Dim index As Integer
If Not Scramble(scrambled, key, buffer) Then
Return Nothing
End If
Dim descrambled = System.Text.Encoding.Unicode.GetString(buffer, 0, buffer.Length);
Return descrambled
End Function

C# - RSACryptoServiceProvider Decrypt into a SecureString instead of byte array

I have a method that currently returns a string converted from a byte array:
public static readonly UnicodeEncoding ByteConverter = new UnicodeEncoding();
public static string Decrypt(string textToDecrypt, string privateKeyXml)
{
if (string.IsNullOrEmpty(textToDecrypt))
{
throw new ArgumentException(
"Cannot decrypt null or blank string"
);
}
if (string.IsNullOrEmpty(privateKeyXml))
{
throw new ArgumentException("Invalid private key XML given");
}
byte[] bytesToDecrypt = Convert.FromBase64String(textToDecrypt);
byte[] decryptedBytes;
using (var rsa = new RSACryptoServiceProvider())
{
rsa.FromXmlString(privateKeyXml);
decryptedBytes = rsa.Decrypt(bytesToDecrypt, FOAEP);
}
return ByteConverter.GetString(decryptedBytes);
}
I'm trying to update this method to instead return a SecureString, but I'm having trouble converting the return value of RSACryptoServiceProvider.Decrypt from byte[] to SecureString. I tried the following:
var secStr = new SecureString();
foreach (byte b in decryptedBytes)
{
char[] chars = ByteConverter.GetChars(new[] { b });
if (chars.Length != 1)
{
throw new Exception(
"Could not convert a single byte into a single char"
);
}
secStr.AppendChar(chars[0]);
}
return secStr;
However, using this SecureString equality tester, the resulting SecureString was not equal to the SecureString constructed from the original, unencrypted text. My Encrypt and Decrypt methods worked before, when I was just using string everywhere, and I've also tested the SecureString equality code, so I'm pretty sure the problem here is how I'm trying to convert byte[] into SecureString. Is there another route I should take for using RSA encryption that would allow me to get back a SecureString when I decrypt?
Edit: I didn't want to convert the byte array to a regular string and then stuff that string into a SecureString, because that seems to defeat the point of using a SecureString in the first place. However, is it also bad that Decrypt returns byte[] and I'm then trying to stuff that byte array into a SecureString? It's my guess that if Decrypt returns a byte[], then that's a safe way to pass around sensitive information, so converting one secure representation of the data to another secure representation seems okay.
A char and a byte can be used interchangeably with casting, so modify your second chunk of code as such:
var secStr = new SecureString();
foreach (byte b in decryptedBytes)
{
secStr.AppendChar((char)b);
}
return secStr;
This should work properly, but keep in mind that you're still bringing the unencrypted information into the "clear" in memory, so there's a point at which it could be compromised (which sort of defeats the purpose to a SecureString).
** Update **
A byte[] of your sensitive information is not secure. You can look at it in memory and see the information (especially if it's just a string). The individual bytes will be in the exact order of the string, so 'read'ing it is pretty straight-forward.
I was (actually about an hour ago) just struggling with this same issue myself, and as far as I know there is no good way to go straight from the decrypter to the SecureString unless the decryter is specifically programmed to support this strategy.
I think the problem might be your ByteConvert.GetChars method. I can't find that class or method in the MSDN docs. I'm not sure if that is a typo, or a homegrown function. Regardless, it is mostly likely not interpreting the encoding of the bytes correctly. Instead, use the UTF8Encoding's GetChars method. It will properly convert the bytes back into a .NET string, assuming they were encrypted from a .NET string object originally. (If not, you'll want to use the GetChars method on the encoding that matches the original string.)
You're right that using arrays is the most secure approach. Because the decrypted representations of your secret are stored in byte or char arrays, you can easily clear them out when done, so your plaintext secret isn't left in memory. This isn't perfectly secure, but more secure than converting to a string. Strings can't be changed and they stay in memory until they are garbage collected at some indeterminate future time.
var secStr = new SecureString();
var chars = System.Text.Encoding.UTF8.GetChars(decryptedBytes);
for( int idx = 0; idx < chars.Length; ++idx )
{
secStr.AppendChar(chars[idx]);
# Clear out the chars as you go.
chars[idx] = 0
}
# Clear the decrypted bytes from memory, too.
Array.Clear(decryptedBytes, 0, decryptedBytes.Length);
return secStr;
Based on Coding Gorilla's answer, I tried the following in my Decrypt method:
string decryptedString1 = string.Empty;
foreach (byte b in decryptedBytes)
{
decryptedString1 += (char)b;
}
string decryptedString2 = ByteConverter.GetString(decryptedBytes);
When debugging, decryptedString1 and decryptedString2 were not equal:
decryptedString1 "m\0y\0V\0e\0r\0y\0L\0o\0n\0g\0V\03\0r\0y\05\03\0c\0r\03\07\0p\04\0s\0s\0w\00\0r\0d\0!\0!\0!\0"
decryptedString2 "myVeryLongV3ry53cr37p4ssw0rd!!!"
So it looks like I can just go through the byte[] array, do a direct cast to char, and skip \0 characters. Like Coding Gorilla said, though, this does seem to again in part defeat the point of SecureString, because the sensitive data is floating about in memory in little byte-size chunks. Any suggestions for getting RSACryptoServiceProvider.Decrypt to return a SecureString directly?
Edit: yep, this works:
var secStr = new SecureString();
foreach (byte b in decryptedBytes)
{
var c = (char)b;
if ('\0' == c)
{
continue;
}
secStr.AppendChar(c);
}
return secStr;
Edit: correction: this works with plain old English strings. Encrypting and then attempting to decrypt the string "標準語 明治維新 english やった" doesn't work as expected because the resulting decrypted string, using this foreach (byte b in decryptedBytes) technique, does not match the original unencrypted string.
Edit: using the following works for both:
var secStr = new SecureString();
foreach (char c in ByteConverter.GetChars(decryptedBytes))
{
secStr.AppendChar(c);
}
return secStr;
This still leaves a byte array and a char array of the password in memory, which sucks. Maybe I should find another RSA class that returns a SecureString. :/
What if you stuck to UTF-16?
Internally, .NET (and therefore, SecureString) uses UTF-16 (double byte) to store string contents. You could take advantage of this and translate your protected data two bytes (i.e. 1 char) at a time...
When you encrypt, peel off a Char, and use Encoding.UTF16.GetBytes() to get your two bytes, and push those two bytes into your encryption stream. In reverse, when you are reading from your encrypted stream, read two bytes at a time, and UTF16.GetString() to get your char.
It probably sounds awful, but it keeps all the characters of your secret string from being all in one place, AND it gives you the reliability of character "size" (you won't have to guess if the next single byte is a char, or a UTF marker for a double-wide char). There's no way for an observer to know which characters go with which, nor in which order, so guessing the secret should be near impossible.
Honestly, this is just a suggested idea... I'm about to try it myself, and see how viable it is. My goal is to produce extension methods (SecureString.Encrypt and ICrypto.ToSecureString, or something like that).
Use System.Encoding.Default.GetString
GetString MSDN

Categories