Node.js encrypt RSA[XML] - c#

I need to encrypt a string in Node.js using the key created in C# (XML format)
Public Key:
<RSAKeyValue><Modulus>mlDk9dIwcGJ+sS7kCOiG/xr/1RkM7v7/bUExalwSj7Q/Ul575l4cUGR1ZjC3BtEgmMZjW6xSRTCkgp0WMpdnXGmygV0mQbrAP32NTGoMoWgjTIevBbd+yOMfY8E87bUG0sYUA8+Wk55iEPk3O0Ua5FiLNWIqGTbrF2A5iSp1voc=</Modulus><Exponent>AQAB</Exponent></RSAKeyValue>
How can I do it?

You parse the XML using an XML parser, then retrieve the two base 64 values and decode those: the modulus and the public exponent. Then you generate a key using the two components for whatever library you are using.
I could not directly see if the number encoded is big- or little endian: Mickeysoft doesn't specify that, but the lack of reversal routines in other language implementations leads me to believe that the bytes represent a big endian encoding.
The fact that the leftmost byte of the modulus is an even value is probably a good indication as well as n = p * q, and if p & q are primes and thus odd, so the result of the calculation is then odd as well.

Related

How to efficiently store Huffman Tree and Encoded binary string into a file?

I can easily convert a character string into a Huffman-Tree then encode into a binary sequence.
How should I save these to be able to actually compress the original data and then recover back?
I searched the web but I only could find guides and answers showing until what I already did. How can I use huffman algorithm further to actually achieve lossless compression?
I am using C# for this project.
EDIT: I've achieved these so far, might need rethinking.
I am attempting to compress a text file. I use Huffman Algorithm but there are some key points I couldn't figure out:
"aaaabbbccdef" when compressed gives this encoding
Key = a, Value = 11
Key = b, Value = 01
Key = c, Value = 101
Key = d, Value = 000
Key = e, Value = 001
Key = f, Value = 100
11111111010101101101000001100 is the encoded version. It normally needs 12*8 bits but we've compressed it to be 29 bits. This example might be a litte unnecessary for a file this small but let me explain what I tried to do.
We have 29 bits here but we need 8*n bits so I fill the encodedString with zeros until it becomes a multiple of eight. Since I can add 1 to 7 zeros it is more than enough to use 1-byte to represent this. This case I've added 3 zeros
11111111010101101101000001100000 Then add as binary how many extra bits I've added to the front and the split into 8-bit pieces
00000011-11111111-01010110-11010000-01100000
Turn these into ASCII characters
ÿVÐ`
Now if I have the encoding table I can look to the first 8bits convert that to integer ignoreBits and by ignoring the last ignoreBits turn it back to the original form.
The problem is I also want to include uncompressed version of encoding table with this file to have a fully functional ZIP/UNZIP prpgram but I am having trouble deciding when my ignoreBits ends, my encodingTable startse/ends, encoded bits start/end.
I thought about using null character but there is no assurance that Values cannot produce a null character. "ddd" in this situation produces 00000000-0.....
Your representation of the code needs to be self-terminating. Then you know the next bit is the start of the Huffman codes. One way is to traverse the tree that resulted from the Huffman code, writing a 0 bit for each branch, or a 1 bit followed by the symbol for leaf. When the traverse is done, you know the next bit must be the codes.
You also need to make your data self terminating. Note that in the example you give, the added three zero bits will be decoded as another 'd'. So you will incorrectly get 'aaaabbbccdefd' as the result. You need to either precede the encoded data with a count of symbols expected, or you need to add a symbol to your encoded set, with frequency 1, that marks the end of the data.

Reversible Hash in C#

Before we start, I want to say "hash" is a bit of a misnomer from what I actually want.
Basically, I have a program that returns a 92 character string (this is cryptographically secure) that I want to shorten, which is why I can't think of any other word but I'll need to be able to reverse it.
So I'm looking for some way that I can take the 92 character base64 string (s) and turn it into a much shorter string (n), and then reverse it.
So the encoding would be like (n) + (hash function) = (s)
And then I'll be able to decode it with (s) + (hash function) = (n). I don't need this to be secure since I handled that when generating the string.
I was using Base65536 but that was mostly for a quick joke since that would be impractical for an actual user.
TL;DR - I need a hash (or encryption) function that will generate short strings out of long ones.
Just to clarify, I do NOT need to compress the file size, I need a shorter string to return to the user.
The most space efficient way to store binary data is to store it as bytes. The only way you may get it even shorter is via compression. But for 92 Characters that will not amount to much.
As for Base64: There are cases where we are forced to transmit binary data over a medium not supporting random binary data. Mostly Textbase media (Email, XML files, HTML). So we use Base64 as a way to encode Binary Data. While it is lossless, it is less storage efficient. In effect every Byte of Input needs 1 1/4 byte in Base64 Output. It is never the ideal case to use Base64, more a nessesary evil.

Good ways of converting letters to numbers in naive RSA implementation?

I'm implementing a short RSA program and have this code:
private string Encrypt(string data)
{
BigInteger dataAsBigInteger = new BigInteger(Encoding.UTF8.GetBytes(data));
BigInteger remainder = BigInteger.ModPow(dataAsBigInteger, exponentE, CalculatePublicKey());
return Convert.ToBase64String(remainder.ToByteArray());
}
private string Decrypt(string data)
{
BigInteger dataAsBigInteger = new BigInteger(Convert.FromBase64String(data));
BigInteger remainder = BigInteger.ModPow(dataAsBigInteger, CalculatePrivateKey(), CalculatePublicKey());
return Encoding.UTF8.GetString(remainder.ToByteArray());
}
Unfortunately, I seem to be getting weird ASCII values for the result. I tried with just using numbers instead of text and Decrypt(Encrypt(number)) == number so I know the algorithm is fine so I think it is messing up because of converting to and from byte arrays and performing operations on them.
If this didn't work I was thinking of a better idea for a formula of converting letters to numbers. I can't do A = 1, B = 2, etc. because 11 would be ambiguous with K (11th letter). Maybe if each letter's position (A = 1, B = 2, etc.) was first multiplied by 10 and then you would know the next letter began at a non-zero value?
Is something like this advisable or can the byte arrays be salvaged?
In principle your scheme should work, as long as the resulting BigInteger is not negative or larger than the modulus.
If a cryptographically secure RSA implementation such as OAEP is used then you also need to subtract the overhead of the padding. Usually though you should only encrypt a symmetric key and use hybrid cryptography to allow for almost arbitrary message sizes.
The thing you are trying to do does not make sense. RSA can only encrypt fixed-length messages, integers of the same order-of-magnitude as the public modulus n. (Specifically, if the plaintext m, taken as a number, is small enough that me < n, then the encryption is trivially reversible, and if it is larger than n, it can't be encrypted at all.)
Moreover, you appear to be attempting to implement "textbook RSA", which is insecure. You need to redesign your application so that, instead, it uses RSA as part of a key encapsulation scheme, which securely delivers a symmetric (e.g. AES) key which is used, in an authenticated operation mode, to encrypt the actual message.
This correction to your design will also render your encoding problem moot, since the message proper is now being encrypted using a symmetric cipher that operates on bit streams rather than numbers.
I would be very surprised if C# does not have libraries that do this for you.

How to double-decode UTF-8 bytes C#

I have a problem.
Unicode 2019 is this character:
’
It is a right single quote.
It gets encoded as UTF8.
But I fear it gets double-encoded.
>>> u'\u2019'.encode('utf-8')
'\xe2\x80\x99'
>>> u'\xe2\x80\x99'.encode('utf-8')
'\xc3\xa2\xc2\x80\xc2\x99'
>>> u'\xc3\xa2\xc2\x80\xc2\x99'.encode('utf-8')
'\xc3\x83\xc2\xa2\xc3\x82\xc2\x80\xc3\x82\xc2\x99'
>>> print(u'\u2019')
’
>>> print('\xe2\x80\x99')
’
>>> print('\xc3\xa2\xc2\x80\xc2\x99')
’
>>> '\xc3\xa2\xc2\x80\xc2\x99'.decode('utf-8')
u'\xe2\x80\x99'
>>> '\xe2\x80\x99'.decode('utf-8')
u'\u2019'
This is the principle used above.
How can I do the bolded parts, in C#?
How can I take a UTF8-Encoded string, conver to byte array, convert THAT to a string in, and then do decode again?
I tried this method, but the output is not suitable in ISO-8859-1, it seems...
string firstLevel = "’";
byte[] decodedBytes = Encoding.UTF8.GetBytes(firstLevel);
Console.WriteLine(Encoding.UTF8.GetChars(decodedBytes));
// ’
Console.WriteLine(decodeUTF8String(firstLevel));
//â�,��"�
//I was hoping for this:
//’
Understanding Update:
Jon's helped me with my most basic question: going from "’" to "’ and thence to "’" But I want to honor the recommendations at the heart of his answer:
understand what is happening
fix the original sin
I made an effort at number 1.
Encoding/Decoding
I get so confused with terms like these.
I confuse them with terms like Encrypting/Decrypting, simply because of "En..." and "De..."
I forget what they translate from, and what they translate to.
I confuse these start points and end points; could it be related to other vague terms like hex, character entities, code points, and character maps.
I wanted to settle the definition at a basic level.
Encoding and Decoding in the context of this question is:
Decode
Corresponds to C# {Encoding}.'''GetString'''(bytesArray)
Corresponds to Python stringObject.'''decode'''({Encoding})
Takes bytes as input, and converts to string representation as output, according to some conversion scheme called an "encoding", represented by {Encoding} above.
Bytes -> String
Encode
Corresponds to C# {Encoding}.'''GetBytes'''(stringObject)
Corresponds to Python stringObject.'''encode'''({Encoding})
The reverse of Decode.
String -> Bytes (except for Python)
Bytes vs Strings in Python
So Encode and Decode take us back and forth between bytes and strings.
While Python helped me understand what was going wrong, it could also confuse my understanding of the "fundamentals" of Encoding/Decoding.
Jon said:
It's a shame that Python hides [the difference between binary data and text data] to a large extent
I think this is what PEP means when it says:
Python's current string objects are overloaded. They serve to hold both sequences of characters and sequences of bytes. This overloading of purpose leads to confusion and bugs.
Python 3.* does not overload strings in this way.:
Python 2.7
>>> #Encoding example. As a generalization, "Encoding" produce bytes.
>>> #In Python 2.7, strings are overloaded to serve as bytes
>>> type(u'\u2019'.encode('utf-8'))
<type 'str'>
Python 3.*
>>> #In Python 3.*, bytes and strings are distinct
>>> type('\u2019'.encode('utf-8'))
<class 'bytes'>
Another important (related) difference between Python 2 and 3, is their default encoding:
>>>import sys
>>>sys.getdefaultencoding()
Python 2
'ascii'
Python 3
'utf-8'
And while Python 2 says 'ascii', I think it means a specific type of ASCII;
It does '''not''' mean ISO-8859-1, which supports range(256), which is what Jon uses to decode (discussed below)
It means ASCII, the plainest variety, which are only range(128)
And while Python 3 no longer overloads string as both bytes, and strings, the interpreter still makes it easy to ignore what's happening and move between types. i.e.
just put a 'u' before a string in Python 2.* and it's a Unicode literal
just put a 'b' before a string in Python 3.* and it's a Bytes literal
Encoding and C
Jon points out that C# uses UTF-16, to correct my "UTF-8 Encoded String" comment, above;
Every string is effectively UTF-16.
My understanding of is: if C# has a string object "s", the computer memory actually has bytes corresponding to that character in the UTF-16 map. That is, (including byte-order-mark??) feff0073.
He also uses ISO-8859-1 in the hack method I requested.
I'm not sure why.
My head is hurting at the moment, so I'll return when I have some perspective.
I'll return to this post. I hope I'm explaining properly. I'll make it a Wiki?
You need to understand that fundamentally this is due to someone misunderstanding the difference between binary data and text data. It's a shame that Python hides that difference to a large extent - it's quite hard to accidentally perform this particular form of double-encoding in C#. Still, this code should work for you:
using System;
using System.Text;
class Test
{
static void Main()
{
// Avoid encoding issues in the source file itself...
string firstLevel = "\u00c3\u00a2\u00c2\u0080\u00c2\u0099";
string secondLevel = HackDecode(firstLevel);
string thirdLevel = HackDecode(secondLevel);
Console.WriteLine("{0:x}", (int) thirdLevel[0]); // 2019
}
// Converts a string to a byte array using ISO-8859-1, then *decodes*
// it using UTF-8. Any use of this method indicates broken data to start
// with. Ideally, the source of the error should be fixed.
static string HackDecode(string input)
{
byte[] bytes = Encoding.GetEncoding(28591)
.GetBytes(input);
return Encoding.UTF8.GetString(bytes);
}
}

C#: String -> MD5 -> Hex

in languages like PHP or Python there are convenient functions to turn an input string into an output string that is the HEXed representation of it.
I find it a very common and useful task (password storing and checking, checksum of file content..), but in .NET, as far as I know, you can only work on byte streams.
A function to do the work is easy to put on (eg http://blog.stevex.net/index.php/c-code-snippet-creating-an-md5-hash-string/), but I'd like to know if I'm missing something, using the wrong pattern or there is simply no such thing in .NET.
Thanks
The method you linked to seems right, a slightly different method is showed on the MSDN C# FAQ
A comment suggests you can use:
System.Web.Security.FormsAuthentication.HashPasswordForStoringInConfigFile(string, "MD5");
Yes you can only work with bytes (as far as I know). But you can turn those bytes easily into their hex representation by looping through them and doing something like:
myByte.ToString("x2");
And you can get the bytes that make up the string using:
System.Text.Encoding.UTF8.GetBytes(myString);
So it could be done in a couple lines.
One problem is with the very concept of "the HEXed representation of [a string]".
A string is a sequence of characters. How those characters are represented as individual bits depends on the encoding. The "native" encoding to .NET is UTF-16, but usually a more compact representation is achieved (while preserving the ability to encode any string) using UTF-8.
You can use Encoding.GetBytes to get the encoded version of a string once you've chosen an appropriate encoding - but the fact that there is that choice to make is the reason that there aren't many APIs which go straight from string to base64/hex or which perform encryption/hashing directly on strings. Any such APIs which do exist will almost certainly be doing the "encode to a byte array, perform appropriate binary operation, decode opaque binary data to hex/base64".
(That makes me wonder whether it wouldn't be worth writing a utility class which could take an encoding, a Func<byte[], byte[]> and an output format such as hex/base64 - that could represent an arbitrary binary operation applied to a string.)

Categories