Limit the encrypted result using AES - c#

I need to encrypt a text (16 chars), preferably using AES, and I need limit the length of the result encrypted text (14 or 16 characteres). The encrypted has to be only chars and numbers (not '=', '?', ...) Is it possible?
I'll need to get back the original text from the cipher text(encrypted).
Is there a way to do this using RijndaelManaged (System.Security.Cryptography
)?

The cypher-text needs to be at least as long(in an entropic sense) as plain-text. You can't losslessly compress arbitrary texts. So if you limit your output to log2(10+2*26)*16=95 bits the input can't have any more entropy than that. This has nothing to do with AES, it's a mathematical limitation that applies to all lossless encodings.
What's a character? A byte a char or a unicode-codepoint?
AES has the additional problem that it's a block cypher, the minimum output size is equal to the blocksize, 128bits. And since the output appears random it can't be compressed after encrypting. And that already exceeds your limit. And most encryption modes add a bit of additional padding.
There are functions which map arbitrary length input to constant length output. They are called hash-functions. But following the pidgeon-hole-principle they map multiple inputs to an output. So you can't get back the input for all possible inputs.

Related

Reversible Hash in C#

Before we start, I want to say "hash" is a bit of a misnomer from what I actually want.
Basically, I have a program that returns a 92 character string (this is cryptographically secure) that I want to shorten, which is why I can't think of any other word but I'll need to be able to reverse it.
So I'm looking for some way that I can take the 92 character base64 string (s) and turn it into a much shorter string (n), and then reverse it.
So the encoding would be like (n) + (hash function) = (s)
And then I'll be able to decode it with (s) + (hash function) = (n). I don't need this to be secure since I handled that when generating the string.
I was using Base65536 but that was mostly for a quick joke since that would be impractical for an actual user.
TL;DR - I need a hash (or encryption) function that will generate short strings out of long ones.
Just to clarify, I do NOT need to compress the file size, I need a shorter string to return to the user.
The most space efficient way to store binary data is to store it as bytes. The only way you may get it even shorter is via compression. But for 92 Characters that will not amount to much.
As for Base64: There are cases where we are forced to transmit binary data over a medium not supporting random binary data. Mostly Textbase media (Email, XML files, HTML). So we use Base64 as a way to encode Binary Data. While it is lossless, it is less storage efficient. In effect every Byte of Input needs 1 1/4 byte in Base64 Output. It is never the ideal case to use Base64, more a nessesary evil.

AES Key length not matching

I'm sending some encrypted data to a client through a web service.
The client had requested that I encrypt the data using a given key and IV. I know you should ideally use a different random IV each time, and I've already raised that with them.
The IV they have provided is a string of length 25. This really doesn't seem right to me.
As far as I was aware the IV length should match the block size, so either 128, 192 or 256 bytes (String lengths 16, 24 or 32). Am I right, or am I missing something here...?
Please note that the IV was provided to me, and therefore I am not trying to pick it.
The provided IV was of the form "ghPNHfg544JUdfjdR5BGVbj67", which I not believe is correct. (The provided key was a string 16 characters long)

How to reverse a Reed - Solomon algorithm? [duplicate]

I want to transmit binary data over a noisy channel.
I read that a good ECC algorithm to detect errors is Reed-Solomon.
The problem is i don't understand the input for this algorithm.
here is my naive failed attempt with zxing.net:
int[] toEncode = { 123,232,432};
var gf = GenericGF.AZTEC_DATA_12;
ReedSolomonEncoder rse = new ReedSolomonEncoder(gf);
rse.encode(toEncode, 2);
ReedSolomonDecoder rsd = new ReedSolomonDecoder(gf);
rse.encode(toEncode, 2);
please explain to me the input for the encoder and decoder.
Is this the implementation you are using here: ReedSolomonEncoder.cs?
If so, to encode N integers with M data correction integers, you need to pass an array of length N+M. Your data should be in the first N indices and the codes look to be added at the end in the final M entries.
Also, note the following restriction in the encoder:
Update: a more recent version is here: http://zxingnet.codeplex.com/. Its most recent version of ReedSolomonEncoder.cs does not have this restriction.
This class implements Reed-Solomon encoding schemes used in processing QR codes. A very brief description of Reed Solomon encoding is here: Reed-Solomon Codes.
An encoding choice of "QR_CODE_FIELD_256" (which is probably a reasonable choice for you) means that error correction codes are being generated on byte-sized chunks ("symbols") of your message, which means your maximum message length (data to encode plus error correction codes) is 255 bytes long. If you are sending more data you will need to break it into chunks.
Update 2: Using QR_CODE_FIELD_256, your integers need to be between 0 and 255 as well, so to encode a general byte stream, you need to put each byte into a separate integer in the integer array, pass the int array (plus space for error correction codes) through the encoder, then reconvert to a (larger) byte array. And the reverse for decoding.

RijndaelManaged Padding when data matches block size

If I use PKCS7 padding in RijndaelManaged with 16 bytes of data then I get 32 bytes of data output. It appears that for PKCS7 when the data size matches the block size it adds a whole extra block of data.
If I use Zeros padding for 16 bytes of data I get out 16 bytes of data. So for Zeros padding if the data matches the block size then it doesn't pad.
I have searched through the documentation and it says nothing about this difference in padding behavior.
Can someone please point me to some kind of documentation which specifies what the padding behavior should be for the different padding modes when the data size matches the block size.
I came across this article which offers an explanation that seems to jibe with some other articles I found during my searching. Here's the basic reason:
You may be wondering what happens if our data length is a perfect
multiple of the block size. In this scenario, PaddingMode.None and
PaddingMode.Zeros add no padding. However, in the case of
PaddingMode.PKCS7, padding must be added because the cipher must be
able to reverse even a no-padding situation. In this case, an
additional block must be added to the plain text and the value of each
byte set to the block size in bytes.

AES output, is it smaller than input?

I want to encrypt a string and embed it in a URL, so I want to make sure the encrypted output isn't bigger than the input.
Is AES the way to go?
It's impossible to create any algorithm which will always create a smaller output than the input, but can reverse any output back to the input. If you allow "no bigger than the input" then basically you're just talking isomorphic algorithms where they're always the same size as the input. This is due to the pigeonhole principle.
Added to that, encryption usually has a little bit of padding (e.g. "to the nearest 8 bytes, rounded up" - in AES, that's 16 bytes). Oh, and on top of that you're got the issue of converting between text and binary. Encryption algorithms usually work in binary, but URLs are in text. Even if you assume ASCII, you could end up with an encrypted binary value which isn't ASCII. The simplest way of representing arbitrary binary data in text is to use base64. There are other alternatives which would be highly fiddly, but the general "convert text to binary, encrypt, convert binary to text" pattern is the simplest one.
Simple answer is no.
Any symmetric encryption algorithm ( AES included ) will produce an output of at minimum the same but often slightly larger. As Jon Skeet points out, usually because of padding or alignment.
Of course you could compress your string using zlib and encrypt but you'd need to decompress after decrypting.
Disclaimer - compressing the string with zlib will not guarantee it comes out smaller though
What matters is not really the cipher that you use, but the encryption mode that you use. For example the CTR mode has no length expansion, but every encryption needs a new distinct starting point for the counter. Other modes like OFB, CFB (or CBC with ciphertext stealing) also don't need to be padded to a multiple of the block length of the cipher, but they need an IV. It is unclear from your question if there is some information available from which an IV could be derived pseudorandomly an if any of these modes would be appropriate. It is also unclear if you need authentication, or if you need semantic security> i.e. is it a problem if you encrypt the same string twice and you get the same ciphertext twice?
If we are talking about symetric encription to obtain the original encrypted string from a cyphered one it is not possible. I think that unless you use hashes (SHA1, SHA256...) you will never obtain a cyphered string smaller than the original text. The problem with hashes is that they are not the solution for retrieving the original string because they are one way encryption algorithms.
When using AES, the output data will be rounded up to have a specific length (e.g a length divisible trough 16).
If you want to transfer secret data to another website, a HTTP post may do better than embedding the data into the URL.
Also just another thing to clarify:
Not only is it true that symmetric encryption algorithms produce an output that is at least as large as the input, the same is true of asymmetric encryption.
"Asymmetric encryption" and "cryptographic hashes" are two different things.
Asymmetric encryption (e.g. RSA) means that given the output (i.e. the ciphertext), you can get the input (i.e. the plaintext) back if you have the right key, it's just that decrypting requires a different key than the key used for encrypting. For asymmetric encryption, the same "pigeonhole principle" argument applies.
Cryptographic hashes (e.g. SHA-1) mean that given the output (i.e. the hash) you can't get the input back, and you can't even find a different input that hashes to the same value (assuming the hash is secure). For cryptographic hashes, the hash can be shorter than the input. (In fact the hash is the same size regardless of the length of the input.
And also one more thing: In any secure encryption system the ciphertext will be longer than the plaintext. This is because there are multiple possible ciphertexts that any given plaintext could encrypt to (e.g. using different IVs.) If this were not the case then the cipher would leak information because if two identical plaintexts were encrypted, they would encrypt to identical ciphertexts, and an adversary would then know that the plaintexts were the same.

Categories