In my research I have found mixed messages on this subject so I'm looking for expertise to explain the best approach to encrypting variable amounts of data.
Requirements:
[Edit: Adding additional requirement #3 in response to comment]
I would like to use RSA for the public/private key encryption scheme
so I can distribute the public key to an application that should
encrypt data but should not know how to decrypt it
I need to support data lengths from 16 characters (credit card
number) to kilobytes (serialized objects) and beyond. Most of the
data I encrypt will be small (credit cards, addresses, etc).
This is for encrypting data at rest.
Options I'm Aware Of:
RSA-ONLY: Use RSACryptoServiceProvider to encrypt all data using public key.
Iterate through the data in blocks that are less than the key size
minus padding.
HYBRID: Use AesCryptoServiceProvider to encrypt the data, calling
.GenerateKey() and .GenerateIV() to generate a random key and IV.
Then use RSACryptoServiceProvider to encrypt the above key and IV
and prepend or append that to the data.
It seems to my the Hybrid approach gives me the best of both worlds. Strong block cipher (AES) and distributed public key (RSA).
What are the pros and cons of these approaches? What is the standard? Surprisingly I have not found much opinion or information on the subject and would appreciate any references you might have.
Bonus:
I am rolling my own for various reasons including corporate licensing restrictions but I'm curious if there is a good standard opensource approach for C#.
in most cases RSA is used to encrypt a symetric key (you don't really need to encrypt the IV, but hey...)
if you use RSA for encryption of data (instead of a key) you might run into the ECB (Electronic Code Book mode) problem that is known in the context of symetric block cyphers: for a given key, a clear-text is always mapped to the same cypher-text ... that alone doesn't help in breaking the encryption, but it can leak information since an attacker can identify which data packages contain the same clear-texts
i'd choose the hybrid approach, because it's suitable for arbitrary sized data, and won't be prone to this information leak unless you choose ECB for the mode of operation (CBC - Cypher Block Chaining mode - should do)
If you just want to use RSA to store a small amount of data, smaller than the number of bits in the key, you can pad the input data with random numbers. There are several padding schemes listed at https://en.wikipedia.org/wiki/RSA_(cryptosystem)#Padding
Related
I tried to search stackoverflow for an answer to this particular question but couldn't find a good answer.
The BouncyCastle API offers a ton of different encryption algorithms for .NET. In this case I have to select an encryption algorithm for the following use case:
Encrypting several thousand short strings for storage in an unencrypted file, typical length 10-30 characters.
Encryption needs to be only moderately secure, all strings will be encrypted with the same key but a different initialization vector. Things like authorization (like in AES-GCM) are not needed.
Which encryption algorithm is both fastest and straightforward to apply for encrypting and decrypting such a set of several thousand small strings? I will store the encrypted data of each string as BASE64 in the file.
Thank you for your advice!
That processor still has AES-NI, so using that through the AES.Create seems most logical. A higher end solution would be to create CTR mode out of ECB (and cache the key streams).
Otherwise you are looking for a fast stream cipher in software. You could check if Salsa20 is working for you. Unfortunately that's the only eStream-compatible cipher that I can find in Bouncy Castle for C#.
Note that you may want to look into multi-threading and I would certainly check if it is possible to use sequential nonce when using a stream cipher, as generating a 128 bit random value for each encryption seems wasteful.
I have an encoding application written in C# where users can optionally encrypt messages. I had been using the class in this answer, and it turns out I'm in good company because I found several places online that use the exact same code (one of which is Netflix's Open Source Platform).
However, comments to that answer (as well as later edits to that answer) led me to believe that this method was insecure. I opted to use the class in this answer to the same question instead.
How secure is AES encryption if you use a constant salt? How easily can this method be broken? I admit that I have very little experience in this area.
AES is a block cipher. A block cipher's input is a key and a block of plaintext. A block cipher is usually used in a block cipher mode of operation. All secure modes of operation use an Initialization Vector or IV. Otherwise identical plaintext would encrypt to identical ciphertext (for the same key), and this is leaking information.
Salt is not used by AES or modes of operation. It's usually used as input for Key Derivation Functions (KDFs), especially Password Based Key Derivation Functions (PBKDFs). Dot NET's Rfc2898DeriveBytes implements the PBKDF2 function as defined in - you'd guess it - RFC 2898: "PKCS #5: Password-Based Cryptography Specification Version 2.0".
If you use a static salt in a PBKDF2 then you would get the same key as output (for the same number of iterations). Now if you would ever leak the resulting key then all your ciphertext would be vulnerable. And if you would use multiple passwords then an attacker would be able to build a rainbow table; the PBKDF2 work factor would become less important; the attacker can simply build one table and then try all the resulting keys on all possible ciphertexts.
So, as the salt is not actually used for AES it doesn't make much of a difference for the security. It is however still a horrible sin, even worse than using the default iteration count for PBKDF2 / Rfc2898DeriveBytes.
Note that horrible security sins are committed by a large number of people on a daily basis. That there are many many many persons that get it wrong doesn't tell you that you are in "good company". That there are 289 upvotes just tells you that SO answers about cryptography should not be trusted based on vote count.
Salt is there for a reason.
This enables same input to be encrypted differently.
If an attacker would really insist, he can find some patterns that repeat themselves in encryption without salt, and eventually can get to your key more easily.
Still the attcker would have to work very hard.
Using constant salt equals to not using salt at all.
And it is highly recommended to use it, as it has no effect on the decryption process.
What is considered "best practice" for encrypting certain sensitive or personally identifiable data in a SQL database (under PCI, HIPAA, or other applicable compliance standards)?
There are many questions here regarding individual aspects of a solution, but I have not seen any that discuss the approach at a high level.
After looking around for quite some time, I came up with the following:
Use CryptoAPI and Rijndael
Generate IV and store it with the encrypted data
Use DPAPI (Machine scope) to "protect" the symmetric key
Store the symmetric key in the registry or a file or the database, split the key and store parts in multiple places for added protection
do not decrypt the data unless it is really needed, i.e. not upon read from the database. Instead, hold cipher text in memory.
Is this adequate? Outdated? Audit-safe? Reckless?
Your approach is good, with a few adjustments in my eyes (I code for PCI compliance generally):
Use CryptoAPI and Rijndael
Use Rijndael/AES256 at a minimum, regardless of other APIs
Generate IV and store it with the encrypted data
Good
Use DPAPI (Machine scope) to "protect" the symmetric key
Not sure if it matters. I'd just keep the IV next to the data that's encrypted, or if you're really paranoid on some other medium. Ensure that the IV is not accessible to the public.
Store the symmetric key in the registry or a file or the database, split the key and store parts in multiple places for added protection
Storing in multiple places will not help you if someone steals your media. It's a bit overkill to split the key up all over heck, but definitely do NOT store it with your IV and/or ciphertext. That'd be bad.
do not decrypt the data unless it is really needed, i.e. not upon read from the database. Instead, hold cipher text in memory.
Definitely. Holding cipher text in memory in fine, but don't pass it around anywhere, and don't decrypt except when you absolutely must, and even then don't EXPOSE the entire unencrypted dataset - only what is needed from it at the minimum. Also, do not hold the key in memory if possible - a memory dump could expose it.
Additions:
Whatever database you store your cipher text in, restrict read access entirely to the proc(s) that select for a given identifier. Do not allow read access to the tables that store this data to ANYONE, even the SA account. This way, a person who breaks into your system will have a hard time pulling down your cipher texts without knowing what IDs to look for. Do the same for any table(s) referencing the identifier on the ciphertext table. DO NOT ALLOW BLANKET READS OF THESE TABLES!
Restrict database access by IP
Never persist any unencrypted plaintext in memory over state. Allow it to be dereferenced/garbage collected as soon as the request is completed.
Restrict the server(s) running this code to as few users as possible.
Possibly combine encryption methods for a stronger ciphertext (AES + Blowfish for example)
Hope these help. Some of them are my personal opinions but remain PCI compliant to the best of my knowledge.
I saw that one of the previous comments mentioned that it doesn't matter if you use CryptoAPI. I just wanted to point out that CryptoAPI is FIPS 140-2 compliant, while Bouncy Castle and the built-in managed classes (all the ones with "Managed" at the end of their names in the System.Security.Cryptography namespace) are not. If you have a requirement for FIPS compliance, it's probably easiest to for you to use CryptoAPI.
I would add:
Keeping the IV hidden is not important. It's OK if the IV is public. Just use good IVs, which means, use a cryptographic-strong random number generator so that your IVs are indistinguishable from random.
Storing the encryption key separate from the data that it encrypts.
Add authentication to your encryption. For example, add an HMAC keyed with a second symmetric encryption key, covering the ciphertext. If you don't use some form of authenticated encryption, then your ciphertext could be modified, and you have no way of knowing (AES will decrypt garbage just fine.) You want any tampering of the ciphertext to be noticed.
Taken more generic list of best practices, from OWASP (Cryptographic Storage Cheat Sheet):
Use strong approved cryptographic algorithms
Do not implement an existing cryptographic algorithm on your own
Only use approved public algorithms such as AES, RSA public key cryptography, and SHA-256 or better for hashing
Do not use weak algorithms, such as MD5 or SHA1
Avoid hashing for password storage, instead use Argon2, PBKDF2, bcrypt or scrypt
Use approved cryptographic modes
In general, you should not use AES, DES or other symmetric cipher primitives directly. NIST approved modes should be used instead. Quote from Nist: "The approved algorithms for encryption/decryption are symmetric key algorithms: AES and TDEA."
Use strong random numbers
Ensure that any secret key is protected from unauthorized access
Also, according to this Cisco article:
DES is to be avoided and so is RSA-768, -1024
RSA-2048 and RSA-3072 are acceptable
AES-CBC mode is acceptable, while
AES-GCM mode is part of the Next Generation Encription.
I was told that there's an encryption library I can use and there's a couple that I can choose from (eg. AES, RSA, etc). I also read something about keys. Are keys something you just generate so you can encrypt and decrypt a series of texts? Do you have to purchase that key?
Also, is there a best practice that I need to be aware of in encrypting and decrypting? Is encrypting a password recommended? Would performance be affected?
You are correct. Base64 encoding is a world away from actually encrypting your data. The former simply converts the data to be representable using 64 unique characters, obfuscating the data at best, while the latter actually converts your data into a representation that can only make sense once it is decrypted using the proper key. Do not ever base64 encoding if you want to keep something a secret.
Are keys something you just generate so you can encrypt and decrypt a series of texts?
Yes.
Do you have to purchase that key?
No, you generate the keys yourself.
Is encrypting a password recommended?
Most definitely. You should always encrypt passwords whenever possible.
Would performance be affected?
When encrypting data, you're using more CPU cycles than you would have otherwise, so performance is affected, but it really depends on what algorithm you use, the amount of data, etc.
Here are some linke that might help you out:
Some info on encryption in .NET
MSDN Article
More on Encryption
Start reading here:
http://msdn.microsoft.com/en-us/library/system.security.cryptography.aspx
Oh and yes - encrypting a password is recommended in most systems (do a search for hash and salt).
A common practice would be using Protected Configuration feature.
Encrypting and Decrypting Configuration Sections
After reading this post regarding the use ECC to implement the hashing using aa private key I set about trying to find an implementation of ECDH and came across BoucyCastle.
Unfortunately documentation is minimal (as in zerow!) and I'm unsure what I'm about to accomplish is completely correct/valid.
We want to simply hash 4 strings which will be the users registration information (Name, Company, their company ID and their account ID which are both 12 characters long) which will then compute a serial they can use to activate our software.
I have generated a key pair using PUTTYGEN.exe but I cannot workout how to apply this with BouncyCastle, which class can I use to get started? Are there any examples out there?
So far I've concatenated the information and computed a MD5 hash of it (using the .NET classes) I cannot use the new VISTA enhanced API functions as we target XP still - .NET 3.5.
Anyone have any ideas?
I think .NET has the RSACryptoServiceProvider class which is a full RSA implementation.
There's sample code for your particular application here:
http://www.codeproject.com/KB/security/xmldsiglic.aspx
In this example they use MS's sn.exe tool to create the key.
So far I've concatenated the information and computed a MD5 hash of it (using the .NET classes).....
That statement in itself worries me. MD5 is seriously crackable - not just theoretically but practically. Please, please don't use MD5 for secure hashing. Use SHA-256 or SHA-512 and here's why
Also the post you linked is not quite true - yes symmetric algorithms use the same key to encrypt/decrypt but public/private key is not a magic bullet.
1) Public/private key is slow
2) Most publicc/private algorithms just encrypt the symmetric key and then use symmetric encryption for the data because it's much faster
The point is that a good hashing algorithm is non-reversible and hence very difficult to crack so is perfectly fine for your purposes. However, I'd suggest using a SALT, which is a cryptographically random number to add to your user data then hash that data as it makes your data much safer against dictionary attacks ( where hackers use well know terms and variants to crack passwords )