Securely storing and searching by social security number

Securely storing and searching by social security number - c#

So I'm working on a supplemental web-based system required by an HR department to store and search records of former personnel. I fought the requirement, but in the end it was handed down that the system has to both enable searching by full SSN, and retrieval of full SSN. My protestations aside, taking some steps to protect this data will actually be a huge improvement over what they are doing with it right now (you don't want to know).
I have been doing a lot of research, and I think I have come up with a reasonable plan -- but like all things crypto/security related there's an awful lot of complexity, and it's very easy to make a mistake. My rough plan is as follows:
On first time run of the application, generate a large random salt, and a 128bit AES key using RijndaelManaged
Write out both of these into a plaintext file for emergency recovery. This file will be stored offline in a secure physcial location. The application will check for the presence of the file, and scream warnings if it is still sitting there.
Store the salt and key securely somewhere. This is the part I don't have a great answer for. I was planning on using DPAPI -- but I don't know how secure it really is at the end of the day. Would I be better off just leaving it in plaintext and restricting filesystem access to the directory its stored in ?
When writing a record to the database, hash the SSN along with the large salt value above to generate a field that is searchable (but not recoverable without obtaining the salt and brute forcing all possible SSNs), and AES encrypt the raw SSN value with a new IV (stored alongside) to generate a field that is retrievable (with the key/iv) but not searchable (because encrypting the same SSN twice should yield different output).
When searching, just hash the search value with the same salt and look it up in the DB
When retrieving, decrypt the value from the DB using the AES key/iv
Other than needing a way to store the keys in a relatively secure way (number 3 above) it seems solid enough.
Things that won't work for us:
"Don't do any of this" Is not an option. This needs to be done, and if we don't do it they'll a) get mad at us and b) just pass all the numbers around in a plaintext document over email.
This will be internal to our network only, so we have that layer of protection at least on top of whatever is implemented here. And access to the application itself will be controlled by active directory.
Thank you for reading, and for any advice.
Update #1:
I realized from the comments that it makes no sense to keep a private IV for the SSN retrieval field. I updated the plan to properly generate a new IV for each record and store it alongside the encrypted value.
Update #2:
I'm removing the hardware stuff from my list of stuff we can't do. I did a bit of research, and it seems like that stuff is more accessible than I thought. Does making use of one of those USB security token things add meaningful security for key storage?

I've had to solve a similar problem recently and have decided to use an HMAC for the hashing. This would provide more security than a simple hash, especially as you can't salt the value (otherwise it wouldn't be searchable).
Then as you say, use AES with a random salt for the reversible encryption.
It maybe that you don't need to encrypt this data but I had no choice and this seemed like a reasonable solution.
My question on IT Security https://security.stackexchange.com/questions/39017/least-insecure-way-to-encrypt-a-field-in-the-database-so-that-it-can-still-be-in

With respect to key storage there are two methods you can use if you choose to store your AES key in the web.config. First method is to use DPAPI as you mentioned. This will encrypt your web.config application setting for that box. The other method you can use is via RSA key (check out this MSDN tutorial), this will encrypt your web.config just like DPAPI however you can use the RSA key on multiple boxes, so if the application is clustered then RSA key is better (just more complicated to setup).
As far as generating the key before you run your application not on the machine running the app this way there's no chance you're going to leave the text file in the directory. You should generate the key as follows.
Generate a random value using RngCryptoServiceProvider
Generate a random salt value using RngCryptoServiceProvider
Hash the two values with PBKDF2 (Rfc2898DeriveBytes)
The reason you use the key derivation method is it protects you in case RngCryptoServiceProvider was found to be insecure for some reason which happens with random number generators.
Use AES 256 instead of AES 128, reason is these algorithms are extremely fast anyway so get the higher security it's almost free. Also make sure you're using the algorithm in CBC or CTR mode (CTR is available in the BouncyCastle library).
Now this will not give your key absolute protection if someone were able to put up a aspx file in your directory. Because that file will become part of your application it would have access to your decrypted values including your key. The reason I'm mentioning this is your network and server security will have to be top notch, so I would highly recommend you work hand-in-hand with your network security team to ensure that nobody has access to that box except the parties in the HR department that need access (Firewall not Active directory). Do NOT make this application publically accessible from the internet in any way shape or form.
You also cannot trust your HR department, someone could become a victim of a social engineering attack and end up giving away their login thus destroying your security model. So in addition to working with your network team you should integrate a two factor authentication mechanism to get into the system, highly recommend going with an actual RSA key or something similar rather than implementing TOTP. This way even if someone from the dept gives away their password because they thought they were winning a free ipad, the attacker would still need a physical device to get into the application.
Log Everything, any time someone sees a SSN make sure to log it somewhere that will be part of a permanent record that's archived on a regular basis. This will allow you to mitigate quickly. I would also put limits on how many records a person can see in a particular time frame, this way you know if someone is mining data from within your application.
Create a SQL user specifically to access this table, do not let any other user have access to the table. This will ensure that only with a particular user id and password can you view the table data.
Before deploying to a production environment you should hire a penetration testing team to test the application and see what they can get, this will go a long way to harden the application from potential attackers, and they can give you great advice on how to harden the security of the application.

Create a new salt and IV for each record. If you need to dump the data into a report for some reason (hopefully without my SSN in it), you would be able to use the method you describe with the unique salt and IV. If you only need to search on an SSN, you could actually hash it instead of using a reversible encryption (more secure).

I think I read somewhere once that hashing a limited set of inputs gets you absolutely nothing. A quick google turned up this SO post with similar warnings:
Hashing SSNs and other limited-domain information
I must admit that I am also no security expert, but given that the possible number of inputs is much smaller than 10^9 which any decent hacker should be able to breeze through in a matter of hours, hashing a SSN seems like you are adding a small layer of annoyance rather than an actual security/difficulty barrier.
Rather than doing it this way, could you do something else? For example, SSN's only have value to an attacker if they can associate a name to a number (since anyone can enumerate out all numbers easily enough). In that case, could you encrypt the user id that the SSN links to in such a way that would be impractical to attack? I am assuming your employees table has some sort of ID, but maybe instead of that do a hash on their email or some sort of guid? That way, even if they do get your SSN data, they would not be able to tell which employee's it is until they managed to brute force that link.
Then again, that approach is also flawed since your company may not have that many employees total. At that point it would be a relatively simple matter of guessing and checking against a company directory to attain everything. No matter how you slice it, this security flaw is going to exist if SSN's must be stored with other identifying data.

Related

Enterprise Data

We are developing a huge financial, budget and expense management solution and one of the requirements is the user data and postings collected by our app stored in sql server MUST be encrypted by user supplied key.
We are using SQL server 2012,EF 6 and .NET 4.5.
What we have tried:
We created class library with two functions in c# that does the encryption and decryption.The assemblies are complied to sql assembly and that works fine using a single encryption key.
The challenge:
The database contain data from different users who supply different key.the question is
How do we store user supplied keys such that it's secure...e.g If a user lost/forgot the key used to encrypt their data..the app can recover it.. while the DBA who supports this database should NOT have access to the keys..?
if we have 1M users..that means million keys..the tables have relational references so it become tricky to encrypt each row differently per user...What's the industry standard in this scenario..?

First off, I would like to preface this answer by stating that I don't pretend to know the industry standard in this scenario -- I don't. That being said, here's what I would do.
In cryptography, there's an algorithm known as Shamir's Secret Sharing. In summary, it would let you split the key into multiple parts:
User chooses their private key, and splits it into 4 parts (n = 4), where any subset of 2 parts (k = 2) is sufficient to reconstruct their secret. You can vary n and k to suit your needs, where n would be the number of recovery options provided and k is the number that must be correct.
User then encrypts each part of the private key with their recovery options and sends the encrypted parts to the server to store.
When user requests file, server sends encrypted file to client who can then decrypt it with their key.
In the event the user forgets their key, they can request their encrypted key parts from the server, provide recovery answers in an attempt to decrypt at least k of them, and (hopefully) get their data back.
Notes:
Server doesn't store answers to the recovery options. This means it won't be able to decrypt the files without the user's help (unless you were to send the raw splits to the server as well, but that's a potential security risk). In essence, you could help the user get back to their key, but all bets are off if they can't remember any of their recovery options (e.g. amnesia, Alzheimer's, untimely death).
If the user were to change their secret key, every file would need to be decrypted and re-encrypted using the new key. This could be a potentially expensive task.
The sum of recovery options needed to remake the key must not be easy for an attacker to guess. For example, if I have 4 recovery options of which I must provide 2, and my choices are phone number, best friend's first name, and some others, then this would not be secure. There aren't very many possible choices, which would make that example combination very easy to brute force.

Encryption of Data that should be stored in a Database. And understanding the concept of the "key" used,

I'm new to C# and ASP.NET and I have to do a project now. It deals with confidential data of a firm's employees so it needs to be encrypted. I am not sure if I will be able to get through with my own encryption algorithm. If I use any existing algorithms, they said that I should find a foolproof way to store the key.
To be honest, I don't really understand the term "key" in encryption. I would like someone to brief about it and help me with how I should move forward with this project.

http://en.wikipedia.org/wiki/Key_%28cryptography%29
dunno, but maybe start there?

IMHO:
as already advised, don't cobble up your "own", use existing algorithms in the framework that have been tested extensively. Whatever weaknesses they may have will (likely) still be better than what you can cobble up on your own.
understand what needs to be encrypted which pretty much means at some point will need to be decrypted vs. data that needs to be hashed (one-way - e.g. passwords).
decide if you want this to happen on the application side or perhaps, if resources are available to you like SQL server (to store data), on the database side (discuss this with your DBA). You can do both encryption and hashing in SQL server alone.
on the application side, you can think about storing keys in your web.config and subsequently encrypting that section - just like the option to do so for your db connection strings (encrypting the connection section of web.config). This way even your keys aren't in plain text.

The first rule of cryptography - never use your own algorithm, unless you are a Ph.D. and several other Ph.D's are helping you, even then, use only after public auditing.
What they mean about storing the key is that it shouldn't be exposed anywhere - if an attacker can get the key, they can decrypt all data in the database. Currently, there are no known ways to do this. You can store the key in a file outside the website's root folder - this way either the server itself must be compromised, your app must be compromised (e.g. by making it display the "../../key.txt" file, thus descending below the webroot) or your app must be tricked into encrypting/decrypting the data transparently for the attacker (e.g. by having a bug that allows authentication bypass, thus allowing them to use your app to talk to the database).
For the last part of the question, use #Haxx's answer :)

Im making my first encryption program, any tips?

Im making a program in C# that has passwords and I need to encrypt them. So far I flip the string backwards (so hello becomes olleh) and then I use a loop that loops through each character, and the loop inside it loops through another string that has the converted letters to see if they match. Using this, hello = Ghh#$ so it works fine. So anyway, are there any extra stuff I can add to it? PS what is salting and how is hashing one way?

Rule one of cryptography is don't write your own encryption scheme. Instead use a library such as http://www.cryptlib.com/why-use-cryptlib-10-good-reasons which has bindings for C#.
For more information check out the first answer to:
https://security.stackexchange.com/questions/2202/lessons-learned-and-misconceptions-regarding-encryption-and-cryptology

First off, the difference between encryption and hashing is, at a high level, that encrypted data can be decrypted with the right key, whereas hashed data cannot be retrieved except via brute force methods like pregeneration or rainbow tables.
Hashed passwords are validated by hashing the user's input each time that they log in in the same way that you do when they create the account, and comparing the result of the hash. For any given input, the hashed result should be the same.
Obligatory rant:
There is a good argument to be made that passwords should always be hashed using a cryptographically-strong algorithm. You may hear the excuse that "my application/web page/etc is not all that important, there is no sensitive information there", or "I'm just learning so it isn't important", but the fact is that if I can crack the security of one website, or you leave your machine logged in and I steal your password file from your "educational" app, I can take all of the user's email addresses and virtually guarantee that at least a few of them will use the same password for that gmail or yahoo account. I can then send reset requests for just about any site that their email tells me they have an account for and get access to those also. So it is very important that no matter what software you are writing, if it stores passwords, you should do the responsible thing and salt + hash them properly.

Salt: http://en.wikipedia.org/wiki/Salt_(cryptography)
Hashing: http://en.wikipedia.org/wiki/Cryptographic_hash_function
Simplistic Example:
var salt = "abc123";
var encryptedPassword = HashingAlgorithm("password");
var encryptedSaltedPassword = HashingAlgorithm ("password" + salt);
Console.Writeline(encryptedPassword);
Console.Writeline(encryptedSaltedPassword);
writes out
aIdekXieklKq309nasdf
dfk#cxk)8lkdfesijcde
The point of salting your code is to prevent dictionary attacks. If anyone figures out your HashingAlgorithm, they can brute-force run through every word in the dictionary and figure out that "password" hashes to be "aIdekXieklKq309nasdf". If you salt your to-be-encrypted words, they'd have to know your salt word too.
Also, it's good to hash your passwords into a database instead of using some two-way algorithm, that way anyone (including you and your co-workers) having access to the database can look and see what your users use as passwords (since a lot of users tend to reuse the same passwords on multiple sites).

How should I derive the key and initialization vector for my AES encrypted database entries?

I've built a CMS system to allow users to create and manage online forms on my client's intranet app.
Of course some of the data handled by the forms may need to be encrypted e.g. if the system is used to build a form that handles salary specifics or whatever. So I'm using the AESManaged class to symmetrically encrypt this sort of data prior to it going into our application db.
All is fine, but now, prior to release, I could do with a steer regarding the shared secret and salt.
My original idea was to make a (dynamic) shared secret by combining the (GUID-based) ID of the Form containing the encrypted field with the (again, GUID-based) id of the Question the field is the answer to:
FormId:QuestionId
My Salt is currently generated the same way, only with the order of Guids reversed ie.
QuestionID:FormID.
I'm new to this stuff so not sure if this a sensible strategy or if I should be doing it some other way?

The salt should be a randomly generated value. Its purpose is to make dictionary/brute force attacks more difficult to execute. Wikipedia has a nice article on cryptographic salts:
http://en.wikipedia.org/wiki/Salt_(cryptography)
For the shared secret ideally it would not be a value that was stored unencrypted with the data that it was encrypting (such as your ids). It's generally a best practice that the key be chosen somehow by the end-user or admin so that they could rotate it periodically or if some sort of security breach occurred. This password key could be owned by each user of the CMS or perhaps by an admin account. If you have very serious security requirements you could pursue a third-party Key Management Server.
If the main goal here is more of obfuscation and the CMS will not be subject to some form of security audit then something along the lines of your initial idea would do. It would prevent the casual access of the data but would probably not pass an audit against formal standards that would require a random salt, a way to rotate the keys, and a way for the "owner" of the system to change the password such that you yourself could not access the data.

How to encrypt a password for saving it later in a database or text file?

I want my application to save the password encrypted in a database or in a text file.
How can I do that assuming that the database or text file can be open by anyone?
Duplicate
Encrypting/Hashing plain text passwords in database
Not duplicate
I'm asking for code specific for .NET
EDIT: I'm saving the password for later use. I need to decode it and use it to login.
It doesn't have to be super secure, it just needs to be unreadable to the human eye, and difficult to decode with a trivial script.

StackOverflow readers don't know how to write secure password schemes and neither do you. If you're going to do that, save time by sticking with plain text. From Enough With The Rainbow Tables: What You Need To Know About Secure Password Schemes:
Rainbow tables are easy to beat. For
each password, generate a random
number (a nonce). Hash the password
with the nonce, and store both the
hash and the nonce. The server has
enough information to verify passwords
(the nonce is stored in the clear).
But even with a small random value,
say, 16 bits, rainbow tables are
infeasible: there are now 65,536
“variants” of each hash, and instead
of 300 billion rainbow table entries,
you need quadrillions. The nonce in
this scheme is called a “salt”.
Cool, huh? Yeah, and Unix crypt —-
almost the lowest common denominator
in security systems —- has had this
feature since 1976. If this is news to
you, you shouldn’t be designing
password systems. Use someone else’s
good one.
Use BCrypt - Strong Password Hashing for .NET and Mono. It's a single cleanly written .cs file that will continue to meet your needs as password cracking computers get faster.

BCrypt - Strong Password Hashing for .NET and Mono

Triple DES is one way to do it, as long as you mean "A password that my system needs to be able to recall in order to access a resource". If you mean the password is something a user needs to be able to gain access to your system, probably don't want encryption, just a hash will do. When you store the hashed password value, it is useless to anyone with direct database access, but can still be used for authentication. All you do is compare the stored hash against a hash of the incoming password. If they match, then you grant access.
It isn't perfect, by any means, but it is the way 99.999% of people store their passwords.
If you want to argue that you wish to provide the password back to a user if they lose/forget it, then please don't. Issue them with a temporary password (which you store hashed in the db) and get them to change it on first login.

Use Data Protection API either with the user or machine store (e.g. different key per account your program/database server runs under vs. one key per machine). This will help you decode the passwords later and you don't have to remember or store any encryption keys. The downside of it is that when you reinstall the system/delete the account you won't be able to recover the data, I believe.

If you use encryption for securely storing passwords, you'll need to store the encryption "key" somewhere, too. This will be the "weak link", since if someone gets hold of the encryption key, they will be able to decrypt the encrypted passwords.
Since this is passwords that we're talking about here, a much better solution is to use a one-way hash. You hash the password when the user first creates it (preferably hashing with a salt value) and store the resulting hash value. Since hashes are one-way, no one can reverse the hash to the original plain text value.
To check that a users password is correct, you simply ask the user for the plain-text password, hash their input again and compare the resulting hash value with the hash value you have stored (taking salts into account of course). If the two hash values are the same, the user has entered the correct password.
Please see the following links for further info:
Hashing Password with Salt
For encryption (if you need to use that), I'd use Rijndael (AES).

Based on your question I can see two approaches depending on why you are storing the password.
A. if you only need to authenticate using their password and nothing else.
In that case, going using an algorithm that is not reversible (Hashing) would be your best choice. You will need to make sure of a couple of things:
Make sure that the connection is encrypted when transmitting the password from the client to the server. This will prevent it from being sniffed out. This is pretty trivial to do with web applications since the web server is doing the heavy lifting for you. If not it gets a lot tricker and is the subject of an whole other question.
Choose a solid hashing algorithm to prevent collision. I would recommend SHA-256 even if it does provide a larger result than SHA1 or MD5. The reference from Microsoft on using their implementation of the algorythm is here.
Salt the password to prevent attacks using rainbowtable (i.e. looking up the password in large table with the precomputed hash and the associated password in clear text). The answer here (sited in your question) gives good pseudo code in Python on how to do it. There is also a good example of .NET code here.
B. if you need to be able to read the password for each user for other purposes than authenticating the user.
This case is easy if we are only talking about storing a password (or any kind of sensitive information) on a single computer (server). If that's the case, using the Microsoft Data Protection API would be a good solution since it is tied to that computer and (depending on the way you work) the user under which you application runs and takes care of the worst of the job for you (creating, storing, and using keys). You can find some code reference from Microsoft here. If you need it on more than one system and are not willing to enter the password on each system you install on your application then things get a lot more complex because you need to implement a lot of it from scratch. That would be the subject for another question I would think.

If you need to decrypt the password for later use and it doesn't have to be SUPER secure, then use the method here:
http://support.microsoft.com/kb/307010
It's well documented, and easy to understand.

do you need to encrypt it ever again? otherwise use a hashfunction to encrypt it and encrypt the password given by the user with the same hashfunction and look if the hashes are equal.
The reason for not using a 2-way-encryption is that one cannot decrypt your key - since a good hashfunction has collisions.

Personally, I would use something that has one-way encryption - MD5, SHA1, etc...
You can use the FormsAuthentication class with it's HashPasswordForStoringInConfigFile method. When validating the user, encrypt the entered password and compare it with the stored version.

Like ocdecio I would use TripleDes, I would also save the salt in the database too. The key for me is usually hard coded, but the salt should change for each encrypted item.

If you just need the password for an internal authentication process, you should not save the actual password, but save a hash of this password. When you need to check if a password is valid, you'll have to run the hash function on provided password and compare it with the hash you stored in your database/file. You can never find the original password from the hash.
If you need to keep the original password, you'll have to encrypt it. You can use for example a public key infrastructure if you have a process that writes the passwords (public key) and another one that reads them (private key).

Do you really need to be able to retrieve the password itself? If you're storing a password for the purposes of authenticating someone (or something), you should rather hash it (with salting) and then compare that hash to the hash of the password supplied by the party wishing to be authenticated.
If, on the other hand, you need to store the password in order to be able to retrieve it and supply it to some other authentication service later, then you might want to store it encrypted. In that case, use any decent symmetrical encryption algorithm you can, such as TripleDES or AES or Blowfish.

Briefly:
Get a big random number which you will keep private and only your application code will have access to.
Encrypt the password + random number with an ancryption algorithm like SHA1, most programming languages have a cryptography framework.
Store the hashed password.
Later when you want to check inputted passwords, you can rehash the user input and compare to the "virtually" undecipherable stored passwords.

Here's a string encryption article with example code .NET
http://www.devarticles.com/c/a/VB.Net/String-Encryption-With-Visual-Basic-.NET/3/
There is no need to use anything fancy, because anyone with a little bit of skill and determination will break it anyway.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.