Related
Someone told me that he has seen software systems that:
retrieve MD5 encrypted passwords from other systems;
decrypt the encrypted passwords and
store the passwords in the database of the system using the systems own algorithm.
Is that possible? I thought that it wasn't possible / feasible to decrypt MD5 hashes.
I know there are MD5 dictionaries, but is there an actual decryption algorithm?
No. MD5 is not encryption (though it may be used as part of some encryption algorithms), it is a one way hash function. Much of the original data is actually "lost" as part of the transformation.
Think about this: An MD5 is always 128 bits long. That means that there are 2128 possible MD5 hashes. That is a reasonably large number, and yet it is most definitely finite. And yet, there are an infinite number of possible inputs to a given hash function (and most of them contain more than 128 bits, or a measly 16 bytes). So there are actually an infinite number of possibilities for data that would hash to the same value. The thing that makes hashes interesting is that it is incredibly difficult to find two pieces of data that hash to the same value, and the chances of it happening by accident are almost 0.
A simple example for a (very insecure) hash function (and this illustrates the general idea of it being one-way) would be to take all of the bits of a piece of data, and treat it as a large number. Next, perform integer division using some large (probably prime) number n and take the remainder (see: Modulus). You will be left with some number between 0 and n. If you were to perform the same calculation again (any time, on any computer, anywhere), using the exact same string, it will come up with the same value. And yet, there is no way to find out what the original value was, since there are an infinite number of numbers that have that exact remainder, when divided by n.
That said, MD5 has been found to have some weaknesses, such that with some complex mathematics, it may be possible to find a collision without trying out 2128 possible input strings. And the fact that most passwords are short, and people often use common values (like "password" or "secret") means that in some cases, you can make a reasonably good guess at someone's password by Googling for the hash or using a Rainbow table. That is one reason why you should always "salt" hashed passwords, so that two identical values, when hashed, will not hash to the same value.
Once a piece of data has been run through a hash function, there is no going back.
You can't - in theory. The whole point of a hash is that it's one way only. This means that if someone manages to get the list of hashes, they still can't get your password. Additionally it means that even if someone uses the same password on multiple sites (yes, we all know we shouldn't, but...) anyone with access to the database of site A won't be able to use the user's password on site B.
The fact that MD5 is a hash also means it loses information. For any given MD5 hash, if you allow passwords of arbitrary length there could be multiple passwords which produce the same hash. For a good hash it would be computationally infeasible to find them beyond a pretty trivial maximum length, but it means there's no guarantee that if you find a password which has the target hash, it's definitely the original password. It's astronomically unlikely that you'd see two ASCII-only, reasonable-length passwords that have the same MD5 hash, but it's not impossible.
MD5 is a bad hash to use for passwords:
It's fast, which means if you have a "target" hash, it's cheap to try lots of passwords and see whether you can find one which hashes to that target. Salting doesn't help with that scenario, but it helps to make it more expensive to try to find a password matching any one of multiple hashes using different salts.
I believe it has known flaws which make it easier to find collisions, although finding collisions within printable text (rather than arbitrary binary data) would at least be harder.
I'm not a security expert, so won't make a concrete recommendation beyond "Don't roll your own authentication system." Find one from a reputable supplier, and use that. Both the design and implementation of security systems is a tricky business.
Technically, it's 'possible', but under very strict conditions (rainbow tables, brute forcing based on the very small possibility that a user's password is in that hash database).
But that doesn't mean it's
Viable
or
Secure
You don't want to 'reverse' an MD5 hash. Using the methods outlined below, you'll never need to. 'Reversing' MD5 is actually considered malicious - a few websites offer the ability to 'crack' and bruteforce MD5 hashes - but all they are are massive databases containing dictionary words, previously submitted passwords and other words. There is a very small chance that it will have the MD5 hash you need reversed. And if you've salted the MD5 hash - this won't work either! :)
The way logins with MD5 hashing should work is:
During Registration:
User creates password -> Password is hashed using MD5 -> Hash stored in database
During Login:
User enters username and password -> (Username checked) Password is hashed using MD5 -> Hash is compared with stored hash in database
When 'Lost Password' is needed:
2 options:
User sent a random password to log in, then is bugged to change it on first login.
or
User is sent a link to change their password (with extra checking if you have a security question/etc) and then the new password is hashed and replaced with old password in database
Not directly. Because of the pigeonhole principle, there is (likely) more than one value that hashes to any given MD5 output. As such, you can't reverse it with certainty. Moreover, MD5 is made to make it difficult to find any such reversed hash (however there have been attacks that produce collisions - that is, produce two values that hash to the same result, but you can't control what the resulting MD5 value will be).
However, if you restrict the search space to, for example, common passwords with length under N, you might no longer have the irreversibility property (because the number of MD5 outputs is much greater than the number of strings in the domain of interest). Then you can use a rainbow table or similar to reverse hashes.
Not possible, at least not in a reasonable amount of time.
The way this is often handled is a password "reset". That is, you give them a new (random) password and send them that in an email.
You can't revert a md5 password.(in any language)
But you can:
give to the user a new one.
check in some rainbow table to maybe retrieve the old one.
No, he must have been confused about the MD5 dictionaries.
Cryptographic hashes (MD5, etc...) are one way and you can't get back to the original message with only the digest unless you have some other information about the original message, etc. that you shouldn't.
Decryption (directly getting the the plain text from the hashed value, in an algorithmic way), no.
There are, however, methods that use what is known as a rainbow table. It is pretty feasible if your passwords are hashed without a salt.
MD5 is a hashing algorithm, you can not revert the hash value.
You should add "change password feature", where the user gives another password, calculates the hash and store it as a new password.
There's no easy way to do it. This is kind of the point of hashing the password in the first place. :)
One thing you should be able to do is set a temporary password for them manually and send them that.
I hesitate to mention this because it's a bad idea (and it's not guaranteed to work anyway), but you could try looking up the hash in a rainbow table like milw0rm to see if you can recover the old password that way.
See all other answers here about how and why it's not reversible and why you wouldn't want to anyway.
For completeness though, there are rainbow tables which you can look up possible matches on. There is no guarantee that the answer in the rainbow table will be the original password chosen by your user so that would confuse them greatly.
Also, this will not work for salted hashes. Salting is recommended by many security experts.
No, it is not possible to reverse a hash function such as MD5: given the output hash value it is impossible to find the input message unless enough information about the input message is known.
Decryption is not a function that is defined for a hash function; encryption and decryption are functions of a cipher such as AES in CBC mode; hash functions do not encrypt nor decrypt. Hash functions are used to digest an input message. As the name implies there is no reverse algorithm possible by design.
MD5 has been designed as a cryptographically secure, one-way hash function. It is now easy to generate collisions for MD5 - even if a large part of the input message is pre-determined. So MD5 is officially broken and MD5 should not be considered a cryptographically secure hash anymore. It is however still impossible to find an input message that leads to a hash value: find X when only H(X) is known (and X doesn't have a pre-computed structure with at least one 128 byte block of precomputed data). There are no known pre-image attacks against MD5.
It is generally also possible to guess passwords using brute force or (augmented) dictionary attacks, to compare databases or to try and find password hashes in so called rainbow tables. If a match is found then it is computationally certain that the input has been found. Hash functions are also secure against collision attacks: finding X' so that H(X') = H(X) given H(X). So if an X is found it is computationally certain that it was indeed the input message. Otherwise you would have performed a collision attack after all. Rainbow tables can be used to speed up the attacks and there are specialized internet resources out there that will help you find a password given a specific hash.
It is of course possible to re-use the hash value H(X) to verify passwords that were generated on other systems. The only thing that the receiving system has to do is to store the result of a deterministic function F that takes H(X) as input. When X is given to the system then H(X) and therefore F can be recalculated and the results can be compared. In other words, it is not required to decrypt the hash value to just verify that a password is correct, and you can still store the hash as a different value.
Instead of MD5 it is important to use a password hash or PBKDF (password based key derivation function) instead. Such a function specifies how to use a salt together with a hash. That way identical hashes won't be generated for identical passwords (from other users or within other databases). Password hashes for that reason also do not allow rainbow tables to be used as long as the salt is large enough and properly randomized.
Password hashes also contain a work factor (sometimes configured using an iteration count) that can significantly slow down attacks that try to find the password given the salt and hash value. This is important as the database with salts and hash values could be stolen. Finally, the password hash may also be memory-hard so that a significant amount of memory is required to calculate the hash. This makes it impossible to use special hardware (GPU's, ASIC's, FPGA's etc.) to allow an attacker to speed up the search. Other inputs or configuration options such as a pepper or the amount of parallelization may also be available to a password hash.
It will however still allow anybody to verify a password given H(X) even if H(X) is a password hash. Password hashes are still deterministic, so if anybody has knows all the input and the hash algorithm itself then X can be used to calculate H(X) and - again - the results can be compared.
Commonly used password hashes are bcrypt, scrypt and PBKDF2. There is also Argon2 in various forms which is the winner of the reasonably recent password hashing competition. Here on CrackStation is a good blog post on doing password security right.
It is possible to make it impossible for adversaries to perform the hash calculation verify that a password is correct. For this a pepper can be used as input to the password hash. Alternatively, the hash value can of course be encrypted using a cipher such as AES and a mode of operation such as CBC or GCM. This however requires the storage of a secret / key independently and with higher access requirements than the password hash.
MD5 is considered broken, not because you can get back the original content from the hash, but because with work, you can craft two messages that hash to the same hash.
You cannot un-hash an MD5 hash.
There is no way of "reverting" a hash function in terms of finding the inverse function for it. As mentioned before, this is the whole point of having a hash function. It should not be reversible and it should allow for fast hash value calculation. So the only way to find an input string which yields a given hash value is to try out all possible combinations. This is called brute force attack for that reason.
Trying all possible combinations takes a lot of time and this is also the reason why hash values are used to store passwords in a relatively safe way. If an attacker is able to access your database with all the user passwords inside, you loose in any case. If you have hash values and (idealistically speaking) strong passwords, it will be a lot harder to get the passwords out of the hash values for the attacker.
Storing the hash values is also no performance problem because computing the hash value is relatively fast. So what most systems do is computing the hash value of the password the user keyed in (which is fast) and then compare it to the stored hash value in their user database.
You can find online tools that use a dictionary to retrieve the original message.
In some cases, the dictionary method might just be useless:
if the message is hashed using a SALT message
if the message is hash more than once
For example, here is one MD5 decrypter online tool.
The only thing that can be work is (if we mention that the passwords are just hashed, without adding any kind of salt to prevent the replay attacks, if it is so you must know the salt)by the way, get an dictionary attack tool, the files of many words, numbers etc. then create two rows, one row is word,number (in dictionary) the other one is hash of the word, and compare the hashes if matches you get it...
that's the only way, without going into cryptanalysis.
The MD5 Hash algorithm is not reversible, so MD5 decode in not possible, but some website have bulk set of password match, so you can try online for decode MD5 hash.
Try online :
MD5 Decrypt
md5online
md5decrypter
Yes, exactly what you're asking for is possible.
It is not possible to 'decrypt' an MD5 password without help, but it is possible to re-encrypt an MD5 password into another algorithm, just not all in one go.
What you do is arrange for your users to be able to logon to your new system using the old MD5 password. At the point that they login they have given your login program an unhashed version of the password that you prove matches the MD5 hash that you have. You can then convert this unhashed password to your new hashing algorithm.
Obviously, this is an extended process because you have to wait for your users to tell you what the passwords are, but it does work.
(NB: seven years later, oh well hopefully someone will find it useful)
No, it cannot be done. Either you can use a dictionary, or you can try hashing different values until you get the hash that you are seeking. But it cannot be "decrypted".
MD5 has its weaknesses (see Wikipedia), so there are some projects, which try to precompute Hashes. Wikipedia does also hint at some of these projects. One I know of (and respect) is ophrack. You can not tell the user their own password, but you might be able to tell them a password that works. But i think: Just mail thrm a new password in case they forgot.
In theory it is not possible to decrypt a hash value but you have some dirty techniques for getting the original plain text back.
Bruteforcing: All computer security algorithm suffer bruteforcing. Based on this idea today's GPU employ the idea of parallel programming using which it can get back the plain text by massively bruteforcing it using any graphics processor. This tool hashcat does this job. Last time I checked the cuda version of it, I was able to bruteforce a 7 letter long character within six minutes.
Internet search: Just copy and paste the hash on Google and see If you can find the corresponding plaintext there. This is not a solution when you are pentesting something but it is definitely worth a try. Some websites maintain the hash for almost all the words in the dictionary.
MD5 is a cryptographic (one-way) hash function, so there is no direct way to decode it. The entire purpose of a cryptographic hash function is that you can't undo it.
One thing you can do is a brute-force strategy, where you guess what was hashed, then hash it with the same function and see if it matches. Unless the hashed data is very easy to guess, it could take a long time though.
It is not yet possible to put in a hash of a password into an algorithm and get the password back in plain text because hashing is a one way thing. But what people have done is to generate hashes and store it in a big table so that when you enter a particular hash, it checks the table for the password that matches the hash and returns that password to you. An example of a site that does that is http://www.md5online.org/ . Modern password storage system counters this by using a salting algorithm such that when you enter the same password into a password box during registration different hashes are generated.
No, you can not decrypt/reverse the md5 as it is a one-way hash function till you can not found a extensive vulnerabilities in the MD5.
Another way is there are some website has a large amount of set of password database, so you can try online to decode your MD5 or SHA1 hash string.
I tried a website like http://www.mycodemyway.com/encrypt-and-decrypt/md5 and its working fine for me but this totally depends on your hash if that hash is stored in that database then you can get the actual string.
The current top-voted to this question states:
Another one that's not so much a security issue, although it is security-related, is complete and abject failure to grok the difference between hashing a password and encrypting it. Most commonly found in code where the programmer is trying to provide unsafe "Remind me of my password" functionality.
What exactly is this difference? I was always under the impression that hashing was a form of encryption. What is the unsafe functionality the poster is referring to?
Hashing is a one way function (well, a mapping). It's irreversible, you apply the secure hash algorithm and you cannot get the original string back. The most you can do is to generate what's called "a collision", that is, finding a different string that provides the same hash. Cryptographically secure hash algorithms are designed to prevent the occurrence of collisions. You can attack a secure hash by the use of a rainbow table, which you can counteract by applying a salt to the hash before storing it.
Encrypting is a proper (two way) function. It's reversible, you can decrypt the mangled string to get original string if you have the key.
The unsafe functionality it's referring to is that if you encrypt the passwords, your application has the key stored somewhere and an attacker who gets access to your database (and/or code) can get the original passwords by getting both the key and the encrypted text, whereas with a hash it's impossible.
People usually say that if a cracker owns your database or your code he doesn't need a password, thus the difference is moot. This is naïve, because you still have the duty to protect your users' passwords, mainly because most of them do use the same password over and over again, exposing them to a greater risk by leaking their passwords.
Hashing is a one-way function, meaning that once you hash a password it is very difficult to get the original password back from the hash. Encryption is a two-way function, where it's much easier to get the original text back from the encrypted text.
Plain hashing is easily defeated using a dictionary attack, where an attacker just pre-hashes every word in a dictionary (or every combination of characters up to a certain length), then uses this new dictionary to look up hashed passwords. Using a unique random salt for each hashed password stored makes it much more difficult for an attacker to use this method. They would basically need to create a new unique dictionary for every salt value that you use, slowing down their attack terribly.
It's unsafe to store passwords using an encryption algorithm because if it's easier for the user or the administrator to get the original password back from the encrypted text, it's also easier for an attacker to do the same.
As shown in the above image, if the password is encrypted it is always a hidden secret where someone can extract the plain text password. However when password is hashed, you are relaxed as there is hardly any method of recovering the password from the hash value.
Extracted from Encrypted vs Hashed Passwords - Which is better?
Is encryption good?
Plain text passwords can be encrypted using symmetric encryption algorithms like DES, AES or with any other algorithms and be stored inside the database. At the authentication (confirming the identity with user name and password), application will decrypt the encrypted password stored in database and compare with user provided password for equality. In this type of an password handling approach, even if someone get access to database tables the passwords will not be simply reusable. However there is a bad news in this approach as well. If somehow someone obtain the cryptographic algorithm along with the key used by your application, he/she will be able to view all the user passwords stored in your database by decryption. "This is the best option I got", a software developer may scream, but is there a better way?
Cryptographic hash function (one-way-only)
Yes there is, may be you have missed the point here. Did you notice that there is no requirement to decrypt and compare? If there is one-way-only conversion approach where the password can be converted into some converted-word, but the reverse operation (generation of password from converted-word) is impossible. Now even if someone gets access to the database, there is no way that the passwords be reproduced or extracted using the converted-words. In this approach, there will be hardly anyway that some could know your users' top secret passwords; and this will protect the users using the same password across multiple applications. What algorithms can be used for this approach?
I've always thought that Encryption can be converted both ways, in a way that the end value can bring you to original value and with Hashing you'll not be able to revert from the end result to the original value.
Hashing algorithms are usually cryptographic in nature, but the principal difference is that encryption is reversible through decryption, and hashing is not.
An encryption function typically takes input and produces encrypted output that is the same, or slightly larger size.
A hashing function takes input and produces a typically smaller output, typically of a fixed size as well.
While it isn't possible to take a hashed result and "dehash" it to get back the original input, you can typically brute-force your way to something that produces the same hash.
In other words, if a authentication scheme takes a password, hashes it, and compares it to a hashed version of the requires password, it might not be required that you actually know the original password, only its hash, and you can brute-force your way to something that will match, even if it's a different password.
Hashing functions are typically created to minimize the chance of collisions and make it hard to just calculate something that will produce the same hash as something else.
Hashing:
It is a one-way algorithm and once hashed can not rollback and this is its sweet point against encryption.
Encryption
If we perform encryption, there will a key to do this. If this key will be leaked all of your passwords could be decrypted easily.
On the other hand, even if your database will be hacked or your server admin took data from DB and you used hashed passwords, the hacker will not able to break these hashed passwords. This would actually practically impossible if we use hashing with proper salt and additional security with PBKDF2.
If you want to take a look at how should you write your hash functions, you can visit here.
There are many algorithms to perform hashing.
MD5 - Uses the Message Digest Algorithm 5 (MD5) hash function. The output hash is 128 bits in length. The MD5 algorithm was designed by Ron Rivest in the early 1990s and is not a preferred option today.
SHA1 - Uses Security Hash Algorithm (SHA1) hash published in 1995. The output hash is 160 bits in length. Although most widely used, this is not a preferred option today.
HMACSHA256, HMACSHA384, HMACSHA512 - Use the functions SHA-256, SHA-384, and SHA-512 of the SHA-2 family. SHA-2 was published in 2001. The output hash lengths are 256, 384, and 512 bits, respectively,as the hash functions’ names indicate.
Ideally you should do both.
First Hash the pass password for the one way security. Use a salt for extra security.
Then encrypt the hash to defend against dictionary attacks if your database of password hashes is compromised.
As correct as the other answers may be, in the context that the quote was in, hashing is a tool that may be used in securing information, encryption is a process that takes information and makes it very difficult for unauthorized people to read/use.
Here's one reason you may want to use one over the other - password retrieval.
If you only store a hash of a user's password, you can't offer a 'forgotten password' feature.
I have an encoding application written in C# where users can optionally encrypt messages. I had been using the class in this answer, and it turns out I'm in good company because I found several places online that use the exact same code (one of which is Netflix's Open Source Platform).
However, comments to that answer (as well as later edits to that answer) led me to believe that this method was insecure. I opted to use the class in this answer to the same question instead.
How secure is AES encryption if you use a constant salt? How easily can this method be broken? I admit that I have very little experience in this area.
AES is a block cipher. A block cipher's input is a key and a block of plaintext. A block cipher is usually used in a block cipher mode of operation. All secure modes of operation use an Initialization Vector or IV. Otherwise identical plaintext would encrypt to identical ciphertext (for the same key), and this is leaking information.
Salt is not used by AES or modes of operation. It's usually used as input for Key Derivation Functions (KDFs), especially Password Based Key Derivation Functions (PBKDFs). Dot NET's Rfc2898DeriveBytes implements the PBKDF2 function as defined in - you'd guess it - RFC 2898: "PKCS #5: Password-Based Cryptography Specification Version 2.0".
If you use a static salt in a PBKDF2 then you would get the same key as output (for the same number of iterations). Now if you would ever leak the resulting key then all your ciphertext would be vulnerable. And if you would use multiple passwords then an attacker would be able to build a rainbow table; the PBKDF2 work factor would become less important; the attacker can simply build one table and then try all the resulting keys on all possible ciphertexts.
So, as the salt is not actually used for AES it doesn't make much of a difference for the security. It is however still a horrible sin, even worse than using the default iteration count for PBKDF2 / Rfc2898DeriveBytes.
Note that horrible security sins are committed by a large number of people on a daily basis. That there are many many many persons that get it wrong doesn't tell you that you are in "good company". That there are 289 upvotes just tells you that SO answers about cryptography should not be trusted based on vote count.
Salt is there for a reason.
This enables same input to be encrypted differently.
If an attacker would really insist, he can find some patterns that repeat themselves in encryption without salt, and eventually can get to your key more easily.
Still the attcker would have to work very hard.
Using constant salt equals to not using salt at all.
And it is highly recommended to use it, as it has no effect on the decryption process.
This question is more of a design know how rather than a pure technical problem.
I am wondering what would be the best way to implement a authentication mechanism where users are asked to enter/select characters from specific positions in their whole password. ex. asking the user to enter/select the 3rd, 5th and 9th characters and then checking whether they have entered/selected the correct characters from their original password.
As far as I know, passwords are salted and encrypted using an irreversible algorithm before storing them. When a user enters his whole password, it is then salted again, encrypted and compared with the stored value. But in the above case where only certain characters are entered/selected how would one check against the whole password?
Below are the two (unsafe) ways I know to implement the above
Breaking the original password into individual characters and then
salting and encrypting them. Then comparing the user entered value
against the specific character.
Breaking the original password into predefined set of combinations like 2nd, 3rd & 7th characters together and then salting and
encrypting them. Later comparing with the user entered value.
But I think with the above implementations it is easy to crack the password rather than cracking a password which is encrypted as a whole.
What would be the safest way of implementing this?
UPDATE:
Just found that this topic has been discussed in detail here and it mentions about using HSM and symmetric encryption.
Decomposing the password into different parts that a hacker can independently attack reduces the overall complexity of the password significantly. Essentially you allow the hacker to play "20 questions" with the password: "Are the separate digits 500? No, how about 750?"
Additionally, the vast majority of typical app users will find this task difficult, challenging and annoying.
I would not suggest doing this.
You are correct that the salted, hashed (not encrypted) password is typically stored. If you are hashing rather than encrypting, you would need some mechanism to separately know what those particular digits of the password are. You should be salting/hashing. Several high-profile sites had many user accounts compromised because they used encryption, and the encryption key was found by hackers.
By doing this you're lowering the entropy the password gives you. Take for example an 8 character password. Now lets assume that your password is chosen at random and we're trying to brute force it just by trying every combination. The formula for this is (n+r-1!)/r!(n-1)! meaning for 8 characters (r=8) (upper and lower case and numbers n=(26*2+10)) all your combinations would come out to 8361453670 different combinations.
Now lets say out of that you ask them to give you the 1st, 3rd and 6th characters you've effectively reduced the number to choose (r) to 3, because to brute force every combination I only need 3 characters. The total number of combinations you have now is 41664, which is a LOT easier to crack. In fact this would take no time.
You should never reduce your entropy, in fact I recommend making passwords atleast 10 characters with letters numbers and symbols.
As to your question about "irreversable encryption," you're referring to hashing. Passwords should be stored hashed and salted, and in fact they should use a slow hash like BCrypt, SCrypt or PBKDF2. But to be able to do what you're saying you're going to be required to encrypt the password with a reversable encryption. This is a no-no because the key could get stolen in which case your passwords might as well be in plain text.
The best approach to storing passwords is to not store them (use open id or something else if you can so you don't have to take on the responsibility). The second best approach to storing passwords is store them as if you're going to loose them. Salted+Hashed with a very slow algorithm that will give you enough time to warn your users and give them enough time to change their passwords.
Passwords are typically stored in an encrypted format. It is encrypted with a particular salt. The salt tells the encryption algorithm some initialization vectors. This encryption is one way. There is no way to decrypt it.
This way it is extremely difficult to build a rainbow table of passwords because you would need to do this with every salt combination too. See the following WIKI.. http://en.wikipedia.org/wiki/Rainbow_table
The method you are proposing would severely degrade security that you would get from a good password.
I am trying to get back a string from its hash value?
string str="Hello";
int hashStr=str.GetHashCode(); // hash value of "Hello" is -694847
can I get back my_string (i.e "Hello") form the hashed value....?
UPDATED
actually i am thinking to save password into my database after hashing to make it secure...
So it means a different password even have same value?
There are exactly 2^32 many hash codes but way, way more strings. Thus, by the pigeonhole principle, there have to be multiple strings mapping to the same hash code. Therefore, an inverse map from hash code to string is impossible
Edit: Response to your update.
actually i am thinking to save password into my database after hashing to make it secure...
So it means a different password even have same value?
Yes, it is possible for two passwords to have the same hash. This is basically a restatement of the above. But you shouldn't use GetHashCode to hash the password. Instead, use something secure like SHA-2.
To go one step further, never try to roll your own your encryption/security etc. Find a library that does it for you.
actually I am thinking to save password into my database after hashing to make it secure
You are not competent to implement this code.
That's nothing to feel bad about. I'm not competent to do so either, and I've studied security systems for years. By studying security systems I've learned that security systems are insanely difficult to get right, require years of experience and detailed expertise of a complex domain. That's how I know I'm not competent. The fact that you think that hashes might be reversible indicates to me that you are not a security professional.
My advice: hire a security professional to do this task for you. There is no point in spending good money to make a bad security system that doesn't actually protect your resources. Rather than rolling your own cheap system now and spending a lot more money on cleaning up the disaster later, spend a little more up front now and get a professional implementation.
Furthermore, the documentation for GetHashCode specifically states that it is not suitable to be used for password hashing because the algorithm could be changed at any time. In fact the hash algorithm did change between CLR v1 and CLR v2, and that broke every single vendor who relied upon GetHashCode for a password hash who upgraded their system. GetHashCode is not stable, it is not secure, it is not crypto strength and it is not based on any industry standard algorithm. DO NOT UNDER ANY CIRCUMSTANCES use it for crypto hashing.
One answer that is missing here is explaining to the OP that hashing is not encryption. The terms hashing and cryptography are often confusing for junior programmers (myself included at one point) who need to deal with security for the first time.
From Wikipedia: A hash function is any well-defined procedure or mathematical function that converts a large, possibly variable-sized amount of data into a small datum, usually a single integer that may serve as an index to an array (cf. associative array). The values returned by a hash function are called hash values, hash codes, hash sums, checksums or simply hashes.
From Wikipedia: Encryption is the process of transforming information (referred to as plaintext) using an algorithm (called cipher) to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key.
Edit for Update:
Yes. Though unlikely and highly dependent on the type of hash algorithm, hashing of two or more different pieces of data could yield the same value.
Password hashing is often used to secure passwords in a database. But, you cannot un-hash passwords. If you want to hash them you have to evaluate the hash values to make sure they match. Here's and ASP-specific strategy for hashing passwords. Here is a good read, especially if you're working with web technologies
Something not mentioned in here is you should salt your hashes.. yum yum.
What a salt is/does.
Lets say you get a hold of someone's DB full of hashed passwords. If they hashed with no salt, then "breaking" passwords would be as easy as downloading a large pre-hashed dataset of a crap-ton of strings.
If the hash from one string matches, then you have a good chance of knowing the password. Even if it's not the correct password, you can still log in with it since it gives the same hash.
This is where salting your hashes comes in. If you add a salt (aka pre-determined random string) to a password before it is hashed, then you can't just pre-hash a ton of strings
example.
No Salt:
Password: ABCD hashes into 1234EFG
Large list of pre-hashed strings hash a hash of 1234EFG, may or may not be ABCD, but it will still work.
With Salt:
Password: ABCD concat 0315927429 hashes into 43BCF1
Each password has a different salt, so you can't use one pre-computer hash lookup table, you'd have to re-compute the hashes for every password.
Re-computing would incredibly time consuming. Now, the salt doesn't have to be securely stored for it to add lots of this benefit. Even if you store the salt in the same table, it would be incredibly hard for anyone to make a hash lookup to try to reverse any one person's password.
To other responder: "One answer that is missing here is explaining to the OP that hashing is not encryption."
Hashes are sometimes refereed to as "One way encryption". This is a bad description and adds to the confusion you mentioned.
As others said, in general you can't do it as string to hash isn't a one-to-one function; infinite number of strings but only 2^32 ~ 4 billion hashes. That said, you can do a dictionary attack against an unsalted hash. Get a cluster of computers to calculate hashes for a wide variety of likely strings (e.g., dictionary words) and find a hash that matches.
The short answer to your question is : No. The hash is just one way.
If you want to secure your password as you said in the update, hash it with a hashing algorithmic (MD5, SHA1, ...) then stored in the database. When you want to verify the password given by the user, just hash it and compared to the hash stored in the database.
actually i am thinking to save
password into my database after
hashing to make it secure...
So it means a different password even
have same value?
GetHashCode is not a Cryptographic hash function, so it isn't really appropriate for this purpose.
Yes, different passwords will have the same value. Even so, this still makes the user's passwords more secure, though this can safely be done client-side rather than server-side for improved protection. The purpose of hashing passwords before storing them is to make sure your database* cannot be used to determine a user's passwords. A user could still use the hashes stuck in your database to pose as your users, but knowing a user's actually site password is more valuable, since a good chunk of your users will use the same passwords everywhere else.
*There are other similar attacks this protects against like man-in-the-middle attacks, but in general it's all about ensuring you don't store a user's password in your database in plain text.
You can't get back value from hashed value, but what you can do (and this is what is done in almost every website that saves hashed passwords) is to compare the hash of the just-entered password to the hash you've saved.
And about your second question, it is true that there can be more than one text to match one hash, but it's not like the hash of "hello" is equals to the hash of "goodbye". It's more like the hash of "hello" is equals to the hash of "sdd89sfu7w84haushf9478hfsklehf84hfwuhf...".
Don't use GetHashCode() to hash a password. It isn't a cryptographic hash and it's resulting hash way too short. GetHashCode is designed for use in HashTables and similar structures. A GetHashCode() which returns a constant value is valid(but slows down hashtables a lot).
For password-hashing there are several pitfalls:
Use a salt, so an attacker can't use rainbowtables(or similar pre-calculation attacks)
Use many iterations to slow down bruteforce attacks
Use a cryptographic hashfunction
You better not implement it yourself, but instead use a standard Key Derivation Function (KDF) such as PBKDF2.
The .net framework contains classes to do this for you:
Rfc2898DeriveBytes implements PBKDF2
PasswordDeriveBytes implements PBKDF1
To check if the entered password is correct, you don't decrypt the saved password(which isn't possible), but you hash the entered password with the same salt as the original password, and then compare the hash.