I am in the process of creating an app in which a customer can add email addresses to an event. This means that each email address is sent 2 urls via email when added to the list, 1 url to accept and the other to decline. The url is made up of a number of query parmatters, id's etc.
The issue I have is that I want to prevent the scenario in which someone could "guess" another persons url - as such guest the combination of parametters etc. While this is very unlikely, I still want to prevent such.
I have seen several scenarios to help prevent this, ie. add a hash value, encrypt the url etc. However I am looking for the most secure and best practise approach to this and would like any possible feedback.
As an aside I am coding in C# but I dont believe the solution to this is language specific.
Thanks in advance.
I agree this is not language specific. I had a situation very similar to this within the last few years. It needed to be extremely secure due to children and parents receiving the communications. The fastest solution was something like the following:
First store the information that you would use in the URL as parameters somewhere in a database. This should be relatively quick and simple.
Create two GUIDs.
Associate the first GUID with the data in the database that you would have used for processing an "acceptance".
Associate the second GUID for a "decline" record in the database.
Create the two URL's with only the GUID's as parameters.
If the Acceptance URL is clicked, use the database data associated with it to process the acceptance.
If the Decline is clicked, delete the data out of the database, or archive it, or whatever.
After a timeframe, is no URL is clicked, delete or archive the data associated with those GUID's so that they can no longer be used.
GUID's are extremely hard to guess, and the likelihood of guessing one that is actually usable would be so unlikely it is nearly impossible.
I'm guessing you are saving these email addresses somewhere. So it's quite easy to make a secure identifier for each entry you have. Whether that is a hash or some encryption technique, doesn't really matter. But I guess a hash is easier to implement and actually meant for this job.
So you hash for example the emailaddress, the PK value of the record, with the timestamp of when it was added, and some really impossible to guess salt. Just concatenate the various fields together and hash them.
In the end, you send nothing but the hashed key to the server. So when you send those two links, they could look as follows:
http://www.url.com/newsletter/acceptsubscription.aspx?id=x1r15ff2svosdf4r2s0f1
http://www.url.com/newsletter/cancelsubscription.aspx?id=x1r15ff2svosdf4r2s0f1
When the user clicks such a link, your server looks in the database for the record which contains the supplied key. Easy to implement, and really safe if done right. No way in hell someone can guess another persons key. Just bear in mind the standard things when doing something with hashing. Such as:
Do not forget to add salt.
Pick a really slow, and really secure, hashing algorithm.
Just make sure that no one can figure out their own hash, from information they can possess.
If you are really scared of people doing bad things, make sure to stop bruteforcing by adding throttle control to the website. Only allow X number of requests per minute for example. Or some form of banning on an IP-address.
I'm not an expert at these things, so there might be room for improvement. However I think this should point you in the right direction.
edit: I have to add; the solution provided by Tim C is also good. GUID's are indeed very useful for situations like these, and work effectively the same as my hashed solution above.
Related
The idea is I'll have a page that will accept a user's promotion code. When the user clicks "Submit", the code will make a call to the database to ensure that the promo code is indeed valid. I plan on having a "PromoCode" table in my database which contains a list of available promo codes and a bit variable called something like "HasBeenClaimed". I'm not all that familiar with encryption/etc. but I would imagine that I would want to NOT store the actual clear text promotion code in this table but rather something like an encrypted/hashed/etc. version of it. So, if someone maliciously gains access to the table's data, they couldn't do anything with this hashed version of the promo code.
Anyways, so functionally, the user submits their promo code and the code does something like takes its hashed value and compares it with what's in the database. If it matches a record in the database and "HasBeenClaimed" is false, they continue on with the promo.
I am speaking purely pseudocode, and my terminology might not be correct. But I think you get the basic idea of what I want.
My promotions are not of high value - they're "Get the first two months half off" (which equates to $25 off each month for two months). Just FYI, I created a PayPal button that reflects this promotion to be used on the web page that the code will direct to if the promotion code is indeed valid.
QUESTION I don't know exactly where to start with this nor do I know common best practices when it comes to "Promo Codes". Please advise on common best practices regarding implementing promo code functionality in an existing ASP.NET website -any advice would be great.
The answer to this question depends a lot on what kind of promos you are going offer.
If the promo is fairly low value, like Get 1 dollar discount on you next purchase of 5 dollars or more then I don't see much point in protecting the promo code(s) in the database. In a scenario like that, losing the promo code(s) to a hacker is not going to be the worst disaster. Rather, the mere fact that the hacker gained access to the database will be much more worrying than a few stolen promo codes.
If, on the other hand, the promo is high value, like Be one of the three out of 2 million users that wins a new car then it would make much sense to protect the promo code. In such a scenario you must make sure that:
The promo code itself is sufficiently long and random (making it random enough can be quite tricky) so that it becomes practically impossible to guess it.
The promo code is stored in a fashion that protects it if someone gains access to it's storage location. Storing it in some sort of hashed or encrypted (but with encryption you have a new problem, keeping the encryption keys safe) form would likely be the best bet. You could even break it up somehow and store part of it in several different places.
Keep in mind that in this case, your coworkers (and you) are the prime hacker candidates. Even if they are not eligible to claim it, they could steal the code and give it to their second cousin on their mother's side (or similar).
Also, the admins at you site host need to be kept from figuring out what the codes are from their storage form.
Also also, make sure that the page where the user enters his promo code is using SSL to prevent someone from intercepting it in transfer.
More generally speaking, you need to decide if promo codes are going to be single use or if several people can use the same code.
It's not uncommon to have promos like Visit us on [popular social network] to get a free baseball cap with your next purchase. In this case it makes sense to allow each user to use the same promo code even if there is a risk that someone might get his/her hands on the code without actually visiting.
You could of course support both types (single/multiple use).
You also need to figure out how the promo codes are generated and distributed. Do you send them out in email campaigns? Do you print them in a local news paper? Are you going to print paper coupons and hand them out or snail mail them to people? Must the user break 20 captchas to gain a code?
And you need to decide who is eligible to use a promo code. Must it be a registered user or can anyone use it? How does an unregistered user use it?
Technically the options are many. It depends on what kind of web application we are talking about. I would first try to figure out what kind of different promotions to support. Candidates:
Additional discount on purchase
Free additional promotion product
Free shipping on the next order
2 months access to otherwise inaccessible part of the site
(etc)
Then I would build the framework (database tables, business logic etc) around the types of promotions I want to support. Personally I would not make separate pages for each promotion. I would try to integrate the promo into the existing flow of the site as much as possible.
Here is a simple hashing method you run in your codebehind:
string ClearTextPromoCode = TextBox1.Text;
byte[] ClearTextByteArray = System.Text.Encoding.UTF8.GetBytes(ClearTextPromoCode);
System.Security.Cryptography.SHA1 SHA1Encryptor = new System.Security.Cryptography.SHA1CryptoServiceProvider();
byte[] EncryptedByteArray = SHA1Encryptor.ComputeHash(ClearTextByteArray);
string EncryptedPromoCode = System.Text.Encoding.UTF8.GetString(EncryptedByteArray);
SHA1 is quick, one-way, unbreakable, and perfect for this use. You do not add plain-text codes to your database - instead, you run them through this encryption method, and then add them to the database. When a user enters a promo code, you run their text through the same encryption method, and then perform the database query with the encrypted string.
The purpose of this is that the hashed values in your database, if stolen, will do the thief no good - he cannot simply enter these strings on your website - they would be run through the encryption again and not match anything in your database.
It seems like you want the promo-codes to be single-use only. Most checkout systems let you validate your promo-code before the final purchase. I would advise you allow users to make sure their promo code is valid, and only mark the HasBeenClaimed database column after the sale has gone through.
Warning: You should be aware that SHA1, while mathematically unbreakable, can be circumvented using "rainbow tables". A hacker creates a program which runs every word in the dictionary (and then some) through a SHA1 hasher, and the then does a reverse-lookup on the hash he stole from you. The way to prevent this is using a "salt" - adding a public, random string to the beginning (or end) of each promo code before it is hashed, completely changing the end result. You store the salt in plain-text in the database. But do you really need to worry about someone stealing your 20% off coupons? ;)
You can use simple encryption for this, when saving the promo code in database encrypt it.
Then when user enters the promocode, encrypt with the same key and compare in database, if it matches the key in database, accept that promocode and mark the bit field as true meaning it has been used.
Some simple encryption c# codes:
Simple insecure two-way "obfuscation" for C#
Really simple encryption with C# and SymmetricAlgorithm
Obviously depending on the type/context of data returned to a web front-end (in my case the setup is HTML/Javascript, .NET Csharp back-end and JSON as the data transport), if I have to return an ID say of a message that is an auto-generated primary key (Int64), what is the best way to "hide" this real ID?
For most things of course, I can understand it doesn't make too much difference, however an application I am working on means if a user "guesses" an ID in the URL to pull back another record, it could prove to be a security issue..
There seems to be lots of ideas/commentary about methods, but nothing has quite clicked.
I was thinking of having an auto-generated primary INT, but also a secondary alternate GUID too. It would be the GUID returned to any front-end process, and of course the auto-generated primary ID would still be used in the backend..
The thinking of course is the GUID would be far more difficult to guess/obtain another one to access a record?
Any ideas or best practices people use?
Thanks in advance,
David.
Regarding security you have several aspects:
Session hijacking
Accessing/Modifying/Creating/Deleting records the user is not authorized to
Non-Authenticated access
Cross-Site* attacks
Man-in-the-middle attacks
etc.
The measures to deal with these depend on your architecture and security needs.
Since you don't say much about your arhcitecture and security needs it is really hard to give any specific advice...
Some points regarding "ID shouldn't be guessable":
"Correct" solution
The problem goes away in the moment you implement authentication + autherization properly
because properly implemented these two make sure that only authenticated users can access
anything at all AND that every user can only access things he is allowed to. Even if an authenticated user knows the correct ID of something he is not allowed to access this would be secure because he would prevented from accessing it.
"weak solution"
create a ConcurrentDictionary as a thread-safe in-memory-cache and put the real IDs plus the "temporary IDs" (for example upon first record access freshly generated GUIDs) in there. You can combine that temporary ID with some salt and/or encryption and/or hash of some connection-specific aspects (like client IP, time etc.). Then on every access you check with the ConcurrentDictionary and act accordingly... one positive effect: after app restart (for example app pool recycling) the same record gets a different ID because this is only an in-memory-cache... though this is hardly usable in a web-farming scenario
I am working on means if a user "guesses" an ID in the URL to pull back another record, it could prove to be a security issue.."
If this is the case then you really need to step back and review the approach to security. If a user can access records which they don't have authorisation to view you do not provide appropriate security of your Object References - https://www.owasp.org/index.php/Top_10_2010-A4-Insecure_Direct_Object_References
The GUID approach will attempt to provide security by obscurity see Is using a GUID security though obscurity? as to whether or not it does you will have to make your own mind up based on your circumstances.
Ofcourse technically, pulling back another record by quessing another ID is a bad thing- only when that other ID shouldnt be visible for the user who's pulling it back. But then you have a security problem anyways and you should focus on that rather then find a way to obfuscate the ID
Anyways, if you want to mess up the url, i recommend you looking into Rijndael. We use it alot here to pass around tokens. Basically, this encryption technique allows you to both encrypt and decrypt. Therefore you can encrypt the ID, send it to the client, the client posts it back and you can simply decrypt again. No need for an extra database record. Even more secure is to encrypt/decrypt the record ID salted with something like an IP for the current client, therefore even URL fishing will be a reduced problem.
See: http://msdn.microsoft.com/en-us/library/system.security.cryptography.rijndael.aspx
I would like to say that, the URL are meant to be public, it is not kind of confidential data. There's no need to hide the url from users. If a url can be seen by one user and should not be accessable to another user, you should check the privilege of the user from the server side instead of hiding that url.
All of the other answers (3) failed to cover the possibility of this being a non-cookied, non-authenticated, non-sessioned, non-logged-in user.
For example, a confirmation page after a order, etc...
In that case, your authentication is based on a secret in the URL. You use a secret that for all practical purposes is unguessable, and very unique per record. Then you assume that if the user has that secret, then they have access to said record, etc...
The real chalenge is to find a good way to make a secret UUID. Many developers will take the SHA1() of rand() + time() + uuid() + remote_ip() or something like that (which is typically sufficient), but I'm sure there is plenty of documentation out there on this.
Yes, in a situation where you have a non-authenticated user accessing a specific piece of data or performing an action (such as password reset), you need to have a second identifier (eg, varchar 40) on your records with a unique key (as you had outlined). Fill it with very random data, and if they have that secret, then let them in.
Take care.
I have an ASP.NET app that accepts users comments and them in a SQL database. I want to make sure that I weed out any "naughty" words so I can keep my app respectable. Problem is that I'm finding there are LOTS of these words. ;>
My question is, what's the most efficient way to do this processing? Should I have a table in SQL and write a stored proc that does the work? Should I do it with c# and Regex in memory on the web server? Are there other options? Has anyone else successfully done this kind of text scanning at scale? If y, what worked?
It's a futile task. If people want to swear then they will start typing things like f uck and sh*t.
There's no substitute for effective moderation. Anything else is likely to leave you with clbuttic errors on your page
I remember a quote from somewhere about technical solutions to social problems, but I can't source it right now
Scunthorpe Problem
One should be embar***ed to try to solve this in code.
There are some things to consider here:
Do you want to be able to add or remove words from that black list later? If so it might make sense to do this only before showing the message, but store the original message.
Do you want to have a copy of the message later on (e.g. for legal reasons or customer support)? Then it also makes sense to keep the message unchanged in the database.
So I would keep the message in the database and parse it only before rendering it. To me it looks like the most efficient way to do that would be either to:
Keep the blacklist in an indexed column (lowercase) in the database and return the comments through a stored procedure which filters it
Keep the blacklist lowercase in some data structure that allows for efficient access (e.g. Dictionary) in memory on the middle layer.
In both cases you would simply run through each comment and filter it. The latter method is more easier implemented but means that you would have to keep a list in memory, which stops to make sense when you have a very large blacklist.
(I actually see no point in using regex.)
There are already some Perl modules out there to do all of that for you.
https://metacpan.org/pod/Regexp::Common::profanity
https://metacpan.org/pod/Regexp::Profanity::US
https://metacpan.org/pod/Plagger::Plugin::Filter::Profanity
I have to come up with a membership solution for a very large website. The site will be built using ASP.NET MVC 2 and a MS SQL2008 database.
The current Membership provider seems like a BIG overkill, there's way too much functionality.
All I want to store is email/password and basic profile information such as First/LastName, Phone number. I will only ever need 2 roles, administrators & users.
What are your recommendations on this type of scenario, considering there might be millions of users registered? What does StackOverflow use?
I've used the existing Membership API a lot in the past and have extended it to store additional information etc. But there's tables such as
aspnet_Applications
aspnet_Paths
aspnet_SchemaVersions
aspnet_WebEvent_Events
aspnet_PersonalizationAllUsers
aspnet_PersonalizationPerUser
which are extremely redundant and I've never found use for.
Edit
Just to clarify a few other redundancies after #drachenstern's answer, there are also extra columns which I have no use for in the Membership/Users table, but which would add to the payload of each select/insert statements.
MobilePIN
PasswordQuestion/PasswordAnswer (I'll do email based password recovery)
IsApproved (user will always be approved)
Comment
MobileAlias
Username/LoweredUsername (or Email/LoweredEmail) [email IS the username so only need 1 of these]
Furthermore, I've heard that GUID's aren't all that fast, and would prefer to have integers instead (like Facebook does) which would also be publicly exposed.
How do I go about creating my own Membership Provider, re-using some of the Membership APIs (validation, password encryption, login cookie, etc) but only with tables that meet my requirements?
Links to articles and existing implementations are most welcome, my Google searches have returned some very basic examples.
Thanks in advance
Marko
#Marko I can certainly understand that the standard membership system may contain more functionality than you need, but the truth is that it really isn't going to matter. There are parts of the membership system that you aren't going to use just like there are parts of .Net that you aren't going to use. There are plenty of things that .Net can do that you are never, ever going to use, but you aren't going to go through .Net and strip out that functionality are you? Of course not. You have to focus on the things that are important to what you are trying to accomplish and work from there. Don't get caught up in the paralysis of analysis. You will waste your time, spin your wheels and not end up with anything better than what has already been created for you. Now Microsoft does get it wrong sometimes, but they do get a lot of things right. You don't have to embrace everything they do to accomplish your goals - you just have to understand what is important for your needs.
As for the Guids and ints as primary keys, let me explain something. There is a crucial difference between a primary key and a clustered index. You can add a primary key AND a clustered index on columns that aren't a part of the primary key! That means that if it is more important to have your data arranged by a name (or whatever), you can customize your clustered index to reflect exactly what you need without it affecting your primary key. Let me say it another way - a primary key and a clustered index are NOT one in the same. I wrote a blog post about how to add a clustered index and then a primary key to your tables. The clustered index will physically order the table rows the way you need them to be and the primary key will enforce the integrity that you need. Have a look at my blog post to see exactly how you can do it.
Here is the link - http://iamdotnetcrazy.blogspot.com/2010/09/primary-keys-do-not-or-should-not-equal.html.
It is really simple, you just add the clustered index FIRST and then add the primary key SECOND. It must be done in that order or you won't be able to do it. This assumes, of course, that you are using Sql Server. Most people don't realize this because SQL Server will create a clustered index on your primary key by default, but all you have to do is add the clustered index first and then add the primary key and you will be good to go. Using ints as a primary key can become VERY problematic as your database and server system scales out. I would suggest using Guids and adding the clustered index to reflect the way you actually need your data stored.
Now, in summary, I just want to tell you to go create something great and don't get bogged down with superficial details that aren't going to give you enough of a performance gain to actually matter. Life is too short. Also, please remember that your system can only be as fast as its slowest piece of code. So make sure that you look at the things that ACTUALLY DO take up a lot of time and take care of those.
And one more additional thing. You can't take everything you see on the web at face value. Technology changes over time. Sometimes you may view an answer to a question that someone wrote a long time ago that is no longer relevant today. Also, people will answer questions and give you information without having actually tested what they are saying to see if it is true or not. The best thing you can do for your application is to stress test it really well. If you are using ASP.Net MVC you can do this in your tests. One thing you can do is to add a for loop that adds users to your app in your test and then test things out. That is one idea. There are other ways. You just have to give it a little effort to design your tests well or at least well enough for your purposes.
Good luck to you!
The current Membership provider seems like a BIG overkill, there's way too much functionality.
All I want to store is email/password and basic profile information such as First/LastName, Phone number. I will only ever need 2 roles, administrators & users.
Then just use that part. It's not going to use the parts that you don't use, and you may find that you have a need for those other parts down the road. The classes are already present in the .NET framework so you don't have to provide any licensing or anything.
The size of the database is quite small, and if you do like I do, and leave aspnetdb to itself, then you're not really taking anything from your other databases.
Do you have a compelling reason to use a third-party component OVER what's in the framework already?
EDIT:
there are also extra columns which I
have no use for in the
Membership/Users table, but which
would add to the payload of each
select/insert statements.
MobilePIN
PasswordQuestion/PasswordAnswer (I'll
do email based password recovery)
IsApproved (user will always be
approved)
Comment MobileAlias
Username/LoweredUsername (or
Email/LoweredEmail) [email IS the
username so only need 1 of these]
This sounds like you're trying to microoptimize. Passing empty strings is virtually without cost (ok, it's there, but you have to profile to know just how much it's costing you. It won't be THAT much per user). We already routinely don't use all these fields in our apps either, but we use the membership system with no measurable detrimental impact.
Furthermore, I've heard that Guid's aren't all that fast, and would prefer to have integers instead (like Facebook does) which would also be publicly exposed.
I've heard that the cookiemonster likes cookies. Again, without profiling, you don't know if that's detrimental. Usually people use GUIDs because they want it to be absolutely (well to a degree of absoluteness) unique, no matter when it's created. The cost of generating it ONCE per user isn't all that heavy. Not when you're already creating them a new account.
Since you are absolutely set on creating a MembershipProvider from scratch, here are some references:
http://msdn.microsoft.com/en-us/library/system.web.security.membershipprovider.aspx
https://web.archive.org/web/20211020202857/http://www.4guysfromrolla.com/articles/120705-1.aspx
http://msdn.microsoft.com/en-us/library/f1kyba5e.aspx
http://www.amazon.com/ASP-NET-3-5-Unleashed-Stephen-Walther/dp/0672330113
Stephen Walther goes into detail on that in his book and it's a good reference for you to have as is.
My recommendation would be for you to benchmark it. Add as many records as you think you will have in production and submit a similar number of requests as you would get in production and see how it performs for your environment.
My guess is that it would be OK, the overhead that you are talking about would be insignificant.
The issue is there is a database with around 20k customer records and I want to make a best effort to avoid duplicate entries. The database is Microsoft SQL Server 2005, the application that maintains that database is Microsoft Dynamics/SL. I am creating an ASP.NET webservice that interacts with that database. My service can insert customer records into the database, read records from it, or modify those records. Either in my webservice, or through MS Dynamics, or in Sql Server, I would like to give a list of possible matches before a user confirms a new record add.
So the user would submit a record, if it seems to be unique, the record will save and return a new ID. If there are possible duplications, the user can then resubmit with a confirmation saying, "yes, I see the possible duplicates, this is a new record, and I want to submit it".
This is easy if it is just a punctuation or space thing (such as if you are entering "Company, Inc." and there is a "Company Inc" in the database, But what if there is slight changes such as "Company Corp." instead of "Company Inc" or if there is a fat fingered misspelling, such as "Cmpany, Inc." Is it even possible to return records like that in the list? If it's absolutely not possible, I'll deal with what I have. It just causes more work later on, if records need to be merged due to duplications.
The specifics of which algorithm will work best for you depends greatly on your domain, so I'd suggest experimenting with a few different ones - you may even need to combine a few to get optimal results. Abbreviations, especially domain specific ones, may need to be preprocessed or standardized as well.
For the names, you'd probably be best off with a phonetic algorithm - which takes into account pronunciation. These will score Smith and Schmidt close together, as they are easy to confuse when saying the words. Double Metaphone is a good first choice.
For fat fingering, you'd probably be better off with an edit distance algorithm - which gives a "difference" between 2 words. These would score Smith and Smoth close together - even though the 2 may slip through the phonetic search.
T-SQL has SOUNDEX and DIFFERENCE - but they are pretty poor. A Levenshtein variant is the canonical choice, but there's other good choices - most of which are fairly easy to implement in C#, if you can't find a suitably licensed implementation.
All of these are going to be much easier to code/use from C# than T-SQL (though I did find double metaphone in a horrendous abuse of T-SQL that may work in SQL).
Though this example is in Access (and I've never actually looked at the code, or used the implementation) the included presentation gives a fairly good idea of what you'll probably end up needing to do. The code is probably worth a look, and perhaps a port from VBA.
Look into SOUNDEXing within SQL Server. I believe it will give you the fuzziness of probable matches that you're looking for.
SOUNDEX # MSDN
SOUNDEX # Wikipedia
If it's possible to integrate Lucene.NET into your solutionm you should definetly try it out.
You could try using Full Text Search with FreeText (or FreeTextTable) functions to try to find possible matches.