Routes with 16-bit Guids seem Crazy?

Routes with 16-bit Guids seem Crazy? - c#

I just looked at my Database Schema from my DBA and its using 16bit unique Identifiers as the Primary Key. The question I have how do I used this in the routing for MVC.
Something like http://www.app.com/project/21212/product/212121
This is a midsize enterprise application, why would you need a GUID for our tables anyway?
I know we can create a friendly ID field, but I know MVC Routing doesn’t recommend using database IDs in Routes..
So I guess my questions are:
Why would we need 16 Bit Guids for our Primary Key?
How could I use that in the Route. The route isn’t supposed to contain and Database IDs.

On the DB part of the question
The decision of what kind of DB keys you are going to use should be completely independent of your MVC routes. DBAs might chose to use whatever value they think is appropriate to your application without having to worry about how you are going to craft your routes. I couldn't tell whether they make sense for your domain or not.
On the route/URL part of the question
Depending on what you are trying to do adding a GUID to a route might not be the best idea for a route/URL. The authors of "ASP.NET MVC in Action" (page 95) give some good guidelines on how URLs should be:
Simple and clean
Hackable
Allow URLs parameters to clash
Short
Avoid exposing database IDs where possible
Consider adding unnecessary information
If you have GUIDs as database IDs see if you can use another value to craft the route to each resource/record. For example the name of the product plus the last 4 digits of the db ID, or another unique and user friendly (see guidelines) value that you can come up with based on the information that you are trying to access.

Lets look at this very page as an example. I think we can all agree that StackOverflow is a successful MVC application...
https://stackoverflow.com/questions/4079861/routes-with-16-bit-guids-seem-crazy
What is that "4079861" in there? A database ID?
Note that the database ID is really the only important part as these links also arrive at the same location:
https://stackoverflow.com/questions/4079861/
https://stackoverflow.com/questions/4079861/Foo
So, the short answer is: yes, your routes will probably have a big ugly Guid in them. Go talk to your DBA if you have a problem with that.

You should ask yourself: Is there another way to uniquely identify my [insert name] ?
#Mark gives the StackOverflow example. The nice part is that it's a number, even if it's a long one. Numbers are nicer than GUIDs.
Your options are probably:
create a simple number to GUID mapping in the database, creating a redundant unique identifier, for routing purposes
actually display the GUID as part of your routing
find some other way you could uniquely identify your record (i.e. date + name combination, which blogs use) - although here you have to make sure you don't allow duplicate entries of your routing identifier.
What you end up using will depend entirely on your situation and requirements.

Related

Should I expose primary keys in ASP.NET MVC views

In my web application I use primary keys to generate hyperlinks in order to navigate to different pages:
<td>
#Html.ActionLink("Edit", "Edit", new { id = item.Id }) |
#Html.ActionLink("Details", "Details", new { id = item.Id }) |
#Html.ActionLink("Delete", "Delete", new { id = item.Id })
</td>
I was wondering if this code is a security concern. Is it advisable to expose primary keys in ASP.NET MVC views? If this is the case what are the alternatives? Should I encrypt the IDs in my viewmodels or should I create a mapping table between public and private keys?
I appreciate your advice

Gone are the days when people seeing you primary or surrogate keys were able to hack down the database. Now sql injections and backdoor concept are subsided.
I disagree with the stance that exposing primary keys is a problem. It can be a problem if you make them visible to users because they are given meaning outside the system, which is usually what you're trying to avoid.
However to use IDs as the value for combo box list items? Go for it I say. What's the point in doing a translation to and from some intermediate value? You may not have a unique key to use. Such a translation introduces more potential for bugs.
Just don't neglect security.
If say you present the user with 6 items (ID 1 to 6), never assume you'll only get those values back from the user. Someone could try and breach security by sending back ID 7 so you still have to verify that what you get back is allowed.
But avoiding that entirely? No way. No need.
As a comment on another answer says, look at the URL here. That includes what no doubt is the primary key for the question in the SO database. It's entirely fine to expose keys for technical uses.
Also, if you do use some surrogate value instead, that's not necessarily more secure.

Generally, there is no point of encrypting item id, because this is not considered (in most business domains) confidential information. Unless your domain specifically requires to keep id private - don't do this. Keep it simple, stupid.
There is no security concern associated with this.

What exactly do you mean by "primary key" in this context? That term is a database term. In your browser, it's just an identifier. What difference does it make if that identifier is stored in a column with a primary key constraint on it, or a column with a unique index on it, with or without some reversible transformation on its value before storing?
There is obviously no direct risk in exposing an identifier.
But there are risks associated with identifiers, that may have to be mitigated.
For example, you must ensure that knowledge of the identifier does not imply full access to the identified resource. You do that by properly authenticating and authorizing all resource access. (Update: some other answers have suggested that you may do that by making identifiers hard to guess, e.g. through encryption or signing. That is nonsense of course. You protect a resource by protecting it, not by trying to hide it.)
In some cases, the value of an identifier may carry information that you do not want to expose. For example, if you number your "orders" sequentially, and a user sees they have order number 17, they know how many orders you have received in the past. That may be competitive information that you do not want to expose. Also, if identifiers are sequential, they contain information about when the identifier was created, relative to other identifiers. That may be confidential as well.
So the question is not really "can I expose identifiers", but rather "how should I generate identifiers in such a way that no confidential information is exposed through them".
Well, if the number of identified resources is not confidential, just use a sequence (e.g. as generated by an identity column). If you want the identifier to be meaningless, use a cryptographic random number generator to generate them.

Their is no issue to public the ID of item in Web-applications.
If your code CRUD ajax request take ID parameter and process on it then a user can call many ajax request within firebug very easily. If you didn't permit too much to a guest user then it would not be a big problem.
Security doesn't means anything in this context. You just remember that all your code are safe from XSS.
expose of primary ID make it easier for people to remember or hack the url and go to next one (item or page). The only thing you need to care that always check security (XSS for this question)

I believe that there is no risk to expose primary keys to the public, I think you should pay attention to where vulnerabilities start. As long as your generated urls are tamper-free and you are certain about deciding a given url is generated within your application and no-man-in-the-middle, all go smoothly. To do that, I invariably use a hash-styled mechanisms and provide an extra parameter to my urls made up from primary key and something else to check for tamper.

Exposing Database IDs to the UI

This is a beginner pattern question for a web forms-over-data sort of thing. I read Exposing database IDs - security risk? and the accepted answer has me thinking that this is a waste of time, but wait...
I have an MVC project referencing a business logic library, and an assembly of NHibernate SQL repositories referencing the same. If something forced my hand to go and reference those repositories directly from my controller codebase, I'd know what went wrong. But when those controllers talk in URL parameters with the database record IDs, does it only seem wrong?
I can't conceive of those IDs ever turning un-consumable (by MVC actions). I don't think I'd ever need two UI entities corresponding to the same row in the database. I don't intend for the controller to interpret the ID in any way. Surrogate keys would make zero difference. Still, I want to have the problem because assumptions about the ralational design aren't any better than layer-skipping dependencies.
How would you make a web application that only references the business logic assembly and talks in BL objects and GUIDs that only have meaning for that session, while the assembly persists transactions using database IDs?

You can encrypt or hash your ids if you want. Using session id as a salt. It depends on the context. A public shopping site you want the catalog pages to be clear an easily copyable. User account admin it's fine to encrypt the ids, so users can't url hack into someone else's account.
I would not consider this to be security by obscurity. If a malicious user has one compromised account they can look at all the form fields, url ids, and cookie values set while logged in as that user. They can then try using those when logged in as a different user to escalate permissions. But by protecting them using session id as a salt, you have locked that data down so it's only useful in one session. The pages can't even be bookmarked. Could they figure out your protection? Possibly. But likely they'd just move on to another site. Locking your car door doesn't actually keep anyone out of your car if they want to get in, but it makes it harder, so everyone does it.

I'm no security expert, but I have no problem exposing certain IDs to the user, those such as Product IDs, User IDs, and anything that the user could normally read, meaning if I display a product to the user, displaying its Product ID is not a problem.
Things that are internal to the system that the users do not directly interact with, like Transaction IDs, I do not display to the user, not in fear of them editing it somehow, but just because that is not information that is useful to them.
Quite often in forms, I would have the action point to "mysite.com/messages/view/5", where 5 is the message they want to view. In all of these actions, I always ensure that the user has access to view it (modify or delete, which ever functionality is required), by doing a simple database check and ensure the logged in user is equal to the messages owner.

Be very very very careful as parameter tampering can lead to data modification. Rules on 'who can access what ids' must be very very carefully built into your application when exposing these ids.
For instance, if you are updating an Order based on OrderId, include in your where clause for load and updates that :
where order.orderid=passedInOrderId and Order.CustomerId=
I developed an extension to help with stored ids in MVC available here:
http://mvcsecurity.codeplex.com/
Also I talk about this a bit in my security course at: Hack Proofing your ASP.NET MVC and Web Forms Applications

Other than those responses, sometimes it's good to use obvious id's so people can hack the url for the information they want. For example, www.music.com\artist\acdc or www.music.com\arist\smashing-pumpkins. If it's meaningful to your users and if you can increase the information the user understands from the page through the URL then all the better and especially if your market segment is young or tech savvy then use the id to your advantage. This will also boost your SEO.
I would say when it's not of use, then encode it. It only takes one developer one mistake to not check a customer id against a session and you expose your entire customer base.
But of course, your unit tests should catch that!

While you will find some people who say that IDs are just an implementation detail, in most systems you need a way of uniquely identifying a domain entity, and most likely you will generate an ID for that identifier. The fact that the ID is generated by the database is an implementation detail; but once it has been generated it becomes an attribute of the domain entity, and it is therefore perfectly reasonable to use it wherever you need to reference the entity.

How to hide a database ID from HTML/Javascript

Obviously depending on the type/context of data returned to a web front-end (in my case the setup is HTML/Javascript, .NET Csharp back-end and JSON as the data transport), if I have to return an ID say of a message that is an auto-generated primary key (Int64), what is the best way to "hide" this real ID?
For most things of course, I can understand it doesn't make too much difference, however an application I am working on means if a user "guesses" an ID in the URL to pull back another record, it could prove to be a security issue..
There seems to be lots of ideas/commentary about methods, but nothing has quite clicked.
I was thinking of having an auto-generated primary INT, but also a secondary alternate GUID too. It would be the GUID returned to any front-end process, and of course the auto-generated primary ID would still be used in the backend..
The thinking of course is the GUID would be far more difficult to guess/obtain another one to access a record?
Any ideas or best practices people use?
Thanks in advance,
David.

Regarding security you have several aspects:
Session hijacking
Accessing/Modifying/Creating/Deleting records the user is not authorized to
Non-Authenticated access
Cross-Site* attacks
Man-in-the-middle attacks
etc.
The measures to deal with these depend on your architecture and security needs.
Since you don't say much about your arhcitecture and security needs it is really hard to give any specific advice...
Some points regarding "ID shouldn't be guessable":
"Correct" solution
The problem goes away in the moment you implement authentication + autherization properly
because properly implemented these two make sure that only authenticated users can access
anything at all AND that every user can only access things he is allowed to. Even if an authenticated user knows the correct ID of something he is not allowed to access this would be secure because he would prevented from accessing it.
"weak solution"
create a ConcurrentDictionary as a thread-safe in-memory-cache and put the real IDs plus the "temporary IDs" (for example upon first record access freshly generated GUIDs) in there. You can combine that temporary ID with some salt and/or encryption and/or hash of some connection-specific aspects (like client IP, time etc.). Then on every access you check with the ConcurrentDictionary and act accordingly... one positive effect: after app restart (for example app pool recycling) the same record gets a different ID because this is only an in-memory-cache... though this is hardly usable in a web-farming scenario

I am working on means if a user "guesses" an ID in the URL to pull back another record, it could prove to be a security issue.."
If this is the case then you really need to step back and review the approach to security. If a user can access records which they don't have authorisation to view you do not provide appropriate security of your Object References - https://www.owasp.org/index.php/Top_10_2010-A4-Insecure_Direct_Object_References
The GUID approach will attempt to provide security by obscurity see Is using a GUID security though obscurity? as to whether or not it does you will have to make your own mind up based on your circumstances.

Ofcourse technically, pulling back another record by quessing another ID is a bad thing- only when that other ID shouldnt be visible for the user who's pulling it back. But then you have a security problem anyways and you should focus on that rather then find a way to obfuscate the ID
Anyways, if you want to mess up the url, i recommend you looking into Rijndael. We use it alot here to pass around tokens. Basically, this encryption technique allows you to both encrypt and decrypt. Therefore you can encrypt the ID, send it to the client, the client posts it back and you can simply decrypt again. No need for an extra database record. Even more secure is to encrypt/decrypt the record ID salted with something like an IP for the current client, therefore even URL fishing will be a reduced problem.
See: http://msdn.microsoft.com/en-us/library/system.security.cryptography.rijndael.aspx

I would like to say that, the URL are meant to be public, it is not kind of confidential data. There's no need to hide the url from users. If a url can be seen by one user and should not be accessable to another user, you should check the privilege of the user from the server side instead of hiding that url.

All of the other answers (3) failed to cover the possibility of this being a non-cookied, non-authenticated, non-sessioned, non-logged-in user.
For example, a confirmation page after a order, etc...
In that case, your authentication is based on a secret in the URL. You use a secret that for all practical purposes is unguessable, and very unique per record. Then you assume that if the user has that secret, then they have access to said record, etc...
The real chalenge is to find a good way to make a secret UUID. Many developers will take the SHA1() of rand() + time() + uuid() + remote_ip() or something like that (which is typically sufficient), but I'm sure there is plenty of documentation out there on this.
Yes, in a situation where you have a non-authenticated user accessing a specific piece of data or performing an action (such as password reset), you need to have a second identifier (eg, varchar 40) on your records with a unique key (as you had outlined). Fill it with very random data, and if they have that secret, then let them in.
Take care.

ASP.NET Custom Membership Provider for very large application

I have to come up with a membership solution for a very large website. The site will be built using ASP.NET MVC 2 and a MS SQL2008 database.
The current Membership provider seems like a BIG overkill, there's way too much functionality.
All I want to store is email/password and basic profile information such as First/LastName, Phone number. I will only ever need 2 roles, administrators & users.
What are your recommendations on this type of scenario, considering there might be millions of users registered? What does StackOverflow use?
I've used the existing Membership API a lot in the past and have extended it to store additional information etc. But there's tables such as
aspnet_Applications
aspnet_Paths
aspnet_SchemaVersions
aspnet_WebEvent_Events
aspnet_PersonalizationAllUsers
aspnet_PersonalizationPerUser
which are extremely redundant and I've never found use for.
Edit
Just to clarify a few other redundancies after #drachenstern's answer, there are also extra columns which I have no use for in the Membership/Users table, but which would add to the payload of each select/insert statements.
MobilePIN
PasswordQuestion/PasswordAnswer (I'll do email based password recovery)
IsApproved (user will always be approved)
Comment
MobileAlias
Username/LoweredUsername (or Email/LoweredEmail) [email IS the username so only need 1 of these]
Furthermore, I've heard that GUID's aren't all that fast, and would prefer to have integers instead (like Facebook does) which would also be publicly exposed.
How do I go about creating my own Membership Provider, re-using some of the Membership APIs (validation, password encryption, login cookie, etc) but only with tables that meet my requirements?
Links to articles and existing implementations are most welcome, my Google searches have returned some very basic examples.
Thanks in advance
Marko

#Marko I can certainly understand that the standard membership system may contain more functionality than you need, but the truth is that it really isn't going to matter. There are parts of the membership system that you aren't going to use just like there are parts of .Net that you aren't going to use. There are plenty of things that .Net can do that you are never, ever going to use, but you aren't going to go through .Net and strip out that functionality are you? Of course not. You have to focus on the things that are important to what you are trying to accomplish and work from there. Don't get caught up in the paralysis of analysis. You will waste your time, spin your wheels and not end up with anything better than what has already been created for you. Now Microsoft does get it wrong sometimes, but they do get a lot of things right. You don't have to embrace everything they do to accomplish your goals - you just have to understand what is important for your needs.
As for the Guids and ints as primary keys, let me explain something. There is a crucial difference between a primary key and a clustered index. You can add a primary key AND a clustered index on columns that aren't a part of the primary key! That means that if it is more important to have your data arranged by a name (or whatever), you can customize your clustered index to reflect exactly what you need without it affecting your primary key. Let me say it another way - a primary key and a clustered index are NOT one in the same. I wrote a blog post about how to add a clustered index and then a primary key to your tables. The clustered index will physically order the table rows the way you need them to be and the primary key will enforce the integrity that you need. Have a look at my blog post to see exactly how you can do it.
Here is the link - http://iamdotnetcrazy.blogspot.com/2010/09/primary-keys-do-not-or-should-not-equal.html.
It is really simple, you just add the clustered index FIRST and then add the primary key SECOND. It must be done in that order or you won't be able to do it. This assumes, of course, that you are using Sql Server. Most people don't realize this because SQL Server will create a clustered index on your primary key by default, but all you have to do is add the clustered index first and then add the primary key and you will be good to go. Using ints as a primary key can become VERY problematic as your database and server system scales out. I would suggest using Guids and adding the clustered index to reflect the way you actually need your data stored.
Now, in summary, I just want to tell you to go create something great and don't get bogged down with superficial details that aren't going to give you enough of a performance gain to actually matter. Life is too short. Also, please remember that your system can only be as fast as its slowest piece of code. So make sure that you look at the things that ACTUALLY DO take up a lot of time and take care of those.
And one more additional thing. You can't take everything you see on the web at face value. Technology changes over time. Sometimes you may view an answer to a question that someone wrote a long time ago that is no longer relevant today. Also, people will answer questions and give you information without having actually tested what they are saying to see if it is true or not. The best thing you can do for your application is to stress test it really well. If you are using ASP.Net MVC you can do this in your tests. One thing you can do is to add a for loop that adds users to your app in your test and then test things out. That is one idea. There are other ways. You just have to give it a little effort to design your tests well or at least well enough for your purposes.
Good luck to you!

The current Membership provider seems like a BIG overkill, there's way too much functionality.
All I want to store is email/password and basic profile information such as First/LastName, Phone number. I will only ever need 2 roles, administrators & users.
Then just use that part. It's not going to use the parts that you don't use, and you may find that you have a need for those other parts down the road. The classes are already present in the .NET framework so you don't have to provide any licensing or anything.
The size of the database is quite small, and if you do like I do, and leave aspnetdb to itself, then you're not really taking anything from your other databases.
Do you have a compelling reason to use a third-party component OVER what's in the framework already?
EDIT:
there are also extra columns which I
have no use for in the
Membership/Users table, but which
would add to the payload of each
select/insert statements.
MobilePIN
PasswordQuestion/PasswordAnswer (I'll
do email based password recovery)
IsApproved (user will always be
approved)
Comment MobileAlias
Username/LoweredUsername (or
Email/LoweredEmail) [email IS the
username so only need 1 of these]
This sounds like you're trying to microoptimize. Passing empty strings is virtually without cost (ok, it's there, but you have to profile to know just how much it's costing you. It won't be THAT much per user). We already routinely don't use all these fields in our apps either, but we use the membership system with no measurable detrimental impact.
Furthermore, I've heard that Guid's aren't all that fast, and would prefer to have integers instead (like Facebook does) which would also be publicly exposed.
I've heard that the cookiemonster likes cookies. Again, without profiling, you don't know if that's detrimental. Usually people use GUIDs because they want it to be absolutely (well to a degree of absoluteness) unique, no matter when it's created. The cost of generating it ONCE per user isn't all that heavy. Not when you're already creating them a new account.
Since you are absolutely set on creating a MembershipProvider from scratch, here are some references:
http://msdn.microsoft.com/en-us/library/system.web.security.membershipprovider.aspx
https://web.archive.org/web/20211020202857/http://www.4guysfromrolla.com/articles/120705-1.aspx
http://msdn.microsoft.com/en-us/library/f1kyba5e.aspx
http://www.amazon.com/ASP-NET-3-5-Unleashed-Stephen-Walther/dp/0672330113
Stephen Walther goes into detail on that in his book and it's a good reference for you to have as is.

My recommendation would be for you to benchmark it. Add as many records as you think you will have in production and submit a similar number of requests as you would get in production and see how it performs for your environment.
My guess is that it would be OK, the overhead that you are talking about would be insignificant.

Assistance with URL structure for accept/decline links

I am in the process of creating an app in which a customer can add email addresses to an event. This means that each email address is sent 2 urls via email when added to the list, 1 url to accept and the other to decline. The url is made up of a number of query parmatters, id's etc.
The issue I have is that I want to prevent the scenario in which someone could "guess" another persons url - as such guest the combination of parametters etc. While this is very unlikely, I still want to prevent such.
I have seen several scenarios to help prevent this, ie. add a hash value, encrypt the url etc. However I am looking for the most secure and best practise approach to this and would like any possible feedback.
As an aside I am coding in C# but I dont believe the solution to this is language specific.
Thanks in advance.

I agree this is not language specific. I had a situation very similar to this within the last few years. It needed to be extremely secure due to children and parents receiving the communications. The fastest solution was something like the following:
First store the information that you would use in the URL as parameters somewhere in a database. This should be relatively quick and simple.
Create two GUIDs.
Associate the first GUID with the data in the database that you would have used for processing an "acceptance".
Associate the second GUID for a "decline" record in the database.
Create the two URL's with only the GUID's as parameters.
If the Acceptance URL is clicked, use the database data associated with it to process the acceptance.
If the Decline is clicked, delete the data out of the database, or archive it, or whatever.
After a timeframe, is no URL is clicked, delete or archive the data associated with those GUID's so that they can no longer be used.
GUID's are extremely hard to guess, and the likelihood of guessing one that is actually usable would be so unlikely it is nearly impossible.

I'm guessing you are saving these email addresses somewhere. So it's quite easy to make a secure identifier for each entry you have. Whether that is a hash or some encryption technique, doesn't really matter. But I guess a hash is easier to implement and actually meant for this job.
So you hash for example the emailaddress, the PK value of the record, with the timestamp of when it was added, and some really impossible to guess salt. Just concatenate the various fields together and hash them.
In the end, you send nothing but the hashed key to the server. So when you send those two links, they could look as follows:
http://www.url.com/newsletter/acceptsubscription.aspx?id=x1r15ff2svosdf4r2s0f1
http://www.url.com/newsletter/cancelsubscription.aspx?id=x1r15ff2svosdf4r2s0f1
When the user clicks such a link, your server looks in the database for the record which contains the supplied key. Easy to implement, and really safe if done right. No way in hell someone can guess another persons key. Just bear in mind the standard things when doing something with hashing. Such as:
Do not forget to add salt.
Pick a really slow, and really secure, hashing algorithm.
Just make sure that no one can figure out their own hash, from information they can possess.
If you are really scared of people doing bad things, make sure to stop bruteforcing by adding throttle control to the website. Only allow X number of requests per minute for example. Or some form of banning on an IP-address.
I'm not an expert at these things, so there might be room for improvement. However I think this should point you in the right direction.
edit: I have to add; the solution provided by Tim C is also good. GUID's are indeed very useful for situations like these, and work effectively the same as my hashed solution above.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.