I am working on a Order (MVC) system where the orders transition to different states, i.e new order, paid, shipped, etc. Each state can have multiple transitions. Originally I thought I would have a status table with an ID and Description and then a transition table that would have current status and transitions status, with each transition on a single row. In order to populate a selection box, I would have to do the join to get the descriptions. Now I am thinking I could do it all in one table and add a comma separate column which would list the possible transitions. Is this a good idea or is there a better way?
Any RDBMS promotes database normalization, There are 6 forms of database normalization. Normally if you can get to first three it is good enough.
The first Normal Form states: you should have only one piece of information in a column and a column should store on one piece/Type of information.
Now if your case when you are try to save a comma deliminited list of transitions. if you have to pick only record with a particular type of transitions state?? it will be a messy query.
Also imagine a scenario where you have to update a column for a particular record when transition state is changed, again a very messy , error prone and performance killer query.
Therefore follow the basic rules of Database Normalization, and stick to your 1st which was to create a separate table and use IDs to define transition state, and add a new row whenever a transition changes.
My Suggestion
Simply Have one column for [current status] and one Column for [transition], add a new row everytime any of the values change.
Also a datetime column with default value to current datetime. Which will allow you to go back in history and see different status and transition states of a record in point in time.
Have only one column in only One table which stores this information reference this column in other tables if you need to.
Related
I have a view which I've created by joining several tables whose records can be changed so the content of the columns of the view can also be changed.
Columns of the view contain data like address,random numbers,date,some random string etc.
I'm accepting search text from user and returns rows if any of its column contain text entered by the user.
My view have millions of records so normal like query won't work(takes long time) ?
What is the most efficient way to search this view as it changes as its tables get changed ?
I'm using oracle database, C#, entityframework.
For better performance you should properly add index in the original table .. these indexes are automatically refreshed by rdbms engine on each change .. so is impossible that you obtain wrong data by the index value .. the index value and the table data contain the same values..
You don't need to reindex every time ... sometimes (monthly) you can updated the related statistcs ..
so the index can change you performance in better a lot .. and this also for the view
The view in create on the top of the original table on fly and is not a stored copy of the original tables .. so the indexes help the view to render more fastly the expected result ..
the indexes Indexes when properly designed, serve for important purposes in a database server:
They let the rdbms
find groups of adjacent rows instead of single rows.
avoid sorting by reading the rows in a desired order.
let the server satisfy (sometimes) entire queries from the index alone, avoiding (when possible) the need to access the table at all.
from mysql https://dev.mysql.com/doc/refman/5.5/en/mysql-indexes.html
https://dev.mysql.com/doc/refman/5.5/en/column-indexes.html
https://dev.mysql.com/doc/refman/5.5/en/multiple-column-indexes.html
http://code.tutsplus.com/tutorials/top-20-mysql-best-practices--net-7855
http://use-the-index-luke.com
Let's say we have a code list of all the countries including their country codes. The country code is primary key of the Countries table and it is used as a foreign key in many places in the database. In my application the countries are usually displayed as dropdowns on multiple forms.
Some of the countries, that used to exists in the past, don't exist any more, for example Serbia and Montenegro, which had the country code of SCG.
I have two objectives:
don't allow the user to use these old values (so these values should not be visible in dropdowns when inserting data)
the user should still be able to (readonly) open old stuff and in this case the deprecated values should be visible in dropdowns.
I see two options:
Rename deprecated values, for instance from 'CountryName' to '!!!!!CountryName'. This approach is the easiest to implement, but with obvious drawbacks.
Add IsActive column to Countries table and set it to false for all deprecated values and true for all other. On all the forms where the user can insert data, display only values which are active. On the readonly forms we can display all values (including deprecated ones) so the user will be able to display old data. But on some of my forms the user should be able to also edit data, which means that the deprecated values should be hidden from him. That means, that each dropbox should have some initialization logic like this: if the data displayed is readonly, then include deprecated values in dropbox and if the data is for edit also, then exclude them. But this is a lot of work and error prone too.
And other ideas?
I deal with this scenario a lot, and use the 'Active' flag to solve the problem, much as you described. When I populate a drop-down list with values, I only load 'active' data and include upto 1 deprecated value, but only if it is being used. (i.e. if I am looking at a person record, and that person has a deprecated country, then that country would be included in the Drop-downlist along with the active countries. I do this in read-only AND in edit modes, because in my cases, if a person record (for example) has a deprecated country listed, they can continue to use it, but once they change it to a non-deprecated country, and then save it, they can never switch back (your use case may vary).
So the key differences is, even in read-only mode I don't add all the deprecated countries to the DDL, just the deprecated country that applies to the record I am looking at, and even then, it is only if that record was already in use.
Here is an example of the logic I use when loading the drop down list:
protected void LoadSourceDropdownList(bool AddingNewRecord, int ExistingCode)
{
using (Entities db = new Entities())
{
if (AddingNewRecord) // when we are adding a new record, only show 'active' items in the drop-downlist.
ddlSource.DataSource = (from q in db.zLeadSources where (q.Active == true) select q);
else // for existing records, show all active items AND the current value.
ddlSource.DataSource = (from q in db.zLeadSources where ((q.Active == true) || (q.Code == ExistingCode)) select q);
ddlSource.DataValueField = "Code";
ddlSource.DataTextField = "Description";
ddlSource.DataBind();
ddlSource.Items.Insert(0, "--Select--");
ddlSource.Items[0].Value = "0";
}
}
If you are displaying the record as read-only, why bother loading the standing data at all?
Here's what I would do:
the record will contain the country code in any case, I would also propose returning the country description (which admittedly makes things less efficient), but when the user loads "old stuff", the business service recognises that this record will be read only, and you don't bother loading the country list (which would make things more efficient).
in my presentation service I will then generally do a check to see whether the list of countries is null. If not (r/w) load the data into the list box, if so (r/o) populate the list box from the data in the record - a single entry in the list equals read-only.
You can filter with CollectionViewSource or you could just create a Public Enumerable that filters the full list using LINQ.
CollectionViewSource Class
LINQ The FieldDef.DispSearch is the active condition. IEnumerable is a little better performance than List.
public IEnumerable<FieldDefApplied> FieldDefsAppliedSearch
{
get
{
return fieldDefsApplied.Where(df => df.FieldDef.DispSearch).OrderBy(df => df.FieldDef.DispName);
}
}
Why would you still want to display (for instance) customer-addresses with their OLD country-code?
If I understand correctly, you currently still have 'address'-records that still point to 'Serbia and Montenegro'. I think if you solve that problem, your current question would be none-existent.
The term "country" is perhaps a little misleading: not all the "countries" in ISO 3166 are actually independent. Rather, many of them are geographically separate territories that are legally portions or dependencies of other countries.
Also note that 'withdrawn country-codes' are reserved for 5 years, meaning that after 5 years they may be reused. So moving away from using the country-code itself as primary key would make sense to me, especially if for historical reasons you would need to back-track previous country-codes.
So why not make the 'withdrawn' field/table that points to the new country-id's. You can still check (in sql for instance, since you were already using a table) if this field is empty or not to get a true/false check if you need it.
The way I see it: "Country" codes may change, country's may merge and country's may divide.
If country's change or merge, you can update your address-records with a simple query.
If country's divide, you need a way to determine what address is part of what country.
You could use some automated system do do this (and write lengthly books about it).
OR
(when it is a forum like site), you could ask the users that still have a withdrawn country that points to multiple alternatives in their account to update their country-entry at login, where they can only choose from the list of new country's that are specified in the withdrawn field.
Think of this simplified country-table setup:
id cc cn withdrawn
1 DE Germany
2 CS Serbia and Montenegro 6,7
3 RH Southern Rhodesia 5
4 NL The Netherlands
5 ZW Zimbabwe
6 RS Serbia
7 ME Montenegro
In this example, address-records with country-id 3, get updated with a query to country-id 5, no user interaction (or other solution) needed.
But address-records that specify country-id 2 will be asked to select country-id 6 or 7 (of course in the text presented to the user you use the country-name) or are selected to perform your custom automated update routine on.
Also note: 'withdrawn' is a repeating group and as such you could/should make it into a separate table.
Implementing this idea (without downtime) in your scenario:
sql statement to build a new country-table with numerical id's as primary key.
sql statement to update address-records with new field 'country-id' and fill this field with the country-id from the new country-table that corresponds with country-code specified in that record's address-field.
(sql statement to) create the withdrawn table and populate the correct data with in it.
then rewrite your the sql statements that supply your forms with data
add the check and 'ask user to update country'-routine
let new forms go live
wait/see for unintended bugs
delete old country-table and (now unused) country-code column from the "address"-table
I am very curious what other experts think about this idea!!
I have seen people using ‘DataListValue’ table for storing those values (Call_Types, DepartmentCodes, Divisions and etc) which are used quite often in the drop down list on UI.
This way i can manage them in one table and will have one screen to update codes.
I am wondering if it is okay to keep DepartmentCode,RoleCode,CountryCode in the Data_List Table? Or I should have them in separate table?
It is common practice to have a single codes table that stores code/description pairs with some kind of table type column. For instance, you might commonly see a table like this:
CodesTable
TableId
Code
Description
However, I have never thought it was a good idea. Even if all you store is a code and a description, it's better to make a new table for each set of codes. That way your foreign key relationships will be more clear. Plus, inevitably, you will one day need to store some additional data about one of those code sets and you'll end up needing to add an additional column that only applies to one of the code sets that are stored in the table and the column will be null for all the other rows. It always turns ugly fast.
For instance, lets say, as in the example above, you set TableId to "C" for all the Country codes and you set it to "D" for all Department codes. But then next month a new requirement comes in that requires you to store a postal abbreviation for each Country code. Do you add a PostalAbbreviation column to the table even though it will never apply to Department codes? Or do you create another table that just stores additional data for each country code? How do you know what "C" and "D" mean unless you have some other place to look them up? All around it's just a bad idea.
I have an application which has rows of data in a relation database the table needs a status which will always be either
Not Submitted, Awaiting Approval, Approved, Rejected
Now since these will never change I was trying to decide the best way to implement them I can either think of a Status enum with the values and an int assigned where the int is placed into the status column on the table row.
Or a status table that linked to the table and the user select one of these as the current status.
I can't decide which is the better option as I currently have a enum in place with these values for the approval pages to populate the dropdown etc and setup the sql (as it currently using to bool Approved and submitted for approval but this is dirty for various reasons and needs changed).
Wondering what your thought on this were and whether I should go for one or the other.
If it makes any difference I am using Entity framework.
I would go with the Enum if it never changes since this will be more performant (no join to get the status). Also, it's the simpler solution :).
Now since these will never change...
You can count on this assumption being false, and sooner than you think.
I would use a lookup table. It's far easier to add or change values in a lookup table than to change the definition of an enum.
You can use a natural primary key in the lookup table so you don't need to do a join to get the value. Yes a string takes a bit more space than an integer id, but if your goal is to avoid the join this will accomplish that goal.
I use Enums and use the [Description("asdf")] attribute to bind meaningful sentences or other things that aren't allowed in Enums. Then use the Enum text itself as a value in drop downs and the Description as the visible text.
i have a countries list. Each user can check multiple countries. Once saved, this "user country list" will be used to get whether other users fit into countries certain user chose.
Question is what would be the most efficient approach to this problem...
I have one, one to save user selection as delimited list like Canada,USA,France ... in single varchar(max) field but problem with it would be that once user from Germany enters page i perform this check on. To search for Germany i would be needed to get all items and un-delimit each field to check against value or to use sql 'like' which again is pretty damn slow..
If you have better solution or some tips i would be glad to hear.
Just to make sure, many users will have their own selections of countries from which and only they want to have users to land on their page. While millions of users will reach those pages. So the faster approach will be the better.
technology, MSSQL and ASP.NET
thanks
You should not store a list of values in one cell. Consider having a separate table that stores each of the selected countries with a foreign key reference to the user table. This is standard Database Normalization.
PLEASE don't go down the route you're thinking of, storing multiple entries in one field. I've had to re-write more applications because of bad database design than for any other reason, and that is a bad design.
Added
I have this poster on my wall at work: http://www.informationqualitysolutions.com/FreeStuff/rettigNormalizationPoster.pdf
One of my predecessors was a newbie to DB Design, and this helped her a lot. I keep it for any new hires that may need it. It explains normalization very nicely, with examples.
Do not save delimited fields into your database. Your database will not be normalized.
You need a many-to-many table for users and countries:
UserId
CountryId
If you do start using a delimited field, you end up needing to parse it (either in SQL or your Code). It is more difficult to query and optimize.
In this case, you want will want to create a table called UserCountries (or some such) which would store the UserID and CountryID. This is a standard relational construct. To beginners, it seems strange and too involved, but this structure makes it very easy and very fast to write flexible queries against this type of data. No delimiting required!
I think it would be better to use a UserCountry table, which contains a link to the User and the Country table. This creates a lot more possibilities to query against the database. Example queries that are much simpler this way:
Number of Countries per user
All users which selected a particular country
Sort all popular countries
Do not store multiple countries in a single field. Add 2 additional tables - Countries (ID, Name) and UserCountries (UserID, CountryID)