I use SQL Server and Entity Framework as ORM.
Currently I have a table Product which contains all products of any kind. The different kinds of products possess different attributes.
For example:
All products of kind TV have attributes title, resolution and contrast
Where as all products of kind Car have attributes like model and horsepower
Based on this scenario I created a table called Attribute which contains all the attributes of a product.
Now to fetch a product from database I always have to join all the attributes.
To insert a product I have to insert all the attributes one by one as single rows.
The application is not just a shop or anything like it. It should be possible to add/remove an attribute to/from a kind of product on the fly without changing the db.
But my questions to you is still:
Is this a bad design?
Is there another way of doing it?
Will my solution slow down significant? (f.e. an insert takes several seconds assumed the product has hundreds of attributes...)
Update
The problem is that my application is really complex. There are a lot of huge algorithms. The software is used for statistical purposes.
One problem for example is the following one: In an algorithm-table I'm storing which attributes are used for filters. Say an administrator wants to filter all cars that have less than 100 horsepowers. The filters are dynamical, what means that I have a filter table which stores the filter type (lessThan) and the attribute (horsepowers). How can I keep this flexibility with the suggested approaches (with "hardcoded" columns)?
There is a thing about EF that I don't think everybody is aware of when designing the relations.
When you query something, EF (at least <= 4) wants to create a single SELECT for that query.
What that implies is that if you have entity A, that have a one-to-many relationship to entity B (say Item to Attributes) then EF joins the two together such there will be a returned row for all dependent Bs for each A. If A have many properties, multiple dependencies or even worse if B has many sub-dependencies, then the returned table will be quite massive, since all A-properties will be copied for each row of dependent B. Over time, when your entity models grow in complexity, this can turn into a real performance problem.
EF only includes the Bs if you explicitly tell to it to eager load the dependencies "include"s. If the includes are omitted, your stuff will initially load faster, but once you access your attributes, they will be lazy-loaded by EF. This is known as the SELECT N+1 problem (each A will require N times B-lazy queries, which can be a huge overhead).
While this is not a straight answer to your question, it is something to consider when designing your tables.
Also note, that EF supports several alternatives for base-classing. One strategy is to have a common table, that automatically joined together with the sub-entities. The alternative, which typically performs better, but is harder to upgrade, is to have one table with a super-set of all properties of all sub-classes.
More (over) generalized database design considerations:
The devil is in the details. You can make a whole career out of making good database design choices. There is no silver bullet database patterns.
EF comes with a lot of limitations. This is the price for the convenience. If the model suits EF well, then EF is quite good, but do consider more flexible alternatives like NHibernate. Sometimes even plain old data tables with views and stored procedures are to be preferred.
EF is not efficient if your model has a lot of small dependents (like a ton of attributes to an item table). It will result in either a monster query and return table or the select n+1 problem. You can write some tricky multi-part LINQ queries to somewhat compensate, but it is tricky.
SQL's strength is in integrity and reporting which works best for rather rigid data models.
Depending on the details, your model looks like a great candidate for a NoSql backend, like RavenDb and MongoDb. NoSql is much better for dynamic datamodels and scale really well.
Related
I've seen a few answers that get in the general area of this question, but none that directly gets at what I'm seeing. I've inherited a recent code base with a charter to make the code run faster. It looks to me like each of the tables in the database (one database) is given its own dbcontext. These tables are highly relational and the major issue I keep seeing is that decisions need to be made based on foreign key relationships between tables. With this design approach, every time two tables are involved, the developer nested using statements for each context (good on him for auto cleanup, right), grabs the data from table A, then, loops through table B row by row to find matches (usually we're interested in the non-matches so that we can just update table B), then proceed. I have certainly seen multiple contexts used in an app, but I've never seen this "pattern" and cannot figure out why one would use it. Before I rewrite this part of the app so that each table is in the same context, I felt I should do a sanity/reality check to make sure that there isn't some really bad monster waiting for me. Obviously, I cannot find anything. Here's a pseudo snippet (exact code with table and column name changes):
var beers = from b in beerContext.AllBeersOffered
where b.CurrentStock = 1
select b;
foreach(var beer in beers)
{
Bars bars = barContext.AllBeersCarried.FirstOrDefault(x=>x.BeerId==beer.Id)
if(bars == null)
{
//Create a List of beers offered but not carried in that bar,
// which is then added to another table.
}
}
Again, my inclination is very much Whiskey-Tango-Foxtrot, move the tables to one DbContext and handle this in one insert, but that requires assuming a deeper level of really, really bad coding than I'm comfortable making. Obviously, the original programmer is long gone from the business and we think the profession entirely. But, he apparently was a pretty experienced developer (Java, though), so I don't like to assume incompetence when it could be something I'm missing. My question is, could you justify this design, and if so, what should I look for?
This is an anti-pattern, big time!
I've seen beginners do this, esp. when coming from a DAO background, in which each entity has its own data-access mirror.
It thoroughly defeats the purpose of a mature OR mapper like Entity Framework, that is designed to cross the chasm between a relational model and an object model. Such OR mappers are designed especially to map associations in tabular form (child refers to parent) to associations in object-oriented form (parent owns children).
So yes, go ahead and grab related entities into contexts that represent natural aggregates in the application.
I would recommend you to make coarse-grained DbContexts, but not to move entire schema to one DbContext.
You can follow the DDD practice, where one important table is taken as the root, and then it goes together with a couple of related tables that strongly depend on it. That structure is called an aggregate. In my designs, I usually make one DbContext per aggregate.
On the other hand, operations should be redefined where needed, so that each operation only deals with one aggregate. It may pick IDs from other aggregates, make queries to collect all the data it needs, but then make changes to only one aggregate. That approach to design is leading to great simplifications in code.
Introduction:
I'm refactoring (pretty much rewriting) a legacy application in my current internship. The part that this question will be concerned about is the database it uses and the way they retrieve data from it.
The database structure is:
There's a table that has the main records. Let's say each record is a measurement. It has some info about the measured material and different measurement information.
There's a table view they use that has the same information columns, plus some extra columns that contains data calculated from the given measurements. And it also filters some of the data from the table.
So let's say we have the main table with columns:
Measurement ID
Measurement A
Measurement B
The view has something like this:
Measurement ID
Measurement A
Measurement B
Some extra data (for example Measurement A * Measurement B)
The guy that is leading the development only knows some SQL, so he likes adding new columns that is calculated by some columns in the main table for experimenting. And this is definitely a need at the moment.
Requirements are:
Different types of databases should be supported (like SQL Server, Oracle, and probably some others).
The frontend should be able to show the view, which means even though some main columns will always stay the same, there may be some new columns including newly calculated values.
My question is:
What kind of system should I use to accommodate the needs of this application? I wanted to use Entity Framework, but the fact that the view may have new columns in the future is I think a problem. As far as I understand, I should map my classes to the database before compiling.
The other thing that I'm considering is maybe using Entity Framework to get data from the main table and do the calculations and the filtering that is currently done in the table view directly in the frontend, and skip the view altogether. Which sounds fine, though I don't know if they will allow me to do that.
What would you do in my case? Please take into account that I have virtually no experience with databases and ORMs.
You are correct in that using Entity Framework will be a problem if the underlying DB schema is always changing. It will require you to update the EF model on your end every time to grab those new columns.
Ideally, all of your database access is hidden behind the interface to your DAL, so that your application doesn't need to know about which ORM is being used -- if any -- or which database it's connecting to.
I hate to say it, but given your requirements, an ORM might not make sense. You might want to go with something more generic without any strong-typing. You could just simply always return a DataTable to your application layer, and it could loop through the columns and values to display whatever is returned. If there are fields you know will never change, you could create a manual mapping for those fields only into your application object(s).
You may have a look to NoSQL system that are a lot more flexible on the schema. Or have a look to document database like RavenDB. All these systems allow the schema to change dynamically. You need to check the Pro's and Con's to see if it can fill you requirements.
(This answer is a bit out of subject as it's about replacing the SQL server and not really creating a DAL, but other answers cover the subject well and I would like to propose another way that may help.)
If your schema is unstable, then using Entity Framework as a beginner is going to be a headache. The assumption is that you can just refresh the design canvas periodically to let the tool handle database table changes. You can try that for a time to see when it becomes too much of a pain, but without any prior experience using ORMs or Entity Framework it may not be worth the effort.
I would probably use something like Rob Conery's Massive ORM (https://github.com/robconery/massive). It gives you more flexibility with the underlying database schema and is a very small library. I remember it being ~300 lines of code and very easy to use. It uses C# dynamics so you'll have to be using >= C# 4.0 and be comfortable with that one concept but IMO it's worth it for the low-overhead. A full-fledged ORM like Entity Framework or NHibernate is going to cost a lot of learning cycles.
You could, of course, just stick to ADO.NET DataTables. They're a bit ugly and verbose, but they'll do the job.
You can use Entity Framework - Database First if the DB is changing. Of course, you will have to regenerate your classes when you want to be able to access new columns, when the DB schema changes.
If you need to accomodate different database servers, then you should take a look into implementing a repository pattern and abstract all your data access that way.
Your comment
it involves write operations to the main table but the main table never changes
confirms what I was hoping for. It means you can use Entity Framework as the core of you application and a different route to display data.
Suppose that for display (of the view) you use a classic DataTable (because all common grids support them, contrary to displaying dynamic objects). I don't know how create/update/delete will be done, but saving changes will at some point involve mapping a DataRow to a MainEntity object. You can write one method for that like
MainEntity DataRowToEntity(DataRow row)
{
var entity = new MainEntity();
entity.PropertyA = row["PropertyA"];
....
}
The MainEntity can be attached to a context, its status changed to Modified, and saved.
I'm having trouble choosing an appropriate data access framework, partly because I'm very picky with my preferences and mostly because I don't have much experience with most of them :-)
I need a framework that will allow me to easily map between the DB tables (SQL Server) and my entities, and that will handle the CRUD operations for me (for the most part).
I want my entities to reside in a separate assembly from my DAL.
I prefer using attributes for the mappings over external file like XML.
It doesn't have to be an ORM, and I want to code my entities myself.
I don't mind writing stored procedures.
The project's database won't be very big. Less than 50 tables.
I'd like some of my entities to correspond to an inner join of two tables - one for static data entered manually during development and the other with data filled during runtime - without using two entities that reference one another (the result of this join will be a single entity).
Entity Framework sounded perfect until I realized it doesn't support Enums (yet - and I can't wait for EF 5.0).
I want these entities to include Enums, and plan on using lookup tables for the enums + code generation for the enum to keep it synchronized with the database.
Linq-to-SQL seems like a good candidate, but I don't know if it copes well with my previous demands.
Using Enterprise Library 5.0 DAAB with it's RowMapper, and extending it's abilities to perform updates and inserts is also an option (but will require more coding on my part).
I plan on implementing the Repository Pattern.
How about NHibernate? Would it do? No experience there either.
I would be happy to hear all suggestions.. the more the merrier! Thanks in advance!
I think nHibernate is the way to go, although some of its main strengths (ORM, stored procedure generation, etc) are things you listed as non-requirements. Anyway, nHibernate will do everything you want it to do. Technically it does use xml mappings, but these can easily be auto-generated using fluent attribute mapping. I like this, as it IS done for you, but you get the customization too just in case you need it. Good luck!
I have a legacy database with a pretty evil design that I need to write some applications for. I am not allowed to touch the database design at all, seeing how this is a fragile old system held together by spit and prayers. I am of course very aware that this is not how the database should have been designed in the first place, but real life some times gets in the way..
For my new application I am using NHibernate (with Fluent for mappings and NHibernate LINQ for querying) and trying to Do Things Right. So there is IoC and repositories and more interfaces than I can count. However, the DB structure is giving me some headaches.
The system is very much focused around the concept of customers, and each customer lives in a campaign. These campaigns are created by one of the old applications. Each campaign in the system is defined in a table called CampaignSettings. One of the columns of this table is simply a text column called "Table", which refers to a database table that is created at the same time as the campaign entry in CampaignSettings. The name of this table is related to the name of the campaign, which can pretty much be anything the customer wants (within the constraints given by SQL Server (2000 or 2005)). In these tables the customers live.
So that is challenge #1 - I won't know the table names until runtime. And it will change from site to site - no static mapping I guess.
To make it even worse, we have challenge #2 - this campaign table is also dynamic in structure, meaning it has a certain number of columns that are always there (customer id, name, phone number, email address and other housekeeping stuff), and then there are two other sets of columns, added depending on the requirements of the customer on a case-by-case basis.
The old applications use SQL to get the column names present in the table, then add the ones it doesn't know about as "custom fields" in the application. I need to handle this.
I know I probably can't handle these challenges simply by using mapping magic, and I am prepared to do some ugly SQL in addition to the ORM goodness that I get from NHibernate (there are 20-some "static" tables in here as well which NHibernate handles beautifully) - but how?
I will create a Customer entity that I guess I can populate manually by doing direct SQL like
SELECT * FROM SomeCampaignTable WHERE id=<?>
and then going through the columns one by one and putting stuff where it belongs. Not fun, but necessary.
And then I guess to discover the structure of the table in the first place, I could run SQL like this:
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'SomeCampaignTable'
ORDER BY ORDINAL_POSITION
And again do some manual work to configure my object to handle the custom fields.
My question is simply - how can I do this in NHibernate? Is it a simple matter of finding a way to run my own SQL, then looping through the results, or is there a more elegant way to take the pain out of it?
While I appreciate that this database design belongs in some kind of Museum of Torture somewhere, answers like "Add some views" or "Change the DB" won't help me - I will be shot if I suggest something like that.
Thanks for anything that could help save my sanity here!
You might be able to use NHibernate using Native SQL Entity Queries. Forget Linq2NH - not that I would recommend Linq2NH for any serious application.
Check this page.
13.1.2. Entity queries
https://www.hibernate.org/hib_docs/nhibernate/1.2/reference/en/html/querysql.html
You could maybe do something like this:
Map your entities based on a 'fake' table to keep NHibernate happy when it compiles the mapping documents (I know you said you can't change the DB, but hopefully ok to make an empty table to keep NH happy).
Then run a query like this, as per 13.1.2 above:
sess.CreateSQLQuery("SELECT tempColumn1 as mappingFileColumn1, tempColumn2 as mappingFileColumn2, tempColumn3 as mappingFileColumn3 FROM tempTableName").AddEntity(typeof(Cat));
NHibernate should stitch together the columns you've returned with the mapped entity and give you the entity of type 'Cat' with all the properties populated. I am speculating here though, I do not know for sure if this will work, its the only way I can think of to use NHibernate for this given you don't know the tables/columns at compile time. You definitely cannot use HQL, Criteria, Linq2NH since you don't know the tables and columns at compile time, and HQL et al all convert your mappings to the mapped column names to produce the underlying SQL. Native SQL Queries are the only way I think.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm a pretty good C# programmer who needs to learn SQL Server. What's the best way for me to learn SQL Server/Database development?
Note: I'm a total newb when it comes to DB's and SQL.
SQL is about set theory, or more correctly, relational algebra. Read a brief primer on that. And learn to think in sets, not in procedures.
On the practical side, there are four fundamental operations,
selects, which show some projection of a table(s) data
deletes, which remove some subset of a table's rows,
inserts, which add rows to a table,
updates, which (possibly) change data in a table
(By subset, I mean any subset, including the empty set, and not necessarily a proper subset.)
Anywhere I can write a column name in DDL (except as the target of an update), I can write an expression that uses column names, functions, or constants.
select 1, 2, 3 from table will return the resultset "1 2 3", once for each row in the table. If the column named create_date is of type date, and the function month returns a month number given a date, select month( create_date) from table will show me the month number for each create_date.
A where clause is a predicate that restricts rows selected, or deleted, or updated to those rows for which the predicate is true. A where cause can be composed of an arbitrary number of predicates connected by the logical operators and or and not. Just like the column list in a select, I can use column names, functions, and constants in my where clause. What result set do you think is returned from select * from table where 1 = 1;?
In a query, tables are related by joins, in which some datum or key in one is related by an operator to a datum or key in another table. The relational operator is often equality, but can in fact be any binary operator or even a function.
Tables are related, as I mentioned above, by keys; a row in a table may relate to zero, one, or many rows in another table; this is referred to as the cardinality of the relation. Relations may be one-to-one, one-to-many, many-to-many. There are standard ways of representing each relation. Before you look up the standard ways to do this, think about how you'd represent each one, what the minimum requirements of each kind is. You'll see that a many-to-many relation can in fact also model one-to-many and one-to-one; ask yourself why, given that, all relations are not many-to-many.
EF Codd, among other, pioneered the idea of normal form in relational databases. There are commonly held to be five or six normal forms, but the most important summary of normal form is simple: every entity that your database models should be represented by one row and one row only, every attribute should depend on the row's key, and every row should model an entity or a relationship. Read a primer on normal form, and understand why you can get data inconsistencies if a your database isn't normalized.
In all this, try to understand why I like to say "if you lie to the database, it will lie to you". By this I don't mean bad data, I mean bad design. E.g., if you model a one-to-many relation as many-to-many, what "lies" can be recorded? What "lies" can happen if your tables aren't normalized?
A view, in practical terms, is a select query given a name and stored in the database. If I often join table student to table major through the many-to-many relation student_major, maybe I can write a view that selects the columns of interest from that join, and use the view instead of alway rewriting the join.
Practical tips: first, write a view. whatever you're doing, it'll be simpler and clearer if you write a view for every calculation or sub-calculation you do. Write a view that encapsulates each join, write a view that encapsulates each transformation. Almost anything you want to do can be done in a view.
Decomposing a query into views serves the same ends as functional decomposition serves in procedural code: it allows you to concentrate on doing one thing well, makes it more easily tested, and allows you to compose more complex functionality out of simpler operations. Here's an example where I use views to transform a table into forms that more easily allow me to apply successive transformations, in order to get to a goal.
Don't conflate data. Each table ought to unambiguously model one thing (one kind of entity) and only one thing; each column should express one and only one attribute of that thing. Different kinds of entities belong in different tables.
Metadata is your friend. Your database platform will provide some metadata; what it doesn't provide you should add. Since metadata is data, all the rules for modeling data apply. You can get, for example, the names of all objects in your database from the sytem table sysobjects; syscolumns contains all the columns. To find all the columns in one table, you'd join sysobjects and syscolumns on id, and add a where clause restricting the resultset to a particular table name: where sysobjects.name = 'mytable'.
Experiment. Sit down at a database and ask yourself, "How can I represent people with hair colors and professions and residences? What tables and relations are implied in modeling that?" Then model that, as tables.
Then ask yourself, "How can I show all blonde doctors who reside in Atlanta", and write the query that does that. Piece it together by writing views that show you all blondes, all doctors, and all people who reside in Atlanta.
You'll find that in asking "how can I find that", you'll expose deficiencies in your model, and you'll find that you want or even need to change the way your model works. Make the changes, see how they make your queries easier or harder to write.
I love Joe Celko books from novice to advanced. I also think virtual labs are great.
An easy way to learn SQL syntax?
Use Microsoft Access. Use the Northwind sample database, open Access up in Query view and run some queries.
Creating a Simple Query
Start with SELECT * FROM and work your way up to more complicated examples.
One of the Best resources is http://www.sqlservercentral.com/ Tons of articles
Another good resource is http://www.trainingspot.com/VideoLibrary/Default.aspx
And here is a list of books my DBA suggested I read for learning SQL
Best Damn Exchange, SQL and IIS Book Period or on google books
Beginning SQL Server 2008 Developers or on Google books
Here are the three books I strongly recommend you read in order.
Begining SQL Server 2005 Programming
Professional SQL Server 2005 Programming
The Gurus Guide to Transact SQL
W3Schools has a nice tutorial with try by example setup. But other than just installing a express edition and having a bunch of trials runs with the demo databases, I'd say no book will teach you better.
I would say your very best bet is to sign up for a DB class at a local college. You can usually find an evening class. You will start with simple Database concepts like what is a database, and what are tables.
The instructor will usually give you a project as homework about halfway though the class where you will design and implement a simple database for something like a video store. You will have interaction with other students who are at your same level and will be interested in discussing the technical details from a new DB guy standpoint. And you will have an experienced instructor you can ask questions of and get timely interaction from, who won't be snarky like us internet posters :)
Get it from horse's mouth --> http://www.asp.net/learn/videos/default.aspx?tabid=63#sql
These days most of the universities have their courses online. Try to research some good professors and learn the fundamentals. Their assignments are also useful.
of the top of my head, I can think of MIT opencourseware (OCW)
This depends on what you will need to do. If you just need to access databases, you should have a look the various access strategies - DataReader, DataSet, LINQ to SQL, Entity Framework, NHibernate - and pick a solution.
If you need to develope database, get a good book on that topic. Get familar with the theoretical stuff - relational algebra, keys, referential integrity and normalisation. Then have a look at SQL and finally you may have a closer look at ACID transaction, locking, concurency control, indexes, and all the technical details that make a database server work.
I would suggest to read the wikipedia articles - may be the 100 most important ones - to get the big picture and then approch the details where required. But this will probably be no replacement for a good book if you want to get a good database developer.
I tend to like books because I can read them anywhere, I can go at my own pace and I can get eBook copies (when using apress). I also happen to learn more efficiently in this manner as I already know most of the concepts, like database types.. int, bool, guid, etc... you will know those as well. So, essentially, I would recommend the apress series of books - very comprehensive IMO. And you can generally find them used for very cheap on Amazon... Here is one tailored to you:
http://www.amazon.com/Beginning-SQL-Server-2008-Developers/dp/1590599586/ref=sr_1_1?ie=UTF8&s=books&qid=1239758026&sr=1-1
When you sign up to Microsoft Books Newsletters (From Microsoft Press) they actually give you (free) an ebook called Introducing SQL Server 2008.
http://csna01.libredigital.com/?urss1q2we6