DataSets to POCOs - an inquiry regarding DAL architecture

DataSets to POCOs - an inquiry regarding DAL architecture - c#

I have to develop a fairly large ASP.NET MVC project very quickly and I would like to get some opinions on my DAL design to make sure nothing will come back to bite me since the BL is likely to get pretty complex. A bit of background: I am working with an Oracle backend so the built-in LINQ to SQL is out; I also need to use production-level libraries so the Oracle EF provider project is out; finally, I am unable to use any GPL or LGPL code (Apache, MS-PL, BSD are okay) so NHibernate/Castle Project are out. I would prefer - if at all possible - to avoid dishing out money but I am more concerned about implementing the right solution. To summarize, there are my requirements:
Oracle backend
Rapid development
(L)GPL-free
Free
I'm reasonably happy with DataSets but I would benefit from using POCOs as an intermediary between DataSets and views. Who knows, maybe at some point another DAL solution will show up and I will get the time to switch it out (yeah, right). So, while I could use LINQ to convert my DataSets to IQueryable, I would like to have a generic solution so I don't have to write a custom query for each class.
I'm tinkering with reflection right now, but in the meantime I have two questions:
Are there any problems I overlooked with this solution?
Are there any other approaches you would recommend to convert DataSets to POCOs?
Thanks in advance.

There's no correct answer, though you'll find people who will try to give you one. Some things to keep in mind:
Since you can't get the advantages of EF or Linq-to-SQL, don't worry about using the IQuerable interface; you won't be getting the main advantage of it. Of course, once you've got your pocos, LINQ to object will be a great way of dealing with them! Many of your repository methods will return IQueryable<yourType>.
As long as you have a good repository to return your pocos, using reflection to fill them out is a good strategy, at first. If you have a well-encapsulated repository, I say again. You can always switch out the reflection-filled entity object code for more efficient code later, and nothing in you BL will know the difference. If you make yourself dependent on straight reflection (not optimized reflection like nHibernate), you might regret the inefficiency later.
I would suggest looking into T4 templates. I generated entity classes (and all the code to populate them, and persist them) from T4 templates a few months ago, for the first time. I'm sold! My code in my T4 template is pretty horrible this first try, but it spits out some nice, consistent code.
You will have to have a plan for your repository methods, and closely monitor all the methods your team creates. You can't have a general .GetOrders() method, because it will get all the customers every time, and then your LINQ to object will look nice, but will be covering some bad data access! Have methods like .GetOrderById(int OrderID) and .GetOrderByCustomer(int CustomerID). Make sure each method that returns entities uses an index at least in the DB. If the basic query returns some wasted records, that's fine, but it can't do table scans and return thousands of wasted records.
An example:
var Order = From O in rOrders.GetOrderByCustomer(CustID)
Where O.OrderDate > PromoBeginDate
Select O
In this example, all the Order for a customer would be retrieved, just to get some of the orders. But there won't be a huge amount of waste, and CustomerID should certainly be an indexed field on Orders. You have to decide whether this is acceptable, or whether to add a date distinction to your repository, either as a new method or with overloading other methods. There's no shortcut to this; you have walk the line between efficiency and maintaining your data abstraction. You don't want to have a method in your repository for every single data inquiry in your entire solution.
Some recent articles I've found where people are wrestling with how exactly to do this.:
http://mikehadlow.blogspot.com/2009/01/should-my-repository-expose-iqueryable.html
http://www.west-wind.com/WebLog/posts/160237.aspx

Devart dotConnect for Oracle supports the entity framework, you could then use LINQ to Entities.

Don't worry about using reflection to build DTOs from Datasets. They just work great.
An area of pain will be implementation of IComparer for each business object. Only load the data that is minimum requirement at the presentation layer. I burnt my fingers badly on in-memory sorting.
Also, plan in advanced for lazy-loading on DTOs.
We wrote our on Generic library to convert datatable/datarow into entitycollection/entityobjects. And they work pretty fast.

Related

Should I Use Entity Framework, DataSet or Custom classes?

I am really having a hard time here. I need to design a "Desktop app" that will use WCF as the communications channel. Its a multi-tiered application (DB and application server are the same, the client goes through the internet cloud).
The application is a little complex (in terms of SQL and code logics) then the usual LOB applications, but the concept is the same: Read from DB, update to DB, handle concurrency etc. My problem is that now with Entity Framework out in the open, I cant decide which way to proceed: Should I use Entity Framework, Dataset or Custom Classes.
As I understand by Entity Framework, it will create the object mapping of my DB tables ALONG WITH the CRUD scripts as well. Thats all well and good for simple CRUD, but most of the times the "Select" is complex and it requires a custom SQL. I understand I can use Stored Procedures in EF (I dont like SP btw, i dont know why, I like to code my SQL in the DAL by hand, I feel more secure and comfortable that way).
With DataSet, I will use my custom SQLs and populate on the data set. With Custom classes (objects for DB tables) I will populate my custom SQLs on those custom classes (collections and lists etc). I want to use EF, but i dont feel confident in deploying an application whose SQL I have not written and cant see in the code. Am I missing something here.
Any help in this regard would be greatly appreciated.
Xeshu

I would agree with Marc G. 100% - DataSets suck, especially in a WCF scenario (they add a lot of overhead for handling in-memory data manipulation) - don't use those. They're okay for beginners and two-tier desktop apps on a small scale maybe - but I wouldn't use them in a serious, professional app.
Basically, your question boils down to how do you transform your rows from the database into something you can remote across WCF. This means some form of mapping - either you do it yourself, using DataReaders and then shoving all the data into WCF [DataContract] classes - you can certainly do that, gives you the ultimate control, but it's also tedious, cumbersome, and error-prone.
Or you let some ready-made ORM handle this grunt work for you - take your pick amongst Linq-to-SQL (great, easy-to-use, flexible, but SQL Server only), EF v4 (out by March 2010 - looks very promising, very flexible) or any other ORM, really - whatever suits your needs best.
Other serious competitors in the ORM space might include Subsonic 3.0 and NHibernate (amongst many many others).
So to sum up:
forget about Datasets
either you have 100% control and to the mapping between SQL and your objects yourself
you let some capable ORM handle that (Linq-to-SQL, EF v4, Subsonic, NHibernate et al) - which one really doesn't matter all that much, i.e. it's also a matter of personal preference and coding style

I can't advocate datasets, especially in an SOA environment like WCF - it'll work, but for mostly the wrong reasons. They simply aren't portable, and IMO don't really "work" over service boundaries. Of course, IMO they don't work in most other scenarios too ;-p
So then it comes down to how much plumbing you want to do. Most ORMs will create WCF-serializable types for you; personally I'd use LINQ-to-SQL at the moment; it is both simpler and more complete than EF, although EF 4.0 is meant to be much better than EF in 3.5sp1. You can use custom TSQL (via ExecuteQuery, which still does the mapping back to objects), but I tend to use either SPROC (for complex queries) or LINQ-generated queries (for simple requests).
Writing the types yourself is fine too, and will work with NHibernate etc. So many options.

While EF works with WCF and sounds very promising, you should consider the effort to get on speed with it. Especially when doing some non trivial stuff, the designer in VS2008 can't open the model anymore and you have to code your model in xml.
Also keep in mind that EF works on a very high abstraction level. Because of the law of leaky abstractions its not all that shiny as it supposed to be :)
The other way round that means, you have to deal with very crazy and hard to read sql statements sent to your database when it comes to troubleshooting / performance issues.

Win-based application (C#) needs a class for handling DB Connection

I think about having a class clsConnection which we can take advantage of in order to execute every SQL query like select, insert, update, delete, .... is pretty good.
But how complete it could be? How?

You could use LINQ to SQL as AB Kolan suggested or, if you don't have time for the learning curve, I'd suggest taking a look at the Microsoft Enterprise Library Data Access Application Blocks.

You can use the DAB (SQlHelper) from the enterprise Library. This has all the methods/properties necessary for database operation. You dont need to create you own code.
Alternately you can use a ORM like LINQ or NHibernate.

It sounds to me like you're just re-writing the ADO.NET SqlConnection (which already has an attached property of type SqlCommand). Or Linq to SQL (or, even, Linq to Entities).

When doing data access i tend to split it into 2 tiers - purely for testability.
totally seperate the logic for getting key values and managing the really low level data collection from the atomic inserts, updates, selects deletes etc.
This way you can test the logic of the low level data collection very easily without needing to read and write from a database.
this way one layer of classes effectively manages writes to individual tables whilst the other is concerned with getting the data from lookups etc to populate these tables
The Business logic layer that sits on top of these 2 dal layers obviously manages the actual business logic - this means that the datastructure is as seperated from the business logic as is realistically possible ... Ie you could replace the dal and not feel the pain so much.
the 2 routes you can take that work well are
ADO.Net
this is very powerful as you have total control, but at the same time it is time consuming and feels repetative. Also its old school so most people are bored of it hence all the linq 2 sql comments. With this you open a connection to the DB and then execute a command against it.
Basically you create a class to interface with the database and use this to use stored procedures that are in the database. The lowest level class essentially fires off the command with its parameters and then populates itself with the returned values.
and Linq 2 SQL
This is a cool system. Essentially it makes SP's redundant for 90% of cases in return for allowing strongly typed sqlesque statements in your code - save time and are more reliable. I still use 2 dal layers with this but take advantage of the fact that it will generate the basic class with properties for you and simply add functionality to actually do the atomic operations. The higher level then implements the read and write logic for multiple objects.
The nicest part is that you can generate collections of collections easily with linq 2 sql and then write all the inserts and updates with one command (altohguh in reality you tend to do things seperatley).
L2S is powerful once you start playing with it wheras generating a collection of objects from ado.net can be a real pain in comparison - especially when you have to do it again and again.
Another alternative is Linq 2 entities
I ahve had problems with this due to linked servers, also it doesn't like views much and if your tables dont have pk's or constraints then it doesn't like life much either. Id stay clear of it for a while.
Of course if you mean that you want a generic class for writing and reading data from a database I think you will be adding complexity rather than solving a problem. Really you can;t avoid writing code ;) - each bit of data access is unique, trying to genericise it past ado.net or l2s is really asking for trouble imo.

Small project:
A singleton class (like DatabaseConnection) might be good for what you're doing.
Large project:
Enterprise Library has some database code; NHibernate or Entities Framework, perhaps.
Your question wasn't specific enough to give a very definitive answer on this.

Best design practices for .NET architecture with LINQ to SQL (DAL necessary? Can we truly use POCOs? Design pattern to adopt?)

I was avoiding writing what may seem like another thread on .net arch/n-tier architecture, but bear with me.
I, hopefully like others still am not 100% satisfied or clear on the best approach to take given today's trends and new emerging technologies when it comes to selecting an architecture to use for enterprise applications.
I suppose I am seeking mass community opinion on the direction and architectural implementation you would chose when building an enterprise application utilising most facets of today's .NET technology and what direction you would take. I better make this about me and my question, in fear of this being too vague otherwise; I would like to improve my architecture to improve and would really like to hear what you guys think given the list of technologies I am about to write.
Any and all best practices and architectural patterns you would suggest are welcome and if you have created a solution before for a similar type setup, any pitfalls or caveats you may have hit or overcome.
Here is a list of technologies adopted in my latest project, yep pretty much everything except WPF :)
Smart Client (WinForms)
WCF
Used by Smart Client
ASP.NET MVC
Admin tool
Client tool
LINQ to SQL
Used by WCF
Used ASP.NET MVC
Microsoft SQL Server 2008
Utilities and additional components to consider:
Dependency Injection - StructureMap
Exception Management - Custom
Unit Testing - MBUnit
I currently have this running in an n-Tier arch. adopting a Service-based design pattern utilising Request/Response (Not sure of it's formal name) and the Repository pattern, most of my structure was adopted from Rob Conery's Storefront.
I suppose I am more or less happy with most of my tiers (It's really just the DAL which I am a bit uneasy on).
Before I finish, these are the real questions I have faced with my current architecture:
I have a big question mark on if I should have a custom data access layer given the use of LINQ to SQL. Should I perform LINQ to SQL directly in my service/business layer or in a DAL in a repository method? Should you create a new instance of your DB context in each repository method call (using using())? or one in the class constructor/through DI?
Do you believe we can truly use POCO (plain old CLR objects) when using LINQ to SQL? I would love to see some examples as I encountered problems and it would have been particularly handy with the WCF work as I can't obviously be carrying L2S objects across the wire.
Creating an ASP.NET MVC project by itself quite clearly displays the design pattern you should adopt, by keeping view logic in the view, controller calling service/business methods and of course your data access in the model, but would you drop the 'model' facet for larger projects, particularly where the data access is shared, what approach would you take to get your data?
Thanks for hearing me out and would love to see sample code-bases on architectures and how it is split. As said I have seen Storefront, I am yet to really go through Oxite but just thought it would benefit myself and everyone.
Added additional question in DAL bullet point. / 15:42 GMT+10

To answer your questions:
Should I perform LINQ to SQL directly in my service/business layer or in a DAL in a repository method? LINQ to SQL specifically only makes sense if your database maps 1-to-1 with your business objects. In most enterprise situations that's not the case and Entities is more appropriate.
That having been said, LINQ in general is highly appropriate to use directly in your business layer, because the LINQ provider (whether that is LINQ to SQL or something else) is your DAL. The benefit of LINQ is that it allows you to be much more flexible and expressive in your business layer than DAL.GetBusinessEntityById(id), but the close-to-the-metal code which makes both LINQ and the traditional DAL code work are encapsulated away from you, having the same positive effect.
Do you believe we can truly use POCO (plain old CLR objects) when using LINQ to SQL? Without more specific info on your POCO problems regarding LINQ to SQL, it's difficult to say.
would you drop the 'model' facet for larger projects The MVC pattern in general is far more broad than a superficial look at ASP.NET MVC might imply. By definition, whatever you choose to use to connect to your data backing in your application becomes your model. If that is utilizing WCF or MQ to connect to an enterprise data cloud, so be it.

When I looked at Rob Connery's Storefront, it looked like he is using POCOs and Linq to SQL; However, he is doing it by translating from the Linq to SQL created entities to POCO (and back), which seems a little silly to me - Essentially we have a DAL for a DAL.
However, this appears to be the only way to use POCOs with Linq to SQL.
I would say you should use a Repository pattern, let it hide your Linq to SQL layer(or whatever you end up using for data access). That doesn't mean you can't use Linq in the other tiers, just make sure your repository returns IQueryable<T>.

Whether or not you use LINQ-to-SQL, it is often cmomon to use a separate DTO object for things like WCF. I have cited a few thoughts on this subject here: Pragmatic LINQ - but for me, the biggest is: don't expose IQueryable<T> / Expression<...> on the repository interface. If you do, your repository is no longer a black box, and cannot be tested in isolation, since it is at the whim of the caller. Likewise, you can't profile/optimise the DAL in isolation.
A bigger problem is the fact that IQueryable<T> leaks (LOLA). For example, Entity Framework doesn't like Single(), or Take() without an explicit OrderBy() - but L2S is fine with that. L2S should be an implementation detail of the DAL - it shouldn't define the repository.
For similar reasons, I mark the L2S association properties as internal - I can use them in the DAL to create interesting queries, but...

Simple Object to Database Product

I've been taking a look at some different products for .NET which propose to speed up development time by providing a way for business objects to map seamlessly to an automatically generated database. I've never had a problem writing a data access layer, but I'm wondering if this type of product will really save the time it claims. I also worry that I will be giving up too much control over the database and make it harder to track down any data level problems. Do these type of products make it better or worse in the already tough case that the database and business object structure must change?
For example:
Object Relation Mapping from Dev Express
In essence, is it worth it? Will I save "THAT" much time, effort, and future bugs?

I have used SubSonic and EntitySpaces. Once you get the hang of them, I beleive they can save you time, but as complexity of your app and volume of data grow, you may outgrow these tools. You start to lose time trying to figure out if something like a performance issue is related to the ORM or to your code. So, to answer your question, I think it depends. I tend to agree with Eric on this, high volume enterprise apps are not a good place for general purpose ORMs, but in standard fare smaller CRUD type apps, you might see some saved time.

I've found iBatis from the Apache group to be an excellent solution to this problem. My team is currently using iBatis to map all of our calls from Java to our MySQL backend. It's been a huge benefit as it's easy to manage all of our SQL queries and procedures because they're all located in XML files, not in our code. Separating SQL from your code, no matter what the language, is a great help.
Additionally, iBatis allows you to write your own data mappers to map data to and from your objects to the DB. We wanted this flexibility, as opposed to a Hibernate type solution that does everything for you, but also (IMO) limits your ability to perform complex queries.
There is a .NET version of iBatis as well.

I've recently set up ActiveRecord from the Castle Project for an app. It was pretty easy to get going. After creating a new app with it, I even used MyGeneration to script out class files for a legacy app that ActiveRecord could use in a pretty short time. It uses NHibernate to interact with the database, but takes away all the xml mapping that comes with NHibernate. The nice thing is though, if necessary, you already have NHibernate in your project, you can use its full power if you have some special cases. I'd suggest taking a look at it.

There are lots of choices of ORMs. Linq to Sql, nHibernate. For pure object databases there is db4o.
It depends on the application, but for a high volume enterprise application, I would not go this route. You need more control of your data.

I was discussing this with a friend over the weekend and it seems like the gains you make on ease of storage are lost if you need to be able to query the database outside of the application. My understanding is that these databases work by storing your object data in a de-normalized fashion. This makes it fast to retrieve entire sets of objects, but if you need to select data from a perspective that doesn't match your object model, the odbms might have a hard time getting at the particular data you want.

Should I start using LINQ To SQL?

Currently I am using NetTiers to generate my data access layer and service layer. I have been using NetTiers for over 2 years and have found it to be very useful. At some point I need to look at LINQ so my questions are...
Has anyone else gone from NetTiers to LINQ To SQL?
Was this switch over a good or bad thing?
Is there anything that I should be aware of?
Would you recommend this switch?
Basically I would welcome any thoughts
.

No
See #1
You should beware of standard abstraction overhead. Also it's very SQL Server based in it's current state.
Are you using SQL Server, then maybe. If you are using LINQ for other things right now like over XML data (great), Object data, Datasets, then yes you should could switch to have a uniform data syntax for all of them. Like lagerdalek mentioned if it ain't broke don't fix it.
From the quick look at .netTiers Application Framework, I'd say if you already have an investment with that solution it seems to give you much more than a simple Data Access Layer and you should stick with it.
From my experience LINQ to SQL is a good solution for small-medium sized projects. It is an ORM which is a great way to enhance productivity. It also should give you another layer of abstraction that will allow you to change out the layer underneath for something else. The designer in Visual Studio (and I belive VS Express also) is very easy and simple to use. It gives you the common drag-drop and property-based editing of the object mappings.
# Jason Jackson - The Designer does let you add properties by hand, however you need to specify the attributes for that property, but you do this once, it might take 3 minutes longer than the initial dragging of the table into the designer, however it is only necessary once per change in the database itself. This is not too different from other ORMs, however you are correct that they could make this much easier, and find only those properties that have changed, or even implement some kind of refactoring tool for such needs.
Resources:
Why use LINQ to SQL?
Scott Guthrie on LINQ to SQL
10 Tips to Improve your LINQ to SQL Application Performance
LINQ To SQL and Visual Studio 2008 Performance Update
Performance Comparisons LINQ to SQL / ADO / C#
LINQ to SQL 5 Minute Overview
Note that Parallel LINQ is being developed to allow for much greater performance on multi-core machines.

I tried to use Linq to SQL on a small project, thinking that I wanted something I could generate quickly. I ran into a lot of problems in the designer. For example, anytime you need to add a column to a table you basically have to remove and re-add the table definition in the designer. If you have set any properties on the table then you have to re-set those properties. For me this really slowed down the development process.
LINQ to SQL itself is nice. I really like the extensibility. If they can improve the designer I might try it again. I think that the framework would benefit from a little more functionality aimed at a disconnected model like web development.
Check out Scott Guthrie's LINQ to SQL series of blog posts for some great examples of how to use it.

NetTiers is very good for generating a heavy and robust DAL, and we use it internally for core libraries and frameworks.
As I see it, LINQ (in all its incarnations, but specifically as I think you're asking to SQL) is fantastic for quick data access, and we generally use it for more agile cases.
Both technologies are quite inflexible to change without regeneration of the code or dbml layer.
That being said, used properly LINQ 2 SQL is quite a robust solution, and you might even start using it for future development due to it's ease of use, but I wouldn't throw away your current DAL for it - if it aint broke ...

My experience tells me that using by using linq you can get things done faster, however the actual actions to the database are slower.
So... if you have a small database, i'll say go for it. If not, i would wait for some improvements before changing

I'm using LINQ to SQL on fairly large project right now (about 150 tables) and it is working out very well for me. The last ORM I used was IBatis and it worked well but took alot of legwork to get your mappings done. LINQ to SQL performs very well for me and so far has proved to be very easy to use out of the box. There are definately some differences you have to overcome in transition, but I would recommend it's use.
Side note, I have never used or read about NetTiers so I won't discount it's effectiveness, but LINQ to SQL in general has proven to be an extremely viable ORM.

Our team used to use NetTiers and found it to be useful. BUT... the more we used it, the more we found headaches and pain points with it. For example, anytime you make a change to the database, you need to re-generate the DAL with CodeSmith which involved:
re-generating thousands of lines of code in 3 separate projects
re-generating hundreds of stored procedures
Maybe there are other ways of doing it, but this is what we had to do. The re-gen of the source code was ok, scary, but ok. The real issue came with the stored procedures. It didn't clean any unused stored procedures so if you removed a table from your schema and re-gened your DAL, the stored procedures for that table did not get removed. Also, this became quite a headache for database change scripts where we had to compare the old database structure to the new one and create a change script to update client installations. This script could run into the tens of thousands of lines of sql code and if there was an issue executing it, which there invariably was, it was quite a pain to resolve it.
Then the light came on, NHibernate as an ORM. It certainly has a ramp-up time to it but it is well worth it. There is a ton of support for it so if there's something you need done, more than likely it's been done before. It is extremely flexible and allows you to control every aspect of it and then some. It is also becoming easier and easier to use. Fluent Nhibernate is up and coming as a great way to get rid of the xml mapping files that are needed and NHibernate Profiler provides an excellent interface to see what's going on behind the scenes to increase efficiency and remove redundancy.
Moving from NetTiers to NHibernate has been painful, but in a good way. It has forced us to move into a better architecture and re-evaluate functional needs. NetTiers provided tons of data access code, get this entity by its id, get this other entity by its foreign key, get a tlist and vlist of this and that, but most of it was unnecessary and unused. NHibernate with a generic repository and custom repositories only where needed reduced tons of unused code and really increased readability and reliability.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.