Best way to query different database engines in a uniform way?

Best way to query different database engines in a uniform way? - c#

I work on a C# client application (SlimTune Profiler) that uses relational (and potentially embedded) database engines as its backing store. The current version already has to deal with SQLite and SQL Server Compact, and I'd like to experiment with support for other systems like MySQL, Firebird, and so on. Worse still, I'd like it to support plugins for any other backing data store -- and not necessarily ones that are SQL based, ideally. Topping off the cake, the frontend itself supports plugins, so I have an unknown many-to-many mapping between querying code and engines handling the queries.
Right now, queries are basically handled via raw SQL code. I've already run into trouble making complex SELECTs work in a portable way. The problem can only get worse over time, and that doesn't even consider the idea of supporting non-SQL data. So then, what is the best way to query wildly disparate engines in a sane way?
I've considered something based on LINQ, possibly the DbLinq project. Another option is object persistence frameworks, Subsonic for example. But I'm not too sure what's out there, what the limitations are, or if I'm just hoping for too much.
(An aside, for the inevitable question of why I don't settle on one engine. I like giving the user a choice of the engine that works best for them. SQL Compact allows replication to a full SQL Server instance. SQLite is portable and supports in-memory databases. I can imagine a situation where a company wants to drop in a MySQL plugin so that they can easily store and collate an application's performance data over the course of time. Last and most importantly, I find the idea that I should have to be dependent on the implementation details of my underlying database engine to be absurd.)

Your best bet is to use an interface for all of your database access. Then for each database type you want to support to do the implementation of the interface for that database. That is what I've had to do for projects in the past.
The problem with many database systems and storage tools is that they aim to solve different problems. You might not even want to store your data in a SQL database but instead store it as files in the App_Data folder of a web application. With an interface method you could do that quite easily.
There generally isn't a solution that fits all database and storage solutions well or even a few of them well. If you find one that claims it does I still wouldn't trust it. When you have a problem with one of the databases it's going to be much easier for you to dig through your objects than it will be to go dig through theirs.

Use an object-relational mapper. This will provide a high level of abstraction away from the different database engines, and won't impose (many) limitations on the kind of queries you can run. Many ORMs also include LINQ support. There are numerous questions on SO providing recommendations and comparisons (e.g. What is your favorite ORM for .NET? appears to be the most recent and has links to several others).

I would recommend the repository pattern. You can create a class that encapsulates all the actions that you need the database for, and then create a different implementation for each database type you want to support. In many cases, for relationional data stores, you can use the ADO.NET abstractions (IDbConnection, IDataReader, IDataAdapter, etc) and create a single generic repository, and only write specific implementations for the database types that do not provide an ADO.NET driver.
public interface IExecutionResultsRepository
{
void SaveExecutionResults(string name, ExecutionResults results);
ExecutionResults GetExecutionResults(int id);
}
I don't actually know what you are storing, so you'd have to adapt this for your actual needs. I'm also guessing this would require some heavy refactoring as you might have sql statements littered throughout your code. And pulling these out and encapsulating them might not be feasible. But IMO, that's the best way to achieve what you want to do.

Related

When and where to use nHibernate in a project?

Having created a functional test project using nHibernate with all the typical trimmings I've got a very good handle on how I can leverage nHibernate, well at least as much as I can so far. However, I'm wondering how other developers handle scenarios where the application needs to retreive data from the backend database that may run complex queries with joins / grouping for say reporting purposes or other tasks that fall outside of the OO paradigm.
Should I take the pure route of utilising nHibernate to execute these queries and fill the appropriate Repository objects regardless of its performance OR should I simply go straight the database passing a dataset back through the business layer?
Ultimately I'm comfortable using nHibernate for interacting with simple and complex business objects but there are situations where I feel that simply delving into the database makes more sense, and working as the only .NET developer atm I'm very interested in how other devs have handled this situation...
Thanks in advance

NHibernate offers much more than simple querying. The answer depends on what you need. Some examples:
Automatic mapping of data to classes;
Automatic change tracking;
Database independence;
Multiple choices in querying the database (currently 7) with support for easy refactoring;
Integrated caching capabilities;
Much much more.
If any of these properties fit your project, then yes, choose NHibernate.

Be pragmatic about it.
Many "complex" queries can be easily expressed in HQL, but if SQL is a better fit for some, by all means use it.
The reporting scenarios in particular are usually better suited for reporting tools that go directly against the DB.

If your reports have relatively simple joins/grouping, then Nhivernate will be ok.
If you have complex queries for reporting purposes, Nhibernate may not be the best solution. I would recommend SSRS (Sql Server Reporting Service). It will be easier for grouping and other advanced functinality such as exporting to word/excel.
It really depends on the complexity of your reports.

Should I Use Entity Framework, DataSet or Custom classes?

I am really having a hard time here. I need to design a "Desktop app" that will use WCF as the communications channel. Its a multi-tiered application (DB and application server are the same, the client goes through the internet cloud).
The application is a little complex (in terms of SQL and code logics) then the usual LOB applications, but the concept is the same: Read from DB, update to DB, handle concurrency etc. My problem is that now with Entity Framework out in the open, I cant decide which way to proceed: Should I use Entity Framework, Dataset or Custom Classes.
As I understand by Entity Framework, it will create the object mapping of my DB tables ALONG WITH the CRUD scripts as well. Thats all well and good for simple CRUD, but most of the times the "Select" is complex and it requires a custom SQL. I understand I can use Stored Procedures in EF (I dont like SP btw, i dont know why, I like to code my SQL in the DAL by hand, I feel more secure and comfortable that way).
With DataSet, I will use my custom SQLs and populate on the data set. With Custom classes (objects for DB tables) I will populate my custom SQLs on those custom classes (collections and lists etc). I want to use EF, but i dont feel confident in deploying an application whose SQL I have not written and cant see in the code. Am I missing something here.
Any help in this regard would be greatly appreciated.
Xeshu

I would agree with Marc G. 100% - DataSets suck, especially in a WCF scenario (they add a lot of overhead for handling in-memory data manipulation) - don't use those. They're okay for beginners and two-tier desktop apps on a small scale maybe - but I wouldn't use them in a serious, professional app.
Basically, your question boils down to how do you transform your rows from the database into something you can remote across WCF. This means some form of mapping - either you do it yourself, using DataReaders and then shoving all the data into WCF [DataContract] classes - you can certainly do that, gives you the ultimate control, but it's also tedious, cumbersome, and error-prone.
Or you let some ready-made ORM handle this grunt work for you - take your pick amongst Linq-to-SQL (great, easy-to-use, flexible, but SQL Server only), EF v4 (out by March 2010 - looks very promising, very flexible) or any other ORM, really - whatever suits your needs best.
Other serious competitors in the ORM space might include Subsonic 3.0 and NHibernate (amongst many many others).
So to sum up:
forget about Datasets
either you have 100% control and to the mapping between SQL and your objects yourself
you let some capable ORM handle that (Linq-to-SQL, EF v4, Subsonic, NHibernate et al) - which one really doesn't matter all that much, i.e. it's also a matter of personal preference and coding style

I can't advocate datasets, especially in an SOA environment like WCF - it'll work, but for mostly the wrong reasons. They simply aren't portable, and IMO don't really "work" over service boundaries. Of course, IMO they don't work in most other scenarios too ;-p
So then it comes down to how much plumbing you want to do. Most ORMs will create WCF-serializable types for you; personally I'd use LINQ-to-SQL at the moment; it is both simpler and more complete than EF, although EF 4.0 is meant to be much better than EF in 3.5sp1. You can use custom TSQL (via ExecuteQuery, which still does the mapping back to objects), but I tend to use either SPROC (for complex queries) or LINQ-generated queries (for simple requests).
Writing the types yourself is fine too, and will work with NHibernate etc. So many options.

While EF works with WCF and sounds very promising, you should consider the effort to get on speed with it. Especially when doing some non trivial stuff, the designer in VS2008 can't open the model anymore and you have to code your model in xml.
Also keep in mind that EF works on a very high abstraction level. Because of the law of leaky abstractions its not all that shiny as it supposed to be :)
The other way round that means, you have to deal with very crazy and hard to read sql statements sent to your database when it comes to troubleshooting / performance issues.

sqlite , berkeley db benchmarking

I want to create desktop application in c# for that i want to use embedded database like
(sqlite,berkeley db), so how can i start benchmarking for these databases ?

Recently, Oracle added the sqlite3 interface on top of BDB's btree storage, so you should be able to write your code against sqlite3 and then plug in BDB. The catch is licensing. BDB forces you to either pay or go open source; sqlite let's you do whatever you want.

Before thinking about benchmarking, you need to compare the features of the databases.
SQLite and BDB are completely different in the features they support, and if the data is complicated, I'd suggest SQLite for easier querying of relational data (if that's how your data is laid out)

I agree with Osama that you should compare the features your after first.
However, I disagree that "complicated" data should automatically drive you toward sqlite. While I haven't seen any benchmarks (nor have cared to write any), I have a gut reaction (whatever that's worth) that says BerkeleyDB is going to outperform nearly every time.
That said. I don't think that's what I'd use to make my own decision. It goes back to those features. If all I want is a simple data store, then I'd probably choose sqlite because its going to be easier. Likewise, if I want to be able to arbitrarily query my data on any field, or possibly one day store it in an "enterprise" SQL database, I'd likely go with sqlite because future migration will be easier. If, however, I intend to move beyond a simple data store, and am eyeing transactional safety, high concurrency, high availability, having many readers and writers, etc and I have a set of fairly well-defined "queries", then I probably want BDB.
Notice that "complexity" of my data doesn't really enter into these equations. The reason is simple. BDB can hold my object in it's native serialized format. Sql of any flavor comes with the famous impedence mismatch which, IMO, complicates my application.
If you are seriously considering BDB, I need to warn you that you should decide the type of storage your going to use up front as the different types of stores that BDB offers are not necessarily compatible.

Which is the "best" data access framework/approach for C# and .NET?

(EDIT: I made it a community wiki as it is more suited to a collaborative format.)
There are a plethora of ways to access SQL Server and other databases from .NET. All have their pros and cons and it will never be a simple question of which is "best" - the answer will always be "it depends".
However, I am looking for a comparison at a high level of the different approaches and frameworks in the context of different levels of systems. For example, I would imagine that for a quick-and-dirty Web 2.0 application the answer would be very different from an in-house Enterprise-level CRUD application.
I am aware that there are numerous questions on Stack Overflow dealing with subsets of this question, but I think it would be useful to try to build a summary comparison. I will endeavour to update the question with corrections and clarifications as we go.
So far, this is my understanding at a high level - but I am sure it is wrong...
I am primarily focusing on the Microsoft approaches to keep this focused.
ADO.NET Entity Framework
Database agnostic
Good because it allows swapping backends in and out
Bad because it can hit performance and database vendors are not too happy about it
Seems to be MS's preferred route for the future
Complicated to learn (though, see 267357)
It is accessed through LINQ to Entities so provides ORM, thus allowing abstraction in your code
LINQ to SQL
Uncertain future (see Is LINQ to SQL truly dead?)
Easy to learn (?)
Only works with MS SQL Server
See also Pros and cons of LINQ
"Standard" ADO.NET
No ORM
No abstraction so you are back to "roll your own" and play with dynamically generated SQL
Direct access, allows potentially better performance
This ties in to the age-old debate of whether to focus on objects or relational data, to which the answer of course is "it depends on where the bulk of the work is" and since that is an unanswerable question hopefully we don't have to go in to that too much. IMHO, if your application is primarily manipulating large amounts of data, it does not make sense to abstract it too much into objects in the front-end code, you are better off using stored procedures and dynamic SQL to do as much of the work as possible on the back-end. Whereas, if you primarily have user interaction which causes database interaction at the level of tens or hundreds of rows then ORM makes complete sense. So, I guess my argument for good old-fashioned ADO.NET would be in the case where you manipulate and modify large datasets, in which case you will benefit from the direct access to the backend.
Another case, of course, is where you have to access a legacy database that is already guarded by stored procedures.
ASP.NET Data Source Controls
Are these something altogether different or just a layer over standard ADO.NET?
- Would you really use these if you had a DAL or if you implemented LINQ or Entities?
NHibernate
Seems to be a very powerful and powerful ORM?
Open source
Some other relevant links;
NHibernate or LINQ to SQL
Entity Framework vs LINQ to SQL

I think LINQ to SQL is good for projects targeted for SQL Server.
ADO.NET Entity Framework is better if we are targeting different databases. Currently I think a lot of providers are available for ADO.NET Entity Framework, Provider for PostgreSQL, MySQL, esql, Oracle and many other (check http://blogs.msdn.com/adonet/default.aspx).
I don't want to use standard ADO.NET anymore because it's a waste of time. I always go for ORM.

Having worked on 20+ different C#/ASP.NET projects I always end up using NHibernate. I often start with a completely different stack - ADO.NET, ActiveRecord, hand rolled wierdness. There are numerous reasons why NHibernate can work in a wide range of situations, but the absolutely stand out for me is the saving in time, especially when linked to code generation. You can change the datamodel, and the entities get rebuilt, but most/all the other code doesn't need to be changed.
MS does have a nasty habit of pushing technologies in this area that parallel existing open source, and then dropping them when they don't take off. Does anyone remember ObjectSpaces?

Added for new technologies:
With Microsoft Sql Server out for Linux in Beta right now, I think it's ok to not be database agnostic. The .Net Core Path and MS-SQL route allows you to run on Linux servers like Ubuntu entirely with no windows dependencies.
As such, imo, a very good flow is to not use a full ORM framework or data controls and leverage the power of SSDT Visual Studio Projects (Sql Server Data Tools) and a Micro ORM.
In Visual Studio you can create a Sql Server Project as a legit Visual Studio Project. Doing so allows you to create the entire database via table designers or raw query editing right inside visual studio.
Secondly, you get SSDT's Schema Compare tool which you can use to compare your database project to a live database in Microsoft Sql Server and update it. You can sync your Visual Studio Project with the server causing updates in your project to go out to the server. Or you can sync the server with your project causing your source code to update. Via this route you can easily pick up changes the DBA made in maintenance last night and push out your new development changes for a new feature easily with a simple tool.
Using that same tool you can compute the migration script without actually running it, if you need to pass that off to an operations department and submit a change order, it works for that flow to.
Now for writing code against you MS-SQL Database, I recommend PetaPoco.
Because PetaPoco works Perfectly inline with the above SSDT solution. PetaPoco comes with T4 text templates you can use to generate all your data entity classes, and it generates the bulk data layer classes for you.
The catch is, you have to write queries yourself, which isn't a bad thing.
So you end up with something like this:
var people = dbContext.Fetch<Person>("SELECT * FROM People where Username Like '%#0%'", "bob");
PetaPoco automatically handles parameterizing #0 for you, it also has the handy Sql class for building queries.
Furthermore, PetaPoco is an order of magnitude faster than EF6 and 8+ times faster than EF7.
So in total, this solution involves using SSDT for SCHEMA management, and PetaPoco for code integration at the gain of high maintainability, customization, and very good performance.
The only downfall to this approach, is that you're hard tieing yourself to Microsoft Sql Server. However, imo, Microsoft Sql Server is one of the best RDBM's out there.
It's got DBMail, Jobs, CLR object capabilities, and on and on. Plus the integration between Visual Studio and MS-SQL server is phenomenal and you don't get any of that if you choose a different RDBMS.

I must say that I never used NHibernate for the immense time that needed to start using... time wasted on the XML setup.
I recently did a web application in MVC2, where I did choose ADO Entities Framework and I use Linq all the time.
I must say, I was impressed with the speed! and our site was having around 35 000 unique visitors per day, in around 60Gb bandwidth per day (I reduced radically this 60Gb number by hosting all static files in Amazon S3 - Great .NET wrapper they have, I must say).
I will always go this way. It's easy to start (just add new data item, choose tables and that's it! for every change in the database we just need to refresh the model - made automatically in just 2 clicks) and it's fun to use - Linq rules!

Simple Object to Database Product

I've been taking a look at some different products for .NET which propose to speed up development time by providing a way for business objects to map seamlessly to an automatically generated database. I've never had a problem writing a data access layer, but I'm wondering if this type of product will really save the time it claims. I also worry that I will be giving up too much control over the database and make it harder to track down any data level problems. Do these type of products make it better or worse in the already tough case that the database and business object structure must change?
For example:
Object Relation Mapping from Dev Express
In essence, is it worth it? Will I save "THAT" much time, effort, and future bugs?

I have used SubSonic and EntitySpaces. Once you get the hang of them, I beleive they can save you time, but as complexity of your app and volume of data grow, you may outgrow these tools. You start to lose time trying to figure out if something like a performance issue is related to the ORM or to your code. So, to answer your question, I think it depends. I tend to agree with Eric on this, high volume enterprise apps are not a good place for general purpose ORMs, but in standard fare smaller CRUD type apps, you might see some saved time.

I've found iBatis from the Apache group to be an excellent solution to this problem. My team is currently using iBatis to map all of our calls from Java to our MySQL backend. It's been a huge benefit as it's easy to manage all of our SQL queries and procedures because they're all located in XML files, not in our code. Separating SQL from your code, no matter what the language, is a great help.
Additionally, iBatis allows you to write your own data mappers to map data to and from your objects to the DB. We wanted this flexibility, as opposed to a Hibernate type solution that does everything for you, but also (IMO) limits your ability to perform complex queries.
There is a .NET version of iBatis as well.

I've recently set up ActiveRecord from the Castle Project for an app. It was pretty easy to get going. After creating a new app with it, I even used MyGeneration to script out class files for a legacy app that ActiveRecord could use in a pretty short time. It uses NHibernate to interact with the database, but takes away all the xml mapping that comes with NHibernate. The nice thing is though, if necessary, you already have NHibernate in your project, you can use its full power if you have some special cases. I'd suggest taking a look at it.

There are lots of choices of ORMs. Linq to Sql, nHibernate. For pure object databases there is db4o.
It depends on the application, but for a high volume enterprise application, I would not go this route. You need more control of your data.

I was discussing this with a friend over the weekend and it seems like the gains you make on ease of storage are lost if you need to be able to query the database outside of the application. My understanding is that these databases work by storing your object data in a de-normalized fashion. This makes it fast to retrieve entire sets of objects, but if you need to select data from a perspective that doesn't match your object model, the odbms might have a hard time getting at the particular data you want.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.