I'm building a data access layer and need to be able switch between two providers in different environments.
How do I structure this? I'm using a repository pattern and have e.g. a CarRepository class and a Car class. The CarRepository class is responsible for saving, deleting and loading from the database.
I have a Database class, responsible for connecting to the database, and executing the query (sending a SqlCommand for SQL Server). The SQL syntax for the underlying databases is different and the parameter syntax is also different (SQL Server uses # and MySql uses ?).
I would like an approach where I can make the least effort in making my application run on both platforms.
The obvious method is making a MySqlCarRepository and a SqlServerCarRepository, but that introduces a heavy amount of maintenance. Are there any good approaches to this scenario? Maybe keeping a static class with static strings containing the SQL statements for the different SQL flavours? (how about parameter syntax then?)
Any advice is welcome
(Please note that ORM (Nhibernate, Linq2Sql etc) is not an option)
The approach I follow is to first-of-all use the ADO Provider Factories to abstract the data access implementation. So I will use IDbConnection and so forth in the code.
Then I have an abstraction for a query. I can then use Query objects that contain the actual sql statements. These Query objects are created from RawQuery or various query builders (insert/update/delete/etc.) that have implementations for each provider type. The specific raw queries will need to be coded and obtained specific to the DB you need since there is no gettin passed that.
There is quite a bit of leg work involved in coding this 'plumbing' and I have not had a situation where I actually require different platforms so I have not bothered coding some small bits that I know need some ironing out but you are welcome to contact if you are interested in seeing some code.
Can you use any code generation tools?
I used to use Code Smith in another life and had templates that would generate POCO objects from DB tables, repository classes for each object and stored procedures. Worked alright after fine tuning the templates, and there were plenty examples on the net.
But this was way before I saw the light with NHibernate!
A pattern for accessing multiple database types is the DAO (Data Access Object) pattern. This could suit your particular need if you can't/don't use an ORM. The following article explains the pattern for Java but it is still very relevant for C#:
http://java.sun.com/blueprints/corej2eepatterns/Patterns/DataAccessObject.html
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm writing a big C# application that communicates with a MS-SQL Server database.
As the app grows bigger, I find myself writing more and more "boilerplate" code containing various SQL queries in various classes and forms like this:
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Windows.Forms;
public class SomeForm : Form
{
public void LoadData(int ticketId)
{
// some multi-table SQL select and join query
string sqlQuery = #"
SELECT TOP(1) [Foo].Id AS FooId, [Foo].Name AS FooName, [Foo].Address AS FooAddress
[Bar].Name AS BarName, [Bar].UnitPrice AS BarPrice,
[Bif].Plop
FROM [dbo].[Foo]
INNER JOIN [dbo].[Bar]
ON [Bar].Id = [Foo].BarId
INNER JOIN [dbo].[Bif]
ON [Bar].BifId = [Bif].Id
WHERE [Foo].TicketId = #ticketId";
SqlCommand sqlCmd = new SqlCommand();
sqlCmd.CommandText = sqlQuery;
sqlCmd.Parameters.AddWithValue("#ticketId", ticketId);
// connection string params etc and connection open/close handled by this call below
DataTable resultsDataTable = SqlQueryHelper.ExecuteSqlReadCommand(sqlCmd);
if (resultsDataTable.Rows.Count > 0)
{
var row = resultsDataTable.Rows[0];
// read-out the fields
int fooId = 0;
if (!row.IsNull("FooId"))
fooId = row.Field<int>("FooId");
string fooName = "";
if (!row.IsNull("FooName"))
fooName = row.Field<string>("FooName");
// read out further fields...
// display in form
this.fooNameTextBox.Text = fooName;
// etc.
}
}
}
There are dozens of forms in this project all doing conceptually the same thing, just with different SQL queries (different columns selected, etc.)
And each time the forms are opened, the database is being continually queried.
For a local DB server the speed is OK but using the app over a slow VPN is painful.
Are there better ways of cutting down the amount of querying the database? Some sort of caching the database in memory and performing the queries on the in-memory data?
I've added some data tables to a data source in my project but can't understand how I can do complex queries like the one stated above.
Is there a better way of doing this?
Thanks for all your suggestions folks!
I think non of the recommendations like using a framework to access a datase will solve your problem. The problem is not writing SQL or LINQ queries. You always have to dispatch a query to the database or set of data at some point.
The problem is from where you query the database.
Your statement about writing "code containing various SQL queries in various classes and forms" gives me chills. I recently worked for a company and they did exactly the same. As a result they can't maintain their database anymore. Maintenance is still possible, but only rudimentary and very expensive/time consuming and very frustrating - so nobody likes to do it and therefore nobody does it and as a result it gets worse and worse. Queries are getting slower and slower and the only possible quick fix is to buy more bandwidth for the database server.
The actual queries are scattered all over the projects making it impossible to improve/refactor queries (and identify them) or improve the table design. But the worst is, they are not able to switch to a faster or cheaper database model e.g. a graph structured database, although there is a burning desire and an urgent need to do so. It makes you look sloppy in front of customers. Absolutely no fun at all to work in such an environment (so I left). Sounds bad?
You really should decouple the database and the SQL from your business code. All this should be hidden behind an interface:
IRepository repository = Factory.GetRepository();
// IRepository exposes query methods, which hide
// the actual query and the database related code details from the repository client or business logic
var customers = repository.GetCustomers();
Spreading this code through your project doesn't hurt. It will improve maintainability and readability. It separates the data persistence from the actual business logic, since you are hiding/encapsulating all the details like the actual database, the queries and the query language. If you want to switch to another database you just have to implement a new IRepository to modify the existing queries. Changing the database won't break the application.
And all queries are implemented in one location/layer. And when talking about queries, this includes LINQ queries as well (that's why using a framework like Entity Framework doesn't solve your problem). You can use dependency injection or Factory pattern to distribute the implementation of IRepository. This even allows to switch between different databases during runtime without recompiling the application.
Using this pattern (Repository Pattern) also allows to decouple frameworks like Entitiy Framework from your business logic.
Take a look at the Repository Pattern. If it's not too late you should start to refactor your existing code. The price is too high if keep it like it is.
Regarding caching I know that the database or the DBMS already handles the caching of data very very efficiently. What you can do is to
cache data locally e.g. if the data won't change remotely (because
some other client will modify it) e.g. local client settings. This
way you can reduce network traffic significantly.
cache data locally e.g. if you access this data set frequently and remote changes are not likely to occur or will have impact. You can update the local data store periodically or invalidate it e.g. after a period of time to force the client to dispatch a new query in order to update the cache.
You also can filter data locally using LINQ. This may also reduce
network traffic although you may end up reading more data the
necessary. Also filtering is generally done more efficient by the
DBMS.
You can consider to upgrade the database server or VPN to increase
bandwidth.
Indexing will also improve lookup times significantly.
You should consider to refactor all your SQL queries. There are many
articles on how to improve performance of SQL queries. The way you
build up a query will have significant impact on performance or
execution times especially with big data.
You can use data virtualization. It doesn't make sense to pull thousands of records from the database when you can only show 20 of them to the user. Pull more data when the user scrolls the view. Or even better, display a pre-selected list e.g. of most recent items and the allow the user to search for the data of interest. This way you read only the data that the user explicitly asked for. This will improve overall performance drastically, since usually the user is only interested in very few records.
Before introducing an interface (Dependency Inversion)
The following examples are meant to show, that this is a question of architecture or design, instead a question of frameworks or libraries.
Libraries or frameworks can help on a different level, but won't solve the problem, which is introduced by spreading environment specific queries all over the business code. Queries should always be neutral. When looking at the business code, you shouldn't be able to tell if the data is fetched from a file or a database. This details must be hidden or encapsulated.
When you spread the actual database access code (whether directly using plain SQL or with help of a framework) throughout your business code, you are not able to write unit tests without a database attached. This is not desired. It makes testing too complicated and the tests will execute unnecessarily slow. You want to test your business logic and not the database. This are separate tests. You usually want to mock the database away.
The problem:
you need data from the database in multiple places across the application's model or business logic.
The most intuitive approach is to dispatch a database query whenever and where ever you need the data. This means, when the database is a Postgre database, all the code would of course use PostgreSQL or some framework like Entity Framework or ORM in general. If you decide to change the database or DBMS e.g. to some Oracle or want to use a different framework to manage your entities, you would be forced to touch and rewrite every code that uses PostgreSQL or Entity Framework.
In a big business application, this will be the reason that forces your company to stay with what you have and leaves your team dreaming of a better world. Frustration level will rise. Maintaining database related code is nearly impossible, error prone and time consuming. Since the actual database access is not centralized, rewriting the database related code means to crawl through the complete application. Worst case is the spreading of meaningless SQL query strings, nobody understands or remembers. Impossible to move to a new database or to refactor queries to improve performance, without scarifying valuable and expensive time and team resources.
Imaging the following simplified symbolic method is repeated in some form across the application's business logic, maybe accessing different entities and using different filters, but using the same query language, framework or library. Let's say we find similar code a thousand times:
private IEnumerable GetCustomers()
{
// Use Entity Framework to manage the database directly
return DbContext.Customers;
}
We have introduced a tight coupling to the framework as it is woven deep into our business code. The code "knows" how the database is manged. It knows about Entity Framework as it has to use its classes or API everywhere.
The proof is, that if you would want to replace Entity Framework with some other framework or just want to drop it, you would have to refactor the code in thousand places - everywhere you used this framework in your application.
After introducing an interface (Dependency Inversion) and encapsulating all the database access
Dependency Inversion will help to remove a dependency on concrete classes by introducing interfaces. Since we prefer loose coupling between components and classes to enhance flexibility, testability and maintainability when using a helper framework or plain SQL dialect, we have to wrap this specific code and hide it behind an interface (Repository Pattern).
Instead of having a thousand places, which explicitly use the database framework or SQL or LINQ queries to read, write or filter data, we now introduce interface methods e.g GetHighPriorityCustomers and GetAllCustomers. How the data is supplied or from which kind of database it is fetched are details, that are only known to the implementation of this interface.
Now the application no longer uses any framework or database specific languages directly:
interface IRepository
{
IEnumerable<Customer> GetHighPriorityCustomers();
IEnumerable<Customer> GetAllCustomers();
}
The previous thousand places now look something like:
private IRepository Repository { get; } // Initialized e.g. from constructor
private IEnumerable GetCustomers()
{
// Use a repository hidden behind an interface.
// We don't know in this place (business logic) how the interface is implemented
// and what classes it uses. When the implementation changes from Entity Framework to something else,, no changes have to be made here (loose coupling).
return this.Repository.GetAllCustomers();
}
The implementation of IRepository:
class EntityFrameworkRepository : IRepository
{
IEnumerable<Customer> GetAllCustomers()
{
// If we want to drop Entity Framework, we just have to provide a new implementation of IRepository
return DbContext.Customers;
}
...
}
Now, you decide to use plain SQL. The only change to make is to implement a new IRepository, instead of changing thousand places to remove the specialized code:
class MySqlRepository : IRepository
{
// The caller still accesses this method via the interface IRepository.GetAllCustomers()
IEnumerable<Customer> GetAllCustomers()
{
return this.Connection.ExecuteQuery("SELECT * FROM ...");
}
...
}
Now you decide to replace MySQL with Microsoft SQL. All you have to do is to implement a new IRepository.
You can swap in and out any database and change the query language or introduce helper frameworks without affecting your original business logic. Once written, never touched again (at least for the changes regarding the database).
If you move the implementation to a separate assembly, you can even swap them at runtime.
There is another solution besides Entity Framework.
For example you can use The Sharp Factory.
It is a commercial product but it not only maps database objects like Entity Framework but also creates a full Repository based on layers.
It is better than EF in my opinion if you are willing to pay for it.
There are some downsides to Entity Framework. For example the fact that your Repository will leak throughout your layers.
Because you need to reference Entity Framework on all consumers of your entities... So even if your architecture looks correct, at runtime you can still execute sql queries from your upper layers even unknowingly.
I can suggest several things:
Instead of using ADO.NET switch to Entity Framework, where you can use LINQ to
SQL/LINQ to EF(newer version) and write down queries with simply C# language and not worry about querying SQL.
Use Stored procedures, SQL Functions, Views - written down in SQL Server database, which calls are cached by the SQL Server, which provides more efficient executing, security and more maintainability.
For making your Queries more efficient against database tables, consider using Full-Text Index over the tables which data you use more often in the Filtering operations, like Search.
Use Repository and Unit of Work patterns in your C#(including integration with Entity Framework) code which actually will do exactly the thing you want, i.e. collection several amounts of SQL Queries and sending them to execute by SQL Server at once, instead of sending Queries one by one. This will not only drastically improve performance but will keep your coding as simple as it can.
Note: One of the problems with your queries is related not only to their executions but also on Opening and closing SQL database connections each time you need to execute the particular Query. This problem is solved with the Repository and Unit of Work design patterns approach.
Based on your business needs, use In Memory or Database Caching for the data, which repeats for a lot of users.
I'm looking for a class for Sql Server. I need to make insert, update, delete, select (retrieve many rows and columns) and execute Stored Procedure.
I didn't find a sample of this sort of class and i didn't want to reinvente the wheel.
Somebody can give it to me?
You sound like you may be looking for a ORM (Object Relational Mapper). There are a great number available, some built right it to the .NET framework itself. Look at the various websites and see if you can find one that fits your needs.
There's not a single class that does this, but instead a set of a few classes you need to know:
Sql Server specific:
System.Data.SqlClient.SqlConnection
System.Data.SqlClient.SqlCommand
System.Data.SqlClient.SqlDataReader
System.Data.SqlClient.SqlDataAdapter
System.Data.SqlClient.SqlParameter
Used by all database types
System.Data.DataTable
System.Data.DataSet
System.Data.SqlDbType (enum)
There are others as well, but these are the main ones. Together, these make up the ADO.Net API, and the Sql Server provider for the ADO.Net API.
Additionally, there are a number of Object Relational Mappers that build on top of ADO.Net to try to make this easier. Entity Framework, Linq To Sql, and NHibernate are of a few of the more common options. One common characteristic of ORMs is that they try to free you from even knowing the sql language. If you want to write your own SELECT/INSERT/UPDATE/DELETE queries, which it sounds like you do, you should start at the native ADO.Net level.
To put your data access in one object, you create your own class that makes use of these other types. Don't try to build a new public method that accepts an sql string. Build individual methods for each query you will want to run that include the needed sql as part of the method, and have those methods use these types to change or return data.
You might be interested in this tutorial.
There is builtin functionality (System.Data.SqlClient) to simply access an SQL server.
There is no single class that can do everything you need. Whatever choice you decide you would necessarily need to deal with multiple classes.
Look at it this way – in order to get data from SQL Server you need to typically do following things:
Open connection
Crete SQL query
Execute SQL Query
Accept results
Close connection
Putting all this functionality into a single class would make the class way too complex.
Here is a good reading material for what you need.
Beginners guide to accessing SQL Server through C#
I work on a C# client application (SlimTune Profiler) that uses relational (and potentially embedded) database engines as its backing store. The current version already has to deal with SQLite and SQL Server Compact, and I'd like to experiment with support for other systems like MySQL, Firebird, and so on. Worse still, I'd like it to support plugins for any other backing data store -- and not necessarily ones that are SQL based, ideally. Topping off the cake, the frontend itself supports plugins, so I have an unknown many-to-many mapping between querying code and engines handling the queries.
Right now, queries are basically handled via raw SQL code. I've already run into trouble making complex SELECTs work in a portable way. The problem can only get worse over time, and that doesn't even consider the idea of supporting non-SQL data. So then, what is the best way to query wildly disparate engines in a sane way?
I've considered something based on LINQ, possibly the DbLinq project. Another option is object persistence frameworks, Subsonic for example. But I'm not too sure what's out there, what the limitations are, or if I'm just hoping for too much.
(An aside, for the inevitable question of why I don't settle on one engine. I like giving the user a choice of the engine that works best for them. SQL Compact allows replication to a full SQL Server instance. SQLite is portable and supports in-memory databases. I can imagine a situation where a company wants to drop in a MySQL plugin so that they can easily store and collate an application's performance data over the course of time. Last and most importantly, I find the idea that I should have to be dependent on the implementation details of my underlying database engine to be absurd.)
Your best bet is to use an interface for all of your database access. Then for each database type you want to support to do the implementation of the interface for that database. That is what I've had to do for projects in the past.
The problem with many database systems and storage tools is that they aim to solve different problems. You might not even want to store your data in a SQL database but instead store it as files in the App_Data folder of a web application. With an interface method you could do that quite easily.
There generally isn't a solution that fits all database and storage solutions well or even a few of them well. If you find one that claims it does I still wouldn't trust it. When you have a problem with one of the databases it's going to be much easier for you to dig through your objects than it will be to go dig through theirs.
Use an object-relational mapper. This will provide a high level of abstraction away from the different database engines, and won't impose (many) limitations on the kind of queries you can run. Many ORMs also include LINQ support. There are numerous questions on SO providing recommendations and comparisons (e.g. What is your favorite ORM for .NET? appears to be the most recent and has links to several others).
I would recommend the repository pattern. You can create a class that encapsulates all the actions that you need the database for, and then create a different implementation for each database type you want to support. In many cases, for relationional data stores, you can use the ADO.NET abstractions (IDbConnection, IDataReader, IDataAdapter, etc) and create a single generic repository, and only write specific implementations for the database types that do not provide an ADO.NET driver.
public interface IExecutionResultsRepository
{
void SaveExecutionResults(string name, ExecutionResults results);
ExecutionResults GetExecutionResults(int id);
}
I don't actually know what you are storing, so you'd have to adapt this for your actual needs. I'm also guessing this would require some heavy refactoring as you might have sql statements littered throughout your code. And pulling these out and encapsulating them might not be feasible. But IMO, that's the best way to achieve what you want to do.
I think about having a class clsConnection which we can take advantage of in order to execute every SQL query like select, insert, update, delete, .... is pretty good.
But how complete it could be? How?
You could use LINQ to SQL as AB Kolan suggested or, if you don't have time for the learning curve, I'd suggest taking a look at the Microsoft Enterprise Library Data Access Application Blocks.
You can use the DAB (SQlHelper) from the enterprise Library. This has all the methods/properties necessary for database operation. You dont need to create you own code.
Alternately you can use a ORM like LINQ or NHibernate.
It sounds to me like you're just re-writing the ADO.NET SqlConnection (which already has an attached property of type SqlCommand). Or Linq to SQL (or, even, Linq to Entities).
When doing data access i tend to split it into 2 tiers - purely for testability.
totally seperate the logic for getting key values and managing the really low level data collection from the atomic inserts, updates, selects deletes etc.
This way you can test the logic of the low level data collection very easily without needing to read and write from a database.
this way one layer of classes effectively manages writes to individual tables whilst the other is concerned with getting the data from lookups etc to populate these tables
The Business logic layer that sits on top of these 2 dal layers obviously manages the actual business logic - this means that the datastructure is as seperated from the business logic as is realistically possible ... Ie you could replace the dal and not feel the pain so much.
the 2 routes you can take that work well are
ADO.Net
this is very powerful as you have total control, but at the same time it is time consuming and feels repetative. Also its old school so most people are bored of it hence all the linq 2 sql comments. With this you open a connection to the DB and then execute a command against it.
Basically you create a class to interface with the database and use this to use stored procedures that are in the database. The lowest level class essentially fires off the command with its parameters and then populates itself with the returned values.
and Linq 2 SQL
This is a cool system. Essentially it makes SP's redundant for 90% of cases in return for allowing strongly typed sqlesque statements in your code - save time and are more reliable. I still use 2 dal layers with this but take advantage of the fact that it will generate the basic class with properties for you and simply add functionality to actually do the atomic operations. The higher level then implements the read and write logic for multiple objects.
The nicest part is that you can generate collections of collections easily with linq 2 sql and then write all the inserts and updates with one command (altohguh in reality you tend to do things seperatley).
L2S is powerful once you start playing with it wheras generating a collection of objects from ado.net can be a real pain in comparison - especially when you have to do it again and again.
Another alternative is Linq 2 entities
I ahve had problems with this due to linked servers, also it doesn't like views much and if your tables dont have pk's or constraints then it doesn't like life much either. Id stay clear of it for a while.
Of course if you mean that you want a generic class for writing and reading data from a database I think you will be adding complexity rather than solving a problem. Really you can;t avoid writing code ;) - each bit of data access is unique, trying to genericise it past ado.net or l2s is really asking for trouble imo.
Small project:
A singleton class (like DatabaseConnection) might be good for what you're doing.
Large project:
Enterprise Library has some database code; NHibernate or Entities Framework, perhaps.
Your question wasn't specific enough to give a very definitive answer on this.
I am trying to leverage ORM given the following requirements:
1) Using .NET Framework (latest Framework is okay)
2) Must be able to use Sybase, Oracle, MSSQL interchangeably
3) The schema is mostly static, BUT there are dynamic parts.
I am somewhat familiar with SubSonic and NHibernate, but not deeply.
I get the nagging feeling that the ORM can do what I want, but I don't know how to leverage it at the moment.
SubSonic probably isn't optimal, since it doesn't currently support Sybase, and writing my own provider for it is beyond my resources and ability right now.
For #3 (above), there are a couple of metadata tables, which describe tables which the vendors can "staple on" to the existing database.
Let's call these MetaTables, and MetaFields.
There is a base static schema, which the ORM (NHibernate ATM) handles nicely.
However, a vendor can add a table to the database (physically) as long as they also add the data to the metadata tables to describe their structure.
What I'd really like is for me to be able to somehow "feed" the ORM with that metadata (in a way that it understands) and have it at that point allow me to manipulate the data.
My primary goal is to reduce the amount of generic SQL statement building I have to do on these dynamic tables.
I'd also like to avoid having to worry about the differences in SQL being sent to Sybase,Oracle, or MSSQL.
My primary problem is that I don't have a way to let ORM know about the dynamic tables until runtime, when I'll have access to the metadata
Edit: An example of the usage might be like the one outlined here:
IDataReader rdr=new Query("DynamicTable1").WHERE("ArbitraryId",2).ExecuteReader();
(However, it doesn't look like SubSonic will work, as there is no Sybase provider (see above)
Acording to this blog you can in fact use NHibernate with dynamic mapping. It takes a bit of tweaking though...
We did some of the using NHibernate, however we stopped the project since it didn't provide us with the ROI we wanted. We ended up writing our own ORM/SQL layer which worked very well (worked since I no longer work there, I'm guessing it still works).
Our system used a open source project to generate the SQL (don't remember the name any more) and we built all our queries in our own Xml based language (Query Markup Language - QML). We could then build an xmlDocument with selects, wheres, groups etc. and then send that to the SqlEngine that would turn it into a Sql statement and execute it. We discusse, but never implemented, a cache in all of this. That would've allowed us to cache the Qmls for frequently used queries.
I am a little confused as to how the orm would be used then at runtime? If the ORM would dynamically build something at runtime, how does the runtime code know what the orm did dynamically?
"have it at that point allow me to manipulate the data" - What is manipulating the data?
I may be missing something here and i aplogize if thats the case. (I only have really used bottom up approach with ORM)
IDataReader doesn't map anything to an object you know. So your example should be written using classic query builder.
Have you looked into using the ADO.NET Entity Framework?
MSDN: LINQ to Entities
It allows you to map database tables to an object model in such a manner that you can code without thinking about which database vendor is being used, and without worrying about minor variations made by a DBA to the actual tables. The mapping is kept in configuration files that can be modified when the db tables are modified without requiring a recompile.
Also, using LINQ to Entities, you can build queries in an OO manner, so you aren't writing actual SQL query strings.