oData queries and Disposable SQL connection - c#

When implementing an oData service using WebApi 2.0, the standard practice is to expose an IQueryable from your service to the ApiControllers' action. This way the framework can apply the oData query to your IQueryable.
I'm currently reading, amongst others, about how important it is to always call Dispose on your database connections. Preferably by using the "using" statement like so:
using (SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
// Execute operations against the database
} // Connection is automatically closed.
My question is: When is the database connection closed in the case of oData? You obviously can't dispose the connection yourself, before the framework applies the oData query - this would throw an exception.
A side topic would be: do you agree on exposing IQueryable<> from your services? I've read about this and it's a long debated issue - people argue that the database work should be contained in the repository, while others like to give the querying freedom to the services' clients. I agree with containing the queries in the repository, but in the case of oData I don't like to over-complicate things, if the framework espectes IQueryable, then I give it IQueryable. What do you think ?

Usually in such cases database connection is closed when asp.net controller instance is disposed. Suppose you use Entity Framework context to do queries. Then you create (maybe lazily) that context instance when needed and then override Dispose method of ODataController and dispose it there. For example, take a look at this article: http://www.asp.net/web-api/overview/odata-support-in-aspnet-web-api/odata-v4/create-an-odata-v4-endpoint.
I personally never use this approach because I prefer to have more control on allowed operations. However, see no harm in certain cases to allow read access to certain tables via IQueryable approach described - in cases when on client you have rich filtering possibilities. In such case, you will reinvent this approach anyway, because you will use some custom filters, or accept a lot of parameters to your querying method.

Related

How can I reduce the amount of SQL-querying in my code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm writing a big C# application that communicates with a MS-SQL Server database.
As the app grows bigger, I find myself writing more and more "boilerplate" code containing various SQL queries in various classes and forms like this:
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Windows.Forms;
public class SomeForm : Form
{
public void LoadData(int ticketId)
{
// some multi-table SQL select and join query
string sqlQuery = #"
SELECT TOP(1) [Foo].Id AS FooId, [Foo].Name AS FooName, [Foo].Address AS FooAddress
[Bar].Name AS BarName, [Bar].UnitPrice AS BarPrice,
[Bif].Plop
FROM [dbo].[Foo]
INNER JOIN [dbo].[Bar]
ON [Bar].Id = [Foo].BarId
INNER JOIN [dbo].[Bif]
ON [Bar].BifId = [Bif].Id
WHERE [Foo].TicketId = #ticketId";
SqlCommand sqlCmd = new SqlCommand();
sqlCmd.CommandText = sqlQuery;
sqlCmd.Parameters.AddWithValue("#ticketId", ticketId);
// connection string params etc and connection open/close handled by this call below
DataTable resultsDataTable = SqlQueryHelper.ExecuteSqlReadCommand(sqlCmd);
if (resultsDataTable.Rows.Count > 0)
{
var row = resultsDataTable.Rows[0];
// read-out the fields
int fooId = 0;
if (!row.IsNull("FooId"))
fooId = row.Field<int>("FooId");
string fooName = "";
if (!row.IsNull("FooName"))
fooName = row.Field<string>("FooName");
// read out further fields...
// display in form
this.fooNameTextBox.Text = fooName;
// etc.
}
}
}
There are dozens of forms in this project all doing conceptually the same thing, just with different SQL queries (different columns selected, etc.)
And each time the forms are opened, the database is being continually queried.
For a local DB server the speed is OK but using the app over a slow VPN is painful.
Are there better ways of cutting down the amount of querying the database? Some sort of caching the database in memory and performing the queries on the in-memory data?
I've added some data tables to a data source in my project but can't understand how I can do complex queries like the one stated above.
Is there a better way of doing this?
Thanks for all your suggestions folks!
I think non of the recommendations like using a framework to access a datase will solve your problem. The problem is not writing SQL or LINQ queries. You always have to dispatch a query to the database or set of data at some point.
The problem is from where you query the database.
Your statement about writing "code containing various SQL queries in various classes and forms" gives me chills. I recently worked for a company and they did exactly the same. As a result they can't maintain their database anymore. Maintenance is still possible, but only rudimentary and very expensive/time consuming and very frustrating - so nobody likes to do it and therefore nobody does it and as a result it gets worse and worse. Queries are getting slower and slower and the only possible quick fix is to buy more bandwidth for the database server.
The actual queries are scattered all over the projects making it impossible to improve/refactor queries (and identify them) or improve the table design. But the worst is, they are not able to switch to a faster or cheaper database model e.g. a graph structured database, although there is a burning desire and an urgent need to do so. It makes you look sloppy in front of customers. Absolutely no fun at all to work in such an environment (so I left). Sounds bad?
You really should decouple the database and the SQL from your business code. All this should be hidden behind an interface:
IRepository repository = Factory.GetRepository();
// IRepository exposes query methods, which hide
// the actual query and the database related code details from the repository client or business logic
var customers = repository.GetCustomers();
Spreading this code through your project doesn't hurt. It will improve maintainability and readability. It separates the data persistence from the actual business logic, since you are hiding/encapsulating all the details like the actual database, the queries and the query language. If you want to switch to another database you just have to implement a new IRepository to modify the existing queries. Changing the database won't break the application.
And all queries are implemented in one location/layer. And when talking about queries, this includes LINQ queries as well (that's why using a framework like Entity Framework doesn't solve your problem). You can use dependency injection or Factory pattern to distribute the implementation of IRepository. This even allows to switch between different databases during runtime without recompiling the application.
Using this pattern (Repository Pattern) also allows to decouple frameworks like Entitiy Framework from your business logic.
Take a look at the Repository Pattern. If it's not too late you should start to refactor your existing code. The price is too high if keep it like it is.
Regarding caching I know that the database or the DBMS already handles the caching of data very very efficiently. What you can do is to
cache data locally e.g. if the data won't change remotely (because
some other client will modify it) e.g. local client settings. This
way you can reduce network traffic significantly.
cache data locally e.g. if you access this data set frequently and remote changes are not likely to occur or will have impact. You can update the local data store periodically or invalidate it e.g. after a period of time to force the client to dispatch a new query in order to update the cache.
You also can filter data locally using LINQ. This may also reduce
network traffic although you may end up reading more data the
necessary. Also filtering is generally done more efficient by the
DBMS.
You can consider to upgrade the database server or VPN to increase
bandwidth.
Indexing will also improve lookup times significantly.
You should consider to refactor all your SQL queries. There are many
articles on how to improve performance of SQL queries. The way you
build up a query will have significant impact on performance or
execution times especially with big data.
You can use data virtualization. It doesn't make sense to pull thousands of records from the database when you can only show 20 of them to the user. Pull more data when the user scrolls the view. Or even better, display a pre-selected list e.g. of most recent items and the allow the user to search for the data of interest. This way you read only the data that the user explicitly asked for. This will improve overall performance drastically, since usually the user is only interested in very few records.
Before introducing an interface (Dependency Inversion)
The following examples are meant to show, that this is a question of architecture or design, instead a question of frameworks or libraries.
Libraries or frameworks can help on a different level, but won't solve the problem, which is introduced by spreading environment specific queries all over the business code. Queries should always be neutral. When looking at the business code, you shouldn't be able to tell if the data is fetched from a file or a database. This details must be hidden or encapsulated.
When you spread the actual database access code (whether directly using plain SQL or with help of a framework) throughout your business code, you are not able to write unit tests without a database attached. This is not desired. It makes testing too complicated and the tests will execute unnecessarily slow. You want to test your business logic and not the database. This are separate tests. You usually want to mock the database away.
The problem:
you need data from the database in multiple places across the application's model or business logic.
The most intuitive approach is to dispatch a database query whenever and where ever you need the data. This means, when the database is a Postgre database, all the code would of course use PostgreSQL or some framework like Entity Framework or ORM in general. If you decide to change the database or DBMS e.g. to some Oracle or want to use a different framework to manage your entities, you would be forced to touch and rewrite every code that uses PostgreSQL or Entity Framework.
In a big business application, this will be the reason that forces your company to stay with what you have and leaves your team dreaming of a better world. Frustration level will rise. Maintaining database related code is nearly impossible, error prone and time consuming. Since the actual database access is not centralized, rewriting the database related code means to crawl through the complete application. Worst case is the spreading of meaningless SQL query strings, nobody understands or remembers. Impossible to move to a new database or to refactor queries to improve performance, without scarifying valuable and expensive time and team resources.
Imaging the following simplified symbolic method is repeated in some form across the application's business logic, maybe accessing different entities and using different filters, but using the same query language, framework or library. Let's say we find similar code a thousand times:
private IEnumerable GetCustomers()
{
// Use Entity Framework to manage the database directly
return DbContext.Customers;
}
We have introduced a tight coupling to the framework as it is woven deep into our business code. The code "knows" how the database is manged. It knows about Entity Framework as it has to use its classes or API everywhere.
The proof is, that if you would want to replace Entity Framework with some other framework or just want to drop it, you would have to refactor the code in thousand places - everywhere you used this framework in your application.
After introducing an interface (Dependency Inversion) and encapsulating all the database access
Dependency Inversion will help to remove a dependency on concrete classes by introducing interfaces. Since we prefer loose coupling between components and classes to enhance flexibility, testability and maintainability when using a helper framework or plain SQL dialect, we have to wrap this specific code and hide it behind an interface (Repository Pattern).
Instead of having a thousand places, which explicitly use the database framework or SQL or LINQ queries to read, write or filter data, we now introduce interface methods e.g GetHighPriorityCustomers and GetAllCustomers. How the data is supplied or from which kind of database it is fetched are details, that are only known to the implementation of this interface.
Now the application no longer uses any framework or database specific languages directly:
interface IRepository
{
IEnumerable<Customer> GetHighPriorityCustomers();
IEnumerable<Customer> GetAllCustomers();
}
The previous thousand places now look something like:
private IRepository Repository { get; } // Initialized e.g. from constructor
private IEnumerable GetCustomers()
{
// Use a repository hidden behind an interface.
// We don't know in this place (business logic) how the interface is implemented
// and what classes it uses. When the implementation changes from Entity Framework to something else,, no changes have to be made here (loose coupling).
return this.Repository.GetAllCustomers();
}
The implementation of IRepository:
class EntityFrameworkRepository : IRepository
{
IEnumerable<Customer> GetAllCustomers()
{
// If we want to drop Entity Framework, we just have to provide a new implementation of IRepository
return DbContext.Customers;
}
...
}
Now, you decide to use plain SQL. The only change to make is to implement a new IRepository, instead of changing thousand places to remove the specialized code:
class MySqlRepository : IRepository
{
// The caller still accesses this method via the interface IRepository.GetAllCustomers()
IEnumerable<Customer> GetAllCustomers()
{
return this.Connection.ExecuteQuery("SELECT * FROM ...");
}
...
}
Now you decide to replace MySQL with Microsoft SQL. All you have to do is to implement a new IRepository.
You can swap in and out any database and change the query language or introduce helper frameworks without affecting your original business logic. Once written, never touched again (at least for the changes regarding the database).
If you move the implementation to a separate assembly, you can even swap them at runtime.
There is another solution besides Entity Framework.
For example you can use The Sharp Factory.
It is a commercial product but it not only maps database objects like Entity Framework but also creates a full Repository based on layers.
It is better than EF in my opinion if you are willing to pay for it.
There are some downsides to Entity Framework. For example the fact that your Repository will leak throughout your layers.
Because you need to reference Entity Framework on all consumers of your entities... So even if your architecture looks correct, at runtime you can still execute sql queries from your upper layers even unknowingly.
I can suggest several things:
Instead of using ADO.NET switch to Entity Framework, where you can use LINQ to
SQL/LINQ to EF(newer version) and write down queries with simply C# language and not worry about querying SQL.
Use Stored procedures, SQL Functions, Views - written down in SQL Server database, which calls are cached by the SQL Server, which provides more efficient executing, security and more maintainability.
For making your Queries more efficient against database tables, consider using Full-Text Index over the tables which data you use more often in the Filtering operations, like Search.
Use Repository and Unit of Work patterns in your C#(including integration with Entity Framework) code which actually will do exactly the thing you want, i.e. collection several amounts of SQL Queries and sending them to execute by SQL Server at once, instead of sending Queries one by one. This will not only drastically improve performance but will keep your coding as simple as it can.
Note: One of the problems with your queries is related not only to their executions but also on Opening and closing SQL database connections each time you need to execute the particular Query. This problem is solved with the Repository and Unit of Work design patterns approach.
Based on your business needs, use In Memory or Database Caching for the data, which repeats for a lot of users.

What is the best way to make database calls via Generic Handlers in ASP.NET?

What is the best way to make database calls via Generic Handlers in ASP.NET?
I want to make database calls but am wary that the frequent connection and disconnection from the database in each request might cause performance issues.
I wouldn't worry much about that since SQL Server uses connection pooling by default: Info
I would also strongly advice you to have a look at an ORM like Entity Framework, if you haven't already. It will help you with the generic approach, among other useful things.

Architecture for database-aware Application

I'm looking for a reference implementation of the "Unit of work" and "repository" pattern for MS SQL Server or Plain old ADO.NET. But all samples are build aroud an existing context like Linq2SQL or EF. According to my understanding, these technologies are themselves almost implementing these pattern.
But how do I deal with a "plain" SQL Repository without any context and SaveChanges() methods? Is the right way to use TransactionScope? For example collect all SQL Operations in a List of commands and then simple execute them one after each other within a Tx Scope... or is this too simple?
Why am I looking for this? I have the task of building a data layer that can both deal with an ancient Sybase database as well as SQL Server (maybe additional in conjunction with a POCO based EF4 Component)
For this my Idea is to create an abstraction layer with a Repository and Unit of Work Pattern and create different implementations for each Technology.
Update:
I was on vacation the last week. Sorry for the delay. Today I built up an basic picture of my architecture for this. [link] (s7.directupload.net/file/d/2570/whb7ulbs_jpg.htm). My idea is to create a simple ObjectContext like the EF ObjectContext that exists parallel to the EF Context and is used by my repository. This context collects ATOM Sql Transactions in a kind of Stack and executes them within the Transaction within the Unit of Work part. Good idea? Bad Idea? Hard to do? I'm looking forward to your views on this.
I don't envy your task; supporting multiple backend databases in your application is going to be tricky.
Here's an example of a Unit Of Work pattern using ASP.NET MVC and LightSpeed: link
Personally, I would use EF or NHibernate (prefer EF); SQL Anywhere supports ADO.NET and Entity Framework, so (ideally) you wouldn't need to do anything special to support that database.
Good luck!
If you are just worried about transaction scope, let me point you towards the System.Transactions library, and its TransactionScope object. Great class. any sql or other transaction managed system that is manipulated within the same thread that instantiated the transaction scope will be automatically added to the transaction. that way, if any part of the code fails, and throws an exception, you can just not call the scope.Complete() method and all the operations within the transaction scope are rolled back. very nice class.

Custom data provider?

I have a database which does not provide (yet) c# client library, only a binary tcp or http rest protocol.
I have to write an application that can perform various operations on the DB : CRUD, management, etc.
These operation are expressed in a sql query, with select/insert/update/delete and custom keywords for DB-specific operations.
I'm wondering what is the path to this result. I can ask the question in two point of views : in an ideal world, and in a practical world.
I'd appreciate any feedback !What is the recommended approach, problems encountered, etc.
PS: I'm thinking over these approaches :
writing a custom ADO.Net provider (IDbCommand, IDbConnection, etc.)
writing a custom linq provider (which relies on the former)
maybe writing a EF provider
Linq is just language integrated query. It is syntax to express query but in case of database querying it just creates expression tree which must be translated to SQL and executed somehow. For this execution ADO.NET provider is used. The same states for Entity framework which rely on ADO.NET provider to access the database. So if you want to create some implementation you should start with ADO.NET provider anyway.

More efficient database access

I am new to databases and linq, so my problem may be considered trivial. I currently start all my db requests in each class with:
DataClassesDataContext db = new DataClassesDataContext()
Then I proceed to make whatever linq request I need within the method and carry on with the main application logic.
Now, two interesting queries:
1) I believe I have seen people wrapping db usage within 'using'. Such as:
using (DataClassesDataContext db = new DataClassesDataContext())
{
...
}
If this is correct, then doesn't it mean that my class can't use a member 'db' variable anymore, but rather those db requests need to be made within each function call? Also, what exactly would happen if I don't use 'using' within the calls?
2) Running my app with SQL Profiler enabled, I see lots of connections opening and closing. Does this means that each DataClassesDataContext call makes a separate connection? It seems inefficient, so is the right way to actually make the DataClassesDataContext object a static within each class being used?
In general, you should use one DataContext per database conversation. Only you can decide exactly what a conversation is, but typically it is a complete request (fetch the user's wish list, or fetch the user's closed orders, for example) that you might think of as a "unit of work."
Typically what happens is something like this:
WishList wishlist;
using(var context = new DataContext(connectionString)) {
var service = new UserWishListService(context);
wishlist = service.GetUserWishList();
}
Also, what exactly would happen if I don't use using within the calls?
The DataContext won't be disposed of properly (unless you've wrapped in a try-catch-finally, but generally you should just use using).
Does this means that each DataClassesDataContext call makes a separate connection?
Not quite. Your application will benefit from the SQL Server ADO.NET provider's built-in connection pooling. Don't worry about this, let the provider manage it for you.
It seems inefficient, so is the right way to actually make the DataClassesDataContext object a static within each class being used?
Absolutely not. DataContexts are not thread-safe (in fact, they are thread-unsafe) and this has "there be dragons" written all over it. Additionally, even a single-threaded context, a static DataContext is a bad choice because the DataContext maintains a cache (for object tracking purposes) of all the entities it has pulled from the database. Over time, the memory consumption will become ginormous.
Since you added the asp.net tag, it means you are using the context within a HTTP call. A static member context is unusable in asp.net because you need to synchronize access to it, and since your data context is required by every call, you can only serve one HTTP response at a time, a scalability fiasco of epic proportions.
This is why data context are created and disposed 'on-the-go'. In fact, the class specifications clearly calls out this use pattern:
In general, a DataContext instance is
designed to last for one "unit of
work" however your application defines
that term. A DataContext is
lightweight and is not expensive to
create. A typical LINQ to SQL
application creates DataContext
instances at method scope or as a
member of short-lived classes that
represent a logical set of related
database operations.
For ASP.Net a sensible 'unit-of-work' context is the HTTP call itself. A longer discussion on this topic can be found at Linq to SQL DataContext Lifetime Management.
The issue of connections open/close is a non-issue. Normally the connections are pooled and the 'opening' is nothing but a re-use of a connection from the pool. If you're opening is heavyweight (fully fledged login) then you're using pooling incorrectly. Comparing Logins/sec and Connection Resets/sec counters will quickly reveal if that is indeed the case.

Categories