I've been searching for some time now in here and other places and can't find a good answer to why Linq-TO-SQL with NOLOCK is not possible..
Every time I search for how to apply the with(NOLOCK) hint to a Linq-To-SQL context (applied to 1 sql statement) people often answer to force a transaction (TransactionScope) with IsolationLevel set to ReadUncommitted. Well - they rarely tell this causes the connection to open an transaction (that I've also read somewhere must be ensured closed manually).
Using ReadUncommitted in my application as is, is really not that good. Right now I've got using context statements for the same connection within each other. Like:
using( var ctx1 = new Context()) {
... some code here ...
using( var ctx2 = new Context()) {
... some code here ...
using( var ctx3 = new Context()) {
... some code here ...
}
... some code here ...
}
... some code here ...
}
With a total execution time of 1 sec and many users on the same time, changing the isolation level will cause the contexts to wait for each other to release a connection because all the connections in the connection pool is being used.
So one (of many reasons) for changing to "nolock" is to avoid deadlocks (right now we have 1 customer deadlock per day). The consequence of above is just another kind of deadlock and really doesn't solve my issue.
So what I know I could do is:
Avoid nested usage of same connection
Increase the connection pool size at the server
But my problem is:
This is not possible within near future because of many lines of code re-factoring and it will conflict with the architecture (without even starting to comment whether this is good or bad)
Even though this of course will work, this is what I would call "symptomatic treatment" - as I don't know how much the application will grow and if this is a reliable solution for the future (and then I might end up with a even worse situation with a lot more users being affected)
My thoughts are:
Can it really be true that NoLock is not possible (for each statement without starting transactions)?
If 1 is true - can it really be true no one other got this problem and solved it in a generic linq to sql modification?
If 2 is true - why is this not a issue for others?
Is there another workaround I havn't looked at maybe?
Is the using of the same connection (nested) many times so bad practice that no-one has this issue?
1: LINQ-to-SQL does indeed not allow you to indicate hints like NOLOCK; it is possible to write your own TSQL, though, and use ExecuteQuery<T> etc
2: to solve in an elegant way would be pretty complicated, frankly; and there's a strong chance that you would be using it inappropriately. For example, in the "deadlock" scenario, I would wager that actually it is UPDLOCK that you should be using (during the first read), to ensure that the first read takes a write lock; this prevents a second later query getting a read lock, so you generally get blocking instead of deadlock
3: using the connection isn't necessarily a big problem (although note that new Context() won't generally share a connection; to share a connection you would use new Context(connection)). If seeing this issue, there are three likely solutions (if we exclude "use an ORM with hint support"):
using an explicit transaction (which doesn't have to be TransactionScope - it can be a connection level transaction) to specify the isolation level
write your own TSQL with hints
use a connection-level isolation level (noting the caveat I added as a comment)
IIRC there is also a way to subclass the data-context and override some of the transaction-creation code to control the isolation-level for the transactions that it creates internally.
Related
Scenario: my query variable is dynamic, there are 4 possible values for that depending on the report type (_reportType). Meaning there are 4 different queries and some of it doesn't have #STAFF in the where condition, so my question is, is it safe to just leave my
dBCommand.AddParameter("#STAFF", staff)
there or should I include if else condition just to be safe?
Like this
if(_reportType == 1)
{
dBCommand.AddParameter("#STAFF", staff);
}
else if (_reportType == 2)
{
//code
}
else if (_reportType == 3)
{
//code
}
else
{
//Don't add dBCommand.AddParameter("#STAFF", staff);
}
Is it safe just to leave addParameter("#STAFF", staff) even though I'm not going to use it in a query?
Example I'm going to write
dBCommand.Initialize(string.Format(query, "RetailTable"), batch);
dBCommand.AddParameter("#STAFF", staff);
But the query value doesn't have #STAFF in the WHERE condition
It should generally be ok to specify unused parameters, aside from the minor overhead of sending the value to the server. The exception is if you execute DDL queries that have a restriction of being the only statement in the batch (e.g. CREATE VIEW). Those would fail due to the parameter.
There are 2 glaring bad practices in your approach:
1. Generating dynamic query within the code.
This approach has many drawbacks and possible security loopholes. You should almost always avoid doing that.
Please go through the following links to understand this more:
https://codingsight.com/dynamic-sql-vs-stored-procedure/
https://www.ecanarys.com/Blogs/ArticleID/112/SQL-injection-attack-and-prevention-using-stored-procedure
2. Trying to use generic Where Clause that fits all your variations.
This approach is disaster in waiting, regardless of the query being written in your application code OR in a Stored Procedure.
This is an ugly code-smell and a maintenance nightmare.
No developer can ever be 100% sure that there will not be any change required during the lifespan of the application due to a simple fact that the client WILL need enhancements on regular bases.
So, even if this approach may work for you for a small period of time, this will blow back.
Assume, over the period, there are few more filter parameters added due to new requirements. Now, imagine how your code would look like and the possibilities it creates of problems you may get if they are not handled properly. Specially when YOU are not making those changes. Scary, right?
Always write code that will not only be easier to read and understand, but also easy to enhance and maintain, regardless of the person writing the code.
So, IMHO, you should add those if-else conditions OR use switch-case blocks to safeguard yourself and your client. It may look overkill in the start, but will surely payoff in future.
Hope this help!
I'd like to ask a question. I've been trying to find some information regarding transactions with multiple connections, but I haven't been able to find any good source of information.
Now for what I'm trying to do. I have code that looks like this:
using (var Connection1 = m_Db.CreateConnection())
using (var Connection2 = m_Db.CreateConnection())
{
Connection1.DoRead(..., (IDataReader Reader) =>
{
// Do stuff
Connection2.DoWrite(...);
Connection2.DoRead(..., (IDataReader Reader) =>
{
// Do more stuff
using (var Connection3 = m_Db.CreateConnection())
{
Connection3.DoWrite(...);
Connection3.Commit(); // Is this even right?
}
});
});
Connection1.DoRead(..., (IDataReader) =>
{
// Do yet more stuff
});
Connection1.Commit();
Connection2.Commit();
}
Each CreateConnection creates a new transaction using MySqlConnection::BeginTransaction. The CreateConnection method creates a Connection object which wraps a MySqlConnection. The DoRead function executes some SQL, and disposes the IDataReader when done.
Every Connection will do a Rollback when disposed.
Now for some notes:
I have ONE server with multiple databases.
I am running MySql server with InnoDB databases.
I am doing both reads and writes to these databases.
For performance reasons and not to mess up the database, I am using transactions.
The code is (at least, for now) entirely serial. There are NO concurrent threads. All inserts and queries are done in serial fashion.
I use multiple connections to the database because a read or write is not allowed while another read is in progress (basically the reader object has not yet been disposed).
I basically want every connection to see all changes. So for example, after Connection 3 does some writes, Connection 1 should see those. But the data should be in the transaction and not written to the database (yet).
Now, as for my questions:
Does this work? Will everything ONLY be committed only once the last Commit function is called? Should I use another approach?
Is this right? Is my approach completely and utterly wrong and silly?
Any drawbacks? Especially regarding performance.
Thanks.
Welp, it seems no one knows. But that's okay.
For now, I just went with the method of just using one connection and reading all the results into a List>, then closing the reader, thereby avoiding the problem of having to use multiple connections.
Might there be performance problems? Maybe, but it's better than having to deal with uncertainty and deadlocks.
My hosting company blocked my website for using more than 15 concurrent database connections. But in my code I closed each and every connection that I opened. But still they are saying that there are too many concurrent connections. And suggested me the solution that I should change the source code of my website. So please tell me the solution about this? And my website is dynamic, so would making it static simple HTML old days type will make a difference or not?
Also note that I tried this when no solution I can think of, before every con.open(), I added con.Close(), So that any other connection opened will be closed.
The first thing to do is to check when you open connections - see if you can minimise that. For example, and you doing "n+1" on different connections?
If you have a single server, the technical solution here is a semaphore - for example, something like:
someSemaphore.TakeOne();
try {
using(var conn = GetConnection()) {
...
}
} finally {
someSemaphore.Release();
}
which will (assuming someSemaphore is shared, for example static) ensure that you can only get into that block "n" times at once. In your case, you would create the semaphore with 15 spaces:
static readonly Semaphore someSemaphore = new Semaphore(15,15);
However! Caution is recommended: in some cases you could get a deadlock: imagine 2 threads poorly written each need 9 connections - thread A takes 7 and thread B takes 8. They both need more - and neither will ever get them. Thus, using WaitOne with a timeout is important:
static void TakeConnection() {
if(!someSemaphore.TakeOne(3000)) {
throw new TimeoutException("Unable to reserve connection");
}
}
static void ReleaseConnection() {
someSemaphore.Release();
}
...
TakeConnection();
try {
using(var conn = GetConnection()) {
...
}
} finally {
ReleaseConnection();
}
It would also be possible to wrap that up in IDisposable to make usage more convenient.
Change Hosting Company.
Seriously.
Unless you run a pathetic Little home blog.
You can easily have more than 15 pages / requests being handled at the same time. I am always wary of "run away Connections" but I would not consider 15 Connections to even be something worth mentioning. This is like a car rental Company complaining you drive more than 15km - this simply is a REALLY low Limit.
On a busy Website you can have 50, 100, even 200 open Connections just because you ahve that many requests at the same time.
This is something not so obvious, but even if you care about opening and closing your connections properly, you have to look at something particular.
If you make the smallest change on the text you use to build a connection string, .net will create a whole new connection instead of using one already opened (even if the connection uses MARS), so just in case, look for your code if you are creating connection strings on the fly instead of using a single one from your web config.
I believe SQL Connections are pooled. When you close one, you actually just return it to connection pool.
You can use SqlConnection.ClearPool(connection) or SqlConnection.ClearAllPools to actually close the connection, but it will affect the performance of your site.
Also, you can disable pooling by using connection string parameter Pooling=false.
There are also Max Pool Size (default 100), you may want to set it to a lower number.
This all might work, but i would also suggest you to switch providers ....
If you only fetch data from database then it is not very difficult to create some sort of cache. But if there full CRUD then the better solution is to change hosting provider.
I felt some delay on Loading Contents while Using Transactions to Edit the contents,
(Testing this situation is a bit hard for me as I don't know how could be better to test it)
I have some doubts about Transactions usages:
There are some minor issues and things I should understand about Transactions
and these parts are related to this question :
When should we use Transactions in a Own-Made CMS ?
My-case-specific notes :
Should I use transactions on any CMS , While we have sprocs on Insert,Update,Retrieve, .... ?
Is the necessity of using transactions just when we are working on more tables than one ?
The Transaction strategy I used :
Adding Product Method ( Which uses add Product sproc ) :
TransactionOptions txOptions = new TransactionOptions();
using (TransactionScope txScope = new TransactionScope
(TransactionScopeOption.Required, txOptions))
{
try
{
connection.Open();
command.ExecuteNonQuery();
LastInserted = (int)pInsertedID.Value;
txScope.Complete();
}
catch (Exception ex)
{
logErrors.Warn(ex.Message);
}
finally
{
command.Dispose();
connection.Close();
}
Transactions may help to ensure consistency of the database. For example, if a stored procedure used to add a product inserts data in more than one table, and something fails along the way, a transaction might be helpful to rollback the whole operation, thus the database is free of half-baked products (e.g. lacking some critical info in related tables).
Transaction scopes (TransactionScope) are used to provide an ambient implicit transaction for whatever code runs inside a code block. These scopes may help to severely simplify the code, however, they also may add complexities in multithreaded environments (unfortunately, I don't know quite a lot about such cases).
Therefore, the code you provided would probably make sense to ensure database's consistency, especially if the command uses more than one table. It may add some performance overhead; however, you would be better off relying on gathered profiling data rather than any sort of feelings before conducting any optimizations (i.e. try to gather some quantitative data as to how slower things are under transactions). Modern database engines usually handle transactions quite efficiently; in my own experience, there were no transactions for removal due to their performance overhead.
Lately in apps I've been developing I have been checking the number of rows affected by an insert, update, delete to the database and logging an an error if the number is unexpected. For example on a simple insert, update, or delete of one row if any number of rows other than one is returned from an ExecuteNonQuery() call, I will consider that an error and log it. Also, I realize now as I type this that I do not even try to rollback the transaction if that happens, which is not the best practice and should definitely be addressed. Anyways, here's code to illustrate what I mean:
I'll have a data layer function that makes the call to the db:
public static int DLInsert(Person person)
{
Database db = DatabaseFactory.CreateDatabase("dbConnString");
using (DbCommand dbCommand = db.GetStoredProcCommand("dbo.Insert_Person"))
{
db.AddInParameter(dbCommand, "#FirstName", DbType.Byte, person.FirstName);
db.AddInParameter(dbCommand, "#LastName", DbType.String, person.LastName);
db.AddInParameter(dbCommand, "#Address", DbType.Boolean, person.Address);
return db.ExecuteNonQuery(dbCommand);
}
}
Then a business layer call to the data layer function:
public static bool BLInsert(Person person)
{
if (DLInsert(campusRating) != 1)
{
// log exception
return false;
}
return true;
}
And in the code-behind or view (I do both webforms and mvc projects):
if (BLInsert(person))
{
// carry on as normal with whatever other code after successful insert
}
else
{
// throw an exception that directs the user to one of my custom error pages
}
The more I use this type of code, the more I feel like it is overkill. Especially in the code-behind/view. Is there any legitimate reason to think a simple insert, update, or delete wouldn't actually modify the correct number of rows in the database? Is it more plausible to only worry about catching an actual SqlException and then handling that, instead of doing the monotonous check for rows affected every time?
Thanks. Hope you all can help me out.
UPDATE
Thanks everyone for taking the time to answer. I still haven't 100% decided what setup I will use going forward, but here's what I have taken away from all of your responses.
Trust the DB and .Net libraries to handle a query and do their job as they were designed to do.
Use transactions in my stored procedures to rollback the query on any errors and potentially use raiseerror to throw those exceptions back to the .Net code as a SqlException, which could handle these errors with a try/catch. This approach would replace the problematic return code checking.
Would there be any issue with the second bullet point that I am missing?
I guess the question becomes, "Why are you checking this?" If it's just because you don't trust the database to perform the query, then it's probably overkill. However, there could exist a logical reason to perform this check.
For example, I worked at a company once where this method was employed to check for concurrency errors. When a record was fetched from the database to be edited in the application, it would come with a LastModified timestamp. Then the standard CRUD operations in the data access layer would include a WHERE LastMotified=#LastModified clause when doing an UPDATE and check the record modified count. If no record was updated, it would assume a concurrency error had occurred.
I felt it was kind of sloppy for concurrency checking (especially the part about assuming the nature of the error), but it got the job done for the business.
What concerns me more in your example is the structure of how this is being accomplished. The 1 or 0 being returned from the data access code is a "magic number." That should be avoided. It's leaking an implementation detail from the data access code into the business logic code. If you do want to keep using this check, I'd recommend moving the check into the data access code and throwing an exception if it fails. In general, return codes should be avoided.
Edit: I just noticed a potentially harmful bug in your code as well, related to my last point above. What if more than one record is changed? It probably won't happen on an INSERT, but could easily happen on an UPDATE. Other parts of the code might assume that != 1 means no record was changed. That could make debugging very problematic :)
On the one hand, most of the time everything should behave the way you expect, and on those times the additional checks don't add anything to your application. On the other hand, if something does go wrong, not knowing about it means that the problem may become quite large before you notice it. In my opinion, the little bit of additional protection is worth the little bit of extra effort, especially if you implement a rollback on failure. It's kinda like an airbag in your car... it doesn't really serve a purpose if you never crash, but if you do it could save your life.
I've always prefered to raiserror in my sproc and handle exceptions rather than counting. This way, if you update a sproc to do something else, like logging/auditing, you don't have to worry about keeping the row counts in check.
Though if you like the second check in your code or would prefer not to deal with exceptions/raiserror, I've seen teams return 0 on successful sproc executions for every sproc in the db, and return another number otherwise.
It is absolutely overkill. You should trust that your core platform (.Net libraries, Sql Server) work correctly -you shouldn't be worrying about that.
Now, there are some related instances where you might want to test, like if transactions are correctly rolled back, etc.
If there's is a need for that check, why not do that check within the database itself? You save yourself from doing a round trip and it's done at a more 'centralized' stage - If you check in the database, you can be assured it's being applied consistently from any application that hits that database. Whereas if you put the logic in the UI, then you need to make sure that any UI application that hits that particular database applies the correct logic and does it consistently.