I have a situation where I am using a transaction scope in .NET.
Within it are multiple method calls, the first perform database updates, and then the last reads the database.
My question is will the database reads pick up the changes in the first method calls that update the databases (note there are commits in these methods, but they are not truly committed until the transaction scope completes).
E.g Using TransactionScope.
{
Method 1 (Insert new comment into database).
Method 2 (Retrieve all comments from database).
complete.
}
Will method 2 results include the method 1 insert?
Thing that is confusing me is that I have ran loads of tests, and sometimes the update is there, sometimes its not!?!
I am aware there are isolation levels (at a high level), is there one that would allow reads to uncommitted data ONLY within the transactionscope?
Any and all help greatly appreciated......
You can do any operations on databases that you want (ms-sql), and until you don't make
transaction.commit()
any changes will appear.
Even if you insert NEW record in one transaction you can get its value in this same transaction. (ofc if you wont rollback()) it.
Yes, this is the purpose of transactions. Think about the situation where you have 2 tables, and 1 foreign keys the other. In your transaction, you insert into one and then the other one with a foreign key of your first insert, and it works. If the data was not available to you, the transaction would be pointless: it would be one operation at a time, which would be atomic, and thus negate the need for transactions.
Related
Does a transaction lock my table when I'm running multiple queries?
Example: if another user will try to send data in same time which I use transaction, what will happen?
Also how can I avoid this, but also to be sure that all data has inserted successfully into database?
Begin Tran;
Insert into Customers (name) values(name1);
Update CustomerTrans
set CustomerName = (name2);
Commit;
You have to implement transaction smartly. Below are some performance related points :-
Locking Optimistic/Pessimistic. In pessimistic locking whole table is locked. but in optimistic locking only specific row is locked.
Isolation level Read Committed/Read Uncommitted. When table is locked it depends upon on your business scenario if it allowed you then you can go for dirty read using with NoLock.
Try to use where clause in update and do proper indexing. For any heavy query check the query plan.
Transaction timeout should be very less. So if the table is locked then it should throw error and In catch block you can retry.
These are few points you can do.
You cannot avoid that multiples users load data to the database. It is neither feasible nor clever to lock every time a single user requested the usage of a table. Actually you do not have to worry about it, because the DB itself will provide mechanism to avoid such issues. I would recommend you reading into ACID properties.
Atomicity
Consistency
Isolation
Durability
What may happen is that you could suffer a ghost read, which basically consist that you cannot read data unless the user who is inserting data commits. And even if you have finished inserting data and do not commit, there is a fair chance that you will not see the changes.
DDL operations such as creation, removal, etc. are themselves committed at the end. However DML operation, such as update, insert, delete, etc. are not committed at the end.
I'm looking for a solution to a thorny problem.
Me and my colleagues have made an 'engine' (in C#) that performs different elaborations on a SQL Server database.
Initially, these elaborations were contained in many stored procedures called in series in a nightly batch. It was a system with many flaws.
Now we have extracted every single query from each stored procedure and, may sound strange, we have inserted the queries into the DB.
(Note: the reasons are different and I'm not listing them all, but you just need to know that, for business reasons, we do not have the opportunity to make frequent software releases... but we have a lot of freedom with SQL scripts).
Mainly, the logic behind our engine is:
there are Phases, called sequentially
each Phase contains several Step, then subdivided into Set
the Set is a set of Steps, that will be executed sequentially
the Sets, unless otherwise specified, start running parallel to each other
the Step that by default does not belong to any Set, will be embedded in a Set (created at runtime)
a Set before starting may have to wait the completion of one or more Steps
Step corresponds to atomic (or almost) SQL queries or C# methods to run
at start the engine queries the database, then composes the Phases, Step and Set (and related configurations)... which will be executed
We have created the engine, we have all the configurations... and everything works.
However, we have a need: some phases must have a transaction. If even a single step of that phase fails, we need to rollback the entire phase.
What creates problems is the management of the transaction.
Initially we created a single transaction and connection for the entire phase, but we soon realized that - because of multithreading - this is not thread-safe.
In addition, after several tests, we have had exceptions regarding the transaction. Apparently, when a phase contains a LOT of steps (= many database queries), the same transaction cannot execute any further statements.
So, now, we've made a change and made sure that each step in the phases that require a transaction opens a connection and a transaction on its own, and if all goes well, all commits (otherwise rollback).
It works. However, we have noticed a limitation: the use of temporary tables.
In a transactional phase, when I create a temporary temp table (#TempTable1) in a step x, I can't use #TempTable1 in the next step y (SELECT TOP 1 1 FROM #TempTable1).
This is logical: as it is a separate transaction and #TempTable1 is deleted at the end of the execution instance.
Then we tried to use a global temp table ##TempTable2, but, in step y, the execution of the SELECT is blocked until the timeout passes..
I also tried lowering the transaction isolation level, but nothing.
Now we are in the unfortunate situation of having to create real tables instead of using temporary tables.
I'm looking for a compromise between the use of transactions on a large number of steps and the use of temporary tables. I believe that the focus of the speech is the management of transactions. Suggestions?
I need to update several rows of one of my tables as an atomic operation.
The update concerns incrementing some values in int columns of certain rows. I need to increment values in several rows as a single action.
What would be the best way to do this?
Answering this question for me comes down to answering the following two:
If I use LINQ to SQL, how do I achieve the atomicity of the increment
operation (do I use transaction, or is there a better way)?
Are stored procedures executed atomically (in case I invoke the procedure on the DB)?
I am working in C# with SQL Server.
In SQL Server Atomicity between different operations is achieved by using Explicit Transactions, Where the user Explicitly Starts a transaction by using the key words BEGIN TRANSACTION and once all the operations are done without any erros you can commit the transaction by using key words COMMIT TRANSACTION, in case of an error/exception you can undo the work anywhere in the ongoing transaction by using key words ROLLBACK TRANSACTION
Write Ahead Strategy
SQL server uses Write Ahead Strategy to make sure the atomicity of the transactions and durability of data, When we are making any changes/Updates to the data, SQL Server takes following steps
Loads data pages into a buffer cache.
Updates the copy in the buffer.
Creates a log record in a log cache.
Saves the log record to disk via the checkpoint process.
Saves the data to disk.
So anywhere in the process of all these steps if you decide to ROLLBACK the transaction. Your is actual data on the disk is left unchanged.
My Suggestion
BEGIN TRY
BEGIN TRANSACTION
------ Your Code Here ------
---- IF everything Goes fine (No errors/No Exceptions)
COMMIT TRANSACTION
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION --< this will ROLLBACK any half done operations
-- Your Code here ---------
END CATCH
I found my answer: The increment cannot be realized through LINQ to SQL directly. However, stored procedures can be called from LINQ, and increment can be realized there.
My solution was to create a stored procedure that would execute necessary updates within a single while loop in a transaction. This way all the updates are executed as a single, atomic, operation.
The UPDATE statement is atomic by itself.
What are the rules for how a linq-to-sql datacontext keeps the database connection open?
The question came up when we made a few tests on performance for one SubmitChanges() per updated entity instead of one SubmitChanges() for the entire batch of entities. Results:
Inserting 3000 items in one SubmitChanges() call... Duration: 1318ms
Inserting 3000 items in one SubmitChanges() call, within
transactionscope... Duration: 1280ms
Inserting 3000 items in individual SubmitChanges() calls... Duration:
4377ms
Inserting 3000 items in individual SubmitChanges() calls within a
transaction... Duration: 2901ms
Note that when doing individual SubmitChanges() for each changed entity, putting everything within a transaction improves performance, which was quite unexpected to us. In the sql server profiler we can see that the individual SubmitChanges() calls within the transaction do not reset the DB connection for each call, as opposed to the one without the transaction.
In what cases does the data context keep the connection open? Is there any detailed documentation available on how linq-to-sql handles connections?
You aren't showing the entire picture; LINQ-to-SQL will wrap a call to SubmitChanges in a transaction by default. If you are wrapping it with another transaction, then you won't see the connection reset; it can't until all of the SubmitChanges calls are complete and then when the external transaction is committed.
There may be a number of factors that could be influencing the timings besides when connections are opened/closed.
edit: I've removed the bit about tracked entities after realizing how linq2sql manages the cached entities and the dirty entities separately.
You can get a good idea how the connections are managed under the covers by using Reflector or some other disassembler to examine the methods on the SqlConnectionManager class. SubmitChanges will call ClearConnection on its IProvider (typically SqlProvider which then uses SqlConnectionManager) after the submit if it wrapped the submit in its own transaction, but not if the SubmitChanges is part of a larger transaction. When the the connection is opened and closed depends on whether there is other activity making use of the SqlConnectionManager.
I messed about with this lately also. Calling SubmitChanges 3000 times will not be a good idea, but then depending on how critical it is that each record gets inserted, you may want to do this, after all it only takes 1000ms.
The transaction scope and multiple SubmitChanges is what i'd expect to see. Since your still within one transaction i'd expect to see SQL server handle this better, which it seems to. One SubmitChanges and using a explicit/implicit TransactionScope seems to yield the same result, which is to be expected. There shouldn't be any/much of a performance difference there.
I think connections are created when needed, but you have to remember this will be pooled within your provider so unless your connection string is changing, you should hook onto the same connection pool which will yield the same performance regardless of approach. Since LINQ-SQL uses SqlConnection behind the scenes, some information about it is at the following:
http://msdn.microsoft.com/en-us/library/8xx3tyca(VS.80).aspx
If your after brute force performance, look at moving into a Stored Proceedure for insert with an explicit TransactionScope. If that isn't fast enough, look at using SqlBulkCopy. 3000 rows should insert faster than 1000ms.
Have you tried opening and close the connection yourself:
Force the Opening of the DataContext's Connection (LINQ)
I think in that case you do not need the extra transaction.
I'm doing some work that involves inserting a batch of records into a Sql database. The size of the batch will vary but for arguments sake we can say 5000 records every 5 secs. It is likely to be less though. Multiple processes will be writing to this table, nothing is reading from it.
What I have noticed during a quick test is that using a SqlTransaction around this whole batch insert seems to improve performance.
e.g.
SqlTransaction trans = Connection.BeginTransaction()
myStoredProc.Transaction = trans;
sampleData.ForEach(ExecuteNonQueryAgainstDB);
transaction.Commit();
I'm not interested in having the ability to rollback my changes so I wouldn't have really considered using a transaction except it seems to improve performance. If I remove this Transaction code my inserts go from taking 300ms to around 800ms!
What is the logic for this? Because my understanding is the transaction still writes the data to the DB but locks the records until it is committed. I would have expected this to have an overhead...
What I am looking for is the fastest way to do this insert.
The commit is what costs time. Without your explicit transaction, you have one transaction per query executed. With the explicit transaction, no additional transaction is created for your queries. So, you have one transaction vs. multiple transactions. That's where the performance improvement comes from.
If you are looking for a fast wqay to insert/load data have a look at SqlBulkCopy Class
What you're getting is perfectly normal.
If your working with a usual isolation level (let's say commited or snapshot) then when you don't use transactions the database engine has to check for conflicts every time you make an insert. That is, it has to make sure that whenever someone reads from that table (with a SELECT *) for example, it doesn't get dirty reads, that is, mantain the insertion so that while the insertion itself it's taking place noone else is reading.
That will mean, lock, insert row, unlock, lock, insert row, unlock and so on.
When you encapsulate all that in a transaction what you're effectively achieving is reducing that series of "lock" and "unlock" into just one in the commit phase.
I've just finished writing a blog post on the performance gains you can get by explicitly specifying where transactions start and finish.
With Dapper i have observed transactions cutting batch insert down to 1/2 the original time and batch update times down to 1/3 of the original time