I'm currently using the method below to get the ID of the last inserted row.
Database.ExecuteNonQuery(query, parameters);
//if another client/connection inserts a record at this time,
//could the line below return the incorrect row ID?
int fileid = Convert.ToInt32(Database.Scalar("SELECT last_insert_rowid()"));
return fileid;
This method has been working fine so far, but I'm not sure it's completely reliable.
Supposing there are two client applications, each with their own separate database connections, that invoke the server-side method above at exactly the same time. Keeping in mind that the client threads run in parallel and that SQLite can only run one operation at a time (or so I've heard), would it be possible for one client instance to return the row ID of a record that was inserted by the other instance?
Lastly, is there a better way of getting the last inserted row ID?
If another client/connection inserts a record at this time, could the line below return the incorrect row ID?
No, since the write will either happen after the read, or before the read, but not during the read.
Keeping in mind that the client threads run in parallel and that SQLite can only run one operation at a time, would it be possible for one client to get the row ID of the record that was inserted by the other client?
Yes, of course.
It doesn't matter that the server-side methods are invoked at exactly the same time. The database's locking allows concurrent reads, though not concurrent writes, or reads while writing.
if you haven't already, have a read over SQLite's file locking model.
If both the insert command and the get last inserted row id command are inside the same write lock and no other insert command in that write lock can run between those two commands, then you're safe.
If you start the write lock after the insert command, than there's no way to be sure that another thread didn't get a write lock first. If another thread did get a write lock first, then you won't be able to execute a search for the row id until after that other thread has released their lock. By then, it could be too late if the other thread inserted a new row.
Related
I am currently working on some kind of ERP like WPF application with SQL Server as the database.
Up to now, I had to work only with small tasks that does not need row locking on the server side. So the basic was "Create SQLConnection-> Select Data in the DataTable -> close connection".
Now I would like to create the functionality to work on orders.
How could I Lock the records that has been selected till the user finishes the work so no other user can read that rows?
I think I should use transactions, but I am not sure how to keep the transaction alive until the next statement, because I am closing the connection after each command.
Locking data like that is a bad practice. A transaction is intended to ensure that your data is completely saved or not at all. They are not intended to lock the data for the reason specified in your question.
It sounds like the data being entered could be a lot so you don't want a user spending time entering data to only be met with an error because someone else changed the data. You could have a locked_by column that you set when a user is editing the data and simply not allow anyone else to edit the data if that column is not NULL. You could still allow reads of the data or exclude locked data from view with queries depending on your need.
You may also want to include a locked_time column so you know when it was locked. You could then clear the lock if it's stale, or at least query how long it's been locked allowing for an admin user to look for lengthy locks so they can contact that user or clear the lock.
The query could look like this:
UPDATE Table SET locked_by = #lockedByUser, locked_time = #lockedTime
WHERE Id = #fetchId and locked_by IS NULL
SELECT * FROM Table WHERE locked_by = #lockedByUser
If no data is returned, the lock failed or the id doesn't exist. Either way, the data isn't available. You could retrieve the records updated count, to also verify if the lock was successful or not.
Don't close the connection
open transaction
on the select use an uplock so record(s) are locked
perform updates
commit or rollback the transaction
Put some type of timer on it.
One way to handle concurrency via application is implement some kind of "LastServerUpdateDateTime" column on the table you are working on.
When User A pulls the data for a row the ViewModel will have that LastServerUpdateDateTime value saved. Your User A does their updates and then try to save back to the DB. If the LastServerUpdateDateTime value is the same, then that means there was no updates while you were working and you are good to save (and LastServerUpdateDateTime is also updated). If at any point while User A is working on a set of data on the application side, and User B comes in makes their changes and saves, then when User A eventually saves the LastServerUpdateDateTime will be different than what they initially pulled down and save will be rejected. Yes User A then has to redo their changes, but it shouldn't happen often (depending on your application of course) and you don't have to deal with direct DB locking or anything like that.
I will describe the mechanism that I have used with success in the past.
1) Create a document ID table. In this table, each record represents a document type and an ID which can be incremented whenever a new document is created. The importance of this table is really as a root lock; the document ID is not strictly needed.
2) Create a lock table. In this table, each record represents a lock which includes a reference to a document record, a reference to the lock owner, and some additional data such as when the lock was created, when it was last acted upon, its status, or anything else you find useful. Each record means "user A holds a lock on document type X, document ID Y".
3) When locking a document (fetch + lock), lock (SELECT/UPDATE) the relevant record in the document ID table. Then, check the lock table for an existing lock record, and INSERT a new one as appropriate. At this point you may choose to over-write an existing lock, or return an error to the user.
4) When updating a document, again lock (SELECT/UPDATE) the relevant record in the document ID table. Then verify the user holds a lock, and if so do the actual update, and then DELETE the lock record. If the user does not hold a lock, you may choose to allow the update if no other user holds a lock, or return an error.
With this mechanism, a user goes through a open/lock operation, and a save/unlock, or discard/unlock operation. Additionally, locks can be removed by a cron job or by an administrator, in case users fail to update or discard (which they will).
This approach avoids holding record locks and transactions open for long periods of time, which can cause concurrency issues. It also allows locks to survive software crashes. It also allows all kinds of flexibility; for example, my implementation allowed a lock to be "demoted" after some period of time, and once a lock was demoted, it could be over-written by an ordinary user, while still allowing the owner to perform an update as long as the lock remained.
Suppose we have the following situation:
We have 2-3 tables in database with a huge amount of data (let it be 50-100mln of records) and we want to add 2k of new records. But before adding them we need to check our db on duplicates. So if this 2k contains records which we have in our DB we should ignore them. But to find out whether new record is a duplicate or not we need info from both tables (for example we need to make left join).
The idea of solution is: one task or thread create a suitable data for comparison and pushes data into queue (by batches, not record by record), so our queue(or concurrentQueue) is a global variable. The second thread gets batch from queue and look it through. But there's a problem - memory is growing...
How can I clean memory after I've surfed through the batch?
P.S. If smb has another idea how to optimize this process - please describe it...
This is not the specific answer to the question you are asking, because what you are asking, doesn't really make sense to me.
if you are looking to update specific rows:
INSERT INTO tablename (UniqueKey,columnname1, columnname2, etc...)
VALUES (UniqueKeyValue,value1,value2, etc....)
ON DUPLICATE KEY
UPDATE columnname1=value1, columnname2=value2, etc...
If not, simply ignore/remove the update statement.
This would be darn fast, considering, it would use the unique index of whatever field you want to be unique, and just do an insert or update. No need to validate in a separate table or anything.
We are creating a client server application using WPF/C# with SQL. Here we are generating a unique number b checking DB(To get the last maximum number) and with that max value, we are increment '1' and storing the value in DB. At this time another user also working on the same screen and creating unique numbers, in some case the the unique numbers gets duplicated and throws exception.
We found this is a concurrency issue.
Indeed, fetching a number out, adding one, and hoping it still isn't in use is a thread-race and a race between multiple clients - and should be avoided.
Options:
use an IDENTITY column in the database, and let the database generate the value itself during INSERT; the database server knows how to do this safely and reliably
if that isn't possible, you might want to delay this code until you are ready to INSERT so it is all part of a single database operation - and even then, if it isn't in a "serializable transaction" (with key-range read locks, etc), then you would have to loop on "get the max, increment, try to insert but note that we might have lost a race, so only insert if the value doesn't exist - which it might; repeat from start if unsuccessful"
alternatively, you could create the new record when you first need the number (even though the rest of the data isn't available), noting that you might still need the "loop until successful" approach
Frankly, the IDENTITY column approach is the simplest.
Finally, We have follwed Singleton pattern with lock to resolver this issue.
Thanks.
I have an Oracle database that I access using Devart and Entity Framework.
There's a table called IMPORTJOBS with a column STATUS.
I also have multiple processes running at the same time. They each read the first row in IMPORTJOBS that has status 'REGISTERED', put it to status 'EXECUTING', and if done put it to status 'EXECUTED'.
Now because these processes are running in parallel, I believe the following could happen:
process A reads row 10 which has status REGISTERED,
process B also reads row 10 which has still status REGISTERED,
process A updates row 10 to status EXECUTING.
Process B should not be able to read row 10 as process A already read it and is going to update its status.
How should I solve this? Put read and update in a transaction? Or should I use some versioning approach or something else?
Thanks!
EDIT: thanks to the accepted answer I got it working and documented it here: http://ludwigstuyck.wordpress.com/2013/02/28/concurrent-reading-and-writing-in-an-oracle-database.
You should use the built-in locking mechanisms of the database. Don't reinvent the wheel, especially since RDBMS are designed to deal with concurrency and consistency.
In Oracle 11g, I suggest you use the SKIP LOCKED feature. For example each process could call a function like this (assuming id are number):
CREATE OR REPLACE TYPE tab_number IS TABLE OF NUMBER;
CREATE OR REPLACE FUNCTION reserve_jobs RETURN tab_number IS
CURSOR c IS
SELECT id FROM IMPORTJOBS WHERE STATUS = 'REGISTERED'
FOR UPDATE SKIP LOCKED;
l_result tab_number := tab_number();
l_id number;
BEGIN
OPEN c;
FOR i IN 1..10 LOOP
FETCH c INTO l_id;
EXIT WHEN c%NOTFOUND;
l_result.extend;
l_result(l_result.size) := l_id;
END LOOP;
CLOSE c;
RETURN l_result;
END;
This will return 10 rows (if possible) that are not locked. These rows will be locked and the sessions will not block each other.
In 10g and before since Oracle returns consistent results, use FOR UPDATE wisely and you should not have the problem that you describe. For instance consider the following SELECT:
SELECT *
FROM IMPORTJOBS
WHERE STATUS = 'REGISTERED'
AND rownum <= 10
FOR UPDATE;
What would happen if all processes reserve their rows with this SELECT? How will that affect your scenario:
Session A gets 10 rows that are not processed.
Session B would get the same 10 rows, is blocked and waits for session A.
Session A updates the selected rows' statuses and commits its transaction.
Oracle will now (automatically) rerun Session B's select from the beginning since the data has been modified and we have specified FOR UPDATE (this clause forces Oracle to get the last version of the block).
This means that session B will get 10 new rows.
So in this scenario, you have no consistency problem. Also, assuming that the transaction to request a row and change its status is fast, the concurrency impact will be light.
Each process can issue a SELECT ... FOR UPDATE to lock the row when they read it. In this scenario, process A will read and lock the row, process B will attempt to read the row and block until process A releases the lock by committing (or rolling back) its transaction. Oracle will then determine whether the row still meets B's criteria and, in your example, won't return the row to B. This works but it means that your multi-threaded process may now be effectively single-threaded depending on how your transaction control needs to work.
Possible ways to improve scalability
A relatively common approach on the consumer to resolving this is to have a single coordinator thread that reads the data from the table, parcels out work to different threads, and updates the table appropriately (including knowing how to re-assign a job if the thread that was assigned it has died).
If you are using Oracle 11.1 or later, you can use the SKIP LOCKED clause on your FOR UPDATE so that each session gets back the first row that meets their criteria and is not locked (the clause existed in earlier versions but was not documented so it may not work correctly).
Rather than using a table for ImportJobs, you can use a queue with multiple consumers. This will allow Oracle to distribute messages to each process without you needing to build any additional locking (Oracle queues are doing it all behind the scenes).
Use versioning and optimistic concurrency.
The IMPORTJOBS table should have a timestamp column that you mark as ConcurrencyMode = Fixed in your model. Now when EF tries to do an update the timestamp column is incorporated in the update statement: WHERE timestamp = xxxxx.
For B, the timestamp changed in the mean time, so a concurrency exception is raised, which, in this case, you handle by skipping the update.
I'm from a SQL server background and I don't know the Oracle equivalent of timestamp (or rowversion), but the idea is that it's a field that auto-updates when an update is made to a record.
I am working on an auction system and one of the issues I am trying to make sure I don't get affected by is a situation where 2 people put in a bid at the exact same time for the same item.
To do this I need to put a lock on the table, get the highest bid for the current item, make sure the entered bid is greater than that bid, add a new bid entry into the table, then unlock the table.
I need to lock this so a second webserver does not trigger a bid insert between when I check for the highest bid and when I insert my new bid into the table, as this would cause data issues.
How do I accomplish this with Linq-to-sql?
Note, I don't know if transactionscopes can do this but I can't use them, as they tend to trigger a distributed transaction due to our webfarm setup, and I can't use distributed transactions.
There seem to be a couple of obstacles implementing a solution in pure Linq:
You should definitely avoid a table lock
A table lock would make it impossible for several items to be bid on during the processing of one single bid, thus severely harming performance
Linq to SQL does not seem to support pessimistic locking
as stated in other answers on SO.
If you cannot have transactions in your code, I suggest the following procedure:
generate a GUID for your operation
pseudo-lock the item's record using the guid:
UPDATE Items SET LockingGuid = #guid
WHERE ItemId = #ItemId and LockingGuid IS NULL
SELECT #recordsaffected = ##ROWCOUNT
the lock succeeded if ##rowcount == 1
perform your bidding operation
UPDATE the record back to LockingGuid = NULL
if the lock fails, either raise the failure to the .Net client, or busy-wait using WAITFOR.
You should implement proper exception handling so that item records do not get locked indefinitely by a dying or failing process, probably by adding a datetime column storing the timestamp the lock occurred, and cleaning up orphaned locks.
If your architecture allows for separate backend operation, you might want to have a look and CQRS and Event Sourcing for processing such bidding operations.
You could use a separate table to store information when this processing occurs. For example, your second table could be something like:
Table name:
ItemProcessing
Columns:
ItemId (int)
ProcessingToken (guid)
When a process wants to check on a current bid, it writes the ID of the item and a token/guid to the ItemProcessing table. That tells other processes that this item is currently being inspected. If there is already a row in the ItemProcessing table for this item, the other process must wait or abort. When the original process is done, it removes the token (sets it to null), or removes the row from ItemProcessing altogether. Then other processes know they can process that item.
Of course, you'll need a way to make sure both processes don't write to this processing table at the same time. You could accomplish that by inserting into this table where ProcessingToken is null. If another table just beat a process to it, the second process won't be able to insert because the ProcessingToken will exist.
While not a full solution, in detail, that's the basic idea.
You can manually begin a transaction and pass that transaction to the DataContext.
http://geekswithblogs.net/robp/archive/2009/04/02/your-own-transactions-with-linq-to-sql.aspx
I think it is necessary as well to manually control the opening and closing of the Connection to avoid an unwanted escalation to a distributed transaction. It seems that the DataContext will actually get in its own way and try to open two connections sometimes, thus causing a promotion to a distributed transaction.