In my application, I have InsertData HTTP method, where i am inserting multiple data in DB (postrgres).While inserting data, I am taking the MAX count from one of the column "ColIndex" and increment the "ColIndex" by 1. Increment operation will be happen based on the data count we received.
Note: Column "ColIndex" is not a primary key.
My problem is, When two users make a call at the same time to insert a different data. Data has been inserted successfully. But problem is User1 and User2 are getting same MAX count from column "ColIndex". How to wait till User1 response and proceed with User2 request?
Environment: .NET 6 & Postrgres
How to handle this Concurrent API call scenario without affecting performance?
You have to have some form of concurrency, which could be either pessimistic (taking a lock of some kind) or optimistic (lockless, but needs some way to resolve conflicts).
One way to do this, and to not take a lock would be to put a unique constraint on the COLINDEX column. Something like
CREATE TABLE YourTableName(..., ColIndex Integer, UNIQUE (ColIndex));
In your API code, implement a retry loop. Something like
const int MaxTries = 5;
for (int i=1; ; i++)
{
try
{
// attempt your DB call, which includes inserting
// what might or might not be the correct Max(ColIndex)
}
catch (PostgresException e)
when (e.Code == PostgressErrorCodes.UniqueViolation)
{
if (i == MaxTries) throw;
await Task.Delay(i*100);
}
}
How long you delay between tries is a whole other topic, so I suggest for now delaying for a random number of milliseconds.
#tymtam's way is even easier, though both mine and their answer requires you to have the permissions to modify the DB schema.
You could make ColIndex an auto-increment column:
CREATE SEQUENCE a_colIndex_seq;
CREATE TABLE A (
name VARCHAR PRIMARY KEY,
colIndex integer NOT NULL DEFAULT nextval('a_colIndex_seq')
);
ALTER SEQUENCE a_colIndex_seq OWNED BY A.colIndex;
Then:
INSERT INTO A(Name) VALUES ('Ashok');
INSERT INTO A(Name) VALUES ('Kohsa');
SELECT * FROM A
Result:
name colindex
Ashok 1
Kohsa 2
Related
I have the following scenario..
Employee record is created
NextPayrollNumber is read from record in database table (settings table)
Number is incremented by 1 and added as PayrollNumber to Employee record, as well as overwriting current NextPayrollNumber
Employee Record is saved to database (employee table)
I need to ensure that two Employee records won't have the same number and I have done some searching and it looks like concurrency issues are usually handled with a Concurrency Token and doing concurrency exception handling in the DBContext. But this involves adding another column to the settings table to store rowversion and adding code to the dbcontext that would only be used for this one requirement and the rest of the application doesn't need.
Are there other approaches to handling this? I would have liked to add a unique constraint to the database table (but the column will have null values) or use a sequence but the value needs to be based on the NextPayrollNumber that will be configurable by an end-user.
Given the payroll # is not the PK for the row (using an in-built identity) and you don't want to derive a payroll number from the PK identity, Then my suggestion to be safe rather than worrying about the exception case of 2 inserts happening together, handle the exception with a retry having a unique constraint on the Payroll #. Basically populate the payroll # and save the record as quickly as possible, and if you hit a duplicate exception (which should be rare) handle it by fetching a new payroll # from your settings (ensuring you reload the setting entity or fetch from DB not a cached row) and save again, retrying if necessary. If the next # comes back the same as what you retrieved then you have a bigger problem with the insert and can bail with an exception message.
var settings = _context.Settings.Single(); // However this is loaded, whether single row or row per Tenant...
var customer = new Customer
{
// populate values...
PayrollNumber = settings.NextPayrollNumber++;
};
int retryCount = 0;
bool complete = false;
while(retryCount < 5 && !complete)
{
try
{
_context.SaveChanges();
complete = true;
}
catch (SqlException ex)
{
var constraintErrorNumbers = new[] {547, 2601, 2627}; // SQL codes around constraint violations.
if (constraintErrorNumbers.Contains(ex.Number)
{
_context.Entry(settings).Reload(); //Refresh the settings from the DB.
int currentPayrollNumber = customer.PayrollNumber;
customer.PayrollNumber = settings.NextPayrollNumber++;
if(customer.PayrollNumber == currentPayrollNumber)
throw; // It wasn't the payroll number that was duplicated because the sequence hasn't changed.
retryCount++;
}
else
throw;
}
}
You will most likely need to catch something like an EF InsertException or UpdateException rather than SqlException and inspect the InnerException which should be the SqlException.
This should update the NextPayrollNumber in the settings with a successful save of the new customer.
Typically I wouldn't recommend keeping a Sequence in a table for something like a payroll number but generating/selecting a "should be unique" value like a random number, snowflake, or hash etc. to build and validate a new payroll number. The same retry logic would need to apply to handle the rare case of a duplication but this would not be relying on coordinating inserts to one sequence.
I have a method that needs to "claim" a payment number to ensure it is available at a later time. I cannot just get a new payment number when ready to commit to the database, as the number is added to a signed token, and then the payment number is taken from the signed token later on when committing to the database to allow the token to be linked to the payment afterwards.
Payment numbers are sequential and the current method used in existing code is:
Create a Payment
Get the last payment number from the database
Increment the payment number
Use this payment number for the Payment
Update the database with the incremented payment number
In my service I am trying to prevent the following race-condition:
My service reads the payment number (eg. 100)
Another service uses and updates the payment number (now 101)
My service increments the number locally (to 101) and updates the database (still 101)
This would produce two payments with a payment number of 100.
Here is my implementation so far, in my Transaction class:
private DbSet<PaymentIdentifier> paymentIdentifier;
//...
private int ClaimNextPaymentNumber()
{
int nextPaymentNumber = -1;
using(var dbTransaction = db.Database.BeginTransaction())
{
int lastPaymentNumber = paymentIdentifier.ElementAt(0).Identifier;
nextPaymentNumber = lastPaymentNumber + 1;
paymentIdentifier.ElementAt(0).Identifier = nextPaymentNumber;
db.SaveChanges();
dbTransaction.Commit();
}
return nextPaymentNumber;
}
The PaymentIdentifier table has a single row and a single column "Identifier" (hence the .ElementAt(0)). I am unable to change the database structure as there is lots of legacy code relying on it that is very brittle.
Will having the code wrapped in a transaction (as I have done) protect against the race condition, or is there some Entity Framework / PostgreSQL idiosyncrasies I need to deal with to protect the identifier from being read whilst performing the transaction?
Thank you!
(As a side point, I believe lots of legacy code in the other software connecting to the database simply ignores the race condition and relies on it being "very fast")
It helps you with the race condition only if all code, including legacy, will use this method. If there is still code that continue using client side incrementing without transaction, you'll get the same problem. Just exchange 'My service' and 'Another service' in your description.
1. Another service reads the payment number (eg. 100) **without** transaction
2. My service uses and updates the payment number (now 101) **with** transaction
3. Another service increments the number locally (to 101) and updates the database (still 101) **without** transaction
Note that you can replace your code with simpler one by executing this query without explicit transaction.
update PaymentIdentifier set Identifier = Identifier + 1 returning Identifier;
But again, it will not solve your concurrency problem until you replace all places where the Identifier is incremented. If you can change that, you would better use SEQUENCE or Generators that will safely provide you with incremental Ids.
A transaction does not automaticaly lock your table. A Transaction just ensures that multiple changes to the database are done altogether or nothing at all (see the A (atomic) in ACID). But the thing you want is that only one session can read, add one, update the value. And after that is done the next session is allowed to do the same thing.
So you now have different possibilities:
Use a Sequence you can get the next value for example like that SELECT nextval('mysequencename'). If if two sessions try to get a value at the same time they will get two differnt values.
If you have more complex needs and want to store every "token" with additional data in a table. so every token is a row in the table with additional colums you could use table locking. With this you could restrict the access to table. So only one session is allowed to access the table at a time. But make sure that you use locks for as short as possible because this will become your performance bottleneck.
The database prevents the race condition by throwing a concurrency violation error in this case. So, I looked at how this is handled in the legacy code (following the suggestion by #sergey-l) and it uses a simple retry mechanism. So, I did the same:
private int ClaimNextPaymentNumber()
{
DbContextTransaction dbTransaction;
bool failed;
int paymentNumber = -1;
do
{
failed = false;
using(dbTransaction = db.Database.BeginTransaction())
{
try
{
paymentNumber = TryToClaimNextPaymentNumber();
}
catch(DbUpdateConcurrencyException ex)
{
failed = true;
ResetForClaimPaymentNumberRetry(ex);
}
dbTransaction.Commit();
concurrencyExceptionRetryCount = 0;
}
}
while(failed);
return paymentNumber;
}
I am trying to read all new rows that are added to the database on a timer.
First I read the entire database and save it to a local data table, but I want to read all new rows that are added to the database. Here is how I'm trying to read new rows:
string accessDB1 = string.Format("SELECT * FROM {0} ORDER BY ID DESC", tableName);
setupaccessDB(accessDB1);
int dTRows = localDataTable.Rows.Count + 1;
localDataTable.Rows.Add();
using (readNext = command.ExecuteReader())
{
while (readNext.Read())
{
for (int xyz = 0; xyz < localDataTable.Columns.Count; xyz++)
{
// Code
}
break;
}
}
If only 1 row is added within the timer then this works fine, but when multiple rows are added this only reads the latest row.
So is there any way I can read all added rows.
I am using OledbDataReader.
Thanks in advance
For most tables the primary key is based an incremental value. This can be a very simple integer that is incremented by one, but it could also be a datetime based guid.
Anyway if you know the id of the last record. You can simple ask for all records that have a 'higher' id. In that way you do get the new records, but what about updated records? If you also want those you might want to use a column that contains a datetime value.
A little bit more trickier are records that are deleted from the database. You can't retrieve those with a basic query. You could solve that by setting a TTL for each record you retrieve from the database much like a cache. When the record is 'expired', you try to retrieve it again.
Some databases like Microsoft SQL Server also provide more advanced options into this regard. You can use query notifications via the broker services or enable change tracking on your database. The last one can even indicate what was the last action per record (insert, update or delete).
Your immediate problem lies here:
while (readNext.Read())
{
doSomething();
break;
}
This is what your loop basically boils down to. That break is going to exit the loop after processing the first item, regardless of how many items there are.
The first item, in this case, will probably be the last one added (as you state it is) since you're sorting by descending ID.
In terms of reading only newly added rows, there are a variety of ways to do it, some which will depend on the DBMS that you're using.
Perhaps the simplest and most portable would be to add an extra column processed which is set to false when a row is first added.
That way, you can simply have a query that looks for those records and, for each, process them and set the column to true.
In fact, you could use triggers to do this (force the flag to false on insertion) which opens up the possibility for doing it with updates as well.
Tracking deletions is a little more difficult but still achievable. You could have a trigger which actually writes the record to a separate table before deleting it so that your processing code has access to those details as well.
The following works
using (readNext = command.ExecuteReader())
{
while (readNext.Read())
{
abc = readNext.FieldCount;
for (int s = 1; s < abc; s++)
{
var nextValue = readNext.GetValue(s);
}
}
}
The For Loop reads the current row and then the While Loop moves onto the next row
I have some .NET code wrapped up in a repeatable read transaction that looks like this:
using (
var transaction = new TransactionScope(
TransactionScopeOption.Required,
new TransactionOptions { IsolationLevel = IsolationLevel.RepeatableRead },
TransactionScopeAsyncFlowOption.Enabled))
{
int theNextValue = GetNextValueFromTheDatabase();
var entity = new MyEntity
{
Id = Guid.NewGuid(),
PropertyOne = theNextValue, //An identity column
PropertyTwo = Convert.ToString(theNextValue),
PropertyThree = theNextValue,
...
};
DbSet<MyEntity> myDbSet = GetEntitySet();
myDbSet.Add(entity);
await this.databaseContext.Entities.SaveChangesAsync();
transaction.Complete();
}
The first method, GetNextValueFromTheDatabase, retrieves the max value stored in a column in a table in the database. I'm using repeatable read because I don't want two users to read and use the same value. Then, I simply create an Entity in memory and call SaveChangesAsync() to write the values to the database.
Sporadically, I see that the values of entity.PropertyOne, entity.PropertyTwo, and entity.PropertyThree do not match each other. For example, entity.PropertyOne has a value of 500, but entity.PropertyTwo and entity.PropertyThree have a value of 499. How is that possible? Even if the code weren't wrapped in a transaction, I would expect the values to match (just maybe duplicated across the Entities if two users ran at the same time).
I am using Entity Framework 6 and Sql Server 2008R2.
Edit:
Here is the code for GetNextValueFromTheDatabase
public async Task<int> GetNextValueFromTheDatabase()
{
return await myQuerable
.OrderByDescending(x => x.PropertyOne) //PropertyOne is an identity column (surprise!)
.Select(x => x.PropertyOne)
.Take(1)
.SingleAsync() + 1;
}
So this question cannot be definitively answered because GetNextValueFromTheDatabase is not shown. I'm going off of what you said what it does:
REPEATABLE READ in SQL Server S-locks rows that you have read. When you read the current maximum, presumably from an index, that row is S-locked. Now, if a new maximum appears that row is unaffected by the lock. That's why the lock does not prevent other, competing maximum values from appearing.
You need SERIALIZABLE isolation if you obtain the maximum by reading the largest values from a table. This will result in deadlocks in your specific case. That can be solved through locking hints or retries.
You could also keep a separate table that stores the current maximum value. REPEATABLE READ is enough here because you always access the same row of that table. You will be seeing deadlocks here as well even with REPEATABLE READ without locking hints.
Retries are a sound solution to deadlocks.
I think that you are basically experiencing the phantom read.
Consider two transactions T1, T2 that are scheaduled for execution like shown below. The things is that in T1's first read you do not get value (X) that is inserted from transaction T2. In the second time you do get the value (X) in your select statement. This is the scary nature of the repeatable read. It does not block insertion in the whole table if some rows are read from it. It only locks existing rows.
T1 T2
SELECT A.X FROM WeirdTable
INSERT INTO WeirdTable TABLE (A) VALUES (X)
SELECT A.X FROM WeirdTable
.
UPDATE
It seems that this answer turned out irrelavant for this specific question. It is related to the repeatable read isolation level, matches the keywords of this question and is not concepcually wrong though, so I will leave it in here.
I finally figured this out. As described in usr's response, multiple transaction can read the same max value at the same time (S-Lock).The problem was that one of the columns is an identity column. EF allows you specify an identity column's value when inserting but ignores the value you specify. So the identity column seemed to update with the expected value most of the time, but in fact the value specified in the domain entity just happen to match what the database was generating internally.
So for example, let's say the current max number is 499, transaction A and transaction B both read 499. When transaction A finishes, it successfully writes 500 to all three properties. Transaction B attempts to write 500 to all 3 columns. The non-identity columns are updated successfully to 500, but the identity column's value is incremented to the next available value automatically (without throwing an error)
A few solutions
The solution I used is to not set the value for any of the columns when inserting the record. Once the record is inserted, update the other two columns with the database assigned identity column's value.
Another option would be to change the column's option to .HasDatabaseGeneratedOption(DatabaseGeneratedOption.None)
...which would perform better than the first option, but would require the changes usr suggested to mitigate the lock issues.
I'm using c# and i have three datatables, all three have an int column named idnum, what i need is to keep all numbers in the column idnum unique in the three tables and the next number will always be the small available. For example:
table a
idnum=1
idnum=3
idnum=4
table b
idnum=2
idnum=7
table c
idnum=8
in this case the next number on a new row in any of the three tables would be number 5 then 6 and then 9.
My question is what would be the best aproach to get the next number?
I don't want to use sql.
thanks
nuno
You'd probably want a fourth table, to hold all the "gap" numbers. Otherwise you would have to check every number starting from 1.
On insert: Find the smallest number in the "gaps" table. Use that number when inserting a new item. If there are no items in the gap table, use Max+1 of the idnums across all tables.
On delete: Put the number that you just retired into the "gaps" table.
If your app is multi-threaded, you'd have to add a lock to make sure that two threads don't grab the same gap.
You're not going to be able to do this automatically; the auto-numbering features built into ADO.NET are all scoped to the individual table.
So, given that you're going to have to code your own method to handle this, what's the best way?
If you were using a database, I'd suggest that you use a fourth table, make the ID column in the three main tables a foreign key in which you stored the fourth table's ID, and synchronize inserting a row into the fourth table with inserting a row into either of the other three. Something like:
INSERT INTO Sequence (DateInserted) VALUES (GETDATE())
INSERT INTO TableA (SequenceID, ... ) VALUES (##SCOPE_IDENTITY(), ...)
But you don't want to use a database, which suggests to me that you don't really care about the persistence of these ID numbers. If they really only need to exist while your application is running, you can just use a static field to store the last used ID, and make a helper class:
public static class SequenceHelper
{
private static int ID;
private static object LockObject = new object();
public static int GetNextID()
{
lock (LockObject)
{
return ID++;
}
}
}
The locking isn't strictly necessary, but there's no harm in making this code thread-safe.
Then you can handle the TableNewRow event on each of your three data tables, e.g.:
DataTable t = MyDataSet["TableA"];
t.TableNewRow += new delegate(object sender, DataTableNewRowEventArgs e)
{
r.Row["ID"] = SequenceHelper.GetNextID();
};
This will insure that whatever method adds a new row to each table - whether it's your code calling NewRow() or a new row being added via a data bound control - each row added will have its ID column set to the next ID.