Restful API thread clashing

Restful API thread clashing - c#

I currently have a shopping cart API to add the item to table when the item doesn't exist, and then increase the qty column each time the item is then added:
var exist = _context.Carts.Any(a => a.CartID == dto.CartSesID && a.SweetID == dto.SweetID);
if (!exist)
{
// Create a new cart item if no cart item exists
var cartItem = new Cart
{
SweetID = dto.SweetID,
CartID = dto.CartSesID,
Qty = qty,
DateCreated = DateTime.Now
};
_context.Carts.Add(cartItem);
}
else
{
var cartItem = _context.Carts.FirstOrDefault(a => a.CartID == dto.CartSesID && a.SweetID == dto.SweetID);
// If the item does exist in the cart,
// then add one to the quantity
if(type == "plus")
cartItem.Qty = cartItem.Qty + qty;
if (type == "minus")
cartItem.Qty = cartItem.Qty - qty;
if(cartItem.Qty == 0)
_context.Carts.Remove(cartItem);
}
// Save changes
_context.SaveChanges();
The problem is that the if (!exist) check seems to think that the item doesn't exist when the button is clicked multiple times too fast (maybe thread not finished when other has started?) resulting in the same item added on several rows:
But it should be added as the following:
Does anyone know an ideal fix?

You have a race condition here. When two requests follow rapidly one another, the first request may have not committed its changes to the DB before the second one queries the existence of the item in question.
You need to apply some concurrency control to resolve this. Basically there are two ways to go:
Serializing the changes made to the DB. Again, two main options:
most RDBMS supports the serializable isolation level for transactions or
you can use some locking mechanisms. This can take place on DB level (e.g. table locking) or application level (.NET locking constructions) depending on your application architecture.
You can apply a trial-and-error (or trial-and-retry-on-error to be more precise) approach aka optimistic concurrency control.
Obviously, serializing have a negative impact on performance (especially option 1.1) so usually optimistic concurrency control is preferred and the other ones are for special cases.
Luckily, EF has built-in support for optimistic concurrency handling. All the details is discussed in this MSDN article.
In this particular case you have an even simpler way. You need to define a compound unique constraint on (CartID, SweetID) fields. Doing this guarantees that no duplicates can be inserted into the table. You get an exception when such an attempt is detected and by catching it you can handle the situation according to your requirements. E.g. you can initiate an update (but keep in mind that even in this case you need optimistic concurrency checking to make the process absolutely fail-safe!)
Footnote
Disabling the button is just a masking of the problem, indeed. On server-side you cannot trust what JS does on client-side because it is completely out of your control. JS can be disabled or modified by the user easily.
Update
Reading through my answer again, I feel there's a conclusion I should add:
In this particular case I think the best you can do is to
set the unique constraint as I suggested but don't care about the exception thrown when a duplicate insert is detected AND
disable the submit button on client-side for the time of submit.
This way you ensure that no invalid data can be stored in your DB even if someone manipulates your JS code. At the same time, you don't need to overcomplicate your data persisting logic.

Related

Appropriate update of entity while checking if it exists in EF Core

I have the following method updating an entity. The only biff I had was that when an non-existing ID were provided, I got an harsh exception.
public bool Update(Thing thing)
{
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
So I added a check to control the exception thrown (plus some nice logging and other facilitation). Eventually, I plan to skip the throwing up entirely.
public bool UpdateWithCheck(Thing thing)
{
Thing target = Context.Things.SingleOrDefault(a => a.Id == thing.Id);
if (target == null)
throw new CustomException($"No thing with ID {thing.Id}.");
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
No, this doesn't work, because the entity already is being tracked. I have several options to handle that.
Change to Context.Where(...).AsNoTracking().
Explicitly set the updated fields in target and save it.
Horse around with entity states and tampering with the tracker.
Removing the present and adding the new one.
I can't decide which is the best practice. Googling gave me the default examples that do not contain the check for pre-existing status in the same operation.

The reason for the exception is because by loading the entity from the Context to check if it exists, you now have a tracked reference. When you go to update the detatched reference, EF will complain that a instance is already tracked.
The simplest work-around would be:
public bool UpdateWithCheck(Thing thing)
{
bool doesExist = Context.Things.Any(a => a.Id == thing.Id);
if (!doesExist)
throw new CustomException($"No thing with ID {thing.Id}.");
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
However, there are two problems with this approach. Firstly, because we don't know the scope of the DbContext instance or can guarantee the order of methods, it may be possible that at some point that DbContext instance could have loaded and tracked that instance of the thing. This can manifest as seemingly intermittent errors. The proper way to guard against that would be something like:
public bool UpdateWithCheck(Thing thing)
{
bool doesExist = Context.Things.Any(a => a.Id == thing.Id);
if (!doesExist)
throw new CustomException($"No thing with ID {thing.Id}.");
Thing existing = Context.Things.Local.SingleOrDefault(a => a.Id == thing.Id);
if (existing != null)
Context.Entry(existing).State = EntityState.Detached;
Context.Things.Update(thing);
int result = Context.SaveChanges();
return result == 1;
}
This checks the local tracking cache for any loaded instances, and if found, detaches them. The risk here is that any modifications that haven't be persisted in those tracked references will be discarded, and any references floating around that would have assumed were attached, will now be detached.
The second significant issue is with using Update(). When you have detached entities being passed around there is a risk that data you don't intend to be updated could be updated. Update will replace all columns, where typically if a client might only be expected to update a subset of them. EF can be configured to check row versions or timestamps on entities against the database before updating when your database is set up to support them (Such as Snapshot isolation) which can help guard against stale overwrites, but still allow unexpected tampering.
As you've already figured out, the better approach is to avoid passing detached entities around, and instead use dedicated DTOs. This avoids potential confusion about what objects represent view/consumer state vs. data state. By explicitly copying the values across from the DTO to the entity, or configuring a mapper to copy supported values, you also protect your system from unexpected tampering and potential stale overwrites. One consideration with this approach is that you should guard updates to avoid unconditionally overwriting data with potentially stale data by ensuring your Entity and DTO have a RowVersion/Timestamp to compare. Before copying from DTO to the freshly loaded Entity, compare the version, if it matches then nothing has changed in the data row since you fetched and composed your DTO. If it has changed, that means someone else has updated the underlying data row since the DTO was read, so your modifications are against stale data. From there, take an appropriate action such as discard changes, overwrite changes, merge the changes, log the fact, etc.

Just alter properties of target and call SaveChanges() - remove the Update call. I'd say the typical use case these days is for the input thing to not actually be a Thing but to be a ThingViewModel, ThingDto or some other variation on a theme of "an object that carries enough data to identify and update a Thing but isn't actually a DB entity". To that extent, if the notion of updating properties of Thing from ThingViewModel by hand bores you, you can look at a mapper (AutoMapper is probably the most well known but there are many others) to do the copying for you, or even set you up with a new Thing if you decide to turn this method into an Upsert

Update only the changes that are different

I have a Entity-Set employee_table, I'm getting the data through excel sheet which I have loaded in the memory and the user will click Save to save the changes in the db and its all good for the first time inserting records and no issue with that.
but how can I update only the changes that are made? meaning that, let say I have 10 rows and 5 columns and out of 10 rows say row 7th was modified and out of 5 column let say 3rd column was modified and I just need to update only those changes and keep the existing value of the other columns.
I can do with checking if (myexistingItem.Name != dbItem.Name) { //update } but its very tedious and not efficient and I'm sure there is a better way to handle.
here is what I got so far.
var excelData = SessionWrapper.GetSession_Model().DataModel.OrderBy(x => x.LocalName).ToList();;
var dbData = context.employee_master.OrderBy(x => x.localname).ToList();
employee_master = dbEntity = employee_master();
if (dbData.Count > 0)
{
//update
foreach (var dbItem in dbData)
{
foreach(var xlData in excelData)
{
if(dbItem.customer == xlData.Customer)
{
dbEntity.customer = xlData.Customer;
}
//...do check rest of the props....
db.Entry(dbEntity).State = EntityState.Modified;
db.employee_master.Add(dbEntity);
}
}
//save
db.SaveChanges();
}
else
{
//insert
}

You can make this checking more generic with reflection.
Using this answer to get the value by property name.
public static object GetPropValue(object src, string propName)
{
return src.GetType().GetProperty(propName).GetValue(src, null);
}
Using this answer to set the value by property name.
public static void SetPropertyValue(object obj, string propName, object value)
{
obj.GetType().GetProperty(propName).SetValue(obj, value, null);
}
And this answer to list all properties
public static void CopyIfDifferent(Object target, Object source)
{
foreach (var prop in target.GetType().GetProperties())
{
var targetValue = GetPropValue(target, prop.Name);
var sourceValue = GetPropValue(source, prop.Name);
if (!targetValue.Equals(sourceValue))
{
SetPropertyValue(target, prop.Name, sourceValue);
}
}
}
Note: If you need to exclude some properties, you can implement very easy by passsing the list of properties to the method and you can check in if to be excluded or not.

Update:
I am updating this answer to provide a little more context as to why I suggested not going with a hand-rolled reflection-based solution for now; I also want to clarify that there is nothing wrong with a such a solution per se, once you have identified that it fits the bill.
First of all, I assume from the code that this is a work in progress and therefore not complete. In that case, I feel that the code doesn't need more complexity before it's done and a hand-rolled reflection-based approach is more code for you to write, test, debug and maintain.
For example, right now you seem to have a situation where there is a simple 1:1 simple copy from the data in the excel to the data in the employee_master object. So in that case reflection seems like a no-brainer, because it saves you loads of boring manual property assignments.
But what happens when HR (or whoever uses this app) come back to you with the requirement: If Field X is blank on the Excel sheet, then copy the value "Blank" to the target field, unless it's Friday, in which case copy the value "N.A".
Now a generalised solution has to accomodate custom business logic and could start to get burdensome. I have been in this situation and unless you are very careful, it tends to end up with turning a mess in the long run.
I just wanted to point this out - and recommend at least looking at Automapper, because this already provides one very proven way to solve your issue.
In terms of efficiency concerns, they are only mentioned because the question mentioned them, and I wanted to point out that there are greater inefficiencies at play in the code as posted as opposed to the inefficiency of manually typing 40+ property assignments, or indeed the concern of only updating changed fields.
Why not rewrite the loop:
foreach (var xlData in excelData)
{
//find existing record in database data:
var existing = dbData.FirstOrDefault(d=>d.customer==xlData.Customer);
if(existing!=null)
{
//it already exists in database, update it
//see below for notes on this.
}
else
{
//it doesn't exist, create employee_master and insert it to context
//or perform validation to see if the insert can be done, etc.
}
//and commit:
context.SaveChanges();
}
This lets you avoid the initial if(dbData.Count>0) because you will always insert any row from the excel sheet that doesn't have a matching entry in dbData, so you don't need a separate block of code for first-time insertion.
It's also more efficient than the current loop because right now you are iterating every object in dbData for every object in xlData; that means if you have 1,000 items in each you have a million iterations...
Notes on the update process and efficiency in general
(Note: I know the question wasn't directly about efficiency, but since you mentioned it in the context of copying properties, I just wanted to give some food for thought)
Unless you are building a system that has to do this operation for multiple entities, I'd caution against adding more complexity to your code by building a reflection-based property copier.
If you consider the amount of properties you have to copy (i.e the number of foo.x = bar.x type statements) , and then consider the code required to have a robust, fully tested and provably efficient reflection-based property copier (i.e with built-in cache so you don't have to constantly re-reflect type properties, a mechanism to allow you to specify exceptions, handling for edge cases where for whatever unknown reason you discover that for random column X the value "null" is to be treated a little differently in some cases, etc), you may well find that the former is actually significantly less work :)
Bear in mind that even the fastest reflection-based solution will always still be slower than a good old fashioned foo.x = bar.x assignment.
By all means if you have to do this operation for 10 or 20 separate entities, consider the general case, otherwise my advice would be, manually write the property copy assignments, get it right and then think about generalising - or look at Automapper, for example.
In terms of only updating field that have changed - I am not sure you even need to. If the record exists in the database, and the user has just presented a copy of that record which they claim to be the "correct" version of that object, then just copy all the values they gave and save them to the database.
The reason I say this is because in all likelihood, the efficiency of only sending for example 4 modified fields versus 25 or whatever, pales into insignificance next to the overhead of the actual round-trip to the database itself; I'd be surprised if you were able to observe a meaningful performance increase in these kinds of operations by not sending all columns - unless of course all columns are NVARCHAR(MAX) or something :)
If concurrency is an issue (i.e. other uses might be modifying the same data) then include a ROWVERSION type column in the database table, map it in Entity Framework, and handle the concurrency issues if and when they arise.

Using Transactions or Locking in Entity Framework to ensure proper operation

I am fairly new to EF and SQL in general, so I could use some help clarifying this point.
Let's say I have a table "wallet" (and EF code first object Wallet) that has an ID and a balance. I need to do an operation like this:
if(wallet.balance > 100){
doOtherChecksThatTake10Seconds();
wallet.balance -= 50;
context.SaveChanges();
}
As you can see, it checks to see if a condition is valid, then if so it has to do a bunch of other operations first that take a long time (in this exaggerated example we say 10 seconds), then if that passes it subtracts $50 from the wallet and saves the new data.
The issue is, there are other things happening that can change the wallet balance at any time (this is a web application). If this happens:
wallet.balance = 110;
this operation passes its "if" check because wallet.balance > 110
while it's doing the "doOtherChecksThatTake10Seconds()", a user transfers $40 out of their wallet
now wallet.balance = 70
"doOtherChecksThatTake10Seconds()" finishes, subtracts 50 from wallet.balance, and then saves the context with the new data.
In this case, the check of wallet.balance > 100 is no longer true, but the operation still happened because of the delay. I need to find a way of locking the table and not releasing it until the entire operation is finished, so nothing gets edited during. What is the most effective way to do this?
It should be noted that I have tried putting this operation within a TransactionScope(), I am not sure if that will have the intended effect or not but I did notice it started causing a lot of deadlocks with an entirely different database operation that is running.

Use Optimistic concurrency http://msdn.microsoft.com/en-us/data/jj592904
//Object Property:
public byte[] RowVersion { get; set; }
//Object Configuration:
Property(p => p.RowVersion).IsRowVersion().IsConcurrencyToken();
This Allows dirty read. BUT when you go to update the record the system checks the rowversion hasn't changed in the mean time, it fails if someone has changed the record in the meantime.
Rowversion is maintained by DB each time a record changes.
Out of the box EF optimistic locking.

you can use Transaction Scope.
Import the namespace
using System.Transactions;
and use it like below:
public string InsertBrand()
{
try
{
using (TransactionScope transaction = new TransactionScope())
{
//Do your operations here
transaction.Complete();
return "Mobile Brand Added";
}
}
catch (Exception ex)
{
throw ex;
}
}

Another approach could be to use one or many internal queues and consume this queue(s) by one thread only (producer-consumer-pattern). I use this approach in a booking system and it works quite well and is very easy.
In my case I have multiple queues (one for each 'product') that are created and deleted dynamically and multiple consumers, where only one consumer can be assigned to one queue. This allows also to handle higher concurrency. In a high-concurrency scenario with houndredthousands of user you could also use separate servers and queues like msmq to handle this.
There might be a problem with this approach in a ticket system where a lot of users want to have a ticket for a concert or in a shopping system, when a new "Harry Potter" is released but I dont have this scenarios.

Is checking rows affected count after database action (insert, update, delete) overkill?

Lately in apps I've been developing I have been checking the number of rows affected by an insert, update, delete to the database and logging an an error if the number is unexpected. For example on a simple insert, update, or delete of one row if any number of rows other than one is returned from an ExecuteNonQuery() call, I will consider that an error and log it. Also, I realize now as I type this that I do not even try to rollback the transaction if that happens, which is not the best practice and should definitely be addressed. Anyways, here's code to illustrate what I mean:
I'll have a data layer function that makes the call to the db:
public static int DLInsert(Person person)
{
Database db = DatabaseFactory.CreateDatabase("dbConnString");
using (DbCommand dbCommand = db.GetStoredProcCommand("dbo.Insert_Person"))
{
db.AddInParameter(dbCommand, "#FirstName", DbType.Byte, person.FirstName);
db.AddInParameter(dbCommand, "#LastName", DbType.String, person.LastName);
db.AddInParameter(dbCommand, "#Address", DbType.Boolean, person.Address);
return db.ExecuteNonQuery(dbCommand);
}
}
Then a business layer call to the data layer function:
public static bool BLInsert(Person person)
{
if (DLInsert(campusRating) != 1)
{
// log exception
return false;
}
return true;
}
And in the code-behind or view (I do both webforms and mvc projects):
if (BLInsert(person))
{
// carry on as normal with whatever other code after successful insert
}
else
{
// throw an exception that directs the user to one of my custom error pages
}
The more I use this type of code, the more I feel like it is overkill. Especially in the code-behind/view. Is there any legitimate reason to think a simple insert, update, or delete wouldn't actually modify the correct number of rows in the database? Is it more plausible to only worry about catching an actual SqlException and then handling that, instead of doing the monotonous check for rows affected every time?
Thanks. Hope you all can help me out.
UPDATE
Thanks everyone for taking the time to answer. I still haven't 100% decided what setup I will use going forward, but here's what I have taken away from all of your responses.
Trust the DB and .Net libraries to handle a query and do their job as they were designed to do.
Use transactions in my stored procedures to rollback the query on any errors and potentially use raiseerror to throw those exceptions back to the .Net code as a SqlException, which could handle these errors with a try/catch. This approach would replace the problematic return code checking.
Would there be any issue with the second bullet point that I am missing?

I guess the question becomes, "Why are you checking this?" If it's just because you don't trust the database to perform the query, then it's probably overkill. However, there could exist a logical reason to perform this check.
For example, I worked at a company once where this method was employed to check for concurrency errors. When a record was fetched from the database to be edited in the application, it would come with a LastModified timestamp. Then the standard CRUD operations in the data access layer would include a WHERE LastMotified=#LastModified clause when doing an UPDATE and check the record modified count. If no record was updated, it would assume a concurrency error had occurred.
I felt it was kind of sloppy for concurrency checking (especially the part about assuming the nature of the error), but it got the job done for the business.
What concerns me more in your example is the structure of how this is being accomplished. The 1 or 0 being returned from the data access code is a "magic number." That should be avoided. It's leaking an implementation detail from the data access code into the business logic code. If you do want to keep using this check, I'd recommend moving the check into the data access code and throwing an exception if it fails. In general, return codes should be avoided.
Edit: I just noticed a potentially harmful bug in your code as well, related to my last point above. What if more than one record is changed? It probably won't happen on an INSERT, but could easily happen on an UPDATE. Other parts of the code might assume that != 1 means no record was changed. That could make debugging very problematic :)

On the one hand, most of the time everything should behave the way you expect, and on those times the additional checks don't add anything to your application. On the other hand, if something does go wrong, not knowing about it means that the problem may become quite large before you notice it. In my opinion, the little bit of additional protection is worth the little bit of extra effort, especially if you implement a rollback on failure. It's kinda like an airbag in your car... it doesn't really serve a purpose if you never crash, but if you do it could save your life.

I've always prefered to raiserror in my sproc and handle exceptions rather than counting. This way, if you update a sproc to do something else, like logging/auditing, you don't have to worry about keeping the row counts in check.
Though if you like the second check in your code or would prefer not to deal with exceptions/raiserror, I've seen teams return 0 on successful sproc executions for every sproc in the db, and return another number otherwise.

It is absolutely overkill. You should trust that your core platform (.Net libraries, Sql Server) work correctly -you shouldn't be worrying about that.
Now, there are some related instances where you might want to test, like if transactions are correctly rolled back, etc.

If there's is a need for that check, why not do that check within the database itself? You save yourself from doing a round trip and it's done at a more 'centralized' stage - If you check in the database, you can be assured it's being applied consistently from any application that hits that database. Whereas if you put the logic in the UI, then you need to make sure that any UI application that hits that particular database applies the correct logic and does it consistently.

using the db to prevent errors in a UI presentation

I am going though this msdn article by noted DDD expert Udi Dahan, where he makes a great observation that he said took him years to realize; "Bringing all e-mail addresses into memory would probably get you locked up by the performance police. Even having the domain model call some service, which calls the database, to see if the e-mail address is there is unnecessary. A unique constraint in the database would suffice."
In a LOB presentation that captured some add or edit scenario, you wouldn't enable the Save type action until all edits were considered valid, so the first trade-off doing the above is that you need to enable Save and be prepared to notify the user if the uniqueness constraint is violated. But how best to do that, say with NHibernate?
I figure it needs to follow the lines of the pseudo-code below. Does anyone do something along these lines now?
Cheers,
Berryl
try {}
catch (GenericADOException)
{
// "Abort due to constraint violation\r\ncolumn {0} is not unique", columnName)
//(1) determine which db column violated uniqueness
//(2) potentially map the column name to something in context to the user
//(3) throw that can be translated into a BrokenRule for the UI presentation
//(4) reset the nhibernate session
}

The pessimistic approach is to check for unique-ness before saving; the optimistic approach is to attempt the save and handle the exception. The biggest problem with the optimistic approach is that you have to be able to parse the exception returned by the database to know that it's a specific unique constraint violation rather than the myriad of other things that can go wrong.
For this reason, it's much easier to check unique-ness before saving. It's a trivial database call to make this check: select 1 where email = 'newuser#somewhere.com'. It's also a better user experience to notify the user that the value is a duplicate (perhaps they already registered with the site?) before making them fill out the rest of the form and click save.
The unique constraint should definitely be in place, but the UI should check that the address is unique at the time it is entered on the form.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.