Should I add Locks or TransactionScope when using .Net Cache? - c#

I’m using HttpContext.Current.Cache to cache data from the DB (.Net 4 web application).
I want to make sure I don’t run into any threading synchronization problem.
Scenario: 3 users pointing to the same Company Object:
User A:
Profile.Company.Name = “CompX”;
Profile.Company.Desc = “CompXDesc”;
Profile.Company.Update(); //Update DB
User B:
String Name = Profile.Company.Name;
User C:
Profile.Company.Name = “CompY”;
Profile.Company.Update(); //Update DB
Questions:
Does the Cache provide any type of locking?
Should I add Locks like ReaderWriterLockSlim (how exactly)?
Existing Code:
ProfileBLL:
public CompanyBLL Company {
get {
return CompanyBLL.GetById(this.Company_ID);
}
}
// HttpContext.Current.Cache
public static CompanyBLL GetById(int Company_ID) {
string key = "GetById_" + Company_ID.ToString();
CompanyBLL ret = null;
if (Cache[key] != null) {
ret = (CompanyBLL)Cache[key];
}
else
{
ret = DAL_Company<CompanyBLL>.GetById(Company_ID);
Cache[key] = ret;
}
return ret;
}
Another option is to add TransactionScope on any DB update:
User A:
using (TransactionScope Scope = new TransactionScope()){
Profile.Company.Name = “CompX”;
Profile.Company.Desc = “CompXDesc”;
Profile.Company.Update(); //Update DB
Scope.Complete(); //COMMIT TRANS
}
User B:
String Name = Profile.Company.Name;
Will it solve any threading problem?
Thanks

You have nothing to worry about. The class is thread safe.

If you're using SQL to store cache then SQL will lock the row as it's being written (under pessimistic mode, which is default) so you wont have to worry about that. Transactions aren't going to provide thread safety but you should do it anyway when making changes that need to be consistent.
You can always add a lock around any "write" methods you have.
If you want to make sure that when any user calls a "read" method that they get the absolute latest then put a lock around those methods as well.

Related

Strange SaveChanges behavior in Entity Framework and SQL Server

I have some code, you can check project github, error contains in UploadContoller method GetExtensionId.
Database diagram:
Code (in this controller I sending files to upload):
[HttpPost]
public ActionResult UploadFiles(HttpPostedFileBase[] files, int? folderid, string description)
{
foreach (HttpPostedFileBase file in files)
{
if (file != null)
{
string fileName = Path.GetFileNameWithoutExtension(file.FileName);
string fileExt = Path.GetExtension(file.FileName)?.Remove(0, 1);
int? extensionid = GetExtensionId(fileExt);
if (CheckFileExist(fileName, fileExt, folderid))
{
fileName = fileName + $" ({DateTime.Now.ToString("dd-MM-yy HH:mm:ss")})";
}
File dbFile = new File();
dbFile.folderid = folderid;
dbFile.displayname = fileName;
dbFile.file_extensionid = extensionid;
dbFile.file_content = GetFileBytes(file);
dbFile.description = description;
db.Files.Add(dbFile);
}
}
db.SaveChanges();
return RedirectToAction("Partial_UnknownErrorToast", "Toast");
}
I want to create Extension in database if it not exist yet. And I do it with GetExtensionId:
private static object locker = new object();
private int? GetExtensionId(string name)
{
int? result = null;
lock (locker)
{
var extItem = db.FileExtensions.FirstOrDefault(m => m.displayname == name);
if (extItem != null) return extItem.file_extensionid;
var fileExtension = new FileExtension()
{
displayname = name
};
db.FileExtensions.Add(fileExtension);
db.SaveChanges();
result = fileExtension.file_extensionid;
}
return result;
}
In the SQL Server database I have unique constraint on displayname column of FileExtension.
Problem starts only if I uploading few files with the same extension and this extension not exist in database yet.
If I remove lock, in GetExtensionId will be Exception about unique constraint.
Maybe, for some reason, next iteration of foreach cycle calls GetExtensionId without waiting? I don't know.
But only if I set lock my code works fine.
If you know why it happens please explain.
This sounds like a simple concurrency race condition. Imagine two requests come in at once; they both check the FirstOrDefault, which correctly says "nope" for both. Then they both try and insert; one wins, one fails because the DB has changed. While EF manages transactions around SaveChanges, that transaction doesn't start from when you query the data initially
The lock appears to work, by preventing them getting into the looking code at the same time, but this is not a reliable solution for this in general, as it only works inside a single process, let alone node.
So: a few option here:
your code could detect the foreign key violation exception and recheck from the start (FirstOrDefault etc), which keeps things simple in the success case (which is going to be the majority of the time) and not horribly expensive in the failure case (just an exception and an extra DB hit) - pragmatic enough
you could move the "select if exists, insert if it doesn't" into a single operation inside the database inside a transaction (ideally serializable isolation level, and/or using the UPDLOCK hint) - this requires writing TSQL yourself, rather than relying on EF, but minimises round trips and avoids writing "detect failure and compensate" code
you could perform the selects and possible inserts inside a transaction via EF - complicated and messy, frankly: don't do this (and it would again need to be serializable isolation level, but now the serializable transaction spans multiple round trips, which can start to impact locking, if at scale)

Bulk insert with EF

I need to insert some objects (about 4 million) in the database using C# and EF (using .NET 3.5). My method that adds the objects is in a for:
private DBModelContainer AddToContext(DBModelContainer db, tblMyTable item, int count)
{
db.AddTottblMyTable (item);
if ((count % 10000== 0) || (count == this.toGenerate))
{
try
{
db.SaveChanges();
}
catch (Exception e)
{
Console.WriteLine(e.StackTrace);
}
}
return db;
}
How to detach the added objects (of type tblMyTable) from the context object? I don't need them for a later use and when more than 300000 objects are added, the execution time between db saving ( db.SaveChanges()) increases considerably.
Regards
Entity Framework may not be the best tool for this type of operation. You may be better off with plain ADO.Net, some stored procedures... But if you had to use it, here are a number of suggestions:
Keep the active Context Graph small by using a new context for each
Unit of Work
Turn off AutoDetechChangesEnabled - context.Configuration.AutoDetectChangesEnabled = false;
Batching, in your loop, Call SaveChanges periodically
EDIT
using(var db = new DBModelContainer())
{
db.tblMyTable.MergeOption = MergeOption.NoTracking;
// Narrow the scope of your db context
db.AddTottblMyTable (item);
db.SaveChanges();
}
Keeping a long running db context is not advisable, so consider refactoring your Add method to not keep attempting to reuse the same context.
See Rick Strahl's post on bulk inserts for more details
AFAK EF does not support directly the BulkInsert so it will be tedious to do such thing manually.
try to consider EntityFramework.BulkInsert
using (var ctx = GetContext())
{
using (var transactionScope = new TransactionScope())
{
// some stuff in dbcontext
ctx.BulkInsert(entities);
ctx.SaveChanges();
transactionScope.Complete();
}
}
You may try Unit Of Work and dont save context (SaveChanges) on every record insert but save it at end

Executing part of code exactly 1 time inside Parallel.ForEach

I have to query in my company's CRM Solution(Oracle's Right Now) for our 600k users, and update them there if they exist or create them in case they don't. To know if the user already exists in Right Now, I consume a third party WS. And with 600k users this can be a real pain due to the time it takes each time to get a response(around 1 second). So I managed to change my code to use Parallel.ForEach, querying each record in just 0,35 seconds, and adding it to a List<User> of records to be created or to be updated (Right Now is kinda dumb so I need to separate them in 2 lists and call 2 distinct WS methods).
My code used to run perfectly before multithread, but took too long. The problem is that I can't make a batch too large or I get a timeout when I try to update or create via Web Service. So I'm sending them around 500 records at once, and when it runs the critical code part, it executes many times.
Parallel.ForEach(boDS.USERS.AsEnumerable(), new ParallelOptions { MaxDegreeOfParallelism = -1 }, row =>
{
...
user = null;
user = QueryUserById(row["USER_ID"].Trim());
if (user == null)
{
isUpdate = false;
gObject.ID = new ID();
}
else
{
isUpdate = true;
gObject.ID = user.ID;
}
... fill user attributes as generic fields ...
gObject.GenericFields = listGenericFields.ToArray();
if (isUpdate)
listUserUpdate.Add(gObject);
else
listUserCreate.Add(gObject);
if (i == batchSize - 1 || i == (boDS.USERS.Rows.Count - 1))
{
UpdateProcessingOptions upo = new UpdateProcessingOptions();
CreateProcessingOptions cpo = new CreateProcessingOptions();
upo.SuppressExternalEvents = false;
upo.SuppressRules = false;
cpo.SuppressExternalEvents = false;
cpo.SuppressRules = false;
RNObject[] results = null;
// <Critical_code>
if (listUserCreate.Count > 0)
{
results = _service.Create(_clientInfoHeader, listUserCreate.ToArray(), cpo);
}
if (listUserUpdate.Count > 0)
{
_service.Update(_clientInfoHeader, listUserUpdate.ToArray(), upo);
}
// </Critical_code>
listUserUpdate = new List<RNObject>();
listUserCreate = new List<RNObject>();
}
i++;
});
I thought about using lock or mutex, but it isn't gonna help me, since they will just wait to execute afterwards. I need some solution to execute only ONCE in only ONE thread that part of code. Is it possible? Can anyone share some light?
Thanks and kind regards,
Leandro
As you stated in the comments you're declaring the variables outside of the loop body. That's where your race conditions originate from.
Let's take variable listUserUpdate for example. It's accessed randomly by parallel executing threads. While one thread is still adding to it, e.g. in listUserUpdate.Add(gObject); another thread could already be resetting the lists in listUserUpdate = new List<RNObject>(); or enumerating it in listUserUpdate.ToArray().
You really need to refactor that code to
make each loop run as independent from each other as you can by moving variables inside the loop body and
access data in a synchronizing way using locks and/or concurrent collections
You can use the Double-checked locking pattern. This is usually used for singletons, but you're not making a singleton here so generic singletons like Lazy<T> do not apply.
It works like this:
Separate out your shared data into some sort of class:
class QuerySharedData {
// All the write-once-read-many fields that need to be shared between threads
public QuerySharedData() {
// Compute all the write-once-read-many fields. Or use a static Create method if that's handy.
}
}
In your outer class add the following:
object padlock;
volatile QuerySharedData data
In your thread's callback delegate, do this:
if (data == null)
{
lock (padlock)
{
if (data == null)
{
data = new QuerySharedData(); // this does all the work to initialize the shared fields
}
}
}
var localData = data
Then use the shared query data from localData By grouping the shared query data into a subordinate class you avoid the necessity of making its individual fields volatile.
More about volatile here: Part 4: Advanced Threading.
Update my assumption here is that all the classes and fields held by QuerySharedData are read-only once initialized. If this is not true, for instance if you initialize a list once but add to it in many threads, this pattern will not work for you. You will have to consider using things like Thread-Safe Collections.

How to implement locking in a shared cachecontroller?

I have a static class which handles the cache read/write for frequently used data.
The code is this:
public static T GetFromCache<T>(double seconds, string cacheId, Func<T> method) where T : class
{
HttpContext ctx = HttpContext.Current;
object temp = null;
temp = ctx.Cache[cacheId];
if (temp == null)
{
lock (Sync)
{
temp = ctx.Cache[cacheId];
if (temp == null)
{
temp = method.Invoke();
AddToCache(temp as T, seconds, cacheId);
return temp as T;
}
}
}
if (temp is T)
{
return (T)temp;
}
return null;
}
The code is used by various callers to read data from and write data to the cache.
Now I have a Sync object (private static readonly object Sync = new object();) which gets locked when data gets written to the cache.
As this code is called by multiple callers, I would like to create a List of Sync objects, one for each caller. (with caller I don't mean the user, but calling code. I then would identify a caller by the signature of the parameter method)
The reason I want this is that every piece of calling code can have it's own lock object; otherwise (I think) every call to this cachecontroller from different callers will use the same lock object. Then, the caching for the list of countries will also lock the caching of the list of states, and with two different lock objects, they will not be in each others way.
I would then use the CacheItemRemovedCallback method to remove the lockitems from the list.
The question is this: How can I do that?
By having one Sync object for each user will defy the purpose of synchronization as each use will hold its own lock and there will be a chance that for the same cacheId you will end up invoking the method multiple times. This might result in data becoming inconsistent.
If you wish to keep one Sync object per user then it's good to make use of session variables or per user cache or something similar.. otherwise each user will virtually end up messing up with each other's cacheId results.
If you have a scenario when there can be Multiple readers of data but at a time a single user can write it then try using ReaderWriterLockSlim. This is very fast compared to lock in a multi user scenario.
Update1
Considering the cacheId is unique and not common among the callers. You can use the following code.
No lock is needed here. Reason, HttpContext.Cache is ThreadSafe. Meaning, you can read/save values to Cache. But, if the value reference itself is being shared among more than one concurrent calls, then please synchronize it.
public static T GetFromCache<T>(double seconds, string cacheId, Func<T> method) where T : class
{
HttpContext ctx = HttpContext.Current;
object temp = null;
temp = ctx.Cache[cacheId];
if (temp == null)
{
temp = method.Invoke();
AddToCache(temp as T, seconds, cacheId);
return temp as T;
}
}
Regards

How to access related objects after using statement has finished in Entity Framework?

I use the EF 3.5 in VS 2010. I have a method which returns a struct. In the struct there is an object armatuur. When the struct is returned i want to access the related objects from the armatuur instance.
However
the method returning the struct:
public LampPostDetail getLamppostInfo(int id)
{
LampPostDetail lpd;
lpd.xPos = 0;
lpd.ypos = 0;
lpd.armatuur = new Armatuur();
//get the info from object
using (var db = new OvisionDBEntities())
{
var objects = from o in db.Objects
where o.ObjectId == id
select o;
foreach (OVSL.Data.Object o in objects)
{
lpd.xPos = o.XCoordinatie;
lpd.ypos = o.YCoordinatie;
lpd.armatuur = o.Armatuur; //which is a table in my db
}
return lpd;
}
}
struct:
public struct LampPostDetail
{
#region [ Data Members (14)]
//lamppost info
public double? xPos;
public double? ypos;
//a lamppost can have several armaturen
public OVSL.Data.Armatuur armatuur; //is a table in my db
#endregion [ Data Members ]
}
when doing this in my client:
LampPostDetail lpd = client.getLamppostInfo(id);
string brand = lpd.armatuur.producer.name; //producer is related object of armatuur
I get a ObjectDisposedException. I understand that this happens because the LampPostDetail object is disposed after the using block is finished. But how do i get this to work? Retrieving all information I need (like brand name e.g.) before I return it to the client is not not an option.
The only thing that gets disposed here is the OvisionDBEntities context. After that, no lazy loading is possible. How to deal with that? In fact your question is: what can you do to feed a client with all data that are potentially required for user actions at any time? I see three or four options:
The standard way to enable access to navigation properties of entities after context disposal is calling Include: from o in db.Objects.Include("Armatuur.Producer")... But that's clearly not an option for you.
Let the context live and rely on lazy loading to fetch data on demand. This may be an option for you. But long-lived contexts may cause problems like gradually declining performance as the internal change track record grows, and stale cached data giving rise to refresh/reload statements scattered all over the place.
In stead of navigation properties/lazy loading fetch data on demand from a service/repository layer that uses context instances per call. I think this option could work well for you.
More a functional than a technical option: design use cases that can do with less data (so that Include may suffice after all). No one can take in a grid with thousands of rows and tens of columns. Well-designed user interaction can drastically reduce the amount of data that is pumped into a client (and I'm only at the beginning of getting this).
Its not you LampPostDetail that is getting disposed, it is the Armatuur object you retrieved from the database that it references, or an object that Armatuur is referencing.
I can see two options to getting around this. The first is to make the Entity context an optional parameter to your getLamppostInfo info method. Since you are using 3.5 you will have to do an overload to keep the orignal functionality:
public LampPostDetail getLamppostInfo(int id,OvisionDBEntities context)
{
...
try
{
OvisionDBEntities db;
if (context == null)
db = new OvisionDBEntities();
else
db = context;
...
}
finally
{
if (context == null && db != null)
db.Dispose() // or close maybe
}
retun lpd;
}
// Overloaded function to keep orignal functionality (C# 3.5 does not have
// optional parameters)
public LampPostDetail getLamppostInfo(int id)
{
return LampPostDetail(id,null)
}
Now you can call it as:
using (var db = new OvisionDBEntities())
{
LampPostDetail lpd = client.getLamppostInfo(id,db);
string brand = lpd.armatuur.producer.name;
}
And your objects will still exist when you try to reference them.
The other option is to detach your referenced objects from the entity context, before disposing of it.
db.Detach(o.Armatuur);
However, I don't believe that detaches any objects references by that object. So you would have to interate the reference trees and detach thoes objects as well.

Categories