How to clear the DataContext cache on Linq to Sql - c#

I'm using Linq to Sql to query some database, i only use Linq to read data from the DB, and i make changes to it by other means. (This cannot be changed, this is a restriction from the App that we are extending, all updates must go trough its sdk).
This is fine, but I'm hitting some cache problems, basically, i query a row using Linq, then i delete it trough external means, and then i create a new row externally if i query that row again using linq i got the old (cached) data.
I cannot turn off Object Tracking because that seems to prevent the data context from auto loading associated propertys (Foreign Keys).
Is there any way to clear the DataContex cache?
I found a method sufring the net but it doesn't seem safe: http://blog.robustsoftware.co.uk/2008/11/clearing-cache-of-linq-to-sql.html
What do you think? what are my options?.

If you want to refresh a specific object, then the Refresh() method may be your best bet.
Like this:
Context.Refresh(RefreshMode.OverwriteCurrentValues, objectToRefresh);
You can also pass an array of objects or an IEnumerable as the 2nd argument if you need to refresh more than one object at a time.
Update
I see what you're talking about in comments, in reflector you see this happening inside .Refresh():
object objectByKey = context.Services.GetObjectByKey(trackedObject.Type, keyValues);
if (objectByKey == null)
{
throw Error.RefreshOfDeletedObject();
}
The method you linked seems to be your best option, the DataContext class doesn't provide any other way to clear a deleted row. The disposal checks and such are inside the ClearCache() method...it's really just checking for disposal and calling ResetServices() on the CommonDataServices underneath..the only ill-effect would be clearing any pending inserts, updates or deletes that you have queued.
There is one more option, can you fire up another DataContext for whatever operation you're doing? It wouldn't have any cache to it...but that does involve some computational cost, so if the pending insert, update and deletes aren't an issue, I'd stick with the ClearCache() approach.

I made this code to really CLEAR the "cached" entities, detaching it.
var entidades = Ctx.ObjectStateManager.GetObjectStateEntries(EntityState.Added | EntityState.Deleted | EntityState.Modified | EntityState.Unchanged);
foreach (var objectStateEntry in entidades)
Ctx.Detach(objectStateEntry.Entity);
Where Ctx are my Context.

You should be able to just requery the result sets that are using this objects. This would not pull a cached set, but would actually return the final results. I know that this may not be as easy or feasible depending on how you setup your app...
HTH.

Related

Why EF Extensions not throwing an exception when there is any such entity on table to delete?

I am wondering about why BulkDelete not throwing an exception if there is no such entity to delete on database? I am forced to check database if there are some matching entities and after calling BulkDelete method sending matched entities which I've got from querying database. Do EF Extensions has some options to automate it?
You dont have to check database. I am assuming you are using EF Extensions - BulkDelete. When you call BulkDelete, you just delete it but to make changes to the database, you have to .SaveChanges(); - This method however return the number of rows affected.
So, if the number is 0, then you know your DELETE failed. If the number is above 0, then you know the DELETE was successfull
The BulkDelete has been optimized for performance.
So indeed, there is no check about if the entity exists or not first, we just perform the Delete operation.
Something you could do on your side to make it easier is using the BulkRead method before using the BulkDelete.
var customers = context.Customers.BulkRead(deserializedCustomers);
Docs: https://entityframework-extensions.net/bulk-read
By using this method, you will be able to easily compare with your current list to get the list of customer doesn't exist in the database for example.
You could also get the RowAffecteds and compare it with your list count: https://entityframework-extensions.net/rows-affected
I am wondering about why BulkDelete not throwing an exception if there is no such entity to delete on database?
Because there is nothing wrong with having a filter that ends up yielding 0 results. That is not inherently an erroneous state.
Sure, for a specific use case you may have been expecting to find at least something, but a generalized library cannot account for your specific expectation in your specific use case. There are plenty of cases where ending up not deleting something is not a problematic scenario, e.g. a daily cleanup job that removes old entries if there are any. If there are no old entries, that's not a reason to throw an exception. It just means that nothing needed to be deleted.
This is no different from why a foreach doesn't throw an exception when you pass it an empty collection. The foreach is still doing its job, i.e. performing the work for each item in the collection. There are 0 items, and therefore it performs the work 0 times.

Entity Framework Too slow /Memory leak

I'm doing a Lot of Work With EntityFramework, like millions Inserts and Updates.
However, by time it Get Slower and Slower...
I tried usign some ways to improve performance. Like:
db.Configuration.AutoDetectChangesEnabled = false;
db.Configuration.ValidateOnSaveEnabled = false;
tried too:
db.Table.AsNoTracking();
When i change all this things it really gets Faster. However Memory used start to increases and until it give me exception.
Has anyone had this situation?
Thanks
The DbContext stores all the entities you have fetched or added to a DbSet. As others have suggested, you need to dispose of the context after each group of operations (a set of closely-related operations - e.g. a web request) and create a new one.
In the case of inserting millions of entities, that might mean creating a new context every 1,000 entities for example. This answer gives you all you need to know about inserting thousands of entities.
If you are doing only insertion and updates - try to use db.Database.SqlQuery(queryString, object).
Entity framework keeps in memory all attached objects. So having millions of them may cause a memory leak.
https://github.com/loresoft/EntityFramework.Extended offers a clean interface for doing faster bulk updates, and deletes. I think it only works with SQL Server, but it may give you a quick solution to your performance issue.
Updates can be done like this:
context.Users.Where(u => u.FirstName == "Firstname").Delete();
Deletes can be done in a similar fashion:
context.Tasks.Where(t => t.StatusId == 1).Update(t => new Task { StatusId = 2 });
For millions insert and Update, Everything give out of memory, i've tried all..
Only worked for me when i stop use the context and use ADO or Another Micro ORM like Dapper.

Multiple Execute Calls

I'm trying to update a field on a phone call entity, then close it. Current to do so as far as I can tell, takes two calls. But this is painfully slow as it's taken 30 minutes to process 60 phone calls and I have around 200,000 to do. Is there a way to combine both into one call?
Here's my current code -
foreach (phonecall phonepointer in _businessEntityCollection.BusinessEntities.Cast<phonecall>()
.Where(phonepointer => phonepointer.statecode.Value == PhoneCallState.Open))
{
//Update fiserv_contactstatus value
phonepointer.fiserv_contactstatus = Picklist;
crmService.Update(phonepointer);
//Cancel activity
setStatePhoneCallRequest.PhoneCallState = PhoneCallState.Canceled;
setStatePhoneCallRequest.PhoneCallStatus = 200011;
setStatePhoneCallRequest.EntityId = phonepointer.activityid.Value;
crmService.Execute(setStatePhoneCallRequest);
}
Unfortunately, there's little you can do.
You COULD try and use the new SDK and the XRM context (strongly typed classes) to batch-update the phone call entities (this should be faster), but you'll still need to use the old-fashioned CrmService to actually change the state of each entity, one by one.
EDIT:
You could also directly change the state of the entities in the database, but this should be your last resort, as manual changes to the CRM DB are unsupported and dangerous.
Seriously, last resort! No, I'm NOT joking!

new objects added during long loop

We currently have a production application that runs as a windows service. Many times this application will end up in a loop that can take several hours to complete. We are using Entity Framework for .net 4.0 for our data access.
I'm looking for confirmation that if we load new data into the system, after this loop is initialized, it will not result in items being added to the loop itself. When the loop is initialized we are looking for data "as of" that moment. Although I'm relatively certain that this will work exactly like using ADO and doing a loop on the data (the loop only cycles through data that was present at the time of initialization), I am looking for confirmation for co-workers.
Thanks in advance for your help.
//update : here's some sample code in c# - question is the same, will the enumeration change if new items are added to the table that EF is querying?
IEnumerable<myobject> myobjects = (from o in db.theobjects where o.id==myID select o);
foreach (myobject obj in myobjects)
{
//perform action on obj here
}
It depends on your precise implementation.
Once a query has been executed against the database then the results of the query will not change (assuming you aren't using lazy loading). To ensure this you can dispose of the context after retrieving query results--this effectively "cuts the cord" between the retrieved data and that database.
Lazy loading can result in a mix of "initial" and "new" data; however once the data has been retrieved it will become a fixed snapshot and not susceptible to updates.
You mention this is a long running process; which implies that there may be a very large amount of data involved. If you aren't able to fully retrieve all data to be processed (due to memory limitations, or other bottlenecks) then you likely can't ensure that you are working against the original data. The results are not fixed until a query is executed, and any updates prior to query execution will appear in results.
I think your best bet is to change the logic of your application such that when the "loop" logic is determining whether it should do another interation or exit you take the opportunity to load the newly added items to the list. see pseudo code below:
var repo = new Repository();
while (repo.HasMoreItemsToProcess())
{
var entity = repo.GetNextItem();
}
Let me know if this makes sense.
The easiest way to assure that this happens - if the data itself isn't too big - is to convert the data you retrieve from the database to a List<>, e.g., something like this (pulled at random from my current project):
var sessionIds = room.Sessions.Select(s => s.SessionId).ToList();
And then iterate through the list, not through the IEnumerable<> that would otherwise be returned. Converting it to a list triggers the enumeration, and then throws all the results into memory.
If there's too much data to fit into memory, and you need to stick with an IEnumerable<>, then the answer to your question depends on various database and connection settings.
I'd take a snapshot of ID's to be processed -- quickly and as a transaction -- then work that list in the fashion you're doing today.
In addition to accomplishing the goal of not changing the sample mid-stream, this also gives you the ability to extend your solution to track status on each item as it's processed. For a long-running process, this can be very helpful for progress reporting restart / retry capabilities, etc.

how does linq2sql keep track of database objects?

When using Linq2sql everything automagically works. My experience is that going with the flow is not always the best solution and to understand how something internally works is better so you use the technique optimally.
So, my question is about linq2sql.
If I do a query and get some database objects, or I create a new one, somehow the linqcontext object keeps references to these objects. If something changes in one of the objects, the context object 'knows' what has changed and needs updating.
If my references to the object are set to null, does this mean that the context object also removes it's link to this object? Or is the context object slowly getting filled with tons of references, and keeping my database objects from garbage collecting?
If not, how does this work??
Also, is it not very slow for the database object to always go through the entire list to see what changed and to update it?
Any insight in how this works would be excellent!
thanks
yes, the context keeps references of the loaded objects. That's one of the reasons why it isn't meant to be used with a single instance shared accross the different requests.
It keeps lists for the inserts/deletes. I am not sure if it captures update adding those to a list, or it loops at the end. But, u shouldn't be loading large sets of data at a time, because that alone would be a bigger hit to performance than any last check it might do on the list.
The DataContext registers to your objects PropertyChanged event to know when it is modified. At this point it clones the original object and keeps it to compare the 2 objects together later when you do your SubmitChanges().
If my references to the object are set to null, does this mean that the context object also removes it's link to this object?
Edit: No. Sorry for my original answer I had misinterpreted what you had written. In that case the data context still has a reference to both object but will remove the relationship with those 2 objects on next SubmitChanges().
Be careful though. If you created your own objects instead of using the ones generated from the .dbml, the "magic" that the datacontext performs might not work properly.

Categories