EF Code First object graph memory use in WPF app - c#

I'm building an app to display the historical log of changes from a source control repository. The app is implemented in .NET 4 WPF and Entity Framework Code First.
One of the problems I'm getting is that over time, as more log entries are added to the log, the application uses more and more memory and doesn't release references to the log entries. Each log entry holds a list of "changed files" and for each changed file, the before and after version of the file.
The UI displays a list of log entries and the diff between the old and new version of the currently selected LogEntry and ChangedFile. The data model is roughly as follows:
public class LogSubscription
{
public List<LogEntry> Log { get; set; }
}
public class LogEntry
{
public List<ChangedFile> ChangedFiles { get; set; }
}
public class ChangedFile
{
public string OldVersion { get; set; }
public string NewVersion { get; set; }
}
As I'm using EF Code First, the database is queried and the object model is built automatically by simply accessing the List properties. What I'd like to do is somehow de-reference the ChangedFiles list after a certain time and have the database re-queried and the object model rebuilt as necessary (i.e. the user has clicked back onto the log entry).
Is there any way to do this with EF Code First? Or should I be taking a different approach to control the memory used?
The app and full source code is hosted on GitHub here: https://github.com/tomhunter-gh/SourceLog

As said in the comment: this is likely to happen with a static context that is never disposed. I see in the source that there is a ThreadStaticContextBackground which is used by a LogEntry: in an active-record like pattern a LogEntry saves itself by its MarkAsReadAndSave method, for which it needs a context.
You probably did this (I did not really scrutinize the source) to prevent creating and disposing many contexts while saving logs entries. But I think you should rethink the active record approach. Log entries are saved by the MainWindowViewModel. The view model should invoke a service method that saves log entries by a short-lived context and possibly caches the entries to have them available for display etc. And it should receive the RemovedItems collection from DgLogSelectionChanged in order to have them handled by one context instead of one context per item. Does that make any sense?

Related

EF6 SaveChanges fails claiming field supplied is NULL

I have been using EF for awhile but I am not sure what is missing as I am following a pattern I have used several times in the past. This is the SQL table definition:
Table LogTable
Columns
LogID (int, Identity)
fk_ref (int, not null)
action (nvarchar(60))
notes (nvarchar(200))
This is the code (names changed for ease of reading/understanding)
using(myEntity _me = new myEntity)
{
LogTable _lt = new LogTable();
_lt.fk_ref = 10;
_lt.action = "Some action";
_lt.notes = "even more text";
_me.LogTable.Add(_lt);
_me.SaveChanges();
}
This is where it blows up claiming that the field "fk_ref" is null.
When I go to the edmx and ModelBrowser all the fields are represented.
When I check the select SQL on the table name "_me.LogTable" during debug the SELECT statement is missing the field it claims as NULL.
I hope I have given enough information to turn on the light bulb in my head.
NOTE: I have tried dropping and re-adding the table. Gone as far as drop, clean, rebuild, re-add and no change.
Would really appreciate any help.
UPDATE: Since this is new functionality I took the liberty of breaking the foreign key enforcement on the reference table and ran the code as demonstrated above. I also removed the Not Null limitation. It wrote out the record but put a NULL in the fk_ref field.
UPDATE 2 As someone asked for it. This is the CS modified to match the shortened definition above.
public LogTable()
{
this.fk_ref = 0;
}
public int LogID { get; set; }
public Nullable<int> fk_ref { get; set; }
public string action { get; set; }
public string notes { get; set; }
prior to the changes I mentioned in the first update it was
public LogTable()
{
}
public int LogID { get; set; }
public fk_ref { get; set; }
public string action { get; set; }
public string notes { get; set; }
UPDATE 3 moving ahead with this I saved a record via the code above and while debugging checked the DB for the value inserted in the fk_ref field and it was null. So, i fetch the record back to the app via the LogID, manually set the field value to a random number and called SaveChanges again. Still null. Here is the code following the SaveChanges() above
//... prior code ...
// assume that 4 is the log id of the record just inserted
// and 1000 is the fk_ref intended to be inserted
LogTable _new = _me.LogTable.where(p=>p.LogID == 4).FirstOrDefault();
// when I inspect _new the fk_ref post save changes the value is 1000
_new.fk_ref = 999;
_me.SaveChanges();
Retrieving the record from the db again fk_ref is still null
Found the Answer
I have no idea how to categorize this but the answer was in a scope not included in the original question. My thanks to all who responded as you pushed me to look under different rocks - sometimes that is all you need.
additional scope for question
This project is part of an enterprise wide management tool delivered via a One-click interface. Each of the 20 or so different business surfaces has their own management project which may or may not be written within the IT development group. (Democritization and all that). The subordinate projects are user control DLLs that are then distributed along with the main shell. The entire solution gets data from several servers and more than a hundred DBs. The connection strings are managed through main shell. What I am currently working on is a project that has a number shared components and controls including an enhanced logger. (log4j and all related services are outlawed here). I was developing a new control for shared TimeKeeping that uses the shared controls project. More graphically it looks like this:
Timekeeping had a reference to the Shared Controls project which had a EF object that connected to the DB that had the Log Table.
Project 1 had an EF object that connected to the DB that had the Log Table
The Shared Controls EF object was up to date. (the fk_ref field was added some time back)
Project 1 has been around a while (longer than the shared controls) and it's EF object was out of date.
Even though Timekeeping did not have a reference to Project 1 when the EF object in the Shared Controls was writing it used the definition Project 1.
Oddly enough, on a read the Shared controls retrieved all the fields
How I "proved" it
Created a mock up of the original application (WPF)
Added a user control project with EF connecting to a DB
used the control to write data to the DB
Closed out of VS
Used SSMS to modify the table
Opened VS added a new project with an EF project that connects to the same table
Added a third project that used a class in the second to write to the table referenced by the first and second.
any attempt to insert from the third project wrote a NULL into the added field
updated Model form Database in the first control project - checked to make sure the new field was there
Attempted to insert from the third project and all fields were inserted.
This seems odd to me but heck with all VS and MS have done for me who can complain. Off to make up for lost time. Believe it or not, I am more than a little happy I figured it out with all your help. Maybe this experience will help someone else.

How to prevent to update changed data

I use SQL Azure and have application, which sync data with external resource. Data is huge, approx 10K records, so, I get it from DB one time, update something if necessary during some minutes and save changes. It works, but problem with simultaneously access to data. IF during these some minutes other service add changes, these changes will be rewritten.
But in the most cases it concerns fields, which my application does not touch!
So, for example, my Table Device:
public partial class Device : BaseEntity
{
public string Name { get; set; }
public string IMEI { get; set; }
public string SN { get; set; }
public string ICCID { get; set; }
public string MacAddress { get; set; }
public DeviceStatus Status { get; set; }
first service (application with long-term process) can modify SN, ICCID, MacAddress, but not Status, second service, vice versa, can modify only Status.
Code to update in the first service:
_allLocalDevicesWithIMEI = _context.GetAllDevicesWithImei().ToList();
(it gets entities, not DTO, because really there are many fields can be changed)
and then:
_context.Devices.Update(localDevice);
for every device, which should be changed
and, eventually:
await _context.SaveChangesAsync();
How to mark, that field Status should be excluded from tracing?
One simple method to avoid update the status field when calling the first service is create a update entity not include the status field, and create another update entity for the second service which includes the status field.
Another way to resolve this problem is override the SaveChangesAsync method and control the update logic by yourself, but it's complex I think and the behavior is implicit, it will not easy for others to understand your code.
To avoid rewrite, you can specify RowVersion on entities. This is so called optimistic concurrency, it will throw error if rewrite happens and you can retry operation if someone already changed something. Or you can just level up your Transaction level, to something like RepeatableRead/Serialized to lock these rows for entire operation (which of course will pose huge performance impact and timeouts). Second option is simple, and good enough for background jobs and distributed transactions, first one is more flexible and usually faster. but hard to implement across multiple endpoints/entities.

Incrementing a single database row field multiple times at once while making sure that all changes are saved?

I have an ASP.NET 5 Web API application using EF Core with Npgsql (PostgreSQL).
I'm trying to implement a file download tracker that increments some fields in my database when a file is requested.
Let's say I have this object:
public class Stats
{
[Key]
public long ID { get; set; }
public long DownloadCount { get; set; }
}
This application will obviously have many users requesting many different files at the same time, resulting in lots of changes made to the same value for each file. Now, I'm wondering whether this type of thing is already implemented in the internal change tracker or not.
If it isn't, how can I implement it in a way that will make sure that every request is counted?

Updating relational entities

I have a scenario in which I need some help.
Let us assume that there is a User who listens to some type of Music.
class User
{
public virtual List<UserMusicType> Music { get; set; }
}
public class UserMusicType
{
public int ID { get; set; }
public MusicType name { get; set; }
}
public class MusicType
{
public int ID { get; set; }
public string Name { get; set; }
}
There is a form where I am asking users to check/select all types of Music he listens to. He selects 3 types namely { Pop, Rock, and Electronic }
CASE 1:
Now I want to update the User Entity and insert these 3 new types. From my understanding, I need to first remove whatever MusicTypes for this users were saved in the Database then insert these new types again. Is it a correct approach? Removing all previous and Inserting new ones? Or any other way to do it?
CASE 2:
I am taking MusicType names as string of course. Now while updating the User Entity, I'll have to first fetch the MusicType.ID after that I'll be able to do this:
user.Music.Add(new UserMusicType() { ID = SOME_ID });
Is there a better approach for this case?
I'll be glad to have some replies from experienced people in EF. I want to learn if there is an efficient way of doing it. Or even if my approach/Models are totally wrong or could be improved.
First of all, you don't need the UserMusicType class, you can just declare the `User class as
class User
{
public virtual List<MusicType> Music { get; set; }
}
And entity framework will create a many to many relationship table in the database
As for the first question, it depends. If you use this relationship any where else, like payment or audit trail, then the best way would be to compare the posted values to the saved values, ex:
User selected Music 1, Music 2, Music 3 for the first time and saved, in this case the 3 records will be inserted.
User edited his selection and chose Music 1,Music 3,Music 4, in this case you will get the values submitted which is 1,3,4 and retrieve the values stored in the database which is 1,2,3
Then you will get the new values which are the items that exist in the new values but not in the old, in this case it will be 4
You will get the removed values, which exist in the old but not in the new, in this case it will be Music 2.
The rest can be ignored.
So, your query, will be add Music 4, remove Music 2.
If you don't depend on the relationship, then it is easier to just remove all user music and add the collection again.
As for the second part of your question, I assume you will display some chechboxes for the user, you should make the value for the checkbox control as the MusicType ID, and this is what will be posted to the backend and you can use it to link it to the user.
ex:
user.Music.Add(new MusicType{ID=[selected ID ]}
You should not depend on the music name
First question:
Actually, it is a personal preference. Because, wouldn't want to delete all rows which belongs to that user and then insert them. I would compare the collection which is posted from the form with the rows which is stored in the database. Then, delete those entities from the database which are not exist in the collection anymore. And, insert new ones. Even, you can update those entities which has modified some additional details.
By the way, you can easily achieve this with the newly released EntityGraphOperations for Entity Framework Code First. I am the author of this product. And I have published it in the github, code-project and nuget. With the help of InsertOrUpdateGraph method, it will automatically set your entities as Added or Modified. And with the help of DeleteMissingEntities method, you can delete those entities which exists in the database, but not in the current collection.
// This will set the state of the main entity and all of it's navigational
// properties as `Added` or `Modified`.
context.InsertOrUpdateGraph(user)
.After(entity =>
{
// And this will delete missing UserMusicType objects.
entity.HasCollection(p => p.Music)
.DeleteMissingEntities();
});
You can read my article on Code-project with a step-by-step demonstration and a sample project is ready for downloading.
Second question:
I don't know on which platform you are developing your application. But, generally I am storing such libraries as MusicType in a cache. And use DropDownList element for rendering all types. When user posts the form, I am getting values rather than names of the selected types. So, no additional work is required.

manage cache objects in asp.net

I have a list of product that I have stored in asp.net cache but I have a problem in refreshing the cache. As per our requirement I want to refresh cache every 15 minutes but I want to know that if in the mean time when the cache is being refreshed if some user ask for the list of product then should he get error or the old list or he have to wait until the cache is refreshed.
the sample code is below
public class Product
{
public int Id{get;set;}
public string Name{get;set;}
}
we have a function which gives us list of Product in BLL
public List<Product> Products()
{
//////some code
}
Cache.Insert("Products", Products(), null, DateTime.Now.AddMinutes(15), TimeSpan.Zero);
I want to add one more situation here, Let say I use static object instead of cache object then what will happen and which approach is best if we are on a stand alone server and not on cluster
Sorry - this might be naive/obvious but just have a facade type class which does
if(Cache["Products"] == null)
{
Cache.Insert("Products", Products(), null, DateTime.Now.AddMinutes(15), TimeSpan.Zero);
}
return Cache["Products"];
There is also a CacheItemRemoveCallback delegate which you could use to repopulate an expired cache. As an alternative
ALSO
use the cache object rather than static objects. More efficient apparently (Asp.net - Caching vs Static Variable for storing a Dictionary) and you get all your cache management methods (sliding expiration and so on)
EDIT
If there is a concern about update times then consider two cache objects plus a controller e.g.
Active Cache
Backup Cache - this is the one that will be updated
Cache controller (another cache object?) this will indicate which object is active
So the process to update will be
Update backup cache
Completes. Check is valid
Backup becomes active and visa versa. The control now flags the Backup cache as being active
There needs to be a method which will fire when the products cache object is populated. I would probably use the CacheItemRemoveCallback delegate to initiate the cache repopulation. Or do an async call in the facade type class - you wouldn't want it blocking the current thread
I'm sure there are many other variants of this
EDIT 2
Actually thinking about this I would make the controller class something like this
public class CacheController
{
public StateEnum Cache1State {get;set;}
public StateEnum Cache1State {get;set;}
public bool IsUpdating {get;set;}
}
The state would be active, backup, updating and perhaps inactive and error. You would set the IsUpdating flag when the update is occurring and then back to false once complete to stop multiple threads trying to update at once - i.e. a race condition. The class is just a general principle and could/should be amended as required

Categories