Best practice for archiving an entity in EF code first - c#

I'm trying to archiving an entity of a table. There are couple of ways to do this. One of them is to create IsArchived column and set it to true when an entity is deleted or put into history. One of the disadvantage of this design will make specified table so heavy.
Another way to do this is to create the duplication of the class of specified entity to be logged, make another table, and adding it to log table with the help of AutoMapper. In this case i need lots of duplicate classes of entities which needed to be archived.
Is there any other solutions for archiving specified entities?

The best way would be to add a nullable ArchivedTimeStamp column to the table. This way, it is possible to tell if the row was archived or not, and if so, when it was archived.
If you are worried about the table size, you can partition the table and automatically move the archived rows onto a secondary / slower physical disk. You can even partition it in such a way that only rows that was, let say, archived over a year ago, must be moved to the secondary partition.
More info on on SQL archiving using partitioning can be found on http://www.mssqltips.com/sqlservertip/2780/archiving-sql-server-data-using-partitioning/

You could have more than one database, with the same schema. You can then open a couple contexts, one to each database, using a different connection string. Query one, attach the entities to the other, and save.
I've never done this, but it should work. You might run into trouble since the entities are going to be attached to the source context and cannot be attached to the destination, but there are ways to unattach and reattach the entities.

I have implemented a soft delete for the purposes of undo. My answer shows how to overcome some of the problems normally associated with soft deletes - i.e. joins and indexes. It suits my purposes well. However, if it was used for archiving then the tables would grow forever.
Your other idea is to create duplicate classes and use automapper. That sounds like a lot of extra coding.
I think you could create a database with the same schema - except, perhaps, the primary keys would not be database generated, and foreign keys not enforced. Then override the delete so that the data is copied over.
Something like this:
public override int SaveChanges()
{
foreach (var entry in ChangeTracker.Entries()
.Where(p => p.State == EntityState.Deleted
&& p.Entity is ModelBase))//I have a base class for entities with a single
//"ID" property - all my entities derive from this
CustomDelete(entry);
return base.SaveChanges();
}
private void CustomDelete(DbEntityEntry entry)
{
var e = entry.Entity as ModelBase;
string tableName = GetTableName(e.GetType());
string sql = String.Format(#"INSERT INTO archive.{0} SELECT * FROM {0} WHERE ID = #id;
DELETE FROM {0} WHERE ID = #id", tableName);
Database.ExecuteSqlCommand(
sql
, new SqlParameter("id", e.ID));
entry.State = EntityState.Detached;
}
Note that in EF6 you could also override the delete by altering the sql in the migration file when mapping to stored procedures is used

Related

Saving entity in many-to-many relation

(3am here, so bear with me!)
This is EF6. I have many-to-many relationship between entities Procedures and Points. I create a new Procedure object named P and add one of the existing Point objects to its Points collection. I then send P to EF for saving:
context.Procedures.Add(P);
context.SaveChanges();
Unfortunately this tries to INSERT the Point object too into the database, which obviously fails with duplicate primary key error.
Among serveral other things, I checked the value of context.ChangeTracker.Entries() and to my astonishment, it contains 21 entries instead of just 2. Upon further investigation, it looks like EF is creating entries for the recursive Procedure > Point > Procedure relation. How can I fix this problem?
Edit
(4am now, :))
I have made it to work with the following code:
var Temp = P.Points.ToArray();
P.Points.Clear();
foreach (var t in Temp)
P.Points.Add(context.Points.Find(t.Id));
context.Procedures.Add(P);
Is there really the correct way of doing it?

What logic determines the insert order of Entity Framework 6

So, I have a DBContext, and I am doing the following operations:
dbContext.SomeTables1.Add(object1)
dbContext.SomeTables2.AddRange(objectArray2)
dbContext.SomeTables3.AddRange(objectArray3)
dbContext.SaveChanges();
The EF doesn't insert the db records in this order, it inserts them in a random order. To insert them in the same order, I have to do a dbContext.SaveChanges() after each addition. This is not an efficient solution and in my case, it is taking 10 seconds to do all my inserts, while the random order with one save takes around 3 seconds.
N.B. I need the right order to solve a deadlock issue.
My questions are:
Is this issue resolved in EF7?
I can profile EF and determine the random order, however, is there a guarantee that it will be consistently with the same random order or
does it change between requests? (I can adopt my other code if the
answer to this question is positive).
Is there a better way of maintaining the order than dbContext.SaveChanges() on every addition?
There is no way you can specify a save order in EF6 or EF Core
The issue is not resolved in EF Core since this is not an issue.
The order will be the same if the predecessor is the same (which will likely rarely happen)
When you call SaveChanges, all entities are ordered from an internal order in the method “ProduceDynamicCommands” then sorted again by the method “TryTopologicalSort” which loops to add command with no predecessor left (if you add A and B and A depend on B, then B will be inserted before A)
You are left to insert by batch addition.
Since it takes you 3 seconds to perform your insert, I will assume you have thousands of entities and performing bulk insert may improve your performance to reduce the 10 seconds to less, and then maybe the initial 3 seconds!
To improve your performance, you can use http://entityframework-extensions.net/ (PAID but support all cases)
Disclaimer: I'm the owner of the Entity Framework Extensions project.
I've found a way to do it. It just thought I'd let you know:
using (var dbContextTransaction = Context.Database.BeginTransaction())
{
dbContext.SomeTables1.Add(object1);
dbContext.SaveChanges();
dbContext.SomeTables1.Add(object2);
dbContext.SaveChanges();
dbContextTransaction.Commit();
}
To explicitly set the values of the Primary Keys (and hence the order of the Clustered Index) in an Identity column in EF and EF Core, you need to manually turn on IDENTITY_INSERT before calling _context.SaveChanges() after which you need to turn off IDENTITY_INSERT like so:
This example assumes EF Core
// Add your items with Identity Primary Key field manually set
_context.SomeTables1.AddRange(yourItems);
_context.Database.OpenConnection();
try {
_context.Database.ExecuteSqlRaw("SET IDENTITY_INSERT dbo.SomeTables1 ON");
_context.SaveChanges();
_context.Database.ExecuteSqlRaw("SET IDENTITY_INSERT dbo.SomeTables1 OFF");
} finally {
_context.Database.CloseConnection();
}
I've found a very simple solution.
Just set the property for the ID (primary key) of the entity to a value that matches your desired order.
SaveChanges() first sorts by this ID, then by other properties.
The assigned ID may already exist in the database. A unique ID is assigned when writing to the database.
for(int i = 0; i < objectArray2.Count(); i++)
{
objectArray2[i].Id = i;
}
dbContext.SomeTables2.AddRange(objectArray2)

Setting a primary key with ROW_NUMBER in a view mapped with Entity Fluent API makes linq timeout

My problem is the following : I map my view to an object through Entity Fluent API. I needed a view containing an few left joins, an there were no unique identifier in the tables, therefore Entity always returned the same set of object. In a few different threads / blogs, I saw a solution consisting of add a column with
ROW_NUMBER() OVER (ORDER BY Id))
I then tried to map it in Entity :
in my class I add a property
public long Row { get; set; }
and in my configuration class I add
HasKey(imc => imc.Row).HasColumnName("Row")
Apparently, the mapping works. What doesn't work is that, when I query the objects with linq, even a Count() will timeout ; however the request itself only returns about 200 lines when used in a SQL Management Studio environement.
Has anyone ever seen this issue ?
EDIT:
I have been able to bypass the problem by replacing the "row_number()" with a newid() in the MS SQL View, but I'm still afraid it might be a problem later on.
Your query is slow which causes the timeout. About 1 million people have seen this before. You would need to analyze the query plan. Computing a row number over the whole table if unindexed can be slow. Also, a row number cannot be used as a key because it's values changes when you change the underlying data. EF does not support changing keys.
If you use newid() as the "key" in the view then you get fresh IDs each time. I think you might not be aware of the fact that a view is merely a shortcut for that particular query. It's contents are not stored anywhere.
Introduce a column that can be used as a key. For example an IDENTITY column.

EF 4.1 code first - How to update/delete many to many join table entries automatically

I have 2 entities, let's say, Trip and Activity. The relationship between them is many to many so a join table is created automatically by EF.
Entity Trip attributes:
-Id (PK) Generated by database
-Name
-Description
-Property1
-Property2
-Property3
Entity Activity attributes (this entity contains fixed records -read only-, no records are inserted here on performing inserts):
-Id (PK) Generated by database
-Name
-Description
-Cost
Join table contains 2 columns, that is, the IDs of the above entities, that are primary and foreign keys at the same time.
I have no problems inserting entries which automatically EF creates join table TripActivities and add entries successfully to it. Also entries are added successfully to entity Trip and it leaves unchanged entity Activity.
My problem is on updating entries, for example, - suppose user can modify information related to a trip from the GUI - so I take all the info from this GUI and I perform the following steps to update the existing trip:
Trip trip = Context.Trips.Find(id); // Search for the appropriate trip to update from Id
trip.Name = ObtainNameFromGUI();
trip.Description = ObtainDescriptionFromGUI();
trip.Property1 = ObtainProperty1FromGUI();
trip.Property2 = ObtainProperty2FromGUI();
trip.Property3 = ObtainProperty3FromGUI();
trip.Activities = new List<Activity>();
// From the GUI user selects from a checkbox list the activities associated to the trip
// So we read its Ids and from those ids we fetch from database the activities to obtain
// the info related to each activity selected in the GUI. This is all done inside the
// below method.
List<Activity> activities = this.ObtainActivitiesSelectedFromGUI();
// If no activites selected (=null) I want EF automatically deletes the entries in the
// joined table for this trip. And of course, if there are activities selected, EF
// should update the respectives entries in the joined table for this trip with the new
// ones.
if (activites != null)
{
activities.ForEach(a =>
{
trip.Activities.Add(a);
});
}
context.Trips.Add(trip);
context.SaveChanges();<br><br>
By doing this I want EF updates all the entities related (except Activity as it has fixed entries, must be kept unchanged), that is, Trip and the joined table automatically but it does not work: a new trip is created and more entries in the joined table (The only thing that is working is that entity Activity is kept unchanged as I want).
How to achieve this? I have spent a lot of hours trying to do this but without success...
Thanks in advance.
EDIT:
I have removed line:
context.Trips.Add(trip);
Now the results are:
-Entity Trip is correctly updated, no new records added which is Ok.
-Entity Activity is kept unchanged which is Ok.
-Join table: The old records for current trip being updated are not updated, instead new records are inserted for the current trip which is not correct.
I have used a different approach for similar scenario that I faced, which works well with Detached Entities. What I ended up was finding out which entities were added and which ones deleted by comparing GUI(detached entity) values to the database values. Here is the sample code that I have used. The entities in play are RelayConfig and StandardContact which have many to many relationship
public void Update(RelayConfig relayConfig, List<StandardContact> exposedContacts) {
RelayConfig dbRelayConfig = context.RelayConfigs.Include(r => r.StandardContacts)
.Where(r => r.Id == relayConfig.Id).SingleOrDefault();
context.Entry<RelayConfig> (dbRelayConfig).CurrentValues.SetValues(relayConfig);
List<StandardContact> addedExposedContacts =
exposedContacts.Where(c1 => !dbRelayConfig.StandardContacts.Any(c2 => c1.Id == c2.Id)).ToList();
List<StandardContact> deletedExposedContacts =
dbRelayConfig.StandardContacts.Where(c1 => !exposedContacts.Any(c2 => c2.Id == c1.Id)).ToList();
StandardContact dbExposedContact = null;
addedExposedContacts.ForEach(exposedContact => {
dbExposedContact = context.StandardContacts.SingleOrDefault(sc => sc.Id == exposedContact.Id);
dbRelayConfig.StandardContacts.Add(dbExposedContact);
});
deletedExposedContacts.ForEach(exposedContact => { dbRelayConfig.StandardContacts.Remove(exposedContact);});
You will use something like this. Assuming that you will get the related objects from the UI and just you are going to update the same in the database, some thing like the following will work.
context.Products.Attach(product);
context.ObjectStateManager.ChangeObjectState(product, System.Data.EntityState.Modified);
context.ObjectStateManager.ChangeObjectState(product.ProductDescription, System.Data.EntityState.Modified);
context.ObjectStateManager.ChangeObjectState(product.ProductModel, System.Data.EntityState.Modified);
context.SaveChanges();
As you may see here, we are setting the EntityState as Modified which hints EF to perform update for the related tables too.
Please post back your queries or any issues that you may encounter in this implementation.

Database doubly connected relationship inserting problem

I have two tables Plants and Information. For every plant there are many information, but for each plant there is a single MainInformation. So there is a one-to-many relationship and a one-to-one relationship between the two. The Information table has a PlantID and the Plants table has a MainInformationID. I want both fields in both tables not to be nulls. But now you can't insert either of the two records into their tables because each one requires their fields not be null, meaning they need the other record to be created first in order to create themselves. Perhaps this is not a good database design and something should be changed? (I am new to databases and entity framework)
I tried inserting into the database itself manually but I cant do it. I also tried this code with EntityFramework.
using (var context = new MyEntities())
{
var p = new Plant()
{
LatinName = "latinNameTest",
LocalName = "localNameTest",
CycleTime = 500
};
var i = new Information()
{
ShortDescription = "ShortDesc",
LongDescription = "LongDesc"
};
p.MainInformation = i;
i.Plant = p;
context.AddToPlants(p);
context.AddToInformation(i);
context.SaveChanges();
}
One of
The 1-1 FK column has to be NULL
The FK has to be disabled to allow parent insert before child
You have a single dummy Information row that is used by default in FL column
SQL Server does not allow deferred constraint checking without "code change" rights so even wrapping in a transaction won't work
Sounds like an EAV schema, which has other problems
You need to change the tables to allow for null. There is no other way to do this.
You may want to look at database transactions and how to use them with the Entity Framework. You can wrap both INSERTS into a single db transaction so the only results are both of them go in or neither go in.
Here is a link for transactions using EF. I didn't read through it but it seems to talk about them enough to get you started.

Categories