Saving Entity causes duplicate insert into lookup data - c#

I am using EF 4.1 "code first" to create my db and objects.
Given:
public class Order
{
public int Id { get; set; }
public string Name { get; set; }
public virtual OrderType OrderType { get; set; }
}
public class OrderType
{
public int Id { get; set; }
public string Name { get; set; }
}
An order has one ordertype. An order type is just a look up table. The values dont change. Using Fluent API:
//Order
ToTable("order");
HasKey(key => key.Id);
Property(item => item.Id).HasColumnName("order_id").HasColumnType("int");
Property(item => item.Name).HasColumnName("name").HasColumnType("string").HasMaxLength(10).IsRequired();
HasRequired(item => item.OrderType).WithMany().Map(x => x.MapKey("order_type_id")).WillCascadeOnDelete(false);
//OrderType
ToTable("order_type");
HasKey(key => key.Id);
Property(item => item.Id).HasColumnName("order_type_id").HasColumnType("int");
Property(item => item.Name).HasColumnName("name").HasColumnType("nvarchar").HasMaxLength(100).IsRequired();
Now in our App we load all our lookup data and cache it.
var order = new Order
{
Name = "Bob"
OrderType = GetFromOurCache(5) //Get order type for id 5
};
var db = _db.GetContext();
db.Order.Add(order);
db.SaveChanges();
Our you-beaut order is saved but with a new order type, courtesy of EF. So now we have two same order types in our database. What can I do to alter this behaviour?
TIA

With EF 4.1 you can do this before calling SaveChanges:
db.Entry(order.OrderType).State = EntityState.Unchanged;

Alternatively to Yakimych's solution you can attach the OrderType to the context before you add the order to let EF know that the OrderType already exists in the database:
var order = new Order
{
Name = "Bob"
OrderType = GetFromOurCache(5) //Get order type for id 5
};
var db = _db.GetContext();
db.OrderTypes.Attach(order.OrderType);
db.Order.Add(order);
db.SaveChanges();

Yakimych / Slauma - thanks for the answers. Interestingly I tried both ways and neither worked. Hence I asked the question. Your answers confirmed that I must be doing something wrong, and sure enough I wasnt managing my dbContext properly.
Still its a pain that EF automatically wants to insert lookup/static data even when you supply the full object (including the lookups unique Id). It puts the onus on the developer to remember to set the state. To make things a little easier I do:
var properties = entry.GetType().GetProperties().Where(x => x.PropertyType.GetInterface(typeof(ISeedData).Name) != null);
foreach (var staticProperty in properties)
{
var n = staticProperty.GetValue(entry, null);
Entry(n).State = EntityState.Unchanged;
}
in SaveChanges override.
Again thanks for the help.

Related

How to query IQueryable with Include - ThenInclude?

In .NET Core 2.2 I'm stuck with filtering IQueryable built as:
_context.Ports.Include(p => p.VesselsPorts)
.ThenInclude(p => p.Arrival)
.Include(p => p.VesselsPorts)
.ThenInclude(p => p.Departure)
.OrderBy(p => p.PortLocode);
in many-to-many relation. And the entity models are such as:
public class PortModel
{
[Key]
public string PortLocode { get; set; }
public double? MaxKnownLOA { get; set; }
public double? MaxKnownBreadth { get; set; }
public double? MaxKnownDraught { get; set; }
public virtual ICollection<VesselPort> VesselsPorts { get; set; }
}
public class VesselPort
{
public int IMO { get; set; }
public string PortLocode { get; set; }
public DateTime? Departure { get; set; }
public DateTime? Arrival { get; set; }
public VesselModel VesselModel { get; set; }
public PortModel PortModel { get; set; }
}
Based on this this SO answer I managed to create LINQ like that:
_context.Ports.Include(p => p.VesselsPorts).ThenInclude(p => p.Arrival).OrderBy(p => p.PortLocode)
.Select(
p => new PortModel
{
PortLocode = p.PortLocode,
MaxKnownBreadth = p.MaxKnownBreadth,
MaxKnownDraught = p.MaxKnownDraught,
MaxKnownLOA = p.MaxKnownLOA,
VesselsPorts = p.VesselsPorts.Select(vp => vp.Arrival > DateTime.UtcNow.AddDays(-1)) as ICollection<VesselPort>
}).AsQueryable();
BUT what I need is to find all port records, where:
VesselsPorts.Arrival > DateTime.UtcNow.AddDays(-1) quantity is greater than int x = 5 value (for the example). And I have no clue how to do it :/
Thanks to #GertArnold comment, I ended up with query:
ports = ports.Where(p => p.VesselsPorts.Where(vp => vp.Arrival > DateTime.UtcNow.AddDays(-1)).Count() > x);
When using entity framework people tend to use Include instead of Select to save them some typing. It is seldom wise to do so.
The DbContext holds a ChangeTracker. Every complete row from any table that you fetch during the lifetime of the DbContext is stored in the ChangeTracker, as well as a clone. You get a reference to the copy. (or maybe a reference to the original). If you change properties of the data you got, they are changed in the copy that is in the ChangeTracker. During SaveChanges, the original is compared to the copy, to see if the data must be saved.
So if you are fetching quite a lot of data, and use include, then every fetched items is cloned. This might slow down your queries considerably.
Apart from this cloning, you will probably fetch more properties than you actually plan to use. Database management systems are extremely optimized in combining tables, and searching rows within tables. One of the slower parts is the transfer of the selected data to your local process.
For example, if you have a database with Schools and Students, with the obvious one to many-relation, then every Student will have a foreign key to the School he attends.
So if you ask for School [10] with his 2000 Students, then every Student will have a foreign key value of [10]. If you use Include, then you will be transferring this same value 10 over 2000 times. What a waste of processing power!
In entity framework, when querying data, always use Select to select the properties, and Select only the properties that you actually plan to use. Only use Include if you plan to change the fetched items.
Certainly don't use Include to save you some typing!
Requirement: Give me the Ports with their Vessels
var portsWithTheirVessels = dbContext.Ports
.Where(port => ...) // if you don't want all Ports
.Select(port => new
{
// only select the properties that you want:
PortLocode = port.PortLoCode,
MaxKnownLOA = port.MaxKnownLOA,
MaxKnownBreadth = prot.MaxKnownBreadth,
MaxKnownDraught = ports.MaxKnownDraught,
// The Vessels in this port:
Vessels = port.VesselsPort.Select(vessel => new
{
// again: only the properties that you plan to use
IMO = vessel.IMO,
...
// do not select the foreign key, you already know the value!
// PortLocode = vessle.PortLocode,
})
.ToList(),
});
Entity framework knows your one-to-many relation, and knows that if you use the virtual ICollection that it should do a (Group-)Join.
Some people prefer to do the Group-Join themselves, or they use a version of entity framework that does not support using the ICollection.
var portsWithTheirVessels = dbContext.Ports.GroupJoin(dbContext.VesselPorts,
port => port.PortLocode, // from every Port take the primary key
vessel => vessel.PortLocode, // from every Vessel take the foreign key to Port
// parameter resultSelector: take every Port with its zero or more Vessels to make one new
(port, vesselsInThisPort) => new
{
PortLocode = port.PortLoCode,
...
Vessels = vesselsInThisPort.Select(vessel => new
{
...
})
.ToList(),
});
Alternative:
var portsWithTheirVessels = dbContext.Ports.Select(port => new
{
PortLocode = port.PortLoCode,
...
Vessels = dbContext.VesselPorts.Where(vessel => vessel.PortLocode == port.PortLocode)
.Select(vessel => new
{
...
}
.ToList(),
});
Entity framework will translate this also to a GroupJoin.

Do we have to explicitly add to db context?

public class Practice
{
public List<Participation> Participation { get; set; }
}
public class Participation
{
public string Id { get; set; }
public virtual Practice Practice { get; set; }
}
public void test()
{
var practice = _ctx.Practice.SingleOrDefault(p => p.Id == practiceId);
practice.Participations.AddRange(NewParticipations);
_ctx.Participation.AddRange(NewParticipations)
await _ctx.SaveChangesAsync();
}
If I have the above, would I need the 3rd line in the test function to save new participations or would the practice.Participations.AddRange() handle that implicitly?
practice.Participations.AddRange should be enough.
If you reference a new entity from the navigation property of an entity that is already tracked by the context, the entity will be discovered and inserted into the database.
source: https://learn.microsoft.com/en-us/ef/core/saving/related-data#adding-a-related-entity
You can observe it like so...
var practice = _ctx.Practice.SingleOrDefault(p => p.Id == practiceId);
practice.Participations.AddRange(NewParticipations);
Debug.WriteLine(_ctx.Participation.Count()); //note count
await _ctx.SaveChangesAsync();
Debug.WriteLine(_ctx.Participation.Count()); //count increased
You should be able to add the new data to the database either way. If you added through the context, you would need to set the foreign key in the NewParticipations objects yourself, so that a link would exist to the Practice object.

Best way to load navigation properties in new entity

I am trying to add new record into SQL database using EF. The code looks like
public void Add(QueueItem queueItem)
{
var entity = queueItem.ApiEntity;
var statistic = new Statistic
{
Ip = entity.Ip,
Process = entity.ProcessId,
ApiId = entity.ApiId,
Result = entity.Result,
Error = entity.Error,
Source = entity.Source,
DateStamp = DateTime.UtcNow,
UserId = int.Parse(entity.ApiKey),
};
_statisticRepository.Add(statistic);
unitOfWork.Commit();
}
There is navigation Api and User properties in Statistic entity which I want to load into new Statistic entity. I have tried to load navigation properties using code below but it produce large queries and decrease performance. Any suggestion how to load navigation properties in other way?
public Statistic Add(Statistic statistic)
{
_context.Statistic.Include(p => p.Api).Load();
_context.Statistic.Include(w => w.User).Load();
_context.Statistic.Add(statistic);
return statistic;
}
Some of you may have question why I want to load navigation properties while adding new entity, it's because I perform some calculations in DbContext.SaveChanges() before moving entity to database. The code looks like
public override int SaveChanges()
{
var addedStatistics = ChangeTracker.Entries<Statistic>().Where(e => e.State == EntityState.Added).ToList().Select(p => p.Entity).ToList();
var userCreditsGroup = addedStatistics
.Where(w => w.User != null)
.GroupBy(g => g.User )
.Select(s => new
{
User = s.Key,
Count = s.Sum(p=>p.Api.CreditCost)
})
.ToList();
//Skip code
}
So the Linq above will not work without loading navigation properties because it use them.
I am also adding Statistic entity for full view
public class Statistic : Entity
{
public Statistic()
{
DateStamp = DateTime.UtcNow;
}
public int Id { get; set; }
public string Process { get; set; }
public bool Result { get; set; }
[Required]
public DateTime DateStamp { get; set; }
[MaxLength(39)]
public string Ip { get; set; }
[MaxLength(2083)]
public string Source { get; set; }
[MaxLength(250)]
public string Error { get; set; }
public int UserId { get; set; }
[ForeignKey("UserId")]
public virtual User User { get; set; }
public int ApiId { get; set; }
[ForeignKey("ApiId")]
public virtual Api Api { get; set; }
}
As you say, the following operations against your context will generate large queries:
_context.Statistic.Include(p => p.Api).Load();
_context.Statistic.Include(w => w.User).Load();
These are materialising the object graphs for all statistics and associated api entities and then all statistics and associated users into the statistics context
Just replacing this with a single call as follows will reduce this to a single round trip:
_context.Statistic.Include(p => p.Api).Include(w => w.User).Load();
Once these have been loaded, the entity framework change tracker will fixup the relationships on the new statistics entities, and hence populate the navigation properties for api and user for all new statistics in one go.
Depending on how many new statistics are being created in one go versus the number of existing statistics in the database I quite like this approach.
However, looking at the SaveChanges method it looks like the relationship fixup is happening once per new statistic. I.e. each time a new statistic is added you are querying the database for all statistics and associated api and user entities to trigger a relationship fixup for the new statistic.
In which case I would be more inclined todo the following:
_context.Statistics.Add(statistic);
_context.Entry(statistic).Reference(s => s.Api).Load();
_context.Entry(statistic).Reference(s => s.User).Load();
This will only query for the Api and User of the new statistic rather than for all statistics. I.e you will generate 2 single row database queries for each new statistic.
Alternatively, if you are adding a large number of statistics in one batch, you could make use of the Local cache on the context by preloading all users and api entities upfront. I.e. take the hit upfront to pre cache all user and api entities as 2 large queries.
// preload all api and user entities
_context.Apis.Load();
_context.Users.Load();
// batch add new statistics
foreach(new statistic in statisticsToAdd)
{
statistic.User = _context.Users.Local.Single(x => x.Id == statistic.UserId);
statistic.Api = _context.Api.Local.Single(x => x.Id == statistic.ApiId);
_context.Statistics.Add(statistic);
}
Would be interested to find out if Entity Framework does relationship fixup from its local cache.
I.e. if the following would populate the navigation properties from the local cache on all the new statistics. Will have a play later.
_context.ChangeTracker.DetectChanges();
Disclaimer: all code entered directly into browser so beware of the typos.
Sorry I dont have the time to test that, but EF maps entities to objects. Therefore shouldnt simply assigning the object work:
public void Add(QueueItem queueItem)
{
var entity = queueItem.ApiEntity;
var statistic = new Statistic
{
Ip = entity.Ip,
Process = entity.ProcessId,
//ApiId = entity.ApiId,
Api = _context.Apis.Single(a => a.Id == entity.ApiId),
Result = entity.Result,
Error = entity.Error,
Source = entity.Source,
DateStamp = DateTime.UtcNow,
//UserId = int.Parse(entity.ApiKey),
User = _context.Users.Single(u => u.Id == int.Parse(entity.ApiKey)
};
_statisticRepository.Add(statistic);
unitOfWork.Commit();
}
I did a little guessing of your namings, you should adjust it before testing
How about make a lookup and load only necessary columns.
private readonly Dictionary<int, UserKeyType> _userKeyLookup = new Dictionary<int, UserKeyType>();
I'm not sure how you create a repository, you might need to clean up the lookup once the saving changes is completed or in the beginning of the transaction.
_userKeyLookup.Clean();
First find in the lookup, if not found then load from context.
public Statistic Add(Statistic statistic)
{
// _context.Statistic.Include(w => w.User).Load();
UserKeyType key;
if (_userKeyLookup.Contains(statistic.UserId))
{
key = _userKeyLookup[statistic.UserId];
}
else
{
key = _context.Users.Where(u => u.Id == statistic.UserId).Select(u => u.Key).FirstOrDefault();
_userKeyLookup.Add(statistic.UserId, key);
}
statistic.User = new User { Id = statistic.UserId, Key = key };
// similar code for api..
// _context.Statistic.Include(p => p.Api).Load();
_context.Statistic.Add(statistic);
return statistic;
}
Then change the grouping a little.
var userCreditsGroup = addedStatistics
.Where(w => w.User != null)
.GroupBy(g => g.User.Id)
.Select(s => new
{
User = s.Value.First().User,
Count = s.Sum(p=>p.Api.CreditCost)
})
.ToList();

Get only some properties from an entity using Repository pattern and LINQ2Entities

I know it's pretty standard stuff but right know the solution escapes me. I have entity Documents. In my service I can call DocumentsRepository.All() and then use only what I need but I don't want to carry all the unneeded data. I guess I have to use anonymous object to achieve this, but the exact implementation escapes me.
In Documents entity I have column Id and column UserId. How can I write my LINQ to only get those two values?
P.S
And what type should I use for my method? Maybe object but I would like something more specific.
Building upon olivers answer, if you want to return that from a method, you could use dynamic:
public dynamic ReturnSomeData()
{
return context.Documents.Select(d => new
{
Id = d.Id,
UserId = d.UserId
});
}
You have to keep in mind that you trade compiler checking for flexibility.
This should work for what you need, if you want to put this into a method you should create a type that contains all the info you need.
var selectedItems = context.Documents.Select(d => new
{
Id = d.Id,
UserId = d.UserId
});
EDIT
Use in a method:
public class MyData
{
public int Id { get; set; }
public int UserId { get; set; }
}
public IEnumerable<MyData> GetMyDataFromDocuments()
{
return context.Documents.Select(d => new MyData
{
Id = d.Id,
UserId = d.UserId
});
}

Updating disconnected entities with many-to-many relationships

Suppose I have the following model classes in an Entity Framework Code-First setup:
public class Person
{
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Team> Teams { get; set; }
}
public class Team
{
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Person> People { get; set; }
}
The database created from this code includes a TeamPersons table, representing the many-to-many relationship between people and teams.
Now suppose I have a disconnected Person object (not a proxy, and not yet attached to a context) whose Teams collection contains one or more disconnected Team objects, all of which represent Teams already in the database. An object such as would be created by the following, just for example, if a Person with Id 1 and a Team with Id 3 already existed in the db:
var person = new Person
{
Id = 1,
Name = "Bob",
Teams = new HashSet<Team>
{
new Team { Id = 3, Name = "C Team"}
}
};
What is the best way of updating this object, so that after the update the TeamPersons table contains a single row for Bob, linking him to C Team ? I've tried the obvious:
using (var context = new TestContext())
{
context.Entry(person).State = EntityState.Modified;
context.SaveChanges();
}
but the Teams collection is just ignored by this. I've also tried various other things, but nothing seems to do exactly what I'm after here. Thanks for any help.
EDIT:
So I get that I could fetch both the Person and the Team[s] from the db, update them and then commit changes:
using (var context = new TestContext())
{
var dbPerson = context.People.Find(person.Id);
dbPerson.Name = person.Name;
dbPerson.Teams.Clear();
foreach (var id in person.Teams.Select(x => x.Id))
{
var team = context.Teams.Find(id);
dbPerson.Teams.Add(team);
}
context.SaveChanges();
}
This is a pain if Person's a complicated entity, though. I know I could use Automapper or something to make things a bit easier, but still it seems a shame if there's no way of saving the original person object, rather than having to get a new one and copy all the properties over...
The general approach is to fetch the Team from the database and Add that to the Person's Teams collection. Setting EntityState.Modified only affects scalar properties, not navigation properties.
Try selecting the existing entities first, then attaching the team to the person object's team collection.
Something like this: (syntax might not be exactly correct)
using (var context = new TestContext())
{
var person = context.Persons.Where(f => f.Id == 1).FirstOrDefault();
var team = context.Teams.Where(f => f.Id == 3).FirstOrDefault();
person.Teams.Add(team);
context.Entry(person).State = EntityState.Modified;
context.SaveChanges();
}
That's where EF s**ks. very inefficient for disconnected scenario. loading data for the update/delete and every for re-attaching updated, one cannot just attached the updated entity to the context as an entity with the same key might already existed in the context already, in which case, EF will just throw up. what need to be done is to check if an entity with the same key is already in the context and attached or updated accordingly. it's worse to update entity with many to many relationship child. removing deleted child is from the child's entity set but not the reference property, it's very messy.
You can use the Attach method. Try this:
using (var context = new TestContext())
{
context.People.Attach(person);
//i'm not sure if this foreach is necessary, you can try without it to see if it works
foreach (var team in person.Teams)
{
context.Teams.Attach(team);
}
context.Entry(person).State = EntityState.Modified;
context.SaveChanges();
}
I didn't test this code, let me know if you have any problems

Categories