Best way to load navigation properties in new entity

Best way to load navigation properties in new entity - c#

I am trying to add new record into SQL database using EF. The code looks like
public void Add(QueueItem queueItem)
{
var entity = queueItem.ApiEntity;
var statistic = new Statistic
{
Ip = entity.Ip,
Process = entity.ProcessId,
ApiId = entity.ApiId,
Result = entity.Result,
Error = entity.Error,
Source = entity.Source,
DateStamp = DateTime.UtcNow,
UserId = int.Parse(entity.ApiKey),
};
_statisticRepository.Add(statistic);
unitOfWork.Commit();
}
There is navigation Api and User properties in Statistic entity which I want to load into new Statistic entity. I have tried to load navigation properties using code below but it produce large queries and decrease performance. Any suggestion how to load navigation properties in other way?
public Statistic Add(Statistic statistic)
{
_context.Statistic.Include(p => p.Api).Load();
_context.Statistic.Include(w => w.User).Load();
_context.Statistic.Add(statistic);
return statistic;
}
Some of you may have question why I want to load navigation properties while adding new entity, it's because I perform some calculations in DbContext.SaveChanges() before moving entity to database. The code looks like
public override int SaveChanges()
{
var addedStatistics = ChangeTracker.Entries<Statistic>().Where(e => e.State == EntityState.Added).ToList().Select(p => p.Entity).ToList();
var userCreditsGroup = addedStatistics
.Where(w => w.User != null)
.GroupBy(g => g.User )
.Select(s => new
{
User = s.Key,
Count = s.Sum(p=>p.Api.CreditCost)
})
.ToList();
//Skip code
}
So the Linq above will not work without loading navigation properties because it use them.
I am also adding Statistic entity for full view
public class Statistic : Entity
{
public Statistic()
{
DateStamp = DateTime.UtcNow;
}
public int Id { get; set; }
public string Process { get; set; }
public bool Result { get; set; }
[Required]
public DateTime DateStamp { get; set; }
[MaxLength(39)]
public string Ip { get; set; }
[MaxLength(2083)]
public string Source { get; set; }
[MaxLength(250)]
public string Error { get; set; }
public int UserId { get; set; }
[ForeignKey("UserId")]
public virtual User User { get; set; }
public int ApiId { get; set; }
[ForeignKey("ApiId")]
public virtual Api Api { get; set; }
}

As you say, the following operations against your context will generate large queries:
_context.Statistic.Include(p => p.Api).Load();
_context.Statistic.Include(w => w.User).Load();
These are materialising the object graphs for all statistics and associated api entities and then all statistics and associated users into the statistics context
Just replacing this with a single call as follows will reduce this to a single round trip:
_context.Statistic.Include(p => p.Api).Include(w => w.User).Load();
Once these have been loaded, the entity framework change tracker will fixup the relationships on the new statistics entities, and hence populate the navigation properties for api and user for all new statistics in one go.
Depending on how many new statistics are being created in one go versus the number of existing statistics in the database I quite like this approach.
However, looking at the SaveChanges method it looks like the relationship fixup is happening once per new statistic. I.e. each time a new statistic is added you are querying the database for all statistics and associated api and user entities to trigger a relationship fixup for the new statistic.
In which case I would be more inclined todo the following:
_context.Statistics.Add(statistic);
_context.Entry(statistic).Reference(s => s.Api).Load();
_context.Entry(statistic).Reference(s => s.User).Load();
This will only query for the Api and User of the new statistic rather than for all statistics. I.e you will generate 2 single row database queries for each new statistic.
Alternatively, if you are adding a large number of statistics in one batch, you could make use of the Local cache on the context by preloading all users and api entities upfront. I.e. take the hit upfront to pre cache all user and api entities as 2 large queries.
// preload all api and user entities
_context.Apis.Load();
_context.Users.Load();
// batch add new statistics
foreach(new statistic in statisticsToAdd)
{
statistic.User = _context.Users.Local.Single(x => x.Id == statistic.UserId);
statistic.Api = _context.Api.Local.Single(x => x.Id == statistic.ApiId);
_context.Statistics.Add(statistic);
}
Would be interested to find out if Entity Framework does relationship fixup from its local cache.
I.e. if the following would populate the navigation properties from the local cache on all the new statistics. Will have a play later.
_context.ChangeTracker.DetectChanges();
Disclaimer: all code entered directly into browser so beware of the typos.

Sorry I dont have the time to test that, but EF maps entities to objects. Therefore shouldnt simply assigning the object work:
public void Add(QueueItem queueItem)
{
var entity = queueItem.ApiEntity;
var statistic = new Statistic
{
Ip = entity.Ip,
Process = entity.ProcessId,
//ApiId = entity.ApiId,
Api = _context.Apis.Single(a => a.Id == entity.ApiId),
Result = entity.Result,
Error = entity.Error,
Source = entity.Source,
DateStamp = DateTime.UtcNow,
//UserId = int.Parse(entity.ApiKey),
User = _context.Users.Single(u => u.Id == int.Parse(entity.ApiKey)
};
_statisticRepository.Add(statistic);
unitOfWork.Commit();
}
I did a little guessing of your namings, you should adjust it before testing

How about make a lookup and load only necessary columns.
private readonly Dictionary<int, UserKeyType> _userKeyLookup = new Dictionary<int, UserKeyType>();
I'm not sure how you create a repository, you might need to clean up the lookup once the saving changes is completed or in the beginning of the transaction.
_userKeyLookup.Clean();
First find in the lookup, if not found then load from context.
public Statistic Add(Statistic statistic)
{
// _context.Statistic.Include(w => w.User).Load();
UserKeyType key;
if (_userKeyLookup.Contains(statistic.UserId))
{
key = _userKeyLookup[statistic.UserId];
}
else
{
key = _context.Users.Where(u => u.Id == statistic.UserId).Select(u => u.Key).FirstOrDefault();
_userKeyLookup.Add(statistic.UserId, key);
}
statistic.User = new User { Id = statistic.UserId, Key = key };
// similar code for api..
// _context.Statistic.Include(p => p.Api).Load();
_context.Statistic.Add(statistic);
return statistic;
}
Then change the grouping a little.
var userCreditsGroup = addedStatistics
.Where(w => w.User != null)
.GroupBy(g => g.User.Id)
.Select(s => new
{
User = s.Value.First().User,
Count = s.Sum(p=>p.Api.CreditCost)
})
.ToList();

Related

How to Fetch a Lot of Records with EF6

I need to fetch a lot of records from a SQL Server database with EF6. The problem that its takes a lot of time. The main problem is entity called Series which contains Measurements. There is like 250K of them and each has 2 nested entities called FrontDropPhoto and SideDropPhoto.
[Table("Series")]
public class DbSeries
{
[Key] public Guid SeriesId { get; set; }
public List<DbMeasurement> MeasurementsSeries { get; set; }
}
[Table("Measurements")]
public class DbMeasurement
{
[Key] public Guid MeasurementId { get; set; }
public Guid CurrentSeriesId { get; set; }
public DbSeries CurrentSeries { get; set; }
public Guid? SideDropPhotoId { get; set; }
[ForeignKey("SideDropPhotoId")]
public virtual DbDropPhoto SideDropPhoto { get; set; }
public Guid? FrontDropPhotoId { get; set; }
[ForeignKey("FrontDropPhotoId")]
public virtual DbDropPhoto FrontDropPhoto { get; set; }
}
[Table("DropPhotos")]
public class DbDropPhoto
{
[Key] public Guid PhotoId { get; set; }
}
I've wrote fetch method like this (Most of the properties omitted for clarity):
public async Task<List<DbSeries>> GetSeriesByUserId(Guid dbUserId)
{
using (var context = new DDropContext())
{
try
{
var loadedSeries = await context.Series
.Where(x => x.CurrentUserId == dbUserId)
.Select(x => new
{
x.SeriesId,
}).ToListAsync();
var dbSeries = new List<DbSeries>();
foreach (var series in loadedSeries)
{
var seriesToAdd = new DbSeries
{
SeriesId = series.SeriesId,
};
seriesToAdd.MeasurementsSeries = await GetMeasurements(seriesToAdd);
dbSeries.Add(seriesToAdd);
}
return dbSeries;
}
catch (SqlException e)
{
throw new TimeoutException(e.Message, e);
}
}
}
public async Task<List<DbMeasurement>> GetMeasurements(DbSeries series)
{
using (var context = new DDropContext())
{
var measurementForSeries = await context.Measurements.Where(x => x.CurrentSeriesId == series.SeriesId)
.Select(x => new
{
x.CurrentSeries,
x.CurrentSeriesId,
x.MeasurementId,
})
.ToListAsync();
var dbMeasurementsForAdd = new List<DbMeasurement>();
foreach (var measurement in measurementForSeries)
{
var measurementToAdd = new DbMeasurement
{
CurrentSeries = series,
MeasurementId = measurement.MeasurementId,
FrontDropPhotoId = measurement.FrontDropPhotoId,
FrontDropPhoto = measurement.FrontDropPhotoId.HasValue
? await GetDbDropPhotoById(measurement.FrontDropPhotoId.Value)
: null,
SideDropPhotoId = measurement.SideDropPhotoId,
SideDropPhoto = measurement.SideDropPhotoId.HasValue
? await GetDbDropPhotoById(measurement.SideDropPhotoId.Value)
: null,
};
dbMeasurementsForAdd.Add(measurementToAdd);
}
return dbMeasurementsForAdd;
}
}
private async Task<DbDropPhoto> GetDbDropPhotoById(Guid photoId)
{
using (var context = new DDropContext())
{
var dropPhoto = await context.DropPhotos
.Where(x => x.PhotoId == photoId)
.Select(x => new
{
x.PhotoId,
}).FirstOrDefaultAsync();
if (dropPhoto == null)
{
return null;
}
var dbDropPhoto = new DbDropPhoto
{
PhotoId = dropPhoto.PhotoId,
};
return dbDropPhoto;
}
}
Relationships configured via FluentAPI:
modelBuilder.Entity<DbSeries>()
.HasMany(s => s.MeasurementsSeries)
.WithRequired(g => g.CurrentSeries)
.HasForeignKey(s => s.CurrentSeriesId)
.WillCascadeOnDelete();
modelBuilder.Entity<DbMeasurement>()
.HasOptional(c => c.FrontDropPhoto)
.WithMany()
.HasForeignKey(s => s.FrontDropPhotoId);
modelBuilder.Entity<DbMeasurement>()
.HasOptional(c => c.SideDropPhoto)
.WithMany()
.HasForeignKey(s => s.SideDropPhotoId);
I need all of this data to populate WPF DataGrid. The obvious solution is to add paging to this DataGrid. This solution is tempting but it will break the logic of my application badly. I want to create plots at runtime using this data, so I need all of it, not just some parts. I've tried to optimize it a bit by make every method to use async await, but it wasn't helpful enough. I've tried to add
.Configuration.AutoDetectChangesEnabled = false;
for each context, but loading time is still really long. How to approach this problem?

Other than the very large amount of data that you are intent on returning, the main problem is that the way your code is structured means that for each of the 250,000 Series you are performing another trip to the database to get the Measurements for the Series and a further 2 trips to get the front/side DropPhotos for each Measurement. Apart from the round-trip time for the 750,000 calls this completely avoids taking advantage of SQL's set-based performance optimisations.
Try to ensure that EF submits as few queries as possible to return your data, preferably one:
var loadedSeries = await context.Series
.Where(x => x.CurrentUserId == dbUserId)
.Select(x => new DbSeries
{
SeriesId = x.SeriesId,
MeasurementsSeries = x.MeasurementsSeries.Select(ms => new DbMeasurement
{
MeasurementId = ms.MeasurementId,
FrontDropPhotoId = ms.FrontDropPhotoId,
FrontDropPhoto = new DbDropPhoto
{
PhotoId = ms.FrontDropPhotoId
},
SideDropPhotoId = ms.SideDropPhotoId,
SideDropPhoto = new DbDropPhoto
{
PhotoId = ms.SideDropPhotoId
},
})
}).ToListAsync();

Firstly, async/await will not help you here. It isn't a "go faster" type of operation, it is about accommodating systems that "can be doing something else while this operation is computing". If anything, it makes an operation slower in exchange for making a system more responsive.
My recommendation would be to separate your concerns: On the one hand you want to display detailed data. On the other hand you want to plot an overall graph. Separate these. A user doesn't need to see details for every record at one time, paginating it server-side will greatly reduce the raw amount of data at any one time. Graphs want to see all data, but they don't care about "heavy" details like bitmaps.
The next thing would be to separate your view's model from your domain model (entity). Doing stuff like:
var measurementToAdd = new DbMeasurement
{
CurrentSeries = series,
MeasurementId = measurement.MeasurementId,
FrontDropPhotoId = measurement.FrontDropPhotoId,
FrontDropPhoto = measurement.FrontDropPhotoId.HasValue
? await GetDbDropPhotoById(measurement.FrontDropPhotoId.Value)
: null,
SideDropPhotoId = measurement.SideDropPhotoId,
SideDropPhoto = measurement.SideDropPhotoId.HasValue
? await GetDbDropPhotoById(measurement.SideDropPhotoId.Value)
: null,
};
... is just asking for trouble. Any code that accepts a DbMeasurement should receive a complete, or completable DbMeasurement, not a partially populated entity. It will burn you in the future. Define a view model for the data grid and populate it. This way you clearly differentiate what is an entity model and what is the view's model.
Next, for the data grid, absolutely implement server-side pagination:
public ICollection<MeasurementViewModel> GetMeasurements(int seriesId, int pageNumber, int pageSize)
{
using (var context = new DDropContext())
{
var measurementsForSeries = await context.Measurements
.Where(x => x.CurrentSeriesId == seriesId)
.Select(x => new MeasurementViewModel
{
MeasurementId = x.MeasurementId,
FromDropPhoto = x.FromDropPhoto.ImageData,
SideDropPhoto = x.SideDropPhoto.ImageData
})
.Skip(pageNumber*pageSize)
.Take(pageSize)
.ToList();
return measurementsForSeries;
}
}
This assumes that we want to pull image data for the rows if available. Leverage the navigation properties for related data in the query rather than iterating over results and going back to the database for each and every row.
For the graph plot you can return either the raw integer data or a data structure for just the fields needed rather than relying on the data returned for the grid. It can be pulled for the entire table without having the "heavy" image data. It may seem counter-productive to go to the database when the data might already be loaded once already, but the result is two highly efficient queries rather than one very inefficient query trying to serve two purposes.

Why are you reinventing the wheel and manually loading and constructing your related entities? You’re causing an N+1 selects problem resulting in abhorrent performance. Let EF query for related entities efficiently via .Include
Example:
var results = context.Series
.AsNoTracking()
.Include( s => s.MeasurementSeries )
.ThenInclude( ms => ms.FrontDropPhoto )
.Where( ... )
.ToList(); // should use async
This will speed up execution dramatically though it may still not be quick enough for your requirments if it needs to construct hundreds of thousands to millions of objects, in which case you can retrieve the data in concurrent batches.

How to query IQueryable with Include - ThenInclude?

In .NET Core 2.2 I'm stuck with filtering IQueryable built as:
_context.Ports.Include(p => p.VesselsPorts)
.ThenInclude(p => p.Arrival)
.Include(p => p.VesselsPorts)
.ThenInclude(p => p.Departure)
.OrderBy(p => p.PortLocode);
in many-to-many relation. And the entity models are such as:
public class PortModel
{
[Key]
public string PortLocode { get; set; }
public double? MaxKnownLOA { get; set; }
public double? MaxKnownBreadth { get; set; }
public double? MaxKnownDraught { get; set; }
public virtual ICollection<VesselPort> VesselsPorts { get; set; }
}
public class VesselPort
{
public int IMO { get; set; }
public string PortLocode { get; set; }
public DateTime? Departure { get; set; }
public DateTime? Arrival { get; set; }
public VesselModel VesselModel { get; set; }
public PortModel PortModel { get; set; }
}
Based on this this SO answer I managed to create LINQ like that:
_context.Ports.Include(p => p.VesselsPorts).ThenInclude(p => p.Arrival).OrderBy(p => p.PortLocode)
.Select(
p => new PortModel
{
PortLocode = p.PortLocode,
MaxKnownBreadth = p.MaxKnownBreadth,
MaxKnownDraught = p.MaxKnownDraught,
MaxKnownLOA = p.MaxKnownLOA,
VesselsPorts = p.VesselsPorts.Select(vp => vp.Arrival > DateTime.UtcNow.AddDays(-1)) as ICollection<VesselPort>
}).AsQueryable();
BUT what I need is to find all port records, where:
VesselsPorts.Arrival > DateTime.UtcNow.AddDays(-1) quantity is greater than int x = 5 value (for the example). And I have no clue how to do it :/

Thanks to #GertArnold comment, I ended up with query:
ports = ports.Where(p => p.VesselsPorts.Where(vp => vp.Arrival > DateTime.UtcNow.AddDays(-1)).Count() > x);

When using entity framework people tend to use Include instead of Select to save them some typing. It is seldom wise to do so.
The DbContext holds a ChangeTracker. Every complete row from any table that you fetch during the lifetime of the DbContext is stored in the ChangeTracker, as well as a clone. You get a reference to the copy. (or maybe a reference to the original). If you change properties of the data you got, they are changed in the copy that is in the ChangeTracker. During SaveChanges, the original is compared to the copy, to see if the data must be saved.
So if you are fetching quite a lot of data, and use include, then every fetched items is cloned. This might slow down your queries considerably.
Apart from this cloning, you will probably fetch more properties than you actually plan to use. Database management systems are extremely optimized in combining tables, and searching rows within tables. One of the slower parts is the transfer of the selected data to your local process.
For example, if you have a database with Schools and Students, with the obvious one to many-relation, then every Student will have a foreign key to the School he attends.
So if you ask for School [10] with his 2000 Students, then every Student will have a foreign key value of [10]. If you use Include, then you will be transferring this same value 10 over 2000 times. What a waste of processing power!
In entity framework, when querying data, always use Select to select the properties, and Select only the properties that you actually plan to use. Only use Include if you plan to change the fetched items.
Certainly don't use Include to save you some typing!
Requirement: Give me the Ports with their Vessels
var portsWithTheirVessels = dbContext.Ports
.Where(port => ...) // if you don't want all Ports
.Select(port => new
{
// only select the properties that you want:
PortLocode = port.PortLoCode,
MaxKnownLOA = port.MaxKnownLOA,
MaxKnownBreadth = prot.MaxKnownBreadth,
MaxKnownDraught = ports.MaxKnownDraught,
// The Vessels in this port:
Vessels = port.VesselsPort.Select(vessel => new
{
// again: only the properties that you plan to use
IMO = vessel.IMO,
...
// do not select the foreign key, you already know the value!
// PortLocode = vessle.PortLocode,
})
.ToList(),
});
Entity framework knows your one-to-many relation, and knows that if you use the virtual ICollection that it should do a (Group-)Join.
Some people prefer to do the Group-Join themselves, or they use a version of entity framework that does not support using the ICollection.
var portsWithTheirVessels = dbContext.Ports.GroupJoin(dbContext.VesselPorts,
port => port.PortLocode, // from every Port take the primary key
vessel => vessel.PortLocode, // from every Vessel take the foreign key to Port
// parameter resultSelector: take every Port with its zero or more Vessels to make one new
(port, vesselsInThisPort) => new
{
PortLocode = port.PortLoCode,
...
Vessels = vesselsInThisPort.Select(vessel => new
{
...
})
.ToList(),
});
Alternative:
var portsWithTheirVessels = dbContext.Ports.Select(port => new
{
PortLocode = port.PortLoCode,
...
Vessels = dbContext.VesselPorts.Where(vessel => vessel.PortLocode == port.PortLocode)
.Select(vessel => new
{
...
}
.ToList(),
});
Entity framework will translate this also to a GroupJoin.

NHibernate join does not fully populate objects within transactions

We have a situation where a transaction is started on an NHibernate session, some rows are populated into a couple of tables, and a query is executed which performs a join on the two tables.
Models:
public class A
{
public virtual string ID { get; set; } // Primary key
public IList<B> Bs { get; set; }
}
public class B
{
public virtual string ID { get; set; } // Foreign key
}
NHibernate maps:
public class AMap: ClassMap<A>
{
public AMap()
{
Table("dbo.A");
Id(x => x.ID).Not.Nullable();
HasMany(u => u.Bs).KeyColumn("ID");
}
}
public class BMap: ClassMap<B>
{
public BMap()
{
Table("dbo.B");
Map(x => x.ID, "ID").Not.Nullable();
}
}
A transaction is started and the following code is executed:
var a1 = new A
{
ID = "One"
};
session.Save(a1);
var a2 = new A
{
ID = "Two"
};
session.Save(a2);
session.Flush();
var b1 = new B
{
ID = a1.ID
};
session.Save(b1);
var b2 = new B
{
ID = a2.ID
};
session.Save(b2);
session.Flush();
A a = null;
B b = null;
var result = _session.QueryOver(() => a)
.JoinQueryOver(() => a.Bs, () => b,JoinType.LeftOuterJoin)
.List();
The result is a list of A. In the list, objects of A do not have Bs populated.
Although this example is simplified, the actual objects in question have additional properties associated with corresponding table columns; all those properties populate as expected; the issue is confined to the property mapped as HasMany (foreign key association).
If the table is populated first, and then the query is performed (either as separate processes or in consecutive transactions), the objects of A do have their Bs correctly populated. In other words, it seems as though queries executed in a transaction are not able to see the complete effect of inserts previously performed within the same transaction.
Inspection of the SQL generated by NHibernate confirms that it correctly performed all the inserts and correctly formulated the join query; it appears that it simply did not correctly populate the objects from the query result.
Are there any special steps required to ensure that database inserts/updates performed via NHibernate are fully visible to subsequent fetches in the same transaction?

HasMany(u => u.Bs).KeyColumn("ID");
looks wrong to me. The id of a one-to-many relation should be A_ID.
You do lots of strange things in your code. I hope your real code doesn't look like this. You should not set foreign keys directly. They are managed by NH. You should not Flush all the time. Normally you never flush.
Also note that the left outer join is not used to populate the list of Bs in A. (There is no information for NHibernate that this would be a valid option.) There are mapping tricks to load entities and one of its collections in one query, but this is most of the time not such a good idea and I suggest to not try this unless you really know NH and how queries are processed very well. You'll only get the same A multiple times and some performance problems if you do not break it completely. If you are afraid of the N+1 problem (I hope you are), use batch-size instead.

Figured out the solution. The gist of it is to add the "child" items to the "parent" and then save that.
So... classes now look like:
public class A
{
public virtual string ID { get; set; } // Primary key
public virtual IList<B> Bs { get; set; }
}
public class B
{
public virtual A A { get; set; } // Foreign key now expressed as reference to "parent" object instead of property containing key value
}
ClassMaps for both parent and child express the relationship as object/list:
public class AMap: ClassMap<A>
{
public AMap()
{
Table("dbo.A");
Id(x => x.ID).Not.Nullable();
HasMany(u => u.Bs).KeyColumn("ID").Cascade.SaveUpdate();
}
}
public class BMap: ClassMap<B>
{
public BMap()
{
Table("dbo.B");
Map(x => x.ID, "ID").Not.Nullable();
References(x => x.A, "ID").Not.Nullable();
}
}
Finally, data is saved by constructing the objects and their relationship before saving them i.e. relationships are saved with the objects:
var a1 = new A
{
ID = "One"
};
var b1 = new B
{
A = a1
};
a1.Bs = new []{b1};
session.Save(a1);
var a2 = new A
{
ID = "Two"
};
var b2 = new B
{
A = a2
};
a2.Bs = new []{b2};
session.Save(a2);
session.Flush();
This query:
A a = null;
B b = null;
var result = _session.QueryOver(() => a)
.JoinQueryOver(() => a.Bs, () => b,JoinType.LeftOuterJoin)
.List();
Now returns the expected result, and within the same session/transaction.

ObjectStateManager error on attach in foreach loop entity frameworks

So I have a table of (new) users and a table of groups. What I'm trying to do is add the users to the groups.
What I thought I'd do is :-
using(context = new MyEntity())
{
foreach(csvUser from csvSource)
{
User oUser = new User();
oUser.Firstname = csvUser.Firstname;
Group oGroup = new Group();
// Set the primary key for attach
oGroup.ID = csvUser.GroupID;
context.Group.Attach(oGroup);
oUser.Groups.Add(oGroup);
context.Users.Add(oUser);
}
context.saveChnages();
}
So bascially loop through all the new users, grab their group id from the CSV File (group already exists in db). So I would attach to the group and then add the group.
However I'm running into an error because as soon as a user with group id which has already been attached tries to attach it booms.
An object with a key that matches the key of the supplied object could
not be found in the ObjectStateManager. Verify that the key values of
the supplied object match the key values of the object to which
changes must be applied.
Which I can understand, its trying to re-attach an object its already attached to in memory. However is there a way around this? All I need to do is attach a new user to a pre-existing group from the database.

That error is usually associated with the ApplyCurrentValues in some form or shape - when it tries to update your (previously) detached entity.
It's not entirely clear why is that happening in your case - but maybe you have something else going on - or you're just 'confusing' EF with having attaching the same group over again.
The simplest way I think is to just use Find - and avoid Attach
context.Group.Find(csvUser.GroupID);
Which loads from cache if there is one - or from Db if needed.
If an entity with the given primary key values exists in the context,
then it is returned immediately without making a request to the store.
Otherwise, a request is made to the store for an entity with the given
primary key values and this entity, if found, is attached to the
context and returned. If no entity is found in the context or the
store, then null is returned
That should fix things for you.
You could also turn off applying values form Db I think (but I'm unable to check that at the moment).

It's seems like you're adding a new Group for each User you're iterating on, try this:
using(context = new MyEntity())
{
// fetch Groups and add them first
foreach(groupId in csvSource.Select(x => x.GroupID).Distinct())
{
context.Groups.Add(new Group { ID = groupId });
}
// add users
foreach(csvUser from csvSource)
{
User oUser = new User();
oUser.Firstname = csvUser.Firstname;
var existingGroup = context.Groups.Single(x => x.Id == csvUser.GroupID);
oUser.Groups.Add(existingGroup);
context.Users.Add(oUser);
}
context.saveChnages();
}

It seems like you have a many-to-many relationship between Users and Groups. If that is the case and you are using Code-First then you model could be defined like this...
public class User
{
public int Id { get; set; }
public string Firstname { get; set; }
// Other User properties...
public virtual ICollection<UserGroup> UserGroups { get; set; }
}
public class Group
{
public int Id { get; set; }
// Other Group properties...
public virtual ICollection<UserGroup> UserGroups { get; set; }
}
public class UserGroup
{
public int UserId { get; set; }
public User User { get; set; }
public int GroupId { get; set; }
public Group Group { get; set; }
}
Next, configure the many-to-many relationship...
public class UserGroupsConfiguration : EntityTypeConfiguration<UserGroup>
{
public UserGroupsConfiguration()
{
// Define a composite key
HasKey(a => new { a.UserId, a.GroupId });
// User has many Groups
HasRequired(a => a.User)
.WithMany(s => s.UserGroups)
.HasForeignKey(a => a.UserId)
.WillCascadeOnDelete(false);
// Group has many Users
HasRequired(a => a.Group)
.WithMany(p => p.UserGroups)
.HasForeignKey(a => a.GroupId)
.WillCascadeOnDelete(false);
}
}
Add the configuration in your DbContext class...
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.Configurations.Add(new UserGroupsConfiguration());
...
}
Now your task is simpler...
foreach (var csvUser in csvSource)
{
User oUser = new User();
oUser.Firstname = csvUser.Firstname;
// Find Group
var group = context.Groups.Find(csvUser.GroupID);
if(group == null)
{
// TODO: Handle case that group is null
}
else
{
// Group found, assign it to the new user
oUser.UserGroups.Add(new UserGroup { Group = group });
context.Users.Add(oUser);
}
}
context.SaveChanges();

Saving Entity causes duplicate insert into lookup data

I am using EF 4.1 "code first" to create my db and objects.
Given:
public class Order
{
public int Id { get; set; }
public string Name { get; set; }
public virtual OrderType OrderType { get; set; }
}
public class OrderType
{
public int Id { get; set; }
public string Name { get; set; }
}
An order has one ordertype. An order type is just a look up table. The values dont change. Using Fluent API:
//Order
ToTable("order");
HasKey(key => key.Id);
Property(item => item.Id).HasColumnName("order_id").HasColumnType("int");
Property(item => item.Name).HasColumnName("name").HasColumnType("string").HasMaxLength(10).IsRequired();
HasRequired(item => item.OrderType).WithMany().Map(x => x.MapKey("order_type_id")).WillCascadeOnDelete(false);
//OrderType
ToTable("order_type");
HasKey(key => key.Id);
Property(item => item.Id).HasColumnName("order_type_id").HasColumnType("int");
Property(item => item.Name).HasColumnName("name").HasColumnType("nvarchar").HasMaxLength(100).IsRequired();
Now in our App we load all our lookup data and cache it.
var order = new Order
{
Name = "Bob"
OrderType = GetFromOurCache(5) //Get order type for id 5
};
var db = _db.GetContext();
db.Order.Add(order);
db.SaveChanges();
Our you-beaut order is saved but with a new order type, courtesy of EF. So now we have two same order types in our database. What can I do to alter this behaviour?
TIA

With EF 4.1 you can do this before calling SaveChanges:
db.Entry(order.OrderType).State = EntityState.Unchanged;

Alternatively to Yakimych's solution you can attach the OrderType to the context before you add the order to let EF know that the OrderType already exists in the database:
var order = new Order
{
Name = "Bob"
OrderType = GetFromOurCache(5) //Get order type for id 5
};
var db = _db.GetContext();
db.OrderTypes.Attach(order.OrderType);
db.Order.Add(order);
db.SaveChanges();

Yakimych / Slauma - thanks for the answers. Interestingly I tried both ways and neither worked. Hence I asked the question. Your answers confirmed that I must be doing something wrong, and sure enough I wasnt managing my dbContext properly.
Still its a pain that EF automatically wants to insert lookup/static data even when you supply the full object (including the lookups unique Id). It puts the onus on the developer to remember to set the state. To make things a little easier I do:
var properties = entry.GetType().GetProperties().Where(x => x.PropertyType.GetInterface(typeof(ISeedData).Name) != null);
foreach (var staticProperty in properties)
{
var n = staticProperty.GetValue(entry, null);
Entry(n).State = EntityState.Unchanged;
}
in SaveChanges override.
Again thanks for the help.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Best way to load navigation properties in new entity - c#

Related

How to Fetch a Lot of Records with EF6

How to query IQueryable with Include - ThenInclude?

NHibernate join does not fully populate objects within transactions

ObjectStateManager error on attach in foreach loop entity frameworks

Saving Entity causes duplicate insert into lookup data

Categories

Resources