WebApi Speed for Returning Related Entities

WebApi Speed for Returning Related Entities - c#

My WebApi is working with a lot of internal references between my objects and i'm wondering what would be less costly for the application. I'm using EF database first so i don't have access to the generated classes (I know i can edit them but it's not that smart).
For example, i have some areas where i will have 5 relations, and those relations are deep but i don't want to return them all the time to the user because i won't use all that data, sometimes i just need the parent object and to work that around i'm using AutoMapper and creating some ViewModels where i make a copy of my object.
On some point on my Api that i only want to return some entities i would start the AutoMapper and tell him what he should ignore for that case.
My problem is as i said, i have a lot of data, this system is going to be used for 15k - 20k users. Is the AutoMapper ignoring the data be a bottleneck up ahead ? If so would be better i use some other alternative ?
If this isn't the best option, what else could i use ?
This is an example of how i'm working:
Controller:
public async Task<EventVM> Get(int id)
{
var event = await eventService.Get(id);
return event;
}
Service:
public async Task<EventoVM> Get(int id)
{
var event = await _context.Event.FindAsync(id);
return event;
}
Also i checked on my configuration, Lazy Loading is enabled.

Some of the things in your initial post are not clear at all.
You say you use code first but don't have access to generated classes. Well, if you use code first there won't be generated classes, but you must have some classes initially from which your sql tables get generated, right?
As a rule of thumb, do not use anything from EF in your WebApi. Have your Api return only the data and properties you need for each endpoint. This means creating another set of classes, tipically DTOs which are much lighter, don't have any methods only public properties with exactly the data you need. Yes, you will need an extra step in between to transform the data, but that is absolutely fine.
This should help you get started, just remember the important rule : return exactly what you need, nothing more, nothing less.

Related

Problem with EF Core updating nested entities when using automapper

I am maintaining an application which uses EF Core to persist data to a SQL database.
I am trying to implement a new feature which requires me to retrieve an object from the database (Lets pretend its an order) manipulate it and some of the order lines which are attached to it and save it back into the database. Which wouldn't be a problem but I have inherited some of this code so need to try to stick to the existing way of doing things.
The basic process for data access is :
UI -> API -> Service -> Repository -> DataContext
The methods in the repo follow this pattern (Though I have simplified it for the purposes of this question)
public Order GetOrder(int id)
{
return _context.Orders.Include(o=>o.OrderLines).FirstOrDefault(x=>x.Id == id);
}
The service is where business logic and mapping to DTOs are applied, this is what the GetOrder method would look like :
public OrderDTO GetOrder(int id)
{
var ord = _repo.GetOrder(id);
return _mapper.Map<OrderDto>(ord);
}
So to retrieve and manipulate an order my code would look something like this
public void ManipulateAnOrder()
{
// Get the order DTO from the service
var order = _service.GetOrder(3);
// Manipulate the order
order.UpdatedBy = "Daneel Olivaw";
order.OrderLines.ForEach(ol=>ol.UpdatedBy = "Daneel Olivaw");
_service.SaveOrder(order);
}
And the method in the service which allows this to be saved back to the DB would look something like this:
public void SaveOrder(OrderDTO order)
{
// Get the original item from the database
var original = _repo.GetOrder(order.Id);
// Merge the original and the new DTO together
_mapper.Map(order, original);
_repo.Save(original);
}
Finally the repositories save method looks like this
public void Save(Order order){
_context.Update(order)
_context.SaveChanges();
}
The problem that I am encountering is using this method of mapping the Entities from the context into DTOs and back again causes the nested objects (in this instance the OrderLines) to be changed (or recreated) by AutoMapper in such a way that EF no longer recognises them as being the entities that it has just given to us.
This results in errors when updating along the lines of
InvalidOperationException the instance of ProductLine cannot be tracked because another instance with the same key value for {'Id'} is already being tracked.
Now to me, its not that there is ANOTHER instance of the object being tracked, its the same one, but I understand that the mapping process has broken that link and EF can no longer determine that they are the same object.
So, I have been looking for ways to rectify this, There are two ways that have jumped out at me as being promising,
the answer mentioned here EF & Automapper. Update nested collections
Automapper.Collection
Automapper.collection seems to be the better route, but I cant find a good working example of it in use, and the implementation that I have done doesn't seem to work.
So, I'm looking for advice from anyone who has either used automapper collections before successfully or anyone that has any suggestions as to how best to approach this.
Edit, I have knocked up a quick console app as an example, Note that when I say quick I mean... Horrible there is no DI or anything like that, I have done away with the repositories and services to keep it simple.
I have also left in a commented out mapper profile which does work, but isn't ideal.. You will see what I mean when you look at it.
Repo is here https://github.com/DavidDBD/AutomapperExample

Ok, after examining every scenario and counting on the fact that i did what you're trying to do in my previous project and it worked out of the box.
Updating your EntityFramework Core nuget packages to the latest stable version (3.1.8) solved the issue without modifying your code.

AutoMapper in fact "has broken that link" and the mapped entities you are trying to save are a set of new objects, not previously tracked by your DbContext. If the mapped entities were the same objects, you wouldn't have get this error.
In fact, it has nothing to do with AutoMapper and the mapping process, but how the DbContext is being used and how the entity states are being managed.
In your ManipulateAnOrder method after getting the mapped entities -
var order = _service.GetOrder(3);
your DbContext instance is still alive and at the repository layer it is tracking the entities you just retrieved, while you are modifying the mapped entities -
order.UpdatedBy = "Daneel Olivaw";
order.OrderLines.ForEach(ol=>ol.UpdatedBy = "Daneel Olivaw");
Then, when you are trying to save the modified entities -
_service.SaveOrder(order);
this mapped entities reach the repository layer and DbContext tries to add them to its tracking list, but finds that it already has entities of same type with same Ids in the list (the previously fetched ones). EF can track only one instance of a specific type with a specific key. Hence, the complaining message.
One way to solve this, is when fetching the Order, tell EF not to track it, like at your repository layer -
public Order GetOrder(int id, bool tracking = true) // optional parameter
{
if(!tracking)
{
return _context.Orders.Include(o=>o.OrderLines).AsNoTracking().FirstOrDefault(x=>x.Id == id);
}
return _context.Orders.Include(o=>o.OrderLines).FirstOrDefault(x=>x.Id == id);
}
(or you can add a separate method for handling NoTracking calls) and then at your Service layer -
var order = _repo.GetOrder(id, false); // for this operation tracking is false

Entity Framework does not see data in database until a DBSet is converted to a list

I define the index API of a controller as the following:
[HttpGet]
public IEnumerable<Blog> GetDatas()
{
return _context.Blogs;
}
This always returns empty even-thought the database contains many blogs. However, when I do the following for test reasons only, entity framework manages to see the data and can return all the blogs in the database:
[HttpGet]
public IEnumerable<Blog> GetDatas()
{
var blogs = _context.blogs.ToList();
return _context.Blogs;
}
Any thoughts?
(maybe related to my other unanswered question).
Update 1
To avoid confusion around deferred execution in LINQ; I've tried the following two methods and using neither of the methods the returned json object contains the information already in the database. In other words, the serialized objects do not reflect entities persisted in the database. I think these methods would trigger execution of LINQ query, correct?
// Method 1:
[HttpGet]
public async Task<ActionResult<IEnumerable<Blog>>> GetDatas()
{
return await _context.Blogs.ToListAsync().ConfigureAwait(false);
}
// Method 2:
[HttpGet]
public IEnumerable<Blog> GetDatas()
{
return _context.Blogs.ToList();
}

As Daniel said, this is by design. See What are the benefits of a Deferred Execution in LINQ? for an extended discussion but essentially data is loaded when it is used, not when it is requested. The only way you can see that it's empty is in the debugger; your runtime code doesn't see it that way because as soon as you try and find out whether the first form is empty or not it will fill with data, at which point (the point of use) it doesn't matter that it was empty up to that point - nothing was using it to find out whether it was empty or full
Think of it a bit like Schrondinger's cat
It's quite helpful actually:
var w = worldPopulation.Where(e => e.Gender = Gender.Male)
if(name!=null)
w=w.Where(e=>e.Name == name)
The first query, if it ran immediately, could see 3.5 billion results being downloaded from your db to your client (a low spec machine compared with the server), then the name filter would reduce it to a few million. Better to only download a few million into your slow, low spec machine over a very slow network, in the first place.. right?
One benefit of only running the query when you actually ask for the data is that at that point you finally actually KNOW you want the data. Up to that point you might never have needed it, so downloading it would be a waste of resources

Use a scoped or context pool for database instead of singleton or transient; i.e., use:
services.AddDbContextPool<BlogsContext>(options => {options.UseSqlServer();});
and avoid registrations such as (note ServiceLifetime.Singleton):
services.AddDbContext<BlogsContext>(options => {options.UseSqlServer();}, ServiceLifetime.Singleton);

Is it proper form to extend a model object (e.g. Product) and add a Create() method that inserts into the database? (MVC 5 Entity Framework 6)

So I am currently extending the classes that Entity Framework automatically generated for each of the tables in my database. I placed some helpful methods for processing data inside these partial classes that do the extending.
My question, however, is concerning the insertion of rows in the database. Would it be good form to include a method in my extended classes to handle this?
For example, in the Product controller's Create method have something like this:
[HttpPost]
public ActionResult Create(Product p)
{
p.InsertThisProductIntoTheDatabase(); //my custom method for inserting into db
return View();
}
Something about this feels wrong to me, but I can't put my finger on it. It feels like this functionality should instead be placed inside a generic MyHelpers.cs class, or something, and then just do this:
var h = new MyHelpers();
h.InsertThisProductIntoTheDatabase(p);
What do you guys think? I would prefer to do this the "correct" way.
MVC 5, EF 6
edit: the InsertThisProductIntoTheDatabase method might look something like:
public partial class Product()
{
public void InsertThisProductIntoTheDatabase()
{
var context = MyEntities();
this.CreatedDate = DateTime.Now;
this.CreatedByID = SomeUserClass.ID;
//some additional transformation/preparation of the object's data would be done here too. My goal is to bring all of this out of the controller.
context.Products.Add(this);
}
}

One of the problems I see is that the entity framework DBContext is a unit of work. if you create a unit of work on Application_BeginRequest when you pass it into controller constructor it acts as a unit of work for the entire request. maybe it's only updating 1 entity in your scenario, but you could be writing more information to your database. unless you are wrapping everything in a TransactionScope, all these Saves are going to be independent which could leave your database in an inconsistent state. And even if you are wrapping everything with a TransactionScope, I'm pretty sure that transaction is going to be promoted to the DTC because you are making multiple physical connections in a single controller and sql server isn't that smart.
Going the BeginRequest route seems like less work than adding methods to all of your entities to save itself. Another issue here is that an EF entity is supposed to be a not really know anything about it's own persistence. That's what the DbContext is for. So putting a reference back to the DbContext breaks this isolation.
Your second reason, adding audit information to the entity, again adding this to each entity is a lot of work. You could override SaveChanges on the context and do it once for every entity. See this SO answer.
By going down this road I think that you are breaking SOLID design principles because your entities violate SRP. introduce a bunch of cohesion and you are ending up writing more code than you need. So i'd advocate against doing it your way.

Why don't you simply use:
db.Products.Add(p);
db.SaveChanges();
Your code would be much cleaner and it will certainly be easier for you to manage it and get help in the future. Most of samples available in internet use this schema. Extension methods and entities does not look pleasnt.
BTW: Isn't InsertThisProductIntoTheDatabase() method name too long?

How to update a whole entity without specifying every one of its members?

My .net web service reads an entity from the DB and sends it to a client application.
The client application modifies some fields in the entity and then submits the entity back to the server to be updated in the DB.
The surefire but laborious way to do this goes something like:
public void Update(MyEntity updatedEntity)
{
using (var context = new MyDataContext())
{
var existingEntity = context .MyEntities.Single(e => e.Id == updatedEntity.Id);
existingEntity.FirstName = updatedEntity.Name;
existingEntity.MiddleName = updatedEntity.MiddleName;
existingEntity.LastName = updatedEntity.LastName;
// Rinse, repeat for all members of MyEntity...
context.SubmitChanges();
}
}
I don't want to go down this path because it forces me to specify each and every member property in MyEntity. This is will likely break in case MyEntity's structure is changed.
How can I take the incoming updatedEntity and introduce it to LINQ to SQL whole for update?
I've tried achieving this with the DataContext's Attach() method and entered a world of pain.
Is Attach() the right way to do it? Can someone point to a working example of how to this?

Attach is indeed one way to do it.
That said...
The surefire but laborious way to do this goes something like
The right way if you ask me.
This is will likely break in case MyEntity's structure is changed
I personally would expect to modify my Update business method in case the database schema has changed:
if it's an internal change that doesn't change the business, then there is just no reason to modify the code that calls your business method. Let your business method be in charge of the internal stuff
if it's some change that require you to modify your consumers, then so be it, it was required to update the calling code anyway (at least to populate for instance the new properties you added to the entity)
Basically, my opinon on this subject is that you shouldn't try to pass entities to your business layer. I explained why I think that in a previous answer.

Entity Framework Code First: Is there a way to automatically filter disabled rows from DbSets?

In the data architecture I have to contend with, there are no deletes. Instead, all records have a nullable datetime2 that signals the record has been "disabled". This means that in the cases of direct selections on the entities, I'll always have to add in a check to see if the entity was disabled or not.
So far what I've come up with is just a simple extension method called .Enabled() that gets only the enabled rows. It seems effective so far, but it's also annoying that I have to type that in every case.
Surely someone else has skinned this cat before. Is there a better way to do this with Entity Framework I'm just not privy to?

I suppose you could do something like this
public class MyContext : DbContext
{
public IDbSet<Thing> Things { get; set; }
public IQueryable<Thing> EnabledThings
{
get
{
return Things.Where(t => t.Enabled);
}
}
}
or the same as an extension method (but on the context not the DbSet/queriable).
Personally in practice I actually do exactly as you have in your example and use Things.Enabled().Whatever

I don't know of anything native to Entity Framework. But normally when I run into this, I create an "repository" layer that I run most database transactions through. Then I create methods like GetAll() which returns all items with appropriate where statement in place to hide "deleted" items.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.