Entity Framework - improve query efficiency that retrieve lots of data - c#

I have a database with lots of data - Excel file management.
The application manages objects when each object contains an Excel file (number of sheets, list of rows for each sheet).
The application contains a Data Grid and a list of sheets. The user will select revision number, and sheet name, the lines of the same sheet are displayed.
The objects are built like this:
Version object contains list of Pages, each page contains list of PageLine.
What is the best way to retrieve data ?
For example, my PopulateGrid method :
public void PopulateGrid()
{
CurrentPageLineGridObjects.Clear();
PreviousPageLineGridObjects.Clear();
SetCurrentConnectorPageList();
// get current revision
CurrentPageLineGridObjects = CurrentCombinedPageList.Where(page => page.Name ==
PageNameSelected).FirstOrDefault().PageLines.ToList().ToObservablePageLineGridObjectCollection();
//get prev revision
RevisionCOMBINED prevRevCombined = pgroupDataService.GetRevisionCombinedForPGroup(((PGroup)PGroupSelected.Object).Id).Result;
// get pages and pagelines for revision eeprom and override.
List<Page> eepromPages =
revisionEEPROMDataService.GetEEPROMPages(prevRevCombined.RevisionEEPROM.Id).Result;
}
public async Task<List<Page>> GetEEPROMPages(int eepromRevId)
{
string[] includes = { "Pages", "Pages.PageLines" };
IEnumerable<RevisionEEPROM> list = (IEnumerable<RevisionEEPROM>)await dataService.GetAll(includes);
return list.Where(r => r.Id == eepromRevId).SelectMany(p => p.Pages).ToList();
}
public async Task<IEnumerable<T>> GetAll()
{
using (DeployToolDBContex contex = _contexFactory.CreateDbContext())
{
IEnumerable<T> entities = await contex.Set<T>().ToListAsync();
return entities;
}
}
As you can see I pull out all the version data along with all the Sheets and all the PageLines and only then filter by the given version key.
It takes me quite a while to load.
I would appreciate any advice.
I tried to use IQueryable:
public async Task<List<T>> GetQueryable(string[] includes = null)
{
using (DeployToolDBContex context = _contextFactory.CreateDbContext())
{
if (includes != null)
{
var query = context.Set<T>().AsQueryable();
foreach (var include in includes)
query = query.Include(include);
return query.ToList();
}
else
{
List<T> entities = await context.Set<T>().AsQueryable().ToListAsync();
return entities;
}
}
}

This is terrible use of EF. For a start, code like this:
IEnumerable<RevisionEEPROM> list = (IEnumerable<RevisionEEPROM>)await dataService.GetAll(includes);
return list.Where(r => r.Id == eepromRevId).SelectMany(p => p.Pages).ToList();
You are fetching the entire table and associated includes (based on that includes array passed) into memory before filtering.
Given you are scoping the DbContext within that data service method with a using block, the best option would be to introduce a GetPagesForEepromRevision() method to fetch the pages for a given ID in your data service. Your Generic implementation for this Data Service should be a base class for these data services so that they can provide common functionality, but can be extended to support specific cases to optimize queries for each area. For instance if you have:
public class DataService<T>
{
public async Task<IEnumerable<T>> GetAll() {...}
// ...
}
extend it using:
public class EepromDataService : DataService<EEPROM>
{
public async Task<IEnumerable<Page>> GetPagesForEepromRevision(int eepromRevId)
{
using (DeployToolDBContext context = _contexFactory.CreateDbContext())
{
var pages = await context.Set<EEPROM>()
.Where(x => x.Id == eepromRevId)
.SelectMany(x => x.Pages)
.ToListAsync();
return pages;
}
}
}
So if your calling code was creating something like a var dataService = new DataService<EEPROM>(); this would change to var dataService = new EepromDataService();
The IQueryable option mentioned before:
public IQueryable<T> GetQueryable()
{
var query = _context.Set<T>().AsQueryable();
return query;
}
Then when you go to fetch your data:
var results = await dataService.GetQueryable()
.Where(r => r.Id == eepromRevId)
.SelectMany(r => r.Pages)
.ToListAsync();
return results;
This requires either a Unit of Work pattern which would scope the DbContext at the consumer level (eg: GetEEPROMPages method) or a shared dependency injected DbContext that spans both the caller where ToListAsync would be called as well as the data service. Since your example is scoping the DbContext inside the dataService with a using block that's probably a bigger change.
Overall you need to review your use of asynchronous vs. synchronous calls because other methods that do things like:
RevisionCOMBINED prevRevCombined = pgroupDataService.GetRevisionCombinedForPGroup(((PGroup)PGroupSelected.Object).Id).Result;
is very bad practice to just call .Result. If you need to call async calls from within a synchronous method then there are proper ways to do it and ensure things like exception bubbling can occur. For examples, see (How to call asynchronous method from synchronous method in C#?) If the code doesn't need to be asynchronous then leave it synchronous. async is not a silver "go faster" bullet, it is used to make supporting code more responsive so long as that code is actually written to leverage async the entire way. (I.e. HTTP Web requests in ASP.Net)

Related

IAsyncEnumerable and database queries

I have three controller methods returning IAsyncEnumerable of WeatherForecast.
The first one #1 uses SqlConnection and yields results from an async reader.
The second one #2 uses EF Core with the ability to use AsAsyncEnumerable extension.
The third one #3 uses EF Core and ToListAsync method.
I think the downside of #1 and #2 is if I, for example, do something time-consuming inside while or for each then the database connection will be open till the end. In scenario #3 I'm able to iterate over the list with a closed connection and do something else.
But, I don't know if IAsyncEnumerable makes sense at all for database queries. Are there any memory and performance issues? If I use IAsyncEnumerable for returning let's say HTTP request from API, then once a response is returned it's not in memory and I'm able to return the next one and so on. But what about the database, where is the whole table if I select all rows (with IAsyncEnumerable or ToListAsync)?
Maybe it's not a question for StackOverflow and I'm missing something big here.
#1
[HttpGet("db", Name = "GetWeatherForecastAsyncEnumerableDatabase")]
public async IAsyncEnumerable<WeatherForecast> GetAsyncEnumerableDatabase()
{
var connectionString = "";
await using var connection = new SqlConnection(connectionString);
string sql = "SELECT * FROM [dbo].[Table]";
await using SqlCommand command = new SqlCommand(sql, connection);
connection.Open();
await using var dataReader = await command.ExecuteReaderAsync();
while (await dataReader.ReadAsync())
{
yield return new WeatherForecast
{
Date = Convert.ToDateTime(dataReader["Date"]),
Summary = Convert.ToString(dataReader["Summary"]),
TemperatureC = Convert.ToInt32(dataReader["TemperatureC"])
};
}
await connection.CloseAsync();
}
#2
[HttpGet("ef", Name = "GetWeatherForecastAsyncEnumerableEf")]
public async IAsyncEnumerable<WeatherForecast> GetAsyncEnumerableEf()
{
await using var dbContext = _dbContextFactory.CreateDbContext();
await foreach (var item in dbContext
.Tables
.AsNoTracking()
.AsAsyncEnumerable())
{
yield return new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
};
}
}
#3
[HttpGet("eflist", Name = "GetWeatherForecastAsyncEnumerableEfList")]
public async Task<IEnumerable<WeatherForecast>> GetAsyncEnumerableEfList()
{
await using var dbContext = _dbContextFactory.CreateDbContext();
var result = await dbContext
.Tables
.AsNoTracking()
.Select(item => new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
})
.ToListAsync();
return result;
}
Server-Side
If I only cared with the server I'd go with option 4 in .NET 6 :
Use an injected DbContext, write a LINQ query and return the results as AsAsyncEnumerable() instead of ToListAsync()
public class WeatherForecastsController:ControllerBase
{
WeatherDbContext _dbContext;
public WeatherForecastsController(WeatherDbContext dbContext)
{
_dbContext=dbContext;
}
public async IAsyncEnumerable<WeatherForecast> GetAsync()
{
return _dbContext.Forecasts.AsNoTracking()
.Select(item => new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
})
.AsAsyncEnumerable();
}
}
A new Controller instance is created for every request which mean the DbContext will be around for as long as the request is being processed.
The [FromServices] attribute can be used to inject a DbContext into the action method directly. The behavior doesn't really change, the DbContext is still scoped to the request :
public async IAsyncEnumerable<WeatherForecast> GetAsync([FromServices] WeatherContext dbContext)
{
...
}
ASP.NET Core will emit a JSON array but at least the elements will be sent to the caller as soon as they're available.
Client-Side
The client will still have to receive the entire JSON array before deserialization.
One way to handle this in .NET 6 is to use DeserializeAsyncEnumerable to parse the response stream and emit items as they come:
using var stream=await client.GetAsStreamAsync(...);
var forecasts= JsonSerializer.DeserializeAsyncEnumerable(stream, new JsonSerializerOptions
{
DefaultBufferSize = 128
});
await foreach(var forecast in forecasts)
{
...
}
The default buffer size is 16KB so a smaller one is needed if we want to receive objects as soon as possible.
This is a parser-specific solution though.
Use a streaming JSON response
A common workaround to this problem is to use streaming JSON aka JSON per line, aka Newline Delimited JSON aka JSON-NL or whatever. All names refer to the same thing - sending a stream of unindented JSON objects separated by a newline. It's an old technique that many tried to hijack and present as their own
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
{ "Date": "2022-10-18", Summary = "Blah", "TemperatureC"=18.5 }
That's not valid JSON but many parsers can handle it. Even if a parser can't, we can simply read one line of text at a time and parse it.
Use a different protocol
Even streaming JSON responses is a workaround. HTTP doesn't allow server-side streaming in the first place. The server has to send all data even if the client only reads the first 3 items since there's no way to cancel the response.
It's more efficient to use a protocol that does allow streaming. ASP.NET Core 6 offers two options:
Use ASP.NET Core SignalR with Server-side streaming to push data to clients over WebSockets
A gRPC Service that uses server-side streaming
In both cases the server sends objects to clients as soon as they're available. Clients can cancel the stream as needed.
In a SignalR hub, the code could return an IAsyncEnumerable or a Channel:
public class AsyncEnumerableHub : Hub
{
...
public async IAsyncEnumerable<WeatherForecast> GetForecasts()
{
return _dbContext.Forecasts.AsNoTracking()
.Select(item => new WeatherForecast
{
Date = item.Date,
Summary = item.Summary,
TemperatureC = item.TemperatureC
})
.AsAsyncEnumerable();
}
}
In gRPC, the server method writes objects to a response stream :
public override async Task StreamingFromServer(ForecastRequest request,
IServerStreamWriter<ForecastResponse> responseStream, ServerCallContext context)
{
...
await foreach (var item in queryResults)
{
if (context.CancellationToken.IsCancellationRequested)
{
return;
}
await responseStream.WriteAsync(new ForecastResponse{Forecast=item});
}
}

The operation cannot be completed because the DbContext has been disposed in Web API

My api call is throwing exception "The operation cannot be completed because the DbContext has been disposed". please guide how to resolve it..
Function which is defined in seprate class named as Common Class
public IQueryable productquery()
{
try
{
using (ERPEntities db = new ERPEntities())
{
db.Configuration.ProxyCreationEnabled = false;
var query = db.Products.OrderBy(x => x.ID).AsQueryable();
var list = query.Skip(10).Take(50).AsQueryable();
return list;
}
}
catch (Exception ex)
{
throw;
};
}
Function Call from common function to api Controller
[HttpPost]
public IHttpActionResult GetProducts()
{
try
{
var result = _commonFunctions.productquery();
var resultf = result.Include(x => x.Catagory).ToList();
return Ok(resultf);
}
catch (Exception ex)
{
return BadRequest(ex.Message);
}
}
You are trying to include an entity after your DbContext has been disposed.
var resultf = result.Include(x => x.Catagory).ToList();
But you are calling DbContext with using inside productquery() method. If you want to continue to do so, you should finish all db actions inside the using part.
var query = db.Products.Include(x => x.Catagory).OrderBy(x => x.ID).AsQueryable();
The problem is that you create your context:
ERPEntities db = new ERPEntities()
but wrapping it up in a using statement, it would dispose it after the return statement:
return list;
(Undoubtedly you need the disposal of the DB context class, but you have to use the DB context class prior to its disposal. Afterwards, apparently you can't use it.)
For that reason in GetProducts fails, when you materialize the query.
One solution it would be to call ToList before AsQueryable.
var list = query.Skip(10).Take(50).ToList().AsQueryable();
Essentially this would would materialize your query, the data would be fetched from the database, when you call ToList and later on you wouldn't need the context. However, you would have a new problem, Include(x => x.Catagory). This should placed, where you build the query:
var query = db.Products
.Include(x => x.Catagory)
.OrderBy(x => x.ID)
.AsQueryable();
Doing so, I would say that you don't need any more the call to AsQueryable you can removed it from both the signature of your method and the the places you call it.
The most correct answer is to use dependency injection to pass around one instance of your DBContext per http request.
However, the problem is actually just that you are disposing you DBContext before you're done with it. You can move your DBContext to a class-level variable and instantiate it in the CommonFunctions class' constructor. Then you can destroy it on Dispose. This will make it possible for you to do what you want. You will also have to create a function called SaveChanges() which calls .SaveChanges() on the context when you're done doing whatever it is you're doing that needs saving. This is a very quick explanation of something called the Unit of Work pattern.

Using a separate class file (model) for Entity Framework queries instead of writing in controller itself

Is it ok to write EF queries in a separate class file inside models folder instead of writing it in controller itself.
Because I've seen in msdn writing all EF queries in controller itself. While at the same time I have also read in msdn once that controller is meant to be short.
Using models I use this approach:
In the controller:
public JsonResult SaveStorageLocation(StorageLocations objStorageLocation)
{
int Result = 0;
try
{
StorageLocationModel objStorageLocationModel = new StorageLocationModel();
if (objStorageLocation.Id == Guid.Empty)
{
Result = objStorageLocationModel.AddStorageLocation(objStorageLocation, SessionUserId);
}
else
{
Result = objStorageLocationModel.UpdateStorageLocation(objStorageLocation, SessionUserId);
}
}
catch (Exception ex)
{
Result = (int)MethodStatus.Error;
}
return Json(Result, JsonRequestBehavior.AllowGet);
}
In Model class:
public int AddStorageLocation(StorageLocations objStorageLocation, Guid CreatedBy)
{
MethodStatus Result = MethodStatus.None;
int DuplicateRecordCount = db.StorageLocations.Where(x => x.Location.Trim().ToLower() == objStorageLocation.Location.Trim().ToLower()).Count();
if (DuplicateRecordCount == 0)
{
objStorageLocation.Id = Guid.NewGuid();
objStorageLocation.CreatedBy = CreatedBy;
objStorageLocation.CreatedOn = DateTime.Now;
objStorageLocation.ModifiedBy = CreatedBy;
objStorageLocation.ModifiedOn = DateTime.Now;
objStorageLocation.Status = (int)RecordStatus.Active;
db.StorageLocations.Add(objStorageLocation);
db.SaveChanges();
Result = MethodStatus.Success;
}
else
{
Result = MethodStatus.MemberDuplicateFound;
}
return Convert.ToInt32(Result);
}
public int UpdateStorageLocation(StorageLocations objStorageLocationNewDetails, Guid ModifiedBy)
{
MethodStatus Result = MethodStatus.None;
int DuplicateRecordCount =
db.StorageLocations.
Where(x => x.Location == objStorageLocationNewDetails.Location &&
x.Id != objStorageLocationNewDetails.Id).Count();
if (DuplicateRecordCount == 0)
{
StorageLocations objStorageLocationExistingDetails = db.StorageLocations.Where(x => x.Id == objStorageLocationNewDetails.Id).FirstOrDefault();
if (objStorageLocationExistingDetails != null)
{
objStorageLocationExistingDetails.Location = objStorageLocationNewDetails.Location;
objStorageLocationExistingDetails.ModifiedBy = ModifiedBy;
objStorageLocationExistingDetails.ModifiedOn = DateTime.Now;
objStorageLocationExistingDetails.Status = (int)RecordStatus.Active;
db.SaveChanges();
Result = MethodStatus.Success;
}
}
else
{
Result = MethodStatus.MemberDuplicateFound;
}
return Convert.ToInt32(Result);
}
Or is it better to write all the code in controller itself?
I expect your question will be closed pretty soon because it will be deemed opinion-based.
With that aside, there are many advantages if you don't have your queries in your controller.
Controller should not dictate how to access data, that's the job for model.
It is much easier to mock the data access if you inject the model (or service or repository, whatever you want to call it your application) as a dependency.
You may find out later on that certain queries are much better handled if you migrate them to stored procedures, for they process large amounts of data. This change will be easier to make if controller does not access the data store directly. You could either make the changes in your model class and keep the same interface, or write a new implementation which then gets injected.
Both controller and model can be more easily tested in isolation from each other.
You want your code to be testable, always.
Never put logic in your models, putting logical items in the model-folder will make your structure dirty and it’s easier to loose overview.
You should use a repository class, that implements an interface. In the repository class you can perform EF logic, catch database exceptions and so on.
Your controller will be injected with the repository interface. This way you can test your controller separately from the EF logic, because you can mock that interface. Your repository will be testable for it’s oen functionality as well.

Nopcommerce Update entity issue

Using NopCommerce 3.8, Visual Studio 2015 proff.
I have created a plugin that is responsible for making restful calls to my Web API that exposes a different DB to that of Nop.
The process is run via a nop Task, it successfully pulls the data back and i can step through and manipulate as i see fit, no issues so far.
Issue comes when i try to update a record on the product table, i perform the update... but nothing happens no change, no error.
I believe this is due to the Context having no idea about my newly instantiated product object, however I'm drawing a blank on what i need to do in relation to my particular example.
Similar questions usually reference a "model" object that is part of the parameter of the method call, "model" has the method ToEntity which seems to be the answer in similar question in stack.
However my example doesn't have the ToEntity class/method possibly because my parameter is actually a list of products. To Clarify here my code.
Method in RestClient.cs
public async Task<List<T>> GetAsync()
{
try
{
var httpClient = new HttpClient();
var json = await httpClient.GetStringAsync(ApiControllerURL);
var taskModels = JsonConvert.DeserializeObject<List<T>>(json);
return taskModels;
}
catch (Exception e)
{
return null;
}
}
Method in my Service Class
public async Task<List<MWProduct>> GetProductsAsync()
{
RestClient<MWProduct> restClient = new RestClient<MWProduct>(ApiConst.Products);
var productsList = await restClient.GetAsync();
InsertSyncProd(productsList.Select(x => x).ToList());
return productsList;
}
private void InsertSyncProd(List<MWProduct> inserted)
{
var model = inserted.Select(x =>
{
switch (x.AD_Action)
{
case "I":
//_productService.InsertProduct(row);
break;
case "U":
UpdateSyncProd(inserted);
.....
Then the method to bind and update
private void UpdateSyncProd(List<MWProduct> inserted)
{
var me = inserted.Select(x =>
{
var productEnt = _productRepos.Table.FirstOrDefault(ent => ent.Sku == x.Sku.ToString());
if(productEnt != null)
{
productEnt.Sku = x.Sku.ToString();
productEnt.ShortDescription = x.ShortDescription;
productEnt.FullDescription = x.FullDescription;
productEnt.Name = x.Name;
productEnt.Height = x.Pd_height != null ? Convert.ToDecimal(x.Pd_height) : 0;
productEnt.Width = x.Pd_width != null ? Convert.ToDecimal(x.Pd_width) : 0;
productEnt.Length = x.Pd_depth != null ? Convert.ToDecimal(x.Pd_depth) : 0;
productEnt.UpdatedOnUtc = DateTime.UtcNow;
}
//TODO: set to entity so context nows and can update
_productService.UpdateProduct(productEnt);
return productEnt;
});
}
So as you can see, I get the data and pass data through to certain method based on a result. From that list in the method I iterate over, and pull the the entity from the table, then update via the product service using that manipulated entity.
So what am I missing here, I'm sure its 1 step, and i think it may be either be because 1) The context still has no idea about the entity in question, or 2) Its Incorrect calls.
Summary
Update is not updating, possibly due to context having no knowledge OR my methodology is wrong. (probably both).
UPDATE:
I added some logger.inertlog all around my service, it runs through fine, all to the point of the call of update. But again I check the product and nothing has changed in the admin section.
plugin
I have provided the full source as i think maybe this has something to do with the rest of the code setup possibly?
UPDATE:
Added the following for testin on my execute method.
var myprod = _productRepos.GetById(4852);
myprod.ShortDescription = "db test";
productRepos.Update(myprod);
This successfully updates the product description. I moved my methods from my service into the task class but still no luck. The more i look at it the more im thinking that my async is killing off the db context somehow.
Turned of async and bound the getbyid to a new product, also removed the lambda for the switch and changed it to a foreach loop. Seems to finally update the results.
Cannot confirm if async is the culprit, currently the web api seems to be returning the same result even though the data has changed (some wierd caching by deafult in .net core? ) so im creating a new question for that.
UPDATE: It appears that the issue stems from poor debugging of async. Each instance I am trying to iterate over an await call, simply put im trying to iterate over a collection that technically may or may not be completed yet. And probably due to poor debugging, I was not aware.
So answer await your collection Then iterate after.

Controller method not updating(same result every time)

I have following method in my mvc controller:
[HttpGet]
public ActionResult UserProfile(String username)
{
var user = db.Users.Find(username);
return View(user);
}
This function returns View with user profile. But result of this is the same, regardless of changes in database.
When I debug it seems like db is not changing at all, while in other controllers everything works just fine.
EDIT:
Place when I make changes
public ActionResult ExecuteRetreive(String username, String ISBN)
{
if (IsValid(username))
{
var resBook = db.Books.Find(ISBN);
var resUser = db.Users.Find(username);
var resRentedBooks = (from rb in db.RentedBooks
join b in db.Books on rb.ISBN equals b.ISBN
where b.ISBN == ISBN
where rb.Login == username
where rb.Returned == null
select rb).FirstOrDefault();
if (resRentedBooks == null)
{
return RedirectToAction("Fail", "FailSuccess",
new { error = "" });
}
resRentedBooks.Returned = DateTime.Now;
resBook.IsRented = false;
resUser.RentedBooks--;
db.SaveChanges();
return RedirectToAction("Success", "FailSuccess");
}
else
{
return RedirectToAction("Fail", "FailSuccess",
new { error = "Niepoprawna nazwa użytkownika" });
}
}
Im new to this so dont laugh at my code :P When I display resUser.RentedBooks--; it is the same every time.
As a follow up to what #JeroenVannevel said in the comments, another problem that you might be having because you're using a static context (and one that I've had to deal with in the past) is that once a specific DbContext has loaded an entity (or a set of entities, in my case) it won't tend to refresh just because some outside changes were made in the database. It loads those entities into Local and just refers to those automatically if you query for it.
The solution, then, is to always put your DbContext calls wrapped up in a using block, since DbContext implements IDisposable.
One word of caution with this approach, since you're using MVC: If you are using lazy loading, and you know that your View will need some information from a child object (or to list the names of a collection of child objects), you will absolutely need to hydrate those child entities before you get out of the using block, or you will find yourself getting exceptions saying that your context has been disposed.

Categories