Count or Select in Entity Framework, Optimal solution - c#

I have to display a number regarding to new messages from database.
Which solution is optimal?
1) Create a trigger in database that will increment an countValue in database and select this 'countValue' from Entity framework.
2) Count directly from entity framework the number (databaseContext.MyTable.Count();) .
Thanks

The database keeps track on the tables' row counts, so you dont't need to do that by yourself.
EF's .Count() will call SQL's COUNT() method which in turn will return your result in no time.

Related

Does the order of items in DbSet match the order of rows in the database table in Entity Framework Core?

For example I have a Postgres database with table Clients where primary keys are INT and rows are naturally sorted by ascending of Ids. And I have .Net Core application with Entity Framework Core as ORM and Npgsql as data provider. So main questions:
Does order of items in returned collection of this listing will always match order of rows in original table in Database?
var clients = context.Clients.ToList();
Does Take() applied to DbSet without OrderBy() will always return items from the begin of table in correct order?
Does Skip() applied to DbSet without OrderBy() will always skip items from the begin of table in correct order?
Are these listings are equal?
var clients = context.Clients
.Skip(10)
.Take(5)
.ToList();
var clients = context.Clients
.OrderBy(c => c.Id)
.Skip(10)
.Take(5)
.ToList();
Do I have to always use OderBy() in expressions with Skip() and Take() when I want to paginate table?
Is all this behavior determined by the framework or by the data provider? For example, will these things be the same in MSSQL, Postgres and MySql?
There is no inherent order in the table, they may be physically stored in order of the clustered index, but the engine may return them to you in any order it sees fit to achieve performance and/or consistency unless you specify a sort order.
The original spec (http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt) says:
If an order by clause is not specified, then the ordering of the
rows of Q is implementation-dependent.
You should not rely on implementation-dependent details, as they are prone to change.
So basically Yes - you must specify an order. No they are not the same. Yes you need an orderby to use skip or take. And it is determined by BOTH the provider and framework, neither of which can be relied upon to stay this way, even in between runs on the same version. Just because you get the results in the order you expect a number of times, doesn't mean that will continue to happen.

What logic determines the insert order of Entity Framework 6

So, I have a DBContext, and I am doing the following operations:
dbContext.SomeTables1.Add(object1)
dbContext.SomeTables2.AddRange(objectArray2)
dbContext.SomeTables3.AddRange(objectArray3)
dbContext.SaveChanges();
The EF doesn't insert the db records in this order, it inserts them in a random order. To insert them in the same order, I have to do a dbContext.SaveChanges() after each addition. This is not an efficient solution and in my case, it is taking 10 seconds to do all my inserts, while the random order with one save takes around 3 seconds.
N.B. I need the right order to solve a deadlock issue.
My questions are:
Is this issue resolved in EF7?
I can profile EF and determine the random order, however, is there a guarantee that it will be consistently with the same random order or
does it change between requests? (I can adopt my other code if the
answer to this question is positive).
Is there a better way of maintaining the order than dbContext.SaveChanges() on every addition?
There is no way you can specify a save order in EF6 or EF Core
The issue is not resolved in EF Core since this is not an issue.
The order will be the same if the predecessor is the same (which will likely rarely happen)
When you call SaveChanges, all entities are ordered from an internal order in the method “ProduceDynamicCommands” then sorted again by the method “TryTopologicalSort” which loops to add command with no predecessor left (if you add A and B and A depend on B, then B will be inserted before A)
You are left to insert by batch addition.
Since it takes you 3 seconds to perform your insert, I will assume you have thousands of entities and performing bulk insert may improve your performance to reduce the 10 seconds to less, and then maybe the initial 3 seconds!
To improve your performance, you can use http://entityframework-extensions.net/ (PAID but support all cases)
Disclaimer: I'm the owner of the Entity Framework Extensions project.
I've found a way to do it. It just thought I'd let you know:
using (var dbContextTransaction = Context.Database.BeginTransaction())
{
dbContext.SomeTables1.Add(object1);
dbContext.SaveChanges();
dbContext.SomeTables1.Add(object2);
dbContext.SaveChanges();
dbContextTransaction.Commit();
}
To explicitly set the values of the Primary Keys (and hence the order of the Clustered Index) in an Identity column in EF and EF Core, you need to manually turn on IDENTITY_INSERT before calling _context.SaveChanges() after which you need to turn off IDENTITY_INSERT like so:
This example assumes EF Core
// Add your items with Identity Primary Key field manually set
_context.SomeTables1.AddRange(yourItems);
_context.Database.OpenConnection();
try {
_context.Database.ExecuteSqlRaw("SET IDENTITY_INSERT dbo.SomeTables1 ON");
_context.SaveChanges();
_context.Database.ExecuteSqlRaw("SET IDENTITY_INSERT dbo.SomeTables1 OFF");
} finally {
_context.Database.CloseConnection();
}
I've found a very simple solution.
Just set the property for the ID (primary key) of the entity to a value that matches your desired order.
SaveChanges() first sorts by this ID, then by other properties.
The assigned ID may already exist in the database. A unique ID is assigned when writing to the database.
for(int i = 0; i < objectArray2.Count(); i++)
{
objectArray2[i].Id = i;
}
dbContext.SomeTables2.AddRange(objectArray2)

Entity Framework - how can I optimize “Contains” statement?

In our current application we have some performance issues with some of our queries. Usually we have something like:
List<int> idList = some data here…;
var query = (from a in someTable where idList.Contains(a.Id));
while for simple queries this is acceptable, it becomes a bottleneck when we have more items in idList (in some queries we have about 700 id’s to check, for example).
Is there any way to use something other then Contains? We are thinking of using some temporary tables to first insert the Ids, and then to execute join instead of Contains, but it would seem EntityFramework does not support such operations (creating temporary tables in code) :(
What else can we try?
I Suggest using LINQ PAD it offers a "Transform to SQL" option which allows you to see your query in SQL syntax.
there is a chance that this is the optimal solution (if youre not into messy stuff).
might try holding the idList as a sorted array and have the contains method replaced with a binary search. (you can implement your own extension).
You can try this:
var query = someTable.Where(a => idList.Any(b => b.Id == a.Id));
If you don't mind having a physical table you could use a semi-temporary table. The basic idea is:
Create a physical table with a "query id" column
Generate a unique ID (not random, but unique)
Insert data into the table tagging the records with the query ID
Pass the query id to the main query, using it to join to the link table
Once the query is complete, delete the temporary records
At worst if something goes wrong you will have orphaned records in the link table (which is why you use a unique query ID).
It's not the cleanest solution but it will be faster than using Contains if you have a lot of values to check against.
When Entity Framework starts being a performance bottleneck, generally it's time to write actual SQL.
So what you could do for example is build a table-valued function that takes a table-valued parameter (your list of IDs) as parameter. The function would just return the result of your JOIN.
Table valued function feature requires EF5, so it might be not an option if you're really stuck with EF4.
The idea is to refactor your queries to get rid of idList.
For example you should return the list of orders of male users 18-25 year, from France. If you filter users table by age, sex and country to get idList of users you end up with 700+ id's. Instead you make Orders table join with Users and apply filters to the Users table. So you don't have 2 requests (one for ids and one for orders) and it works much faster cause it can use indexes while joining the table.
Makes sense?

Performance - Delete multiple MSSQL database entries with LINQ

I have got a MSSQL database table with a few million entries.
Every new entry got an ID +1 from the last entry. That means that lower ID numbers are older entries. Now I want to delete old entries in the database with the help of it's ID.
I delete every "Entry" that is lower than the "maxID".
while (true)
{
List<Entry> entries = entity.Entry.Where(z => z.id < maxID).Take(1000).ToList();
foreach (var entry in entries)
{
entity.Entry.DeleteObject(entry);
}
if (entries < 1000)
{
break;
}
}
I can't take all entries with one query because this would raise a System.OutOfMemoryException. So I only took 1000 entries and repeat the delete function until every entry is deleted.
My question is: What would be the best number of entries to ".Take()" in performance?
It's faster to drop and recreate the tables in the database,
You can directly execute commands against the database by calling your stored procedure using
ExecuteStoreQuery method.
Any commands automatically generated by the Entity Framework may be more complex than similar commands written explicitly by a database developer. If you need explicit control over the commands executed against your data source, consider defining a mapping to a table-valued function or stored procedure. -MSDN
As i can see your code (Please correct me if i am wrong or improve the answer), Your code is actually loading entities in memory which is an overhead cost because you need to perform delete operation ,and your query will create delete query for each entity marked by
DeleteObject. So in terms of performance it will be better to call a stored procedure and execute your query directly against the database.
ExecuteStoreCommand Method
Directly Execute commands
Try this...
entity.Entry.Where(z => z.id < maxID).ToList().ForEach(entity.Entry.DeleteObject);
entity.SaveChanges();

How to know how many persistent objects were deleted using Session.Delete(query);

We are refactoring a project from plain MySQL queries to the usage of NHibernate.
In the MySQL connector there is the ExecuteNonQuery function that returns the rows affected. So
int RowsDeleted = ExecuteNonQuery("DELETE FROM `table` WHERE ...");
would show me how many rows where effectively deleted.
How can I achieve the same with NHibernate? So far I can see it is not possible with Session.Delete(query);.
My current workaround is first loading all of the objects that are about to be deleted and delete them one-by-one, incrementing a counter on each delete. But that will cost performance I may assume.
If you don't mind that nHibernate will create delete statements for each row and maybe additional statements for orphans and/or other relationships, you can use session.Delete.
For better performance I would recommend to do batch deletes (see example below).
session.Delete
If you delete many objects with session.Delete, nHibernate makes sure that the integrity is preserved, it will load everything into the session if needed anyways. So there is no real reason to count your objects or have a method to retrieve the number of objects which have been deleted, because you would simply do a query before running the delete to determine the number of objects which will be affected...
The following statement will delete all entities of type post by id.
The select statement will query the database only for the Ids so it is actually very performant...
var idList = session.Query<Post>().Select(p => p.Id).ToList<int>();
session.Delete(string.Format("from Post where Id in ({0})", string.Join(",", idList.ToArray())));
The number of objects deleted will be equal to the number of Ids in the list...
This is actually the same (in terms of queries nHibernate will fire against your database) as if you would query<T> and loop over the result and delete all of them one by one...
Batch delete
You can use session.CreateSqlQuery to run native SQL commands. It also allows you to have input and output parameters.
The following statement would simply delete everything from the table as you would expect
session.CreateSQLQuery(#"Delete from MyTableName");
To retrieve the number of rows delete, we'll use the normal TSQL ##ROWCOUNT variable and output it via select. To retrieve the selected row count, we have to add an output parameter to the created query via AddScalar and UniqueResult simple returns the integer:
var rowsAffected = session.CreateSQLQuery(#"
Delete from MyTableName;
Select ##ROWCOUNT as NumberOfRows")
.AddScalar("NumberOfRows", NHibernateUtil.Int32)
.UniqueResult();
To pass input variables you can do this with .SetParameter(<name>,<value>)
var rowsAffected = session.CreateSQLQuery(#"
DELETE from MyTableName where ColumnName = :val;
select ##ROWCOUNT NumberOfRows;")
.AddScalar("NumberOfRows", NHibernateUtil.Int32)
.SetParameter("val", 1)
.UniqueResult();
I'm not so confortable with MySQL, the example I wrote is for MSSQL, I think in MySQL the ##ROWCOUNT equivalent would be SELECT ROW_COUNT();?

Categories