Alternative to LINQ RemoveRange in EF 6 - c#

I have two tables, Transactions and TransactionsStaging.
I am using a LINQ query to fetch all rows in TransactionsStaging which have a duplicate in Trasactions and then removing them form TranscationsStaging. So ultimately I am removing all entries in TransactionsStaging which have a duplicate in Transactions table.
I have produced the following so far:
IEnumerable<WebApi.Models.TransactionStaging> result = (from ts in db.TransactionsStaging
join t in db.Transactions
on ts.Description equals t.Description
select ts).ToList();
db.TransactionsStaging.RemoveRange(result);
db.SaveChanges();
The above works, but when inspecting the actual SQL queries being sent to the DB, I noticed that the RemoveRange produces a SQL DELETE statement for each row it is removing.
Is there a way to accomplish the same but avoid the multiple delete statements?
I wanted to explore this possibility before switching to executing a raw SQL statement rather than using Linq and ORM.

If you want to issue only a single database command, either a stored proc or raw SQL statements would be the way to go, since EntityFramework does not support bulk transactions.
You could go with a variety of bulk extensions available for batch operations.

Related

LINQs Joins vs Stored Procedures Joins

I read this article on using LINQs for doing Joins. I was wondering how much benefit this would be aside from writing a stored procedure that would join the tables? Would it cause any kind of performance issues using joins with LINQ?
UPDATE:
So using this as an example:
var employeeInfo =
from employee in employees
join addInfo in additionalInfo on employee.ID equals addInfo.CategoryID into allInfo
select new { CategoryName = category.Name, Products = allInfo};
Would this simple join benefit me as apposed to a stored procedure? I know depending on the size of the tables and number of tables you may want to join could make a big difference on when to use LINQ vs store procedure. What would be a good "rule of thumb" on the number of tables and sizes one should use for LINQ joins and when performing LINQ joins becomes too much of a performance hit?
The performance of queries joins in general depends on if the proper indexes exist on the joined fields. If your query is producing a full table scan, fields aren't property indexed your performance will be directly impacted. Perform an explain plan if you're concerned about stored procedure performance.
As to LINQ and joins, you don't want to do it.
Here is a good article below on LINQ joins, from the article:
One of the greatest benefits of LINQ to SQL and LINQ to Entities is navigation properties that allows queries across several tables, without the need to use explicit joins. Unfortunately LINQ queries are often written as a direct translation of a SQL query, without taking advantage of the richer features offered by LINQ to SQL and LINQ to Entities.
https://coding.abel.nu/2012/06/dont-use-linqs-join-navigate/

Can I Insert the Results of a Select Statement Into Another Table Without a Roundtrip?

I have a web application that is written in MVC.Net using C# and LINQ-to-SQL (SQL Server 2008 R2).
I'd like to query the database for some values, and also insert those values into another table for later use. Obviously, I could do a normal select, then take those results and do a normal insert, but that will result in my application sending the values back to the SQL server, which is a waste as the server is where the values came from.
Is there any way I can get the select results in my application and insert them into another table without the information making a roundtrip from the the SQL server to my application and back again?
It would be cool if this was in one query, but that's less important than avoiding the roundtrip.
Assume whatever basic schema you like, I'll be extrapolating your simple example to a much more complex query.
Can I Insert the Results of a Select Statement Into Another Table Without a Roundtrip?
From a "single-query" and/or "avoid the round-trip" perspective: Yes.
From a "doing that purely in Linq to SQL" perspective: Well...mostly ;-).
The three pieces required are:
The INSERT...SELECT construct:
By using this we get half of the goal in that we have selected data and inserted it. And this is the only way to keep the data entirely at the database server and avoid the round-trip. Unfortunately, this construct is not supported by Linq-to-SQL (or Entity Framework): Insert/Select with Linq-To-SQL
The T-SQL OUTPUT clause:
This allows for doing what is essentially the tee command in Unix shell scripting: save and display the incoming rows at the same time. The OUTPUT clause just takes the set of inserted rows and sends it back to the caller, providing the other half of the goal. Unfortunately, this is also not supported by Linq-to-SQL (or Entity Framework). Now, this type of operation can also be achieved across multiple queries when not using OUTPUT, but there is really nothing gained since you then either need to a) create a temp table to dump the initial results into that will be used to insert into the table and then selected back to the caller, or b) have some way of knowing which rows that were just inserted into the table are new so that they can be properly selected back to the caller.
The DataContext.ExecuteQuery<TResult> (String, Object[]) method:
This is needed due to the two required T-SQL pieces not being supported directly in Linq-to-SQL. And even if the clunky approach to avoiding the OUTPUT clause is done (assuming it could be done in pure Linq/Lambda expressions), there is still no way around the INSERT...SELECT construct that would not be a round-trip.
Hence, multiple queries that are all pure Linq/Lambda expressions equates to a round-trip.
The only way to truly avoid the round-trip should be something like:
var _MyStuff = db.ExecuteQuery<Stuffs>(#"
INSERT INTO dbo.Table1 (Col1, Col2, Col2)
OUTPUT INSERTED.*
SELECT Col1, Col2, Col3
FROM dbo.Table2 t2
WHERE t2.Col4 = {0};",
_SomeID);
And just in case it helps anyone (since I already spent the time looking it up :), the equivalent command for Entity Framework is: Database.SqlQuery<TElement> (String, Object[])
try this query according your requirement
insert into IndentProcessDetails (DemandId,DemandMasterId,DemandQty) ( select DemandId,DemandMasterId,DemandQty from DemandDetails)

Entity Framework - how can I optimize “Contains” statement?

In our current application we have some performance issues with some of our queries. Usually we have something like:
List<int> idList = some data here…;
var query = (from a in someTable where idList.Contains(a.Id));
while for simple queries this is acceptable, it becomes a bottleneck when we have more items in idList (in some queries we have about 700 id’s to check, for example).
Is there any way to use something other then Contains? We are thinking of using some temporary tables to first insert the Ids, and then to execute join instead of Contains, but it would seem EntityFramework does not support such operations (creating temporary tables in code) :(
What else can we try?
I Suggest using LINQ PAD it offers a "Transform to SQL" option which allows you to see your query in SQL syntax.
there is a chance that this is the optimal solution (if youre not into messy stuff).
might try holding the idList as a sorted array and have the contains method replaced with a binary search. (you can implement your own extension).
You can try this:
var query = someTable.Where(a => idList.Any(b => b.Id == a.Id));
If you don't mind having a physical table you could use a semi-temporary table. The basic idea is:
Create a physical table with a "query id" column
Generate a unique ID (not random, but unique)
Insert data into the table tagging the records with the query ID
Pass the query id to the main query, using it to join to the link table
Once the query is complete, delete the temporary records
At worst if something goes wrong you will have orphaned records in the link table (which is why you use a unique query ID).
It's not the cleanest solution but it will be faster than using Contains if you have a lot of values to check against.
When Entity Framework starts being a performance bottleneck, generally it's time to write actual SQL.
So what you could do for example is build a table-valued function that takes a table-valued parameter (your list of IDs) as parameter. The function would just return the result of your JOIN.
Table valued function feature requires EF5, so it might be not an option if you're really stuck with EF4.
The idea is to refactor your queries to get rid of idList.
For example you should return the list of orders of male users 18-25 year, from France. If you filter users table by age, sex and country to get idList of users you end up with 700+ id's. Instead you make Orders table join with Users and apply filters to the Users table. So you don't have 2 requests (one for ids and one for orders) and it works much faster cause it can use indexes while joining the table.
Makes sense?

How to know how many persistent objects were deleted using Session.Delete(query);

We are refactoring a project from plain MySQL queries to the usage of NHibernate.
In the MySQL connector there is the ExecuteNonQuery function that returns the rows affected. So
int RowsDeleted = ExecuteNonQuery("DELETE FROM `table` WHERE ...");
would show me how many rows where effectively deleted.
How can I achieve the same with NHibernate? So far I can see it is not possible with Session.Delete(query);.
My current workaround is first loading all of the objects that are about to be deleted and delete them one-by-one, incrementing a counter on each delete. But that will cost performance I may assume.
If you don't mind that nHibernate will create delete statements for each row and maybe additional statements for orphans and/or other relationships, you can use session.Delete.
For better performance I would recommend to do batch deletes (see example below).
session.Delete
If you delete many objects with session.Delete, nHibernate makes sure that the integrity is preserved, it will load everything into the session if needed anyways. So there is no real reason to count your objects or have a method to retrieve the number of objects which have been deleted, because you would simply do a query before running the delete to determine the number of objects which will be affected...
The following statement will delete all entities of type post by id.
The select statement will query the database only for the Ids so it is actually very performant...
var idList = session.Query<Post>().Select(p => p.Id).ToList<int>();
session.Delete(string.Format("from Post where Id in ({0})", string.Join(",", idList.ToArray())));
The number of objects deleted will be equal to the number of Ids in the list...
This is actually the same (in terms of queries nHibernate will fire against your database) as if you would query<T> and loop over the result and delete all of them one by one...
Batch delete
You can use session.CreateSqlQuery to run native SQL commands. It also allows you to have input and output parameters.
The following statement would simply delete everything from the table as you would expect
session.CreateSQLQuery(#"Delete from MyTableName");
To retrieve the number of rows delete, we'll use the normal TSQL ##ROWCOUNT variable and output it via select. To retrieve the selected row count, we have to add an output parameter to the created query via AddScalar and UniqueResult simple returns the integer:
var rowsAffected = session.CreateSQLQuery(#"
Delete from MyTableName;
Select ##ROWCOUNT as NumberOfRows")
.AddScalar("NumberOfRows", NHibernateUtil.Int32)
.UniqueResult();
To pass input variables you can do this with .SetParameter(<name>,<value>)
var rowsAffected = session.CreateSQLQuery(#"
DELETE from MyTableName where ColumnName = :val;
select ##ROWCOUNT NumberOfRows;")
.AddScalar("NumberOfRows", NHibernateUtil.Int32)
.SetParameter("val", 1)
.UniqueResult();
I'm not so confortable with MySQL, the example I wrote is for MSSQL, I think in MySQL the ##ROWCOUNT equivalent would be SELECT ROW_COUNT();?

Linq/Entity Framework, performing xpath query on column

We have a database that contains xml fields. At this moment we perform queries on this database that filter on values in the xml. In the near future we would like to migrate to entity framework or NHibernate as orm. Is this possible? This are two of the queries we run (in sql):
SELECT
yy.[Description] as Name,
convert(xml, yy.Xml).value('(//Division)[1]', 'varchar(255)') as Division,
onvert(xml, yy.Xml).value('(//Season)[1]', 'varchar(255)') as Season
into #Statistics .....
And
SELECT [dbo].[yy].[Id]
FROM [dbo].[yy]
WHERE [dbo].[yy].[ApplicationId] = 1
AND (((dbo.[yy].Xml.exist(''(//qq[Season='Non seasonal'])'') = 1)))
Is there anyway to do this?
In EF there are a variety of ways to pass SQL directly through to the server, e.g. ObjectContext.ExecuteStoreQuery. You could also write a stored proc which does the xpath query and map that as usual. There is no native support for xpath in LINQ or Entity SQL, as far as I know.

Categories