I'm developing an asp.net web forms 4.5 website,
and I've come to a problem where I have to perform an sql query
from sql server originally written by classic asp.
The original query goes like this..
select trait1, trait2, sum(qty) as qty from table2
right outer join table1 on (table1.col == table2.col)
where table2.idx is null
group by trait1, trait2
order by trait1, trait2
and as I'm writing a .net program, I'm trying to rewrite this query...
as something like this
var myHashSet = new HashSet<string>(table2.Select(c => c.col));
from item in table2
where !myHashSet.Contains(item.col)
group item.qty by new {item.trait1, item.trait2} into total
select new
{
trait1 = total.Key.trait1,
trait2 = total.Key.trait2,
qty = total.Sum()
} into anon
order by anon.qty descending
select anon
ps : ignore the order by part.. it's not important
for classic asp, it takes like 1.5 seconds, but for c# asp.net,
it takes roughly 8 seconds.
The queries are not exactly like that but its almost similar to what I wrote.
I cannot figure out why it's taking so long..
Can someone tell me why it's taking so long and how I should fix it?
you don't need the hashset, use a join instead, your query might look like
var inner = from two in table2
join one in table1
on two.col equals one.col
group two by new
{
two.trait1,
two.trait2
} into total
select new
{
total.Key.trait1,
total.Key.trait2,
qty = total.Sum(p => p.qty)
};
Edit
for completeness i'm adding also a left join variant
var left = from two in table2
from one in
(
from temp in table1
where temp.col == two.col
select temp
).DefaultIfEmpty()
group two by new
{
two.trait1,
two.trait2
} into total
select new
{
total.Key.trait1,
total.Key.trait2,
qty = total.Sum(p => p.qty)
};
Related
I'm trying to join three tables together, but I just can't get my LINQ statements to get the data I want to. My SQL query that I want to replicate is like this:
SELECT TOP 5 SUM(Grade) AS 'TotalGrade', COUNT(Restaurants.Name) AS 'NumberOfVisits', Restaurants.Name FROM Lunches
JOIN dbo.LunchRestaurant ON Lunches.LunchId = LunchRestaurant.LunchId
JOIN Restaurants ON Restaurants.RestaurantId = LunchRestaurant.RestaurantId
GROUP BY Restaurants.Name
The LINQ statement I currently have is:
var q = from l in lunchContext.Lunches
join lr in lunchContext.LunchRestaurant on l.LunchId equals lr.LunchId
join r in lunchContext.Restaurants on lr.RestaurantId equals r.RestaurantId
select new { l.Grade, r.RestaurantId};
But with this one, I can't get in my group by statement or the aggregate functions no matter how I try. Do you have any suggestions on what to do? There also doesn't seem to be a way to write raw SQL statements when dealing with several tables, at least not easily done in EF Core if I'm not mistaken. Otherwise that'd be an option too.
I think you're looking for linq grouping. From memory, it would be something like;
var q = from l in lunchContext.Lunches
join lr in lunchContext.LunchRestaurant on l.LunchId equals lr.LunchId
join r in lunchContext.Restaurants on lr.RestaurantId equals r.RestaurantId
group l by lr.RestaurantId into g
select new { Id = g.Key, Grade =g.Sum(x=>x.Grade)};
If you do need to go down the raw SQL route then you can levarage ADO.Net like this
using (var command = lunchContext.Database.GetDbConnection().CreateCommand())
{
command.CommandText = "{Your sql query here}";
context.Database.OpenConnection();
using (var result = command.ExecuteReader())
{
// do something with result
}
}
I have a query wherein I need to get the count from 2 different tables. Here is a very simple form of the query (my query has more joins and conditions but this is the part I am stuck on):
select (select count(1) from Table1) as One, (select count(1) from Table2) as Two
The following linq queries works but I would like to do the above with a single linq query sent to the SQL Server. This query and many other versions I have tried, result in 2 queries sent to the server:
var query1 = from m in this.Table1 select m;
var query2 = from sr in this.Table2 select sr;
var final = new { One = query1.Count(), Two = query2.Count() };
I also tried this and this also sends 2 queries:
var final = from dummy in new List<int> { 1 }
join one in query1 on 1 equals 1 into ones
join two in query2 on 1 equals 1 into twos
select new { One = ones.Count(), Two = twos.Count()};
You need to make it a single LINQ query that can be translated:
var final = (from m in this.Table1.DefaultIfEmpty()
select new {
One = (from m in this.Table1 select m).Count(),
Two = (from sr in this.Table2 select sr).Count()
}).First();
Note that putting the sub-queries into an IQueryable variable will cause three separate queries to be sent.
Alternatively, since Count() doesn't have a query syntax equivalent, this is a little more compact in lambda syntax:
var final = this.Table1.DefaultIfEmpty().Select(t => new {
One = this.Table1.Count(),
Two = this.Table2.Count()
}).First();
I tried the Internet and the SOF but couldn't locate a helpful resource. Perhaps I may not be using correct wording to search. If there are any previous questions I have missed due to this reason please let me know and I will take this question down.
I am dealing with a busy database so I am required to send less queries to the database.
If I access different columns of the same Linq query from different levels of the code then is Entity Framework smart enough to foresee the required columns and bring them all or does it call the db twice?
eg.
var query = from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };
var records = query.Take(10);
// point x
var x = records.Select(a => a.Column1).ToArray();
var y = records.Select(a => a.Column2).ToArray();
Does EF generate query the database twice to faciliate x and y (send a query first to get Column1, and then send another to get Column2) or is it smart enough to know it needs both Columns to be materialised and bring them both at point x?
Added to clarify the intention of the question:
I understand I can simply add a greedy method to the end of query.Take(10) and get it done but I am trying to understand if the approach I try (and in my opinion, more elegant) does work of if not what makes EF to make two queries please.
Yes currently your code will generate 2 queries that will be executed to the database. Reason being is because you have 2 different sqls generated:
First is the top query, taking only 10 records and then only Column1
Second is the top query, taking only 10 records and then only Column2
The reason these are 2 queries is because you have a ToArray over different Select statements -> generating different sql. Most of linq queries are differed executed and will be executed only when you use something like ToArray()/ToList()/FirstOrDefault() and so on - those that actually give you the concrete data. In your original query you have 2 different ToArray on data that has not yet been retrieved - meaning 2 queries (once for the first field and then for the second).
The following code will result in a single query to the database
var records = (from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) })
.Take(10).ToList();
var x = records.Select(a => a.Column1).ToArray();
var y = records.Select(a => a.Column2).ToArray();
In my solution above I added a ToList() after filtering out only that data you need (Take(10)) and then at that point it will execute to the database. Then you have all the data in memory and you can do any other linq operation over it without it going again to the database.
Add to your code ToString() so you can check the generated sql at different points. Then you will understand when and what is being executed:
var query = from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };
var generatedSql = query.ToString(); // Here you will see a query that brings all records
var records = query.Take(10);
generatedSql = query.ToString(); // Here you will see it taking only 10 records
// point x
var xQuery = records.Select(a => a.Column1);
generatedSql = xQuery.ToString(); // Here you will see only 1 column in query
// Still nothing has been executed to DB at this point
var x = xQuery.ToArray(); // And that is what will be executed here
// Now you are before second execution
var yQuery = records.Select(a => a.Column2);
generatedSql = yQuery.ToString(); // Here you will see only the second column in query
// Finally, second execution, now with the other column
var y = yQuery.ToArray();
When you are running linq statement on an entity in EF if only prepares the Select statement (thats why the type is IQueryable). The data is loaded lazily. When you try to use a value from that query then only the result gets evaluated using a enumerator.
So when you turn it to a collection (.toList() etc.) explicitly it tries to get data to populate the list and hence the sql command is fired.
It is designed so to enhance the performance. So if a particular property of an entity is to be used EF doesn't get the value for all the columns from that table
There are four tables:
Questions(questionId, question)
QuestionTags(questionTagId, questionId, tagId)
CodingKeys(codingKeyId, codingTypeId ..)
Codings(codingId, codingKeyId, coding ..)
I want to select all question tagIds and their codings (codingKeyId is foreign key of tagId) that are represented in Questions... So if I have 10 different codings in Codings table but only two of them are represented in Questions I want to select only these two.
I tried with join like this:
var query = from qt in context.QuestionTags
join c in context.Codings on qt.tagId equals c.codingKeyId
select new
{
tagId = qt.tagId,
coding = c.coding
};
But the above solution gave me double results. For example, if one tag is included in more than one question, I get the same tag twice (I tried distinct, but that didn't work).
I also tried using Any:
var query= context.QuestionTags
.Where(qt => qt.Questions.QuestionTags.Any(q => q.tagId == qt.tagId))
.Select(qt => new
{
codingKeyId = qt.questionId,
coding = context.Codings.FirstOrDefault(c => c.CodingKeys.codingKeyId == qt.tagId).coding
});
The same thing happened here, I got duplicate results, but Distinct didn't work (don't know why).
However, if I use this SQL statement:
SELECT distinct tagId, coding
FROM QuestionTags
LEFT OUTER JOIN Codings ON codingKeyId LIKE QuestionTags.tagId
WHERE Codings.languageId = 1
I get the right result, but I don't want to write and store a procedure for this. I really wan't to know if I can solve this with EF (linq), and I am also not sure if distinct is the right solution.
Thanks for your help.
You can use group by in order to get just the results you want.
var query = from qt in context.QuestionTags
join c in context.Codings on qt.tagId equals c.codingKeyId
group qt by new {tagId = qt.tagId,coding = c.coding } into element
select new
{
tagId = element.Key.tagId,
coding = element.Key.coding
};
Please mark it as answer if you find it useful
var result = from qt in context.QuestionTags
join c in context.Codings on qt.tagId equals c.codingKeyId
where c.languageId == 1
select new
{
codingKeyId = qt.tagId,
coding = c.coding
};
return result.Distinct()
Ok, like this it is working, but is this the only way to use it with distinct and join... I am not sure if this is the right solution (but it gives me the right result) ... maybe it can be optimized a bit...
Try
var query = from qt in context.Codings
join c in context.QuestionTags on qt.tagId equals c.codingKeyId
select new
{
tagId = qt.tagId,
coding = c.coding
};
I have the following LINQ query, that is returning the results that I expect, but it does not "feel" right.
Basically it is a left join. I need ALL records from the UserProfile table.
Then the LastWinnerDate is a single record from the winner table (possible multiple records) indicating the DateTime the last record was entered in that table for the user.
WinnerCount is the number of records for the user in the winner table (possible multiple records).
Video1 is basically a bool indicating there is, or is not a record for the user in the winner table matching on a third table Objective (should be 1 or 0 rows).
Quiz1 is same as Video 1 matching another record from Objective Table (should be 1 or 0 rows).
Video and Quiz is repeated 12 times because it is for a report to be displayed to a user listing all user records and indicate if they have met the objectives.
var objectiveIds = new List<int>();
objectiveIds.AddRange(GetObjectiveIds(objectiveName, false));
var q =
from up in MetaData.UserProfile
select new RankingDTO
{
UserId = up.UserID,
FirstName = up.FirstName,
LastName = up.LastName,
LastWinnerDate = (
from winner in MetaData.Winner
where objectiveIds.Contains(winner.ObjectiveID)
where winner.Active
where winner.UserID == up.UserID
orderby winner.CreatedOn descending
select winner.CreatedOn).First(),
WinnerCount = (
from winner in MetaData.Winner
where objectiveIds.Contains(winner.ObjectiveID)
where winner.Active
where winner.UserID == up.UserID
orderby winner.CreatedOn descending
select winner).Count(),
Video1 = (
from winner in MetaData.Winner
join o in MetaData.Objective on winner.ObjectiveID equals o.ObjectiveID
where o.ObjectiveNm == Constants.Promotions.SecVideo1
where winner.Active
where winner.UserID == up.UserID
select winner).Count(),
Quiz1 = (
from winner2 in MetaData.Winner
join o2 in MetaData.Objective on winner2.ObjectiveID equals o2.ObjectiveID
where o2.ObjectiveNm == Constants.Promotions.SecQuiz1
where winner2.Active
where winner2.UserID == up.UserID
select winner2).Count(),
};
You're repeating join winners table part several times. In order to avoid it you can break it into several consequent Selects. So instead of having one huge select, you can make two selects with lesser code. In your example I would first of all select winner2 variable before selecting other result properties:
var q1 =
from up in MetaData.UserProfile
select new {up,
winners = from winner in MetaData.Winner
where winner.Active
where winner.UserID == up.UserID
select winner};
var q = from upWinnerPair in q1
select new RankingDTO
{
UserId = upWinnerPair.up.UserID,
FirstName = upWinnerPair.up.FirstName,
LastName = upWinnerPair.up.LastName,
LastWinnerDate = /* Here you will have more simple and less repeatable code
using winners collection from "upWinnerPair.winners"*/
The query itself is pretty simple: just a main outer query and a series of subselects to retrieve actual column data. While it's not the most efficient means of querying the data you're after (joins and using windowing functions will likely get you better performance), it's the only real way to represent that query using either the query or expression syntax (windowing functions in SQL have no mapping in LINQ or the LINQ-supporting extension methods).
Note that you aren't doing any actual outer joins (left or right) in your code; you're creating subqueries to retrieve the column data. It might be worth looking at the actual SQL being generated by your query. You don't specify which ORM you're using (which would determine how to examine it client-side) or which database you're using (which would determine how to examine it server-side).
If you're using the ADO.NET Entity Framework, you can cast your query to an ObjectQuery and call ToTraceString().
If you're using SQL Server, you can use SQL Server Profiler (assuming you have access to it) to view the SQL being executed, or you can run a trace manually to do the same thing.
To perform an outer join in LINQ query syntax, do this:
Assuming we have two sources alpha and beta, each having a common Id property, you can select from alpha and perform a left join on beta in this way:
from a in alpha
join btemp in beta on a.Id equals btemp.Id into bleft
from b in bleft.DefaultIfEmpty()
select new { IdA = a.Id, IdB = b.Id }
Admittedly, the syntax is a little oblique. Nonetheless, it works and will be translated into something like this in SQL:
select
a.Id as IdA,
b.Id as Idb
from alpha a
left join beta b on a.Id = b.Id
It looks fine to me, though I could see why the multiple sub-queries could trigger inefficiency worries in the eyes of a coder.
Take a look at what SQL is produced though (I'm guessing you're running this against a database source from your saying "table" above), before you start worrying about that. The query providers can be pretty good at producing nice efficient SQL that in turn produces a good underlying database query, and if that's happening, then happy days (it will also give you another view on being sure of the correctness).