Linq correlated subquery to same table on multiple columns - c#

I've looked at several other questions related to correlated subqueries but it's still not clear to me how to accomplish what I need. I'm using Entity Framework and C#, and have a table called STEWARDSHIP with the following columns:
STEWARDSHIP_ID (the primary key)
SITE_ID
VISIT_DATE
VISIT_TYPE_ID
I need to identify cases where the same combination of SITE_ID, VISIT_DATE, VISIT_TYPE_ID exists more than once because it could represent a duplicate entry made by end users in error, and then I need to report on the details of these entries. In SQL I would do this by joining to the temporary result of a GROUP BY/HAVING like so:
SELECT * FROM stewardship AS s2,
(SELECT site_id, visit_type_id, CAST(visit_date AS DATE) AS visit_date
FROM stewardship
GROUP BY site_id, visit_type_id, CAST(visit_date AS DATE)
HAVING COUNT(*) > 1) AS s
WHERE s2.site_id = s.site_id
AND s2.visit_type_id = s.visit_type_id
AND CAST(s2.visit_date AS DATE) = s.visit_date
What's the best way to accomplish this in Linq?

Since you're open to a different approach that should be more performant, here is the new SQL to get what I think you're after.
select distinct s1.*
from stewardship s1
inner join stewardship s2 on
s1.stewardship_id <> s2.stewardship_id and
s1.site_id = s2.site_id and
s1.visit_type_id = s2.visit_type_id and
cast(s1.visit_date as date) = cast(s2.visit_date as date)
order by s1.site_id, s1.visit_type_id
Now, to translate that to LINQ, you can use the following statement.
var duplicates = (
from s in Stewardships
join s2 in Stewardships
on new { s.Site_id, s.Visit_type_id, s.Visit_date.Date } equals new { s2.Site_id, s2.Visit_type_id, s2.Visit_date.Date }
where s.Stewardship_id != s2.Stewardship_id
select s)
.Distinct()
.OrderBy(s => s.Site_id)
.ThenBy(s => s.Visit_type_id)
Note that you cannot use anything other than an equijoin for expression joins, so I had to put the non-equijoin (ensuring our matches aren't on the same record via PK) in the where expression. You could also accomplish this with lambdas via the Except() extension method.
The order by is there for readability of the results and to match the SQL statement above.
I hope this helps!

It would be fairly similar to what you've already got.
from s in context.stewardships
group s by new {s.site_id, s.visit_type_id, visit_date} into g
where g.Count() > 1
select g;
This would give you groups of stewardships with similar values. You could "flatten" those results with a SelectMany afterward, but you might find them more useful to work with in groups.
Note that you may need to use SqlFunctions or something to do the equivalent of the cast to date.

Related

string.Join in Entity framwork, LINQ to SQL. Without Client-side evaluation

If you have a table, similar to here:
DataTypeID, DataValue
1,"Value1"
1,"Value2"
2,"Value3"
3,"Value4"
and want output like this:
DataTypeID,DataValues
1,"Value1,Value2"
2,"Value3"
3,"Value4"
Most questions suggest like this to use toList() or AsEnumerable() and then, string.Join(", ", DataValues) on client-side. This might work if the data is not huge but it defeats the purpose of using EF. How can I do this without loading all the data in-memory?
UPDATE: As of EF7 preview 7, now you simply use string.Join normally for example:
_context.MyTable
.GroupBy(keySelector => keySelector.MyKey, elemSelector => elemSelector.StringProp)
.Select(elem => string.Join(',', elem))
//.FirstOrDefaultAsync(cancellationToken), if (keyselector => 1) i.e. only 1 group so you get all rows
Old answer
Well, as per this this issue, string.Join() is yet to be implemented(as of now) and IEnumerable.Aggregate will not translate either.
In the meanwhile, you can create a view and write your SQL there.
For example, to group by id and string.Join(", ", Names);
CREATE VIEW V_Name AS
SELECT ID,
Names=STUFF
(
(
SELECT DISTINCT ' || '+ CAST(Child.Name AS VARCHAR(MAX))
FROM Child,MainTable
WHERE Main.ID= t1.ID --this line is imp...
AND Child.ID=MainTable.ID
FOR XMl PATH('')
),1,1,''
)
FROM MainTable t1
GROUP BY t1.IDReview
OR
CREATE VIEW V_Name AS
SELECT ID, STRING_AGG(Name, ', ') AS Names
FROM MainTable
LEFT JOIN ChildTable ON MainTable.ID = ChildTable.ID
GROUP BY ID
Now, in your C# you can simply join this with your ID, just like you normally would with an IQueryable:
from data in _dbcontext.sometable
join groupedAndJoinedNames in _dbcontext.viewname
on data.ID equals groupedAndJoinedNames.ID
select new
{
Names = groupedAndJoinedNames.Names
}

linq expressions from simple sql, seems there is limitation in linq

I have following sql query
select * from one a
inner join one b
on
(
a.weekday=b.weekday
and a.starttime =b.starttime
and a.sl>b.sl
)
where a.weekday=b.weekday and a.starttime=b.starttime and a.endtime=b.endtime
And I want it to be converted to linq statement both lambda expression and sql-like syntax. i tried but it seems like bit difficult. I have also used a tool like sqltolinq but not seem to be working.
The problem is with sql-like syntax is that my query s join has multiple elements that includes equal and greater than operator.
any help will be appreciated
LINQ only supports equijoins, but you could do an equijoin for the weekday and starttime, and endtime parts and then a where clause for the rest.
// Names changed to be more idiomatic where feasible. We have no
// idea what "sl" means.
var query = from a in db.TableA
join b in db.TableB
on new { a.WeekDay, a.StartTime, a.EndTime }
equals new { b.WeekDay, b.StartTime, b.EndTime }
where a.Sl > b.Sl
select ...;

using nHibernate QueryOver to join a subset

I am using nHibernate for our database access. I need to do a complicated query to find all member journal entries after a certain date with certain value, PreviousId, set for each member. I can easily write the SQL for it:
SELECT J.MemberId, J.PreviousId
FROM tblMemMemberStatusJournal J
INNER JOIN (
SELECT MemberId,
MIN(EffectiveDate) AS EffectiveDate
FROM tblMemMemberStatusJournal
WHERE EffectiveDate > #StartOfMonth
AND (PreviousId is NOT null)
GROUP BY MemberId
) AS X ON (X.EffectiveDate = J.EffectiveDate AND X.MemberId = J.MemberId)
However I am having a lot of trouble trying to get nHibernate to generate this information. There is not a lot of (any) documentation for how to use QueryOver.
I have been seeing information in other places, but none of it is very clear and very little has an actual explanation as to why things are done in certain ways. The answer for Selecting on Sub Queries in NHibernate with Critieria API did not give an adequate example as to what it is doing, so I haven't been able to replicate it.
I've gotten the inner part of the query created with this:
IList<object[]> result = session.QueryOver<MemberStatusJournal>()
.SelectList(list => list
.SelectGroup(a => a.Member.ID)
.SelectMin(a => a.EffectiveDate))
.Where(j => (j.EffectiveDate > firstOfMonth) && (j.PreviousId != null))
.List<object[]>();
Which, according to the profiler, makes this SQL:
SELECT this_.MemberId as y0_,
min(this_.EffectiveDate) as y1_
FROM tblMemMemberStatusJournal this_
WHERE (this_.EffectiveDate > '2014-08-01T00:00:00' /* #p0 */
and not (this_.PreviousLocalId is null))
GROUP BY this_.MemberId
But I am not finding a good example of how to actually do join this subset with a parent query. Does anyone have any suggestions?
You aren't actually joining on a subset, you're filtering on a subset. Knowing this, you have the option of filtering via other means, in this case, a correlated subquery.
The solution below first creates a detatched query to act as the inner subquery. We can correlate properties of the inner query with properties of the outer query through the use of an alias.
MemberStatusJournal memberStatusJournalAlias = null; // This will represent the
// object of the outer query
var subQuery = QueryOver.Of<MemberStatusJournal>()
.Select(Projections.GroupProperty(Projections.Property<MemberStatusJournal>(m => m.Member.ID)))
.Where(j => (j.EffectiveDate > firstOfMonth) && (j.PreviousId != null))
.Where(Restrictions.EqProperty(
Projections.Min<MemberStatusJournal>(j => j.EffectiveDate),
Projections.Property(() => memberStatusJournalAlias.EffectiveDate)
)
)
.Where(Restrictions.EqProperty(
Projections.GroupProperty(Projections.Property<MemberStatusJournal>(m => m.Member.Id)),
Projections.Property(() => memberStatusJournalAlias.Member.Id)
));
var results = session.QueryOver<MemberStatusJournal>(() => memberStatusJournalAlias)
.WithSubquery
.WhereExists(subQuery)
.List();
This would produce an SQL query like the following:
SELECT blah
FROM tblMemMemberStatusJournal J
WHERE EXISTS (
SELECT J2.MemberId
FROM tblMemberStatusJournal J2
WHERE J2.EffectiveDate > #StartOfMonth
AND (J2.PreviousId is NOT null)
GROUP BY J2.MemberId
HAVING MIN(J2.EffectiveDate) = J.EffectiveDate
AND J2.MemberId = J.MemberId
)
This looks less efficient than the inner join query you opened the question with. But my experience is that the SQL Query Optimizer is clever enough to convert this into an inner join. If you want to confirm this, you can use SQL Studio to generate and compare the execution plans of both queries.

Writing a subquery using LINQ in C#

I would like to query a DataTable that produces a DataTable that requires a subquery. I am having trouble finding an appropriate example.
This is the subquery in SQL that I would like to create:
SELECT *
FROM SectionDataTable
WHERE SectionDataTable.CourseID = (SELECT SectionDataTable.CourseID
FROM SectionDataTable
WHERE SectionDataTable.SectionID = iSectionID)
I have the SectionID, iSectionID and I would like to return all of the records in the Section table that has the CourseID of the iSectionID.
I can do this using 2 separate queries as shown below, but I think a subquery would be better.
string tstrFilter = createEqualFilterExpression("SectionID", strCriteria);
tdtFiltered = TableInfo.Select(tstrFilter).CopyToDataTable();
iSelectedCourseID = tdtFiltered.AsEnumerable().Select(id => id.Field<int>("CourseID")).FirstOrDefault();
tdtFiltered.Clear();
tstrFilter = createEqualFilterExpression("CourseID", iSelectedCourseID.ToString());
tdtFiltered = TableInfo.Select(tstrFilter).CopyToDataTable();
Although it doesn't answer your question directly, what you are trying to do is much better suited for an inner join:
SELECT *
FROM SectionDataTable S1
INNER JOIN SectionDataTable S2 ON S1.CourseID = S2.CourseID
WHERE S2.SectionID = iSectionID
This then could be modeled very similarily using linq:
var query = from s1 in SectionDataTable
join s2 in SectionDataTable
on s1.CourseID equals s2.CourseID
where s2.SectionID == iSectionID
select s1;
When working in LINQ you have to think of the things a bit differently. Though you can go as per the Miky's suggestion. But personally I would prefer to use the Navigational properties.
For example in your given example I can understand that you have at-least 2 tables,
Course Master
Section Master
One Section must contain a Course reference
Which means
One Course can be in multiple Sections
Now if I see these tables as entities in my model I would see navigational properties as,
Course.Sections //<- Sections is actually a collection
Section.Course //<- Course is an object
So the same query can be written as,
var lstSections = context.Sections.Where(s => s.Course.Sections.Any(c => c.SectionID == iSectionID)).ToList();
I think you main goal is, you are trying extract all the Sections where Courses are same as given Section's Courses.

Is this LINQ Query "correct"?

I have the following LINQ query, that is returning the results that I expect, but it does not "feel" right.
Basically it is a left join. I need ALL records from the UserProfile table.
Then the LastWinnerDate is a single record from the winner table (possible multiple records) indicating the DateTime the last record was entered in that table for the user.
WinnerCount is the number of records for the user in the winner table (possible multiple records).
Video1 is basically a bool indicating there is, or is not a record for the user in the winner table matching on a third table Objective (should be 1 or 0 rows).
Quiz1 is same as Video 1 matching another record from Objective Table (should be 1 or 0 rows).
Video and Quiz is repeated 12 times because it is for a report to be displayed to a user listing all user records and indicate if they have met the objectives.
var objectiveIds = new List<int>();
objectiveIds.AddRange(GetObjectiveIds(objectiveName, false));
var q =
from up in MetaData.UserProfile
select new RankingDTO
{
UserId = up.UserID,
FirstName = up.FirstName,
LastName = up.LastName,
LastWinnerDate = (
from winner in MetaData.Winner
where objectiveIds.Contains(winner.ObjectiveID)
where winner.Active
where winner.UserID == up.UserID
orderby winner.CreatedOn descending
select winner.CreatedOn).First(),
WinnerCount = (
from winner in MetaData.Winner
where objectiveIds.Contains(winner.ObjectiveID)
where winner.Active
where winner.UserID == up.UserID
orderby winner.CreatedOn descending
select winner).Count(),
Video1 = (
from winner in MetaData.Winner
join o in MetaData.Objective on winner.ObjectiveID equals o.ObjectiveID
where o.ObjectiveNm == Constants.Promotions.SecVideo1
where winner.Active
where winner.UserID == up.UserID
select winner).Count(),
Quiz1 = (
from winner2 in MetaData.Winner
join o2 in MetaData.Objective on winner2.ObjectiveID equals o2.ObjectiveID
where o2.ObjectiveNm == Constants.Promotions.SecQuiz1
where winner2.Active
where winner2.UserID == up.UserID
select winner2).Count(),
};
You're repeating join winners table part several times. In order to avoid it you can break it into several consequent Selects. So instead of having one huge select, you can make two selects with lesser code. In your example I would first of all select winner2 variable before selecting other result properties:
var q1 =
from up in MetaData.UserProfile
select new {up,
winners = from winner in MetaData.Winner
where winner.Active
where winner.UserID == up.UserID
select winner};
var q = from upWinnerPair in q1
select new RankingDTO
{
UserId = upWinnerPair.up.UserID,
FirstName = upWinnerPair.up.FirstName,
LastName = upWinnerPair.up.LastName,
LastWinnerDate = /* Here you will have more simple and less repeatable code
using winners collection from "upWinnerPair.winners"*/
The query itself is pretty simple: just a main outer query and a series of subselects to retrieve actual column data. While it's not the most efficient means of querying the data you're after (joins and using windowing functions will likely get you better performance), it's the only real way to represent that query using either the query or expression syntax (windowing functions in SQL have no mapping in LINQ or the LINQ-supporting extension methods).
Note that you aren't doing any actual outer joins (left or right) in your code; you're creating subqueries to retrieve the column data. It might be worth looking at the actual SQL being generated by your query. You don't specify which ORM you're using (which would determine how to examine it client-side) or which database you're using (which would determine how to examine it server-side).
If you're using the ADO.NET Entity Framework, you can cast your query to an ObjectQuery and call ToTraceString().
If you're using SQL Server, you can use SQL Server Profiler (assuming you have access to it) to view the SQL being executed, or you can run a trace manually to do the same thing.
To perform an outer join in LINQ query syntax, do this:
Assuming we have two sources alpha and beta, each having a common Id property, you can select from alpha and perform a left join on beta in this way:
from a in alpha
join btemp in beta on a.Id equals btemp.Id into bleft
from b in bleft.DefaultIfEmpty()
select new { IdA = a.Id, IdB = b.Id }
Admittedly, the syntax is a little oblique. Nonetheless, it works and will be translated into something like this in SQL:
select
a.Id as IdA,
b.Id as Idb
from alpha a
left join beta b on a.Id = b.Id
It looks fine to me, though I could see why the multiple sub-queries could trigger inefficiency worries in the eyes of a coder.
Take a look at what SQL is produced though (I'm guessing you're running this against a database source from your saying "table" above), before you start worrying about that. The query providers can be pretty good at producing nice efficient SQL that in turn produces a good underlying database query, and if that's happening, then happy days (it will also give you another view on being sure of the correctness).

Categories