When writing LINQ queries in C#, I know I can perform a join using the join keyword. But what does the following do?
from c in Companies
from e in c.Employees
select e;
A LINQ book I have say it's a type of join, but not a proper join (which uses the join keyword). So exactly what type of join is it then?
Multiple "from" statements are considered compound linq statments. They are like nested foreach statements. The msdn page does list a great example here
var scoreQuery = from student in students
from score in student.Scores
where score > 90
select new { Last = student.LastName, score };
this statement could be rewritten as:
SomeDupCollection<string, decimal> nameScore = new SomeDupCollection<string, float>();
foreach(Student curStudent in students)
{
foreach(Score curScore in curStudent.scores)
{
if (curScore > 90)
{
nameScore.Add(curStudent.LastName, curScore);
}
}
}
This will get translated into a SelectMany() call. It is essentially a cross-join.
Jon Skeet talks about it on his blog, as part of the Edulinq series. (Scroll down to Secondary "from" clauses.)
The code that you listed:
from c in company
from e in c.Employees
select e;
... will produce a list of every employee for every company in the company variable. If an employee works for two companies, they will be included in the list twice.
The only "join" that might occur here is when you say c.Employees. In an SQL-backed provider, this would translate to an inner join from the Company table to the Employee table.
However, the double-from construct is often used to perform "joins" manually, like so:
from c in companies
from e in employees
where c.CompanyId == e.CompanyId
select e;
This would have a similar effect as the code you posted, with potential subtle differences depending on what the employees variable contains. This would also be equivalent to the following join:
from c in companies
join e in employees
on c.CompanyId equals e.CompanyId
select e;
If you wanted a Cartesian product, however, you could just remove the where clause. (To make it worth anything, you'd probably want to change the select slightly, too, though.)
from c in companies
from e in employees
select new {c, e};
This last query would give you every possible combination of company and employee.
All the first set of objects will be joined with all the second set of objects. For example, the following test will pass...
[TestMethod()]
public void TestJoin()
{
var list1 = new List<Object1>();
var list2 = new List<Object2>();
list1.Add(new Object1 { Prop1 = 1, Prop2 = "2" });
list1.Add(new Object1 { Prop1 = 4, Prop2 = "2av" });
list1.Add(new Object1 { Prop1 = 5, Prop2 = "2gks" });
list2.Add(new Object2 { Prop1 = 3, Prop2 = "wq" });
list2.Add(new Object2 { Prop1 = 9, Prop2 = "sdf" });
var list = (from l1 in list1
from l2 in list2
select l1).ToList();
Assert.AreEqual(6, list.Count);
}
Related
This is not a duplicate of: Given 2 C# Lists how to merge them and get only the non duplicated elements from both lists since he's looking at lists of the same type.
I have this scenario:
class A
{
string id;
.... some other stuff
}
class B
{
string id;
.... some other stuff
}
I would like to remove, both from A and B, elements that share an id field between the two lists.
I can do it in 3 steps: find the common ids, and then delete the records from both lists, but I'm wondering if there is something more elegant.
Edit: expected output
var A = [ 1, 3, 5, 7, 9 ]
var B = [ 1, 2, 3, 4, 5 ]
output:
A = [ 7, 9 ]
B = [ 2, 4 ]
but this is showing only the id field; as stated above, the lists are of different types, they just share ids.
You will require three steps, but you can use Linq to simplify the code.
Given two classes which have a property of the same (equatable) type, named "ID":
class Test1
{
public string ID { get; set; }
}
class Test2
{
public string ID { get; set; }
}
Then you can find the duplicates and remove them from both lists like so:
var dups =
(from item1 in list1
join item2 in list2 on item1.ID equals item2.ID
select item1.ID)
.ToArray();
list1.RemoveAll(item => dups.Contains(item.ID));
list2.RemoveAll(item => dups.Contains(item.ID));
But that is still three steps.
See .Net Fiddle example for a runnable example.
You can use LINQ Lambda expression for elegance:
var intersectValues = list2.Select(r => r.Id).Intersect(list1.Select(r => r.Id)).ToList();
list1.RemoveAll(r => intersectValues.Contains(r.Id));
list2.RemoveAll(r => intersectValues.Contains(r.Id));
Building on #Matthew Watson's answer you can move all of it to a single LINQ expression with
(from item1 in list1
join item2 in list2 on item1.ID equals item2.ID
select item1.ID)
.ToList()
.ForEach(d =>
{
list1.RemoveAll(i1 => d == i1.ID);
list2.RemoveAll(i2 => d == i2.ID);
}
);
I don't know where you land on the performance scale. The compiler might actually split this up into the three steps steps you already mentioned.
You also lose some readability as the from ... select result does not have a 'speaking' name like duplicates, to directly tell you what you will be working with in the ForEach.
Complete code example at https://gist.github.com/msdeibel/d2f8a97b754cca85fe4bcac130851597
O(n)
var aHash = list<A>.ToHashSet(x=>x.ID);
var bHash = list<B>.ToHashSet(x=>x.ID);
var result1 = new List<A>(A.Count);
var result2 = new List<B>(B.Count);
int value;
foreach (A item in list<A>)
{
if (!bHash.TryGetValue(item.ID, out value))
result1.Add(A);
}
foreach (B item in list<B>)
{
if (!aHash.TryGetValue(item.ID, out value))
result2.Add(B);
}
In the following code:
var finalArticles =
from domainArticle in articlesFoundInDomain
join articleCategoryVersion in dbc.ArticlesCategoriesVersions
on domainArticle.ArticleID equals articleCategoryVersion.ArticleID
join articleCategory in dbc.ArticleCategories
on articleCategoryVersion.CategoryID equals articleCategory.CategoryID
where articleCategory.ParentID == 52
group articleCategory by articleCategory.CategoryID
into newArticleCategoryGroup
I understand that the group clause should be returning an IEnumerable where k is the Key, in this case CategoryID.
I think I'm misunderstanding Linq at this point because I assume that for each 'k' there should be a list of articles in 'v', but I don't understand the mechanisms or terminology or something. When I try to project this statement into a new anonymous object I don't seem to get any articles... where are they?
Edit:
Okay so I've got a piece of code that is working, but unfortunately it's hitting the SQL server multiple times:
var articlesAssociatedWithKnowledgeTypes =
from categories in dbc.ArticleCategories
join categoryVersions in dbc.ArticlesCategoriesVersions
on categories.CategoryID equals categoryVersions.CategoryID
join articles in articlesFoundInGivenDomain
on categoryVersions.ArticleID equals articles.ArticleID
where categories.ParentID == 52 && articles.Version == categoryVersions.Version
select new
{
ArticleID = articles.ArticleID,
ArticleTitle = articles.Title,
ArticleVersion = articles.Version,
CategoryID = categories.CategoryID,
CategoryName = categories.Name
} into knowledgeTypesFlat
group knowledgeTypesFlat by new { knowledgeTypesFlat.CategoryID, knowledgeTypesFlat.CategoryName } into knowledgeTypesNested
select new
{
CategoryID = knowledgeTypesNested.Key.CategoryID,
CategoryName = knowledgeTypesNested.Key.CategoryName,
Articles = knowledgeTypesNested.ToList()
};
I thought the ToList() on Articles would sort that out but it doesn't. But, the code works although I'm not sure if this is optimal?
The grouping returns an enumeration of IGroupings. IGrouping<K, V> itself implements IEnumerable<V>. Think of each group as an enumerable of all the members of that group plus an extra property Key
In your first query you are showing a group by and the second one is a group join, both return different results. The group by returns an IEnumerable<IGrouping<TKey, TElement>>. To get the result you're expecting you could group by CategoryId and CategoryName and project as I show below:
var finalArticles =
from domainArticle in articlesFoundInDomain
join articleCategoryVersion in dbc.ArticlesCategoriesVersions
on domainArticle.ArticleID equals articleCategoryVersion.ArticleID
join articleCategory in dbc.ArticleCategories
on articleCategoryVersion.CategoryID equals articleCategory.CategoryID
where articleCategory.ParentID == 52
group articleCategory by new{ articleCategory.CategoryID,articleCategory.CategoryName}
into g
select new {CatId=g.Key.CategoryID, CatName=g.Key.CategoryName,Articles =g.ToList() };
When you need the grouped elements you can call ToList or ToArray as I did above
Your finalArticles query results in a IEnumerable<IGrouping<int, Article>> (assuming CategoryID is int and your articles are of type Article).
These IGrouping<int, Article> provides a Key property of type int (your CategoryID and also the IEnumerable<Article> representing the sequence of articles for that CategoryID.
You can turn this for example into a Dictionary<int, List<Article>> mapping CategoryIDs to the lists of articles:
var dictionary = finalArticles.ToDictionary(group => group.Key, group => group.ToList());
or to a list of categories containing articles:
var categories = finalArticles.Select(group => new {
CategoryID = group.Key,
Articles = group.ToList()}).ToList();
Update after your comment:
var finalArticles =
from domainArticle in articlesFoundInDomain
join articleCategoryVersion in dbc.ArticlesCategoriesVersions
on domainArticle.ArticleID equals articleCategoryVersion.ArticleID
join articleCategory in dbc.ArticleCategories
on articleCategoryVersion.CategoryID equals articleCategory.CategoryID
where articleCategory.ParentID == 52
group articleCategory by new {articleCategory.CategoryID, articleCategory.Name}
into newArticleCategoryGroup
select new
{
CategoryID = newArticleCategoryGroup.Key.CategoryID,
CategoryName = newArticleCategoryGroup.Key.Name,
Articles = newArticleCateGroup.ToList()
}
OK, I've been banging my head against this for a few days, and after studying LINQ I think I am on the right track. But I have a SQL brain and that is sometimes hard to translate to C#.
I have two arrays, one sorted alphabetically, and the other ordered by ID. I need to order the second array alphabetically. The IDs are the joining factor. I.E. A.ID = P.ID.
Here are my arrays and example values;
private IGenericListItem[] _priceLevels = new IGenericListItem[0];
_priceLevels is in the form of {ID, Name}
{3, A}
{8, B}
{4, C}
{7, D}
{5, E}
{9, F}
{1, G}
Edit: updated this to show _assignmentControls contains a sub array. I didn't make it so excuse the insanity. It actually contains a copy of _priceLevels...
protected ArrayList _assignmentControls = new ArrayList();
_assignmentControls is in the form of {ID, LastPrice, NewPrice, _priceLevels[]}
{1, 1.00, 2.00, _priceLevels}
{2, 1.00, 2.00, _priceLevels}
{3, 1.00, 2.00, _priceLevels}
{4, 1.00, 2.00, _priceLevels}
Part of the problem as that I'm trying to compare/join an ArrayList and an IGenericListItem.
In SQL I would do something like this;
SELECT A.*
FROM _assignmentControls A JOIN _priceLevels P
ON A.ID = P.ID
ORDER BY P.Name
This Returns me an _assignmentControls table sorted by the values in _priceLevels.
In C# LINQ I got this far, but can't seem to get it right;
var sortedList =
from a in _assignmentControls
join p in _priceLevels on a equals p.ID
orderby p.Name
select _assignmentControls;
I am getting red squigglies under join and orderby and the p in p.Name is red.
And A) it doesn't work. B) I'm not sure it will return sortedList as a sorted version of _assignmentControls sorted by _priceLevels.Name.
EDIT: When I hover over "join" I get "The type arguments for the method 'IEnumerable System.Linq.Enumerable.Join(this Enumerable,IEnumerable, Func,Func....'cannot be infered from the query. I am researching that now.
Thanks for looking!
When I hover over "join" I get "The type arguments for the method IEnumerable System.Linq.Enumerable.Join(this Enumerable,IEnumerable, Func,Func.... cannot be infered from the query.
I can explain what is going on here so that you can track it down.
When you say
from firstitem in firstcollection
join seconditem in secondcollection on firstkey equals secondkey
select result
the compiler translates that into:
Enumerable.Join(
firstcollection,
secondcollection,
firstitem=>firstkey,
seconditem=>secondkey,
(firstitem, seconditem)=>result)
Enumerable.Join is a generic method that has four type parameters: the element type of the first collection, the element type of the second collection, the key type, and the result type.
If you're getting that error then one of those four things cannot be deduced given the information you've provided to the compiler. For example, maybe:
The type of the first collection is not actually a sequence.
The type of the second collection is not actually a sequence.
The type of the result cannot be deduced
The two keys are of inconsistent types and there is no unique best type.
That last point is the most likely one. Suppose for example the first key is int and the second key is short. Since every short can be converted to int, int would win, and the second key would be automatically converted to int. Now suppose that the first key type is Giraffe and the second key type is Tiger. Neither is better than the other. C# does not say "oh, they're both kinds of Animal, so let's pick that." Rather, it says that you haven't provided enough information to determine which one you meant; you should cast one of them to Animal and then it becomes clear.
Make sense?
There's a half-hour video of me explaining this feature back in 2006 -- this was back when I was adding the feature in question to the compiler -- so if you want a more in-depth explanation, check it out.
http://ericlippert.com/2006/11/17/a-face-made-for-email-part-three/
UPDATE: I just read your question again more carefully:
Part of the problem as that I'm trying to compare/join an ArrayList and an IGenericListItem.
There's the problem. The type of the sequence cannot be determined from an ArrayList. You should not use ArrayList anymore. In fact, you should not use it in any code written after 2005. Use List<T> for some suitable T.
Your select clause is wrong, it should be like this:
var sortedList =
from a in _assignmentControls
join p in _priceLevels on a equals p.ID
orderby p.Name
select a;
Another issue is that _assignmentControls is of type ArrayList, which has elements of type Object, so the compiler doesn't know the actual type of a, and can't use it as the join criteria since a doesn't have the same type as p.ID.
You should use a List<int> (assuming p.ID is of type int) instead of ArrayList. Another option is to specify the type of a explicitly:
var sortedList =
from int a in _assignmentControls
join p in _priceLevels on a equals p.ID
orderby p.Name
select a;
I think you should write:
var sortedList =
from a in _assignmentControls
join p in _priceLevels on a.ID equals p.ID
orderby p.AnotherValue
select a;
When you write from a in _assignmentControls - you are declaring a variable that refers to current element in a sequance that the operation to be performed on. And when you're calling select - you're projecting element from the sequence. Imagine it like conveyer.
Let me give you some example with dump data:
public class SomeCLass
{
public int ID { get; set; }
public string Name { get; set; }
}
public class AnotherClass
{
public int ID { get; set; }
public int Value { get; set; }
public int AnotherValue { get; set; }
}
public void TestMEthod()
{
List<SomeCLass> _assignmentControls = new List<SomeCLass>()
{
new SomeCLass() { ID = 1, Name = "test"},
new SomeCLass() { ID = 2, Name = "another test"}
};
List<AnotherClass> _priceLevels = new List<AnotherClass>()
{
new AnotherClass() {ID = 1, AnotherValue = 15, Value = 13},
new AnotherClass() {ID = 2, AnotherValue = 5, Value = 13}
};
var sortedList =
//here you're declaring variable a that will be like caret when you going through _assignmentControls
from a in _assignmentControls
join p in _priceLevels on a.ID equals p.ID
orderby p.AnotherValue
select a;
foreach (var someCLass in sortedList)
{
Console.WriteLine(someCLass.Name);
}
}
Result:
another test
test
Still pretty new to entity framework. So forgive me if this is a noob question. Hoping someone can shed some light on this.
I am trying to select data from 3 related tables.
Leagues -> Teams -> Rosters ->
The relationships are League.LeagueID => Team.LeagueID => Roster.TeamID
In the Roster table there is a PlayerID column
I need a query that can select all leagues where Roster has PlayerID = 1
I cannot seem to filter results on the grandchild record no matter what I try. Not finding too much on the internet either.
I have found a way to do this with anonymous types but those are read only so i can make changes to the data. I must be able to update the data after it returns.
db.Leagues.Where(l => l.Teams.Any(t => t.Roster.PlayerID == 1));
The SQL generated should get you what you want, even it looks unreadable ;)
If you want to specifically use inner joins to do this, you can do so with code like this:
from l in db.Leagues
join t in db.Teams on l.LeagueID equals t.LeagueID
join r in db.Rosters on t.TeamID equals r.TeamID
where r.PlayerID = 1
select l
UPDATE
To do with with eager loading the child associations use Include():
((from l in db.Leagues
join t in db.Teams on l.LeagueID equals t.LeagueID
join r in db.Rosters on t.TeamID equals r.TeamID
where r.PlayerID = 1
select l) as ObjectQuery<League>).Include(l => l.Teams.Select(t => t.Rosters))
db.Roasters.Where(r=>r.PlayerId ==1).Select(r=>r.Team).Select(t=>t.League).Distinct()
If Roaster has many teams and team has many leagues you can use .SelectMany instead of .Select
Example of .SelectMany from MSDN:
PetOwner[] petOwners =
{ new PetOwner { Name="Higa, Sidney",
Pets = new List<string>{ "Scruffy", "Sam" } },
new PetOwner { Name="Ashkenazi, Ronen",
Pets = new List<string>{ "Walker", "Sugar" } },
new PetOwner { Name="Price, Vernette",
Pets = new List<string>{ "Scratches", "Diesel" } } };
// Query using SelectMany().
IEnumerable<string> query1 = petOwners.SelectMany(petOwner => petOwner.Pets);
Preface: I don't understand what this does:
o => o.ID, i => i.ID, (o, id) => o
So go easy on me. :-)
I have 2 lists that I need to join together:
// list1 contains ALL contacts for a customer.
// Each item has a unique ID.
// There are no duplicates.
ContactCollection list1 = myCustomer.GetContacts();
// list2 contains the customer contacts (in list1) relevant to a REPORT
// the items in this list may have properties that differ from those in list1.
/*****/// e.g.:
/*****/ bool SelectedForNotification;
/*****/// may be different.
ContactCollection list2 = myReport.GetContacts();
I need to create a third ContactCollection that contains all of the contacts in list1 but with the properties of the items in list2, if the item is in the list[2] (list3.Count == list1.Count).
I need to replace all items in list1 with the items in list2 where items in list1 have the IDs of the items in list2. The resulting list (list3) should contain the same number of items at list1.
I feel as though I'm not making any sense. So, please ask questions in the comments and I'll try to clarify.
Joins are not so difficult, but your problem could probably use some further explanation.
To join two lists, you could do something like
var joined = from Item1 in list1
join Item2 in list2
on Item1.Id equals Item2.Id // join on some property
select new { Item1, Item2 };
this will give an IEnumerable<'a>, where 'a is an anonymous type holding an item from list1 and its related item from list2. You could then choose which objects' properties to use as needed.
To get the result to a concrete list, all that is needed is a call to .ToList(). You can do that like
var list3 = joined.ToList();
// or
var list3 = (from Item1 in list1
join Item2 in list2
on Item1.Id equals Item2.Id // join on some property
select new { Item1, Item2 }).ToList();
To do a left join to select all elements from list1 even without a match in list2, you can do something like this
var list3 = (from Item1 in list1
join Item2 in list2
on Item1.Id equals Item2.Id // join on some property
into grouping
from Item2 in grouping.DefaultIfEmpty()
select new { Item1, Item2 }).ToList();
This will give you a list where Item1 equals the item from the first list and Item2 will either equal the matching item from the second list or the default, which will be null for a reference type.
Here is what I came up with (based on this):
List<Contact> list3 = (from item1 in list1
join item2 in list2
on item1.ContactID equals item2.ContactID into g
from o in g.DefaultIfEmpty()
select o == null ? item1 :o).ToList<Contact>();
My favorite part is the big nosed smiley
:o)
Thanks for your help!
Here is a DotNetFiddle with a Linq Group Join
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
class Order
{
public int Id;
public string Name;
public Order(int id, string name)
{
this.Id = id;
this.Name = name;
}
}
class OrderItem
{
public int Id;
public string Name;
public int OrderId;
public OrderItem(int id, string name, int orderId)
{
this.Id = id;
this.Name = name;
this.OrderId = orderId;
}
}
List<Order> orders = new List<Order>()
{
new Order(1, "one"),
new Order(2, "two")
};
List<OrderItem> orderItems = new List<OrderItem>()
{
new OrderItem(1, "itemOne", 1),
new OrderItem(2, "itemTwo", 1),
new OrderItem(3, "itemThree", 1),
new OrderItem(4, "itemFour", 2),
new OrderItem(5, "itemFive", 2)
};
var joined =
from o in orders
join oi in orderItems
on o.Id equals oi.OrderId into gj // gj means group join and is a collection OrderItem
select new { o, gj };
// this is just to write the results to the console
string columns = "{0,-20} {1, -20}";
Console.WriteLine(string.Format(columns, "Order", "Item Count"));
foreach(var j in joined)
{
Console.WriteLine(columns, j.o.Name, j.gj.Count() );
}
It looks like you don't really need a full-join. You could instead do a semi-join, checking each contact in list 2 to see if it is contained in list 1:
ContactCollection list3 = list2.Where(c => list1.Contains(c));
I don't know how big your lists are, but note that this approach has O(nm) complexity unless list1 is sorted or supports fast lookups (as in a hashset), in which case it could be as efficient as O(nlog(m)) or rewritten as a merge-join and be O(n).