joining in Linq - c#

When i declare
int[] a = { 1, 2 ,10,30};
int[] b = { 10, 20, 30 };
var q = from item1 in a
join item2 in b
on item1 equals item2
into g
select g
1)What is actually getting selected into g ? It is difficult to understand the into keyword. if you give example to explain the keyword "into",i will be pleased.
2)
How do the following code actually projected ?
1 from a in db.coll1
2 join b in db.coll2 on a.PK equals b.PK into b_join
3 from b in b_join.DefaultIfEmpty()
4 where
5 b.PK == null
6 select new {
7 a.PK,
8 a.Value,
9 Column1 = (System.Int32?)b.PK,
10 Column2 = b.Value
}
in line 2 we are using "b' to select the item
in line 3 also we are using the same b ,does it mean we are overriding the data we selected
at line 2 ?

1) You are doing a group join. g will be an IGrouping for each element in a. g.Key will be the element from a, and g (which implements IEnumerable) will contain all of the matching elements from b. To understand this fully, try it out in LINQPad. Also, read the documentation.
2) The into keyword merges the first b into b_join, which will be an IGrouping as I described above. After that, the first b is no longer in scope.

If you use .NET Reflector on your compiled code, you can see that your first query is equivalent to the following:
IEnumerable<IEnumerable<int>> q = a.GroupJoin(b,
item1 => item1,
item2 => item2,
(item1, group) => group);
Note that you are performing a group join, but only returning the groups. Compare this with an ordinary join:
IEnumerable<int> q = a.Join(b,
item1 => item1,
item2 => item2,
(item1, item2) => item2);
This returns the elements in b that matches to each a, if any such element exists.

Related

LINQ to JSON group query on array

I have a sample of JSON data that I am converting to a JArray with NewtonSoft.
string jsonString = #"[{'features': ['sunroof','mag wheels']},{'features': ['sunroof']},{'features': ['mag wheels']},{'features': ['sunroof','mag wheels','spoiler']},{'features': ['sunroof','spoiler']},{'features': ['sunroof','mag wheels']},{'features': ['spoiler']}]";
I am trying to retrieve the features that are most commonly requested together. Based on the above dataset, my expected output would be:
sunroof, mag wheels, 2
sunroof, 1
mag wheels 1
sunroof, mag wheels, spoiler, 1
sunroof, spoiler, 1
spoiler, 1
However, my LINQ is rusty, and the code I am using to query my JSON data is returning the count of the individual features, not the features selected together:
JArray autoFeatures = JArray.Parse(jsonString);
var features = from f in autoFeatures.Select(feat => feat["features"]).Values<string>()
group f by f into grp
orderby grp.Count() descending
select new { indFeature = grp.Key, count = grp.Count() };
foreach (var feature in features)
{
Console.WriteLine("{0}, {1}", feature.indFeature, feature.count);
}
Actual Output:
sunroof, 5
mag wheels, 4
spoiler, 3
I was thinking maybe my query needs a 'distinct' in it, but I'm just not sure.
This is a problem with the Select. You are telling it to make each value found in the arrays to be its own item. In actuality you need to combine all the values into a string for each feature. Here is how you do it
var features = from f in autoFeatures.Select(feat => string.Join(",",feat["features"].Values<string>()))
group f by f into grp
orderby grp.Count() descending
select new { indFeature = grp.Key, count = grp.Count() };
Produces the following output
sunroof,mag wheels, 2
sunroof, 1
mag wheels, 1
sunroof,mag wheels,spoiler, 1
sunroof,spoiler, 1
spoiler, 1
You could use a HashSet to identify the distinct sets of features, and group on those sets. That way, your Linq looks basically identical to what you have now, but you need an additional IEqualityComparer class in the GroupBy to help compare one set of features to another to check if they're the same.
For example:
var featureSets = autoFeatures
.Select(feature => new HashSet<string>(feature["features"].Values<string>()))
.GroupBy(a => a, new HashSetComparer<string>())
.Select(a => new { Set = a.Key, Count = a.Count() })
.OrderByDescending(a => a.Count);
foreach (var result in featureSets)
{
Console.WriteLine($"{String.Join(",", result.Set)}: {result.Count}");
}
And the comparer class leverages the SetEquals method of the HashSet class to check if one set is the same as another (and this handles the strings being in a different order within the set, etc.)
public class HashSetComparer<T> : IEqualityComparer<HashSet<T>>
{
public bool Equals(HashSet<T> x, HashSet<T> y)
{
// so if x and y both contain "sunroof" only, this is true
// even if x and y are a different instance
return x.SetEquals(y);
}
public int GetHashCode(HashSet<T> obj)
{
// force comparison every time by always returning the same,
// or we could do something smarter like hash the contents
return 0;
}
}

Linq JOIN, COUNT, TUPLE in C#

I have table with columns: A, B and other columns(not important for this)
for example
A B C D
Peter apple
Thomas apple
Thomas banana
Lucy null
How can I get list of tuples {A, count of B} using join?
For my table it is: {Peter, 1}, {Thomas, 2}, {Lucy, 0}
Thanks
You've to just group by records on column A and count where B is not null
var result = (from t1 in cartItems
group t1 by t1.A into t2
select new
{
t2.Key,
count = t2.Count(p=> p.B != null)
}).ToList();
Since you mentioned table, I assume it is DataTable.
You could use simple Linq statements for what you need. Query returns List<Tuple> and Tuple contains two fields Item1 representing Name and Item2 representing Count
var results = dt.AsEnumerable()
.GroupBy(row=>row.Field<string>("A"))
.Select(s=> new Tuple<string, int>(s.Key, s.Count(c=>c!=null)))
.ToList();
Check this Demo

How to compare two lists with multiple objects and set values?

I have two lists. Each list has a Name object and and a Value object. I want to loop through list1 and check if each list1 Name object is the same as the list2 Name object (the linq code below does this).
If they match, then I want the List1 Value to be set with the list2 Value How can this be done?
list1 list2
Name Value Name Value
john apple John orange
peter null Peter grape
I need it to look like this:
list1 list2
Name Value Name Value
john orange john orange
peter grape peter grape
Linq code:
var x = list1.Where(n => list2.Select(n1 => n1.Name).Contains(n.Name));
For filtering you can use LINQ, to set the values use a loop:
var commonItems = from x in list1
join y in list2
on x.Name equals y.Name
select new { Item = x, NewValue = y.Value };
foreach(var x in commonItems)
{
x.Item.Value = x.NewValue;
}
In one result, you can get the objects joined together:
var output= from l1 in list1
join l2 in list2
on l1.Name equals l2.Name
select new { List1 = l1, List2 = l2};
And then manipulate the objects on the returned results. by looping through each and setting:
foreach (var result in output)
result.List1.Value = result.List2.Value;
You are looking for a left join
var x = from l1 in list1
join l2 in list2 on l1.Name equals l2.Name into l3
from l2 in l3.DefaultIfEmpty()
select new { Name = l1.Name, Value = (l2 == null ? l1.Value : l2.Value) };

comparing two lists with LINQ

Let's say that I have these two lists of Persons. The Person object has FirstName, LastName, and Age properties.
List A
David Smith, 38
David Smith, 38
Susan Johnson, 23
List B
David Smith, 38
David Smith, 38
Susan Johnson, 23
Daniel Wallace, 55
I want to see if A is a subset of B by comparing the three properties. No, in this case I do not have a unique ID for each person.
EDIT: There can be duplicates in List A (David Smith, 38). List B should have the duplicates for it to qualify as a super set of B.
Once you've got a class which implements IEquatable<T> or IEqualityComparer<T>, it's easy to do the rest with Except and Any:
if (collectionA.Except(collectionB).Any())
{
// There are elements in A which aren't in B
}
or
if (collectionA.Except(collectionB, equalityComparer).Any())
{
// There are elements in A which aren't in B
}
EDIT: If there are duplicates, you'd probably want to group each collection, then check the counts:
var groupedA = collectionA.GroupBy(p => p,
(Value, g) => new { Value, Count = g.Count() });
var groupedB = collectionB.GroupBy(p => p,
(Value, g) => new { Value, Count = g.Count() });
var extras = from a in groupedA
join b in groupedB on a.Value equals b.Value into match
where !match.Any() || a.Count > match.First().Count
select a;
// ListA has at least one entry not in B, or with more duplicates than in B
if (extras.Any())
{
}
This is pretty horrible though...
If Person does not implement IEquatable<Person> the "brute force" method would be:
var isSubset = listA.All(pa => listB.Any(pb => pb.FirstName == pa.FirstName &&
pb.LastName == pa.LastName &&
pb.Age == pb.Age
)
)
You can use join
var l1 = new List<Person>();//Subset
var l2 = new List<Person>();//Set of all values
var res = from l1 in lst1
join l2 in lst2
on l1.Value equals l2.Value
select new { result = l1 };
And compare count.If it is euqual, then Set contains subset
bool flag = res.Count()==lst1.Count();

c# where in with list and linq

I have two lists, one have a list of object A an other a list of objects B, like this:
ObjectA
{
Int64 idObjectA;
String name;
....
}
ObjectB
{
Int64 idObjectB;
Int64 idObjectA;
String name;
....
}
I have two list, one with Object A and other with Object B. I want to create a new list C that have only objects B, which IDObjectA is any ID of the list A.
In SQL it would be somthing line that:
select * from B where IDObjectA IN(1,2,3,4...);
In my case, the list of values for the IN clause is the list of ObjectA, which have the property idObjectA.
You can use the Join linq method to achieve this by joining listB and listA by their idObjectA, then select itemB.
var result = (from itemB in listB
join itemA in listA on itemB.idObjectA equals itemA.idObjectA
select itemB).ToList();
This method has a linear complexity (O(n)). Using Where(... => ....Contains()) or double foreach has a quadratic complexity (O(n^2)).
The same with Join and without Contains:
var listC = listB.Join(listA, b => b.ObjectAId, a => a.Id, (b, a) => b).ToList();
This is slightly different way of doing it as opposed to a join.
List<ObjectA> listA = ..
List<ObjectB> listB = ..
int[] listAIds = listA.Select(a => a.idObjectA).ToList();
//^^ this projects the list of objects into a list of ints
//It reads like this...
//get items in listB WHERE..
listB.Where(b => listAIds.Contains(b.idObjectA)).ToList();
//b.idObjectA is in listA, OR where listA contains b.idObjectA
Not linq, but does what you want it to:
List<ObjectB> C = new List<ObjectB>();
foreach (n in B)
{
foreach (c in A)
{
if (n.idObjectA == c.idObjectA)
{
C.Add(n)
break;
}
}
}
Or if you wanted higher performance, use a for, and higher than that use Cédric Bignon's solution.

Categories