I'm a big noob with Linq and trying to learn, but I'm hitting a blocking point here.
I have a structure of type:
Dictionary<MyType, List<MyObj>>
And I would like to query with Linq that structure to extract all the MyObj instances that appear in more than one list within the dictionary.
What would such a query look like?
Thanks.
from myObjectList in myObjectDictionary.Values
from myObject in myObjectList.Distinct()
group myObject by myObject into myObjectGroup
where myObjectGroup.Skip(1).Any()
select myObjectGroup.Key
The Distinct() on each list ensures MyObj instances which repeat solely in the same list are not reported.
You could do something like this:
var multipleObjs =
MyObjDictionary.Values // Aggrigate all the List<MyObj> values into a single list
.SelectMany(list => list) // Aggrigate all the MyObjs from each List<MyObj> into a single IEnumerable
.GroupBy(obj => obj) // Group by the Obj itself (Or an ID or unique property on them if it exists)
.Where(group => group.Count() >= 2) // Filter out any group with less then 2 objects
.Select(group => group.Key); // Re-Select the objects using the key.
Edit
I Realized that this could also be read diffrently, such that it doesn't matter if the MyObj occurs multiple times in the same list, but only if it occurs multiple times in diffrent lists. In that case, when we are initally aggrigating the lists of MyObjs we can select Distinct values, or use a slightly diffrent query:
var multipleObjs =
MyObjDictionary.Values // Aggrigate all the List<MyObj> values into a single list
.SelectMany(v => v.Distinct()) // Aggrigate all distinct MyObjs from each List<MyObj> into a single IEnumerable
.GroupBy(obj => obj) // Group by the Obj itself (Or an ID or unique property on them if it exists)
.Where(group => group.Count() >= 2) // Filter out any group with less then 2 objects
.Select(group => group.Key); // Re-Select the objects using the key.
var multipleObjs =
MyObjDictionary.SelectMany(kvp => // Select from all the KeyValuePairs
kvp.Value.Where(obj =>
MyObjDictionary.Any(kvp2 => // Where any of the KeyValuePairs
(kvp.Key != kvp2.Key) // Is Not the current KeyValuePair
&& kvp.Value.Contains(obj)))); // And also contains the same MyObj.
Related
I'm trying to iterate over my two LINQ queries using foreach nested loop pattern and then add the elements to the KeyValuePair instance of a list:
LINQ queries:
var nameQuery = org.deltagerRelation
.SelectMany(r => r.Deltager ?? new List<Deltager>())
.Where(a => a.Enhedstype != null && a.Enhedstype == "PERSON")
.SelectMany(d => d.Navne)
.Select(n => n.Navn);
var boardMembersQuery = org.deltagerRelation
.SelectMany(o => o.Organisationer ?? new List<Organisationer>())
.SelectMany(m => m.MedlemsData)
.SelectMany(a => a.Attributter)
.SelectMany(v => v.Vaerdier)
.Where(v => v.Vaerdi != null && v.Vaerdi == "BESTYRELSE")
.Select(v => v.Vaerdi);
And the ouput I get is as expected, the nameQuery returns two names and the boardMembersQuery returns two board members. However, when I run them through nested foreach loop like this:
foreach(var name in nameQuery)
{
foreach(var boardMember in boardMembersQuery)
{
result.BoardMembers.Add(new KeyValuePair<string, JToken>(name, boardMember));
}
}
Each of the query values gets added two times to the list, and I end up having four key/value pairs instead of two. Is the pattern of for each loops implemented incorrectly or is there something else going on that I'm missing?
Of course you have 4 key-value pairs.
NameQuery has 2 results.
boardMemberQuery has 2 results.
The outer loop runs 2 times, and the inner loop runs 2 times per outer loop execution
2x2=4
Well I figured that I do not really need a nested foreach loop pattern, I just decided to use the Zip method on my LINQ query results:
var combinedResult = nameQuery.Zip(boardMembersQuery, (n, b) => new { Name = n, BoardMember = b });
and just iterated over the combinedResult and added the properties to the list!
I have got this assignment. I need to create method which works with JSON data in this form:
On input N, what is top N of movies? The score of a movie is its average rate
So I have a JSONfile with 5 mil. movies inside. Each row looks like this:
{ Reviewer:1, Movie:1535440, Grade:1, Date:'2005-08-18'},
{ Reviewer:1, Movie:1666666, Grade:2, Date:'2006-09-20'},
{ Reviewer:2, Movie:1535440, Grade:3, Date:'2008-05-10'},
{ Reviewer:3, Movie:1535440, Grade:5, Date:'2008-05-11'},
This file is deserialized and then saved as a IEnumerable. And then I wanted to create a method, which returns List<int> where int is MovieId. Movies in the list are ordered descending and the amount of "top" movies is specified as a parameter of the method.
My method looks like this:
public List<int> GetSpecificAmountOfBestMovies(int amountOfMovies)
{
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.ToList();
var moviesSortedList = new List<int>();
foreach (var movie in moviesAndAverageGradeSortedList)
{
var key = movie.Key;
moviesSortedList.Add(key);
}
return moviesSortedList;
}
So moviesAndAverageGradeSortedList returns List<{int,double}> because of the .select method. So I could not return this value as this method is type of List<int> because I want only movieIds not their average grades.
So I created a new List<int> and then foreach loop which go through the moviesAndAverageGradeSortedList and saves only Keys from that List.
I think this solution is not correct because foreach loop can be then very slow when I put big number as a parameter. Does somebody know, how can I get "Keys" (movieIds) from the first list and therefore avoid creating another List<int> and foreach loop?
I will be thankful for every solution.
You can avoid the second list creation by just adding another .Select after the ordering. Also to make it all a bit cleaner you could:
return _deserializator.RatingCollection()
.GroupBy(i => i.Movie)
.OrderByDescending(g => g.Average(i => i.Grade))
.Select(g => g.Key)
.Take(amountOfMovies)
.ToList();
Note that this won't really improve performance much (if at all) because even in your original implementation the creation of the second list is done only on the subset of the first n items. The expensive operations are the ordering by the averages of the group and that you want to perform on all items in the json file, regardless to the number of item you want to return
You could add another select after you have ordered the list by average
var moviesAndAverageGradeSortedList = _deserializator.RatingCollection()
.GroupBy(movieId => movieId.Movie)
.Select(group => new
{
Key = group.Key,
Average = group.Average(g => g.Grade)
})
.OrderByDescending(a => a.Average)
.Take(amountOfMovies)
.Select(s=> s.Key)
.ToList();
I have two lists. One is a dynamic list of objects. Another is a list of strings. I want to sort the object list based on the other list.
List<dynamic> List1; // Object1,Object2,Object3,Object4
List <String> List2; //"abc","bcd","da"
These objects has one of the attributes "alphabets" on whose basis it has to be sorted.
The objects may not be equal to number of elements in second list.
Something like this might work, if the indexes of the two lists align how you want them to. You'd have to ensure that the lists have the same length for this to work correctly though.
var result = list1
.Select((item, index) =>
new
{
Item = item,
Order = list2[index]
})
.OrderBy(x => x.Order)
.Select(x => x.Item);
If they aren't the same length, what would be the criteria for the order? That would be an undefined problem. One approach would be to put them at the end of the list.
var result = list1.Take(list2.Length)
.Select((item, index) =>
new
{
Item = item,
Order = list2[index]
})
.OrderBy(x => x.Order)
.Select(x => x.Item);
var concatted = result.Concat(list1.Skip(list2.Length));
Ok, assuming that List1 contains a list of objects, and each object contains an attribute called "alphabet", and you want to sort this list of objects, but the sort order is specified in List2 which has the possible values of alphabet in sorted order, then you could do this:
int i=0;
var List2WithRowNum = from str2 in List2.AsEnumerable() select new{ rowNum = i++, str2 };
var sortedList = from obj1 in List1.AsEnumerable()
join strKey in List2WithRowNum.AsEnumerable() on ((listObject)obj1).alphabet equals strKey.str2
orderby strKey.rowNum
select obj1;
sortedList would then be a list of your original objects (from List1) sorted by their "alphabet" attribute, in List2 order.
I have two lists of KeyValue pairs which I want to filter.
I would like to retrieve the keyvalue pairs from list B if the value is different to the key value in list A.
List A List B
<a,1> <b,4>
<b,2> <c,5>
<c,3>
so if I filter the above two key value pair lists I would get the following:
List c
<b,4>
<c,5>
is this possible without having to use a foreach loop and checking individual key values?
Join both lists by keys, then select those items, which have different values:
from kvpA in listA
join kvpB in listB on kvpA.Key equals kvpB.Key
where kvpA.Value != kvpB.Value
select kvpB
Lambda syntax:
listA.Join(listB,
kvpA => kvpA.Key,
kvpB => kvpB.Key,
(kvpA, kvpB) => new { kvpA, kvpB })
.Where(x => x.kvpA.Value != x.kvpB.Value)
.Select(x => x.kvpB)
.ToList()
Try something like this:
ListB.Where(kvpB => !ListA.Select(kvpA => kvpA.Value).Contains(kvpB.Value))
I have a LINQ query against an XML, that gives me a list of nested lists, each sublist being a list of an elements("row") attributes.
var items = loadbodies.Descendants("row").Select(a => a.Attributes().Select(b => b.Value).ToList()).ToList();
This works as intended but, what I actually need to is query this against another list of values so as not to have sublists added where one of the elements attributes("messageID") is on the second list. I can do this for one value but need to check it against the entire second list.
The query to exclude a single sublist by a single hardcoded value from the second list is below.
var items = loadbodies.Descendants("row").Where(c => (string)c.Attribute("messageID") != "avaluefromthesecondlist").Select(a => a.Attributes().Select(b => b.Value).ToList()).ToList();
Any help would be much appreciated.
Just use Contains. Note that splitting lines helps readability considerably:
var ids = ...; // Some sequence of ids, e.g. a List<string> or HashSet<string>
var items = loadbodies
.Descendants("row")
.Where(row => ids.Contains((string) row.Attribute("messageId")))
.Select(a => a.Attributes()
.Select(b => b.Value)
.ToList())
.ToList();
Note that you could use a Join call too... but so long as you've got relatively few IDs, this should be fine.