Inverting a Hierarchy with Linq - c#

Given the class
public class Article
{
public string Title { get; set; }
public List<string> Tags { get; set; }
}
and
List<Article> articles;
How can I create a "map" from individual tags (that may be associated with 1 or more articles) with Linq?
Dictionary<string, List<Article>> articlesPerTag;
I know that I can select all of the tags like this
var allTags = articlesPerTag.SelectMany(a => a.Tags);
However, I'm not sure how to associate back from each selected tag to the article it originated from.
I know I can write this conventionally along the lines of
Dictionary<string, List<Article>> map = new Dictionary<string, List<Article>>();
foreach (var a in articles)
{
foreach (var t in a.Tags)
{
List<Article> articlesForTag;
bool found = map.TryGetValue(t, out articlesForTag);
if (found)
articlesForTag.Add(a);
else
map.Add(t, new List<Article>() { a });
}
}
but I would like to understand how to accomplish this with Linq.

If you specifically need it as a dictionary from tags to articles, you could use something like this.
var map = articles.SelectMany(a => a.Tags.Select(t => new { t, a }))
.GroupBy(x => x.t, x => x.a)
.ToDictionary(g => g.Key, g => g.ToList());
Though it would be more efficient to use a lookup instead, it's precisely what you are trying to build up.
var lookup = articles.SelectMany(a => a.Tags.Select(t => new { t, a }))
.ToLookup(x => x.t, x => x.a);

One more way using GroupBy. A bit complicated though.
articles.SelectMany(article => article.Tags)
.Distinct()
.GroupBy(tag => tag, tag => articles.Where(a => a.Tags.Contains(tag)))
.ToDictionary(group => group.Key,
group => group.ToList().Aggregate((x, y) => x.Concat(y).Distinct()));

Related

LINQ to SQL - order by, group by and order by each group with skip and take

This is an extension of already answered question by Jon Skeet that you can find here.
The desired result is following:
A 100
A 80
B 80
B 50
B 40
C 70
C 30
considering you have following class:
public class Student
{
public string Name { get; set; }
public int Grade { get; set; }
}
to get to the result (in ideal scenario) can be done with Jon Skeet's answer:
var query = grades.GroupBy(student => student.Name)
.Select(group =>
new { Name = group.Key,
Students = group.OrderByDescending(x => x.Grade) })
.OrderBy(group => group.Students.FirstOrDefault().Grade);
However in my case I have to support paging in my query as well. This means performing SelectMany() and then do Skip() and Take(). But to do Skip() you have to apply OrderBy(). This is where my ordering breaks again as I need to preserve the order I get after SelectMany().
How to achieve this?
var query = grades.GroupBy(student => student.Name)
.Select(group =>
new { Name = group.Key,
Students = group.OrderByDescending(x => x.Grade) })
.OrderBy(group => group.Students.FirstOrDefault().Grade).SelectMany(s => s.Students).OrderBy(something magical that doesn't break ordering).Skip(s => skip).Take(t => take);
I know I could manually sort again the records when my query is materialised but I would like to avoid this and do all of it in one SQL query that is translated from LINQ.
You can take another approach using Max instead of ordering each group and taking the first value. After that you can order by max grade, name (in case two students have the same max grade) and grade:
var query = c.Customers
.GroupBy(s => s.Name, (k, g) => g
.Select(s => new { MaxGrade = g.Max(s2 => s2.Grade), Student = s }))
.SelectMany(s => s)
.OrderBy(s => s.MaxGrade)
.ThenBy(s => s.Student.Name)
.ThenByDescending(s => s.Student.Grade)
.Select(s => s.Student)
.Skip(toSkip)
.Take(toTake)
.ToList();
All these methods are supported by EF6 so you should get your desired result.
Just re-index your list results and remove the index before returning.
var query = grades.GroupBy(student => student.Name)
.Select(group =>
new { Name = group.Key,
Students = group.OrderByDescending(x => x.Grade)
})
.OrderBy(group => group.Students.FirstOrDefault().Grade)
.SelectMany(s => s.Students)
.Select((obj,index) => new {obj,index})
.OrderBy(newindex => newindex.index)
.Skip(s => skip).Take(t => take)
.Select(final=> final.obj);

Using linq to filter List of List

I still don't have a good grasp on LINQ yet, and felt like my code could be optimised so looking for help.
I have a Patient and Med Class, each have a public bool IsSelected. These are wrapped into the PatientMeds and PatientsMeds Classes;
public class PatientMeds
{
public Patient Patient;
public List<Med> Meds;
}
public class PatientsMeds
{
public List<PatientMeds> PatientMedsList;
}
I want to filter these, so if the Patient.IsSelected == false then ignore it, and ignore only the Meds where IsSelected == false;
Now, this code works:
List<PatientMeds> patientMedsList = PatientsMeds.PatientMedsList
.Where(x => x.Patient.IsSelected)
.ToList();
foreach (PatientMeds patientMeds in patientMedsList)
{
var medsToRemove = patientMeds.Meds.Where(m => m.IsSelected == false).ToList();
foreach (Med med in medsToRemove)
{
patientMeds.Meds.Remove(med);
}
}
But it just seems 'clunky'. How can i make it better?
I would use ForEach RemoveAll method
List<PatientMeds> patientMedsList = PatientsMeds.PatientMedsList
.Where(x => x.Patient.IsSelected)
.ToList();
patientMedsList.ForEach(p=> p.Meds.RemoveAll(m=>!m.IsSelected));
You could construct a new list with new PatientMeds instances containing only selected patients and meds:
var selectedPatientsWithSelectedMeds = patientMedsList.Where(p => p.IsSelected)
.Select(p => new PatientMeds
{
Patient = p.Patient,
Meds = p.Meds.Where(m => m.IsSelected).ToList()
})
.ToList();
So the Where(p => p.IsSelected) only selects selected patients, and the Select(p => new PatientMeds { ... } constructs new PatientMeds instances.
Finally p.Meds.Where(m => m.IsSelected).ToList() constructs a new list with only selected meds.
But it's not clear whether constructing new PatientMeds and List<Med> instances is viable. For example at new PatientMeds { ... } you will need to map all properties of PatientMeds.
Try shortening the following foreach loop
foreach (PatientMeds patientMeds in patientMedsList)
{
patientMeds.Meds.RemoveAll(m => m.IsSelected == false);
}
You can try RemoveAll
patientsMeds
.PatientMedsList
.Where(m => m.Patient.IsSelected)
.ToList()
.ForEach(m => m.Meds.RemoveAll(med => !med.IsSelected));
As being reference type, despite you create new list using ToList() method, it will point to same location. So, the result also will be reflected at patientsMeds variable
just use:
var bb = patientMedsList.Where(p => p.Patient.IsSelected).ToList().Select(p => new PatientMeds { Patient = p.Patient, Meds = p.Meds.Where(m => m.IsSelected).ToList() }).ToList();

How to partition an array into multiple arrays with LINQ?

In order to explain the problem I've created a simplified example. In real life the data class is somewhat more complicated. Consider the following data class:
public class Data
{
public Data(string source, string path, string information)
{
this.Source = source;
this.Path = path;
this.Information = information;
}
public string Source { get; set; }
public string Path { get; set; }
public string Information { get; set; }
}
Now consider the following array:
var array = new Data[] {
new Data("MS", #"c:\temp\img1.jpg", "{a}"),
new Data("IBM", #"c:\temp\img3.jpg", "{b}"),
new Data("Google", #"c:\temp\img1.jpg", "{c}"),
new Data("MS", #"c:\temp\img2.jpg", "{d}"),
new Data("MS", #"c:\temp\img3.jpg", "{e}"),
new Data("Google", #"c:\temp\img1.jpg", "{f}"),
new Data("IBM", #"c:\temp\img2.jpg", "{g}")
};
I would like to process the data by partitioning it on the Path and sorting each partition on Source. The output needs to be like:
c:\temp\img1.jpg
"Google": "{c}"
"IBM": "{f}"
"MS": "{a}"
c:\temp\img2.jpg
"IBM": "{g}"
"MS": "{d}"
c:\temp\img3.jpg
"IBM": "{b}"
"MS": "{e}
How can I create these partitions with LINQ?
Here you can play with the code: https://dotnetfiddle.net/EbKluE
You can use LINQ's OrderBy and GroupBy to sort your items by Source and group your ordered items by Path:
var partitioned = array
.OrderBy(data => data.Source)
.GroupBy(data => data.Path);
See this fiddle for a demo.
You can use GroupBy and OrderBy like this:
Dictionary<string, Data[]> result =
array.GroupBy(d => d.Path)
.ToDictionary(g => g.Key, g => g.OrderBy(d => d.Source).ToArray());
This gives you a dictionary with Path as keys. Each value is an array of Data that have this Path and are sorted by their Source.
I would recommend the Group-by function of lync.
For your case:
var queryImageNames =
from image in array // <-- Array is your name for the datasource
group image by image.Path into newGroup
orderby newGroup.Key
select newGroup;
foreach (var ImageGroup in queryImageNames)
{
Console.WriteLine("Key: {0}", nameGroup.Key);
foreach (var image in ImageGroup )
{
Console.WriteLine("\t{0}, {1}", image.Source, image.Information);
}
}
You could use GroupBy and do this.
var results = array
.GroupBy(x=>x.Path)
.Select(x=>
new
{
Path =x.Key,
values=x.Select(s=> string.Format("{0,-8}:{1}", s.Source, s.Information))
.OrderBy(o=>o)
})
.ToList();
Output:
c:\temp\img1.jpg
Google :{c}
Google :{f}
MS :{a}
c:\temp\img3.jpg
IBM :{b}
MS :{e}
c:\temp\img2.jpg
IBM :{g}
MS :{d}
Check this fiddle
You can use Enumerable.GroupBy to group by the Path property:
var pathPartitions = array.GroupBy(x => x.Path);
foreach(var grp in pathPartitions)
{
Console.WriteLine(grp.Key);
var orderedPartition = grp.OrderBy(x => x.Source);
foreach(var x in orderedPartition )
Console.WriteLine($"\"{x.Source}\": \"{x.Information}\"");
}
If you want to create a collection you could create a Tuple<string, Data[]>[]:
Tuple<string, Data[]>[] pathPartitions = array
.GroupBy(x => x.Path)
.Select(g => Tuple.Create(g.Key, g.OrderBy(x => x.Source).ToArray()))
.ToArray();
or a Dictionary<string, Data[]>:
Dictionary<string, Data[]> pathPartitions = array
.GroupBy(x => x.Path)
.ToDictionary(g => g.Key, g => g.OrderBy(x => x.Source).ToArray());

ToDictionary on anonymous types

I am trying to mimic a function that was already created in the code base I am working on. The first function works, but when I try to modify it to use strings in the dictionary it does not work. I get System.Linq.Enumerable+WhereSelectEnumerableIterator2[<>f__AnonymousType32[System.Int32,System.String],System.String] as a value for the comments. I know that the first one is using average which is an aggregate but I cannot figure out how to aggregate the comments as they are strings.
public static Dictionary<int, double> getRatingAverages(string EventID)
{
List<tbMultipurposeVertical> allMain = DynamicData.Vertical.getRecords(EventID, appcode, -2).ToList();
Dictionary<int, double> ratings;
using (FBCDBDataContext db = new FBCDBDataContext())
{
ratings = db.tbMultipurposeVerticals.Where(v => v.eventid == EventID & v.appcode == "ratinglabel" & v.label == "Rater")
.Select(v => new
{
AbstractID = v.parent,
Rating = int.Parse(db.tbMultipurposeVerticals.First(r => r.parent == v.id & r.label == "Rating").value)
})
.GroupBy(r => r.AbstractID).ToDictionary(k => k.Key, v => v.Select(r => r.Rating).Average());
}
return ratings;
}
public static Dictionary<int, string> getRatingComments(string EventID)
{
List<tbMultipurposeVertical> allMain = DynamicData.Vertical.getRecords(EventID, appcode, -2).ToList();
Dictionary<int, string> comments;
using (FBCDBDataContext db = new FBCDBDataContext())
{
comments = db.tbMultipurposeVerticals.Where(v => v.eventid == EventID & v.appcode == "ratinglabel" & v.label == "Rater")
.Select(v => new
{
AbstractID = v.parent,
Comment = db.tbMultipurposeVerticals.First(r => r.parent == v.id & r.label == "Comment").ToString()
})
.GroupBy(r => r.AbstractID).ToDictionary(k => k.Key, v => v.Select(r => r.Comment).ToString());
}
return comments;
}
In the first method, you are taking the average of the ratings (an aggregate method). For the second method, you are now treating it as a single comment.
It's not giving you what you expect because of the .GroupBy()
As Steve Greene suggests, either you can get the first comment (v.First().ToString() or v.FirstOrDefault().ToString()), or you can consider concatenating the comments (v.Concat()) if that makes sense in your application.
Otherwise you may want to make your dictionary to be of the form Dictionary<int, List<string>>

Grouping a list of list using linq

I have these tables
public class TaskDetails
{
public string EmployeeName {get; set;}
public decimal EmployeeHours {get; set;}
}
public class Tasks
{
public string TaskName {get; set;}
public List<TaskDetails> TaskList {get; set;}
}
I have a function that returns a List<Tasks>. What I would need is to create a new List that groups the EmployeeNames and SUM the EmployeeHours irrespective of the TaskName. That is, I need to fetch TotalHours of each Employees. How to get that?
P.S: And to what have I done so far. I have stared at the code for a long time. Tried Rubber Duck Problem solving to no avail. I can do get the results using a foreach and placing it to a Dictionary<string, decimal>. That logic will be to check if key does not exist, add a new key and assign the value and if the key exists add the decimal value to the original value. But I feel its too much here. I feel there is a ForEach - GroupBy - Sum combination which I am missing.
Any pointers on how to do it will be very helpful for me.
var results = tasks.SelectMany(x => x.Tasks)
.GroupBy(x => x.EmployeeName)
.ToDictionary(g => g.Key, g => g.Sum(x => x.EmployeeHours));
Gives you Dictionary<string, decimal>.
To get a list just replace ToDictionary with Select/ToList chain:
var results = tasks.SelectMany(x => x.Tasks)
.GroupBy(x => x.EmployeeName)
.Select(g => new {
EmployeeName = g.Key,
Sum = g.Sum(x => x.EmployeeHours)
}).ToList();
a SelectMany would help, I think.
It will "flatten" the Lists of TaskDetail of all your Task elements into a single IEnumerable<TaskDetail>
var result = listOfTasks.SelectMany(x => x.Tasks)
.GroupBy(m => m.EmployeeName)
.Select(m => new {
empName = m.Key,
hours = m.Sum(x => x.EmployeeHours)
});
var emplWithHours = allTasks
.SelectMany(t => t.Tasks)
.GroupBy(empl => empl.EmployeeName)
.Select(empl => new
{
EmployeeName = empl.Key,
TotalHours = empl.Sum(hour => hour.EmployeeHours)
}).ToDictionary(i => i.EmployeeName, i => i.TotalHours);
Also, when both your class name and field name is Tasks, it gives a compile-time error:
Error 1 'Tasks': member names cannot be the same as their enclosing type
I would have named your class Task since it represents a single task.
I would do it this way:
var query =
(
from t in tasks
from td in t.TaskList
group td.EmployeeHours by td.EmployeeName into ghs
select new
{
EmployeeName = ghs.Key,
EmployeeHours = ghs.Sum(),
}
).ToDictionary(x => x.EmployeeName, x => x.EmployeeHours);
I slightly more succinct query would be this:
var query =
(
from t in tasks
from td in t.TaskList
group td.EmployeeHours by td.EmployeeName
).ToDictionary(x => x.Key, x => x.Sum());
There are pros and cons to each. I think the first is more explicit, but the second a little neater.

Categories