Build Dictionary with LINQ - c#

Let's say we have a variable 'data' which is a list of Id's and Child Id's:
var data = new List<Data>
{
new()
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new()
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new()
{
Id = 2,
ChildIds = new List<int> {678, 789}
},
};
I would like to have a dictionary with ChildId's and the related Id's. If the ChildId is already in the dictionary, it should overwrite with the new Id.
Currently I have this code:
var dict = new Dictionary<int, int>();
foreach (var dataItem in data)
{
foreach (var child in dataItem.ChildIds)
{
dict[child] = dataItem.Id;
}
}
This works fine, but I don't like the fact that I am using two loops. I prefer to use Linq ToDictionary to build up the dictionary in a Functional way.
What is the best way to build up the dictionary by using Linq?
Why? I prefer functional code over mutating a state. Besides that, I was just curious how to build up the dictionary by using Linq ;-)

In this case your foreach appproach is both, readable and efficient. So even if i'm a fan of LINQ i would use that. The loop has the bonus that you can debug it easily or add logging if necessary(for example invalid id's).
However, if you want to use LINQ i would probably use SelectMany and ToLookup. The former is used to flatten child collections like this ChildIds and the latter is used to create a collection which is very similar to your dictionary. But one difference is that it allows duplicate keys, you get multiple values in that case:
ILookup<int, int> idLookup = data
.SelectMany(d => d.ChildIds.Select(c => (Id:d.Id, ChildId:c)))
.ToLookup(x => x.ChildId, x => x.Id);
Now you have already everything you needed since it can be used like a dictionary with same lookup performance. If you wanted to create that dictionary anyway, you can use:
Dictionary<int, int> dict = idLookup.ToDictionary(x => x.Key, x => x.First());
If you want to override duplicates with the new Id, as mentioned, simply use Last().
.NET-Fiddle: https://dotnetfiddle.net/mUBZPi

The SelectMany linq operator actually has a few less known overloads. One of these has a result collector which is a perfect use case for your scenario.
Following is an example code snippet to turn that into a dictionary. Note that I had to use the Distinct, since you had 2 id's with value 1 which had some duplicated child id's which would pose problems for a dictionary.
void Main()
{
// Get the data
var list = GetData();
// Turn it into a dictionary
var dict = list
.SelectMany(d => d.ChildIds, (data, childId) => new {data.Id, childId})
.Distinct()
.ToDictionary(x => x.childId, x => x.Id);
// show the content of the dictionary
dict.Keys
.ToList()
.ForEach(k => Console.WriteLine($"{k} {dict[k]}"));
}
public List<Data> GetData()
{
return
new List<Data>
{
new Data
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new Data
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new Data
{
Id = 2,
ChildIds = new List<int> {678, 789}
},
};
}
public class Data
{
public int Id { get; set; }
public List<int> ChildIds { get; set; }
}

The approach is to create pairs of each combination of Id and ChildId, and build a dictionary of these:
var list = new List<(int Id, int[] ChildIds)>()
{
(1, new []{10, 11}),
(2, new []{11, 12})
};
var result = list
.SelectMany(pair => pair.ChildIds.Select(childId => (childId, pair.Id)))
.ToDictionary(p => p.childId, p => p.Id);
ToDictionary will throw if there are duplicate keys, to avoid this you can look at this answer and create your own ToDictionary:
public static Dictionary<K, V> ToDictionaryOverWriting<TSource, K, V>(
this IEnumerable<TSource> source,
Func<TSource, K> keySelector,
Func<TSource, V> valueSelector)
{
Dictionary<K, V> output = new Dictionary<K, V>();
foreach (TSource item in source)
{
output[keySelector(item)] = valueSelector(item);
}
return output;
}

With LINQ you can achieve the result like this:
Dictionary<int, int> dict = (from item in data
from childId in item.ChildIds
select new { item.Id, childId}
).Distinct()
.ToDictionary(kv => kv.childId, kv => kv.Id);
Update:
Fully compatible version with foreach loop would use group by with Last(), instead of Distict():
Dictionary<int, int> dict2 = (from item in data
from childId in item.ChildIds
group new { item.Id, childId } by childId into g
select g.Last()
).ToDictionary(kv => kv.childId, kv => kv.Id);
As some already pointed out, depending on order of input elements does not feel "functional". LINQ expression becomes more convoluted then original foreach loop.

There is an overload of SelectMany which not only flattens the collection but also allows you to have any form of result.
var all = data.SelectMany(
data => data.ChildIds, //collectionSelector
(data, ChildId) => new { data.Id, ChildId } //resultSelector
);
Now if you want to transform all into a Dictionary, you have to remove the duplicate ChildIds first. You can use GroupBy as in below, and then pick the last item from each group (as you stated in your question you want to overwrite Ids as you go). The key of your dictionary should also be unique=ChildId:
var dict = all.GroupBy(x => x.ChildId)
.Select(x => x.Last())
.ToDictionary(x => x.ChildId, x => x.Id);
Or you can write a new class with IEquatable<> implemented and use it as the return type of resultSelector (instead of new { data.Id, ChildId }). Then write all.Reverse().Distinct().ToDictionary(x => x.ChildId); so it would detect duplicates based on your own implementation of Equals method. Reverse, because you said you want the last occurrence of the duplicates.

Related

How to custom sort a list of strings given a desired sorting hierarchy?

I would like to sort a List<string> in a particular way. Below is a unit test showing the input, the specific way (which I am calling a "hierarchy" - feel free to correct my terminology so that I may learn), and the desired output. The code should be self explanatory.
[Test]
public void CustomSortByHierarchy()
{
List<string> input = new List<string>{"TJ", "DJ", "HR", "HR", "TJ"};
List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };
List<string> sorted = input.Sort(hierarchy); // <-- does not compile. How do I sort by the hierarchy?
// ...and if the sort worked as desired, these assert statements would return true:
Assert.AreEqual("HR", sorted[0]);
Assert.AreEqual("HR", sorted[1]);
Assert.AreEqual("TJ", sorted[2]);
Assert.AreEqual("TJ", sorted[3]);
Assert.AreEqual("DJ", sorted[4]);
}
Another way to do it:
var hierarchy = new Dictionary<string, int>{
{ "HR", 1},
{ "TJ", 2},
{ "DJ", 3} };
var sorted = strings.OrderBy(s => hierarchy[s]).ToList();
There are so many ways to do this.
It's not great to create a static dictionary - especially when you have a static list of the values already in the order that you want (i.e. List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };). The problem with a static dictionary is that it is static - to change it you must recompile your program - and also it's prone to errors - you might mistype a number. It's best to dynamically create the dictionary. That way you can adjust your hierarchy at run-time and use it to order your input.
Here's the basic way to create the dictionary:
Dictionary<string, int> indices =
hierarchy
.Select((value, index) => new { value, index })
.ToDictionary(x => x.value, x => x.index);
Then it's an easy sort:
List<string> sorted = input.OrderBy(x => indices[x]).ToList();
However, if you have a missing value in the hierarchy then this will blow up with a KeyNotFoundException exception.
Try with this input:
List<string> input = new List<string> { "TJ", "DJ", "HR", "HR", "TJ", "XX" };
You need to decide if you are removing missing items from the list or concatenating them at the end of the list.
To remove you'd do this:
List<string> sorted =
input
.Where(x => indices.ContainsKey(x))
.OrderBy(x => indices[x])
.ToList();
Or to sort to the end you'd do this:
List<string> sorted =
input
.OrderBy(x => indices.ContainsKey(x) ? indices[x] : int.MaxValue)
.ThenBy(x => x) // groups missing items together and is optional
.ToList();
If you simply want to remove items from input that aren't in hierarchy then there are a couple of other options that might be appealing.
Try this:
List<string> sorted =
(
from x in input
join y in hierarchy.Select((value, index) => new { value, index })
on x equals y.value
orderby y.index
select x
).ToList();
Or this:
ILookup<string, string> lookup = input.ToLookup(x => x);
List<string> sorted = hierarchy.SelectMany(x => lookup[x]).ToList();
Personally, I like this last one. It's a two liner and it doesn't rely on indices at all.

order objects by given values

Given:
class C
{
public string Field1;
public string Field2;
}
template = new [] { "str1", "str2", ... }.ToList() // presents allowed values for C.Field1 as well as order
list = new List<C> { ob1, ob2, ... }
Question:
How can I perform Linq's
list.OrderBy(x => x.Field1)
which will use template above for order (so objects with Field1 == "str1" come first, than objects with "str2" and so on)?
In LINQ to Object, use Array.IndexOf:
var ordered = list
.Select(x => new { Obj = x, Index = Array.IndexOf(template, x.Field1)})
.OrderBy(p => p.Index < 0 ? 1 : 0) // Items with missing text go to the end
.ThenBy(p => p.Index) // The actual ordering happens here
.Select(p => p.Obj); // Drop the index from the result
This wouldn't work in EF or LINQ to SQL, so you would need to bring objects into memory for sorting.
Note: The above assumes that the list is not exhaustive. If it is, a simpler query would be sufficient:
var ordered = list.OrderBy(x => Array.IndexOf(template, x.Field1));
I think IndexOf might work here:
list.OrderBy(_ => Array.IndexOf(template, _.Field1))
Please note that it will return -1 when object is not present at all, which means it will come first. You'll have to handle this case. If your field is guaranteed to be there, it's fine.
As others have said, Array.IndexOf should do the job just fine. However, if template is long and or list is long, it might be worthwhile transforming your template into a dictionary. Something like:
var templateDict = template.Select((item,idx) => new { item, idx })
.ToDictionary(k => k.item, v => v.idx);
(or you could just start by creating a dictionary instead of an array in the first place - it's more flexible when you need to reorder stuff)
This will give you a dictionary keyed off the string from template with the index in the original array as your value. Then you can sort like this:
var ordered = list.OrderBy(x => templateDict[x.Field1]);
Which, since lookups in a dictionary are O(1) will scale better as template and list grow.
Note: The above code assumes all values of Field1 are present in template. If they are not, you would have to handle the case where x.Field1 isn't in templateDict.
var orderedList = list.OrderBy(d => Array.IndexOf(template, d.MachingColumnFromTempalate) < 0 ? int.MaxValue : Array.IndexOf(template, d.MachingColumnFromTempalate)).ToList();
I've actually written a method to do this before. Here's the source:
public static IOrderedEnumerable<T> OrderToMatch<T, TKey>(this IEnumerable<T> source, Func<T, TKey> sortKeySelector, IEnumerable<TKey> ordering)
{
var orderLookup = ordering
.Select((x, i) => new { key = x, index = i })
.ToDictionary(k => k.key, v => v.index);
if (!orderLookup.Any())
{
throw new ArgumentException("Ordering collection cannot be empty.", nameof(ordering));
}
T[] sourceArray = source.ToArray();
return sourceArray
.OrderBy(x =>
{
int index;
if (orderLookup.TryGetValue(sortKeySelector(x), out index))
{
return index;
}
return Int32.MaxValue;
})
.ThenBy(x => Array.IndexOf(sourceArray, x));
}
You can use it like this:
var ordered = list.OrderToMatch(x => x.Field1, template);
If you want to see the source, the unit tests, or the library it lives in, you can find it on GitHub. It's also available as a NuGet package.

Select single item from each group in multiple groups

I have a list (specifically IEnumerable) of items of a specific class:
internal class MyItem
{
public MyItem(DateTime timestamp, string code)
{
Timestamp= timestamp;
Code = code;
}
public DateTime Timestamp { get; private set; }
public string Code { get; private set; }
}
Within this list, there will be multiple items with the same code. Each will have a timestamp, which may or may not be unique.
I'm attempting to retrieve a dictionary of MyItem's (Dictionary<string, MyItem>) where the key is the code associated with the item.
public Dictionary<string, MyItem> GetLatestCodes(IEnumerable<MyItem> items, DateTime latestAllowableTimestamp)
Given this signature, how would I retrieve the MyItem with a timestamp closest to, but not after latestAllowableTimestamp for each code?
For example, given the following for input:
IEnumerable<MyItem> items = new List<MyItem>{
new MyItem(DateTime.Parse("1/1/2014"), "1"),
new MyItem(DateTime.Parse("1/2/2014"), "2"),
new MyItem(DateTime.Parse("1/3/2014"), "1"),
new MyItem(DateTime.Parse("1/4/2014"), "1"),
new MyItem(DateTime.Parse("1/4/2014"), "2")};
If the latestAllowableTimestamp is 1/3/2014, the result would contain only the following items:
Timestamp | Code
----------------
1/3/2014 | 1
1/2/2014 | 2
I can manage to filter the list down to only those timestamps prior to latestAllowableTimestamp, but I don't know linq well enough to pick the most recent for each code and insert it into a dictionary.
var output = items.Where(t => (t.Timestamp <= latestAllowableTimestamp)).GroupBy(t => t.Code);
At this point, I've ended up with two groups, but don't know how to select a single item across each group.
Here is the actual method you are trying to write. It even returns a dictionary and everything:
static Dictionary<string, MyItem> GetLatestCodes(
IEnumerable<MyItem> items, DateTime latestAllowableTimestamp)
{
return items
.Where(item => item.TimeStamp <= latestAllowableTimestamp)
.GroupBy(item => item.Code)
.Select(group => group
.OrderByDescending(item => item.TimeStamp)
.First())
.ToDictionary(item => item.Code);
}
See Enumerable.ToDictionary
This is the your part you should have posted in your question (as LB pointed out)
var list = new List<MyItem>()
{
new MyItem(){ code = "1" , timestamp = new DateTime(2014,1,1)},
new MyItem(){ code = "2" , timestamp = new DateTime(2014,1,2)},
new MyItem(){ code = "1" , timestamp = new DateTime(2014,1,3)},
new MyItem(){ code = "1" , timestamp = new DateTime(2014,1,4)},
new MyItem(){ code = "2" , timestamp = new DateTime(2014,1,4)}
};
DateTime latestAllowableTimestamp = new DateTime(2014, 1, 3);
This is my answer
var result = list.GroupBy(x => x.code)
.Select(x => x.OrderByDescending(y => y.timestamp)
.FirstOrDefault(z => z.timestamp <= latestAllowableTimestamp))
.ToList();
To create your Dictionary, could construct your query like so:
var newDict = items.Where(a => a.Timestamp <= latestAllowableTimestamp)
.GroupBy(b => b.Timestamp)
.ToDictionary(c => c.First().Timestamp, c => c.First());
This should create a Dictionary from your data, with no duplicate days. Note that without the GroupBy query, you'll raise an exception, because ToDictionary doesn't filter out keys it's already seen.
And then if you wanted to get only one MyItem for any given code number, you could use this query:
newDict.Select(a => a.Value)
.OrderByDescending(b => b.Timestamp)
.GroupBy(c => c.Code)
.Select(d => d.First());
The FirstOrDefault query will return only one element from each group. This will give you the MyItem closest to the latest date for any given code.

Using LINQ to build a Dictionary from a List of delimited strings

I have a list of strings that look like this:
abc|key1|486997
def|key1|488979
ghi|key2|998788
gkl|key2|998778
olz|key1|045669
How can I use LINQ and ToDictionary to produce a Dictionary<string, List<string>> that looks like
key1 : { abc|key1|486997, def|key1|488979, olz|key1|045669 }
key2 : { ghi|key2|998788, gkl|key2|998778 }
Basically I want to be able to extract the second element as the key use ToDictionary() to create the dictionary in one go-round.
I'm currently doing this ..
var d = new Dictionary<string, List<string>>();
foreach(var l in values)
{
var b = l.Split('|');
var k = b.ElementAtOrDefault(1);
if (!d.ContainsKey(k))
d.Add(k, new List<string>());
d[k].Add(l);
}
I've seen the questions on building dictionaries from a single string of delimited values, but I'm
wondering if there's an elegant way to do this when starting with a list of delimited strings instead.
var list = new []
{
"abc|key1|486997",
"def|key1|488979",
"ghi|key2|998788",
"gkl|key2|998778",
"olz|key1|045669"
};
var dict = list.GroupBy(x => x.Split('|')[1])
.ToDictionary(x => x.Key, x => x.ToList());
You can also transform it to a lookup (that is very similary to a Dictionary<K,IEnumerable<V>>) in one shot:
var lookup = list.ToLookup(x => x.Split('|')[1]);
var data = new[]
{
"abc|key1|486997",
"def|key1|488979",
"ghi|key2|998788",
"gkl|key2|998778",
"olz|key1|045669"
};
var dictionary = data.Select(row => row.Split('|'))
.GroupBy(row => row[1])
.ToDictionary(group => group.Key, group => group);
If your data is guaranteed to be consistent like that, you could do something like this:
var data = new[]
{
"abc|key1|486997",
"def|key1|488979",
"ghi|key2|998788",
"gkl|key2|998778",
"olz|key1|045669"
};
var items = data
.GroupBy(k => k.Split('|')[1])
.ToDictionary(k => k.Key, v => v.ToList());

Convert List<MyObject> to Dictionary <obj.string, List<obj.ID>>

I would like to take a list of objects and convert it to a dictionary where the key is a field in the object, and the value is a list of a different field in the objects that match on the key. I can do this now with a loop but I feel this should be able to be accomplished with linq and not having to write the loop. I was thinking a combination of GroupBy and ToDictionary but have been unsuccessful so far.
Here's how I'm doing it right now:
var samplesWithSpecificResult = new Dictionary<string, List<int>>();
foreach(var sample in sampleList)
{
List<int> sampleIDs = null;
if (samplesWithSpecificResult.TryGetValue(sample.ResultString, out sampleIDs))
{
sampleIDs.Add(sample.ID);
continue;
}
sampleIDs = new List<int>();
sampleIDs.Add(sample.ID);
samplesWithSpecificResult.Add(sample.ResultString, sampleIDs);
}
The farthest I can get with .GroupBy().ToDictionay() is Dictionary<sample.ResultString, List<sample>>.
Any help would be appreciated.
Try the following
var dictionary = sampleList
.GroupBy(x => x.ResultString, x => x.ID)
.ToDictionary(x => x.Key, x => x.ToList());
The GroupBy clause will group every Sample instance in the list by its ResultString member, but it will keep only the Id part of each sample. This means every element will be an IGrouping<string, int>.
The ToDictionary portion uses the Key of the IGrouping<string, int> as the dictionary Key. IGrouping<string, int> implements IEnumerable<int> and hence we can convert that collection of samples' Id to a List<int> with a call to ToList, which becomes the Value of the dictionary for that given Key.
Yeah, super simple. The key is that when you do a GroupBy on IEnumerable<T>, each "group" is an object that implements IEnumerable<T> as well (that's why I can say g.Select below, and I'm projecting the elements of the original sequence with a common key):
var dictionary =
sampleList.GroupBy(x => x.ResultString)
.ToDictionary(
g => g.Key,
g => g.Select(x => x.ID).ToList()
);
See, the result of sampleList.GroupBy(x => x.ResultString) is an IEnumerable<IGrouping<string, Sample>> and IGrouping<T, U> implements IEnumerable<U> so that every group is a sequence of Sample with the common key!
Dictionary<string, List<int>> resultDictionary =
(
from sample in sampleList
group sample.ID by sample.ResultString
).ToDictionary(g => g.Key, g => g.ToList());
You might want to consider using a Lookup instead of the Dictionary of Lists
ILookup<string, int> idLookup = sampleList.ToLookup(
sample => sample.ResultString,
sample => sample.ID
);
used thusly
foreach(IGrouping<string, int> group in idLookup)
{
string resultString = group.Key;
List<int> ids = group.ToList();
//do something with them.
}
//and
List<int> ids = idLookup[resultString].ToList();
var samplesWithSpecificResult =
sampleList.GroupBy(s => s.ResultString)
.ToDictionary(g => g.Key, g => g.Select(s => s.ID).ToList());
What we 're doing here is group the samples based on their ResultString -- this puts them into an IGrouping<string, Sample>. Then we project the collection of IGroupings to a dictionary, using the Key of each as the dictionary key and enumerating over each grouping (IGrouping<string, Sample> is also an IEnumerable<Sample>) to select the ID of each sample to make a list for the dictionary value.

Categories