Linq aggregate dictionary into new dictionary - c#

class Key { string s; int i; }
Given a Dictionary<Key,int> I want a new Dictionary<string,int> that is a mapping of the minimum dictionary value for each Key.s over all keys.
I feel like this should be easy but I just can't get it.
Thanks
clarification:
var dict = new Dictionary<Key,int>();
dict.Add(new Key("a", 123), 19);
dict.Add(new Key("a", 456), 12);
dict.Add(new Key("a", 789), 13);
dict.Add(new Key("b", 998), 99);
dict.Add(new Key("b", 999), 11);
and I want to produce the dictionary:
"a" -> 12
"b" -> 11
hope that helps.

I'm not clear on exactly what you're trying to do, but you can do a mapping from one dictionary to another with .Select(... and/or .ToDictionary(...
For example:
Dictionary<Key, int> original = ...
Dictionary<string, int> mapped = original.ToDictionary((kvp) => kvp.Key.s, (kvp) => kvp.Key.i);
If you improve your question to be more clear, I'll improve my answer.
EDIT: (question was clarified)
var d = dict.GroupBy(kvp => kvp.Key.s).ToDictionary(g => g.Key, g => g.Min(k => k.Value));
You want to group by the key s property, then select the minimum of the dictionary value as the new dictionary value.

A more generic method to skip the Lookup that is created by .GroupBy :
public static Dictionary<K, V> aggregateBy<T, K, V>(
this IEnumerable<T> source,
Func<T, K> keySelector,
Func<T, V> valueSelector,
Func<V, V, V> aggregate,
int capacity = 0,
IEqualityComparer<K> comparer = null)
{
var dict = new Dictionary<K, V>(capacity, comparer);
foreach (var t in source)
{
K key = keySelector(t);
V accumulator, value = valueSelector(t);
if (dict.TryGetValue(key, out accumulator))
value = aggregate(accumulator, value);
dict[key] = value;
}
return dict;
}
Sample use:
var dict = new Dictionary<Tuple<string,int>, int>();
dict.Add(Tuple.Create("a", 123), 19);
dict.Add(Tuple.Create("a", 456), 12);
dict.Add(Tuple.Create("a", 789), 13);
dict.Add(Tuple.Create("b", 998), 99);
dict.Add(Tuple.Create("b", 999), 11);
var d = dict.aggregateBy(p => p.Key.Item1, p => p.Value, Math.Min);
Debug.Print(string.Join(", ", d)); // "[a, 12], [b, 11]"

Related

Remove all switched dictionary pairs

I have a Dictionary and want to LINQ-remove all pairs (B, A) if there is a pair (A, B).
Dictionary<int, int> dictionary = new Dictionary<int, int>();
dictionary.Add(1, 2);
dictionary.Add(3, 4); // keep it
dictionary.Add(4, 3); // remove it
//dictionary.Add(4, 3); // remove it (ignore this impossible line, #Rahul Singh is right)
You need to implement a custom equality comparer and use the Distinct method.
Dictionary<int, int> dictionary = new Dictionary<int, int>();
dictionary.Add(1, 2);
dictionary.Add(3, 4);
dictionary.Add(4, 3);
var result = dictionary.Distinct(new KeyValuePairEqualityComparer()).ToDictionary(x => x.Key, x => x.Value);
}
The equality comparer is defined as
private class KeyValuePairEqualityComparer : IEqualityComparer<KeyValuePair<int, int>>
{
public bool Equals(KeyValuePair<int, int> x, KeyValuePair<int, int> y)
{
return x.Key == y.Value && x.Value == y.Key;
}
public int GetHashCode(KeyValuePair<int, int> obj)
{
// Equality check happens on HashCodes first.
// Multiplying key/value pairs, ensures that mirrors
// are forced to check for equality via the Equals method
return obj.Key * obj.Value;
}
}
The naive approach would be to simply filter them as you need.
dictionary = dictionary
.Where( kvp => !(dictionary.ContainsKey(kvp.Value) && dictionary[kvp.Value]==kvp.Key) )
.ToDictionary( kvp => kvp.Key, kvp => kvp.Value )`
Let your pair is (1,2), for removing this pair from the dictionary you need not to bother about the value, Since Keys are unique. So you can delete using the following code:dictionary.Remove(pair.Key);
But there is a chance for KeyNotFoundException if the specified key is not found in the collection. so its always better to check for that before proceeding with remove:
int value;
if (dictionary.TryGetValue(pair.Key, out value))
{
dictionary.Remove(pair.Key);
}

Dictionary<> value count c#

I have dictionary object like this:
var dictionary = new Dictionary<string, List<int>()>;
The number of keys is not very large but the list of integers in the value can be quite large (in the order of 1000's)
Given a list of keys (keylist), I need to count the number of times each integer appears for each key and return them ordered by frequency.
Output:
{int1, count1}
{int2, count2}
...
This is the solution I have come up with:
var query = _keylist.SelectMany(
n=>_dictionary[n]).Group(g=>g).Select(
g=> new[] {g.key, g.count}).OrderByDescending(g=>g[1]);
Even when this query produces the desired result, it's not very efficient.
Is there a clever way to produce the same result with less processing?
I would do it this way:
var query =
from k in _keylist
from v in dictionary[k]
group v by v into gvs
let result = new
{
key = gvs.Key,
count = gvs.Count(),
}
orderby result.count descending
select result;
To me this is quite straight forward and simple and well worth accepting any (minor) performance hit by using LINQ.
And alternative approach that doesn't create the large list of groups would be to do this:
var query =
_keylist
.SelectMany(k => dictionary[k])
.Aggregate(
new Dictionary<int, int>(),
(d, v) =>
{
if (d.ContainsKey(v))
{
d[v] += 1;
}
else
{
d[v] = 1;
}
return d;
})
.OrderByDescending(kvp => kvp.Value)
.Select(kvp => new
{
key = kvp.Key,
count = kvp.Value,
});
From an algorithmic space- and time-usage point of view, the only thing I see that is suboptimal is the use of GroupBy when you don't actually need the groups (only the group counts). You can use the following extension method instead.
public static Dictionary<K, int> CountBy<T, K>(
this IEnumerable<T> source,
Func<T, K> keySelector)
{
return source.SumBy(keySelector, item => 1);
}
public static Dictionary<K, int> SumBy<T, K>(
this IEnumerable<T> source,
Func<T, K> keySelector,
Func<T, int> valueSelector)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (keySelector == null)
{
throw new ArgumentNullException("keySelector");
}
var dictionary = new Dictionary<K, int>();
foreach (var item in source)
{
var key = keySelector(item);
int count;
if (!dictionary.TryGetValue(key, out count))
{
count = 0;
}
dictionary[key] = count + valueSelector(item);
}
return dictionary;
}
Note the advantage is that the lists of numbers are enumerated but not stored. Only the counts are stored. Note also that the keySelector parameter is not even necessary in your case and I only included it to make the extension method slightly more general.
The usage is then as follows.
var query = _keylist
.Select(k => _dictionary[k])
.CountBy(n => n)
.OrderByDescending(p => p.Value);
This will you get you a sequence of KeyValuePair<int, int> where the Key is the number from your original lists and the Value is the count.
To more efficiently handle a sequence of queries, you can preprocess your data.
Dictionary<string, Dictionary<int, int>> preprocessedDictionary
= _dictionary.ToDictionary(p => p.Key, p => p.Value.CountBy(n => n));
Now you can perform a query more efficiently.
var query = _keylist
.SelectMany(k => preprocessedDictionary[k])
.SumBy(p => p.Key, p => p.Value)
.OrderByDescending(p => p.Value);

Join two dictionaries by value diffs

I have these two dictionaries:
Dictionary<char, double> analyzed_symbols = new Dictionary<char, double>();
Dictionary<char, double> decode_symbols = new Dictionary<char, double>();
I need to create another dictionary that should have their keys as key and value, like this:
Dictionary<char, char> replace_symbols = new Dictionary<char, char>();
The condition to "join" them is that difference between values should be minimal, like this:
Math.Min(Math.Abs(analyzed_symbols[key] - decode_symbols[key]))
I guess I should use LINQ for this purpose but can't figure out how to write query properly.
Data Sample:
analyzed_symbols = [a, 173], [b, 1522], [z, 99]
decode_symbols = [в, 100], [д, 185], [e, 1622]
For these dicts output data should look like this:
replace_symbols = [z, в], [b, е], [a, д]
I've found question that is pretty close to what I need, but not exactly. Snowy asks there about one close value, but I need to do the same thing for two dictionaries.
This is my take on it:
var analyzed_symbols = new Dictionary<char, double>(){ {'a', 173}, {'b', 1522}, {'z', 99} };
var decode_symbols = new Dictionary<char, double>(){ {'в', 100}, {'д', 185}, {'e', 1622} };
var q = from a in analyzed_symbols
from d in decode_symbols
let tmp = new { A = a.Key, D = d.Key, Value = Math.Abs(a.Value - d.Value) }
group tmp by tmp.A into g
select new
{
Key = g.Key,
Value = g.OrderBy (x => x.Value).Select (x => x.D).First()
};
var replace_symbols = q.ToDictionary (x => x.Key, x => x.Value);
Okay, I'll try. I divided into several queries, because it's more readable that way.
//sorting values of the dictionaries to easily get closest
var analyzedSortedValues = analyzed_symbols.Values.OrderBy(k => k);
var decodeSortedValues = decode_symbols.Values.OrderBy(k => k);
//creating pairs of the closest values. Here I use iterator index i to skip
//some values that have been used already (is it correct?)
var t = analyzedSortedValues.Select((k, i) => new { a = k, d = decodeSortedValues.Skip(i).Any() ? decodeSortedValues.Skip(i).First() : -1 });
//printing results by getting appropriate keys from corresponding dictionaries
foreach (var item in t)
{
Console.WriteLine("[{0}, {1}]", analyzed_symbols.FirstOrDefault(kvp => kvp.Value == item.a).Key, decode_symbols.FirstOrDefault(kvp => kvp.Value == item.d).Key);
}
I am not exactly sure how to do it via LINQ but here is the longhand version of what you want to do.
private static Dictionary<char, char> BuildReplacementDictionary(Dictionary<char, double> analyzedSymbols,
Dictionary<char, double> decodeSymbols)
{
Dictionary<char, char> replaceSymbols = new Dictionary<char, char>(analyzedSymbols.Count);
foreach (KeyValuePair<char, double> analyzedKvp in analyzedSymbols)
{
double bestMatchValue = double.MaxValue;
foreach (KeyValuePair<char, double> decodeKvp in decodeSymbols)
{
var testValue = Math.Abs(analyzedKvp.Value - decodeKvp.Value);
if (testValue <= bestMatchValue)
{
bestMatchValue = testValue;
replaceSymbols[analyzedKvp.Key] = decodeKvp.Key;
}
}
}
return replaceSymbols;
}
What it does is it goes through each element of the analyzed dictionary, test every element of the decoded dictionary, and if that match is the same or better than the previous match it found it will use the new value from the decoded dictionary.

"Grouping" dictionary by value

I have a dictionary: Dictionary<int,int>. I want to get new dictionary where keys of original dictionary represent as List<int>. This is what I mean:
var prices = new Dictionary<int,int>();
The prices contain the following data:
1 100
2 200
3 100
4 300
I want to get the IList<Dictionary<int,List<int>>>:
int List<int>
100 1,3
200 2
300 4
How can I do this?
var prices = new Dictionary<int, int>();
prices.Add(1, 100);
prices.Add(2, 200);
prices.Add(3, 100);
prices.Add(4, 300);
Dictionary<int,List<int>> test =
prices.GroupBy(r=> r.Value)
.ToDictionary(t=> t.Key, t=> t.Select(r=> r.Key).ToList());
You can use GroupBy.
Dictionary<int,List<int>> groups =
prices.GroupBy(x => x.Value)
.ToDictionary(x => x.Key, x => x.Select(i => i.Key).ToList());
Here is my reply. When the dictionaries get large, you will likely find the GroupBy() extension methods less efficient than you would like, as they provide many guarantees that you don't need, such as retaining order.
public static class DictionaryExtensions
{
public static IDictionary<TValue,List<TKey>> Reverse<TKey,TValue>(this IDictionary<TKey,TValue> src)
{
var result = new Dictionary<TValue,List<TKey>>();
foreach (var pair in src)
{
List<TKey> keyList;
if (!result.TryGetValue(pair.Value, out keyList))
{
keyList = new List<TKey>();
result[pair.Value] = keyList;
}
keyList.Add(pair.Key);
}
return result;
}
}
And an example to use in LinqPad:
void Main()
{
var prices = new Dictionary<int, int>();
prices.Add(1, 100);
prices.Add(2, 200);
prices.Add(3, 100);
prices.Add(4, 300);
// Dump method is provided by LinqPad.
prices.Reverse().Dump();
}
You can use GroupBy followed by the Func<TSource, TKey>, Func<TSource, TElement> overload of Enumerable.ToDictionary:
var d = prices.GroupBy(x => x.Value).ToDictionary(x => x.Key, x => x.ToList());
You can use Lookup instead.
var prices = new Dictionary<int, int> { {1, 100}, { 2, 200 }, { 3, 100 }, { 4, 300 } };
ILookup<int, int> groups = prices.ToLookup(x => x.Value, y => y.Key);
foreach (var group in groups)
{
foreach (var item in group)
{
Console.WriteLine(item);
}
}
In particular case, when we use the .NET framework 2.0, we can do as follows:
var prices = new Dictionary<int, int>();
prices.Add(1, 100);
prices.Add(2, 200);
prices.Add(3, 100);
prices.Add(4, 300);
Dictionary<int, List<int>> grouping = new Dictionary<int, List<int>>();
var enumerator = prices.GetEnumerator();
while (enumerator.MoveNext())
{
var pair = enumerator.Current;
if (!grouping.ContainsKey(pair.Value))
grouping[pair.Value] = new List<int>();
grouping[pair.Value].Add(pair.Key);
}

zipping/merging two sorted lists

i have two sorted dictionaries both with the type signature
i.e.
SortedDictionary<decimal, long> A
SortedDictionary<decimal, long> B
I want to merge the two lists where the key is the same, thus creating a new list like
SortedDictionary<decimal, KeyValuePair<long,long>>
or
SortedDictionary<decimal, List<long>>
This may not be the best way of approacing the situation but could someone give me a heads up on how to do this or a better way to approach it.
This is what I've got:
SortedDictionary<decimal, List<long>> merged = new SortedDictionary<decimal, List<long>>
(
A.Union(B)
.ToLookup(x => x.Key, x => x.Value)
.ToDictionary(x => x.Key, x => new List<long>(x))
);
EDIT: Above solution selects keys not included in both collections. This should select where keys are same:
SortedDictionary<decimal, List<long>> merged = new SortedDictionary<decimal, List<long>>
(
A.Where(x=>B.ContainsKey(x.Key))
.ToDictionary(x => x.Key, x => new List<long>(){x.Value, B[x.Key]})
);
You can do this simply using LINQ:
var query = from a in A
join b in B
on a.Key equals b.Key
select new {
Key = a.Key,
Value = Tuple.Create(a.Value, b.Value)
};
var merged = new SortedDictionary<decimal, Tuple<long, long>>(
query.ToDictionary(x => x.Key, x => x.Value)
);
I think you should use Tuple<long, long> as your TValue in the merged dictionary.
Another LINQ way of doing this that I think captures the intent better in terms of set operations:
SortedDictionary<decimal, long> a = new SortedDictionary<decimal, long>();
SortedDictionary<decimal, long> b = new SortedDictionary<decimal, long>();
a.Add(0, 10);
a.Add(1, 10);
a.Add(2, 100);
a.Add(100, 1);
b.Add(0, 4);
b.Add(4, 4);
b.Add(2, 10);
var result = a.Union(b)
.GroupBy(x => x.Key)
.ToDictionary(x => x.Key, x => x.Select(y => (long)y.Value).ToList());
Try something like this, it not easy:
Dictionary<decimal, long> dic1 = new Dictionary<decimal, long>{ {3,23}, {2,3}, {5,4}, {6,8}};
Dictionary<decimal, long> dic2 = new Dictionary<decimal, long>{ {3,2}, {2,5}, {5,14}, {12,2}};
//recover shared keys (the keys that are present in both dictionaries)
var sharedKeys = dic1.Select(dic => dic.Key).Intersect(dic2.Select(d2=>d2.Key));
sharedKeys.Dump();
//add to the fìnal dictionary
var final = new Dictionary<decimal, List<long>>();
foreach(var shk in sharedKeys) {
if(!final.ContainsKey(shk))
final[shk] = new List<long>();
final[shk].Add(dic1[shk]);
final[shk].Add(dic2[shk]);
}
**EDIT**
//Skip below part if you need only keys present on both dictionaries.
///-----------------------------------------------------------------
//get unique keys present in Dic1 and add
var nonsharedkeys1 = dic1.Select(d=>d.Key).Where(k=>!sharedKeys.Contains(k));
foreach(var nshk in nonsharedkeys1) {
final[nshk] = new List<long>();
final[nshk].Add(dic1[nshk]);
}
//get unique keys present in Dic2 and add
var nonsharedkeys2 = dic2.Select(d=>d.Key).Where(k=>!sharedKeys.Contains(k));
foreach(var nshk in nonsharedkeys2) {
final[nshk] = new List<long>();
final[nshk].Add(dic2[nshk]);
}
Should work for you.
You could "abuse" Concat and Aggregate like this:
var A = new SortedDictionary<decimal,long>();
var B = new SortedDictionary<decimal,long>();
A.Add(1, 11);
A.Add(2, 22);
A.Add(3, 33);
B.Add(2, 222);
B.Add(3, 333);
B.Add(4, 444);
var C = A.Concat(B).Aggregate(
new SortedDictionary<decimal, List<long>>(),
(result, pair) => {
List<long> val;
if (result.TryGetValue(pair.Key, out val))
val.Add(pair.Value);
else
result.Add(pair.Key, new[] { pair.Value }.ToList());
return result;
}
);
foreach (var x in C)
Console.WriteLine(
string.Format(
"{0}:\t{1}",
x.Key,
string.Join(", ", x.Value)
)
);
The resulting output:
1: 11
2: 22, 222
3: 33, 333
4: 444
This is pretty much the same as if you wrote a "normal" foreach and would in fact work on any IEnumerable<KeyValuePair<decimal, long>> (not just SortedDictionary<decimal, long>) and is easy to extend to more than two input collections if needed.
Unfortunately, it also completely disregards the fact that the input SortedDictionary is, well, sorted, so performance is not optimal. For optimal performance you'd have to fiddle with linearly advancing separate IEnumerator for each of the input sorted dictionaries, while constantly comparing the underlying elements - you could completely avoid TryGetValue that way...

Categories