How to calculate employees reporting to a manager recursively using LINQ? - c#

I have a dictionary that contains mapping of employee and his/her manager like this
Dictionary<string, string> employees = new Dictionary<string, string>()
{
{ "A","C" },
{ "B","C" },
{ "C","F" },
{ "D","E" },
{ "E","F" },
{ "F","F" }
};
I want to get no of employees under each manager in the hierarchy not just their direct reports but down the hierarchy chain.
In the above dictionary the root node/ceo is listed as reporting to himself. This is the only node that is guaranteed to have this self relationship.
How can I find total no of employees reporting to each manager. Output should be
A - 0
B - 0
C - 2
D - 0
E - 1
F - 5
This is what I tried so far but it only gives counts of direct reports not all reports in the chain
var reports = employees
.GroupBy(e => e.Value, (key, g) => new { employee = key, reports = g.Count() });

The problem you describe is virtually identical to the problem described in this blog post.
Your spec could be written as (this is a trivial adaptation from the quoted text):
The complete set of reports upon which an Employee depends is the transitive closure of the directly-reports-to relationship.
The post then proceeds to provide the following code to compute the transitive closure of a particular relationship:
static HashSet<T> TransitiveClosure<T>(
this Func<T, IEnumerable<T>> relation,
T item)
{
var closure = new HashSet<T>();
var stack = new Stack<T>();
stack.Push(item);
while (stack.Count > 0)
{
T current = stack.Pop();
foreach (T newItem in relation(current))
{
if (!closure.Contains(newItem))
{
closure.Add(newItem);
stack.Push(newItem);
}
}
}
return closure;
}
So all that's left is to provide the code for the directly-reports-to relationship.
This can be easily computed by creating a lookup from your dictionary mapping each employee to their reports:
var directReportLookup = employees.ToLookup(pair => pair.Value, pair => pair.Key);
Func<string, IEnumerable<string>> directReports =
employee => directReportLookup[employee];
employees.Select(pair => new
{
Employee = pair.Key,
Count = directReports.TransitiveClosure(pair.Key).Count,
});

You might be interested in a recursion in the anonymous methods. There is one interesting approach from the functional languages: fixed-point combinator.
It looks like this:
public static class Combinator
{
public static Func<TInput, TResult> Y<TInput, TResult>(Func<Func<TInput,TResult>, TInput, TResult> function)
{
return input => function(Y(function), input);
}
}
And can be used like this:
var result = employees
.Select(employee => Combinator.Y<string, int>
(
(f, e) => employees.Where(x => x.Value == e && x.Value != x.Key)
.Aggregate(employees.Count(x => x.Value == e && x.Value != x.Key), (current, next) => current + f(next.Key))
)
.Invoke(employee.Key))
.ToList();
Of course, it will be more useful for simler tasks, like this:
var fact = Combinator.Y<int, int>((f, n) => n > 1 ? n * f(n - 1) : 1);
var fib = Combinator.Y<uint, int>((f, n) => n > 2 ? f(n - 1) + f(n - 2) : (n == 0 ? 0 : 1));

Related

How to combine two different GroupedStreams in Rx.NET?

This question is similar, but it does not apply to my case, since the user needed the merge observable streams from the same IGroupedObservable, while I want to combine streams from different groups.
I have the following structures and streams:
type A = {
Id: int
Value: int
}
type B = {
Id: int
Value: int
}
//subjects to test input, just any source of As and Bs
let subjectA: Subject<A> = Subject.broadcast
let subjectB: Subject<B> = Subject.broadcast
//grouped streams
let groupedA: IObservable<<IGroupedObservable<int, A>> = Observable.groupBy (fun a -> a.Id) subjectA
let groupedB: IObservable<<IGroupedObservable<int, B>> = Observable.groupBy (fun b -> b.Id) subjectB
I want to somehow merge the internal observables of A and B when groupedA.Key = groupedB.Key, and get an observable of (A, B) pairs where A.Id = B.Id
The signature I want is something like
IObservable<IGroupedObservable<int, A>> -> IObservable<IGroupedObservable<int, B>> -> IObservable<IGroupedObservable<int, (A, B)>> where for all (A, B), A.Id = B.Id
I tried a bunch of combineLatest, groupJoin, filters and maps variations, but with no success.
I'm using F# with Rx.Net and FSharp.Control.Reactive, but if you know the answer in C# (or any language, really) please post it
Here is a custom operator GroupJoin that you could use. It is based on the Select, Merge, GroupBy and Where operators:
/// <summary>
/// Groups and joins the elements of two observable sequences, based on common keys.
/// </summary>
public static IObservable<(TKey Key, IObservable<TLeft> Left, IObservable<TRight> Right)>
GroupJoin<TLeft, TRight, TKey>(
this IObservable<TLeft> left,
IObservable<TRight> right,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
IEqualityComparer<TKey> keyComparer = null)
{
// Arguments validation omitted
keyComparer ??= EqualityComparer<TKey>.Default;
return left
.Select(x => (x, (TRight)default, Type: 1, Key: leftKeySelector(x)))
.Merge(right.Select(x => ((TLeft)default, x, Type: 2, Key: rightKeySelector(x))))
.GroupBy(e => e.Key, keyComparer)
.Select(g => (
g.Key,
g.Where(e => e.Type == 1).Select(e => e.Item1),
g.Where(e => e.Type == 2).Select(e => e.Item2)
));
}
Usage example:
var subjectA = new Subject<A>();
var subjectB = new Subject<B>();
IObservable<IGroupedObservable<int, (A, B)>> query = subjectA
.GroupJoin(subjectB, a => a.Id, b => b.Id)
.SelectMany(g => g.Left.Zip(g.Right, (a, b) => (g.Key, a, b)))
.GroupBy(e => e.Key, e => (e.a, e.b));
I'm not clear if this is what you want. So it may be helpful to clarify first with runner code. Assuming the following runner code:
var aSubject = new Subject<A>();
var bSubject = new Subject<B>();
var groupedA = aSubject.GroupBy(a => a.Id);
var groupedB = bSubject.GroupBy(b => b.Id);
//Initiate solution
solution.Merge()
.Subscribe(t => Console.WriteLine($"(Id = {t.a.Id}, AValue = {t.a.Value}, BValue = {t.b.Value} )"));
aSubject.OnNext(new A() { Id = 1, Value = 1 });
aSubject.OnNext(new A() { Id = 1, Value = 2 });
bSubject.OnNext(new B() { Id = 1, Value = 10 });
bSubject.OnNext(new B() { Id = 1, Value = 20 });
bSubject.OnNext(new B() { Id = 1, Value = 30 });
Do you want to see the following output:
(Id = 1, AValue = 1, BValue = 10)
(Id = 1, AValue = 2, BValue = 10)
(Id = 1, AValue = 1, BValue = 20)
(Id = 1, AValue = 2, BValue = 20)
(Id = 1, AValue = 1, BValue = 30)
(Id = 1, AValue = 2, BValue = 30)
If that's the case, you can get to solution as follows:
var solution = groupedA.Merge()
.Join(groupedB.Merge(),
_ => Observable.Never<Unit>(),
_ => Observable.Never<Unit>(),
(a, b) => (a, b)
)
.Where(t => t.a.Id == t.b.Id)
.GroupBy(g => g.a.Id);
I'll caution that there are memory/performance impacts here if this is part of a long-running process. This keeps all A and B objects in memory indefinitely, waiting to see if they can be paired off. To shorten the amount of time they're kept in memory, change the Observable.Never() calls to appropriate windows for how long to keep each object in memory.
As a start, this has the signature you want:
let cartesian left right =
rxquery {
for a in left do
for b in right do
yield a, b
}
let mergeGroups left right =
rxquery {
for (leftGroup : IGroupedObservable<'key, 'a>) in left do
for (rightGroup : IGroupedObservable<'key, 'b>) in right do
if leftGroup.Key = rightGroup.Key then
let merged = cartesian leftGroup rightGroup
yield {
new IGroupedObservable<_, _> with
member __.Key = leftGroup.Key
member __.Subscribe(observer) = merged.Subscribe(observer)
}
}
However, in my testing, the groups are all empty. I don't have enough Rx experience to know why, but perhaps someone else does.

.Net LINQ - Filter a dictionary using another dictionary

I have two dictionaries of the same type, A and B.
Dictionary<string, IEnumerable<object>>
I'm using object to represent a complex type having a property 'Id'.
I'm looking for all items in A having objects that exist in B (using Id), but under a different key. It's basically to tell if an object has moved keys. A is the new dictionary and B is the old.
Is there a reasonable way to accomplish this using LINQ? I would like the result to be a dictionary of all key-value pairs in A meeting the criteria. Thanks in advance.
I use Interface IHasId for use Id propert:
public interface IHasId
{
int Id { get; }
}
And class AAA that inherited the interface:
public class AAA: IHasId
{
public int Id { get; set; }
}
Here the linq you look for:
Dictionary<string, IEnumerable<IHasId>> A = new Dictionary<string, IEnumerable<IHasId>>();
A.Add("111", new List<IHasId> { new AAA { Id = 1 }, new AAA { Id = 2 } });
A.Add("333", new List<IHasId> { new AAA { Id = 3 } });
Dictionary<string, IEnumerable<IHasId>> B = new Dictionary<string, IEnumerable<IHasId>>();
B.Add("111", new List<IHasId> { new AAA { Id = 1 }});
B.Add("222", new List<IHasId> { new AAA { Id = 2 }});
B.Add("333", new List<IHasId> { new AAA { Id = 3 } });
var res = A.Where(a => a.Value.Any(c => B.Any(v => v.Value
.Select(x => x.Id).Contains(c.Id) && a.Key != v.Key))).ToList();
In this example it return key 111 that has the object with Id = 2 that moved from key 222 to key 111
If you want the result as dictionary you can change the ToList with ToDictionary:
var res = A.Where(a => a.Value.Any(c => B.Any(v => v.Value
.Select(x => x.Id).Contains(c.Id) && a.Key != v.Key)))
.ToDictionary(a=>a.Key, a=>a.Value);
If you want in the new dictionary only the values that has change, like in the example key 111 and value with only the object with Id = 2, you can do it like this:
var res = A.Select(a => new KeyValuePair<string, IEnumerable<IHasId>>(a.Key,
a.Value.Where(c => B.Any(v => v.Value.Select(x => x.Id).Contains(c.Id) && a.Key != v.Key))))
.Where(a=>a.Value.Count() > 0)
.ToDictionary(a => a.Key, a => a.Value);
In terms of searchability, your dictionary has it backwards; it is efficient for looking up an object given a string, but you need to be able to look up the strings for a given object. An efficient data structure for this purpose would be a Lookup<object,string>.
First, use ToLookup() to create a lookup table where the key is the object and the value is the list of keys in both list A and B. Use Union (instead of Concat) to eliminate duplicates.
var lookup = listA
.Union( listB )
.ToLookup( pair => pair.Value, pair => pair.Key );
Once you have the lookup, the problem is trivial.
var results = lookup.Where( x => x.Count() > 1);
See this DotNetFiddle for a working example with sample data.
If you need A entries with original objects, it could be:
var result = A.Where(a => B.Any(b => b.Key != a.Key && b.Value.Intersect(a.Value).Any()));
If you need A entries with only matching objects from B, it could be:
var result = A.Select(a => new KeyValuePair<string, IEnumerable<object>>(a.Key, B.Where(b => b.Key != a.Key).SelectMany(b => b.Value.Intersect(a.Value)))).Where(x => x.Value.Any());
You can provide a custom equality comparer for Intersect to match items by Id or whatever.
Use new Dictionary<string, IEnumerable<object>>(result) if you need it as a dictionary.
Use the Join operator (see join clause (C# Reference)):
var dictionary = (
from a in (from entry in A from Value in entry.Value select new { entry.Key, Value })
join b in (from entry in B from Value in entry.Value select new { entry.Key, Value })
on ((dynamic)a.Value).Id equals ((dynamic)b.Value).Id
where a.Key != b.Key
select a
).ToDictionary(a => a.Key, a => a.Value);

Dictionary<> value count c#

I have dictionary object like this:
var dictionary = new Dictionary<string, List<int>()>;
The number of keys is not very large but the list of integers in the value can be quite large (in the order of 1000's)
Given a list of keys (keylist), I need to count the number of times each integer appears for each key and return them ordered by frequency.
Output:
{int1, count1}
{int2, count2}
...
This is the solution I have come up with:
var query = _keylist.SelectMany(
n=>_dictionary[n]).Group(g=>g).Select(
g=> new[] {g.key, g.count}).OrderByDescending(g=>g[1]);
Even when this query produces the desired result, it's not very efficient.
Is there a clever way to produce the same result with less processing?
I would do it this way:
var query =
from k in _keylist
from v in dictionary[k]
group v by v into gvs
let result = new
{
key = gvs.Key,
count = gvs.Count(),
}
orderby result.count descending
select result;
To me this is quite straight forward and simple and well worth accepting any (minor) performance hit by using LINQ.
And alternative approach that doesn't create the large list of groups would be to do this:
var query =
_keylist
.SelectMany(k => dictionary[k])
.Aggregate(
new Dictionary<int, int>(),
(d, v) =>
{
if (d.ContainsKey(v))
{
d[v] += 1;
}
else
{
d[v] = 1;
}
return d;
})
.OrderByDescending(kvp => kvp.Value)
.Select(kvp => new
{
key = kvp.Key,
count = kvp.Value,
});
From an algorithmic space- and time-usage point of view, the only thing I see that is suboptimal is the use of GroupBy when you don't actually need the groups (only the group counts). You can use the following extension method instead.
public static Dictionary<K, int> CountBy<T, K>(
this IEnumerable<T> source,
Func<T, K> keySelector)
{
return source.SumBy(keySelector, item => 1);
}
public static Dictionary<K, int> SumBy<T, K>(
this IEnumerable<T> source,
Func<T, K> keySelector,
Func<T, int> valueSelector)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (keySelector == null)
{
throw new ArgumentNullException("keySelector");
}
var dictionary = new Dictionary<K, int>();
foreach (var item in source)
{
var key = keySelector(item);
int count;
if (!dictionary.TryGetValue(key, out count))
{
count = 0;
}
dictionary[key] = count + valueSelector(item);
}
return dictionary;
}
Note the advantage is that the lists of numbers are enumerated but not stored. Only the counts are stored. Note also that the keySelector parameter is not even necessary in your case and I only included it to make the extension method slightly more general.
The usage is then as follows.
var query = _keylist
.Select(k => _dictionary[k])
.CountBy(n => n)
.OrderByDescending(p => p.Value);
This will you get you a sequence of KeyValuePair<int, int> where the Key is the number from your original lists and the Value is the count.
To more efficiently handle a sequence of queries, you can preprocess your data.
Dictionary<string, Dictionary<int, int>> preprocessedDictionary
= _dictionary.ToDictionary(p => p.Key, p => p.Value.CountBy(n => n));
Now you can perform a query more efficiently.
var query = _keylist
.SelectMany(k => preprocessedDictionary[k])
.SumBy(p => p.Key, p => p.Value)
.OrderByDescending(p => p.Value);

Use Linq to find consecutively repeating elements

Let's assume I have a list with objects of type Value. Value has a Name property:
private List<Value> values = new List<Value> {
new Value { Id = 0, Name = "Hello" },
new Value { Id = 1, Name = "World" },
new Value { Id = 2, Name = "World" },
new Value { Id = 3, Name = "Hello" },
new Value { Id = 4, Name = "a" },
new Value { Id = 5, Name = "a" },
};
Now I want to get a list of all "repeating" values (elements where the name property was identical with the name property of the previous element).
In this example I want a list with the two elements "world" and "a" (id = 2 and 5) to be returned.
Is this event possible with linq?
Of course I could so smth. like this:
List<Value> tempValues = new List<Value>();
String lastName = String.Empty();
foreach (var v in values)
{
if (v.Name == lastName) tempValues.Add(v);
lastName = v.Name;
}
but since I want to use this query in a more complex context, maybe there is a "linqish" solution.
There won't be anything built in along those lines, but if you need this frequently you could roll something bespoke but fairly generic:
static IEnumerable<TSource> WhereRepeated<TSource>(
this IEnumerable<TSource> source)
{
return WhereRepeated<TSource,TSource>(source, x => x);
}
static IEnumerable<TSource> WhereRepeated<TSource, TValue>(
this IEnumerable<TSource> source, Func<TSource, TValue> selector)
{
using (var iter = source.GetEnumerator())
{
if (iter.MoveNext())
{
var comparer = EqualityComparer<TValue>.Default;
TValue lastValue = selector(iter.Current);
while (iter.MoveNext())
{
TValue currentValue = selector(iter.Current);
if (comparer.Equals(lastValue, currentValue))
{
yield return iter.Current;
}
lastValue = currentValue;
}
}
}
}
Usage:
foreach (Value value in values.WhereRepeated(x => x.Name))
{
Console.WriteLine(value.Name);
}
You might want to think about what to do with triplets etc - currently everything except the first will be yielded (which matches your description), but that might not be quite right.
You could implement a Zip extension, then Zip your list with .Skip(1) and then Select the rows that match.
This should work and be fairly easy to maintain:
values
.Skip(1)
.Zip(items, (first,second) => first.Name==second.Name?first:null)
.Where(i => i != null);
The slight disadvantage of this method is that you iterate through the list twice.
I think this would work (untested) -- this will give you both the repeated word and it's index. For multiple repeats you could traverse this list and check for consecutive indices.
var query = values.Where( (v,i) => values.Count > i+1 && v == values[i+1] )
.Select( (v,i) => new { Value = v, Index = i } );
Here's another simple approach that should work if the IDs are always sequential as in your sample:
var data = from v2 in values
join v1 in values on v2.Id equals v1.Id + 1
where v1.Name == v2.Name
select v2;
I know this question is ancient but I was just working on the same thing so ....
static class utils
{
public static IEnumerable<T> FindConsecutive<T>(this IEnumerable<T> data, Func<T,T,bool> comparison)
{
return Enumerable.Range(0, data.Count() - 1)
.Select( i => new { a=data.ElementAt(i), b=data.ElementAt(i+1)})
.Where(n => comparison(n.a, n.b)).Select(n => n.a);
}
}
Should work for anything - just provide a function to compare the elements
You could use the GroupBy extension to do this.
Something like this
var dupsNames =
from v in values
group v by v.Name into g
where g.Count > 1 // If a group has only one element, just ignore it
select g.Key;
should work. You can then use the results in a second query:
dupsNames.Select( d => values.Where( v => v.Name == d ) )
This should return a grouping with key=name, values = { elements with name }
Disclaimer: I did not test the above, so I may be way off.

Find the most occurring number in a List<int>

Is there a quick and nice way using linq?
How about:
var most = list.GroupBy(i=>i).OrderByDescending(grp=>grp.Count())
.Select(grp=>grp.Key).First();
or in query syntax:
var most = (from i in list
group i by i into grp
orderby grp.Count() descending
select grp.Key).First();
Of course, if you will use this repeatedly, you could add an extension method:
public static T MostCommon<T>(this IEnumerable<T> list)
{
return ... // previous code
}
Then you can use:
var most = list.MostCommon();
Not sure about the lambda expressions, but I would
Sort the list [O(n log n)]
Scan the list [O(n)] finding the longest run-length.
Scan it again [O(n)] reporting each number having that run-length.
This is because there could be more than one most-occurring number.
Taken from my answer here:
public static IEnumerable<T> Mode<T>(this IEnumerable<T> input)
{
var dict = input.ToLookup(x => x);
if (dict.Count == 0)
return Enumerable.Empty<T>();
var maxCount = dict.Max(x => x.Count());
return dict.Where(x => x.Count() == maxCount).Select(x => x.Key);
}
var modes = { }.Mode().ToArray(); //returns { }
var modes = { 1, 2, 3 }.Mode().ToArray(); //returns { 1, 2, 3 }
var modes = { 1, 1, 2, 3 }.Mode().ToArray(); //returns { 1 }
var modes = { 1, 2, 3, 1, 2 }.Mode().ToArray(); //returns { 1, 2 }
I went for a performance test between the above approach and David B's TakeWhile.
source = { }, iterations = 1000000
mine - 300 ms, David's - 930 ms
source = { 1 }, iterations = 1000000
mine - 1070 ms, David's - 1560 ms
source = 100+ ints with 2 duplicates, iterations = 10000
mine - 300 ms, David's - 500 ms
source = 10000 random ints with about 100+ duplicates, iterations = 1000
mine - 1280 ms, David's - 1400 ms
Here is another answer, which seems to be fast. I think Nawfal's answer is generally faster but this might shade it on long sequences.
public static IEnumerable<T> Mode<T>(
this IEnumerable<T> source,
IEqualityComparer<T> comparer = null)
{
var counts = source.GroupBy(t => t, comparer)
.Select(g => new { g.Key, Count = g.Count() })
.ToList();
if (counts.Count == 0)
{
return Enumerable.Empty<T>();
}
var maxes = new List<int>(5);
int maxCount = 1;
for (var i = 0; i < counts.Count; i++)
{
if (counts[i].Count < maxCount)
{
continue;
}
if (counts[i].Count > maxCount)
{
maxes.Clear();
maxCount = counts[i].Count;
}
maxes.Add(i);
}
return maxes.Select(i => counts[i].Key);
}
Someone asked for a solution where there's ties. Here's a stab at that:
int indicator = 0
var result =
list.GroupBy(i => i)
.Select(g => new {i = g.Key, count = g.Count()}
.OrderByDescending(x => x.count)
.TakeWhile(x =>
{
if (x.count == indicator || indicator == 0)
{
indicator = x.count;
return true;
}
return false;
})
.Select(x => x.i);
Here's a solution I've written for when there are multiple most common elements.
public static List<T> MostCommonP<T>(this IEnumerable<T> list)
{
return list.GroupBy(element => element)
.GroupBy(group => group.Count())
.MaxBy(groups => groups.Key)
.Select(group => group.Key)
.ToList();
}

Categories