I'm really fighting to understand how aggregate works, and I have a solution that maps an IEnumerable to a newer C# 7 Tuple.
I'm thinking I could understand this a little bit more if this were written as Linq Sql Syntax.
Would anyone like to take a stab at it?
IEnumerable<(string Key, string Value)> many = DataToPivot();
(string XXXX, string YYYY, string ZZZZ) agg =
many.Aggregate((XXXX: default(string),
YYYY: default(string),
ZZZZ: default(string)),
(a, i) =>
{
switch (i.Key)
{
case "xxxx":
return (i.Value, a.YYYY, a.ZZZZ);
case "yyyy":
return (a.XXXX, i.Value, a.ZZZZ);
case "zzzz":
return (a.XXXX, a.YYYY, i.Value);
default:
return a;
}
});
As far as I know Aggregate doesn't have a query syntax (for more info see documentation). The documentation should also be able to explain how the function works.
The overload you're using is taking the initial value of the aggregate (1st argument), and applying the accumulation function (2nd argument) to each element, returning the intermediate aggregate value. So your example produces 3 strings from the input data basically returning the last string value for each key (or default(string) when input data doesn't contain any items for that key).
If this is your requirement you don't (and shouldn't) need to use the Aggregate function, because you are not aggregating. You can get identical results with the following example (assuming all keys are present in the many input):
IEnumerable<(string Key, string Value)> many = DataToPivot();
var d = many.GroupBy(i => i.Key)
.ToDictionary(g => g.Key, g => g.Last().Value);
(string XXXX, string YYYY, string ZZZZ) agg = (d["xxxx"], d["yyyy"], d["zzzz"]);
If the tuple is not required the following handles also cases where a key is not present in the data set at all (the defaults will be returned if the key doesn't exist):
d.TryGetValue("xxxx", out string x);
d.TryGetValue("yyyy", out string y);
d.TryGetValue("zzzz", out string z);
Aggregate would be used e.g. for string concatenation - but there you would go with String.Join() instead:
many.GroupBy(i => i.Key)
.ToDictionary(g => g.Key, g => string.Join(",", g));
If you would still want to use Aggrergate you can rewrite it like this:
many.GroupBy(i => i.Key)
.ToDictionary(g => g.Key, g => g.Aggregate((a, i) => i));
This is basically Last() implemented using Aggregate(); and with TryGetValue you can get what you need.
On a bit more general note: using this approach you can accommodate multiple key values without needing to specifically code them. In that case you might not even need the ToDictionary call, e.g. like this:
many.GroupBy(i => i.Key)
.Select(g => new { g.Key, Result = g.Aggregate((a, i) => i) })
.ToList();
Related
So far, I have this:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)));
Configuration folder will contain pairs of files:
abc.json
abc-input.json
def.json
def-input.json
GetReportName() method strips off the "-input" and title cases the filename, so you end up with a grouping of:
Abc
abc.json
abc-input.json
Def
def.json
def-input.json
I have a ReportItem class that has a constructor (Name, str1, str2). I want to extend the Linq to create the ReportItems in a single statement, so really something like:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
**.Select(x => new ReportItem(x.Key, x[0], x[1]));**
Obviously last line doesn't work because the grouping doesn't support array indexing like that. The item should be constructed as "Abc", "abc.json", "abc-input.json", etc.
If you know that each group of interest contains exactly two items, use First() to get the item at index 0, and Last() to get the item at index 1:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
.Where(g => g.Count() == 2) // Make sure we have exactly two items
.Select(x => new ReportItem(x.Key, x.First(), x.Last()));
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x))).Select(x => new ReportItem(x.Key, x.FirstOrDefault(), x.Skip(1).FirstOrDefault()));
But are you sure there will be exactly two items in each group? Maybe has it sence for ReportItem to accept IEnumerable, not just two strings?
Here is my code:
IEnumerable<ServiceTicket> troubletickets = db.ServiceTickets.Include(t => t.Company).Include(t => t.UserProfile);
var ticketGroups = new Dictionary<string, List<ServiceTicket>>();
ticketGroups = troubletickets
.GroupBy(o => o.DueDate).ToDictionary(
group => {
var firstOrDefault = #group.FirstOrDefault();
return firstOrDefault != null
? firstOrDefault.DueDate.HasValue
? firstOrDefault.DueDate.Value.ToShortDateString()
: ""
: "";
},
group => group.ToList()
).OrderBy(g => g.Key).ToDictionary(g => g.Key, g => g.Value);
The error that I am getting is: 'An item with the same key has already been added.' This is because the DueDate value is occasionally repeated. My question is how can I keep the key from being added if it already exists in the dictionary?
It seems that you are grouping by one value (the DueDate value), but using a different value as the dictionary key.
Can you not just use the custom code for grouping instead?
ticketGroups = troubletickets
.GroupBy(o => o.DueDate.HasValue
? o.DueDate.Value.ToShortDateString()
: "")
.ToDictionary(g => g.Key, g => g.ToList());
Note that I took our the superfluous OrderBy and second ToDictionary call - I assumed you were trying to "order" the dictionary which won't work as a plain dictionary is not ordered.
You get duplicate keys because there are two ways to get an empty string as key, either an empty group, or an empty date. The duplicate will always be the empty string. I wonder if you really intended to get an empty string as key when the group is empty. Anyway, it's not necessary, you can always filter empty groups later.
It's easier to group by date (including null) first through the database engine and then apply string formatting in memory:
IQueryable<ServiceTicket> troubletickets = db.ServiceTickets
.Include(t => t.Company)
.Include(t => t.UserProfile);
Dictionary<string, List<ServiceTicket>> ticketGroups =
troubletickets
.GroupBy(ticket => ticket.DueDate)
.AsEnumerable() // Continue in memory
.ToDictionary(g => g.Key.HasValue
? g.Key.Value.ToShortDateString()
: string.Empty,
g => g.Select(ticket => ticket));
Now the grouping is by the Key value, not by the First element in the group. The Key is never null, it's always a Nullable<DateTime>, with or without a value.
Side note: you'll notice that EF will not generate a SQL group by statement, that's because the SQL statement is "destructive": it only returns grouped columns and aggregate data, not the individual records that a LINQ GroupBy does return. For this reason, the generated SQL is pretty bloated and it may enhance performance if you place the AsEnumerable before the .GroupBy.
This is related to my other question here. James World presented a solution as follows:
// idStream is an IObservable<int> of the input stream of IDs
// alarmInterval is a Func<int, TimeSpan> that gets the interval given the ID
var idAlarmStream = idStream
.GroupByUntil(key => key, grp => grp.Throttle(alarmInterval(grp.Key)))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key)));
<edit 2:
Question: How do I start the timers immediately without waiting for the first events to arrive? That's the root problem in my question, I guess. For that end, I planned on sending off dummy objects with the IDs I know should be there. But as I write in following, I ended up with some other problems. Nevertheless, I'd think solving that too would be interesting.
Forwards with the other interesting parts then! Now, if I'd like to group a complex object like the following and group by the key as follows (won't compile)
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key)))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key)));
then I get into trouble. I'm unable to modify the part about SelectMany, Concat and Observable.Return so that the query would work as before. For instance, if I make query as
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key)))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key.First())))
.Subscribe(i => Console.WriteLine(i.Id + "-" + i.IsTest);
Then two events are needed before an output can be observed in the Subscribe. It's the effect of the call to First, I gather. Furthermore, I woul like to use the complex object attributes in the call to alarmInterval too.
Can someone offer an explanation what's going on, perhaps even a solution? The problem in going with unmodified solution is that the grouping doesn't look Ids alone for the key value, but also the IsTest field.
<edit: As a note, the problem probably could be solved firsly by creating an explicit class or struct and then that implements a custom IEquatable and secondly then using James' code as-is so that grouping would happen by IDs alone. It feels like hack though.
Also, if you want to count the number of times you've seen an item before the alarm goes off you can do it like this, taking advantage of the counter overload in Select.
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key))
.SelectMany(grp => grp.Select((count, alarm) => new { count, alarm }).TakeLast(1));
Note, this will be 0 for the first (seed) item - which is probably what you want anyway.
You are creating an anonymous type in your Select. Lets call it A1. I will assume your idStream is an IObservable. Since this is the Key in the GroupByUntil you do not need to worry about key comparison - int equality is fine.
The GroupByUntil is an IObservable<IGroupedObservable<int, A1>>.
The SelectMany as written is trying to be an IObservable<A1>. You need to just Concat(Observable.Return(grp.Key)) here - but the the type of the Key and the type of the Group elements must match or the SelectMany won't work. So the key would have to be an A1 too. Anonymous types use structural equality and the return type would be stream of A1 - but you can't declare that as a public return type.
If you just want the Id, you should add a .Select(x => x.Id) after the Throttle:
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key)
.Select(x => x.Id))
.SelectMany(grp => grp.IgnoreElements().Concat(Observable.Return(grp.Key)));
If you want A1 instead - you'll need to create a concrete type that implements Equality.
EDIT
I've not tested it, but you could also flatten it more simply like this, I think this is easier! It is outputing A1 though, so you'll have to deal with that if you need to return the stream somewhere.
var idAlarmStream = idStream
.Select(i => new { Id = i, IsTest = true })
.GroupByUntil(key => key.Id, grp => grp.Throttle(alarmInterval(grp.Key))
.SelectMany(grp => grp.TakeLast(1));
I'm trying to do a GroupBy and then OrderBy to a list I have. Here is my code so far:
reportList.GroupBy(x => x.Type).ToDictionary(y=>y.Key, z=>z.OrderBy(a=>a.Lost));
With the help of the last question I asked on linq I think the ToDictionary is probably unneeded, but without it I don't know how to access the inner value.
To be clear, I need to GroupBy the Type property and want the inner groups I get to be OrderBy the Lost property (an integer). I want to know if there is a better, more efficient way or at the least better then what I've done.
An explanation and not just an answer would be very much appreciated.
Yes, there is better approach. Do not use random names (x,y,z,a) for variables:
reportList.GroupBy(r => r.Type)
.ToDictionary(g => g.Key, g => g.OrderBy(r => r.Lost));
You can even use long names to make code more descriptive (depends on context in which you are creating query)
reportList.GroupBy(report => report.Type)
.ToDictionary(group => group.Key,
group => group.OrderBy(report => report.Lost));
Your code does basically the following things:
Group elements by type
Convert the GroupBy result into a dictionary where the values of the dictionary are IEnumerables coming from a call to OrderBy
As far as the code correctness it is perfectly fine IMO, but maybe can be improved in term of efficiency (even if depends on your needs).
In fact, with your code, the values of your dictionary are lazily evaluated each time you enumerate them, resulting in a call to OrderBy method.
Probably you could perform it once and store the result in this way:
var dict = reportList
.GroupBy(x => x.Type)
.ToDictionary(y => y.Key, z => z.OrderBy(a => a.Lost).ToList());
// note the ToList call
or in this way:
var dict = reportList.OrderBy(a => a.Lost)
.GroupBy(x => x.Type)
.ToDictionary(y => y.Key, z => z);
// here we order then we group,
// since GroupBy guarantees to preserve the original order
Looks fine to me. If you use an anonymous type instead of a Dictionary, you could probably improve the readability of the code that uses the results of this query.
reportList.GroupBy(r => r.Type)
.Select(g => new { Type = g.Key, Reports = g.OrderBy(r => r.Lost) });
Why does this yield an empty set?
Object[] types = {23, 234, "hello", "test", true, 23};
var newTypes = types.Select(x => x.GetType().Name)
.Where(x => x.GetType().Name.Equals("Int32"))
.OrderBy(x => x);
newTypes.Dump();
When you do your select you're getting an IEnumerable<String>. Then you're taking the types of each string in the list (which is all "String") and filtering them out where they aren't equal to "Int32" (which is the entire list). Ergo...the list is empty.
Equals works just fine, it's your query that isn't correct. If you want to select the integers in the list use:
var newTypes = types.Where( x => x.GetType().Name.Equals("Int32") )
.OrderBy( x => x );
Reverse the order of the operations:
var newTypes = types.Where(x => x is int)
.OrderBy(x => x)
.Select(x => x.GetType().Name);
(Notice this also uses a direct type check instead of the rather peculiar .GetType().Name.Equals(…)).
The thing with LINQ is you've got to stop thinking in SQL terms. In SQL we think like this:-
SELECT Stuff
FROM StufF
WHERE Stuff
ORDER BY Stuff
That is what your code looks like. However in LINQ we need to think like this :-
FROM Stuff
WHERE Stuff
SELECT Stuff
ORDER BY Stuff
var newTypes = types.Select(x => x.GetType().Name)
.Where(x => x.Equals("Int32"))
.OrderBy(x => x);
This doesn't work because the Select statement will convert every value in the collection to the name of the underlying type of that value. The resulting collection will contain only string values and hence they won't ever have the name Int32.