var result = booklist.GroupJoin(authorlist, authorlist2,
x => x.ISBN,
y => y.AuthorId,
z => z.AuthorID,
(x, y, z) => new
{
x.Book_Name,
y.AuthorName,
z.AuthorName
}).GroupBy(od => new
{
od.Book_Name,
od.AuthorName
}).OrderBy(d => d.Key.Book_Name).Select(grp => new
{
AuthorName = grp.Key.AuthorName,
Book_Name = grp.Key.Book_Name,
});
foreach (var item in result)
{
Console.WriteLine(string.Format(" BookName:- {0}\n AuthorName:- {1}\n", item.Book_Name, item.AuthorName));
}
enter image description here
Here I used lambda function to print one book and two author name but showing error on GroupJoin. plz check and tell me where I did mistake
The error description says that GroupJoin method has no overload with six parameters. GroupJoin can contact only two lists by two navigation properties, one from each list.
After concatenation of list grouping will look like:
var result = booklist.GroupJoin(concatenatedAuthorlist,
left => left .AuthorId, // navigation from booklist
right => right .AuthorID, // navigation from concatenatedAuthorlist
(left, right) => new
{
left.Book_Name,
left.AuthorName,
right.AuthorName
}).
//omitted for brevity
Related
I know that there're some similar questions to this one, but I'm not being able to fix my issue.
I have the following method:
public IDictionary<string, int> CountXXX(Expression<Func<MessageStatus, bool>> whereFilter = null)
{
try
{
var trackingOpens = whereFilter != null ? _context.MessageStatus.Where(whereFilter).Join(_context.TrackingOpens, x => x.MessageId, g => g.MessageId, (x, g) => x) :
_context.MessageStatus.Join(_context.TrackingOpens, x => x.MessageId, g => g.MessageId, (x, g) => x);
return trackingOpens
.GroupBy(x => x.VariationId)
.Select(g => new { Variation = g.Key.ToString(), Count = g.Count() })
.ToDictionary(x => x.Variation, x => x.Count);
}
catch (Exception e)
{
throw new Exception($"There's been an error trying to count the tracking opens from the database. Details: {e.Message}", e);
}
}
I have the class MessageStatus with VariationId property, I need to group them and count for each one. The problem is that I need to Join with TrackingOpens on MessageId. The Linqstatement here is returning the exception mentioned on the title.
The whereFilter parameter is just another Linq statement that goes inside the .where clause.
TrackingOpens does not have the field VariationId and I cannot add it to that model
Can anyone help me fix this Linq statement?
I have a class like
public class Foo
{
public string X;
public string Y;
public int Z;
}
and the query I want to achieve is, given an IEnumerable<Foo> called foos,
"Group by X, then by Y, and choose the the largest subgroup
from each supergroup; if there is a tie, choose the one with the
largest Z."
In other words, a not-so-compact solution would look like
var outer = foos.GroupBy(f => f.X);
foreach(var g1 in outer)
{
var inner = g1.GroupBy(g2 => g2.Y);
int maxCount = inner.Max(g3 => g3.Count());
var winners = inner.Where(g4 => g4.Count() == maxCount));
if(winners.Count() > 1)
{
yield return winners.MaxBy(w => w.Z);
}
else
{
yield return winners.Single();
}
}
and a not-so-efficient solution would be like
from foo in foos
group foo by new { foo.X, foo.Y } into g
order by g.Key.X, g.Count(), g.Max(f => f.Z)
. . . // can't figure the rest out
but ideally I'd like both compact and efficient.
you are reusing enumerables too much, that causes whole enumerable to be executed again which can cause significant performance decrease in some cases.
Your not so compact code can be simplified to this.
foreach (var byX in foos.GroupBy(f => f.X))
{
yield return byX.GroupBy(f => f.Y, f => f, (_, byY) => byY.ToList())
.MaxBy(l => l.Count)
.MaxBy(f => f.Z);
}
Here is how it goes,
items are grouped by x, hence the variable is named byX, which means entire byX enumerable contains similar X's.
Now you group this grouped items by Y. the variable named byY means that entire byY enumerable contains similar Y's that also have similar X's
Finally you select largest list i.e winners (MaxyBy(l => l.Count)) and from winners you select item with highest Z (MaxBy(f => f.Z)).
The reason I used byY.ToList() was to prevent duplicate enumeration that otherwise would be caused by Count() and MaxBy().
Alternatively you can change your entire iterator into single return statement.
return foos.GroupBy(f => f.X, f => f, (_, byX) =>
byX.GroupBy(f => f.Y, f => f,(__, byY) => byY.ToList())
.MaxBy(l => l.Count)
.MaxBy(f => f.Z));
Based on the wording of your question I assume that you want the result to be an IEnumerable<IEnumerable<Foo>>. Elements are grouped by both X and Y so all elements in a specific inner sequence will have the same value for X and Y. Furthermore, every inner sequence will have different (unique) values for X.
Given the following data
X Y Z
-----
A p 1
A p 2
A q 1
A r 3
B p 1
B q 2
the resulting sequence of sequences should consist of two sequences (for X = A and X = B)
X Y Z
-----
A p 1
A p 2
X Y Z
-----
B q 2
You can get this result using the following LINQ expression:
var result = foos
.GroupBy(
outerFoo => outerFoo.X,
(x, xFoos) => xFoos
.GroupBy(
innerFoo => innerFoo.Y,
(y, yFoos) => yFoos)
.OrderByDescending(yFoos => yFoos.Count())
.ThenByDescending(yFoos => yFoos.Select(foo => foo.Z).Max())
.First());
If you really care about performance you can most likely improve it at the cost of some complexity:
When picking the group with most elements or highest Z value two passes are performed over the elements in each group. First the elements are counted using yFoos.Count() and then the maximum Z value is computed using yFoos.Select(foo => foo.Z).Max(). However, you can do the same in one pass by using Aggregate.
Also, it is not necessary to sort all the groups to find the "largest" group. Instead a single pass over all the groups can be done to find the "largest" group again using Aggregate.
result = foos
.GroupBy(
outerFoo => outerFoo.X,
(x, xFoos) => xFoos
.GroupBy(
innerFoo => innerFoo.Y,
(y, yFoos) => new
{
Foos = yFoos,
Aggregate = yFoos.Aggregate(
(Count: 0, MaxZ: int.MinValue),
(accumulator, foo) =>
(Count: accumulator.Count + 1,
MaxZ: Math.Max(accumulator.MaxZ, foo.Z)))
})
.Aggregate(
new
{
Foos = Enumerable.Empty<Foo>(),
Aggregate = (Count: 0, MaxZ: int.MinValue)
},
(accumulator, grouping) =>
grouping.Aggregate.Count > accumulator.Aggregate.Count
|| grouping.Aggregate.Count == accumulator.Aggregate.Count
&& grouping.Aggregate.MaxZ > accumulator.Aggregate.MaxZ
? grouping : accumulator)
.Foos);
I am using a ValueTuple as the accumulator in Aggregate as I expect that to have a good performance. However, if you really want to know you should measure.
You can prety much ignore the outer grouping and what is left is just a little advaced MaxBy, kind of alike a two parameter sorting. If you implement that, you would end up with something like:
public IEnumerable<IGrouping<string, Foo>> GetFoo2(IEnumerable<Foo> foos)
{
return foos.GroupBy(f => f.X)
.Select(f => f.GroupBy(g => g.Y)
.MaxBy2(g => g.Count(), g => g.Max(m => m.Z)));
}
It is questionable how much you can call this linq approach, as you moved all the functionality into quite ordinary function. You can also implement the functionality with aggregate. There are two options. With seed and without seed. I like the latter option:
public IEnumerable<IGrouping<string, Foo>> GetFoo3(IEnumerable<Foo> foos)
{
return foos.GroupBy(f => f.X)
.Select(f => f.GroupBy(g => g.Y)
.Aggregate((a, b) =>
a.Count() > b.Count() ? a :
a.Count() < b.Count() ? b :
a.Max(m => m.Z) >= b.Max(m => m.Z) ? a : b
));
}
The performance would suffer if Count() is not constant time, which is not guaranteed, but on my tests it worked fine. The variant with seed would be more complicated, but may be faster if done right.
Thinking about this further, I realized your orderby could vastly simplify everything, still not sure it is that understandable.
var ans = foos.GroupBy(f => f.X, (_, gXfs) => gXfs.GroupBy(gXf => gXf.Y).Select(gXgYfs => gXgYfs.ToList())
.OrderByDescending(gXgYfs => gXgYfs.Count).ThenByDescending(gXgYfs => gXgYfs.Max(gXgYf => gXgYf.Z)).First());
While it is possible to do this in LINQ, I don't find it any more compact or understandable if you make it into one statement when using query comprehension syntax:
var ans = from foo in foos
group foo by foo.X into foogX
let foogYs = (from foo in foogX
group foo by foo.Y into rfoogY
select rfoogY)
let maxYCount = foogYs.Max(y => y.Count())
let foogYsmZ = from fooY in foogYs
where fooY.Count() == maxYCount
select new { maxZ = fooY.Max(f => f.Z), fooY = from f in fooY select f }
let maxMaxZ = foogYsmZ.Max(y => y.maxZ)
select (from foogY in foogYsmZ where foogY.maxZ == maxMaxZ select foogY.fooY).First();
If you are willing to use lambda syntax, some things become easier and shorter, though not necessarily more understandable:
var ans = from foogX in foos.GroupBy(f => f.X)
let foogYs = foogX.GroupBy(f => f.Y)
let maxYCount = foogYs.Max(foogY => foogY.Count())
let foogYmCmZs = foogYs.Where(fooY => fooY.Count() == maxYCount).Select(fooY => new { maxZ = fooY.Max(f => f.Z), fooY })
let maxMaxZ = foogYmCmZs.Max(foogYmZ => foogYmZ.maxZ)
select foogYmCmZs.Where(foogYmZ => foogYmZ.maxZ == maxMaxZ).First().fooY.Select(y => y);
With lots of lambda syntax, you can go completely incomprehensible:
var ans = foos.GroupBy(f => f.X, (_, gXfs) => gXfs.GroupBy(gXf => gXf.Y).Select(gXgYf => new { fCount = gXgYf.Count(), maxZ = gXgYf.Max(f => f.Z), gXgYfs = gXgYf.Select(f => f) }))
.Select(fC_mZ_gXgYfs_s => {
var maxfCount = fC_mZ_gXgYfs_s.Max(fC_mZ_gXgYfs => fC_mZ_gXgYfs.fCount);
var fC_mZ_gXgYfs_mCs = fC_mZ_gXgYfs_s.Where(fC_mZ_gXgYfs => fC_mZ_gXgYfs.fCount == maxfCount).ToList();
var maxMaxZ = fC_mZ_gXgYfs_mCs.Max(fC_mZ_gXgYfs => fC_mZ_gXgYfs.maxZ);
return fC_mZ_gXgYfs_mCs.Where(fC_mZ_gXgYfs => fC_mZ_gXgYfs.maxZ == maxMaxZ).First().gXgYfs;
});
(I modified this third possiblity to reduce repetitive calculations and be more DRY, but that did make it a bit more verbose.)
I am developing a ASP.NET MVC website and is looking a way to improve this routine. It can be improved either at LINQ level or SQL Server level. I hope at best we can do it within one query call.
Here is the tables involved and some example data:
We have no constraint that every Key has to have each LanguageId value, and indeed the business logic does not allow such contraint. However, at application level, we want to warn the admin that a key is missing a/some language values. So I have this class and query:
public class LocalizationKeyWithMissingCodes
{
public string Key { get; set; }
public IEnumerable<string> MissingCodes { get; set; }
}
This method get the Key list, as well as any missing codes (for example, if we have en + jp + ch language codes, and the key only has values for en + ch, the list will contains jp):
public IEnumerable<LocalizationKeyWithMissingCodes> GetAllKeysWithMissingCodes()
{
var languageList = Utils.ResolveDependency<ILanguageRepository>().GetActive();
var languageIdList = languageList.Select(q => q.Id);
var languageIdDictionary = languageList.ToDictionary(q => q.Id);
var keyList = this.GetActive()
.Select(q => q.Key)
.Distinct();
var result = new List<LocalizationKeyWithMissingCodes>();
foreach (var key in keyList)
{
// Get missing codes
var existingCodes = this.Get(q => q.Active && q.Key == key)
.Select(q => q.LanguageId);
// ToList to make sure it is processed at application
var missingLangId = languageList.Where(q => !existingCodes.Contains(q.Id))
.ToList();
result.Add(new LocalizationKeyWithMissingCodes()
{
Key = key,
MissingCodes = missingLangId
.Select(q => languageIdDictionary[q.Id].Code),
});
}
result = result.OrderByDescending(q => q.MissingCodes.Count() > 0)
.ThenBy(q => q.Key)
.ToList();
return result;
}
I think my current solution is not good, because it make a query call for each key. Is there a way to improve it, by either making it faster, or pack within one query call?
EDIT: This is the final query of the answer:
public IQueryable<LocalizationKeyWithMissingCodes> GetAllKeysWithMissingCodes()
{
var languageList = Utils.ResolveDependency<ILanguageRepository>().GetActive();
var localizationList = this.GetActive();
return localizationList
.GroupBy(q => q.Key, (key, items) => new LocalizationKeyWithMissingCodes()
{
Key = key,
MissingCodes = languageList
.GroupJoin(
items,
lang => lang.Id,
loc => loc.LanguageId,
(lang, loc) => loc.Any() ? null : lang)
.Where(q => q != null)
.Select(q => q.Code)
}).OrderByDescending(q => q.MissingCodes.Count() > 0) // Show the missing keys on the top
.ThenBy(q => q.Key);
}
Another possibility, using LINQ:
public IEnumerable<LocalizationKeyWithMissingCodes> GetAllKeysWithMissingCodes(
List<Language> languages,
List<Localization> localizations)
{
return localizations
.GroupBy(x => x.Key, (key, items) => new LocalizationKeyWithMissingCodes
{
Key = key,
MissingCodes = languages
.GroupJoin( // check if there is one or more match for each language
items,
x => x.Id,
y => y.LanguageId,
(x, ys) => ys.Any() ? null : x)
.Where(x => x != null) // eliminate all languages with a match
.Select(x => x.Code) // grab the code
})
.Where(x => x.MissingCodes.Any()); // eliminate all complete keys
}
Here is the SQL logic to identify the keys that are missing "complete" language assignments:
SELECT
all.[Key],
all.LanguageId
FROM
(
SELECT
loc.[Key],
lang.LanguageId
FROM
Language lang
FULL OUTER JOIN
Localization loc
ON (1 = 1)
WHERE
lang.Active = 1
) all
LEFT JOIN
Localization loc
ON (loc.[Key] = all.[Key])
AND (loc.LanguageId = all.LanguageId)
WHERE
loc.[Key] IS NULL;
To see all keys (instead of filtering):
SELECT
all.[Key],
all.LanguageId,
CASE WHEN loc.[Key] IS NULL THEN 1 ELSE 0 END AS Flagged
FROM
(
SELECT
loc.[Key],
lang.LanguageId
FROM
Language lang
FULL OUTER JOIN
Localization loc
ON (1 = 1)
WHERE
lang.Active = 1
) all
LEFT JOIN
Localization loc
ON (loc.[Key] = all.[Key])
AND (loc.LanguageId = all.LanguageId);
your code seems to be doing a lot of database query and materialization..
in terms of LINQ, the single query would look like this..
we take the cartesian product of language and localization tables to get all combinations of (key, code) and then subtract the (key, code) tuples that exist in the relationship. this gives us the (key, code) combination that don't exist.
var result = context.Languages.Join(context.Localizations, lang => true,
loc => true, (lang, loc) => new { Key = loc.Key, Code = lang.Code })
.Except(context.Languages.Join(context.Localizations, lang => lang.Id,
loc => loc.LanguageId, (lang, loc) => new { Key = loc.Key, Code = lang.Code }))
.GroupBy(r => r.Key).Select(r => new LocalizationKeyWithMissingCodes
{
Key = r.Key,
MissingCodes = r.Select(kc => kc.Code).ToList()
})
.ToList()
.OrderByDescending(lkmc => lkmc.MissingCodes.Count())
.ThenBy(lkmc => lkmc.Key).ToList();
p.s. i typed this LINQ query on the go, so let me know if it has syntax issues..
the gist of the query is that we take a cartesian product and subtract matching rows.
Given the class
public class Article
{
public string Title { get; set; }
public List<string> Tags { get; set; }
}
and
List<Article> articles;
How can I create a "map" from individual tags (that may be associated with 1 or more articles) with Linq?
Dictionary<string, List<Article>> articlesPerTag;
I know that I can select all of the tags like this
var allTags = articlesPerTag.SelectMany(a => a.Tags);
However, I'm not sure how to associate back from each selected tag to the article it originated from.
I know I can write this conventionally along the lines of
Dictionary<string, List<Article>> map = new Dictionary<string, List<Article>>();
foreach (var a in articles)
{
foreach (var t in a.Tags)
{
List<Article> articlesForTag;
bool found = map.TryGetValue(t, out articlesForTag);
if (found)
articlesForTag.Add(a);
else
map.Add(t, new List<Article>() { a });
}
}
but I would like to understand how to accomplish this with Linq.
If you specifically need it as a dictionary from tags to articles, you could use something like this.
var map = articles.SelectMany(a => a.Tags.Select(t => new { t, a }))
.GroupBy(x => x.t, x => x.a)
.ToDictionary(g => g.Key, g => g.ToList());
Though it would be more efficient to use a lookup instead, it's precisely what you are trying to build up.
var lookup = articles.SelectMany(a => a.Tags.Select(t => new { t, a }))
.ToLookup(x => x.t, x => x.a);
One more way using GroupBy. A bit complicated though.
articles.SelectMany(article => article.Tags)
.Distinct()
.GroupBy(tag => tag, tag => articles.Where(a => a.Tags.Contains(tag)))
.ToDictionary(group => group.Key,
group => group.ToList().Aggregate((x, y) => x.Concat(y).Distinct()));
I'm trying to make a linq GroupJoin, and I receive the fore mentioned error. This is the code
public Dictionary<string, List<QuoteOrderline>> GetOrderlines(List<string> quoteNrs)
{
var quoteHeadersIds = portalDb.nquote_orderheaders
.Where(f => quoteNrs.Contains(f.QuoteOrderNumber))
.Select(f => f.ID).ToList();
List<nquote_orderlines> orderlines = portalDb.nquote_orderlines
.Where(f => quoteHeadersIds.Contains(f.QuoteHeaderID))
.ToList();
var toRet = quoteNrs
.GroupJoin(orderlines, q => q, o => o.QuoteHeaderID, (q => o) => new
{
quoteId = q,
orderlines = o.Select(g => new QuoteOrderline()
{
Description = g.Description,
ExtPrice = g.UnitPrice * g.Qty,
IsInOrder = g.IsInOrder,
PartNumber = g.PartNo,
Price = g.UnitPrice,
ProgramId = g.ProgramId,
Quantity = (int)g.Qty,
SKU = g.SKU
}).ToList()
});
}
I suspect this is the immediate problem:
(q => o) => new { ... }
I suspect you meant:
(q, o) => new { ... }
In other words, "here's a function taking a query and an order, and returning an anonymous type". The first syntax simply doesn't make sense - even thinking about higher ordered functions, you'd normally have q => o => ... rather than (q => o) => ....
Now that won't be enough on its own... because GroupJoin doesn't return a dictionary. (Indeed, you don't even have a return statement yet.) You'll need a ToDictionary call after that. Alternatively, it may well be more appropriate to return an ILookup<string, QuoteOrderLine> via ToLookup.