Using LINQ Lambda expression determining value by group by and where condition - c#

So I have the following data table:
Region Class
Reg100 A
Reg100 B
Reg200 A
Reg300 B
Where I want to determine a region that has two classes A and B. In this case it would be Reg100. How could I write this in using lambda expression?
I have tried something like below but not getting what I want.
dt.Where(x => x.Class.Contains(listOfAandB).GroupBy(x=>x.Region).FirstOrDefault()

Lets define the input:
class Data
{
public string Region;
public string Class;
}
var dt = new[]
{
new Data { Region = "Reg100", Class = "A" },
new Data { Region = "Reg100", Class = "B" },
new Data { Region = "Reg200", Class = "A" },
new Data { Region = "Reg300", Class = "B" },
};
Now, using GroupBy we can group the input by Region.
dt.GroupBy(x => x.Region)
This yields { Reg100 (A, B), Reg200 A, Reg 300 B }. Now we look if we can find both A and B:
dt.GroupBy(x => x.Region)
.Where(g => g.Any(x => x.Class == "A") && g.Any(x => x.Class == "B"))
And finally as we are only interested in the region, we project to it:
dt.GroupBy(x => x.Region)
.Where(g => g.Any(x => x.Class == "A") && g.Any(x => x.Class == "B"))
.Select(g => g.Key);

Related

SQLite and LINQ: find all objects that have a sub list with all ids present in a supplied list of IDs

I have the following class:
public class Article
{
long Id;
List<Category> Categories;
}
I am using EF Core 5 and What I need is a LINQ query against SQLite that returns all the articles that have all the categories that I specify.
I tried the following code:
List<long> cIds = c.Select (x => x.Id).ToList ();
query.Where (art => cIds.All (cId => art.Categories.Select (c => c.Id).Contains (cId)));
but the compiler says
InvalidOperationException: The LINQ expression 'DbSet<Article>()
.Where(a => __cIds_0
.All(cId => DbSet<Dictionary<string, object>>("ArticleCategory")
.Where(a0 => EF.Property<Nullable<long>>(a, "Id") != null && object.Equals(
objA: (object)EF.Property<Nullable<long>>(a, "Id"),
objB: (object)EF.Property<Nullable<long>>(a0, "ArticlesId")))
.Join(
inner: DbSet<Category>(),
outerKeySelector: a0 => EF.Property<Nullable<long>>(a0, "CategoriesId"),
innerKeySelector: c => EF.Property<Nullable<long>>(c, "Id"),
resultSelector: (a0, c) => new TransparentIdentifier<Dictionary<string, object>, Category>(
Outer = a0,
Inner = c
))
.Select(ti => ti.Inner.Id)
.Any(p => p == cId)))' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.
How can I obtain it?
A possible workaround I found is the following:
List<long> cIds = c.Select (x => x.Id).ToList ();
query = query.Where (art => art.Categories.Select (c => c.Id).Any (x => cIds.Contains (x)));
query = query.Include (x => x.Categories);
result = await query.ToListAsync ();
result = result.Where (art => cIds.All (cId => art.Categories.Select (c => c.Id).Contains (cId))).ToList ();
But I was wondering if I could obtain the same result with a single LINQ query.
Thanks in advance
UPDATE:
I'll just add the function where this code will be used and make make an example to make things clearer:
This is the function where the code will be used:
public async Task<List<Article>> SearchAsync (string search, Section s, Website w,
List<Category> c)
{
List<Article> result = new List<Article> ();
if (
search == ""
&& s == null
&& w == null
&& c.Count == 0
)
return result;
IQueryable<Article> query = dbSet.AsQueryable ();
if (search != "")
query = query.Where (x => x.Title.Contains (search) || x.Summary.Contains (search));
if (s != null)
query = query.Where (x => x.SectionId == s.Id);
if (w != null)
query = query.Where (x => x.WebsiteId == w.Id);
if (c.Count > 0)
{
List<long> cIds = c.Select (x => x.Id).ToList ();
query = query.Where (art => art.Categories.Select (c => c.Id).Any (x => cIds.Contains (x)));
}
query = query.Include (x => x.Categories);
result = await query.ToListAsync ();
if (c.Count > 0)
{
List<long> cIds = c.Select (x => x.Id).ToList ();
result = result.Where (art => cIds.All (cId => art.Categories.Select (c => c.Id).Contains (cId))).ToList ();
}
return result;
}
And here is an example:
Let's say c will contain ids 9,10,11 and the articles collection is the following pseudo code:
List<article> articles = new List<Article> ()
{
new Article () {Id = 1, Categories = "12,44,55"}
new Article () {Id = 2, Categories = "7,8,9,10,11"}
new Article () {Id = 3, Categories = "9,10,11"}
}
The linq query should return Article with Id 2 and 3 because both contains all of the ids present in c.
One of the solutions using Intersect, but we have to prepare data for intersection.
// articles query
var query = ...
var cIds = c.Select(x => x.Id).ToList();
var idsCount = cIds.Count();
// translating list of IDs to IQueryable
var categoryIdsQuery = dbContext.Categories
.Where(c => cIds.Contains(c.Id))
.Select(c => c.Id);
query = query
.Where(art => art.Categories
.Select(c => c.Id)
.Intersect(categoryIdsQuery)
.Count() == idsCount
)
.Include(x => x.Categories);
What I need is a LINQ query against SQLite that returns all the articles that have all the categories that I specify.
So you have a sequence of Category Ids and you want all Articles, each Article with only the Categories that are in your sequence of Category Ids.
I'm not sure what your variable 'c' is, but it seems to me that the following statement returns the Ids of all c:
List<long> cIds = c.Select (x => x.Id).ToList ();
If c is your sequence of Categories, then you will have the Ids of all existing categories. This will mean that you will have all Articles, each with all Categories.
If you have a local sequence of Category Ids, with a limited count (say about 250), then you should use Contains:
IEnumerable<long> categoryIds = ...
var articlesWithTheseCategories = dbContext.Articles.Select(article => new
{
Id = article.Id,
Categories = article.Categories
.Where(category => categoryIds.Contains(category.Id)
.ToList(),
})
So if you have CategoryIds 2, 3, and 12, this query will give you all Articles with only the Categories with ids 2, 3, 12.
If Article 40 has only Categories 20, 21, 21, then Article 40 will be in your result, but it will have an empty Categories list.
If you don't have your Category Ids locally, but you have a predicate to select the Category Ids, then your query will be like:
IQueryable<long> categoryIds = dbContext.Categories
.Where(category => category.Status == StatusCode.Obsolete); // predicate
var articlesWithTheseCategories = dbContext.Articles.Select(article => new
{
Id = article.Id,
Categories = article.Categories
.Where(category => categoryIds.Contains(category.Id)
.ToList(),
});
Because your first query is an IQueryable<...> it is not executed yet. If you want you can make it one big statement:
var articlesWithTheseCategories = dbContext.Articles.Select(article => new
{
Id = article.Id,
Categories = article.Categories
.Where(category => dbContext.Categories
.Where(category => category.Status == StatusCode.Obsolete)
.Contains(category.Id))
.ToList(),
});
Although this will not improve efficiency, it surely deteriorates readability.

optimize the comparison in two lists with LINQ

I have two lists of object:
Customer And Employee
I need to check if there is at least 1 Client with the same name as an employee.
Currently I have:
client.ForEach(a =>
{
if (employee.Any(m => m.Name == a.Name && m.FirstName==a.FirstName)
{
// OK TRUE
}
});
can I improve reading by doing it in another way?
why won't you check it before hand using join?
var mergedClients = Client.Join(listSFull,
x => new { x.Name, x.FirstName},
y => new { Name = y.Name, FirstName= y.FirstName},
(x, y) => new { x, y }).ToList();
and then iterate over the new collection:
mergedClients.ForEach(a =>
//your logic
Only disadvantage of this approach (if it bothers you) is that null values will not be included.
I would go either with Join
var isDuplicated = clients.Join(employees,
c => new { c.Name, c.FirstName },
e => new { e.Name, e.FirstName },
(c, e) => new { c, e })
.Any();
or Intersect
var clientNames = clients.Select(c => new { c.Name, c.FirstName });
var employeeNames = employees.Select(e => new { e.Name, e.FirstName });
var isDuplicated = clientNames.Intersect(employeeNames).Any();
Both of Join and Intersect use hashing, and are close to O(n).
Note: equality (and hash code) of anonymous objects (new { , }) is evaluated as for a value type. I.e. two anonymous objects are equal (implies have same hash code) when all their fields are equal.
=== EDIT: Ok, I was interested myself (hope your question was about performance :P)
[TestMethod]
public void PerformanceTest()
{
var random = new Random();
var clients = Enumerable.Range(0, 10000)
.Select(_ => new Person { FirstName = $"{random.Next()}",
LastName = $"{random.Next()}" })
.ToList();
var employees = Enumerable.Range(0, 10000)
.Select(_ => new Person { FirstName = $"{random.Next()}",
LastName = $"{random.Next()}" })
.ToList();
var joinElapsedMs = MeasureAverageElapsedMs(() =>
{
var isDuplicated = clients.Join(employees,
c => new { c.FirstName, c.LastName },
e => new { e.FirstName, e.LastName },
(c, e) => new { c, e })
.Any();
});
var intersectElapsedMs = MeasureAverageElapsedMs(() =>
{
var clientNames = clients.Select(c => new { c.FirstName, c.LastName });
var employeeNames = employees.Select(e => new { e.FirstName, e.LastName });
var isDuplicated = clientNames.Intersect(employeeNames).Any();
});
var anyAnyElapsedMs = MeasureAverageElapsedMs(() =>
{
var isDuplicated = clients.Any(c => employees.Any(
e => c.FirstName == e.FirstName && c.LastName == e.LastName));
});
Console.WriteLine($"{nameof(joinElapsedMs)}: {joinElapsedMs}");
Console.WriteLine($"{nameof(intersectElapsedMs)}: {intersectElapsedMs}");
Console.WriteLine($"{nameof(anyAnyElapsedMs)}: {anyAnyElapsedMs}");
}
private static double MeasureAverageElapsedMs(Action action) =>
Enumerable.Range(0, 10).Select(_ => MeasureElapsedMs(action)).Average();
private static long MeasureElapsedMs(Action action)
{
var stopWatch = Stopwatch.StartNew();
action();
return stopWatch.ElapsedMilliseconds;
}
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
Output:
joinElapsedMs: 5.9
intersectElapsedMs: 3.5
anyAnyElapsedMs: 3185.8
Note: any-any is O(n^2) - (in worst case) every employee is iterated per each iterated client.

Returning a LINQ database query from a Method

Hello everyone I have this query I am performing in multiple places. Instead of retyping the query over and over, I would like to be able to call a method that returns the query. I am not sure what to put as the return type for the method or if this is even possible to do. I use the query to write a csv file of the information, and I use the query to add items to my observable collection that is bound to a list view.
using (ProjectTrackingDBEntities context = new ProjectTrackingDBEntities())
{
var result = context.TimeEntries.Where(Entry => Entry.Date >= FilterProjectAfterDate
&& Entry.Date <= FilterProjectBeforerDate
&& (FilterProjectName != null ? Entry.ProjectName.Contains(FilterProjectName) : true))
.GroupBy(m => new { m.ProjectName, m.Phase })
.Join(context.Projects, m => new { m.Key.ProjectName, m.Key.Phase }, w => new { w.ProjectName, w.Phase }, (m, w) => new { te = m, proj = w })
.Select(m => new
{
Name = m.te.Key.ProjectName,
Phase = m.te.Key.Phase,
TimeWorked = m.te.Sum(w => w.TimeWorked),
ProposedCompletionDate = m.proj.ProposedCompletionDate,
ActualCompletionDate = m.proj.ActualCompletionDate,
Active = m.proj.Active,
StartDate = m.proj.StartDate,
Description = m.proj.Description,
EstimatedHours = m.proj.EstimatedHours
});
}
I am able to do both right now by retyping the query and performing the subsequent foreach() loops on the data. I would rather be able to do something like:
var ReturnedQuery = GetProjectsQuery();
foreach(var item in ReturnedQuery)
{
//do stuff
}
Any help would be appreciated.
You want to return IQueryable<T> with a known model that represents what it is you are returning. You should not return an anonymous type. Also you want to pass in the DbContext so it can be disposed of by the caller and not in the method otherwise you will receive an exception that the DbContext has been disposed of.
For example:
public IQueryable<ProjectModel> GetProjectQuery(ProjectTrackingDBEntities context) {
return context.TimeEntries.Where(Entry => Entry.Date >= FilterProjectAfterDate
&& Entry.Date <= FilterProjectBeforerDate
&& (FilterProjectName != null ? Entry.ProjectName.Contains(FilterProjectName) : true))
.GroupBy(m => new { m.ProjectName, m.Phase })
.Join(context.Projects, m => new { m.Key.ProjectName, m.Key.Phase }, w => new { w.ProjectName, w.Phase }, (m, w) => new { te = m, proj = w })
.Select(m => new ProjectModel
{
Name = m.te.Key.ProjectName,
Phase = m.te.Key.Phase,
TimeWorked = m.te.Sum(w => w.TimeWorked),
ProposedCompletionDate = m.proj.ProposedCompletionDate,
ActualCompletionDate = m.proj.ActualCompletionDate,
Active = m.proj.Active,
StartDate = m.proj.StartDate,
Description = m.proj.Description,
EstimatedHours = m.proj.EstimatedHours
});
}
ProjectModel.cs
public class ProjectModel {
public string Name {get;set;}
public string Phase {get;set;}
// rest of properties
}
Calling code
using (ProjectTrackingDBEntities context = new ProjectTrackingDBEntities())
{
var ReturnedQuery = GetProjectsQuery(context);
foreach(var item in ReturnedQuery)
{
//do stuff
}
}
It is easy to return the enumerator, but you can't return an enumerator for an anonymous type, unfortunately. Probably the easiest path forward for you would be to return enumerator over the full row object, like this:
public IEnumerable<TimeEntries> GetTimeEntries()
{
using (ProjectTrackingDBEntities context = new ProjectTrackingDBEntities())
{
return context.TimeEntries
.Where
(
Entry =>
Entry.Date >= FilterProjectAfterDate &&
Entry.Date <= FilterProjectBeforerDate &&
(FilterProjectName != null ? Entry.ProjectName.Contains(FilterProjectName) : true)
)
.GroupBy(m => new { m.ProjectName, m.Phase })
.Join
(
context.Projects,
m => new { m.Key.ProjectName, m.Key.Phase },
w => new { w.ProjectName, w.Phase },
(m, w) => new { te = m, proj = w }
);
}
)
}
And use it like this:
var query = GetTimeEntries();
foreach (var row in query.Select( m => new { Name = row.te.Key.ProjectName })
{
Console.WriteLine(row.Name);
}

Linq command to group by date - how to group

I am trying to create an Iqueryable method which returns the number of connections to a service for each day. this data is read from a SQL Server database.
Here is ConnectionItem class
public class ConnectionItem
{
public DateTime CreatedDate { get; set; }
public int NumberOfConnections { get; set; }
}
And here is my iqueryable
private IQueryable<ConnectionItem> ListItems(DataContext dataContext)
{
return dataContext.Connections
.Join(dataContext.Configurations,
connections => connections.ConfigID,
config => config.ConfigID,
(connections, config) => new { cx = connections, cf = config })
.Join(dataContext.Users,
config => config.cf.UserID,
users => users.UserID,
(config, users) => new { cf = config, su = users})
.Where(q => q.su.AccountEventID == 123 && q.cf.cx.Successful == true)
.GroupBy(g => g.cf.cx.CreatedDate.ToShortDateString())
.Select(s => new ConnectionItem
{
CreatedDate = ????,
NumberOfConnections = ????
});
}
How do I access the grouped date value and the number of items per group?
Also, is there an easier way to write this kind of statements? I am not 100% sure on how the aliases cx,cf etc work.
Any input is appreciated.
Group by the Date portion of the DateTime objects. The Date property simply drops the time part. You're converting your dates to strings so you're losing the fidelity of a DateTime object.
var eventId = 123;
return dataContext.Connections.Join(dataContext.Configurations,
conn => conn.ConfigID,
cfg => cfg.ConfigID,
(conn, cfg) => new { conn, cfg })
.Join(dataContext.Users,
x => x.cfg.UserID,
u => u.UserID,
(x, u) => new { x.conn, u })
.Where(x => x.conn.Successful && x.u.AccountEventID == eventId)
.GroupBy(x => x.conn.CreatedDate.Date)
.Select(g => new ConnectionItem
{
CreatedDate = g.Key,
NumberOfConnections = g.Count(),
});
The above could be more nicely expressed using query syntax.
var eventId = 123;
return
from conn in dataContext.Connections
join cfg in dataContext.Configurations on conn.ConfigID equals cfg.ConfigID
join u in dataContext.Users on cfg.UserID equals u.UserID
where conn.Successful && u.AccountEventID == eventId
group 1 by conn.CreatedDate.Date into g
select new ConnectionItem
{
CreatedDate = g.Key,
NumberOfConnections = g.Count(),
};
The .GroupBy linq method returns an IGrouping<TKey, TValue>, which is basically a List with a Key property that you've just grouped by.
So here
Select(s => new ConnectionItem
{
CreatedDate = ????,
NumberOfConnections = ????
});
Your iterating through a IEnumerable<IGrouping<TKey,TValue>>so you can do this
Select(s => new ConnectionItem
{
CreatedDate = s.Key
NumberOfConnections = s.Count()
});
edited as per comment I realized your looking for the number not an actual list
Just call s.Key and s.Count() ,You can get it like:
private IQueryable<ConnectionItem> ListItems(DataContext dataContext)
{
return dataContext.Connections
.Join(dataContext.Configurations,
connections => connections.ConfigID,
config => config.ConfigID,
(connections, config) => new {cx = connections, cf = config})
.Join(dataContext.Users,
config => config.cf.UserID,
users => users.UserID,
(config, users) => new {cf = config, su = users})
.Where(q => q.su.AccountEventID == 123 && q.cf.cx.Successful == true)
.GroupBy(g => g.cf.cx.CreatedDate.ToShortDateString())
.Select(s => new ConnectionItem
{
CreatedDate = s.Key,
NumberOfConnections = s.Count()
});
}
The group clause returns a sequence of IGrouping
objects that contain zero or more items that match the key value for
the group. For example, you can group a sequence of strings according
to the first letter in each string. In this case, the first letter is
the key and has a type char, and is stored in the Key property of each
IGrouping object. The compiler infers the type of the
key.
Group clause docs

returning multiple column and sum using linq expression

I need to return two fields using a lambda expression. The first one is the sum of the amount field and the second one is CurrentFinancial year. Below is the code that I have written, how do I include CurrentFinancialYear?
var amount = dealingContext.vw_GetContribution
.Where(o => o.ContactID == contactId)
.Sum(o => o.Amount);
return new Contribution { Amount = amount ?? 0, CurrentFinancialYear = };
Grouping by Year should do the trick:
from entry in ledger.Entries
where entry.ContactID == contactId
&& entry.Time.Year == currentFinancialYear
group entry by entry.Time.Year
into g
select new Contribution ()
{
Amount = g.ToList ().Sum (e => e.Amount),
CurrentFinancialYear = g.Key
};
UPDATE - just return the first/default result...
(from entry in ledger.Entries
where entry.ContactID == contactId
&& entry.Time.Year == currentFinancialYear
group entry by entry.Time.Year
into g
select new Contribution ()
{
Amount = g.ToList ().Sum (e => e.Amount),
CurrentFinancialYear = g.Key
}).FirstOrDefault();
First of all use a simple select
var contribution = dealingContext.vw_GetContribution
.Where(o => o.ContactID == contactId).ToList();
It will give you a list of type vw_GetContribution
Then use groupby on this list as
var groupedContribution = contribution.GroupBy(b => b.CurrentFinancialYear).ToList();
Now you can iterate through or use this list as
foreach(var obj in groupedContribution.SelectMany(result => result).ToList())
{
var amount = obj.Amount;
var Year = obj.CurrentFinancialYear;
}
OR
In single line, you can do all the above as
var contList = context.vw_GetContribution
.Select(a => new { a.Amount, a.CurrentFinancialYear })
.GroupBy(b => b.CurrentFinancialYear)
.SelectMany(result => result).ToList();
I hope this will solve your problem.
Can you try this:
var amount = dealingContext.vw_GetContribution
.Where(o => o.ContactID == contactId)
.GroupBy(o=> new { o.CurrentFinancialYear, o.Amount})
.Select(group =>
new {
year= group.Key.CurrentFinancialYear,
sum= group.Sum(x=>x.Amount)
});

Categories