how do I make this LINQ query faster? - c#

modelData has 100,000 items in the list.
I am doing 2 "Selects" within 2 loops.
Could it be structured differently - as it take a long time - 10 mins
public class ModelData
{
public string name;
public DateTime DT;
public int real;
public int trade;
public int position;
public int dayPnl;
}
List<ModelData> modelData;
var dates = modelData.Select(x => x.DT.Date).Distinct();
var names = modelData.Select(x => x.name).Distinct();
foreach (var aDate in dates)
{
var dateRealTrades = modelData.Select(x => x)
.Where(x => x.DT.Date.Equals(aDate) && x.real.Equals(1));
foreach (var aName in names)
{
var namesRealTrades = dateRealTrades.Select(x => x)
.Where(x => x.name.Equals(aName));
// DO MY PROCESSING
}
}

I believe what you want can be achieved with two queries using group by. One to create a lookup by the date and the other to give you the name-date grouped items.
var data = modelData.Where(x => x.real.Equals(1))
.GroupBy(x => new { x.DT.Date, x.name });
var byDate = modelData.Where(x => x.real.Equals(1))
.ToLookup(x => x.DT.Date);
foreach(var item in data)
{
var aDate = item.Key.Date;
var aName = item.Key.name;
var namesRealTrades = item.ToList();
var dateRealTrades = byDate[aDate].ToList();
// DO MY PROCESSING
}
The first query will give you items grouped by the name and date to iterate over and the second will give you a lookup to get all the items associated with a given date. The second uses a lookup so that the list is iterated once and gives you fast access to the resulting list of items.
This should greatly reduce the number of times you iterate over modelData from what you currently have.

You could rewrite your for loop like this:
foreach (var namesRealTrades in names.Select(aName => dateRealTrades.Where(x => x.name.Equals(aName))))
{
//DO STUFF
}
Depending on your data this could reduce the number of queries you have to make

Did you try to compile your query as suggested on MSDN WebSite?
When you have an application that executes structurally similar
queries many times, you can often increase performance by compiling
the query one time and executing it several times with different
parameters. For example, an application might have to retrieve all the
customers who are in a particular city, where the city is specified at
runtime by the user in a form. LINQ to SQL supports the use of
compiled queries for this purpose.
https://msdn.microsoft.com/en-us/library/bb399335(v=vs.110).aspx

A couple of things:
use .ToList() to calculate a sequence once, so you can keep it for later.
use .GroupBy() to avoid re-searching modelData for things you have already found.
// Collections of models having the same Date or Name.
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.GroupBy(x => x.Name);
foreach (var modelsWithDate in dates)
{
var aDate = modelsWithDate.Key;
var dateRealTrades = modelsWithDate.Where(x => x.real == 1).ToList();
foreach (var modelsWithName in names)
{
var aName = modelsWithName.Key;
var namesRealTrades = modelsWithName.ToList();
// DO MY PROCESSING
}
}

There are two ways the code is ineffective.
names has deffered evaluation. Every time You iterate over it, it has to go though the whole data to find all the distinct names again. You should save the result.
You find distinct values from collection and then You go through collection again for every distinct value and look fot its occurences. You should use grouping.
the rewritten code can look like this
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.Select(x => x.name).Distinct().ToArray();
foreach (var date in dates)
{
var dateRealTrades = date.Where(x => x.real.Equals(1)).ToArray();
var namesRealTradesLookup = dateRealTrades.ToLookup(x => x.name);
foreach (var aName in names)
{
var namesRealTrades = namesRealTradesLookup[aName];
// DO MY PROCESSING
// var aDate = date.Key;
}
}
In case You are not interestested in date/name combination with no real trade, it can be done in much more straightforward way
var realModelData = modelData.Where(x => x.real.Equals(1));
foreach (var dateRealTrades in realModelData.ToLookup(x => x.DT.Date))
{
foreach (var namesRealTrades in dateRealTrades.ToLookup(x => x.name))
{
// DO MY PROCESSING
//var aDate = dateRealTrades.Key;
//var aName = namesRealTrades.Key;
//foreach(var trade in namesRealTrades) { ...
//foreach(var trade in dateRealTrades) { ...
}
}

Related

Operation Intersect with linq

Sorry for strange title of the question, but I don't know how to formulate it more short. If you know how to formulate it better, I will be glad if you edit my question.
So, I have the following table:
I'm tolking about CustomerId and EventType fields. The rest is not important. I think you understand that this table is something like log by customers events. Some customer make event - I have event in the table. Simple.
I need to choice all customers events where each customer had event with type registration and type deposit. In other words, customer had registration before? The same customer had deposit? If yes and yes - I need to select all events of this customer.
How I can do that with the help of LINQ?
So I can write SQL like
select *
From "CustomerEvents"
where "CustomerId" in (
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'deposit'
intersect
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'registration'
)
It works, but how to write it on LINQ?
And second question. SQL above works, but not it is not universal. What if tomorrow I will need to show events of customers who have registration, deposit and - new one event - visit? I have to write new one query. Like:
select *
From "CustomerEvents"
where "CustomerId" in (
select "CustomerId"
from "CustomerEvents"
where "EventType" = 'deposit'
intersect
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'registration'
intersect
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'visit'
)
Uncomfortable :(
As source data, I have List with event types. Is there some way to make it dynamically? I mean, I have new one event in the list - I have new one intersect.
P.S I use Postgres and .NET Core 3.1
Update
I pine here a scheme
I haven't tested to see if this will translate to SQL correctly, but if we assume ctx.CustomerEvents is DbSet<CustomerEvent> you could try this:
var targetCustomerIds = ctx
.CustomerEvents
.GroupBy(event => event.CustomerId)
.Where(grouped =>
grouped.Any(event => event.EventType == "deposit")
&& grouped.Any(event => event.EventType == "registration"))
.Select(x => x.Key)
.ToList();
and then select all events for these customers:
var events = ctx.CustomerEvents.Where(event => targetCustomerIds.Contains(event.CustomerId));
To get targetCustomerIds dynamically with a variable number of event types, you could try this:
// for example
var requiredEventTypes = new [] { "deposit", "registration" };
// First group by customer ID
var groupedByCustomerId = ctx
.CustomerEvents
.GroupBy(event => event.CustomerId);
// Then filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);
// Then select the target customer IDs
var targetCustomerIds = filtered.Select(x => x.Key).ToList();
// Finally, select your target events
var events = ctx.CustomerEvents.Where(event =>
targetCustomerIds.Contains(event.CustomerId));
You can define the GetFilteredGroups method like this:
private static IQueryable<IGrouping<int, CustomerEvent>> GetFilteredGroups(
IQueryable<IGrouping<int, CustomerEvent>> grouping,
IEnumerable<string> requiredEventTypes)
{
var result = grouping.Where(x => true);
foreach (var eventType in requiredEventTypes)
{
result = result.Where(x => x.Any(event => event.EventType == eventType));
}
return result;
}
Alternatively, instead of selecting the target customer IDs, you can try to directly select your target events from the filtered groupings:
// ...
// Filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);
// Select your events here
var results = filtered.SelectMany(x => x).Distinct().ToList();
Regarding the inability to translate the query to SQL
Depending on your database size and particularly on the size of CustomerEvents table, this solution may or may not be ideal, but what you could do is load the optimized collection to memory and perform the grouping there:
// for example
var requiredEventTypes = new [] { "deposit", "registration" };
// First group by customer ID, but load into memory
var groupedByCustomerId = ctx
.CustomerEvents
.Where(event => requiredEventTypes.Contains(event.EventType))
.Select(event => new CustomerEventViewModel
{
Id = event.Id,
CustomerId = event.CustomerId,
EventType = event.EventType
})
.GroupBy(event => event.CustomerId)
.AsEnumerable();
// Then filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);
// Then select the target customer IDs
var targetCustomerIds = filtered.Select(x => x.Key).ToList();
// Finally, select your target events
var events = ctx.CustomerEvents.Where(event =>
targetCustomerIds.Contains(event.CustomerId));
You will need to create a type called CustomerEventViewModel like this (so you don't have to load the entire CustomerEvent entity instances to memory):
public class CustomerEventViewModel
{
public int Id { get; set; }
public int CustomerId { get; set; }
public string EventType { get; set; }
}
And change the GetFilteredGroups like this:
private static IEnumerable<IGrouping<int, CustomerEvent>> GetFilteredGroups(
IEnumerable<IGrouping<int, CustomerEvent>> grouping,
IEnumerable<string> requiredEventTypes)
{
var result = grouping.Where(x => true);
foreach (var eventType in requiredEventTypes)
{
result = result.Where(x => x.Any(event => event.EventType == eventType));
}
return result;
}
It should now work fine.
Thank you for #Dejan Janjušević. He is excpirienced developer. But it seems EF can't translate him solution to SQL (or just my hands grow from wrong place). I publish here my solution for this situation. It's simple stuped. So. I have in the table EventType. It is string. And I have from the client the following filter request:
List<string> eventType
Just list with event types. So, in the action I have the following code of the filter:
if (eventType.Any())
{
List<int> ids = new List<int>();
foreach (var e in eventType)
{
var customerIdsList =
_context.customerEvents.Where(x => x.EventType == e).Select(x => x.CustomerId.Value).Distinct().ToList();
if (!ids.Any())
{
ids = customerIdsList;
}
else
{
ids = ids.Intersect(customerIdsList).ToList();
}
}
customerEvents = customerEvents.Where(x => ids.Contains(x.CustomerId.Value));
}
Not very fast, but works.

C# LINQ - Comparing a IEnumerable<string> against an anonmyous list?

The basic question
I have:
IEnumerable<string> listA
var listB (this is an anonymous list generated by a LINQ query)
I want to query a list of objects that contain listA to see if they match to listB:
someObjectList.Where(x => x.listA == listB)
The comparison doesn't work - so how do I ensure that both lists are the same type for comparison?
The detailed question
I am grouping a larger list into a subset that contains a name and related date(s).
var listGroup = from n in list group n by new
{ n.NAME } into d
select new
{
NAME = d.Key.NAME, listOfDates = from x in d select new
{ Date = x.DATE } };
I have a object to hold the values for further processing:
class SomeObject
{
public SomeObject()
{
_listOfDates = new List<DateTime>();
}
private IEnumerable<DateTime> _listOfDates;
public IEnumerable<DateTime> ListOfDates
{
get { return _listOfDates; }
set { _listOfDates = value; }
}
}
I am then iterating over the listGroup and adding into a generic List<> of SomeObject:
foreach(var item in listGroup)
{
SomeObject so = new SomeObject();
// ...do some stuff
if (some match occurs then add into List<SomeObject>)
}
As I iterate through then I want to check the existing List<SomeOjbect> for matches:
var record = someObjectList.Where(x => x.NAME == item.NAME &&
x.ListOfDates == item.listOfDates)
.SingleOrDefault();
The problem is that comparing x.ListOfDates against item.listOfDates doesn't work.
There is no compiler error but I suspect that the returned value lists are different. How to I get the lists to commonize so they can be compared?
Update #1
This seems to work to get the listOfDates into a similar format:
IEnumerable<DateTime> tempList = item.listOfDates.Select(x => x.DATE).ToList()
Then I followed the 'SequenceEqual' suggestion from #Matt Burland
You can just compare one IEnumerable<DateTime> to another IEnumerable<DateTime>, you need to compare the sequence. Luckily, there's Enumerable.SequenceEquals (in both static and extension method flavors) which should work here.
So something like:
var record = someObjectList
.Where(x => x.NAME == item.NAME && x.ListOfDates.SequenceEquals(item.listOfDates))
.SingleOrDefault();

Getting a amount of rows with same month with LINQ from an MVC model using dateTime, is it possible?

I have this need to know how many rows have the same month from a table and I have no idea of how to do it. I thought I'd try some LINQ but I've never used it before so I don't even know if it's possible. Please help me out!
public ActionResult returTest()
{
ViewData["RowsWithSameMonth"] = // I'm guessing I can put some LINQ here?
var returer = from s in db2.ReturerDB select s;
return View(returer.ToList());
}
The ideal would be to get, maybe a two dimensional array with the month in the first cell and the amount of rows from the db in the second?
I'd like the result to be sort of :
string[,] statistics = new string[,]
{
{"2013-11", "5"},
{"2013-12", "10"},
{"2014-01", "3"}
};
Is this doable? Or should I just query the database and do a whole lot of stuff? I'm thinking that I can solve this on my own, but it would mean a lot of ugly code. Background: self taught C# developer at IT-company with 1 years experience of ugly codesmanship and no official degree of any kind.
EDIT
var returer = from s in db2.ReturerDB select s;
var dateRange = returer.ToList();
var groupedData = dateRange.GroupBy(dateRow => dateRow.ToString())
.OrderBy(monthGroup => monthGroup.Key)
.Select(monthGroup => new
{
Month = monthGroup.Key,
MountCount = monthGroup.Count()
});
string test01 = "";
string test02 = "";
foreach (var item in groupedData)
{
test01 = item.Month.ToString();
test02 = item.MountCount.ToString();
}
In debug, test01 is "Namespace.Models.ReturerDB" and test02 is "6" as was expected, or at least wanted. What am I doing wrong?
You can do this:
var groupedData = db2.ReturerDB.GroupBy(r => new { r.Date.Year, r.Date.Month })
.Select(g => new { g.Key.Year, g.Key.Month, Count = g.Count() })
.OrderBy(x => x.Year).ThenBy(x => x.Month);
.ToList();
var result = groupedData
.ToDictionary(g => string.Format("{0}-{1:00}", g.Year, g.Month),
g => g.Count);
Which will give you
Key Value
---------------
2013-11 5
2013-12 10
2014-01 3
(Creating a dictionary is slightly easier than a two-dimensional array)
This will work against a SQL back-end like entity framework of linq-to-sql, because the expressions r.Date.Year and r.Date.Month can be translated into SQL.
with a nod to mehrandvd, here is how you'd achieve this using linq method chain approach:
var dateRange = { // your base collection with the dates};
// make sure you change MyDateField to match your won datetime field
var groupedData = dateRange
.GroupBy(dateRow => dateRow.MyDateField.ToString("yyyy-mm"))
.OrderBy(monthGroup => monthGroup.Key)
.Select(monthGroup => new
{
Month = monthGroup.Key,
MountCount = monthGroup.Count()
});
This would give you the results you required, as per the OP.
[edit] - as requested, example of how to access the newly created anonymous type:
foreach (var item in groupedData)
{
Console.WriteLine(item.Month);
Console.WriteLine(item.MountCount);
}
OR, you could return the whole caboodle as a jsonresult to your client app and iterate inside that, i.e the final line of your view would be:
return Json(groupedData, JsonRequestBehavior.AllowGet);
hope this clarifies.
What you need is grouping.
Considering you have a list of dates a solution would be this:
var dateRows = // Get from database
var monthlyRows = from dateRow in dateRows
group dateRow by dateRow.ToString("yyyy/mm") into monthGroup
orderby monthGroup.Key
select new { Month=monthGroup.Key, MountCount=monthGroup.Count };
// Your results would be a list of objects which have `Month` and `MonthCount` properties.
// {Month="2014/01", MonthCount=24}
// {Month="2014/02", MonthCount=28}

Removing duplicates from a sorted list c#

I have a list of details about a large number of files. This list contains the file ID, last modified date and the file path. The problem is there are duplicates of the files which are older versions and sometimes have different file paths. I want to only store the newest version of a file regardless of file path. So I created a loop that iterates through the ordered list, checks to see if the ID is unique and if it is, it gets stored in a new unique list.
var ordered = list.OrderBy(x => x.ID).ThenByDescending(x => x.LastModifiedDate);
List<Item> unique = new List<Item>();
string curAssetId = null;
foreach (Item result in ordered)
{
if (!result.ID.Equals(curAssetId))
{
unique.Add(result);
curAssetId = result.ID;
}
}
However this is still allowing duplicates into the DB and I can't figure out why this code isn't working as expected. By duplicates I mean, the files have the same ID but different file paths, which like I said before shouldn't be an issue. I just want the latest version regardless of pathway. Can anyone else see what the issue is? Thanks
var ordered = listOfItems.OrderBy(x => x.AssetID).ThenByDescending(x => x.LastModifiedDate);
List<Item> uniqueItems = new List<Item>();
foreach (Item result in ordered)
{
if (!uniqueItems.Any(x => x.AssetID.Equals(result.AssetID)))
{
uniqueItems.Add(result);
}
}
this is what I have now and it is still allowing duplicates
This is because , you are not searching entire list to check whether the id is unique or not
List<Item> unique = new List<Item>();
string curAssetId = null; // here is the problem
foreach (Item result in ordered)
{
if (!result.ID.Equals(curAssetId)) // here you only compare the last value.
{
unique.Add(result);
curAssetId = result.ID; // You are only assign the current ID value and
}
}
to solve this , change the following
if (!result.ID.Equals(curAssetId)) // here you only compare the last value.
{
unique.Add(result);
curAssetId = result.ID; // You are only assign the current ID value and
}
to
if (!unique.Any(x=>x.ID.Equals(result.ID)))
{
unique.Add(result);
}
I don't know if this code is just simplified, but have you considered grouping on ID, sorting on LastModifiedDate, then just taking the first from each group?
Something like:
var unique = list.GroupBy(i => i.ID).Select(x => x.OrderByDescending(y => y.LastModifiedDate).First());
var ordered = list.OrderBy(x => x.ID).ThenByDescending(x => x.LastModifiedDate).Distinct() ??
For this purpose you have to create your own EquityComparer and after that you could use linq's Distinct method. Enumerable.Distinct at msdn
Also I think you could stay with your current code but you have to modify it in such a way (as a sample):
var ordered = list.OrderByDescending(x => x.LastModifiedDate);
var unique = new List<Item>();
foreach (Item result in ordered)
{
if (unique.Any(x => x.ID == result.ID))
continue;
unique.Add(result);
}
List<Item> p = new List<Item>();
var x = p.Select(c => new Item
{
AssetID = c.AssetID,
LastModifiedDate = c.LastModifiedDate.Date
}).OrderBy(y => y.id).ThenByDescending(c => c.LastModifiedDate).Distinct();

Filter across 2 lists using LINQ

I have two lists:
a. requestedAmenities
b. units with amenities.
I want to filter those units that have any one of the "requested amenities".
I have tried to achieve the same result using foreach loops but I believe it should be much easier using LINQ. Can someone please help\advice?
UnitAmenities unitSearchRequestAmenities = unitSearchRequest.Amenities;
var exactMatchApartmentsFilteredByAmenities= new Units();
IEnumerable<string> requestAmenitiesIds = unitSearchRequestAmenities.Select(element => element.ID);
foreach (var unitCounter in ExactMatchApartments)
{
IEnumerable<string> unitAmenities = unitCounter.Amenities.Select(element => element.ID);
foreach (var requestAmenityId in requestAmenitiesIds)
{
foreach (var unitAmenity in unitAmenities)
{
if (requestAmenityId == unitAmenity)
{
exactMatchApartmentsFilteredByAmenities.Add(unitCounter);
//break to the outmost foreach loop
}
}
}
}
You could filter based on compliance with an Intersect rule
var matchedAmenities = ExactMatchApartments.Where(ema => ema.Amenities
.Any(x => unitSearchRequestAmenities
.Count(y => y.ID == x.ID) == 1));
exactMatchApartmentsFilteredByAmenities.AddRange(matchedAmenities);
This is a somewhat "custom" Intersect given that the default LINQ Intersect extension doesn't support lambda expressions.
It's hard to tell from your types, but I think the following should do the trick
from unit in ExactMatchApartments
from amenity in unit.Amenities
join requestedAmenity in unitSearchRequestAmenities
on amenity.ID equals requestedAmenity.ID
select unit
This is a case where a query expression is both easier to read and understand as opposed to dot notation.
Thanks Jason, I believe it must be Intersect not Except.I have changed the code to the following:
var amenities = unitSearchRequest.Amenities;
if (amenities.Count > 0)
{
//filter the unit's amenities's id's with the search request amenities's ID's.
var exactMatchApartmentsFilteredByAmenities= new Units();
var requestAmenitiesIds = amenities.Select(element => element.ID);
foreach (var unitCounter in ExactMatchApartments)
{
var unitAmenities = unitCounter.Amenities.Select(element => element.ID);
var intersect =unitAmenities.Intersect(requestAmenitiesIds);
if (intersect.Any())
{
exactMatchApartmentsFilteredByAmenities.Add(unitCounter);
break;
}
}
}
I will test the code and update here my results.

Categories