add item count to LINQ select - c#

I have a problem I cant wrap my head around.
I have a Sharepoint List of Items, which have Categories. I want to read all Categories and count how often, they occur.
In another method, I want to take the categoryCount, divide it by the total number of tickets and multiply by 100 to get a percentage.
The problem is the Count.
This is my query so far:
public IEnumerable<KategorieVM> GetAllCategories()
{
int counter = 0;
var result = (from t in Tickets
where t.Kategorie != Kategorie.Invalid && t.Kategorie != Kategorie.None && t.Kategorie != null
select new KategorieVM() { _name = t.Kategorie.ToString(), _val = counter++ });
return result;
}
the problem is, I can't use counter++. Is there a clean workaround? The option to build a query for the purpose of counting each category is not a valid option. The list has 15000 Listitems and growing. In the end I need to iterate through every category and call the query to count the tickets which just takes about 3 minutes.
So counting the cateogry in one query is mandatory.
Any help is highly appreciated.
/edit: for the sake of clearity:
the counter++ as count was just a brainfart - I dont know why I tried it; this would have resulted in an index. I needed a way to count how often the 'category' occured in those 15k entries.

You can use GroupBy to perform the Count within the query itself:
return Tickets
.Where(t => t.Kategorie != Kategorie.Invalid && t.Kategorie != Kategorie.None && t.Kategorie != null)
.GroupBy(t => t.Kategorie.ToString())
.Select(g => new KategorieVM() { _name = g.Key, _val = g.Count() });

Related

how to to SUM specific column including the current data using LINQ

I have a query it should first add up the amount in the database starting from 3 months ago until the current date,and if its more than a specific amount which i put in the condition,it should return false.
public Task<bool> SecurityCheck(CustomerData cust)
{
var checkRsult = (from x in dbContext.CustomerModel
where x.CustomerReference == cust.CustomerReference
&& x.Created >= DateTime.Today.AddMonths(-3)
select new
{
AccomulateAmount = x.AmountToTransfer
}).Sum(x => x.AccomulateAmount);
}
var finalResult=checkRsult+cust.Amount;
if(finalResult>250000){
//return false
}
else{
//store th model in the db
}
first of all im not sure if the way i query is right or not(the LINQ part),my second question is ,is there any way to sum all including the current incoming one(cust.amount)inside a single query? Rather than get the database sum first and then add the current one to it?
It's slightly long winded, you could make it
dbContext.CustomerModel
.Where(cm => cm.CustomerReference == cust.CustomerReference && cm.Created >= DateTime.Today.AddMonths(-3))
.Sum(cm => cm.AmountToTransfer)

How to get second and third rows using Entity Framework

I need to get the first row of the location record in order to added time,
My query as follows,
private Location GetFirstRecord()
{
return clientContext.Lobbies.Where(location => location.FkBranchId == locationRecord.FkBranchId && !location.FkAppointmentId.HasValue && location.Status == 1 && location.IsActive).OrderBy(location => location.AddedTime).FirstOrDefault();
}
The above query works fine, now I need to get second, third, fourth.....10th row records from the location. Do I need to write separate 10 separate queries to get data? or have any other possible way to do it. I don't have an idea to how to get second, third, fourth... values from the records. please provide me a sample query to do this. thanks in advance
You can use Skip method to get specific record like this
private Location GetRecord(int skip)
{
return clientContext.Lobbies.Where(location => location.FkBranchId == locationRecord.FkBranchId && !location.FkAppointmentId.HasValue && location.Status == 1 && location.IsActive)
.OrderBy(location => location.AddedTime)
.Skip(skip)
.FirstOrDefault();
}
and use
var data = GetRecord(0);//First Record
var data = GetRecord(1);//Second record
var data = GetRecord(9);// 10th record
Do I need to write 10 separate queries to get data?
No, you can use Skip(n) to skip first n records and then get First record by FirstOrDefault()
Like,
private Location GetLocation(int skipIndex)
{
return clientContext.Lobbies
.Where(location => location.FkBranchId == locationRecord.FkBranchId
&& !location.FkAppointmentId.HasValue
&& location.Status == 1 && location.IsActive) //Filter using where clause
.OrderBy(location => location.AddedTime) //Order by AddedTime
.Skip(skipIndex) //Skip first n records
.FirstOrDefault(); //Take the record at given position.
}
You need to pass skipIndex = 2 as a parameter to get third Location.
var secondLocation = GetLocation(1);
var thirdLocation = GetLocation(2);
...

Most efficient way to search enumerable

I am writing a small program that takes in a .csv file as input with about 45k rows. I am trying to compare the contents of this file with the contents of a table on a database (SQL Server through dynamics CRM using Xrm.Sdk if it makes a difference).
In my current program (which takes about 25 minutes to compare - the file and database are the exact same here both 45k rows with no differences), I have all existing records on the database in a DataCollection<Entity> which inherits Collection<T> and IEnumerable<T>
In my code below I am filtering using the Where method and then doing a logic based the count of matches. The Where seems to be the bottleneck here. Is there a more efficient approach than this? I am by no means a LINQ expert.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age);
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
EDIT: I can confirm that all existingRecords are in memory before this code is executed. There is no IO or DB access in the above loop.
Himbrombeere is right, you should execute the query first and put the result into a collection before you use Any, Count, AddRange or whatever method will execute the query again. In your code it's possible that the query is executed 5 times in every loop iteration.
Watch out for the term deferred execution in the documentation. If a method is implemented in that way, then it means that this method can be used to construct a LINQ query(so you can chain it with other methods and at the end you have a query). But only methods that don't use deferred execution like Count, Any, ToList(or a plain foreach) will actually execute it. If you dont want that the whole query is executed everytime and you have to access this query multiple times , it's better to store the result in a collection(.f.e with ToList).
However, you could use a different approach which should be much more efficient, a Lookup<TKey, TValue> which is similar to a dictionary and can be used with an anonymous type as key:
var lookup = existingRecords.Entities.ToLookup(r => new
{
fund = r["field_1"].ToString(),
bps = Convert.ToDecimal(r["field_2"]),
withdrawalPct = Convert.ToDecimal(r["field_3"]),
percentile = Convert.ToDecimal(r["field_4"]),
age = Convert.ToDecimal(r["field_5"])
});
Now you can access this lookup in the loop very efficiently.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = lookup[new {fund, bps, withdrawalPct, percentile, age}].ToList();
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
Note that this will work even if the key does not exist(an empty list is returned).
Add a ToList after your Convert.ToDecimal(r["field_5"]) == age);-line to force an immediate execution of the query.
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age)
.ToList();
The Where doesn´t actually execute your query, it just prepares it. The actual execution happens later in a delayed way. In your case that happens when calling Count which itself will iterate the entire collection of items. But if the first condition fails, the second one is checked leading to a second iteration of the complete collection when calling Count. In this case you actually execute that query a thrird time when calling matchingRows.First().
When forcing an immediate execution you´re executing the query only once and thus iterating the entire collection only once also which will decrease your overall-time.
Another option, which is basically along the same lines as the other answers, is to prepare your data first, so that you're not repeatedly calling things like r["field_2"] (which are relatively slow to look up).
This is a (1) clean your data, (2) query/join your data, (3) process your data approach.
Do this:
(1)
var inputs =
inputDataLines
.Select(record =>
{
var fields = record.Split(',');
return new
{
fund = fields[0],
bps = Convert.ToDecimal(fields[1]),
withdrawalPct = Convert.ToDecimal(fields[2]),
percentile = Convert.ToInt32(fields[3]),
age = Convert.ToInt32(fields[4]),
bombOutTerm = Convert.ToDecimal(fields[5]),
record
};
})
.ToArray();
var entities =
existingRecords
.Entities
.Select(entity => new
{
fund = entity["field_1"].ToString(),
bps = Convert.ToDecimal(entity["field_2"]),
withdrawalPct = Convert.ToDecimal(entity["field_3"]),
percentile = Convert.ToInt32(entity["field_4"]),
age = Convert.ToInt32(entity["field_5"]),
bombOutTerm = Convert.ToDecimal(entity["field_6"]),
entity
})
.ToArray()
.GroupBy(x => new
{
x.fund,
x.bps,
x.withdrawalPct,
x.percentile,
x.age
}, x => new
{
x.bombOutTerm,
x.entity,
});
(2)
var query =
from i in inputs
join e in entities on new { i.fund, i.bps, i.withdrawalPct, i.percentile, i.age } equals e.Key
select new { input = i, matchingRows = e };
(3)
foreach (var x in query)
{
entitiesFound.AddRange(x.matchingRows.Select(y => y.entity));
if (x.matchingRows.Count() == 0)
{
rowsToAdd.Add(x.input.record);
}
else if (x.matchingRows.Count() == 1)
{
if (x.matchingRows.First().bombOutTerm != x.input.bombOutTerm)
{
rowsToUpdate.Add(x.input.record);
entitiesToUpdate.Add(x.matchingRows.First().entity);
}
}
else
{
entitiesToDelete.AddRange(x.matchingRows.Select(y => y.entity));
rowsToAdd.Add(x.input.record);
}
}
I would suspect that this will be the among the fastest approaches presented.

Resharper warning on *all* uses of an enumerable about "Possible Multiple Enumeration"

I'm getting the "Possible Multiple Enumeration of IEnumerable" warning from Reshaper. How to handle it is already asked in another SO question. My question is slightly more specific though, about the various places the warning will pop up.
What I'm wondering is whether or not Resharper is correct in giving me this warning. My main concern is that the warning occurs on all instances of the users variable below, indicated in code by "//Warn".
My code is gathering information to be displayed on a web page in a grid. I'm using server-side paging, since the entire data set can be tens or hundreds of thousands of rows long. I've commented the code as best as possible.
Again, please let me know whether or not this code is susceptible to multiple enumerations. My goal is to perform my filtering and sorting of data before calling ToList(). Is that the correct way to do this?
private List<UserRow> GetUserRows(UserFilter filter, int start, int limit,
string sort, SortDirection dir, out int count)
{
count = 0;
// LINQ applies filter to Users object
var users = (
from u in _userManager.Users
where filter.Check(u)
select new UserRow
{
UserID = u.UserID,
FirstName = u.FirstName,
LastName = u.LastName,
// etc.
}
);
// LINQ orders by given sort
if (!String.IsNullOrEmpty(sort))
{
if (sort == "UserID" && dir == SortDirection.ASC)
users = users.OrderBy(u => u.UserID); //Warn
else if (sort == "UserID" && dir == SortDirection.DESC)
users = users.OrderByDescending(u => u.UserID); //Warn
else if (sort == "FirstName" && dir == SortDirection.ASC)
users = users.OrderBy(u => u.FirstName); //Warn
else if (sort == "FirstName" && dir == SortDirection.DESC)
users = users.OrderByDescending(u => u.FirstName); //Warn
else if (sort == "LastName" && dir == SortDirection.ASC)
users = users.OrderBy(u => u.LastName); //Warn
else if (sort == "LastName" && dir == SortDirection.DESC)
users = users.OrderByDescending(u => u.LastName); //Warn
// etc.
}
else
{
users = users.Reverse(); //Warn
}
// Output variable
count = users.Count(); //Warn
// Guard case - shouldn't trigger
if (limit == -1 || start == -1)
return users.ToList(); //Warn
// Pagination and ToList()
return users.Skip((start / limit) * limit).Take(limit).ToList(); //Warn
}
Yes, ReSharper is right: count = users.Count(); enumerates unconditionally, and then if the limit or the start is not negative 1, the ToList would enumerate users again.
It appears that once ReSharper decides that something is at risk of being enumerated multiple times, it flags every single reference to the item in question with the multiple enumeration warning, even though it's not the code that is responsible for multiple enumeration. That's why you see the warning on so many lines.
A better approach would add a separate call to set the count. You can do it upfront in a separate statement, like this:
count = _userManager.Users.Count(u => filter.Check(u));
This way you would be able to leave users in its pre-enumerated state until the final call of ToList.
Your warning is hopefully generated by the call to Count, which does run an extra query.
In the case where limit == -1 || start == -1 you could make the ToList call and then get the count from that, but in the general case there's nothing you can do - you are making two queries, one for the full count and one for a subset of the items.
I'd be curious to see whether fixing the special case causes the warning to go away.
Edit: As this is LINQ-to-objects, you can replace the last return line with a foreach loop that goes through your whole collection counting them, but also builds up your restricted skip/take sublist dynamically, and thus only iterates once.
You could also benefit from only projecting (select new UserRow) in this foreach loop and just before your special-case ToList, rather than projecting your whole collection and then potentially discarding the majority of your objects.
var users = _userManager.Users.Where(u => filter.Check(u));
// Sort as above
List<UserRow> rtn;
if (limit == -1 || start == -1)
{
rtn = users.Select(u => new UserRow { UserID = u.UserID, ... }).ToList();
count = rtn.Length;
}
else
{
int takeFrom = (start / limit) * limit;
int forgetFrom = takeFrom + limit;
count = 0;
rtn = new List<UserRow>();
foreach(var u in users)
{
if (count >= takeFrom && count < forgetFrom)
rtn.Add(new UserRow { UserID = u.UserID, ... });
count++;
}
}
return rtn;

Use LINQ to group a sequence by date with no gaps

I'm trying to select a subgroup of a list where items have contiguous dates, e.g.
ID StaffID Title ActivityDate
-- ------- ----------------- ------------
1 41 Meeting with John 03/06/2010
2 41 Meeting with John 08/06/2010
3 41 Meeting Continues 09/06/2010
4 41 Meeting Continues 10/06/2010
5 41 Meeting with Kay 14/06/2010
6 41 Meeting Continues 15/06/2010
I'm using a pivot point each time, so take the example pivot item as 3, I'd like to get the following resulting contiguous events around the pivot:
ID StaffID Title ActivityDate
-- ------- ----------------- ------------
2 41 Meeting with John 08/06/2010
3 41 Meeting Continues 09/06/2010
4 41 Meeting Continues 10/06/2010
My current implementation is a laborious "walk" into the past, then into the future, to build the list:
var activity = // item number 3: Meeting Continues (09/06/2010)
var orderedEvents = activities.OrderBy(a => a.ActivityDate).ToArray();
// Walk into the past until a gap is found
var preceedingEvents = orderedEvents.TakeWhile(a => a.ID != activity.ID);
DateTime dayBefore;
var previousEvent = activity;
while (previousEvent != null)
{
dayBefore = previousEvent.ActivityDate.AddDays(-1).Date;
previousEvent = preceedingEvents.TakeWhile(a => a.ID != previousEvent.ID).LastOrDefault();
if (previousEvent != null)
{
if (previousEvent.ActivityDate.Date == dayBefore)
relatedActivities.Insert(0, previousEvent);
else
previousEvent = null;
}
}
// Walk into the future until a gap is found
var followingEvents = orderedEvents.SkipWhile(a => a.ID != activity.ID);
DateTime dayAfter;
var nextEvent = activity;
while (nextEvent != null)
{
dayAfter = nextEvent.ActivityDate.AddDays(1).Date;
nextEvent = followingEvents.SkipWhile(a => a.ID != nextEvent.ID).Skip(1).FirstOrDefault();
if (nextEvent != null)
{
if (nextEvent.ActivityDate.Date == dayAfter)
relatedActivities.Add(nextEvent);
else
nextEvent = null;
}
}
The list relatedActivities should then contain the contiguous events, in order.
Is there a better way (maybe using LINQ) for this?
I had an idea of using .Aggregate() but couldn't think how to get the aggregate to break out when it finds a gap in the sequence.
Here's an implementation:
public static IEnumerable<IGrouping<int, T>> GroupByContiguous(
this IEnumerable<T> source,
Func<T, int> keySelector
)
{
int keyGroup = Int32.MinValue;
int currentGroupValue = Int32.MinValue;
return source
.Select(t => new {obj = t, key = keySelector(t))
.OrderBy(x => x.key)
.GroupBy(x => {
if (currentGroupValue + 1 < x.key)
{
keyGroup = x.key;
}
currentGroupValue = x.key;
return keyGroup;
}, x => x.obj);
}
You can either convert the dates to ints by means of subtraction, or imagine a DateTime version (easily).
In this case I think that a standard foreach loop is probably more readable than a LINQ query:
var relatedActivities = new List<TActivity>();
bool found = false;
foreach (var item in activities.OrderBy(a => a.ActivityDate))
{
int count = relatedActivities.Count;
if ((count > 0) && (relatedActivities[count - 1].ActivityDate.Date.AddDays(1) != item.ActivityDate.Date))
{
if (found)
break;
relatedActivities.Clear();
}
relatedActivities.Add(item);
if (item.ID == activity.ID)
found = true;
}
if (!found)
relatedActivities.Clear();
For what it's worth, here's a roughly equivalent -- and far less readable -- LINQ query:
var relatedActivities = activities
.OrderBy(x => x.ActivityDate)
.Aggregate
(
new { List = new List<TActivity>(), Found = false, ShortCircuit = false },
(a, x) =>
{
if (a.ShortCircuit)
return a;
int count = a.List.Count;
if ((count > 0) && (a.List[count - 1].ActivityDate.Date.AddDays(1) != x.ActivityDate.Date))
{
if (a.Found)
return new { a.List, a.Found, ShortCircuit = true };
a.List.Clear();
}
a.List.Add(x);
return new { a.List, Found = a.Found || (x.ID == activity.ID), a.ShortCircuit };
},
a => a.Found ? a.List : new List<TActivity>()
);
Somehow, I don't think LINQ was truly meant to be used for bidirectional-one-dimensional-depth-first-searches, but I constructed a working LINQ using Aggregate. For this example I'm going to use a List instead of an array. Also, I'm going to use Activity to refer to whatever class you are storing the data in. Replace it with whatever is appropriate for your code.
Before we even start, we need a small function to handle something. List.Add(T) returns null, but we want to be able to accumulate in a list and return the new list for this aggregate function. So all you need is a simple function like the following.
private List<T> ListWithAdd<T>(List<T> src, T obj)
{
src.Add(obj);
return src;
}
First, we get the sorted list of all activities, and then initialize the list of related activities. This initial list will contain the target activity only, to start.
List<Activity> orderedEvents = activities.OrderBy(a => a.ActivityDate).ToList();
List<Activity> relatedActivities = new List<Activity>();
relatedActivities.Add(activity);
We have to break this into two lists, the past and the future just like you currently do it.
We'll start with the past, the construction should look mostly familiar. Then we'll aggregate all of it into relatedActivities. This uses the ListWithAdd function we wrote earlier. You could condense it into one line and skip declaring previousEvents as its own variable, but I kept it separate for this example.
var previousEvents = orderedEvents.TakeWhile(a => a.ID != activity.ID).Reverse();
relatedActivities = previousEvents.Aggregate<Activity, List<Activity>>(relatedActivities, (items, prevItem) => items.OrderBy(a => a.ActivityDate).First().ActivityDate.Subtract(prevItem.ActivityDate).Days.Equals(1) ? ListWithAdd(items, prevItem) : items).ToList();
Next, we'll build the following events in a similar fashion, and likewise aggregate it.
var nextEvents = orderedEvents.SkipWhile(a => a.ID != activity.ID);
relatedActivities = nextEvents.Aggregate<Activity, List<Activity>>(relatedActivities, (items, nextItem) => nextItem.ActivityDate.Subtract(items.OrderBy(a => a.ActivityDate).Last().ActivityDate).Days.Equals(1) ? ListWithAdd(items, nextItem) : items).ToList();
You can properly sort the result afterwards, as now relatedActivities should contain all activities with no gaps. It won't immediately break when it hits the first gap, no, but I don't think you can literally break out of a LINQ. So it instead just ignores anything which it finds past a gap.
Note that this example code only operates on the actual difference in time. Your example output seems to imply that you need some other comparison factors, but this should be enough to get you started. Just add the necessary logic to the date subtraction comparison in both entries.

Categories