Using parallelism with linq - c#

I have a foreach loop with some results from a linq query. I am trying to get it to run faster (it's taking about an hour to run) but when I convert to Parallel.foreach, the results I get are different that when I run with the standard foreach, even though it cuts the time in half. Can those of you who are much better at linq and parallelism help me out some on this.
Really would like some way to speed it up. I'm a bit confused though why the parallel.foreach is not giving me the same results. Maybe someone smarter than I can fill me in.
The standard foreach:
var studentTestGroup = from st in this
group st by new { st.TestName, st.STI }
into studentGroups
select new { TestName = studentGroups.Key.TestName, STI = studentGroups.Key.STI, students = studentGroups };
//Loop through each group that has more than one test, or where there exists any retests at all.
foreach (var studentGroup in studentTestGroup.Where(t => t.students.Count() > 1 || t.students.Any(x => x.Retest == "Y")))
{
if (studentGroup.students.Any(t => t.Retest == "Y") && studentGroup.students.Count(t => t.Retest == "N" || t.Retest == "") == 1)
{
//For a test name and STI, if there exists a restest and only 1 non-retest, keep highest and disacrd the rest
var studentToKeep = studentGroup.students.OrderByDescending(t => t.TestScaledScore).First();
this.RemoveAll(t => t.STI == studentGroup.STI && t.TestName == studentGroup.TestName && t.PrimaryKey != studentToKeep.PrimaryKey);
}
else if (studentGroup.students.Any(t => t.Retest == "Y") && studentGroup.students.Count(t => t.Retest == "N" || t.Retest == "") > 1)
{
//For a test anem and STI, if there exists a restest and more than 1 non-retest,
//then keep the highest (number of non-retests) scores and discard the rest
int numRetests = studentGroup.students.Count(t => t.Retest == "N" || t.Retest == "");
var studentsToKeep = studentGroup.students.OrderByDescending(t => t.TestScaledScore).Take(numRetests);
this.RemoveAll(t => t.STI == studentGroup.STI && t.TestName == studentGroup.TestName && !studentsToKeep.Any(x => x.PrimaryKey == t.PrimaryKey));
}
else if (studentGroup.students.Any(t => t.Retest == "Y") && !studentGroup.students.Any(t => t.Retest == "N" || t.Retest == ""))
{
this.RemoveAll(t => t.STI == studentGroup.STI && t.TestName == studentGroup.TestName && Convert.ToInt32(t.TestScaledScore) < 400);
}
}
The part where I converted to a parallel foreach:
Parallel.ForEach (studentTestGroup.AsParallel().Where(t => t.students.Count() > 1 || t.students.Any(x => x.Retest == "Y")).AsParallel(), studentGroup =>
{

Do not combine PLinq(AsParallel) and TPL(Parallel.ForEach). That will even decrease speed because you overload thread pool. Use one of technics. All you can get from parallelism is speed up in your CPU cores count. After that you can use some profilers. Or heuristics about collections that only you know.
For the code you provided - do not duplicate instructions! For example:
studentGroup.students.Count(t => t.Retest == "N" || t.Retest == "")
you can calculate it only once instead of every-time in all different conditions.
Same goes to:
studentGroup.students.Any(t => t.Retest == "Y")
"Any" will iterate your entire collection till predicate match, so just not iterate large collections multiple times for all if-statements with the same condition!
Question yourself about collections, maybe you could use dictionaries for search items or some other structures, but as I said that would be more like heuristics for your collections that maybe provide some speed up.
Hope this helps.
If you want more then you need profilers.

What is type is this?
I think there is a race condition on this.RemoveAll(). If you modify a list/collection in multiple threads at the same time the result of the operation on the collection isn't clear.
In this case you could use a lock statement around your RemoveAll()-calls, but then the benefit of your parallel foreach would be gone.
Another possibility could be to remember all items which should be removed and remove them after the foreach. I think an add-operation on a collection should be possible on multiple threads.
Edit:
This may could be a faster implementation to remove the specified items:
List itemsToRemove = new List();
foreach (var studentGroup in studentTestGroup.Where(t => t.students.Count() > 1 || t.students.Any(x => x.Retest == "Y")))
{
int countNo = studentGroup.students.Count(t => t.Retest == "N" || t.Retest == "");
bool anyYes = studentGroup.students.Any(t => t.Retest == "Y");
if (anyYes && countNo == 1)
{
var studentToKeep = studentGroup.students.Single(t => t.Retest == "N" || t.Retest == "");
itemsToRemove.AddRange(t => t.STI == studentGroup.STI && t.TestName == studentGroup.TestName && t.PrimaryKey != studentToKeep.PrimaryKey);
}
else if (anyYes && countNo > 1)
{
var studentsToKeep = studentGroup.students.Where(t => t.Retest == "N" || t.Retest == "");
itemsToRemove.AddRange(t => t.STI == studentGroup.STI && t.TestName == studentGroup.TestName && !studentsToKeep.Any(x => x.PrimaryKey == t.PrimaryKey));
}
else if (anyYes && countNo == 0)
{
itemsToRemove.AddRange(t => t.STI == studentGroup.STI && t.TestName == studentGroup.TestName && Convert.ToInt32(t.TestScaledScore) < 400);
}
}
foreach (var itemToRemove in itemsToRemove)
{
this.Remove(itemToRemove);
}

Related

To convert if condition to linq in cshtml

Code
if(Model.CurrentStatus == 1 || Model.CurrentStatus == 2)
{
//can display those records..
}
else if((Model.CurrentStatus == 3 || Model.CurrentStatus == 4) && Model.Date != null)
{
if(Model.Date <= 30 days)
{
//can display those records..
}
}
I have tried the following code and unable to complete it fully as expected
#Html.Partial("Filter", new IndexModel()
{
Id = Model.Id,
Collection = Model.Collection.Where((a => a.CurrentStatus == 1 || a.CurrentStatus == 2)
&& )
})
How to convert the above if condition to linq in cshtml. Thanks
the else-if relationship is an OR relationship. So simply combine the two lines. the inner nested if inside the else if is an AND relationship. This would go into the second set of parentheses
Collection = Model.Collection.Where
(
(a => a.CurrentStatus == 1 || a.CurrentStatus == 2) ||
((a.CurrentStatus == 3 || a.CurrentStatus == 4) && a.Date != null && a.Date <= 30)
)
EDIT:
Here is another suggestion: extract the readable code into an own method that evaluates the condition and returns the boolean result. This way you can make a predicate that can be accepted by the Where method:
private bool IsForDisplay( ModelDataType Model )
{
if(Model.CurrentStatus == 1 || Model.CurrentStatus == 2)
{
//can display those records..
return true;
}
else if((Model.CurrentStatus == 3 || Model.CurrentStatus == 4) && Model.Date != null)
{
if(Model.Date <= 30 days)
{
//can display those records..
return true;
}
}
return false;
}
now you can use it simply in the linq expression:
#Html.Partial("Filter", new IndexModel()
{
Id = Model.Id,
Collection = Model.Collection.Where(a => IsForDisplay(a))
});

How can I pull a repeated where clause expression from linq into a function?

I have a large project where I have dozens of linq statements where I am looking for a matching record by checking several fields to see if they match or both field and compared field are null.
var testRecord = new { firstField = "bob", secondField = (string)null, thirdField = "ross" };
var matchRecord = dataContext.RecordsTable.FirstOrDefault(vi =>
(vi.first == testRecord.firstField || ((vi.first == null || vi.first == string.Empty) && testRecord.firstField == null))
&& (vi.second == testRecord.secondField || ((vi.second == null || vi.second == string.Empty) && testRecord.secondField == null))
&& (vi.third == testRecord.thirdField || ((vi.third == null || vi.third == string.Empty) && testRecord.thirdField == null)));
//do stuff with matchRecord
Ideally I would replace all that duplicated code (used around 50 times across the system I'm working on) with something like the following
Expression<Func<string, string, bool>> MatchesOrBothNull = (infoItem, matchItem) => (
infoItem == matchItem || ((infoItem == null || infoItem == string.Empty) && matchItem == null));
var matchRecord = dataContext.RecordsTable.FirstOrDefault(vi =>
MatchesOrBothNull(vi.first, testRecord.firstField)
&& MatchesOrBothNull(vi.second, testRecord.secondField)
&& MatchesOrBothNull(vi.third, testRecord.thirdField));
//do stuff with matchRecord
My question is two-fold: First, is there a matched or both null function already available? (I've looked without luck).
Second, the code block above compiles, but throws a "no supported translation to sql" error, is there a way to have a function in the where clause? I know that there is a translation because it works if I don't pull it into the function. How can I get that translated?
First of all you can check whether string is null or empty with single code which is called : String.IsNullOrEmpty(vi.first). You need a method like this one :
public bool MatchesOrBothNull(string first,string second){
if(first==second||String.IsNullOrEmpty(first)||String.IsNullOrEmpty(second))
return true;
else return false;
}
You can use it in where clause
var matchRecord = dataContext.RecordsTable.Where(vi =>
MatchesOrBothNull(vi.first, testRecord.firstField)
&& MatchesOrBothNull(vi.second, testRecord.secondField)
&& MatchesOrBothNull(vi.third, testRecord.thirdField)
).FirstOrDefault();

Entity Framework: Count() very slow on large DbSet and complex WHERE clause

I need to perform a count operation on this Entity Framework (EF6) data set using a relatively complex expression as a WHERE clause and expecting it to return about 100k records.
The count operation is obviously where records are materialized and therefore the slowest of operations to take place. The count operation is taking about 10 seconds in our production environment which is unacceptable.
Note that the operation is performed on the DbSet directly (db being the Context class), so no lazy loading should be taking place.
How can I further optimize this query in order to speed up the process?
The main use case is displaying an index page with multiple filter criteria but the the function is also used for writing generic queries to the ParcelOrderstable as required for other operations in the service classes which might be a bad idea resulting in very complex queries resulting from laziness and might potentially be a future problem.
The count is later used for pagination, and a much smaller number of records (e.g. 500) is actually displayed. This is a database-first project using SQL Server.
ParcelOrderSearchModel is a C#-class that serves to encapsualte query parameters and is used exclusively by service classes in order to call the GetMatchingOrdersfunction.
Note that on the majority of calls, the majority of the parameters of ParcelOrderSearchModel will be null.
public List<ParcelOrderDto> GetMatchingOrders(ParcelOrderSearchModel searchModel)
{
// cryptic id known --> allow public access without login
if (String.IsNullOrEmpty(searchModel.KeyApplicationUserId) && searchModel.ExactKey_CrypticID == null)
throw new UnableToCheckPrivilegesException();
Func<ParcelOrder, bool> userPrivilegeValidation = (x => false);
if (searchModel.ExactKey_CrypticID != null)
{
userPrivilegeValidation = (x => true);
}
else if (searchModel.KeyApplicationUserId != null)
userPrivilegeValidation = privilegeService.UserPrivilegeValdationExpression(searchModel.KeyApplicationUserId);
var criteriaMatchValidation = CriteriaMatchValidationExpression(searchModel);
var parcelOrdersWithNoteHistoryPoints = db.HistoryPoint.Where(hp => hp.Type == (int)HistoryPointType.Note)
.Select(hp => hp.ParcelOrderID)
.Distinct();
Func<ParcelOrder, bool> completeExpression = order => userPrivilegeValidation(order) && criteriaMatchValidation(order);
searchModel.PaginationTotalCount = db.ParcelOrder.Count(completeExpression);
// todo: use this count for pagination
}
public Func<ParcelOrder, bool> CriteriaMatchValidationExpression(ParcelOrderSearchModel searchModel)
{
Func<ParcelOrder, bool> expression =
po => po.ID == 1;
expression =
po =>
(searchModel.KeyUploadID == null || po.UploadID == searchModel.KeyUploadID)
&& (searchModel.KeyCustomerID == null || po.CustomerID == searchModel.KeyCustomerID)
&& (searchModel.KeyContainingVendorProvidedId == null || (po.VendorProvidedID != null && searchModel.KeyContainingVendorProvidedId.Contains(po.VendorProvidedID)))
&& (searchModel.ExactKeyReferenceNumber == null || (po.CustomerID + "-" + po.ReferenceNumber) == searchModel.ExactKeyReferenceNumber)
&& (searchModel.ExactKey_CrypticID == null || po.CrypticID == searchModel.ExactKey_CrypticID)
&& (searchModel.ContainsKey_ReferenceNumber == null || (po.CustomerID + "-" + po.ReferenceNumber).Contains(searchModel.ContainsKey_ReferenceNumber))
&& (searchModel.OrKey_Referencenumber_ConsignmentID == null ||
((po.CustomerID + "-" + po.ReferenceNumber).Contains(searchModel.OrKey_Referencenumber_ConsignmentID)
|| (po.VendorProvidedID != null && po.VendorProvidedID.Contains(searchModel.OrKey_Referencenumber_ConsignmentID))))
&& (searchModel.KeyClientName == null || po.Parcel.Name.ToUpper().Contains(searchModel.KeyClientName.ToUpper()))
&& (searchModel.KeyCountries == null || searchModel.KeyCountries.Contains(po.Parcel.City.Country))
&& (searchModel.KeyOrderStates == null || searchModel.KeyOrderStates.Contains(po.State.Value))
&& (searchModel.KeyFromDateRegisteredToOTS == null || po.DateRegisteredToOTS > searchModel.KeyFromDateRegisteredToOTS)
&& (searchModel.KeyToDateRegisteredToOTS == null || po.DateRegisteredToOTS < searchModel.KeyToDateRegisteredToOTS)
&& (searchModel.KeyFromDateDeliveredToVendor == null || po.DateRegisteredToVendor > searchModel.KeyFromDateDeliveredToVendor)
&& (searchModel.KeyToDateDeliveredToVendor == null || po.DateRegisteredToVendor < searchModel.KeyToDateDeliveredToVendor);
return expression;
}
public Func<ParcelOrder, bool> UserPrivilegeValdationExpression(string userId)
{
var roles = GetRolesForUser(userId);
Func<ParcelOrder, bool> expression =
po => po.ID == 1;
if (roles != null)
{
if (roles.Contains("ParcelAdministrator"))
expression =
po => true;
else if (roles.Contains("RegionalAdministrator"))
{
var user = db.AspNetUsers.First(u => u.Id == userId);
if (user.RegionalAdministrator != null)
{
expression =
po => po.HubID == user.RegionalAdministrator.HubID;
}
}
else if (roles.Contains("Customer"))
{
var customerID = db.AspNetUsers.First(u => u.Id == userId).CustomerID;
expression =
po => po.CustomerID == customerID;
}
else
{
expression =
po => false;
}
}
return expression;
}
If you can possibly avoid it, don't count for pagination. Just return the first page. It's always expensive to count and adds little to the user experience.
And in any case you're building the dynamic search wrong.
You're calling IEnumerable.Count(Func<ParcelOrder,bool>), which will force client-side evaluation where you should be calling IQueryable.Count(Expression<Func<ParcelOrder,bool>>). Here:
Func<ParcelOrder, bool> completeExpression = order => userPrivilegeValidation(order) && criteriaMatchValidation(order);
searchModel.PaginationTotalCount = db.ParcelOrder.Count(completeExpression);
But there's a simpler, better pattern for this in EF: just conditionally add criteria to your IQueryable.
eg put a method on your DbContext like this:
public IQueryable<ParcelOrder> SearchParcels(ParcelOrderSearchModel searchModel)
{
var q = this.ParcelOrders();
if (searchModel.KeyUploadID != null)
{
q = q.Where( po => po.UploadID == searchModel.KeyUploadID );
}
if (searchModel.KeyCustomerID != null)
{
q = q.Where( po.CustomerID == searchModel.KeyCustomerID );
}
//. . .
return q;
}

Is it posible to pass a condition that is not Initialized?

Simply put I was wondering if this is possible?
if (Descriptionsearch.Checked)
{
searchResult = searchResultBuilder(a.Description == textBox.Text))
}
else if (titleSearch.Checked)
{
searchReuslt = searchResultBuilder(a.title == textBox.Text))
}
As you can see I am simply sending a condition of a variable that has not yet been initialized but will be at the time of use.
private List<int> searchResultBuilder(Func<bool> condition)
{
foreach (var element in currentPosition.Where(a => condition()))
{
searchResults.Add(currentPosition.IndexOf(element));
}
return searchResults;
}
I simply wanted to know if there is a way to do this.
since people are asking this is the for loop from my original code
foreach(var element in main.currentPosition.Where(a => (a.key != null && main.msgSigCollection1.msgSig[(int)a.key].Description.IndexOf(searchTextBox.Text, StringComparison.OrdinalIgnoreCase) >= 0) || (a.value != null && main.msgSigCollection2.msgSig[(int)a.value].Description.IndexOf(searchTextBox.Text, StringComparison.OrdinalIgnoreCase) >= 0)))
{
searchResults.Add(main.currentPosition.IndexOf(element));
}
where currentPosition is a List<int?,int?>
You need to specify a lambda syntax to achieve what you want.
Ex: (input parameters) => expression
if(Descriptionsearch.checked == true)
searchReuslt = searchResultBuilder(a => a.Description == textBox.Text))
else if(titleSearch.checked == true)
searchReuslt = searchResultBuilder(a => a.title == textBox.Text))
MSDN
It's hard to tell based on your question, but I think what you want is to be passing lambdas, which is not what you're doing currently.
Something like this seems closer to what you want:
if(Descriptionsearch.checked == true)
searchReuslt = searchResultBuilder(a => a.Description == textBox.Text))
else if(titleSearch.checked == true)
searchReuslt = searchResultBuilder(a => a.title == textBox.Text))
private List<int> searchResultBuilder<T>(Func<T, bool> condition){
var searchResults = new List<int>();
foreach (var element in currentPosition.Where(condition))
{
searchResults.Add(currentPosition.IndexOf(element));
}
return searchResults;
}
In truth though, your question should state what you're trying to accomplish, not just how you're attempting to get there. There probably is a much easier way to accomplish everything you're trying to do here with LINQ.

Despite attaching the object to context getting the error? can not remove an entity that has not been attached

i have attached object to context, despite i am getting the error Cannot remove an entity that has not been attached.
if (itemRemove != -1)
{
//var deleteDetails = DBContext.ProductCustomizationMasters.Where(p => p.ProductID == this.ProductID && p.CustomCategoryID == catId && p.CustomType == (short)catTypeId).Single();
var deleteDetails = DBContext.ProductCustomizationMasters.Single(p => p.ProductID == this.ProductID && p.CustomCategoryID == catId && p.CustomType == (short)catTypeId);
DBContext.ProductCustomizationMasters.Attach(deleteDetails);
DBContext.ProductCustomizationMasters.DeleteOnSubmit(deleteDetails);
RemoveCategoryItems(catId, catTypeId);
}
private void RemoveCategoryItems(int catId, CategoryType catTypeId)
{
switch (catTypeId)
{
case CategoryType.Topping:
(this.ToppingItems.Where(xx => xx.ToppingInfo.CatID == catId && xx.ProductID == this.ProductID).Single()).IsDefault = false;
FreeToppingItems.RemoveAll(x => x.ProductID == this.ProductID && x.ToppingInfo.CatID == catId);
break;
case CategoryType.Dressing:
(this.DressingItems.Where(xx => xx.DressingInfo.CatID == catId && xx.ProductID == this.ProductID).Single()).IsDefault = false;
FreeDressingItems.RemoveAll(x => x.ProductID == this.ProductID && x.DressingInfo.CatID == catId);
break;
case CategoryType.SpecialInstruction:
(this.InstructionItems.Where(xx => xx.InstructionInfo.CatID == catId && xx.ProductID == this.ProductID).Single()).IsDefault = false;
FreeInstructionItems.RemoveAll(x => x.ProductID == this.ProductID && x.InstructionInfo.CatID == catId);
break;
}
}
FIRST IDEA - You don't need to attach the item that you are going to delete. The item is already attached to the context and the state is being managed. Just skip the attach line and delete the object.
EDIT Since the attach doesn't appear to have been your problem, give this a shot:
if (itemRemove != -1)
{
var deleteDetails = DBContext.ProductCustomizationMasters.Single(p => p.ProductID == this.ProductID && p.CustomCategoryID == catId && p.CustomType == (short)catTypeId);
//Obviously, this isn't going to work directly, you need to actually assign the ID, Primary Key Field here...
var deleteMe = new ProductCustomizationMasters() { PrimaryKey = deleteDetails.PrimaryKey };
DBContext.Attach(deleteMe);
DBContext.DeleteOnSubmit(deleteMe);
DBContext.SubmitChanges();
RemoveCategoryItems(catId, catTypeId);
}
EDIT AGAIN - The code you have posted doesn't appear to be the source of your problem. There is something outside of this code that is setting an object up for deletion from the context ant has not been attached. I suggest working back through your code and inspecting all references to "DeleteOnSubmit" and making sure that those entities are attached when you mark them.

Categories