For each loop takes more time to complete - c#

I have a foreach on a list. List length will be huge always like 200K or so.
When i iterate through the list in foreach, the logic within the foreach will be dealing with another collection which is also will be around 1 million items in the list. For each iteration, the collection will be filtered and it needs to update on property and return the collection as it is. But by doing this, the process never gets completed.
foreach(var list in iterationlist)
{
var filteredCollections = collection.where(a=>a.name==list.name);
filteredCollections.foreach(x=>{x.city="xxxx";});
}
What are the ways to make this logic faster ? Currently this implementation takes more than 3 hours but not completing yet

You can use a Lookup (it's similar to Dictionary<key, List<value>>):
var lookup = collection.ToLookup(a => a.name);
foreach (var list in iterationlist)
foreach (var x in lookup[list.name])
x.city="xxxx";
Another option is to hash the names:
var names = new HashSet<string>(iterationlist.Select(x => x.name));
foreach (var x in collection)
if (names.Contains(x.name))
x.city="xxxx";

Related

More efficient way of using LINQ to compare two items?

I am updating records on a SharePoint list based on data from a SQL database. Lets say my table looks something like this:
VendorNumber
ItemNumber
Descrpition
1001
1
abc
1001
2
def
1002
1
ghi
1002
3
jkl
There can be multiple keys in each table. I am trying to make a generic solution that will work for multiple different table structures. In the above example, VendorNumber and ItemNumber would be considered keys.
I am able to retrieve the SharePoint lists as c# List<Microsoft.SharePoint.Client.ListItem>
I need to search through the List to determine which individual ListItem corresponds to the current SQL datarow I am on. Since both ListItem and DataRow allow bracket notation to specify column names, this is pretty easy to do using LINQ if you only have one key column. What I need is a way to do this if I have anywhere from 1 key to N keys. I have found this solution but realize it is very inefficient. Is there a more efficient way of doing this?
List<string> keyFieldNames = new List<string>() { "VendorNumber", "ItemNumber" };
List<ListItem> itemList = MyFunction_GetSharePointItemList();
DataRow row = MyFunction_GetOneRow();
//this is the part I would like to make more efficient:
foreach (string key in keyFieldNames)
{
//this filters the list with each successive pass.
itemList = itemList.FindAll(item => item[key].ToString().Trim() == row[key].ToString().Trim());
}
Edited to Add: Here is a link to the ListItem class documentation:
Microsoft.SharePoint.Client.ListItem
While ListItem is not a DataTable object, its structure is very similar. I have intentionally designed it so that both the ListItem and my DataRow object will have the same number of columns and the same column names. This was done to make comparing them easier.
A quick optimization tip first:
Create a Dictionary<string, string> to use instead of row
List<string> keyFieldNames = new List<string>() { "VendorNumber", "ItemNumber" };
DataRow row = MyFunction_GetOneRow();
var rowData = keyFieldNames.ToDictionary(name=>row[name].ToString().Trim());
foreach (string key in keyFieldNames)
{
itemList = itemList.FindAll(item => item[key].ToString().Trim() == rowData[key]);
}
This will avoid doing the ToString & Trim on the same records over & over. That's probably taking 1/3rd to 1/2 the time of the loop. (The comparison is fast compared to the string manipulation)
Beyond that, all I can think of is to use reflection to build a specific function, on the fly to handle the comparison. BUT, that would be a big effort, and I don't see it saving that much time. Basically, whatever you do, will still have to do the same basics: Lookup the values by key, and compare them. That's what's taking the majority of the time.
After I stopped looking for an answer, I stumbled across one. I have now realized that using a .Where is implemented using deferred execution. This means that even though the foreach loop iterates several times, the LINQ query executes all at once. This was the part I was struggling to wrap my head around.
My new sudo code:
List<string> keyFieldNames = new List<string>() { "VendorNumber", "ItemNumber" };
List<ListItem> itemList = MyFunction_GetSharePointItemList();
DataRow row = MyFunction_GetOneRow();
//this is the part I would like to make more efficient:
foreach (string key in keyFieldNames)
{
//this filters the list with each successive pass.
itemList = itemList.Where(item => item[key].ToString().Trim() == row[key].ToString().Trim());
}
I know that the .ToString().Trim() is still inefficient, I will address this at some point. But for now at least my mind can rest knowing that the LINQ executes all at once.

Correct passage through the list of dates in the foreach loop

I have a problem in my code, where every time I start with the same date from the date list, I set the process to stop the foreach loop when it was done for the first time, but my way (for the second day I should take it in "arr" between dates) starts again from the first date. How can I get it in the right way?
This is my code:
#region Präsenzzeit-Check
var arrivalDates = dataAccess.Get_dates(userID, date, DateTime.DaysInMonth(date.Year, date.Month)); //get dates for user
var arrival = DateTime.Now;
A_Zeitplan presences =null;
int processed = 0;
foreach (var arr in arrivalDates)
{
var arrivalDate = dataAccess.CheckPraesenzzeit(userID, arr);
arrival = arrivalDate.Select(x => x.ZPZ_Bis).Where(x => x != null).LastOrDefault();
foreach (var pr in presenceBlock)
{
presences = pr;
}
if (++processed == 1) break;
}
errorMessages.Add(error);
#endregion
Looking at your code, I'm not sure what the second foreach loop is for at all now, and I also believe that the first foreach loop runs only once (in which case it's not really a loop at all, just a code block)
I presume that rather than breaking, you need to perform some method call and process your items in your loop, one by one. Alternatively, put them in a modifiable collection, remove the first one, process it and then leave the remaining ones so that next time this code runs, it first item is a different one. This concept / container is called a Queue - a collection where items are added to the end and processed from the start (first in, first out)
I don't have any specific recommendations for how to modify this code snip to implement these because it's a very small snippet with no surrounding context, and it's not immediately obvious how it integrates into your wider program. The list of dates is fetched each time, which will overwrite the "just take the first" style I propose. Perhaps you should do something elsewhere so that get_dates doesn't return the same first date any more (mark it as processed somehow and exclude it from consideration)
As such, in a pseudocode point of view I think you need to look at a code structure like:
var collection = GetData(); //only return unprocessed items
foreach(item in collection)
ProcessItemAndMarkAsCompleted(item)
Where you process all items in a loop and then move on with your program, or you need to queue things up and process them:
_classLevelQueueCollection.AddAllToEnd(GetData()); //GetData might return between 0 and 10 items
var item = _classLevelQueueCollection.Pop(); //remove one item from queue front
ProcessItemAndMarkAsCompleted(item);
In the second way the only possibility that the queue will get shorter is if GetData() often returns 0 items - if on average it always returns more than 1 item, your queue will get longer and longer.
This would be inefficient, but would work:
var collection = GetData(); //GetData returns all unprocessed items
var item = collection[0]; //get the first
ProcessItemAndMarkAsCompleted(item); //when marked as completed that item will not be returned by GetData() next time

How to update values in a list, using a foreach loop

I have some values stored on a database, that I am query'ing.
So for each object, I want to save the value to my list.
Currently I'm doing it, in a function that I call once pr second, like this:
foreach (var obj in results)
{
MyList.Add (ValueSavedOnDatabase);
}
But my issue is that, if I have, for example, 20 objects on the database, then this function will save the values in the first 20 indexes in the list.
Then each time the function gets called (once per second), it will add 20 new indexes to the list, instead of just overwriting the 20 indexes, which I need it to.
So after 3 seconds, I currently get 60 indexes in this list (When I just wanted 20).
Does anyone have an advice, for how I can achieve this overwriting, of these indexes in the list, instead of continouing creating new indexes ?
I'm using Unity and C#.
You need to empty your list before the loop:
MyList.Clear(); // Clear the list
foreach (var obj in results) { // do whatever you need
MyList.Add (ValueSavedOnDatabase);
}
Or if the size of the list is always the same per each call, you can do this:
int i = 0;
foreach (var obj in results) {
MyList[i] = ValueSavedOnDatabase;
i++;
}
N.B. I only suggest this last, in case you are using an ArrayList.

LINQ Intersect but add the result to a New List

I have a list called DiscountableObject. Each item on the list in turn has a Discounts collection. What I need is a list of Discounts which are common across all DiscoubtableObjects.
Code:
List<Discount> IntersectionOfDiscounts = new List<Discount>();
foreach(var discountableItem in DiscoutableObject)
{
IntersectionOfDiscounts = IntersectionOfDiscounts.Intersect(discountableItem.Discounts);
}
This will undoubtedly return an empty list because by IntersectionOfDiscounts was empty in the first instance.
What I want is to take item 1 of the DiscountableObject, compare it with the next item of DiscountableObject and so on.
I know what I am trying to do is wrong because, I am doing the intersection and the addition to the list at the same time...but how else baffles me?
How do I get around this?
Initialize IntersectionOfDiscounts to the first Discount List (if there is more than one) rather than to an empty list. You can also then skip the first item in the 'foreach' loop.
// add check to ensure at least 1 item.
List<Discount> IntersectionOfDiscounts = DiscoutableObject.First().Discounts;
foreach(var discountableItem in DiscoutableObject.Skip(1))
{
IntersectionOfDiscounts = IntersectionOfDiscounts.Intersect(discountableItem.Discounts);
}
Possibly more elegant way:
var intersection = discountableObject
.Select(discountableItem => discountableItem.Discounts)
.Aggregate( (current, next) => current.Intersect(next).ToList());
Missed your 6 minute deadline, but I like it anyway...

HashSet Iterating While Removing Items in C#

I have a hashset in C# that I'm removing from if a condition is met while iterating though the hashset and cannot do this using a foreach loop as below.
foreach (String hashVal in hashset)
{
if (hashVal == "somestring")
{
hash.Remove("somestring");
}
}
So, how can I remove elements while iterating?
Use the RemoveWhere method of HashSet instead:
hashset.RemoveWhere(s => s == "somestring");
You specify a condition/predicate as the parameter to the method. Any item in the hashset that matches the predicate will be removed.
This avoids the problem of modifying the hashset whilst it is being iterated over.
In response to your comment:
's' represents the current item being evaluated from within the hashset.
The above code is equivalent to:
hashset.RemoveWhere(delegate(string s) {return s == "somestring";});
or:
hashset.RemoveWhere(ShouldRemove);
public bool ShouldRemove(string s)
{
return s == "somestring";
}
EDIT:
Something has just occurred to me: since HashSet is a set that contains no duplicate values, just calling hashset.Remove("somestring") will suffice. There is no need to do it in a loop as there will never be more than a single match.
You can't remove items from a collection while looping over it with an enumerator. Two approaches to solve this are:
Loop backwards over the collection using a regular indexed for-loop (which I believe is not an option in the case of a HashSet)
Loop over the collection, add items to be removed to another collection, then loop over the "to-be-deleted"-collection and remove the items:
Example of the second approach:
HashSet<string> hashSet = new HashSet<string>();
hashSet.Add("one");
hashSet.Add("two");
List<string> itemsToRemove = new List<string>();
foreach (var item in hashSet)
{
if (item == "one")
{
itemsToRemove.Add(item);
}
}
foreach (var item in itemsToRemove)
{
hashSet.Remove(item);
}
I would avoid using two foreach loop - one foreach loop is enough:
HashSet<string> anotherHashSet = new HashSet<string>();
foreach (var item in hashSet)
{
if (!shouldBeRemoved)
{
anotherSet.Add(item);
}
}
hashSet = anotherHashSet;
For people who are looking for a way to process elements in a HashSet while removing them, I did it the following way
var set = new HashSet<int> {1, 2, 3};
while (set.Count > 0)
{
var element = set.FirstOrDefault();
Process(element);
set.Remove(element);
}
there is a much simpler solution here.
var mySet = new HashSet<string>();
foreach(var val in mySet.ToArray() {
Console.WriteLine(val);
mySet.Remove(val);
}
.ToArray() already creates a copy for you. you can loop to your hearts content.
Usually when I want to iterate over something and remove values I use:
For (index = last to first)
If(ShouldRemove(index)) Then
Remove(index)

Categories