Foreach loop problem for IQueryable object - c#

Can we use foreach loop for IQueryable object?
I'd like to do something as follow:
query = IQueryable<Myclass> = objDataContext.Myclass; // objDataContext is an object of LINQ datacontext class
int[] arr1 = new int[] { 3, 4, 5 };
foreach (int i in arr1)
{
query = query.Where(q => (q.f_id1 == i || q.f_id2 == i || q.f_id3 == i));
}
I get a wrong output as each time value of i is changed.

The problem you're facing is deferred execution, you should be able to find a lot of information on this but basically none of the code s being executed until you actually try to read data from the IQueryable (Convert it to an IEnumerable or a List or other similar operations). This means that this all happens after the foreach is finished when i is set to the final value.
If I recall correctly one thing you can do is initialize a new variable inside the for loop like this:
foreach (int i in arr1)
{
int tmp = i;
query = query.Where(q => (q.f_id1 == tmp || q.f_id2 == tmp || q.f_id3 == tmp));
}
By putting it in a new variable which is re-created each loop, the variable should not be changed before you execute the IQueryable.

You dont need a for each, try it like this:
query = objDataContext.Myclass.Where(q => (arr1.Contains(q.f_id1) || arr1.Contains(q.f_id2) || arr1.Contains(q.f_id3));

this is because "i" is not evaluated until you really use the iterate the query collectionif not by that time I believe "i" will be the last.

Related

Loop to check for duplicate strings

I want to create a loop to check a list of titles for duplicates.
I currently have this:
var productTitles = SeleniumContext.Driver.FindElements(By.XPath(ComparisonTableElements.ProductTitle));
foreach (var x in productTitles)
{
var title = x.Text;
productTitles = SeleniumContext.Driver.FindElements(By.XPath(ComparisonTableElements.ProductTitle));
foreach (var y in productTitles.Skip(productTitles.IndexOf(x) + 1))
{
if (title == y.Text)
{
Assert.Fail("Found duplicate product in the table");
}
}
}
But this is taken the item I skip out of the array for the next loop so item 2 never checks it's the same as item 1, it moves straight to item 3.
I was under the impression that skip just passed over the index you pass in rather than removing it from the list.
You can use GroupBy:
var anyDuplicates = SeleniumContext
.Driver
.FindElements(By.XPath(ComparisonTableElements.ProductTitle))
.GroupBy(p => p.Text, p => p)
.Any(g => g.Count() > 1);
Assert.That(anyDuplicates, Is.False);
or Distinct:
var productTitles = SeleniumContext
.Driver
.FindElements(By.XPath(ComparisonTableElements.ProductTitle))
.Select(p => p.Text)
.ToArray();
var distinctProductTitles = productTitles.Distinct().ToArray();
Assert.AreEqual(productTitles.Length, distinctProductTitles.Length);
Or, if it is enough to find a first duplicate without counting all of them it's better to use a HashSet<T>:
var titles = new HashSet<string>();
foreach (var title in SeleniumContext
.Driver
.FindElements(By.XPath(ComparisonTableElements.ProductTitle))
.Select(p => p.Text))
{
if (!titles.Add(title))
{
Assert.Fail("Found duplicate product in the table");
}
}
All approaches are better in terms of computational complexity (O(n)) than what you propose (O(n2)).
You don't need a loop. Simply use the Where() function to find all same titles, and if there is more than one, then they're duplicates:
var productTitles = SeleniumContext.Driver.FindElements(By.XPath(ComparisonTableElements.ProductTitle));
foreach(var x in productTitles) {
if (productTitles.Where(y => x.Text == y.Text).Count() > 1) {
Assert.Fail("Found duplicate product in the table");
}
}
I would try a slightly different way since you only need to check for duplicates in a one-dimensional array.
You only have to check the previous element with the next element within the array/collection so using Linq to iterate through all of the items seems a bit unnecessary.
Here's a piece of code to better understand:
var productTitles = SeleniumContext.Driver.FindElements(By.XPath(ComparisonTableElements.ProductTitle))
for ( int i = 0; i < productionTitles.Length; i++ )
{
var currentObject = productionTitles[i];
for ( int j = i + 1; j < productionTitles.Length; j++ )
{
if ( currentObject.Title == productionTitles[j].Title )
{
// here's your duplicate
}
}
}
Since you've checked that item at index 0 is not the same as item placed at index 3 there's no need to check that again when you're at index 3. The items will remain the same.
The Skip(IEnumerable, n) method returns an IEnumerable that doesn't "contain" the n first element of the IEnumerable it's called on.
Also I don't know what sort of behaviour could arise from this, but I wouldn't assign a new IEnumerable to the variable over which the foreach is being executed.
Here's another possible solution with LINQ:
int i = 0;
foreach (var x in productTitles)
{
var possibleDuplicate = productTitles.Skip(i++).Find((y) => y.title == x.title);
//if possibleDuplicate is not default value of type
//do stuff here
}
This goes without saying, but the best solution for you will depend on what you are trying to do. Also, I think the Skip method call is more trouble than it's worth, as I'm pretty sure it will most certainly make the search less eficient.

Most efficient way to search enumerable

I am writing a small program that takes in a .csv file as input with about 45k rows. I am trying to compare the contents of this file with the contents of a table on a database (SQL Server through dynamics CRM using Xrm.Sdk if it makes a difference).
In my current program (which takes about 25 minutes to compare - the file and database are the exact same here both 45k rows with no differences), I have all existing records on the database in a DataCollection<Entity> which inherits Collection<T> and IEnumerable<T>
In my code below I am filtering using the Where method and then doing a logic based the count of matches. The Where seems to be the bottleneck here. Is there a more efficient approach than this? I am by no means a LINQ expert.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age);
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
EDIT: I can confirm that all existingRecords are in memory before this code is executed. There is no IO or DB access in the above loop.
Himbrombeere is right, you should execute the query first and put the result into a collection before you use Any, Count, AddRange or whatever method will execute the query again. In your code it's possible that the query is executed 5 times in every loop iteration.
Watch out for the term deferred execution in the documentation. If a method is implemented in that way, then it means that this method can be used to construct a LINQ query(so you can chain it with other methods and at the end you have a query). But only methods that don't use deferred execution like Count, Any, ToList(or a plain foreach) will actually execute it. If you dont want that the whole query is executed everytime and you have to access this query multiple times , it's better to store the result in a collection(.f.e with ToList).
However, you could use a different approach which should be much more efficient, a Lookup<TKey, TValue> which is similar to a dictionary and can be used with an anonymous type as key:
var lookup = existingRecords.Entities.ToLookup(r => new
{
fund = r["field_1"].ToString(),
bps = Convert.ToDecimal(r["field_2"]),
withdrawalPct = Convert.ToDecimal(r["field_3"]),
percentile = Convert.ToDecimal(r["field_4"]),
age = Convert.ToDecimal(r["field_5"])
});
Now you can access this lookup in the loop very efficiently.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = lookup[new {fund, bps, withdrawalPct, percentile, age}].ToList();
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
Note that this will work even if the key does not exist(an empty list is returned).
Add a ToList after your Convert.ToDecimal(r["field_5"]) == age);-line to force an immediate execution of the query.
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age)
.ToList();
The Where doesn´t actually execute your query, it just prepares it. The actual execution happens later in a delayed way. In your case that happens when calling Count which itself will iterate the entire collection of items. But if the first condition fails, the second one is checked leading to a second iteration of the complete collection when calling Count. In this case you actually execute that query a thrird time when calling matchingRows.First().
When forcing an immediate execution you´re executing the query only once and thus iterating the entire collection only once also which will decrease your overall-time.
Another option, which is basically along the same lines as the other answers, is to prepare your data first, so that you're not repeatedly calling things like r["field_2"] (which are relatively slow to look up).
This is a (1) clean your data, (2) query/join your data, (3) process your data approach.
Do this:
(1)
var inputs =
inputDataLines
.Select(record =>
{
var fields = record.Split(',');
return new
{
fund = fields[0],
bps = Convert.ToDecimal(fields[1]),
withdrawalPct = Convert.ToDecimal(fields[2]),
percentile = Convert.ToInt32(fields[3]),
age = Convert.ToInt32(fields[4]),
bombOutTerm = Convert.ToDecimal(fields[5]),
record
};
})
.ToArray();
var entities =
existingRecords
.Entities
.Select(entity => new
{
fund = entity["field_1"].ToString(),
bps = Convert.ToDecimal(entity["field_2"]),
withdrawalPct = Convert.ToDecimal(entity["field_3"]),
percentile = Convert.ToInt32(entity["field_4"]),
age = Convert.ToInt32(entity["field_5"]),
bombOutTerm = Convert.ToDecimal(entity["field_6"]),
entity
})
.ToArray()
.GroupBy(x => new
{
x.fund,
x.bps,
x.withdrawalPct,
x.percentile,
x.age
}, x => new
{
x.bombOutTerm,
x.entity,
});
(2)
var query =
from i in inputs
join e in entities on new { i.fund, i.bps, i.withdrawalPct, i.percentile, i.age } equals e.Key
select new { input = i, matchingRows = e };
(3)
foreach (var x in query)
{
entitiesFound.AddRange(x.matchingRows.Select(y => y.entity));
if (x.matchingRows.Count() == 0)
{
rowsToAdd.Add(x.input.record);
}
else if (x.matchingRows.Count() == 1)
{
if (x.matchingRows.First().bombOutTerm != x.input.bombOutTerm)
{
rowsToUpdate.Add(x.input.record);
entitiesToUpdate.Add(x.matchingRows.First().entity);
}
}
else
{
entitiesToDelete.AddRange(x.matchingRows.Select(y => y.entity));
rowsToAdd.Add(x.input.record);
}
}
I would suspect that this will be the among the fastest approaches presented.

Comparing 2 strings in a linq to sql statement

var j = from c in User.USERs
where (c.USER_NAME.Equals(tempUserName))
select c;
this keeps on giving me an empty sequence
both are just strings im comparing user input with database
Do something like this:
var j = User.USERs.First(c => c.USER_NAME == tempUserName)
or
var j = User.USERs.Single(c => c.USER_NAME == tempUserName)
or just take j[0] from the result your own query gives you.
P.S. - both First or Single will throw an exception if no item matched the query, if you want to get null returned if nothing was found use FirstOrDefault respectively SingleOrDefault.
to broaden the spectrum try something like this:
string userToSearchFor = tempUserName.Trim().ToLower();
var j = User.USERs.FirstOrDefault(c => c.USER_NAME.ToLower() == userToSearchFor);
if (j != null)
{
//found something
}
If it is returning an empty sequence then your where clause is evaluating to false, check what SQL it is generating if you need to solve that problem first.
To answer your question, to get a single element you usually use
.Single()
.SingleOrDefault()
.First()
.FirstOrDefault()
I would do it like this:
var result = User.USERs.SingleOrDefault(x => x.USER_NAME.Equals(tempUserName));
if (result != null)
{
//do your thing
}

Linq Collection gets reset on next iteration of foreach

I have the following foreach expression within which I am building a predicate and then filtering the collection by executing .Where().
But what stumps me is, result.Count() gives me 0 even before I execute .Where() in the next iteration.
var result = SourceCollection;
foreach (var fieldName in FilterKeys)
{
if (!conditions.ContainsKey(fieldName)) continue;
if (!conditions[fieldName].IsNotNullOrEmpty()) continue;
var param = conditions[fieldName];
Func<BaseEntity, bool> predicate = (d) => fieldName != null && d.GetFieldValue(fieldName).ContainsIgnoreCase(param);
result = result.Where(predicate);
}
Does, anybody know of any LINQ behavior that I might have overlooked that is causing this?
I think, you want this:
var result = SourceCollection;
foreach (var fieldName in FilterKeys)
{
if (!conditions.ContainsKey(fieldName)) continue;
if (!conditions[fieldName].IsNotNullOrEmpty()) continue;
var param = conditions[fieldName];
var f = fieldName
Func<BaseEntity, bool> predicate = (d) => f != null && d.GetFieldValue(f).ContainsIgnoreCase(param);
result = result.Where(predicate);
}
Notice the use of f in the predicate. You don't want to capture the foreach variable. In your original code, when the second iteration starts, param is still the captured value from the first iteration, but fieldName has changed.
I agree that the issue comes from not capturing the foreach variable, but there is a deeper issue and that is the combination of LINQ with imperative control flow structures - i.e. mixing LINQ & foreach in the first place.
Try this instead:
var predicates =
from fieldName in FilterKeys
where conditions.ContainsKey(fieldName)
let param = conditions[fieldName]
where param.IsNotNullOrEmpty()
select (Func<BaseEntity, bool>)
(d => fieldName != null
&& d.GetFieldValue(fieldName).ContainsIgnoreCase(param));
var result = predicates.Aggregate(
SourceCollection as IEnumerable<BaseEntity>,
(xs, p) => xs.Where(p));
No foreach loop needed and the code stays purely in LINQ. Friends shouldn't let friends mix imperative and functional... :-)

Linq query referencing objects and a string array

I am having some trouble in converting the following code to use LINQ.
int occurs = 0;
foreach (string j in items)
{
if (!string.IsNullOrEmpty(j))
{
WorkflowModule tempWM = new WorkflowModule(j);
if (tempWM.StateID == item.StateID)
{
occurs++;
}
}
}
return occurs;
So far, I have:-
var lstItems = (from lstItem in items
where !string.IsNullOrEmpty(lstItem)
let objWorkflowModule = new WorkflowModule(lstItem)
select new
{
tempWM = objWorkflowModule.StateID
}).Where(item.StateID == tempWM));
return lstItems.Count();
but intellisense is not liking the line '.Where(item.StateID == tempWM))'
Can anyone help me achieve this?
Thanks.
When you use the method syntax, you need to use a lambda on the Where operator:
...
}).Where(x => x.tempWM == item.StateID));
In other words, you need to "declare" the variable x which holds the result of the previous part of the query.
It doesn't look like item is initialized anywhere in your statement.
Here's how I'd do this
var lstItems = from lstItem in items
where !string.IsNullOrEmpty(lstItem)
let objWorkflowModule = new WorkflowModule(lstItem)
select objWorkflowModule.StateID;
return lstItems.Count(t=> t == item.StateID);
I'm assuming item is a variable defined outside of the original code you submitted. Basically you don't need to create the anonymous class in the query and you can put the predicate in you're Where into Count instead. But as others have said the main issue is that you need to express your predicate as a lambda.

Categories