Linq query against a collection property with a collection parameter

Linq query against a collection property with a collection parameter - c#

I have a method which takes an array of strings as parameter and queries against a collection property which is also a collection of strings. If that property has one of the values inside the string array passed as parameter, it should be returned.
Here is my code:
public IEnumerable<BlogPost> GetAll(string[] tags,
bool includeUnapprovedEntries = false) {
foreach (var tag in tags) {
foreach (var blogPost in GetAll(includeUnapprovedEntries).
ToList().Where(x => x.Tags.Any(t => t == tag))) {
yield return blogPost;
}
}
}
Note:
Here is the complete code:
https://github.com/tugberkugurlu/MvcBloggy/blob/master/src/MvcBloggy.Data/DataAccess/SqlServer/BlogPostRepository.cs
This does the job but it just doesn't seem right. I could have made this better with some extension methods but couldn't figure out what would do the trick and make this implementation right.
Any idea?

How about this:
public IEnumerable<BlogPost> GetAll(string[] tags,
bool includeUnapprovedEntries = false) {
return GetAll(includeUnapprovedEntries)
.Where(x => x.Tags.Any(t => tags.Contains(t));
}
You may want to call ToList() to materialize the result. Note that this will (hopefully!) result in an IN query in SQL; if you have a large number of tags, I wouldn't be surprised if that failed. (I don't know how the Entity Framework handles that situation.) I believe it should be okay with smaller numbers of tags though.
Note that whether or not this is supported may depend on the version of the entity framework you're using; I seem to remember that some transformations like this (using Contains on a "local" collection) to translate to IN in SQL) have improved over time. Make sure you develop against the same version you'll be deploying against :)

Related

Setting value of object property in List<T> using LINQ

I have a List<Email>() and my Email object looks like this:
public class Email
{
public string EmailAddress { get; set; }
public bool IsPrimary { get; set; }
}
When I add a new email address that is set as primary, I want to set all the others as non-primary. I currently handle this using a foreach. Can I handle this using LINQ?
My current code is:
foreach (var item in emails)
{
if(item.EmailAddress.ToLower() != newEmailAddress.ToLower() && item.IsPrimary)
item.IsPrimary = false;
}

Linq queries collections, it doesn't modify them. The only spot in this equation that linq would come into play is actually making it a part of the enumeration - filtering the collection you're iterating over rather than doing an if statement inside it.
foreach (var item in emails.Where(e => e.IsPrimary && !e.EmailAddress.Equals(newEmailAddress, StringComparison.InvariantCultureIgnoreCase)))
{
item.IsPrimary = false;
}
EDIT: I didn't originally include it as it's not LINQ and that's what the question is about, but as mentioned in the comments on your question List<T> does include a ForEach method.
It would look like this:
emails.ForEach(item =>
{
item.IsPrimary = item.IsPrimary && item.EmailAddress.Equals(newEmailAddress, StringComparison.InvariantCultureIgnoreCase);
});

LINQ is intended for querying and not modification. Having said that, there is a List.ForEach operator, but with no increase in readability most of the time.
Having said that, I personally prefer not having side effect causing code that modifies the collection but I am not opposed to modifying the objects in the collection.
Add an extension method on IEnumerable to encapsulate the foreach loop:
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action) {
foreach (var s in source)
action(s);
}
Then you can re-write your code as follows:
emails.Where(item => item.IsPrimary && !item.EmailAddress.Equals(newEmailAddress, StringComparison.InvariantCultureIgnoreCase))
.ForEach(item => item.IsPrimary = false);
(Thanks to #McAden for the better string comparison I always forget.)
However, since you are creating a race condition anyway, if practical I would suggest reversing your order of operations:
// before adding newEmailAddress
emails[emails.FindIndex(item => item.IsPrimary)].IsPrimary = false; // add error handling if it is possible no `IsPrimary` exists.
// now assign the newEmailAddress and set that item.IsPrimary to true

You can easily do that, but you should not as no one would expect LINQ code to modify the items in the collection.
emails
.Where(item =>
(item.EmailAddress.ToLower() != newEmailAddress.ToLower() && item.IsPrimary)
.Select(item => { item.IsPrimary = false; return true;})
.All();
Note that since LINQ queries are actually executed when result is enumerated you need something that will actually enumerate result. I.e. .All() call.
What would happen after you write this code - someone (or you in a week) remove that stupid and pointless .All() call at the end and things will be somewhat
ok, but modification no longer happen, person will spend a day sorting it out and then use some words to describe author of the code. Don't go there.

Custom Method in LINQ Query

I sum myself to the hapless lot that fumbles with custom methods in LINQ to EF queries. I've skimmed the web trying to detect a pattern to what makes a custom method LINQ-friendly, and while every source says that the method must be translatable into a T-SQL query, the applications seem very diverse. So, I'll post my code here and hopefully a generous SO denizen can tell me what I'm doing wrong and why.
The Code
public IEnumerable<WordIndexModel> GetWordIndex(int transid)
{
return (from trindex in context.transIndexes
let trueWord = IsWord(trindex)
join trans in context.Transcripts on trindex.transLineUID equals trans.UID
group new { trindex, trans } by new { TrueWord = trueWord, trindex.transID } into grouped
orderby grouped.Key.word
where grouped.Key.transID == transid
select new WordIndexModel
{
Word = TrueWord,
Instances = grouped.Select(test => test.trans).Distinct()
});
}
public string IsWord(transIndex trindex)
{
Match m = Regex.Match(trindex.word, #"^[a-z]+(\w*[-]*)*",
RegexOptions.IgnoreCase);
return m.Value;
}
With the above code I access a table, transIndex that is essentially a word index of culled from various user documents. The problem is that not all entries are actually words. Nubers, and even underscore lines, such as, ___________,, are saved as well.
The Problem
I'd like to keep only the words that my custom method IsWord returns (at the present time I have not actually developed the parsing mechanism). But as the IsWord function shows it will return a string.
So, using let I introduce my custom method into the query and use it as a grouping parameter, the is selectable into my object. Upon execution I get the omninous:
LINQ to Entities does not recognize the method
'System.String IsWord(transIndex)' method, and this
method cannot be translated into a store expression."
I also need to make sure that only records that match the IsWord condition are returned.
Any ideas?

It is saying it does not understand your IsWord method in terms of how to translate it to SQL.
Frankly it does not do much anyway, why not replace it with
return (from trindex in context.transIndexes
let trueWord = trindex.word
join trans in context.Transcripts on trindex.transLineUID equals trans.UID
group new { trindex, trans } by new { TrueWord = trueWord, trindex.transID } into grouped
orderby grouped.Key.word
where grouped.Key.transID == transid
select new WordIndexModel
{
Word = TrueWord,
Instances = grouped.Select(test => test.trans).Distinct()
});
What methods can EF translate into SQL, i can't give you a list, but it can never translate a straight forward method you have written. But their are some built in ones that it understands, like MyArray.Contains(x) for example, it can turn this into something like
...
WHERE Field IN (ArrItem1,ArrItem2,ArrItem3)
If you want to write a linq compatible method then you need to create an expresion tree that EF can understand and turn into SQL.
This is where things star to bend my mind a little but this article may help http://blogs.msdn.com/b/csharpfaq/archive/2009/09/14/generating-dynamic-methods-with-expression-trees-in-visual-studio-2010.aspx.

If the percentage of bad records in return is not large, you could consider enumerate the result set first, and then apply the processing / filtering?
var query = (from trindex in context.transIndexes
...
select new WordIndexModel
{
Word,
Instances = grouped.Select(test => test.trans).Distinct()
});
var result = query.ToList().Where(word => IsTrueWord(word));
return result;
If the number of records is too high to enumerate, consider doing the check in a view or stored procedure. That will help with speed and keep the code clean.
But of course, using stored procedures has disadvatages of reusability and maintainbility (because of no refactoring tools).
Also, check out another answer which seems to be similar to this one: https://stackoverflow.com/a/10485624/3481183

How do I make a streaming LINQ expression that delivers the filtered out items as well as the filtered items?

I am transforming an Excel spreadsheet into a list of "Elements" (this is a domain term). During this transformation, I need to skip the header rows and throw out malformed rows that cannot be transformed.
Now comes the fun part. I need to capture those malformed records so that I can report on them. I constructed a crazy LINQ statement (below). These are extension methods hiding the messy LINQ operations on the types from the OpenXml library.
var elements = sheet
.Rows() <-- BEGIN sheet data transform
.SkipColumnHeaders()
.ToRowLookup()
.ToCellLookup()
.SkipEmptyRows() <-- END sheet data transform
.ToElements(strings) <-- BEGIN domain transform
.RemoveBadRecords(out discard)
.OrderByCompositeKey();
The interesting part starts at ToElements, where I transform the row lookup to my domain object list (details: it's called an ElementRow, which is later transformed into an Element). Bad records are created with just a key (the Excel row index) and are uniquely identifiable vs. a real element.
public static IEnumerable<ElementRow> ToElements(this IEnumerable<KeyValuePair<UInt32Value, Cell[]>> map)
{
return map.Select(pair =>
{
try
{
return ElementRow.FromCells(pair.Key, pair.Value);
}
catch (Exception)
{
return ElementRow.BadRecord(pair.Key);
}
});
}
Then, I want to remove those bad records (it's easier to collect all of them before filtering). That method is RemoveBadRecords, which started like this...
public static IEnumerable<ElementRow> RemoveBadRecords(this IEnumerable<ElementRow> elements)
{
return elements.Where(el => el.FormatId != 0);
}
However, I need to report the discarded elements! And I don't want to muddy my transform extension method with reporting. So, I went to the out parameter (taking into account the difficulties of using an out param in an anonymous block)
public static IEnumerable<ElementRow> RemoveBadRecords(this IEnumerable<ElementRow> elements, out List<ElementRow> discard)
{
var temp = new List<ElementRow>();
var filtered = elements.Where(el =>
{
if (el.FormatId == 0) temp.Add(el);
return el.FormatId != 0;
});
discard = temp;
return filtered;
}
And, lo! I thought I was hardcore and would have this working in one shot...
var discard = new List<ElementRow>();
var elements = data
/* snipped long LINQ statement */
.RemoveBadRecords(out discard)
/* snipped long LINQ statement */
discard.ForEach(el => failures.Add(el));
foreach(var el in elements)
{
/* do more work, maybe add more failures */
}
return new Result(elements, failures);
But, nothing was in my discard list at the time I looped through it! I stepped through the code and realized that I successfully created a fully-streaming LINQ statement.
The temp list was created
The Where filter was assigned (but not yet run)
And the discard list was assigned
Then the streaming thing was returned
When discard was iterated, it contained no elements, because the elements weren't iterated over yet.
Is there a way to fix this problem using the thing I constructed? Do I have to force an iteration of the data before or during the bad record filter? Is there another construction that I've missed?
Some Commentary
Jon mentioned that the assignment /was/ happening. I simply wasn't waiting for it. If I check the contents of discard after the iteration of elements, it is, in fact, full! So, I don't actually have an assignment problem. Unless I take Jon's advice on what's good/bad to have in a LINQ statement.

When the statement was actually iterated, the Where clause ran and temp filled up, but discard was never assigned again!
It doesn't need to be assigned again - the existing list which will have been assigned to discard in the calling code will be populated.
However, I'd strongly recommend against this approach. Using an out parameter here is really against the spirit of LINQ. (If you iterate over your results twice, you'll end up with a list which contains all the bad elements twice. Ick!)
I'd suggest materializing the query before removing the bad records - and then you can run separate queries:
var allElements = sheet
.Rows()
.SkipColumnHeaders()
.ToRowLookup()
.ToCellLookup()
.SkipEmptyRows()
.ToElements(strings)
.ToList();
var goodElements = allElements.Where(el => el.FormatId != 0)
.OrderByCompositeKey();
var badElements = allElements.Where(el => el.FormatId == 0);
By materializing the query in a List<>, you only process each row once in terms of ToRowLookup, ToCellLookup etc. It does mean you need to have enough memory to keep all the elements at a time, of course. There are alternative approaches (such as taking an action on each bad element while filtering it) but they're still likely to end up being fairly fragile.
EDIT: Another option as mentioned by Servy is to use ToLookup, which will materialize and group in one go:
var lookup = sheet
.Rows()
.SkipColumnHeaders()
.ToRowLookup()
.ToCellLookup()
.SkipEmptyRows()
.ToElements(strings)
.OrderByCompositeKey()
.ToLookup(el => el.FormatId == 0);
Then you can use:
foreach (var goodElement in lookup[false])
{
...
}
and
foreach (var badElement in lookup[true])
{
...
}
Note that this performs the ordering on all elements, good and bad. An alternative is to remove the ordering from the original query and use:
foreach (var goodElement in lookup[false].OrderByCompositeKey())
{
...
}
I'm not personally wild about grouping by true/false - it feels like a bit of an abuse of what's normally meant to be a key-based lookup - but it would certainly work.

Filtering a string list with logic?

I have a list of strings. I need to be able to filter them in a similar way to a Google query.
Ex: NOT water OR (ice AND "fruit juice")
Meaning return strings that do not have the word water or return strings that can have water if they have ice and "fruit juice".
Is there a mechanism in .NET that can allow the user to write queries in this form (say in a textbox) and given a List or IEnumerable of string, return the ones that contain this.
Can LINQ maybe do something like this?
I am aware that I can do this with LINQ, I'm more concerned with parsing an arbitrary string into an executable expression.

There is nothing built in.
You will need to parse such a string yourself and possibly use the Expression classes to build up an executable expression from which to filter.

For this query: Meaning return strings that do not have the word water or return strings that can have water if they have ice and "fruit juice".
Try something like this if you are going to use LINQ
yourList.Where(i => !i.Contains("water") ||
(i.Contains("water") &&
i.Contains("ice") &&
i.Contains("fruit juice")));

I think we can't answer you using a code sample since as you said this is more logical.
what I would do in such a scenario is to have predefined conditions (saved somewhere) which will contain all the conditions you need along with their "coding translations" like:
and
or
and not
and or
etc...
and what you will do at runtime is to translate these conditions/criteria into sql or linq or whatever language you need to pass it to.

In LINQ: list.Where(item => !item.Contains("water") || (item.Contains("ice") && item.Contains("fruit juice")))

You can try to use the DataTable.Select method:
public static class ExpressionExtensions {
public static IEnumerable<T> Select<T>(this IEnumerable<T> self, string expression) {
var table = new DataTable();
table.Columns.Add("Value", typeof(T));
foreach (var item in self) {
var row = table.NewRow();
row["Value"] = item;
table.Rows.Add(item);
}
return table.Select(expression).Select(row => (T)row["Value"]);
}
}
But you have to follow its format to create your expression:
var filtered = strings.Select("NOT Value LIKE '*water*' OR (Value LIKE '*ice*' AND Value LIKE '*fruit juice*')");
Also note that in this case, since the string fruit juice already contains the string ice, the second condition is redundant. You'd have to find a way to express "words" and not "substrings".
In the end, you might be better off implementing the parsing logic yourself.

LINQ to SQL where clause verifying string contains list element

I'm using a view returning Domains according to an id. The Domains column can be 'Geography' or can be stuffed domains 'Geography,History'. (In any way, the data returned is a VARCHAR)
In my C# code, I have a list containing main domains:
private static List<string> _mainDomains = new List<string>()
{
"Geography",
"Mathematics",
"English"
};
I want to filter my LINQ query in order to return only data related to one or many main Domain:
expression = i => _mainDomains.Any(s => i.Domains.Contains(s));
var results = (from v_lq in context.my_view
select v_lq).Where(expression)
The problem is I can't use the Any key word, nor the Exists keyword, since they aren't available in SQL. I've seen many solutions using the Contains keyword, but it doesn't fit to my problem.
What should I do?

You can use contains:
where i.Domains.Any(s => _mainDomains.Contains<string>(s.xxx))
Notice that the generic arguments are required (even if Resharper might tell you they are not). They are required to select Enumerable.Contains, not List.Contains. The latter one is not translatable (which I consider an error in the L2S product design).
(I might not have gotten the query exactly right for your data model. Please just adapt it to your needs).

I figured it out. Since I can't use the Any keyword, I used this function:
public static bool ContainsAny(this string databaseString, List<string> stringList)
{
if (databaseString == null)
{
return false;
}
foreach (string s in stringList)
{
if (databaseString.Contains(s))
{
return true;
}
}
return false;
}
So then I can use this expression in my Where clause:
expression = i => i.Domains.ContainsAny(_mainDomains);
Update:
According to usr, the query would return all the values and execute the where clause server side. A better solution would be to use a different approach (and not use stuffed/comma-separated values)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Linq query against a collection property with a collection parameter - c#

Related

Setting value of object property in List<T> using LINQ

Custom Method in LINQ Query

How do I make a streaming LINQ expression that delivers the filtered out items as well as the filtered items?

Filtering a string list with logic?

LINQ to SQL where clause verifying string contains list element

Categories

Resources