I have code like this where I want to query to MongoDB using Linq.
I get an AsQueryable from MongoDB collection.
public IEnumerable<IVideo> GetVideos()
{
var collection = database.GetCollection<IVideo>("Videos");
return collection.AsQueryable();
}
I call it like so,
var finalList = Filter2(Filter1(GetVideos())).Skip(2).Take(30);
foreach(var v in finalList)
{
....
}
Functions with the queries.
public IEnumerable<IVideo> Filter1(IEnumerable<IVideo> list)
{
return list.Where(q=>q.Categorized)
}
public IEnumerable<IVideo> Filter2(IEnumerable<IVideo> list)
{
var query = from d in list
where d.File == "string1" || d.File == "string2"
select d;
return query;
}
My code works fine. I have my code hosted in an IIS and have around 50,000 records and the queries are a bit complex than the example. My worker process spikes to 17% and takes a few seconds to execute when the foreach is called. This is a ridiculous high for such a low date amount.
I have a couple of questions.
Is the query being executed by .net or MongoDB? If it is executed by MongoDB why is my worker process taking such a hit?
What are the steps I can take to improve the execution time to render the query and reduce the server load.
Thanks
You're downloading all entries client-side by accident
public IEnumerable<IVideo> Filter1(IEnumerable<IVideo> list)
{
var list = list.Where(q=>q.Categorized)
}
IEnumerable causes the queryable to execute and return results. Change the filter methods to accept and return IQueryable.
EDIT:
The code you posted:
public IEnumerable<IVideo> Filter1(IEnumerable<IVideo> list)
{
var list = list.Where(q=>q.Categorized)
}
Does not compile.
Your code should look like this:
public IQueryable<IVideo> Filter1(IQueryable<IVideo> qVideos)
{
return qVideos.Where(q => q.Categorized);
}
public IQueryable<IVideo> Filter2(IQueryable<IVideo> qVideos)
{
return qVideos
.Where(e => e.File == "string1" || e.File == "string2");
}
public DoSomething()
{
// This is the query, in debug mode you can inspect the actual query generated under a property called 'DebugView'
var qVideos = Filter2(Filter1(GetVideos()))
.Skip(1)
.Take(30);
// This runs the actual query and loads the results client side.
var videos = qVideos.ToList();
// now iterated
foreach (var video in videos)
{
}
}
Using Entity Framework 6.0.2 and .NET 4.5.1 in Visual Studio 2013 Update 1 with a DbContext connected to SQL Server:
I have a long chain of filters I am applying to a query based on the caller's desired results. Everything was fine until I needed to add paging. Here's a glimpse:
IQueryable<ProviderWithDistance> results = (from pl in db.ProviderLocations
let distance = pl.Location.Geocode.Distance(_geo)
where pl.Location.Geocode.IsEmpty == false
where distance <= radius * 1609.344
orderby distance
select new ProviderWithDistance() { Provider = pl.Provider, Distance = Math.Round((double)(distance / 1609.344), 1) }).Distinct();
if (gender != null)
{
results = results.Where(p => p.Provider.Gender == (gender.ToUpper() == "M" ? Gender.Male : Gender.Female));
}
if (type != null)
{
int providerType;
if (int.TryParse(type, out providerType))
results = results.Where(p => p.Provider.ProviderType.Id == providerType);
}
if (newpatients != null && newpatients == true)
{
results = results.Where(p => p.Provider.ProviderLocations.Any(pl => pl.AcceptingNewPatients == null || pl.AcceptingNewPatients == AcceptingNewPatients.Yes));
}
if (string.IsNullOrEmpty(specialties) == false)
{
List<int> _ids = specialties.Split(',').Select(int.Parse).ToList();
results = results.Where(p => p.Provider.Specialties.Any(x => _ids.Contains(x.Id)));
}
if (string.IsNullOrEmpty(degrees) == false)
{
List<int> _ids = specialties.Split(',').Select(int.Parse).ToList();
results = results.Where(p => p.Provider.Degrees.Any(x => _ids.Contains(x.Id)));
}
if (string.IsNullOrEmpty(languages) == false)
{
List<int> _ids = specialties.Split(',').Select(int.Parse).ToList();
results = results.Where(p => p.Provider.Languages.Any(x => _ids.Contains(x.Id)));
}
if (string.IsNullOrEmpty(keyword) == false)
{
results = results.Where(p =>
(p.Provider.FirstName + " " + p.Provider.LastName).Contains(keyword));
}
Here's the paging I added to the bottom (skip and max are just int parameters):
if (skip > 0)
results = results.Skip(skip);
results = results.Take(max);
return new ProviderWithDistanceDto { Locations = results.AsEnumerable() };
Now for my question(s):
As you can see, I am doing an orderby in the initial LINQ query, so why is it complaining that I need to do an OrderBy before doing a Skip (I thought I was?)...
I was under the assumption that it won't be turned into a SQL query and executed until I enumerate the results, which is why I wait until the last line to return the results AsEnumerable(). Is that the correct approach?
If I have to enumerate the results before doing Skip and Take how will that affect performance? Obviously I'd like to have SQL Server do the heavy lifting and return only the requested results. Or does it not matter (or have I got it wrong)?
I am doing an orderby in the initial LINQ query, so why is it complaining that I need to do an OrderBy before doing a Skip (I thought I was?)
Your result starts off correctly as an ordered queryable: the type returned from the query on the first line is IOrderedQueryable<ProviderWithDistance>, because you have an order by clause. However, adding a Where on top of it makes your query an ordinary IQueryable<ProviderWithDistance> again, causing the problem that you see down the road. Logically, that's the same thing, but the structure of the query definition in memory implies otherwise.
To fix this, remove the order by in the original query, and add it right before you are ready for the paging, like this:
...
if (string.IsNullOrEmpty(languages) == false)
...
if (string.IsNullOrEmpty(keyword) == false)
...
result = result.OrderBy(r => r.distance);
As long as ordering is the last operation, this should fix the runtime problem.
I was under the assumption that it won't be turned into a SQL query and executed until I enumerate the results, which is why I wait until the last line to return the results AsEnumerable(). Is that the correct approach?
Yes, that is the correct approach. You want your RDBMS to do as much work as possible, because doing paging in memory defeats the purpose of paging in the first place.
If I have to enumerate the results before doing Skip and Take how will that affect performance?
It would kill the performance, because your system would need to move around a lot more data than it did before you added paging.
Let's say I have this method to seach my DB for products that fit a certain keyword:
public List<Product> GetByKeyword(string keyword)
{
using(var db = new DataEntities())
{
var query = db.Products.Where(x => x.Description.Contains(keyword);
return query.ToList();
}
}
This works fine, but somewhere else in my project, I want to get the active products only, still by keyword. I would like to do something like :
...
var result = ProductStore.GetByKeyword("Apple", x => x.isActive == 1);
Therefore, I created this method:
public List<Product> GetByKeyword(string keyword, Func<Product, bool> predicate = null)
{
using(var db = new DataEntities())
{
var query = db.Products.Where(x => x.Description.Contains(keyword);
if(predicate != null)
query = query.Where(x => predicate(x));
return query.ToList();
}
}
While this compiles well, the ToList() call generates a NotSupportedException because LINQ does not support the Invoke method.
Of course, I could to it with another method
i.e. GetActiveByKeyword(string keyword) but then I would have to do one for every possible variation, including the ones I didn't think of...
How do I get this to work? Thanks!
Isn't it just this:
if(predicate != null)
query = query.Where(predicate);
it's just as AD.Net says the reason why it works with Expression before is because if you say that the compiler knows it would be a lambda expression
I have this function:
public static IQueryable<Article> WhereArticleIsLive(this IQueryable<Article> q)
{
return q.Where(x =>
x != null
&& DateTime.UtcNow >= x.PublishTime
&& x.IsPublished
&& !x.IsDeleted);
}
And it works just fine in this query:
from a in Articles.WhereArticleIsLive()
where a.Id == 5
select new { a.Title }
But it doesn't work in this only slightly more complex query:
from s in Series
from a in Articles.WhereArticleIsLive()
where s.Id == a.SeriesId
select new { s, a }
I get this error message:
NotSupportedException: LINQ to Entities does not recognize the method 'System.Linq.IQueryable1[TheFraser.Data.Articles.Article] WhereArticleIsLive(System.Linq.IQueryable1[TheFraser.Data.Articles.Article])' method, and this method cannot be translated into a store expression.
Any idea why? Is there another way to consolidate query parameters like this?
Thanks in advance.
EDIT: corrections by Craig.
I'm leaving this here, because I think it's a valuable tool: Use linqkit! But not for solving this question though :-)
Instead of returning IQueryable, use Expression to factor out predicates. E.g. you could define the following static method on Article:
public static Expression<Func<Article,bool>> IsLive()
{
return x =>
x != null
&& DateTime.UtcNow >= x.PublishTime
&& x.IsPublished
&& !x.IsDeleted
}
Then, ensure to store a reference to this expression when building your query, something along the lines of (not tested):
var isLive = Article.IsLive();
from s in Series
from a in Articles.Where(isLive)
where s.Id == a.SeriesId
select new { s, a }
I've got a lot of ugly code that looks like this:
if (!string.IsNullOrEmpty(ddlFileName.SelectedItem.Text))
results = results.Where(x => x.FileName.Contains(ddlFileName.SelectedValue));
if (chkFileName.Checked)
results = results.Where(x => x.FileName == null);
if (!string.IsNullOrEmpty(ddlIPAddress.SelectedItem.Text))
results = results.Where(x => x.IpAddress.Contains(ddlIPAddress.SelectedValue));
if (chkIPAddress.Checked)
results = results.Where(x => x.IpAddress == null);
...etc.
results is an IQueryable<MyObject>.
The idea is that for each of these innumerable dropdowns and checkboxes, if the dropdown has something selected, the user wants to match that item. If the checkbox is checked, the user wants specifically those records where that field is null or an empty string. (The UI doesn't let both be selected at the same time.) This all adds to the LINQ Expression which gets executed at the end, after we've added all the conditions.
It seems like there ought to be some way to pull out an Expression<Func<MyObject, bool>> or two so that I can put the repeated parts in a method and just pass in what changes. I've done this in other places, but this set of code has me stymied. (Also, I'd like to avoid "Dynamic LINQ", because I want to keep things type-safe if possible.) Any ideas?
I'd convert it into a single Linq statement:
var results =
//get your inital results
from x in GetInitialResults()
//either we don't need to check, or the check passes
where string.IsNullOrEmpty(ddlFileName.SelectedItem.Text) ||
x.FileName.Contains(ddlFileName.SelectedValue)
where !chkFileName.Checked ||
string.IsNullOrEmpty(x.FileName)
where string.IsNullOrEmpty(ddlIPAddress.SelectedItem.Text) ||
x.FileName.Contains(ddlIPAddress.SelectedValue)
where !chkIPAddress.Checked ||
string.IsNullOrEmpty(x. IpAddress)
select x;
It's no shorter, but I find this logic clearer.
In that case:
//list of predicate functions to check
var conditions = new List<Predicate<MyClass>>
{
x => string.IsNullOrEmpty(ddlFileName.SelectedItem.Text) ||
x.FileName.Contains(ddlFileName.SelectedValue),
x => !chkFileName.Checked ||
string.IsNullOrEmpty(x.FileName),
x => string.IsNullOrEmpty(ddlIPAddress.SelectedItem.Text) ||
x.IpAddress.Contains(ddlIPAddress.SelectedValue),
x => !chkIPAddress.Checked ||
string.IsNullOrEmpty(x.IpAddress)
}
//now get results
var results =
from x in GetInitialResults()
//all the condition functions need checking against x
where conditions.All( cond => cond(x) )
select x;
I've just explicitly declared the predicate list, but these could be generated, something like:
ListBoxControl lbc;
CheckBoxControl cbc;
foreach( Control c in this.Controls)
if( (lbc = c as ListBoxControl ) != null )
conditions.Add( ... );
else if ( (cbc = c as CheckBoxControl ) != null )
conditions.Add( ... );
You would need some way to check the property of MyClass that you needed to check, and for that you'd have to use reflection.
Have you seen the LINQKit? The AsExpandable sounds like what you're after (though you may want to read the post Calling functions in LINQ queries at TomasP.NET for more depth).
Don't use LINQ if it's impacting readability. Factor out the individual tests into boolean methods which can be used as your where expression.
IQueryable<MyObject> results = ...;
results = results
.Where(TestFileNameText)
.Where(TestFileNameChecked)
.Where(TestIPAddressText)
.Where(TestIPAddressChecked);
So the the individual tests are simple methods on the class. They're even individually unit testable.
bool TestFileNameText(MyObject x)
{
return string.IsNullOrEmpty(ddlFileName.SelectedItem.Text) ||
x.FileName.Contains(ddlFileName.SelectedValue);
}
bool TestIPAddressChecked(MyObject x)
{
return !chkIPAddress.Checked ||
x.IpAddress == null;
}
results = results.Where(x =>
(string.IsNullOrEmpty(ddlFileName.SelectedItem.Text) || x.FileName.Contains(ddlFileName.SelectedValue))
&& (!chkFileName.Checked || string.IsNullOrEmpty(x.FileName))
&& ...);
Neither of these answers so far is quite what I'm looking for. To give an example of what I'm aiming at (I don't regard this as a complete answer either), I took the above code and created a couple of extension methods:
static public IQueryable<Activity> AddCondition(
this IQueryable<Activity> results,
DropDownList ddl,
Expression<Func<Activity, bool>> containsCondition)
{
if (!string.IsNullOrEmpty(ddl.SelectedItem.Text))
results = results.Where(containsCondition);
return results;
}
static public IQueryable<Activity> AddCondition(
this IQueryable<Activity> results,
CheckBox chk,
Expression<Func<Activity, bool>> emptyCondition)
{
if (chk.Checked)
results = results.Where(emptyCondition);
return results;
}
This allowed me to refactor the code above into this:
results = results.AddCondition(ddlFileName, x => x.FileName.Contains(ddlFileName.SelectedValue));
results = results.AddCondition(chkFileName, x => x.FileName == null || x.FileName.Equals(string.Empty));
results = results.AddCondition(ddlIPAddress, x => x.IpAddress.Contains(ddlIPAddress.SelectedValue));
results = results.AddCondition(chkIPAddress, x => x.IpAddress == null || x.IpAddress.Equals(string.Empty));
This isn't quite as ugly, but it's still longer than I'd prefer. The pairs of lambda expressions in each set are obviously very similar, but I can't figure out a way to condense them further...at least not without resorting to dynamic LINQ, which makes me sacrifice type safety.
Any other ideas?
#Kyralessa,
You can create extension method AddCondition for predicates that accepts parameter of type Control plus lambda expression and returns combined expression. Then you can combine conditions using fluent interface and reuse your predicates. To see example of how it can be implemented see my answer on this question:
How do I compose existing Linq Expressions
I'd be wary of the solutions of the form:
// from Keith
from x in GetInitialResults()
//either we don't need to check, or the check passes
where string.IsNullOrEmpty(ddlFileName.SelectedItem.Text) ||
x.FileName.Contains(ddlFileName.SelectedValue)
My reasoning is variable capture. If you're immediately execute just the once you probably won't notice a difference. However, in linq, evaluation isn't immediate but happens each time iterated occurs. Delegates can capture variables and use them outside the scope you intended.
It feels like you're querying too close to the UI. Querying is a layer down, and linq isn't the way for the UI to communicate down.
You may be better off doing the following. Decouple the searching logic from the presentation - it's more flexible and reusable - fundamentals of OO.
// my search parameters encapsulate all valid ways of searching.
public class MySearchParameter
{
public string FileName { get; private set; }
public bool FindNullFileNames { get; private set; }
public void ConditionallySearchFileName(bool getNullFileNames, string fileName)
{
FindNullFileNames = getNullFileNames;
FileName = null;
// enforce either/or and disallow empty string
if(!getNullFileNames && !string.IsNullOrEmpty(fileName) )
{
FileName = fileName;
}
}
// ...
}
// search method in a business logic layer.
public IQueryable<MyClass> Search(MySearchParameter searchParameter)
{
IQueryable<MyClass> result = ...; // something to get the initial list.
// search on Filename.
if (searchParameter.FindNullFileNames)
{
result = result.Where(o => o.FileName == null);
}
else if( searchParameter.FileName != null )
{ // intermixing a different style, just to show an alternative.
result = from o in result
where o.FileName.Contains(searchParameter.FileName)
select o;
}
// search on other stuff...
return result;
}
// code in the UI ...
MySearchParameter searchParameter = new MySearchParameter();
searchParameter.ConditionallySearchFileName(chkFileNames.Checked, drpFileNames.SelectedItem.Text);
searchParameter.ConditionallySearchIPAddress(chkIPAddress.Checked, drpIPAddress.SelectedItem.Text);
IQueryable<MyClass> result = Search(searchParameter);
// inform control to display results.
searchResults.Display( result );
Yes it's more typing, but you read code around 10x more than you write it. Your UI is clearer, the search parameters class takes care of itself and ensures mutually exclusive options don't collide, and the search code is abstracted away from any UI and doesn't even care if you use Linq at all.
Since you are wanting to repeatedly reduce the original results query with innumerable filters, you can use Aggregate(), (which corresponds to reduce() in functional languages).
The filters are of predictable form, consisting of two values for every member of MyObject - according to the information I gleaned from your post. If every member to be compared is a string, which may be null, then I recommend using an extension method, which allows for null references to be associated to an extension method of its intended type.
public static class MyObjectExtensions
{
public static bool IsMatchFor(this string property, string ddlText, bool chkValue)
{
if(ddlText!=null && ddlText!="")
{
return property!=null && property.Contains(ddlText);
}
else if(chkValue==true)
{
return property==null || property=="";
}
// no filtering selected
return true;
}
}
We now need to arrange the property filters in a collection, to allow for iterating over many. They are represented as Expressions for compatibility with IQueryable.
var filters = new List<Expression<Func<MyObject,bool>>>
{
x=>x.Filename.IsMatchFor(ddlFileName.SelectedItem.Text,chkFileName.Checked),
x=>x.IPAddress.IsMatchFor(ddlIPAddress.SelectedItem.Text,chkIPAddress.Checked),
x=>x.Other.IsMatchFor(ddlOther.SelectedItem.Text,chkOther.Checked),
// ... innumerable associations
};
Now we aggregate the innumerable filters onto the initial results query:
var filteredResults = filters.Aggregate(results, (r,f) => r.Where(f));
I ran this in a console app with simulated test values, and it worked as expected. I think this at least demonstrates the principle.
One thing you might consider is simplifying your UI by eliminating the checkboxes and using an "<empty>" or "<null>" item in your drop down list instead. This would reduce the number of controls taking up space on your window, remove the need for complex "enable X only if Y is not checked" logic, and would enable a nice one-control-per-query-field.
Moving on to your result query logic, I would start by creating a simple object to represent a filter on your domain object:
interface IDomainObjectFilter {
bool ShouldInclude( DomainObject o, string target );
}
You can associate an appropriate instance of the filter with each of your UI controls, and then retrieve that when the user initiates a query:
sealed class FileNameFilter : IDomainObjectFilter {
public bool ShouldInclude( DomainObject o, string target ) {
return string.IsNullOrEmpty( target )
|| o.FileName.Contains( target );
}
}
...
ddlFileName.Tag = new FileNameFilter( );
You can then generalize your result filtering by simply enumerating your controls and executing the associated filter (thanks to hurst for the Aggregate idea):
var finalResults = ddlControls.Aggregate( initialResults, ( c, r ) => {
var filter = c.Tag as IDomainObjectFilter;
var target = c.SelectedValue;
return r.Where( o => filter.ShouldInclude( o, target ) );
} );
Since your queries are so regular, you might be able to simplify the implementation even further by using a single filter class taking a member selector:
sealed class DomainObjectFilter {
private readonly Func<DomainObject,string> memberSelector_;
public DomainObjectFilter( Func<DomainObject,string> memberSelector ) {
this.memberSelector_ = memberSelector;
}
public bool ShouldInclude( DomainObject o, string target ) {
string member = this.memberSelector_( o );
return string.IsNullOrEmpty( target )
|| member.Contains( target );
}
}
...
ddlFileName.Tag = new DomainObjectFilter( o => o.FileName );