Most maintainable way to search in various columns - c#

A table has a column for Categories which hold integers representing Property, Cars, Others.
There are different columns of interest for each category shown below such that keywords searching for property will focus on columns for PropertyType, State, County, and NoOfBaths; while keyword searching for cars will focus on make, model, year and so on.
All entries have data in all columns but the data might sometimes have a slightly different meaning for different categories. For instance, PropertyType columns holds CarType data for cars and ItemType data for others, but the columns is only of interest when searching property.
Property
PropertyType,
Location State,
Location County,
No of Baths
Cars
Make,
Model,
Year,
Location State
Others
Itemname,
Make,
Colour,
Location State
The columns of interest were limited to four for performance reasons. A single search text box is used in the UI just like google. The algorithm used to pre identify the user’s search category before the query is fired posts an acceptable 98% accuracy rate. The algorithm also makes a good guess of what could be colour, state, county etc.
The site started as a small ads site developed using c#, entity framework, SQL server.
Since it was conceived as a small project, I thought I could pull it off with linq to entities. Using if statements to eliminate null fields, they were a finite number of queries (2 to the power 4) for each category.
Eg. 1
and some listings for the queryHelper
where the null value is checked before the where clause is composed.
By the time I was done, I was not sure if a small project like that deserved this kind of logic even though it seemed more flexible and maintainable. The columns of interest could be changed without affecting the code.
The question is if there is an easier way to achieve this?
Secondly, why isn’t there an ‘Ignorable()’ function in linq such that a given portion of the where clause can be ignored if the value being compared is null or empty?
Eg. 1 modifed
var results = context.Items.Where(m=>m.make.Ignorable() == make &&
m.model.Ignorable() == model && m.year.Ignorable() ==year &&
m.state.Ignorable() == state);
…
Or a symbol, say ‘¬’, which achieves the same like so
Eg. 1 modifed
var results = context.Items.Where(m=>m.make ¬== make && m.model ¬== model
&& m.year ¬==year && m.state ¬== state);
…

I think much easier and maintainable way for doing this is an overrided Equals() method in the particular class. So that any changes in the properties need not to alter the Linq queries. Let me explain this with the help of an example class, let it be the class Cars consider the definition of class will be like this:
public class Cars
{
// Properties
public string Make { get; set; }
public string Model { get; set; }
public int Year { get; set; }
public string Location_State { get; set; }
// overrided Equals method definition
public override bool Equals(object obj)
{
return this.Equals(obj as Cars);
}
public bool Equals(Cars other)
{
if (other == null)
return false;
return (this.Make == other.Make) &&
(this.Model == other.Model) &&
(this.Year == other.Year) &&
(this.Location_State == other.Location_State);
}
}
Now let objCars be the object you wanted to compare with the cars in the context.Items then you can format your LINQ query like this:
context.Items.Where(m=> m.Equals(objCars));
Note : You can give the N number of conditions in the Equals method so that you can avoid checking is null or empty or what ever else each time before executing the LINQ or even withing the LINQ. Easily made property changes to the class, you need to alter the condition in the overrieded method only

var q = context.Items;
if (!string.IsNullOrEmpty(make))
{
q = q.Where(m => m.make == make);
}
if (!string.IsNullOrEmpty(model))
{
q = q.Where(m => m.model == model);
}
//. . .
var results = q.ToList();
The query could be manipulated in multiple lines before being executed. See here

Related

Entity Framework Core - storing/querying multilingual records in the database efficiently

I'm building an application that must support more than one language.
Therefore, some of the records in my database need to have multiple versions for each language.
I will first explain how I currently achieve this: consider an entity called Region which represents a geographical location and simply has a name.
I would design my entities like this:
public class Region
{
public int Id { get; set; }
public List<RegionLanguage> Languages { get;set; }
}
public class RegionLanguage
{
public Region Region { get;set; } // Parent record this language applies to
public string CultureCode { get; set; } // Will store culture code such as en-US or fr-CA
// This column/property will be in the language specified by Culturecode
[StringLength(255)]
public string Name { get;set; }
}
From a database perspective, this works great because its infinitely scalable to any number of records. However, due to the way Entity Framework Core works, it becomes less scalable.
Using the above structure, I can query a Region and generate a view model based on specific culture information:
var region = _context.Regions.Where(e => e.Id == 34)
.Include(e => e.Languages)
.FirstOrDefault();
var viewModel = new RegionViewModel
{
Name = region.Languages.FirstOrDefault(e => e.CultureCode == "en-US")?.Name // en-US would be dynamic based on the user's current language preference
}
You can see this becomes inefficient since I have to include ALL language records for the entity I'm fetching, when I actually only need one and then search for the correct language in memory. Of course this becomes even worse when I need to fetch a list of Regions which then has to return a large amount of unnecessary data.
Of course, this is possible using SQL directly simply by adding an extra clause on the join statement:
select *
from Regions
left join RegionLanguage on (RegionLanguage.Region = Regions.Id and RegionLanguage.CultureCode = 'en-US')
However, to my understanding, this is not possible to do natively from Entity Framework Core without using a RawQuery (EF: Include with where clause)
So that begs the question: is there a better way to achieve multilingual records in the database using EF Core? Or should I just continue with my approach and hope that EF Core implements Include filtering by the time my application actually needs it (I'll admit I might be optimizing slightly prematurely, but I'm genuinely curious if there is a better way to achieve this).
You can use projection.
var languageRegion = await _context.Regions
.Select(p => new Region
{
Languages = p.Languages.FirstOrDefault(e => e.CultureCode == "en-US")
}.FirstOrDefaultAsync(e => e.Id == 34);
If regions and languages are not changing frequently you can use caching.
You could use a Global Query Filter
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<RegionLanguage>(builder =>
{
builder.HasQueryFilter(rl => rl.CultureCode == "en-US");
});
}

How to dynamically build and store complex linq queries against a list of hierarchical objects?

I have a list of objects in a hierarchical structure. I want to build complex LINQ queries against that object list based on "conditions" a client company sets and are stored in the database. So I need to build these at run time, but because they will be run repeatedly whenever the client's users update or refresh their data I would like to store the LINQ queries in objects rather than rebuild them each time.
I have looked at ScottGu's Blog about Dynamic LINQ.
Also this article about using expression trees.
Neither of these appear to provide an adequate solution, but I may not be understanding them adequately. I'm afraid that I'm trying to use LINQ when I should consider other options.
My object hierarchy:
WorkOrder[]
Field[]
Task[]
Field[]
Here is an example of a LINQ query that I would like to store and execute. I can reasonably build this format based on the database records that define the conditions.
var query =
from wo in WorkOrders
from woF in wo.Fields
from task in wo.Tasks
from taskF in task.Fields
from taskF2 in task.Fields
where woF.Name == "System Status"
&& woF.Value.Contains("SETC")
&& taskF.Name == "Material"
&& taskF.Value == "Y"
&& taskF2.Name == "Planner"
&& taskF2.Value == "GR5259"
select new
{
wo_id = wo.ID,
task_id = task.ID
};
A few considerations.
Depending on the complexity of the user defined conditions I may or may not need to pull from the different object lists: the "froms" are dynamic.
Note that in this example I pulled twice from the task.fields[] so I aliased it two times.
The example LINQ structure allows me to have complex ANDs, ORs, parenthesis, etc. that I don't believe is practical with Dynamic Chaining or Expression Trees.
In my code I envision:
//1) Retrieve business rules from DB. I can do this.
//2) Iterate through the business rules to build the linq queries.
foreach (BusinessRule br in BusinessRules) {
//Grab the criteria for the rule from the DB.
//Create a linq to object query based on the criteria just built.
//Add this query to a list for later use.
}
...Elsewhere in application.
//Iterate through and execute the linq queries in order to apply business rules to data cached in the application.
foreach (LinqQuery q in LinqQueries) {
//Execute the query
//Apply business rule to the results.
}
Thank you very much for your thoughts, effort and ideas.
You can technically achieve what you need using only LINQ, but the PredicateBuilder is a nice utility class:
public enum AndOr
{
And,
Or
}
public enum QueryableObjects
{
WorkOrderField,
TaskField
}
public class ClientCondition
{
public AndOr AndOr;
public QueryableObjects QueryableObject;
public string PropertyName;
public string PropertyValue;
}
public void PredicateBuilderExample()
{
var conditions = new List<ClientCondition> {
{
new ClientCondition { AndOr = LINQ.AndOr.And,
QueryableObject = QueryableObjects.WorkOrderField,
PropertyName = "System Status",
PropertyValue = "SETC"
}
},
{
new ClientCondition{AndOr = AndOr.And,
QueryableObject = QueryableObjects.TaskField,
PropertyName = "Material",
PropertyValue = "Y"
}
},
{
new ClientCondition{AndOr = AndOr.Or,
QueryableObject = QueryableObjects.TaskField,
PropertyName = "Planner",
PropertyValue = "GR5259"
}
}
};
//Obviously this WorkOrder object is empty so it will always return empty lists when queried.
//Populate this yourself.
var WorkOrders = new List<WorkOrder>();
var wofPredicateBuilder = PredicateBuilder.True<WorkOrderField>();
var tfPredicateBuilder = PredicateBuilder.True<TaskField>();
foreach (var condition in conditions)
{
if (condition.AndOr == AndOr.And)
{
if (condition.QueryableObject == QueryableObjects.WorkOrderField)
{
wofPredicateBuilder = wofPredicateBuilder.And(
wof => wof.Name == condition.PropertyName &&
wof.Value.Contains(condition.PropertyValue));
}
}
if (condition.AndOr == AndOr.Or)
{
if (condition.QueryableObject == QueryableObjects.TaskField)
{
tfPredicateBuilder = tfPredicateBuilder.Or(
tf => tf.Name = condition.PropertyName &&
tf.Value.Contains(condition.PropertyValue));
}
}
//And so on for each condition type.
}
var query = from wo in WorkOrders
from woF in wo.Fields.AsQueryable().Where(wofPredicateBuilder)
from task in wo.Tasks
from taskF in task.Fields.AsQueryable().Where(tfPredicateBuilder)
select new
{
wo_id = wo.ID,
task_id = task.ID
};
}
Note that I use the enums to limit the possible conditions your clients can send you. To have a truly dynamic queryable engine, you will need to use Reflection to ensure the object names you receive are valid. That seems like a rather large scope, and at that point I would recommend researching a different approach, such as ElasticSearch.
Also note that the order of And and Ors matters significantly. Essentially you are allowing your customers to build SQL queries against your data, and that usually ends in tears. It's your job to limit them to the proper set of conditions they should be querying.
Based on the discussion with Guillaume I would only suggest to pay attention to the type of the resulting query when playing around with advanced dynamic query generation. If you are changing the shape of what is being returned via Select, Aggregate, or one of the other methods you will expect your inner type to change accordingly. If you are just filtering with Where you can keep adding on as many additional cases you want unless you want OR behavior then things like PredicateBuilder helps. If you want to pull in more data via Join, Zip, ... then you are either doing so to filter, add to the rows returned, and possibly change the shape of the data.
I've done a lot of this in the past and had most success focusing on specific helper methods that allow for common cases that I need and then leaning on linq expression trees and patterns such as the visitor pattern to allow custom expression built at runtime.

Improving performance of double for loops

I'm working on an algorithm for recommendations as restaurants to the client. These recommendations are based on a few filters, but mostly by comparing reviews people have left on restaurants. (I'll spare you the details).
For calculating a pearson correlation (A number which determines how well users fit with eachother) I have to check where users have left a review on the same restaurant. To increase the amount of matches, I've included a match on the price range of the subjects. I'll try to explain, here is my Restaurant class:
public class Restaurant
{
public Guid Id { get; set; }
public int PriceRange { get; set; }
}
This is a simplified version, but it's enough for my example. A pricerange can be an integer of 1-5 which determines how expensive the restaurant is.
Here's the for loop I'm using to check if they left reviews on the same restaurant or a review on a restaurant with the same pricerange.
//List<Review> user1Reviews is a list of all reviews from the first user
//List<Review> user2Reviews is a list of all reviews from the second user
Dictionary<Review, Review> shared_items = new Dictionary<Review, Review>();
foreach (var review1 in user1Reviews)
foreach (var review2 in user2Reviews)
if (review1.Restaurant.Id == review2.Restaurant.Id ||
review1.Restaurant.PriceRange == review2.Restaurant.PriceRange)
if (!shared_items.ContainsKey(review1))
shared_items.Add(review1, review2);
Now here's my actual problem. You can see I'm looping the second list for each review the first user has left. Is there a way to improve the performance of these loops? I have tried using a hashset and the .contains() function, but I need to include more criteria (I.e. the price range). I couldn't figure out how to include that in a hashset.
I hope it's not too confusing, and thanks in advance for any help!
Edit: After testing both linq and the for loops I have concluded that the for loops is twice as fast as using linq. Thanks for your help!
You could try replacing your inner loop by a Linq query using the criteria of the outer loop:
foreach (var review1 in user1Reviews)
{
var review2 = user2Reviews.FirstOrDefault(r2 => r2.Restaurant.Id == review1.Restaurant.Id ||
r2.Restaurant.PriceRange == review1.Restaurant.PriceRange);
if (review2 != null)
{
if (!shared_items.ContainsKey(review1))
shared_items.Add(review1, review2);
}
}
If there are multiple matches you should use Where and deal with the potential list of results.
I'm not sure it would be any quicker though as you still have to check all the user2 reviews against the user1 reviews.
Hoever, if you wrote a custom comparer for your restaurant class you could use this overload of Intersect to return you the common reviews:
var commonReviews = user1Reviews.Intersect(user2Reviews, new RestaurantComparer());
Where RestaurantComparer looks something like this:
// Custom comparer for the Restaurant class
class RestaurantComparer : IEqualityComparer<Restaurant>
{
// Products are equal if their ids and price ranges are equal.
public bool Equals(Restaurant x, Restaurant y)
{
//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
//Check whether the properties are equal.
return x.Id == y.Id && x.PriceRange == y.PriceRange;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Product product)
{
//Check whether the object is null
if (Object.ReferenceEquals(product, null)) return 0;
//Get hash code for the Id field.
int hashId product.Id.GetHashCode();
//Get hash code for the Code field.
int hashPriceRange = product.PriceRange.GetHashCode();
//Calculate the hash code for the product.
return hashId ^ hashPriceRange;
}
}
You basically need a fast way to locate a review by Id or PriceRange. Normally you would use fast hash based lookup structure like Dictionary<TKey, TValue> for a single key, or composite key if the match operation was and. Unfortunately your is or, so the Dictionary doesn't work.
Well, not really. Single dictionary does not work, but you can use two dictionaries, and since the dictionary lookup is O(1), the operation will still be O(N) (rather than O(N * M) as with inner loop / naïve LINQ).
Since the keys are not unique, instead of dictionaries you can use lookups, keeping the same efficiency:
var lookup1 = user2Reviews.ToLookup(r => r.Restaurant.Id);
var lookup2 = user2Reviews.ToLookup(r => r.Restaurant.PriceRange);
foreach (var review1 in user1Reviews)
{
var review2 = lookup1[review.Restaurant.Id].FirstOrDefault() ??
lookup2[review.Restaurant.PriceRange].FirstOrDefault();
if (review2 != null)
{
// do something
}
}

Filtering LINQ Query against member object list

I have an application that I inherited from a coworker that tracks feedback cards. I also have a form that filters the cards that are displayed on a web page based upon a number of user entered filters. All of the filters work fine, except the filter that is applied against feedback details (service was fine/bad, room was clean/dirty, etc). These are stored in list of a member class in my card class.
Below is a set of snippets of each class.
public class Card {
public long ID { get; set; }
public List<Feedback> feedback { get; set; }
...
}
public class Feedback {
public long ID {get; set; }
...
}
public class CardFilter {
public ICollection<long> FeedBackDetails {get; set; }
...
}
...
public IQueryable<CardType > GetFeedbackQueryable<CardType>(CardFilter filter = null)
where CardType : Card
{
var data = Service.GetRepository<CardType>();
var CardQuery = data.All;
...
if (filter.FeedbackDetails != null && filter.FeedbackDetails.Count != 0)
{
cardQuery = cardQuery.Where(card => card.FeedbackValues)
.All(fbv => filter.FeedbackDetails.Contains(fbv.ID));
}
return cardQuery;
}
...
When I try the filter:
cardQuery = cardQuery.Where(card => card.FeedbackValues)
.All(fbv => filter.FeedbackDetails.Contains(fbv.ID));
It returns the 15 card instances without any feedback. If I use the filter:
cardQuery = cardQuery.Where(card => card.FeedbackValues)
.Any(fbv => filter.FeedbackDetails.Contains(fbv.ID));
Nothing is returned, even though I can look through the data and see the appropriate cards.
I'm new to LINQ, so I know I'm missing something. Please point me in the right direction here.
EDIT:
To give a little more background on this application, I'll be a bit more verbose. The Card table/Model has the information about the card and the person submitting it. By that I mean name or anonymous, address, location being commented upon and a few other basic facts. The feedback items are listed in another table and displayed on the web form and the user can check either positive or negative for each. There are three possible answers for each feedback detail; 0 (positive), 1 (negative) or nothing (no answer).
The Card Model has all of the basic card information as well as a collection of feedback responses. My filter that is giving me trouble is against that collection of responses. Each card can have from 0 to 52 possible responses which may not apply to all situations, so I need to see all cards that are about a specific situation (cleanliness, etc.) whether they are positive or negative. That is the purpose of this filter.
You can't use the all statement, the predicate for this statement is if all values are identical to the id.
In your where statement, which is a filter clause, are you not filtering any thing.
And you are comparing feedbackvalues with an id? Are they the same?
Can you post some more details about
Maybe try:
cardQuery = cardQuery.Where(card => filter. FeedbackDetails.Contains(card. Id/detsils))
.Select(se=> se).Tolist() ;
var ifExist = YourList.Any(lambda expression) checking if YourList<T> contains object whitch fulifill lambda expression . It's only return true or false. If you want to have list of objects you should use var YourNewList = YourList.Where(lambda expression).ToList().
Try this. Although I'm not entirely sure about your filter obj.
cardQuery = cardQuery.Query().Select(card => card.FeedbackValues).Where(fbv => filter.FeedbackDetails.Contains(fbv.ID));
I was able to solve this. All of the answers that were posted helped me in the right direction. I wish I could have flagged them all as the answer.
I ended up reworking my Feedback model slightly to include another identity field from the database. It duplicated existing date (bad design, I know. It wasn't mine), but had a unique name. Using the new field, I was able to apply an Any filter. I guess I was confusing LINQ with multiple fields named ID. Once I used FeedbackID, it worked fine.

Fast selection of elements from a set, based on a property

How can I store multiple values of a large set to be able to find them quickly with a lambda expression based on a property with non-unique values?
Sample case (not optimized for performance):
class Product
{
public string Title { get; set; }
public int Price { get; set; }
public string Description { get; set; }
}
IList<Product> products = this.LoadProducts();
var q1 = products.Where(c => c.Title == "Hello"); // 1 product.
var q2 = products.Where(c => c.Title == "Sample"); // 5 products.
var q3 = products.Where(c => string.IsNullOrEmpty(c.Title)); // 12 345 products.
If title was unique, it would be easy to optimize performance by using IDictionary or HashSet. But what about the case where the values are not unique?
The simplest solution is to use a dictionary of collections of Product. Easiest is to use
var products = this.LoadProducts().ToLookup(p => p.Title);
var example1 = products["Hello"]; // 1 product
var example2 = products["Sample"]; // 5 products
Your third example is a little harder, but you could use ApplyResultSelector() for that.
What you need is the ability to run indexed queries in LINQ. (same as we do in SQL)
There is a library called i4o which apparently can solve your problem:
http://i4o.codeplex.com/
from their website:
i4o (index for objects) is the first class library that extends LINQ
to allow you to put indexes on your objects. Using i4o, the speed of
LINQ operations are often over one thousand times faster than without
i4o.
i4o works by allowing the developer to specify an
IndexSpecification for any class, and then using the
IndexableCollection to implement a collection of that class that
will use the index specification, rather than sequential search, when
doing LINQ operations that can benefit from indexing.
also the following provides an example of how to use i4o:
http://www.hookedonlinq.com/i4o.ashx
Make it short you need to:
Add [Indexable()] attribute to your "Title" property
Use IndexableCollection<Product> as your data source.
From this point, any linq query that uses an indexable field will use the index rather than doing a sequential search, resulting in order of magnituide performance increases for queries using the index.

Categories