Is there a wildcard for the .Take method in LINQ? - c#

I am trying to create a method using LINQ that would take X ammount of products fron the DB, so I am using the .TAKE method for that.
The thing is, in situations I need to take all the products, so is there a wildcard I can give to .TAKE or some other method that would bring me all the products in the DB?
Also, what happens if I do a .TAKE (50) and there are only 10 products in the DB?
My code looks something like :
var ratingsToPick = context.RatingAndProducts
.ToList()
.OrderByDescending(c => c.WeightedRating)
.Take(pAmmount);

You could separate it to a separate call based on your flag:
IEnumerable<RatingAndProducts> ratingsToPick = context.RatingAndProducts
.OrderByDescending(c => c.WeightedRating);
if (!takeAll)
ratingsToPick = ratingsToPick.Take(pAmmount);
var results = ratingsToPick.ToList();
If you don't include the Take, then it will simply take everything.
Note that you may need to type your original query as IEnumerable<MyType> as OrderByDescending returns an IOrderedEnumerable and won't be reassignable from the Take call. (or you can simply work around this as appropriate based on your actual code)
Also, as #Rene147 pointed out, you should move your ToList to the end otherwise it will retrieve all items from the database every time and the OrderByDescending and Take are then actually operating on a List<> of objects in memory not performing it as a database query which I assume is unintended.
Regarding your second question if you perform a Take(50) but only 10 entries are available. That might depend on your database provider, but in my experience, they tend to be smart enough to not throw exceptions and will simply give you whatever number of items are available. (I would suggest you perform a quick test to make sure for your specific case)

Your current solution always takes all products from database. Because you are calling ToList(). After loading all products from database you are taking first N in memory. In order to conditionally load first N products, you need to build query
int? countToTake = 50;
var ratingsToPick = context.RatingAndProducts
.OrderByDescending(c => c.WeightedRating);
// conditionally take only first results
if (countToTake.HasValue)
ratingsToPick = ratingsToPick.Take(countToTake.Value);
var result = ratingsToPick.ToList(); // execute query

Related

Query ODataV4 connected service with LINQ - Get last record from table

Im trying to query my OData webservice from a C# application.
When i do the following:
var SecurityDefs = from SD in nav.ICESecurityDefinition.Take(1)
orderby SD.Entry_No descending
select SD;
i get an exception because .top() and .orderby is not supposed to be used together.
I need to get the last record in the dataset and only the last.
The purpose is to get the last used entry number in a ledger and then continue creating new entries incrementing the found entry no.
I cant seem to find anything online that explains how to do this.
Its very important that the service only returns the last record from the feed since speed is paramount in this solution.
i get an exception because .top() and .orderby is not supposed to be used together.
Where did you read that? In general .top() or .Take() should ONLY be used in conjunction WITH .orderby(), otherwise the record being retrieved is not guaranteed to be repeatable or predictable.
Probably the compounding issue here is mixing query and fluent expression syntax, which is valid, but you have to understand the order of precedence.
Your syntax is taking 1 record, then applying a sort order... you might find it easier to start with a query like this:
// build your query
var SecurityDefsQuery = from SD in nav.ICESecurityDefinition
orderby SD.Entry_No descending
select SD;
// Take the first item from the list, if it exists, will be a single record.
var SecurityDefs = SecurityDefsQuery.FirstOrDefault();
// Take an array of only the first record if it exists
var SecurityDefsDeferred = SecurityDefsQuery.Take(1);
This can be executed on a single line using brackets, but you can see how the query is the same in both cases, SecurityDefs in this case is a single ICESecurityDefinition typed record, where as SecurityDefsDeferred is an IQueryable<ICESecurityDefinition> that only has a single record.
If you only need the record itself, you this one liner:
var SecurityDefs = (from SD in nav.ICESecurityDefinition
orderby SD.Entry_No descending
select SD).FirstOrDefault();
You can execute the same query using fluent notation as well:
var SecurityDefs = nav.ICESecurityDefinition.OrderByDescending(sd => sd.Entry_No)
.FirstOrDefault();
In both cases, .Take(1) or .top() is being implemented through .FirstOrDefault(). You have indicated that speed is important, so use .First() or .FirstOrDefault() instead of .Single() or .SingleOrDefault() because the single variants will actually request .Take(2) and will throw an exception if it returns 1 or no results.
The OrDefault variants on both of these queries will not impact the performance of the query itself and should have negligble affect on your code, use the one that is appriate for your logic that uses the returned record and if you need to handle the case when there is no existing record.
If the record being returned has many columns, and you are only interested in the Entry_No column value, then perhaps you should simply query for that specific value itself:
Query expression:
var lastEntryNo = (from SD in nav.ICESecurityDefinition
orderby SD.Entry_No descending
select SD.Entry_No).FirstOrDefault();
Fluent expression:
var lastEntryNo = nav.ICESecurityDefinition.OrderByDescending(sd => sd.Entry_No)
.Select(sd => sd.Entry_No)
.FirstOrDefault();
If Speed is paramount then look at providing a specific custom endpoint on the service to either serve the record or do not process the 'Entry_No` in the client at all, make that the job of the code that receives data from the client and compute it at the time the entries are inserted.
Making the query perform faster is not the silver bullet you might be looking for though, Even if this is highly optimised, your current pattern means that X number of clients could all call the service to get the current value of Entry_No, meaning all of them would start incrementing from the same value.
If you MUST increment the Entry_No from the client then you should look at putting a custom endpoint on the service to simply return the Next Entry_No to use. This should be optimistic meaning that you don't care if the Entry_No actually gets used in the end, but you can implement the end point such that every call will increment the field in the database and return the next value.
Its getting a bit beyond the scope of your initial post, but SQL Server now has support for Sequences that formalise this type of logic from a database and schema point of view, using Sequence simplifies how we can manage these types of incrementations from the client, because we no longer rely on the outcome of data updates to be comitted to the table before the client can increment the next record. (which is what your TOP, Order By Desc solution is trying to do.

Least expensive operation to check for new records in table with EF

I have PostgreSQL running together with EF7.
My table structure:
Id (bigint)
Content (text)
CreatedAt (timestamp without time zone NOT NULL)
I have infinite scroll in my frontend app and query results with .Skip(n) & .Take(m).
I.e.
.Skip(0), .Take(10);
.Skip(10), .Take(10);
.Skip(20), .Take(10);
<...>
Now, while scrolling, if there're newer records, I have to know how many and add them to .Skip(n) function. I don't need to display them, just need to add them to consideration while skipping.
Currently I'm checking for them like so but this seemingly should be quite expensive after table will exceed 50k-100k records:
_myRepository.GetAll().Where(x => x.CreatedAt > newestActivityDate).Count();
GetAll():
public IQueryable<Activity> GetAll()
{
return _context.Activities.OrderByDescending(x => x.CreatedAt);
}
What would be the best (most performant) operation to check whether there're new records & how many? Checking by date and only then doing count if there're any new records?
EDIT:
Added GetAll() description for more clarity.
I don't know what your .GetAll() method does, so that really is the crux of your problem.
LINQ has what is called deferred execution. What that means is that the query will not get executed until the results are acted upon. So in the case of when you're building a SQL query, nothing will get sent to the database until you finish with it.
So if your .GetAll() method queries with a .ToList() in there, then it will pull everything from the database immediately before adding the rest of your filtering. In that case, yes, it will be very expensive.
However, you can get around that by just asking for the count. You already have most of that set up.
If you add a new method in your repository like this:
public int GetNewRecordsCount(DateTime newestActivityDate)
{
_context.Widgets.Count(x => x.CreatedAt > newestActivityDate);
}
Then this will create a different SQL query that is just something similar to this:
SELECT COUNT(*)
FROM Widgets
WHERE CreatedAt > newestActivityDate;
That is a very quick operation where the database will do all of the filtering for you. The only thing returned to your code is just a single row, single column result of the total count.

nhibernate with a restriction outside of sql

Is it possible to add restriction in nHibernate (version 3.3) that is based on a calculation outside of the database? For example, say someCalculation below calls into some other method in my code and returns a boolean. For the sake of argument, someCalculation() can not be made in the database. Is there a way to get it to work? It's currently throwing and I'm not sure if it's because I am way off or I'm doing something else wrong.
query.UnderlyingCriteria.Add(Restrictions.Where<MyEntity>(x => someCalculation(x.id));
The answer is more than to NHibernate related to SQL. Simply, either we will send the result of that computation upfront, before execution - or we will implement such function on DB side. No other mixture of these two is possible.
The first would end up in statement like this
var allowedIds = someCalculation(); // someCalculation(x.id)
query.WhereRestrictionOn(c => c.id).IsIn(allowedIds.ToArray());
In case, that id must be part of calculation, we can firstly load somehow filtered IDs, do the computation, and then execute a second select - similar to above
var ids = session.QueryOver<MyEntity>()
.Select(c => c.id)
.List<int>();
var allowedIds = someCalculation(ids); // someCalculation(x.id)
If that is still not effective, the only way is to create a Function on DB side and call it. There is detailed Q & A:
Using SQL CONVERT function through nHibernate Criterion

EF LINQ ToList is very slow

I am using ASP NET MVC 4.5 and EF6, code first migrations.
I have this code, which takes about 6 seconds.
var filtered = _repository.Requests.Where(r => some conditions); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes 6 seconds, has 8 items inside
I thought that this is because of relations, it must build them inside memory, but that is not the case, because even when I return 0 fields, it is still as slow.
var filtered = _repository.Requests.Where(r => some conditions).Select(e => new {}); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes still around 5-6 seconds, has 8 items inside
Now the Requests table is quite complex, lots of relations and has ~16k items. On the other hand, the filtered list should only contain proxies to 8 items.
Why is ToList() method so slow? I actually think the problem is not in ToList() method, but probably EF issue, or bad design problem.
Anyone has had experience with anything like this?
EDIT:
These are the conditions:
_repository.Requests.Where(r => ids.Any(a => a == r.Student.Id) && r.StartDate <= cycle.EndDate && r.EndDate >= cycle.StartDate)
So basically, I can checking if Student id is in my id list and checking if dates match.
Your filtered variable contains a query which is a question, and it doesn't contain the answer. If you request the answer by calling .ToList(), that is when the query is executed. And that is the reason why it is slow, because only when you call .ToList() is the query executed by your database.
It is called Deferred execution. A google might give you some more information about it.
If you show some of your conditions, we might be able to say why it is slow.
In addition to Maarten's answer I think the problem is about two different situation
some condition is complex and results in complex and heavy joins or query in your database
some condition is filtering on a column which does not have an index and this cause the full table scan and make your query slow.
I suggest start monitoring the query generated by Entity Framework, it's very simple, you just need to set Log function of your context and see the results,
using (var context = new MyContext())
{
context.Database.Log = Console.Write;
// Your code here...
}
if you see something strange in generated query try to make it better by breaking it in parts, some times Entity Framework generated queries are not so good.
if the query is okay then the problem lies in your database (assuming no network problem).
run your query with an SQL profiler and check what's wrong.
UPDATE
I suggest you to:
add index for StartDate and EndDate Column in your table (one for each, not one for both)
ToList executes the query against DB, while first line is not.
Can you show some conditions code here?
To increase the performance you need to optimize query/create indexes on the DB tables.
Your first line of code only returns an IQueryable. This is a representation of a query that you want to run not the result of the query. The query itself is only runs on the databse when you call .ToList() on your IQueryable, because its the first point that you have actually asked for data.
Your adjustment to add the .Select only adds to the existing IQueryable query definition. It doesnt change what conditions have to execute. You have essentially changed the following, where you get back 8 records:
select * from Requests where [some conditions];
to something like:
select '' from Requests where [some conditions];
You will still have to perform the full query with the conditions giving you 8 records, but for each one, you only asked for an empty string, so you get back 8 empty strings.
The long and the short of this is that any performance problem you are having is coming from your "some conditions". Without seeing them, its is difficult to know. But I have seen people in the past add .Where clauses inside a loop, before calling .ToList() and inadvertently creating a massively complicated query.
Jaanus. The most likely reason of this issue is complecity of generated SQL query by entity framework. I guess that your filter condition contains some check of other tables.
Try to check generated query by "SQL Server Profiler". And then copy this query to "Management Studio" and check "Estimated execution plan". As a rule "Management Studio" generatd index recomendation for your query try to follow these recomendations.

IQueryable<>.ToString() too slow

I'm using BatchDelete found on the answer to this question: EF Code First Delete Batch From IQueryable<T>?
The method seems to be wasting too much time building the delete clause from the IQueryable. Specifically, deleting 20.000 elements using the IQueryable below is taking almost two minutes.
context.DeleteBatch(context.SomeTable.Where(x => idList.Contains(x.Id)));
All the time is spent on this line:
var sql = clause.ToString();
The line is part of this method, available on the original question linked above but pasted here for convenience:
private static string GetClause<T>(DbContext context, IQueryable<T> clause) where T : class
{
const string Snippet = "FROM [dbo].[";
var sql = clause.ToString();
var sqlFirstPart = sql.Substring(sql.IndexOf(Snippet, System.StringComparison.OrdinalIgnoreCase));
sqlFirstPart = sqlFirstPart.Replace("AS [Extent1]", string.Empty);
sqlFirstPart = sqlFirstPart.Replace("[Extent1].", string.Empty);
return sqlFirstPart;
}
I imagine making context.SomeTable.Where(x => idList.Contains(x.Id)) into a compiled query could help, but AFAIK you can't compile queries while using DbContext on EF 5. In thesis they should be cached but I see no sign of improvement on a second execution of the same BatchDelete.
Is there a way to make this faster? I would like to avoid manually building the SQL delete statement.
The IQueryable isn't cached and each time you evaluate it you're going out to SQL. Running ToList() or ToArray() on it will evaluate it once and then you can work with the list as the cached version.
If you want to preserve you're interfaces, you'd use ToList().AsQueryable() and this would pass in a cached version.
Related post.
How do I cache an IQueryable object?
It seems there is no way to cache the IQueryable in this case, because the query contains a list of ids to check against and the list changes in every call.
The only way I found to avoid the two minute delay in building the query every time I had to mass-delete objects was to use ExecuteSqlCommand as below:
var list = string.Join("','", ids.Select(x => x.ToString()));
var qry = string.Format("DELETE FROM SomeTable WHERE Id IN ('{0}')", list);
context.Database.ExecuteSqlCommand(qry);
I'll mark this as the answer for now. If any other technique is suggested that doesn't rely on ExecuteSqlCommand, I'll gladly change the answer.
There is a EF pattern that works Ok.
it uses projection. to return ONLY keys from DB. (projections are not added to context,
So this is pretty quick.
Then You build the context with KEY only stub POCOs, and light the fuse....
basically.
var deleteMagazine = Context.Set<DeadMeat>.Where(t=>t.IhateYou == true).Select(t=>t.THEKEY).toList
//Now instantiate a dummy POCO with KEY only for the list,
foreach ( var bullet in deleteMagazine)
{
context.Set<deadmeat>.attach(bullet);
context.set<deadmeat>.remove(bullet);
// consider saving chnages every 1000 records .... performance, trial different values
if (magazineisEmpty) // your counter logic here :-)
context.SaveChanges
}
// shoot anyone still moving
context.SaveChanges
check SQL server profiler....

Categories