nhibernate paging with detached criteria - c#

I am working on an application in which I would like to implement paging. I have the following class that implements detached criteria -
public class PagedData : DetachedCriteria
{
public PagedData(int pageIndex, int pageSize) : base(typeof(mytype))
{
AddOrder(Order.Asc("myId"));
var subquery = DetachedCriteria.For(typeof(mytype2))
.SetProjection(Projections.Property("mytype.myId"));
Add(Subqueries.PropertyIn("myId", subquery));
SetFirstResult((pageIndex - 1) * pageSize);
SetMaxResults(pageSize);
}
}
This works fine - it returns exactly the data that I am trying to retrieve. The problem I am running into is getting the total row count for my page navigation. since I am using the setfirstresults and setmaxresults in my detached criteria, the row count is always limited to the pageSize variable that is coming in.
My question is this: How can I get the total row count? Should I just create another detachedcriteria to calculate the row count? If so, will that add round trips to the db? Would I be better off not using detacedcriteria and using a straight criteria query in which I can then utilize futures? Or can I somehow use futures with what I am currently doing.
Please let me know if any further information is needed.
Thanks

I do it like this, inside my class which is used for paged criteria access:
// In order to be able to determine the NumberOfItems in a efficient manner,
// we'll clone the Criteria that has been given, and use a Projection so that
// NHibernate will issue a SELECT COUNT(*) against the ICriteria.
ICriteria countQuery =
CriteriaTransformer.TransformToRowCount (_criteria);
NumberOfItems = countQuery.UniqueResult<int> ();
Where NumberOfItems is a property (with a private setter) inside my 'PagedCriteriaResults' class.
The PagedCriteriaResults class takes an ICriteria instance in its constructor.

you can create a second DetachedCriteria to get to row count with the build-in CriteriaTransformer
DetachedCriteria countSubquery = NHibernate.CriteriaTransformer.TransformToRowCount(subquery)
this will of course result in a second call to the db

Discussed here:
How can you do paging with NHibernate?

Drawing on the two answers above i created this method for paged searching using detached criteria.
Basically i just take an ordinary detached criteria and after i've created the real ICriteria from the session, i transform it to a rowcount critera and then use Future on both of them. Works great!
public PagedResult<T> SearchPaged<T>(PagedQuery query)
{
try
{
//the PagedQuery object is just a holder for a detached criteria and the paging variables
ICriteria crit = query.Query.GetExecutableCriteria(_session);
crit.SetMaxResults(query.PageSize);
crit.SetFirstResult(query.PageSize * (query.Page - 1));
var data = crit.Future<T>();
ICriteria countQuery = CriteriaTransformer.TransformToRowCount(crit);
var rowcount = countQuery.FutureValue<Int32>();
IList<T> list = new List<T>();
foreach (T t in data)
{
list.Add(t);
}
PagedResult<T> res = new PagedResult<T>();
res.Page = query.Page;
res.PageSize = query.PageSize;
res.TotalRowCount = rowcount.Value;
res.Result = list;
return res;
}
catch (Exception ex)
{
_log.Error("error", ex);
throw ex;
}
}

Related

Iterating over Linq-to-Entities IEnumerable causes OutOfMemoryException

The part of the code I'm working on receives an
IEnumerable<T> items
where each item contains a class with properties reflecting a MSSQL database table.
The database table has a total count of 953664 rows.
The dataset in code is filtered down to a set of 284360 rows.
The following code throws an OutOfMemoryException when the process reaches about 1,5 GB memory allocation.
private static void Save<T>(IEnumerable<T> items, IList<IDataWriter> dataWriters, IEnumerable<PropertyColumn> columns) where T : MyTableClass
{
foreach (var item in items)
{
}
}
The variable items is of type
IQueryable<MyTableClass>
I can't find anyone with the same setup, and other's solutions that I've found doesn't apply here.
I've also tried paging, using Skip and Take with a page size of 500, but that just takes a long time and ends up with the same result. It seems like objects aren't being released after each iteration. How is that?
How can I rewrite this code to cope with a larger collection set?
Well, as Servy has already said you didn't provide your code so I'll try to make some predictions... (Sorry for my english)
If you have an exception in "foreach (var item in items)" when you are using paging then, I guess, something wrong with paging. I wrote a couple of examples to explain my idea.
if first example I suggest you (just for test) put your filter inside the Save function.
private static void Save<T>(IQueryable<T> items, IList<IDataWriter> dataWriters, IEnumerable<PropertyColumn> columns) where T : MyTableClass
{
int pageSize = 500; //Only 500 records will be loaded.
int currentStep = 0;
while (true)
{
//Here we create a new request into the database using our filter.
var tempList = items.Where(yourFilter).Skip(currentStep * pageSize).Take(pageSize);
foreach (var item in tempList)
{
//If you have an exception here maybe something wrong in your dataWriters or columns.
}
currentStep++;
if (tempList.Count() == 0) //No records have been loaded so we can leave.
break;
}
}
The second example show how to use paging without any changes in the Save function
int pageSize = 500;
int currentStep = 0;
while (true)
{
//Here we create a new request into the database using our filter.
var tempList = items.Where(yourFilter).Skip(currentStep * pageSize).Take(pageSize);
Save(tempList, dataWriters, columns); //Calling saving function.
currentStep++;
if (tempList.Count() == 0)
break;
}
Try both of them and you'll either resolve your problem or find another place where an exception is raised.
By the way, another potential place is your dataWriters. I guess there you store all data that your have been received from the database. Maybe you shouldn't save all data? Just calculate memory size that all objects are required.
P.S. And don't use while(true) in your code. It just an example:)

Having troubles in ordering search results using Lucene

I am running search query as following to bring results from Dynamics CRM. Search is working fine but it is brining results based on relevance. We want to order them in descending order of 'createdon' field. As we are displaying only 10 results per page, so I can't sort the result returned by this query.
Is there any way to order based on a field?
public IEnumerable<SearchResult> Search(string term, int? pageNumber, int
pageSize, out int totalHits, IEnumerable<string> logicalName)
{
var searchProvider = SearchManager.Provider;
var query = new CrmEntityQuery(term, pageNumber.GetValueOrDefault(1), pageSize, logicalNames);
return GetSearchResults(out totalHits, searchProvider, query);
}
private IEnumerable<SearchResult> GetSearchResults(out int totalHits,
SearchProvider searchProvider, CrmEntityQuery query)
{
using (ICrmEntityIndexSearcher searcher = searchProvider.GetIndexSearcher())
{
Portal.StoreRequestItem("SearchDeduplicateListForAuthorisation", new List<Guid>());
var results = searcher.Search(query);
totalHits = results.ApproximateTotalHits;
return from x in results
select new SearchResult(x);
}
}
Not used Lucene myself, so cant comment on that.
However, if you were doing this in basic CRM. You would use a QueryExpression with an OrderExpression. Then when you page the results they are paged in order.
Here is an example of a QueryExpression, with an OrderExpression, and paging.
Page large result sets with QueryExpression
Presumably at some point the data is being pulled out of CRM, either within Lucene, or your own code, maybe in CrmEntityQuery? Then you can add the sort there.

How to retrieve records more than 4000 from Raven DB in SIngle Session [duplicate]

I know variants of this question have been asked before (even by me), but I still don't understand a thing or two about this...
It was my understanding that one could retrieve more documents than the 128 default setting by doing this:
session.Advanced.MaxNumberOfRequestsPerSession = int.MaxValue;
And I've learned that a WHERE clause should be an ExpressionTree instead of a Func, so that it's treated as Queryable instead of Enumerable. So I thought this should work:
public static List<T> GetObjectList<T>(Expression<Func<T, bool>> whereClause)
{
using (IDocumentSession session = GetRavenSession())
{
return session.Query<T>().Where(whereClause).ToList();
}
}
However, that only returns 128 documents. Why?
Note, here is the code that calls the above method:
RavenDataAccessComponent.GetObjectList<Ccm>(x => x.TimeStamp > lastReadTime);
If I add Take(n), then I can get as many documents as I like. For example, this returns 200 documents:
return session.Query<T>().Where(whereClause).Take(200).ToList();
Based on all of this, it would seem that the appropriate way to retrieve thousands of documents is to set MaxNumberOfRequestsPerSession and use Take() in the query. Is that right? If not, how should it be done?
For my app, I need to retrieve thousands of documents (that have very little data in them). We keep these documents in memory and used as the data source for charts.
** EDIT **
I tried using int.MaxValue in my Take():
return session.Query<T>().Where(whereClause).Take(int.MaxValue).ToList();
And that returns 1024. Argh. How do I get more than 1024?
** EDIT 2 - Sample document showing data **
{
"Header_ID": 3525880,
"Sub_ID": "120403261139",
"TimeStamp": "2012-04-05T15:14:13.9870000",
"Equipment_ID": "PBG11A-CCM",
"AverageAbsorber1": "284.451",
"AverageAbsorber2": "108.442",
"AverageAbsorber3": "886.523",
"AverageAbsorber4": "176.773"
}
It is worth noting that since version 2.5, RavenDB has an "unbounded results API" to allow streaming. The example from the docs shows how to use this:
var query = session.Query<User>("Users/ByActive").Where(x => x.Active);
using (var enumerator = session.Advanced.Stream(query))
{
while (enumerator.MoveNext())
{
User activeUser = enumerator.Current.Document;
}
}
There is support for standard RavenDB queries, Lucence queries and there is also async support.
The documentation can be found here. Ayende's introductory blog article can be found here.
The Take(n) function will only give you up to 1024 by default. However, you can change this default in Raven.Server.exe.config:
<add key="Raven/MaxPageSize" value="5000"/>
For more info, see: http://ravendb.net/docs/intro/safe-by-default
The Take(n) function will only give you up to 1024 by default. However, you can use it in pair with Skip(n) to get all
var points = new List<T>();
var nextGroupOfPoints = new List<T>();
const int ElementTakeCount = 1024;
int i = 0;
int skipResults = 0;
do
{
nextGroupOfPoints = session.Query<T>().Statistics(out stats).Where(whereClause).Skip(i * ElementTakeCount + skipResults).Take(ElementTakeCount).ToList();
i++;
skipResults += stats.SkippedResults;
points = points.Concat(nextGroupOfPoints).ToList();
}
while (nextGroupOfPoints.Count == ElementTakeCount);
return points;
RavenDB Paging
Number of request per session is a separate concept then number of documents retrieved per call. Sessions are short lived and are expected to have few calls issued over them.
If you are getting more then 10 of anything from the store (even less then default 128) for human consumption then something is wrong or your problem is requiring different thinking then truck load of documents coming from the data store.
RavenDB indexing is quite sophisticated. Good article about indexing here and facets here.
If you have need to perform data aggregation, create map/reduce index which results in aggregated data e.g.:
Index:
from post in docs.Posts
select new { post.Author, Count = 1 }
from result in results
group result by result.Author into g
select new
{
Author = g.Key,
Count = g.Sum(x=>x.Count)
}
Query:
session.Query<AuthorPostStats>("Posts/ByUser/Count")(x=>x.Author)();
You can also use a predefined index with the Stream method. You may use a Where clause on indexed fields.
var query = session.Query<User, MyUserIndex>();
var query = session.Query<User, MyUserIndex>().Where(x => !x.IsDeleted);
using (var enumerator = session.Advanced.Stream<User>(query))
{
while (enumerator.MoveNext())
{
var user = enumerator.Current.Document;
// do something
}
}
Example index:
public class MyUserIndex: AbstractIndexCreationTask<User>
{
public MyUserIndex()
{
this.Map = users =>
from u in users
select new
{
u.IsDeleted,
u.Username,
};
}
}
Documentation: What are indexes?
Session : Querying : How to stream query results?
Important note: the Stream method will NOT track objects. If you change objects obtained from this method, SaveChanges() will not be aware of any change.
Other note: you may get the following exception if you do not specify the index to use.
InvalidOperationException: StreamQuery does not support querying dynamic indexes. It is designed to be used with large data-sets and is unlikely to return all data-set after 15 sec of indexing, like Query() does.

Caching/compiling complex Linq query (Entity Framework)

I have a complex Entity Framework query. My performance bottleneck is not actually querying the database, but translating the IQueryable into query text.
My code is something like this:
var query = context.Hands.Where(...)
if(x)
query = query.where(...)
....
var result = query.OrderBy(...)
var page = result.skip(500 * pageNumber).Take(500).ToList(); //loong time here, even before calling the DB
do
{
foreach(var h in page) { ... }
pageNumber += 1;
page = result.skip(500 * pageNumber).Take(500).ToList(); //same here
}
while(y)
What can I do? I am using DbContext (with SQLite), so I can't use precompiled query (and even then, it would be cumbersome with query building algorithm like this).
What I basically need, is to cache a "page" query and only change the "skip" and "take" parameters, without recompiling it from the ground up each time.
Your premise is incorrect. Because you have a ToList call at the end of your query you are querying the database where you've indicated, to construct the list. You're not deferring execution any longer. That's why it takes so long. You aren't spending a long time constructing the query, it's taking a long time to go to the database and actually execute it.
If it helps you can use the following method to do the pagination for you. It will defer fetching each page until you ask for the next one:
public static IEnumerable<IEnumerable<T>> Paginate<T>(
this IQueryable<T> query, int pagesize)
{
int pageNumber = 0;
var page = query.Take(pagesize).ToList();
while (page.Any())
{
yield return page;
pageNumber++;
page = query.Skip(pageNumber * pagesize)
.Take(pagesize)
.ToList();
}
}
So if you had this code:
var result = query.OrderBy(...);
var pages = result.Paginate();//still haven't hit the database
//each iteration of this loop will go query the DB once to get that page
foreach(var page in pages)
{
//use page
}
If you want to get an IEnumerable<IQueryable<T>> in which you have all of the pages as queries (meaning you could add further filters to them before sending them to the database) then the major problem you have is that you don't know how many pages there will be. You need to actually execute a given query to know if it's the last page or not. You either need to fetch each page as you go, as this code does, or you need to query the count of the un-paged query at the start (which means one more DB query than you would otherwise need). Doing that would look like:
public static IEnumerable<IQueryable<T>> Paginate<T>(
this IQueryable<T> query, int pagesize)
{
//note that this is hitting the DB
int numPages = (int)Math.Ceiling(query.Count() / (double)pagesize);
for (int i = 0; i < numPages; i++)
{
var page = query.Skip(i * pagesize)
.Take(pagesize);
yield return page;
}
}

LINQ with ManyToMany: Filtering based on multiple selection

I am a newbe to C# and have to use it for my master thesis. At the moment, I am facing a problem that is a bit to complex for me.
I have set up a database with a many-to-many relationship like this:
Table Relay:
- id (PK)
- Name
- Input
Table ProtectionFunction:
- id (PK)
- ANSI
- IEC
- Description
Table RelayConfig (junction table)
- RelayID (PK)
- ProtFuncID (PK)
- TimeToSaturate
- Remanence
The thing is, a Relay can have multiple protection functions, and for each it has specific values for TimeToSaturate and Remanence. Now I want to realize a filter. The user can select protection function via checkboxes in a DataGridView and a ListBox should show all Relays that support ALL of these protection functions.
I have already created the LINQ-to-SQL classes for my project. But now I am stuck because I don't know how to realize the filtering. All LINQ commands I have found so far would give me all Relays for one protection function.
I really hope one of you can give me a hint.
var ids = new int[]{ ... };
// if ids is null or ids.Length == 0 please return null or an empty list,
//do not go further otherwise you'll get Relays without any function filter
var query = Relays.AsQueryable();
foreach (var id in ids)
{
var tempId = id;
query = query.Where(r=>r.RelayConfigs.Any(rc=>rc.ProtFuncID == tempId));
}
var items = query.ToList();
Update
Just saw this on PredicateBuilder page:
The temporary variable in the loop is required to avoid the outer
variable trap, where the same variable is captured for each iteration
of the foreach loop.
It's easier if you start from the RelayConfigs. Something like this should work:
var protFuncIds = new[]{1,2,3};
var query = from rc in db.RelayConfigs
where protFuncIds.Contains(rc.ProtFuncID)
select rc.Relay;
var relays = query.Distinct().ToList();
UPDATE:
based on your comment, the following should work, however do monitor the SQL generated...
IQueryable<Relay> query = db.Relays
foreach (var id in ids)
query = relays.Where(r => r.RelayConfigs.Select(x => x.ProtFuncId).Contains(id));
var relays = query.ToList();
// Build a list of protection function ids from your checkbox list
var protFuncIDs = [1,2,3,4];
using(var dc = new MyDataContext())
{
var result = dc.Relays.Where(r=>protFuncIDs.Join(r.RelayConfigs, pf=>pf, rc=>rc.ProtFuncID, (pf,rc)=>pf).Count() == protFuncIDs.Length).ToArray();
}
It's not especially efficient, but that should do the trick for you.
I have done this in Lightswitch, and here was my preprocess query:
partial void UnusedContactTypesByContact_PreprocessQuery(int? ContactID, ref IQueryable<ContactType> query)
{
query = from contactType in query
where !contactType.ContactToContactTypes.Any(c => c.Contact.Id == ContactID)
select contactType;
}
Hope that helps.

Categories