Whether to use AsEnumerable() in LINQ or not?

Whether to use AsEnumerable() in LINQ or not? - c#

I understand AsEnumerable() is used to switch from "LINQ to SQL" to "LINQ to Object", so we can use some extra (mostly user defined) methods in our LINQ queries. But from my experience I've seen, using AsEnumerable() makes the query much slower. In that case I can enumerate the list later to apply my own methods, but still the result is pretty slow.
Can anyone suggest any better approach?
Here is some code sample of what I'm trying to do?
With AsEnumerable():
var Data = (from r in _context.PRD_ChemProdReq.AsEnumerable()
//where r.RecordStatus == "NCF"
orderby r.RequisitionNo descending
select new PRDChemProdReq
{
RequisitionID = r.RequisitionID,
RequisitionNo = r.RequisitionNo,
RequisitionCategory = DalCommon.ReturnRequisitionCategory(r.RequisitionCategory),
RequisitionType = DalCommon.ReturnOrderType(r.RequisitionType),
ReqRaisedOn = (Convert.ToDateTime(r.ReqRaisedOn)).ToString("dd'/'MM'/'yyyy"),
RecordStatus= DalCommon.ReturnRecordStatus(r.RecordStatus),
RequisitionFromName = DalCommon.GetStoreName(r.RequisitionFrom),
RequisitionToName = DalCommon.GetStoreName(r.RequisitionTo)
}).ToList();
without AsEnumerable():
var Data = (from r in _context.PRD_ChemProdReq
//where r.RecordStatus == "NCF"
orderby r.RequisitionNo descending
select new PRDChemProdReq
{
RequisitionID = r.RequisitionID,
RequisitionNo = r.RequisitionNo,
RequisitionCategory = r.RequisitionCategory,
RequisitionType = (r.RequisitionType),
ReqRaisedOnTemp = (r.ReqRaisedOn),
RecordStatus= (r.RecordStatus),
RequisitionFrom = (r.RequisitionFrom),
RequisitionTo = (r.RequisitionTo)
}).ToList();
foreach (var item in Data)
{
item.RequisitionCategory = DalCommon.ReturnRequisitionCategory(item.RequisitionCategory);
item.RequisitionType = DalCommon.ReturnOrderType(item.RequisitionType);
item.ReqRaisedOn = (Convert.ToDateTime(item.ReqRaisedOnTemp)).ToString("dd'/'MM'/'yyyy");
item.RecordStatus = DalCommon.ReturnRecordStatus(item.RecordStatus);
item.RequisitionFromName = DalCommon.GetStoreName(item.RequisitionFrom);
item.RequisitionToName = DalCommon.GetStoreName(item.RequisitionTo);
}

It looks like you confuse these two interfaces as two totally different things. In fact IQueryable is inherited from IEnumerable, so whatever worked for you using with latter, would work with former as well, so don't need to use AsEnumerable.
Behind the scenes though these interfaces are implemented quite differently - IEnumerable will process your collection in memory, and IQueryable would pass the query to the underlying data provider. You can imagine that if a database table contains millions of records and you try to sort it, DB server can do it very quickly (using indexes) so Queryable will shine. For IEnumerable, all the data need to be loaded to your computer memory and sorted there.
For a longer answers search for "IEnumerable IQueryable difference" on SO, you will see a plenty of details:
Random Link 1
Random Link 2
Update: If you remove call .ToList from your second example, then the result won't be automatically loaded to memory. At this point you need to decide which items you want to store in memory and call your functions for them only.
var Data = (from r in _context.PRD_ChemProdReq
orderby r.RequisitionNo descending
select new PRDChemProdReq
{
// do your initialization
});
var subsetOfData = Data.Take(100).ToList(); // Now it's loaded to memory
foreach (var item in subsetOfData)
{
item.RequisitionCategory = DalCommon.ReturnRequisitionCategory(item.RequisitionCategory);
item.RequisitionType = DalCommon.ReturnOrderType(item.RequisitionType);
item.ReqRaisedOn = (Convert.ToDateTime(item.ReqRaisedOnTemp)).ToString("dd'/'MM'/'yyyy");
item.RecordStatus = DalCommon.ReturnRecordStatus(item.RecordStatus);
item.RequisitionFromName = DalCommon.GetStoreName(item.RequisitionFrom);
item.RequisitionToName = DalCommon.GetStoreName(item.RequisitionTo);
}
Now if you actually need to assign these properties for all your data and data can be arbitrary large, you need to work out a strategy how you can do it. The very simple option is to save them to database to a new table, then the size of processed data will be only limited by the capacity of your database.

AsEnumerable() will be slower if you add any query elements after it.
Even though AsEnumerable() doesn't execute the query directly, applying a where or orderby after the AsEnumerable() means that the Sql will get all items and then apply filtering and ordering to the collection in memory.
In short:
Without AsEnumerable() = filtering and ordering done in SQL
With AsEnumerable() and then Where or orderby = applied to whole collection brought into memory.
You can only run user defined functions on a collection in memory (as your Linq to SQL will not be able to interpret your functions to SQL code). So your second code snippet (without AsEnumerable()) is probably best.
The only other alternative is to apply your user defined functions in SQL itself.

OK guys, I took notice of different points from all of you & came up with this:
var Data = (from r in _context.PRD_ChemProdReq.AsEnumerable()
//where r.RecordStatus == "NCF"
join rf in _context.SYS_Store on (r.RequisitionFrom==null?0: r.RequisitionFrom) equals rf.StoreID into requisitionfrom
from rf in requisitionfrom.DefaultIfEmpty()
join rt in _context.SYS_Store on (r.RequisitionTo == null ? 0 : r.RequisitionTo) equals rt.StoreID into requisitionto
from rt in requisitionto.DefaultIfEmpty()
orderby r.RequisitionNo descending
select new PRDChemProdReq
{
RequisitionID = r.RequisitionID,
RequisitionNo = r.RequisitionNo,
RequisitionCategory = DalCommon.ReturnRequisitionCategory(r.RequisitionCategory),
RequisitionType = r.RequisitionType == "UR" ? "Urgent" : "Normal",
ReqRaisedOn = (Convert.ToDateTime(r.ReqRaisedOn)).ToString("dd'/'MM'/'yyyy"),
RecordStatus = (r.RecordStatus=="NCF"? "Not Confirmed": "Approved"),
RequisitionFromName = (rf==null? null: rf.StoreName),
RequisitionToName = (rt == null ? null : rt.StoreName)
});
First of all I removed my ToList() which does nothing but executes the query which is already done when I called AsEnumerable(). No points to execute the same query twice. Also my custom method calls within the select block was also playing a major part slowing down things. I tried lessen the method calls, rather used join where possible. It makes things pretty faster. Thank You all.

Related

Results IOrderedEnumerable is null outside context

I'm learning entity framework and hitting a wall. Here is my code:
public IOrderedEnumerable<ArchiveProcess> getHistory()
{
using (ArchiveVMADDatabase.ArchiveDatabaseModel dataContext = new ArchiveDatabaseModel())
{
var query = (from history in dataContext.ArchiveProcess.AsNoTracking()
orderby history.ArchiveBegin descending
select history).Take(10).ToList();
return query as IOrderedEnumerable<ArchiveProcess>;
}
}
When I step through this code, query is a List<ArchiveProcess> containing my ten desired results. However, as soon as I exit the method and the context is disposed of, query becomes null. How can I avoid this? I tried doing this instead:
select new ArchiveProcess
{
ArchiveBegin = history.ArchiveBegin,
ArchiveEnd = history.ArchiveEnd,
DeploysHistoryCount = history.DeploysHistoryCount,
MachinesHistory = history.MachinesHistory,
ScriptHistory = history.ScriptHistory
}
But then I received a NotSupportedException. Why does entity framework delete my precious entities as soon as the context is disposed of and how do I tell it to stop?

I think there are several ways to avoid this but in general you should know precisely how long you want your context to live. In general it's better to have the using statement wrapped through the entire method.
In order to avoid the garbage collection you can do something like this: set the object in memory and then add value to that object.
List<ArchiveProcess> query;
using (ArchiveVMADDatabase.ArchiveDatabaseModel dataContext = new ArchiveDatabaseModel())
{
query = (from history in dataContext.ArchiveProcess.AsNoTracking()
orderby history.ArchiveBegin descending
select history).Take(10).ToList();
return query; /// you do not really need to all enumerable as IOrderedEnumerable<ArchiveProcess>;
}

query as IOrderedEnumerable<ArchiveProcess>;
query is a List<ArchiveProcess>, as returns null when you try use it to cast something to an interface it doesn't implement. List<ArchiveProcess> is not an IOrderedEnumerable<ArchiveProcess> so query as IOrderedEnumerable<ArchiveProcess> is null.
The only thing that IOrderedEnumerable<T> does that IEnumerable<T> doesn't do, is implement CreateOrderedEnumerable<TKey>, which can be called directly or through ThenBy and ThenByDescending, so you can add a secondary sort on the enumerable that only affects items considered equivalent by the earlier sort.
If you don't use CreateOrderedEnumerable() either directly or through ThenBy() or ThenByDescending() then change to not attempt to use it:
public IEnumerable<ArchiveProcess> getHistory()
{
using (ArchiveVMADDatabase.ArchiveDatabaseModel dataContext = new ArchiveDatabaseModel())
{
return (from history in dataContext.ArchiveProcess.AsNoTracking()
orderby history.ArchiveBegin descending
select history).Take(10).ToList();
}
}
Otherwise reapply the ordering, so that ThenBy etc. can be used with it:
public IOrderedEnumerable<ArchiveProcess> getHistory()
{
using (ArchiveVMADDatabase.ArchiveDatabaseModel dataContext = new ArchiveDatabaseModel())
{
return (from history in dataContext.ArchiveProcess.AsNoTracking()
orderby history.ArchiveBegin descending
select history).Take(10).ToList().OrderBy(h => h.ArchiveBegin);
}
}
However this adds a bit more overhead, so don't do it if you don't need it.
Remember, IOrderedEnumerable<T> is not just an ordered enumerable (all enumerables are in some order, however arbitrary), it's an ordered enumerable that has knowledge about the way it which it is ordered so as to provide for secondary sorting. If you don't need that then you don't need IOrderedEnumerable<T>.

using (ArchiveVMADDatabase.ArchiveDatabaseModel dataContext = new ArchiveDatabaseModel())
{
var query = dataContext.ArchiveProcess.AsNoTracking().Take(10).OrderBy(o=> o.ArchiveBegin);
return query;
}

Custom Method in LINQ Query

I sum myself to the hapless lot that fumbles with custom methods in LINQ to EF queries. I've skimmed the web trying to detect a pattern to what makes a custom method LINQ-friendly, and while every source says that the method must be translatable into a T-SQL query, the applications seem very diverse. So, I'll post my code here and hopefully a generous SO denizen can tell me what I'm doing wrong and why.
The Code
public IEnumerable<WordIndexModel> GetWordIndex(int transid)
{
return (from trindex in context.transIndexes
let trueWord = IsWord(trindex)
join trans in context.Transcripts on trindex.transLineUID equals trans.UID
group new { trindex, trans } by new { TrueWord = trueWord, trindex.transID } into grouped
orderby grouped.Key.word
where grouped.Key.transID == transid
select new WordIndexModel
{
Word = TrueWord,
Instances = grouped.Select(test => test.trans).Distinct()
});
}
public string IsWord(transIndex trindex)
{
Match m = Regex.Match(trindex.word, #"^[a-z]+(\w*[-]*)*",
RegexOptions.IgnoreCase);
return m.Value;
}
With the above code I access a table, transIndex that is essentially a word index of culled from various user documents. The problem is that not all entries are actually words. Nubers, and even underscore lines, such as, ___________,, are saved as well.
The Problem
I'd like to keep only the words that my custom method IsWord returns (at the present time I have not actually developed the parsing mechanism). But as the IsWord function shows it will return a string.
So, using let I introduce my custom method into the query and use it as a grouping parameter, the is selectable into my object. Upon execution I get the omninous:
LINQ to Entities does not recognize the method
'System.String IsWord(transIndex)' method, and this
method cannot be translated into a store expression."
I also need to make sure that only records that match the IsWord condition are returned.
Any ideas?

It is saying it does not understand your IsWord method in terms of how to translate it to SQL.
Frankly it does not do much anyway, why not replace it with
return (from trindex in context.transIndexes
let trueWord = trindex.word
join trans in context.Transcripts on trindex.transLineUID equals trans.UID
group new { trindex, trans } by new { TrueWord = trueWord, trindex.transID } into grouped
orderby grouped.Key.word
where grouped.Key.transID == transid
select new WordIndexModel
{
Word = TrueWord,
Instances = grouped.Select(test => test.trans).Distinct()
});
What methods can EF translate into SQL, i can't give you a list, but it can never translate a straight forward method you have written. But their are some built in ones that it understands, like MyArray.Contains(x) for example, it can turn this into something like
...
WHERE Field IN (ArrItem1,ArrItem2,ArrItem3)
If you want to write a linq compatible method then you need to create an expresion tree that EF can understand and turn into SQL.
This is where things star to bend my mind a little but this article may help http://blogs.msdn.com/b/csharpfaq/archive/2009/09/14/generating-dynamic-methods-with-expression-trees-in-visual-studio-2010.aspx.

If the percentage of bad records in return is not large, you could consider enumerate the result set first, and then apply the processing / filtering?
var query = (from trindex in context.transIndexes
...
select new WordIndexModel
{
Word,
Instances = grouped.Select(test => test.trans).Distinct()
});
var result = query.ToList().Where(word => IsTrueWord(word));
return result;
If the number of records is too high to enumerate, consider doing the check in a view or stored procedure. That will help with speed and keep the code clean.
But of course, using stored procedures has disadvatages of reusability and maintainbility (because of no refactoring tools).
Also, check out another answer which seems to be similar to this one: https://stackoverflow.com/a/10485624/3481183

Using variables to build a LinQ query?

I don't think is possible but wanted to ask to make sure. I am currently debugging some software someone else wrote and its a bit unfinished.
One part of the software is a search function which searches by different fields in the database and the person who wrote the software wrote a great big case statement with 21 cases in it 1 for each field the user may want to search by.
Is it possible to reduce this down using a case statement within the Linq or a variable I can set with a case statement before the Linq statement?
Example of 1 of the Linq queries: (Only the Where is changing in each query)
var list = (from data in dc.MemberDetails
where data.JoinDate.ToString() == searchField
select new
{
data.MemberID,
data.FirstName,
data.Surname,
data.Street,
data.City,
data.County,
data.Postcode,
data.MembershipCategory,
data.Paid,
data.ToPay
}
).ToList();
Update / Edit:
This is what comes before the case statement:
string searchField = txt1stSearchTerm.Text;
string searchColumn = cmbFirstColumn.Text;
switch (cmbFirstColumn.SelectedIndex + 1)
{
The cases are then done by the index of the combo box which holds the list of field names.

Given that where takes a predicate, you can pass any method or function which takes MemberDetail as a parameter and returns a boolean, then migrate the switch statement inside.
private bool IsMatch(MemberDetail detail)
{
// The comparison goes here.
}
var list = (from data in dc.MemberDetails
where data => this.IsMatch(data)
select new
{
data.MemberID,
data.FirstName,
data.Surname,
data.Street,
data.City,
data.County,
data.Postcode,
data.MembershipCategory,
data.Paid,
data.ToPay
}
).ToList();
Note that:
You may look for a more object-oriented way to do the comparison, rather than using a huge switch block.
An anonymous type with ten properties that you use in your select is kinda weird. Can't you return an instance of MemberDetail? Or an instance of its base class?

How are the different where statements handled, are they mutually excluside or do they all limit the query somehow?
Here is how you can have one or more filters for a same query and materialized after all filters have been applied.
var query = (from data in dc.MemberDetails
select ....);
if (!String.IsNullOrEmpty(searchField))
query = query.Where(pr => pr.JoinDate.ToString() == searchField);
if (!String.IsNullOrEmpty(otherField))
query = query.Where(....);
return query.ToList();

Making a SQL database table browser using Entity Framework in c# winforms

I'm just looking for some links or tips for some general direction. I'm writing a program where the user will have access to many different tables in a sql server database. For example, if a user clicks on the "Foo" button, it will bring up a gui dialog which displays all of the columns of the "Foo" table. It also has a textbox at the top for filtering through data in the columns. For the most part, what i've written so far works atleast decent, but there are times when the performance is really slow and I can't help but feel I'm doing something wrong. Generally my code goes like this.
//GUI Constructor will call a Load Function
private void Load()
{
//Context is a DbContext entity
var query = from q in context.Foo
select q;
datagrid.Datasource = query.ToList();
}
The part which I feel goes wrong is with the textbox filter searching. The way I do it right now is to basically re-query the database to get more specific rows. This function gets called on the TextBox TextChanged event. I know this would be bad to call every keypress, I was going to add a timer to wait for the user to stop typing before applying the filter but anyways this is the code though.
private void TextFilter()
{
var query = from q in context.Foo.Where( x => x.Name == FilterTextbox.Text )
select q;
datagrid.Datasource = query.ToList();
}
I would assume it'd be better to store the entire database from the load function into a list that way all of the info is in the programs memory already, but this actually was slower than just querying the database again. I also tried using Context.Foo.Local and querying off of that but it proved to be as slow as storing all the data in my own List
private void AlternateLoad()
{
context.Foo.Load();
datagrid.Datasource = context.Foo.Local.ToList();
}
private void AlternateTextSearch()
{
var query = from q in context.Foo.Local.Where( x => x.Name == FilterTextBox.Text )
select q;
datagrid.Datasource = query.ToList();
}
I've experimented with AsParallel() when using Local or my own List but it doesn't seem to make a difference. Anyways I just want to see how to speed this up. In one specific scenario, I was prefiltering the database before displaying the data, and the prefilter took about 19 seconds before it could display its 7 row result. The smaller tables are fine but the tables with 100k+ rows definitely reveal the weakness of the code. Any tips or just general links to on how to do this would be greatly appreciated. I've been searching all over and I have had no luck in finding anything.
Thanks very much!

It's really up to you to estimate what the best solution is for the application, but the following should be some solid pointers to continue:
Don't return data that you don't need: top 100 rows / Paging
LINQ query ==> .First() .Skip() .Take()
In memory filtering can be very fast too! Keep the entire list in memory and run linq query against this collection...
Create StoredProcedures with a parameter to filter on and
put an index on the columns you must likely want to filter on.

I would suggest adding a private member variable to the class(a list):
public MyClass
{
private List<Foo> myFoo = new List<Foo>();
//GUI Constructor will call a Load Function
private void Load()
{
//Context is a DbContext entity
var query = from q in context.Foo
select q;
// set the private member variable "myFoo" here
datagrid.Datasource = myFoo = query.ToList();
}
private void TextFilter()
{
// don't requery the DB, query the list myFoo
//var query = from q in context.Foo.Where( x => x.Name == FilterTextbox.Text )
//select q;
var query = myFoo.Where(x => x.Column == SearchCriteria).ToList();
datagrid.Datasource = query;
}
}
Note: The above code isn't tested or anything, hopefully point you in the right direction.
Also Important: Because you're not requering the database the data in the list could become out of date.

What's the correct way to query accross multiple entity contexts without the chance of too-long IN clause being generated?

I'm trying to produce the set of entities to be created in a target context by comparing IDs from a second context.
This is what I came up with but I'm looking for a better way or confirmation this is the right way.
The main points I've noted so far are:
ToList() is requred to prevent an error occurring when a query uses
multiple contexts
I know that if I force an IN clause into the generated SQL there's an upper limit to how many it can handle and I don't want that error condition looming.
public override IEnumerable<Campaign> Execute()
{
using (var eom = eomDatabase.Create())
using (var cake = cakeEntities.Create())
{
var campaigns = eom.Campaigns.Select(c => c.pid).ToList();
var offers = cake.CakeOffers.Select(c => c.Offer_Id).ToList();
var newOffers = offers.Except(campaigns).ToList();
var newCampaigns = from offer in cake.CakeOffers
where newOffers.Contains(offer.Offer_Id)
select new Campaign {
pid = offer.Offer_Id,
campaign_name = offer.OfferName
};
return newCampaigns.ToList();
}
}
UPDATE: apparently let statements do not work how I thought - the above produces no error while my original code causes a multiple contexts error.
var newCampaigns = from offer in cake.CakeOffers
let campaigns = eom.Campaigns.Select(c => c.pid).ToList()
let offers = cake.CakeOffers.Select(c => c.Offer_Id).ToList()
let newOffers = offers.Except(campaigns).ToList()
where newOffers.Contains(offer.Offer_Id)
select new Campaign {
pid = offer.Offer_Id,
campaign_name = offer.OfferName
};

1) The impact of ToList() is to have the query executed at that point. Therefore, you are pulling all the Ids into memory. Depending on the relative sizes of the data sets this may or may not be optimal. If there are more campaigns than cake offers, you may be better off querying the cake offer id's into memory first using a cakeEntities context - then dispose of it and manage the rest separately.
You can easily circumvent any IN clause limitations by batching your reading of cake offers - just handle a fixed number at a time using the Take operator: e.g.
IList<int> cakeOffers;
using (var cakeDb = new cakeEntities())
{
cakeOffers = cakeDb.CakeOffers.Take(10).Select(c => c.pid).ToList();
}
2) (EDIT - I think you knew this...) You aren't actually creating any new Campaigns in the context. You're just contructing a bunch of them in memory. You need to add them to the context and then save it to create them.
EDIT
I would then just query the eomDatabase for existing Campaigns with matching offer_Ids using a Contains clause: e.g.:
var existingCampaignOffers = campaignDb.Campaigns
.Where(co => cakeOffers.Contains(co.Offer_Id)).Select(c => c.Offer_Id).ToList();
Then finally use Except() in memory to get the exclusive list Offer_Ids you need to create your new Campaigns.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Whether to use AsEnumerable() in LINQ or not? - c#

Related

Results IOrderedEnumerable is null outside context

Custom Method in LINQ Query

Using variables to build a LinQ query?

Making a SQL database table browser using Entity Framework in c# winforms

What's the correct way to query accross multiple entity contexts without the chance of too-long IN clause being generated?

Categories

Resources