I have very peculiar problem with performance of Entity Framework. I use version 7 of the framework with SQLite provider (both from nuget). Database have around 10 millions of records but in the future there will be around 100 millions. The construction of db is very simple:
public class Sample
{
public int SampleID { get; set; }
public long Time { get; set; }
public short Channel { get; set; } /* values from 0 to 8191, in the presented test 0-15 */
public byte Events { get; set; } /* 1-255 */
}
public class Channel
{
public int ChannelID { get; set; }
public short Ch { get; set; }
public int Es { get; set; }
}
public class MyContext : DbContext
{
// This property defines the table
public DbSet<Sample> Samples { get; set; }
public DbSet<Channel> Spectrum { get; set; }
// This method connects the context with the database
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
var connectionStringBuilder = new SqliteConnectionStringBuilder { DataSource = "E://database.db" };
var connectionString = connectionStringBuilder.ToString();
var connection = new SqliteConnection(connectionString);
optionsBuilder.UseSqlite(connection);
}
}
I try to group events by channel and sum them up into something like spectrum. When I use linq2sql I have very low performance. For 10m of records the query takes about 15 minutes and get around 1 GB of RAM and then throws an OutOfMemoryException - I think that Entity Framework is loading all records as objects into memory - but why? On the other hand, simple SQL needs about 3 seconds and takes no significant amount of RAM.
using (var db = new MyContext())
{
var res1 = from sample in db.Samples
group sample by sample.Channel into g
select new { Channel=g.Key, Events = g.Sum(s => s.Events) };
res1.ToArray();
var res2 = db.Natas.FromSql("SELECT Channel as ChannelID, Channel as Ch, SUM(Events) as Es FROM Sample GROUP BY Channel");
var data = res2.ToArray();
}
Any suggestions? Thank for help ;)
Suggestion? IGNORE ENTITY FRAMEWORK.
As in: this is so totally not an EF issue it is not even funny.
Look at the SQL that EF sends out, then optimize from that level. Oh, you have little influence on the SQL; but for a trivial statement like this the SQL will be optimal.
What will not be optimal - and there is a hint you never looked at the SQL - is the database. Are the indices there? Code first is amazing in that it is ignorant to the intricacies of the database and you need to look at it FIRST from a "is my database optimal". Indices. And - sadly - hardware. If you hit 100 million rows, you need to have the power in the database to handle this.
I think that Entity Framework is loading all records as objects into memory -
but why?
Rule 1 in performance dbugging: DO NOT THINK - CHECK. Look at the SQL generated (log, the res1 variable can show you) and see what gets submitted to the database.
It is possible that you just have that much data. You say nothing about how many channels exist - this may well require a bigger machine.
Check it.
Also: it is not really smart to pull the results into an array unless you need that. Arrays are memory problematic in this scenario (reallocations to get the size) and a LIST may be better (uses more memory but requires no reallocation). In general, though, you want to AVOID materializing the result sets - i.e. work from the enumerable. Not always, but your test may simple show problems on that side. The resulting array may be hugh. And require one piece of memory.
And seriously, question your selection of database technology. SqlLite is nice - it is small, it is lightweight. It is in memory. It is NOT suitable for hugh amounts of data, it is not a full scale database server. You may be much better off using Sql Express (if anything: SQL express will use memory for caching that is NOT in your process but separate). I personally would not use SqlLite for something that may use hundreds of millions of records.
Also: Note your SQL is different. The EF Part has an OrderBy (which is not needed), the SQL not. Ordering may well be expensive. Which puts us back to "get the SQL generated by entity framework".
The problem was connected with SQLite provider. After change to SQL Server Compact everything works fine ;)
I have three simple tables - usergroups, staff, and salutations. All have ID, Name/Desc, and Active columns. The Usergroups are also assigned an optional staff ID, and the staff are assigned a non-optional salutation ID. I wish to query these tables to return a complete list of all active usergroups, with their related staff members (is any), and their related salutations.
A working SQL query is as follows:
SELECT grp.ID, grp.Desc, grp.Active, sub.Name, sub.Desc
FROM Tbl_UserGroup AS grp
LEFT JOIN (
SELECT st.ID, st.Name, sal.Desc
FROM PrmTbl_Staff AS st
LEFT JOIN PrmTbl_Salutation AS sal ON st.SalutationID = sal.ID
WHERE 1
) AS sub ON grp.StaffID = sub.ID
WHERE grp.Active = TRUE
ORDER BY grp.ID DESC
I have a ViewModel as follows:
public class StaffUserGroup
{
public int GroupID { get; set; }
public string GroupDesc { get; set; }
public bool GroupActive { get; set; }
public int? StaffID { get; set; }
public string StaffName { get; set; }
public string SalutationName { get; set; }
public List<PrmTbl_Staff> StaffsList { get; set; }
}
And an attempt at a LINQ query:
IEnumerable<Tbl_UserGroup> grpsQuery;
grpsQuery = from grp in db.Tbl_UserGroups
join sub in(
from st in db.PrmTbl_Staffs
join sal in db.PrmTbl_Salutations on st.SalutationID equals sal.ID
select new { StID = st.ID, st.Name, Salt = sal.Desc }
) on grp.StaffID equals sub.StID
where grp.Active = true
orderby grp.ID descending
select new { grp.ID, grp.Desc, grp.Active, sub.Name, sub.Salt, sub.StID };
Which is loaded in my Controller:
var viewModel = grpsQuery.Select(group =>
new StaffUserGroup
{
GroupID = group.GroupID,
GroupDesc = group.GroupDesc,
GroupActive = group.GroupActive,
StaffID = group.StaffID,
StaffName = group.StaffName,
SalutationName = group.SalutationName,
StaffsList = rtrnStaff
}
);
Note that intellisense was flagging identically named columns between the sub query and the main query, so I introduced some aliases. I also wish to pass to the view a dropdown list of all available staff, hence the List in the viewmodel.
I am getting an error on the select call in the LINQ statement: Cannot implicitly convert type 'System.Linq.IQueryable<AnonymousType#4>' to '<...StaffUserGroup> An explicit conversion exists. Are you missing a cast?
I don't know
Why I need a ViewModel when I can just query the data I need directly to the Controller
What the ViewModel class then actually does with the data retrieved from the query - does it filter it? Construct an object from it? From my background in PHP and MySQL, what would be a comparison?
How to query specific columns from a table. I using select new {}, because I'm assuming that's equivalent?
Why the above LINQ statement doesn't work.
I can post Models, Views, or Controllers if needed. Any help and advice is greatly appreciated!
Question 1
Why I need a ViewModel when I can just query the data I need directly to the Controller
A ViewModel is a POCO that you write which defines exactly what a view needs in order to display itself correctly.
For example, Let's suppose you have a page (view) that welcomes a user.
Welcome, Bob. Your last visit was 2013-10-11.
A ViewModel is a simple class that defines exactly the things that the view needs.
The user's name
The user's last visit
Therefore:
public class UserDetailsViewModel
{
public string FirstName { get; set; }
public DateTime LastVisit { get; set; }
}
It's (usually) the controller's responsibility to create the ViewModel, ensure it's populated, give it to the view and finally return the view. The controller doesn't do much else; its responsibilities are limited and the code in an action should be fairly small.
The reason that you do this is because it's good practice. But that's not good enough, let me explain.
It is possible to simply run a query, return an IEnumerable of some domain object (for example a list of users) and give that to a view. This is done in many MVC demos. The problem is it's very limited/restrictive. What happens if you want to change what the view displays later? What happens if the domain model changes slightly? It's easier to manage and change things when they are neatly organised and concerns are separated.
Question 2
What the ViewModel class then actually does with the data retrieved from the query - does it filter it? Construct an object from it? From my background in PHP and MySQL, what would be a comparison?
The ViewModel is a concept that is native (or at least common) to certain architectural patterns such as MVC and MVVM. The ViewModel doesn't really 'do' anything. It doesn't do any logic; it has no methods. It just contains a list of properties (and attributes) which define what a view that is using this ViewModel will need.
There isn't exactly a PHP equivalent because a ViewModel isn't specific to .NET. It's just a concept that is associated with MVC, MVVM and so on. The PHP equivalent would be a PHP MVC ViewModel. Remember that ASP.NET MVC is just an implementation of the MVC pattern. PHP has its own MVC implementations.
Question 3
How to query specific columns from a table. I using select new {}, because I'm assuming that's equivalent?
This depends on how you're doing it. EntityFramework is an object-relational mapper that is often used in ASP.NET MVC applications. In this way, you don't query your underlying storage or columns directly. Instead EF will map the tables and columns to .NET objects and you manipulate those.
I'd recommand you try to stay with dealing with objects, rather than creating anonymous types as you go and trying to grab specific columns. Remember that LINQ isn't SQL. The approach shouldn't be "Query this table, grab these columns, where this clause" but instead should be "From this group of objects, grab the object here, where this clause".
For example:
var query = from user in Users
where user.FirstName == "Bob"
select user;
Question 4
Why the above LINQ statement doesn't work.
As the description says you're trying to give an IEnumerable of StaffUserGroup an IEnumerable of Anonymous. I believe this is because of the way you are selecting things in order to populate your ViewModel. Its hard to fix the code without know more about how things are structured. My recommendation here is go look at how some other people are doing LINQ/EntityFramework in MVC. It just takes a bit of practice until you get comfortable with how things are working.
It seems you may be trying to put items of new type
select new { grp.ID, grp.Desc, grp.Active, sub.Name, sub.Salt, sub.St }
into a collection intended for items of type
IEnumerable<Tbl_UserGroup> grpsQuery;
I've written this code to project one to many relation but it's not working:
using (var connection = new SqlConnection(connectionString))
{
connection.Open();
IEnumerable<Store> stores = connection.Query<Store, IEnumerable<Employee>, Store>
(#"Select Stores.Id as StoreId, Stores.Name,
Employees.Id as EmployeeId, Employees.FirstName,
Employees.LastName, Employees.StoreId
from Store Stores
INNER JOIN Employee Employees ON Stores.Id = Employees.StoreId",
(a, s) => { a.Employees = s; return a; },
splitOn: "EmployeeId");
foreach (var store in stores)
{
Console.WriteLine(store.Name);
}
}
Can anybody spot the mistake?
EDIT:
These are my entities:
public class Product
{
public int Id { get; set; }
public string Name { get; set; }
public double Price { get; set; }
public IList<Store> Stores { get; set; }
public Product()
{
Stores = new List<Store>();
}
}
public class Store
{
public int Id { get; set; }
public string Name { get; set; }
public IEnumerable<Product> Products { get; set; }
public IEnumerable<Employee> Employees { get; set; }
public Store()
{
Products = new List<Product>();
Employees = new List<Employee>();
}
}
EDIT:
I change the query to:
IEnumerable<Store> stores = connection.Query<Store, List<Employee>, Store>
(#"Select Stores.Id as StoreId ,Stores.Name,Employees.Id as EmployeeId,
Employees.FirstName,Employees.LastName,Employees.StoreId
from Store Stores INNER JOIN Employee Employees
ON Stores.Id = Employees.StoreId",
(a, s) => { a.Employees = s; return a; }, splitOn: "EmployeeId");
and I get rid of exceptions! However, Employees are not mapped at all. I am still not sure what problem it had with IEnumerable<Employee> in first query.
This post shows how to query a highly normalised SQL database, and map the result into a set of highly nested C# POCO objects.
Ingredients:
8 lines of C#.
Some reasonably simple SQL that uses some joins.
Two awesome libraries.
The insight that allowed me to solve this problem is to separate the MicroORM from mapping the result back to the POCO Entities. Thus, we use two separate libraries:
Dapper as the MicroORM.
Slapper.Automapper for mapping.
Essentially, we use Dapper to query the database, then use Slapper.Automapper to map the result straight into our POCOs.
Advantages
Simplicity. Its less than 8 lines of code. I find this a lot easier to understand, debug, and change.
Less code. A few lines of code is all Slapper.Automapper needs to handle anything you throw at it, even if we have a complex nested POCO (i.e. POCO contains List<MyClass1> which in turn contains List<MySubClass2>, etc).
Speed. Both of these libraries have an extraordinary amount of optimization and caching to make them run almost as fast as hand tuned ADO.NET queries.
Separation of concerns. We can change the MicroORM for a different one, and the mapping still works, and vice-versa.
Flexibility. Slapper.Automapper handles arbitrarily nested hierarchies, it isn't limited to a couple of levels of nesting. We can easily make rapid changes, and everything will still work.
Debugging. We can first see that the SQL query is working properly, then we can check that the SQL query result is properly mapped back to the target POCO Entities.
Ease of development in SQL. I find that creating flattened queries with inner joins to return flat results is much easier than creating multiple select statements, with stitching on the client side.
Optimized queries in SQL. In a highly normalized database, creating a flat query allows the SQL engine to apply advanced optimizations to the whole which would not normally be possible if many small individual queries were constructed and run.
Trust. Dapper is the back end for StackOverflow, and, well, Randy Burden is a bit of a superstar. Need I say any more?
Speed of development. I was able to do some extraordinarily complex queries, with many levels of nesting, and the dev time was quite low.
Fewer bugs. I wrote it once, it just worked, and this technique is now helping to power a FTSE company. There was so little code that there was no unexpected behavior.
Disadvantages
Scaling beyond 1,000,000 rows returned. Works well when returning < 100,000 rows. However, if we are bringing back >1,000,000 rows, in order to reduce the traffic between us and SQL server, we should not flatten it out using inner join (which brings back duplicates), we should instead use multiple select statements and stitch everything back together on the client side (see the other answers on this page).
This technique is query oriented. I haven't used this technique to write to the database, but I'm sure that Dapper is more than capable of doing this with some more extra work, as StackOverflow itself uses Dapper as its Data Access Layer (DAL).
Performance Testing
In my tests, Slapper.Automapper added a small overhead to the results returned by Dapper, which meant that it was still 10x faster than Entity Framework, and the combination is still pretty darn close to the theoretical maximum speed SQL + C# is capable of.
In most practical cases, most of the overhead would be in a less-than-optimum SQL query, and not with some mapping of the results on the C# side.
Performance Testing Results
Total number of iterations: 1000
Dapper by itself: 1.889 milliseconds per query, using 3 lines of code to return the dynamic.
Dapper + Slapper.Automapper: 2.463 milliseconds per query, using an additional 3 lines of code for the query + mapping from dynamic to POCO Entities.
Worked Example
In this example, we have list of Contacts, and each Contact can have one or more phone numbers.
POCO Entities
public class TestContact
{
public int ContactID { get; set; }
public string ContactName { get; set; }
public List<TestPhone> TestPhones { get; set; }
}
public class TestPhone
{
public int PhoneId { get; set; }
public int ContactID { get; set; } // foreign key
public string Number { get; set; }
}
SQL Table TestContact
SQL Table TestPhone
Note that this table has a foreign key ContactID which refers to the TestContact table (this corresponds to the List<TestPhone> in the POCO above).
SQL Which Produces Flat Result
In our SQL query, we use as many JOIN statements as we need to get all of the data we need, in a flat, denormalized form. Yes, this might produce duplicates in the output, but these duplicates will be eliminated automatically when we use Slapper.Automapper to automatically map the result of this query straight into our POCO object map.
USE [MyDatabase];
SELECT tc.[ContactID] as ContactID
,tc.[ContactName] as ContactName
,tp.[PhoneId] AS TestPhones_PhoneId
,tp.[ContactId] AS TestPhones_ContactId
,tp.[Number] AS TestPhones_Number
FROM TestContact tc
INNER JOIN TestPhone tp ON tc.ContactId = tp.ContactId
C# code
const string sql = #"SELECT tc.[ContactID] as ContactID
,tc.[ContactName] as ContactName
,tp.[PhoneId] AS TestPhones_PhoneId
,tp.[ContactId] AS TestPhones_ContactId
,tp.[Number] AS TestPhones_Number
FROM TestContact tc
INNER JOIN TestPhone tp ON tc.ContactId = tp.ContactId";
string connectionString = // -- Insert SQL connection string here.
using (var conn = new SqlConnection(connectionString))
{
conn.Open();
// Can set default database here with conn.ChangeDatabase(...)
{
// Step 1: Use Dapper to return the flat result as a Dynamic.
dynamic test = conn.Query<dynamic>(sql);
// Step 2: Use Slapper.Automapper for mapping to the POCO Entities.
// - IMPORTANT: Let Slapper.Automapper know how to do the mapping;
// let it know the primary key for each POCO.
// - Must also use underscore notation ("_") to name parameters in the SQL query;
// see Slapper.Automapper docs.
Slapper.AutoMapper.Configuration.AddIdentifiers(typeof(TestContact), new List<string> { "ContactID" });
Slapper.AutoMapper.Configuration.AddIdentifiers(typeof(TestPhone), new List<string> { "PhoneID" });
var testContact = (Slapper.AutoMapper.MapDynamic<TestContact>(test) as IEnumerable<TestContact>).ToList();
foreach (var c in testContact)
{
foreach (var p in c.TestPhones)
{
Console.Write("ContactName: {0}: Phone: {1}\n", c.ContactName, p.Number);
}
}
}
}
Output
POCO Entity Hierarchy
Looking in Visual Studio, We can see that Slapper.Automapper has properly populated our POCO Entities, i.e. we have a List<TestContact>, and each TestContact has a List<TestPhone>.
Notes
Both Dapper and Slapper.Automapper cache everything internally for speed. If you run into memory issues (very unlikely), ensure that you occasionally clear the cache for both of them.
Ensure that you name the columns coming back, using the underscore (_) notation to give Slapper.Automapper clues on how to map the result into the POCO Entities.
Ensure that you give Slapper.Automapper clues on the primary key for each POCO Entity (see the lines Slapper.AutoMapper.Configuration.AddIdentifiers). You can also use Attributes on the POCO for this. If you skip this step, then it could go wrong (in theory), as Slapper.Automapper would not know how to do the mapping properly.
Update 2015-06-14
Successfully applied this technique to a huge production database with over 40 normalized tables. It worked perfectly to map an advanced SQL query with over 16 inner join and left join into the proper POCO hierarchy (with 4 levels of nesting). The queries are blindingly fast, almost as fast as hand coding it in ADO.NET (it was typically 52 milliseconds for the query, and 50 milliseconds for the mapping from the flat result into the POCO hierarchy). This is really nothing revolutionary, but it sure beats Entity Framework for speed and ease of use, especially if all we are doing is running queries.
Update 2016-02-19
Code has been running flawlessly in production for 9 months. The latest version of Slapper.Automapper has all of the changes that I applied to fix the issue related to nulls being returned in the SQL query.
Update 2017-02-20
Code has been running flawlessly in production for 21 months, and has handled continuous queries from hundreds of users in a FTSE 250 company.
Slapper.Automapper is also great for mapping a .csv file straight into a list of POCOs. Read the .csv file into a list of IDictionary, then map it straight into the target list of POCOs. The only trick is that you have to add a propery int Id {get; set}, and make sure it's unique for every row (or else the automapper won't be able to distinguish between the rows).
Update 2019-01-29
Minor update to add more code comments.
See: https://github.com/SlapperAutoMapper/Slapper.AutoMapper
I wanted to keep it as simple as possible, my solution:
public List<ForumMessage> GetForumMessagesByParentId(int parentId)
{
var sql = #"
select d.id_data as Id, d.cd_group As GroupId, d.cd_user as UserId, d.tx_login As Login,
d.tx_title As Title, d.tx_message As [Message], d.tx_signature As [Signature], d.nm_views As Views, d.nm_replies As Replies,
d.dt_created As CreatedDate, d.dt_lastreply As LastReplyDate, d.dt_edited As EditedDate, d.tx_key As [Key]
from
t_data d
where d.cd_data = #DataId order by id_data asc;
select d.id_data As DataId, di.id_data_image As DataImageId, di.cd_image As ImageId, i.fl_local As IsLocal
from
t_data d
inner join T_data_image di on d.id_data = di.cd_data
inner join T_image i on di.cd_image = i.id_image
where d.id_data = #DataId and di.fl_deleted = 0 order by d.id_data asc;";
var mapper = _conn.QueryMultiple(sql, new { DataId = parentId });
var messages = mapper.Read<ForumMessage>().ToDictionary(k => k.Id, v => v);
var images = mapper.Read<ForumMessageImage>().ToList();
foreach(var imageGroup in images.GroupBy(g => g.DataId))
{
messages[imageGroup.Key].Images = imageGroup.ToList();
}
return messages.Values.ToList();
}
I still do one call to the database, and while i now execute 2 queries instead of one, the second query is using a INNER join instead of a less optimal LEFT join.
A slight modification of Andrew's answer that utilizes a Func to select the parent key instead of GetHashCode.
public static IEnumerable<TParent> QueryParentChild<TParent, TChild, TParentKey>(
this IDbConnection connection,
string sql,
Func<TParent, TParentKey> parentKeySelector,
Func<TParent, IList<TChild>> childSelector,
dynamic param = null, IDbTransaction transaction = null, bool buffered = true, string splitOn = "Id", int? commandTimeout = null, CommandType? commandType = null)
{
Dictionary<TParentKey, TParent> cache = new Dictionary<TParentKey, TParent>();
connection.Query<TParent, TChild, TParent>(
sql,
(parent, child) =>
{
if (!cache.ContainsKey(parentKeySelector(parent)))
{
cache.Add(parentKeySelector(parent), parent);
}
TParent cachedParent = cache[parentKeySelector(parent)];
IList<TChild> children = childSelector(cachedParent);
children.Add(child);
return cachedParent;
},
param as object, transaction, buffered, splitOn, commandTimeout, commandType);
return cache.Values;
}
Example usage
conn.QueryParentChild<Product, Store, int>("sql here", prod => prod.Id, prod => prod.Stores)
According to this answer there is no one to many mapping support built into Dapper.Net. Queries will always return one object per database row. There is an alternative solution included, though.
Here is another method:
Order (one) - OrderDetail (many)
using (var connection = new SqlCeConnection(connectionString))
{
var orderDictionary = new Dictionary<int, Order>();
var list = connection.Query<Order, OrderDetail, Order>(
sql,
(order, orderDetail) =>
{
Order orderEntry;
if (!orderDictionary.TryGetValue(order.OrderID, out orderEntry))
{
orderEntry = order;
orderEntry.OrderDetails = new List<OrderDetail>();
orderDictionary.Add(orderEntry.OrderID, orderEntry);
}
orderEntry.OrderDetails.Add(orderDetail);
return orderEntry;
},
splitOn: "OrderDetailID")
.Distinct()
.ToList();
}
Source: http://dapper-tutorial.net/result-multi-mapping#example---query-multi-mapping-one-to-many
Here is a crude workaround
public static IEnumerable<TOne> Query<TOne, TMany>(this IDbConnection cnn, string sql, Func<TOne, IList<TMany>> property, dynamic param = null, IDbTransaction transaction = null, bool buffered = true, string splitOn = "Id", int? commandTimeout = null, CommandType? commandType = null)
{
var cache = new Dictionary<int, TOne>();
cnn.Query<TOne, TMany, TOne>(sql, (one, many) =>
{
if (!cache.ContainsKey(one.GetHashCode()))
cache.Add(one.GetHashCode(), one);
var localOne = cache[one.GetHashCode()];
var list = property(localOne);
list.Add(many);
return localOne;
}, param as object, transaction, buffered, splitOn, commandTimeout, commandType);
return cache.Values;
}
its by no means the most efficient way, but it will get you up and running. I'll try and optimise this when i get a chance.
use it like this:
conn.Query<Product, Store>("sql here", prod => prod.Stores);
bear in mind your objects need to implement GetHashCode, perhaps like this:
public override int GetHashCode()
{
return this.Id.GetHashCode();
}
How can I store multiple values of a large set to be able to find them quickly with a lambda expression based on a property with non-unique values?
Sample case (not optimized for performance):
class Product
{
public string Title { get; set; }
public int Price { get; set; }
public string Description { get; set; }
}
IList<Product> products = this.LoadProducts();
var q1 = products.Where(c => c.Title == "Hello"); // 1 product.
var q2 = products.Where(c => c.Title == "Sample"); // 5 products.
var q3 = products.Where(c => string.IsNullOrEmpty(c.Title)); // 12 345 products.
If title was unique, it would be easy to optimize performance by using IDictionary or HashSet. But what about the case where the values are not unique?
The simplest solution is to use a dictionary of collections of Product. Easiest is to use
var products = this.LoadProducts().ToLookup(p => p.Title);
var example1 = products["Hello"]; // 1 product
var example2 = products["Sample"]; // 5 products
Your third example is a little harder, but you could use ApplyResultSelector() for that.
What you need is the ability to run indexed queries in LINQ. (same as we do in SQL)
There is a library called i4o which apparently can solve your problem:
http://i4o.codeplex.com/
from their website:
i4o (index for objects) is the first class library that extends LINQ
to allow you to put indexes on your objects. Using i4o, the speed of
LINQ operations are often over one thousand times faster than without
i4o.
i4o works by allowing the developer to specify an
IndexSpecification for any class, and then using the
IndexableCollection to implement a collection of that class that
will use the index specification, rather than sequential search, when
doing LINQ operations that can benefit from indexing.
also the following provides an example of how to use i4o:
http://www.hookedonlinq.com/i4o.ashx
Make it short you need to:
Add [Indexable()] attribute to your "Title" property
Use IndexableCollection<Product> as your data source.
From this point, any linq query that uses an indexable field will use the index rather than doing a sequential search, resulting in order of magnituide performance increases for queries using the index.
I don't know Linq2Sql so well yet and I was wondering if there is a trick for this probably common MVVM scenario. I have Linq2Sql data context containing Domain models, but I am fetching data for my customized ViewModel object from it.
var query = from ord in ctx.Table_Orders
select new OrderViewModel()
{
OrderId = ord.OrderId,
OrderSum = ord.OrderSum,
OrderCurrencyId = ord.OrderCurrencyId,
OrderCurrencyView = ord.Currency.CurrencyText
};
So i want my ViewModel to inculde both CurrencyId from domain object and the CurrencyText from related table to show it nicely in the View.
This code works great. It generates one DB call with join to fetch the CurrencyText. But the model is simplified, real one has many more fields. I want to make the code reusable because I have many different queries, that returns the same ViewModel. Now every minor change to OrderViewModel requires lots of maintainance.
So I moved the code to OrderViewModel itself as a constructor.
public OrderViewModel(Table_Order ord)
{
OrderId = ord.OrderId,
OrderSum = ord.OrderSum,
OrderCurrencyId = ord.OrderCurrencyId,
OrderCurrencyView = ord.Currency.CurrencyText
}
And call it like this.
var query = from ord in ctx.Table_Orders
select new OrderViewModel(ord);
The Problem: The join is gone DB query is no more optimised. Now I get 1+N calls to database to fetch CurrencyText for every line.
Any comments are welcome. Maybe I have missed different great approach.
This is how far i could get on my own, to get the code reusability. I created a function that does the job and has multiple parameters. Then I need to explicitly pass it everything that has crossed the line of entity.
var query = ctx.Table_Orders.Select(m =>
newOrderViewModel(m, m.Currency.CurrencyText));
The DB call is again optimized. But it still does not feel like I am there yet! What tricks do You know for this case?
EDIT : The final solution
Thanks to a hint by #Muhammad Adeel Zahid I arrived at this solution.
I created an extension for IQueryable
public static class Mappers
{
public static IEnumerable<OrderViewModel> OrderViewModels(this IQueryable<Table_Order> q)
{
return from ord in q
select new OrderViewModel()
{
OrderId = ord.OrderId,
OrderSum = ord.OrderSum,
OrderCurrencyId = ord.OrderCurrencyId,
OrderCurrencyView = ord.Currency.CurrencyText
};
}
}
Now i can do this to get all list
var orders = ctx.Table_Order.OrderViewModels().ToList();
or this to get a single item, or anything in between with Where(x => ..)
var order = ctx.Table_Order
.Where(x => x.OrderId == id).OrderViewModels().SingleOrDefault();
And that completely solves this question. The SQL generated is perfect and the code to translate objects is reusable. Approach like this should work with both LINQ to SQL and LINQ to Entities. (Not tested with the latter) Thank You again #Muhammad Adeel Zahid
Whenever we query the database, we mostly require either enumeration of objects (more than one records in db) or we want a single entity (one record in db). you can write your mapping code in method that returns enumeration for whole table like
public IEnumerable<OrderViewModel> GetAllOrders()
{
return from ord in ctx.Table_Orders
select new OrderViewModel()
{
OrderId = ord.OrderId,
OrderSum = ord.OrderSum,
OrderCurrencyId = ord.OrderCurrencyId,
OrderCurrencyView = ord.Currency.CurrencyText
};
}
Now you may want to filter these records and return another enumeration for example on currencyID
public IEnumerable<OrderViewModel> GetOrdersByCurrency(int CurrencyID)
{
return GetAllOrders().Where(x=>x.CurrencyId == CurrencyID);
}
Now you may also want to find single record out of all these view models
public OrderViewModel GetOrder(int OrderID)
{
return GetAllOrders().SingleOrDefault(x=>x.OrderId == OrderID);
}
The beauty of IEnumerable is that it keeps adding conditions to query and does not execute it until it is needed. so your whole table will not be loaded unless you really want it and you have kept your code in single place. Now if there are any changes in ViewModel Mapping or in query itself, it has to be done in GetAllOrders() method, rest of code will stay unchanged
You can avoid the N+1 queries problem by having Linq2SQL eagerly load the referenced entites you need to construct your viewmodels. This way you can build one list of objects (and some referenced objects) and use it to construct everything. Have a look at this blog post.
One word of warning though: This technique (setting LoadOptions for the Linq2SQL data context) can only be done once per data context. If you need to perform a second query with a different eager loading configuration, you must re-initalize your data context. I automated this with a simple wrapper class around my context.