I would like to know the best way to populate an object that has a collection of child objects and each child object may inturn have a collection of objects, from database without making multiple calls to the database to get child objects for each object. basically in hierarchical format something like for example a customer has orders and each order has order items. is it best to retrieve the data in xml format (SQL server 2005), or retrieve a dataset by joining the related tables together and then map that the data to the object? thanks in advance for your help.
There are a lot of variables still there:
Are the child objects of the same type? If so you can select them all at the same time and then set up the parent/child relationships in your object mapping layer.
Can the child objects have children of their own? If the nesting is unlimited, then you can't get all the data at the same time unless you get all the data.
You could certainly do a join on all of the customers->orders->order items and break everything up in code, but that seems like that would be a lot of overhead in duplicated parent rows and a lot of work in processing that big mess.
Trying to avoid doing multiple calls might be a pre-mature optimization. Are you having performance problems with doing too many calls to the database?
Edit: Based on your comments, you should be able to do one query per hierarchy level:
Select * from orders
where orders.customerid = my_customer_id
--Do some orm mappings and make a list of child object ids--
Select * from child_order_object
where child_order_object_id in (list of child object ids)
--Do some more ORM mapping and link child objects to previous parent objects--
...
--Repeat for more levels--
You should be able to have just one query per relationship level rather than the exploding amount of queries to get just one object by id.
You may take a look at ORMs such as NHibernate and Entity Framework that are designed exactly for such scenarios.
MS SQL 2005 supports Common Table Expressions, which can be used for this purpose. Basically they allow you to do a recursive query. Do a keyword search on CTE / MS SQL and you'll find a lot of stuff like this: Apply a recursive CTE on grouped table rows (SQL server 2005)
This question is old, but the new answer (if you're using entity framework) is to use the Include method on the object query. This will eagerly load the all the navigation properties specified.
https://msdn.microsoft.com/en-us/library/bb738708(v=vs.100).aspx
Related
I was reading this article http://blogs.msdn.com/b/adonet/archive/2011/01/27/using-dbcontext-in-ef-feature-ctp5-part-2-connections-and-models.aspx and was trying to figure out how to create private setters (the section in the article DbContext with read-only set properties is right before the summary). How would you create private setters? I was playing around with different methods but nothing seemed to work. I am doing this because I need to group the original table based on a query I have because the original table is a heap and I need a primary key for the entity. So anytime a client asks for this table it is already grouped. Not even sure if this is the correct way to do that. Thanks.
EDIT: sorry for being vague. I am doing code first. For example there exists a SQL Table with JobNbr, Qty and Date and I need to group by JobNumber, sum on Qty and take the oldest expiration date, and that will be my entity since this table has no primary key. The way I am doing it now gives me the error below from a method I created in the DbContext class. I do have a EntityTypeConfiguration class. Do I do this in that class?
EDIT: : you might be wondering why I am doing this. Basically I need to get data from the heap and save it in another database. My original approach was database.SqlQuery() to get grouped rows from the heap, but sometimes I have too many parameters for execute_sql. So I decided to create an entity for the grouped query without tracking changes (since all I am doing is reading from the table and saving to another DB). See my post here with the issue I am having https://stackoverflow.com/questions/22106030/entity-framework-6-this-database-sqlquery-character-limitation-with-sp-executes. The only way I know to get around it is to create an entity (even though in this case the entity is a query and not a table).
The entity or complex type
' cannot be
constructed in a LINQ to Entities query.
I'm using the PetaPoco mini-ORM, which in my implementation runs stored procedures and maps them to object models I've defined. This works very intuitively for queries that pull out singular tables (i.e. SELECT * FROM Orders), but less so when I start writing queries that pull aggregate results. For example, say I've got a Customers table and Orders table, where the Orders table contains a foreign key reference to a CustomerID. I want to retrieve a list of all orders, but in the view of my application, display the Customer name as well as all the other order fields, i.e.
SELECT
Customers.Name,
Orders.*
FROM
Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.ID
Having not worked with an ORM of any sort before, I'm unsure of the proper method to handle this sort of data. I see two options right now:
Create a new aggregate model for the specific operation. I feel like I would end up with a ton of models in any large application by doing this, but it would let me map a query result directly to an object.
Have two separate queries, one that retrieves Orders, another that retrieves Customers, then join them via LINQ. This seems a better alternative than #1, but similarly seems obtuse as I am pulling out 30 columns when I desire one (although my particular mini-ORM allows me to pull out just one row and bind it to a model).
Is there a preferred method of doing this, either of the two I mentioned, or a better way I haven't thought of?
Option #1 is common in CQRS-based architectures. It makes sense when you think about it: even though it requires some effort, it maps intuitively to what you are doing, and it doesn't impact other pieces of your solution. So if you have to change it, you can do so without breaking anything elsewhere.
I've got a list of entity object Individual for an employee survey app - an Individual represents an employee or outside rater. The individual has the parent objects Team and Team.Organization, and the child objects Surveys, Surveys.Responses. Responses, in turn, are related to Questions.
So usually, when I want to check the complete information about an Individual, I need to fetch Individuals.Include(Team.Organization).Include(Surveys.Responses.Question).
That's obviously a lot of includes, and has a performance cost, so when I fetch a list of Individuals and don't need their related objects, I don't bother with the Includes... but then the user wants to manipulate an Individual. So here's the challenge. I seem to have 3 options, all bad:
1) Modify the query that downloads the big list of Individuals to .Include(Team.Organization).Include(Surveys.Responses.Question). This gives it bad performance.
2) Individuals.Load(), TeamReference.Load(), OrganizationReference.Load(), Surveys.Load(), (and iterate through the list of Surveys and load their Responses and the Responses' Questions).
3) When a user wishes to manipulate an Individual, I drop that reference and fetch a whole brand new Individual from the database by its primary key. This works, but is ugly because it means I have two different kinds of Individuals, and I can never use one in place of the other. It also creates ugly problems if I'm iterating across a list repeatedly, as it's tricky to avoid loading and dropping the fully-included Individuals repeatedly, which is wasteful.
Is there any way to say
myIndividual.Include("Team.Organization").Include("Surveys.Responses.Question");
with an existing Individual entity, instead of taking approach (3)?
That is, is there any middle-ground between "fetch everything from the database up-front" and "late-load one relationship at a time"?
Possible solution that I'm hoping I could get insight about:
So there's no way to do a manually-implemented explicit load on a navigational-property? No way to have the system interpret
Individual.Surveys = from survey in MyEntities.Surveys.Include("Responses.Question")
where survey.IndividualID = Individual.ID
select survey; //Individual.Surveys is the navigation collection property holding Surveys on the Individual.
Individual.Team = from team in MyEntities.Teams.Include("Organization")
where team.ID = Individual.TeamID
select team;
as just loading Individual's related objects from the database instead of being an assignment/update operation? If this means no actual change in X and Y, can I just do that?
I want a way to manually implement a lazy or explicit load that isn't doing it a dumb (one relation at a time) way. Really, the Teams and Organizationss aren't the problem, but the Survey.Responses.Questions are a massive buttload of database hits.
I'm using 3.5, but for the sake of others (and when my project finally migrates to 4) I'm sure responses relevant to 4 would be appreciated. In that context, similar customization of lazy loading would be good to hear about too.
edit: Switched the alphabet soup to my problem domain, edited for clarity.
Thanks
The Include statement is designed to do exactly what you're hoping to do. Having multiple includes does indeed eager load the related entities.
Here is a good blog post about it:
http://thedatafarm.com/blog/data-access/the-cost-of-eager-loading-in-entity-framework/
In addition, you can use strongly typed "Includes" using some nifty ObjectContext extension methods. Here is an example:
http://blogs.microsoft.co.il/blogs/shimmy/archive/2010/08/06/say-goodbye-to-the-hard-coded-objectquery-t-include-calls.aspx
I'm developing a hierarchical object model that is self-referencing as a 0/1 --> * relationship. An object without a parentID is a root element. The parentID is also the foreign key on the self-join. From my understanding, using the parentID as a foreign key will only point to a column where child elements may be found --> does this force an iteration through the entire data set for that column? Is this a scenario where a clustered index should be formed? ....would it be proper to use the XML data type to store all childrenIDs in a single field then load and reference that document for each object? It seems doing this would at least allow me to simplify my object persistence layer and give me more control over recording transactions.
Any advice?
I would strongly suggest against using XML to store the child IDs. It will cause countless headaches trying to maintain it down the road, not to mention trying to use it outside of your application (for example, from a reporting solution or for ETL).
Have you looked into the HIERARCHYID data type? It's in SQL 2008 and may be useful for you here. I don't know what kind of support the various programming languages/ODBC/OLE DB have for it, but you can convert it to a string with .ToString() and that can be manipulated pretty easily. It also then allows you to use the other methods of HIERARCHYID in T-SQL, like .GetAncestor(), etc.
Suppose I have a composite class PharmaProduct(which represents the product hierarchy of a pharmaceutical company) and a database table for it. I have thought two ways to load the data into a PharmaProduct object.
(1) Construct the entire object-tree when an object is instantiated. Make changes to the tree and persist those changes by applying recursive loop to the tree (This is actually the way C# DataSet works).
(2) Load a node. Load other nodes only if
PharmaProduct GetParent()
or,
List<PharmaProduct> GetChldren()
are called (which actually do the direct database access). Make change to the node. Only save that node.
This type of tables may have a thousand entries, depending on how many types of items a pharmaceutical company manufactures. So in that case, the 1st approach will be too clumsy (and also memory consuming) I think.
How should I actually do the database access in case of any Composite Pattern problem?
Take a look at the Proxy pattern. Using it, you would put PharmaProductProxy objects in the tree that have the same interface as PharmaProduct, but lazy load themselves when they are accessed.