Suppose I have a composite class PharmaProduct(which represents the product hierarchy of a pharmaceutical company) and a database table for it. I have thought two ways to load the data into a PharmaProduct object.
(1) Construct the entire object-tree when an object is instantiated. Make changes to the tree and persist those changes by applying recursive loop to the tree (This is actually the way C# DataSet works).
(2) Load a node. Load other nodes only if
PharmaProduct GetParent()
or,
List<PharmaProduct> GetChldren()
are called (which actually do the direct database access). Make change to the node. Only save that node.
This type of tables may have a thousand entries, depending on how many types of items a pharmaceutical company manufactures. So in that case, the 1st approach will be too clumsy (and also memory consuming) I think.
How should I actually do the database access in case of any Composite Pattern problem?
Take a look at the Proxy pattern. Using it, you would put PharmaProductProxy objects in the tree that have the same interface as PharmaProduct, but lazy load themselves when they are accessed.
Related
Excuse me for my broken English.
In my application, all objects in the context have a property called ObsoleteFlag, which basically means if the object should still be used on the frontend. It's some sort of "soft-delete" flag without actually having to delete the data.
Now I want to prevent EF from returning any object where ObsoleteFlag is set to true (1)
If for example I retrieve object X, the navigational list property Y contains all the related objects of type Y, no matter what the ObsoleteFlag is set to.
Is there some general way of preventing EF from doing this? I don't want to check on the ObsoleteFlag property everywhere I access the context, and for every navigational property that may be loaded too.
Thanks and sorry for my broken English.
Two different approaches:
In your repository layer have a GetAllWhatever() that returns IQueryable<Whatever> and uses Where(x => !x.Obsolete) and use this whenever you retrieve objects of this type.
Create a view of Create View ActiveWhatever As Select * from ActiveWhatever Where obsolete = 0 and bind to that rather than the table.
The first is essentially checking the flag every time, but doing so in one place, so you don't have to keep thinking about it.
The second is much the same, but the work is pushed to the database instead of the .NET code. If you are going to modify the entities or add new entities you will have to make it a modifiable view, but just how that is done depends on the database in question (e.g. you can do it with triggers in SQL Server, and triggers or rules in PostgreSQL).
The second can also include having a rule or trigger for DELETE that sets your obsolete property instead of deleting, so that a normal delete as far as Entity Framework is concerned becomes one of your soft-deletes as far as the database is concerned.
I'd go for that approach unless you had a reason to object to a view existing just to help the application's implementation (that is you're heavily into the database being "pure" in being concerned with the data rather than its use). But then, if it's handy for one application it's likely handy for more, given the very meaning of this "obsolete".
I've got a list of entity object Individual for an employee survey app - an Individual represents an employee or outside rater. The individual has the parent objects Team and Team.Organization, and the child objects Surveys, Surveys.Responses. Responses, in turn, are related to Questions.
So usually, when I want to check the complete information about an Individual, I need to fetch Individuals.Include(Team.Organization).Include(Surveys.Responses.Question).
That's obviously a lot of includes, and has a performance cost, so when I fetch a list of Individuals and don't need their related objects, I don't bother with the Includes... but then the user wants to manipulate an Individual. So here's the challenge. I seem to have 3 options, all bad:
1) Modify the query that downloads the big list of Individuals to .Include(Team.Organization).Include(Surveys.Responses.Question). This gives it bad performance.
2) Individuals.Load(), TeamReference.Load(), OrganizationReference.Load(), Surveys.Load(), (and iterate through the list of Surveys and load their Responses and the Responses' Questions).
3) When a user wishes to manipulate an Individual, I drop that reference and fetch a whole brand new Individual from the database by its primary key. This works, but is ugly because it means I have two different kinds of Individuals, and I can never use one in place of the other. It also creates ugly problems if I'm iterating across a list repeatedly, as it's tricky to avoid loading and dropping the fully-included Individuals repeatedly, which is wasteful.
Is there any way to say
myIndividual.Include("Team.Organization").Include("Surveys.Responses.Question");
with an existing Individual entity, instead of taking approach (3)?
That is, is there any middle-ground between "fetch everything from the database up-front" and "late-load one relationship at a time"?
Possible solution that I'm hoping I could get insight about:
So there's no way to do a manually-implemented explicit load on a navigational-property? No way to have the system interpret
Individual.Surveys = from survey in MyEntities.Surveys.Include("Responses.Question")
where survey.IndividualID = Individual.ID
select survey; //Individual.Surveys is the navigation collection property holding Surveys on the Individual.
Individual.Team = from team in MyEntities.Teams.Include("Organization")
where team.ID = Individual.TeamID
select team;
as just loading Individual's related objects from the database instead of being an assignment/update operation? If this means no actual change in X and Y, can I just do that?
I want a way to manually implement a lazy or explicit load that isn't doing it a dumb (one relation at a time) way. Really, the Teams and Organizationss aren't the problem, but the Survey.Responses.Questions are a massive buttload of database hits.
I'm using 3.5, but for the sake of others (and when my project finally migrates to 4) I'm sure responses relevant to 4 would be appreciated. In that context, similar customization of lazy loading would be good to hear about too.
edit: Switched the alphabet soup to my problem domain, edited for clarity.
Thanks
The Include statement is designed to do exactly what you're hoping to do. Having multiple includes does indeed eager load the related entities.
Here is a good blog post about it:
http://thedatafarm.com/blog/data-access/the-cost-of-eager-loading-in-entity-framework/
In addition, you can use strongly typed "Includes" using some nifty ObjectContext extension methods. Here is an example:
http://blogs.microsoft.co.il/blogs/shimmy/archive/2010/08/06/say-goodbye-to-the-hard-coded-objectquery-t-include-calls.aspx
I'm developing a hierarchical object model that is self-referencing as a 0/1 --> * relationship. An object without a parentID is a root element. The parentID is also the foreign key on the self-join. From my understanding, using the parentID as a foreign key will only point to a column where child elements may be found --> does this force an iteration through the entire data set for that column? Is this a scenario where a clustered index should be formed? ....would it be proper to use the XML data type to store all childrenIDs in a single field then load and reference that document for each object? It seems doing this would at least allow me to simplify my object persistence layer and give me more control over recording transactions.
Any advice?
I would strongly suggest against using XML to store the child IDs. It will cause countless headaches trying to maintain it down the road, not to mention trying to use it outside of your application (for example, from a reporting solution or for ETL).
Have you looked into the HIERARCHYID data type? It's in SQL 2008 and may be useful for you here. I don't know what kind of support the various programming languages/ODBC/OLE DB have for it, but you can convert it to a string with .ToString() and that can be manipulated pretty easily. It also then allows you to use the other methods of HIERARCHYID in T-SQL, like .GetAncestor(), etc.
I would like to know the best way to populate an object that has a collection of child objects and each child object may inturn have a collection of objects, from database without making multiple calls to the database to get child objects for each object. basically in hierarchical format something like for example a customer has orders and each order has order items. is it best to retrieve the data in xml format (SQL server 2005), or retrieve a dataset by joining the related tables together and then map that the data to the object? thanks in advance for your help.
There are a lot of variables still there:
Are the child objects of the same type? If so you can select them all at the same time and then set up the parent/child relationships in your object mapping layer.
Can the child objects have children of their own? If the nesting is unlimited, then you can't get all the data at the same time unless you get all the data.
You could certainly do a join on all of the customers->orders->order items and break everything up in code, but that seems like that would be a lot of overhead in duplicated parent rows and a lot of work in processing that big mess.
Trying to avoid doing multiple calls might be a pre-mature optimization. Are you having performance problems with doing too many calls to the database?
Edit: Based on your comments, you should be able to do one query per hierarchy level:
Select * from orders
where orders.customerid = my_customer_id
--Do some orm mappings and make a list of child object ids--
Select * from child_order_object
where child_order_object_id in (list of child object ids)
--Do some more ORM mapping and link child objects to previous parent objects--
...
--Repeat for more levels--
You should be able to have just one query per relationship level rather than the exploding amount of queries to get just one object by id.
You may take a look at ORMs such as NHibernate and Entity Framework that are designed exactly for such scenarios.
MS SQL 2005 supports Common Table Expressions, which can be used for this purpose. Basically they allow you to do a recursive query. Do a keyword search on CTE / MS SQL and you'll find a lot of stuff like this: Apply a recursive CTE on grouped table rows (SQL server 2005)
This question is old, but the new answer (if you're using entity framework) is to use the Include method on the object query. This will eagerly load the all the navigation properties specified.
https://msdn.microsoft.com/en-us/library/bb738708(v=vs.100).aspx
What is the best way to mark some entities DeleteOnSubmit(). Is there a way to check and say to the context that this is for deletion?
Example: I have an Entity which reference an EntitySet<> and i delete from the EntitySet<> 4 of the 8 entities. When submitting changes i want to say DeleteOnSubmit() on those 4! This scenario should play on a single EntityRef<> too.
Of course DataContext lives in another layer so...grabbing, changing, sending back is the job.
Thank you.
This is pretty hard to answer based on the description of your architecture. Just because you're using a layered approach doesn't mean that you can't call DeleteOnSubmit... you'd just call your own method that wraps that I presume.
Unless, of course, you're instantiating your DataContext object in the update routine. in this case you'd have to do something else. Your data layer could expose a method like MarkForDelete() which just adds the entity to a collection, then expose a separate SubmitChanges() that iterates over the collected items for deletion, attaches them to the datacontext and then does the actual DeleteAllOnSubmit() call.
That said I've never really bothered with the whole entity serialization/deserialization/reattach thing as it seems fraught with peril. I usually just collect the primary keys in a list, select out the entities and re-delete them. It's no more work, really.
Take a look at DeleteAllOnSubmit(). You pass this method a list of entities to be deleted.