I'm using the PetaPoco mini-ORM, which in my implementation runs stored procedures and maps them to object models I've defined. This works very intuitively for queries that pull out singular tables (i.e. SELECT * FROM Orders), but less so when I start writing queries that pull aggregate results. For example, say I've got a Customers table and Orders table, where the Orders table contains a foreign key reference to a CustomerID. I want to retrieve a list of all orders, but in the view of my application, display the Customer name as well as all the other order fields, i.e.
SELECT
Customers.Name,
Orders.*
FROM
Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.ID
Having not worked with an ORM of any sort before, I'm unsure of the proper method to handle this sort of data. I see two options right now:
Create a new aggregate model for the specific operation. I feel like I would end up with a ton of models in any large application by doing this, but it would let me map a query result directly to an object.
Have two separate queries, one that retrieves Orders, another that retrieves Customers, then join them via LINQ. This seems a better alternative than #1, but similarly seems obtuse as I am pulling out 30 columns when I desire one (although my particular mini-ORM allows me to pull out just one row and bind it to a model).
Is there a preferred method of doing this, either of the two I mentioned, or a better way I haven't thought of?
Option #1 is common in CQRS-based architectures. It makes sense when you think about it: even though it requires some effort, it maps intuitively to what you are doing, and it doesn't impact other pieces of your solution. So if you have to change it, you can do so without breaking anything elsewhere.
Related
I have two tables in my database: TPM_AREAS and TPM_WORKGROUPS. There exists a many-to-many relationship between these two tables, and these relationships are stored in a table called TPM_AREAWORKGROUPS. This table looks like this:
What I need to do is load all these mappings into memory at once, in the quickest way possible. As TPM_AREAWORKGROUPS is an association, I can't just say:
var foo = (from aw in context.TPM_AREAWORKGROUPS select aw);
I can think of three ways to possibly do this, however I'm not quite sure how to accomplish each of them nor which one is the best.
1) Load in every workgroup, including the associated areas:
Something like:
var allWG = (from w in context.TPM_WORKGROUPS.Include("TPM_AREAS")
where w.TPM_AREAS.Count > 0
select w);
// Loop through this enumeration and manually build a mapping of distinct AREAID/WORKGROUPID combinations.
Pros: This is probably the standard EntityFramework way of doing things, and doesn't require me to change any of the database structure or mappings.
Cons: Could potentially be slow, since the TPM_WORKGROUPS table is rather large and the TPM_AREAWORKGROUPS table only has 13 rows. Plus, there's no TPM_AREAWORKGROUPS class, so I'd have to return a collection of Tuples or make a new class for this.
2) Change my model
Ideally, I'd like a TPM_AREAWORKGROUP class, and a context.TPM_AREAWORKGROUP property. I used the designer to create this model directly from the database, so I'm not quite sure how to force this association to be an actual model. Is there an easy way to do this?
Pros: It would allow me to select directly against this table, done in one line of code. Yay!
Cons: Forces me to change my model, but is this a bad thing?
3) Screw it, use raw SQL to get what I want.
I can get the StoreConnection property of the context, and call CreateCommand() directly. I can then just do:
using (DbCommand cmd = conn.CreateCommand())
{
cmd.CommandText = "SELECT AreaId, WorkgroupId FROM TPM_AREAWORKGROUPS";
var reader = cmd.ExecuteReader();
// Loop through and get each mapping
}
Pros: Fast, easy, doesn't require me to change my model.
Cons: Seems kind of hacky. Everywhere else in the project, we're just using standard Entity Framework code so this deviates from the norm. Also, it has the same issues as the first option; there's still no TPM_AREAWORKGROUPS class.
Question: What's the best solution for this problem?
Ideally, I'd like to do #2 however I'm not quite sure how to adjust my model. Or, perhaps someone knows of a better way than my three options.
You could do:
var result = context
.TPM_WORKGROUPS
.SelectMany(z => z.TPM_AREAS.Select(z2 => new
{
z2.AREAID,
z.WORKGROUPID
}));
The translated SQL will be a simple SELECT AREAID, WORKGROUPID FROM TPM_AREAWORKGROUPS.
About other options:
I wouldn't use option 3) because I personnally avoid raw SQL as much as possible when using Entity Framework (see https://stackoverflow.com/a/8880157/870604 for some reasons).
I wouldn't use option 2) because you would have to change your model, and there is a simple and efficient way that allows to not change it.
What about use projection to load data?
You could do that do fill a annonymous object and then work with it the way you like.
I see tons of questions on LINQ to SQL vs Stored Procs. I'm more curious about the benefits of using them in tandem as relates to object mapping.
I have my business objects defined, and I have stored procedures for all of my CRUD transactions.
Is it better to plop all the stored procs into a DBML file and call them from there, and then map the results to my business objects, or is it better to just use a DataReader and map it from there?
It's annoying to me because I want my objects as I define them, rather than use MyStoredProcResult objects as linq2sql generates, so I feel I'm doing the same field by field mapping as I would with a data reader.
Performance isn't necessarily key here (unless it's ridiculously slow). I'm looking to create a standard way for all our developers to load data from a database into an object in the simplest fashion with the least amount of code.
Mapping to LINQ2SQL has a serious advantage in being type-safe - you don't really have to worry about parsing the results or adding command parameters. It does it all for you.
On the other hand with calling stored procedures directly with SQLcommand and DataReader proves to have better performance (especially when reading/changing a lot of data).
Regardless of which you choose it is better to build a separate Data Access Layer as it allows more flexibility. The logic of accessing/changing database should not be built into your business objects cos if you are forced to change means of storing you data it updating you software will be painful.
Not direct answer to your question, but if you want your objects as result of query, you probably have to consider code first schemas. Linq2SQL does not support this, but Entity Framework and NHibernate does.
Direct answer is that DataReader will obviously has less overhead, but at the same time it will have much more magic strings. Overhead is bad in terms of perfomance(in your case not that big). Magic strings are bad in terms maintaining code. So definetly this will be your personal choise.
LINQ2SQL can provide your objects populated with the results of the query. You will have to build child objects in such a way as to support either a List(Of T) or List depending on your language choice.
Suppose you have a table with an ID, a Company Name, and a Phone Number for fields. Querying that table would be straight-forward in either LINQ or a stored procedure. The advantage that LINQ brings is the ability to map the results to either anonymous types or your own classes. So a query of:
var doSomething = from sList in myTableRef select sList;
would return an anonymous type. However, if you also have a class like this:
public class Company
{
public integer ID;
public string Company;
public string PhoneNumber;
}
changing your query to this will populate Company objects as it moves through the data:
List<Company> companies = (from sList in myTableRef select new Company
{ .ID = sList.id,
.Company = sList.company,
.PhoneNumber = sList.phonenumber }).ToList();
My C# syntax may not be 100% correct as I mainly code in VB, but it will be close enough to get you there.
I've got a list of entity object Individual for an employee survey app - an Individual represents an employee or outside rater. The individual has the parent objects Team and Team.Organization, and the child objects Surveys, Surveys.Responses. Responses, in turn, are related to Questions.
So usually, when I want to check the complete information about an Individual, I need to fetch Individuals.Include(Team.Organization).Include(Surveys.Responses.Question).
That's obviously a lot of includes, and has a performance cost, so when I fetch a list of Individuals and don't need their related objects, I don't bother with the Includes... but then the user wants to manipulate an Individual. So here's the challenge. I seem to have 3 options, all bad:
1) Modify the query that downloads the big list of Individuals to .Include(Team.Organization).Include(Surveys.Responses.Question). This gives it bad performance.
2) Individuals.Load(), TeamReference.Load(), OrganizationReference.Load(), Surveys.Load(), (and iterate through the list of Surveys and load their Responses and the Responses' Questions).
3) When a user wishes to manipulate an Individual, I drop that reference and fetch a whole brand new Individual from the database by its primary key. This works, but is ugly because it means I have two different kinds of Individuals, and I can never use one in place of the other. It also creates ugly problems if I'm iterating across a list repeatedly, as it's tricky to avoid loading and dropping the fully-included Individuals repeatedly, which is wasteful.
Is there any way to say
myIndividual.Include("Team.Organization").Include("Surveys.Responses.Question");
with an existing Individual entity, instead of taking approach (3)?
That is, is there any middle-ground between "fetch everything from the database up-front" and "late-load one relationship at a time"?
Possible solution that I'm hoping I could get insight about:
So there's no way to do a manually-implemented explicit load on a navigational-property? No way to have the system interpret
Individual.Surveys = from survey in MyEntities.Surveys.Include("Responses.Question")
where survey.IndividualID = Individual.ID
select survey; //Individual.Surveys is the navigation collection property holding Surveys on the Individual.
Individual.Team = from team in MyEntities.Teams.Include("Organization")
where team.ID = Individual.TeamID
select team;
as just loading Individual's related objects from the database instead of being an assignment/update operation? If this means no actual change in X and Y, can I just do that?
I want a way to manually implement a lazy or explicit load that isn't doing it a dumb (one relation at a time) way. Really, the Teams and Organizationss aren't the problem, but the Survey.Responses.Questions are a massive buttload of database hits.
I'm using 3.5, but for the sake of others (and when my project finally migrates to 4) I'm sure responses relevant to 4 would be appreciated. In that context, similar customization of lazy loading would be good to hear about too.
edit: Switched the alphabet soup to my problem domain, edited for clarity.
Thanks
The Include statement is designed to do exactly what you're hoping to do. Having multiple includes does indeed eager load the related entities.
Here is a good blog post about it:
http://thedatafarm.com/blog/data-access/the-cost-of-eager-loading-in-entity-framework/
In addition, you can use strongly typed "Includes" using some nifty ObjectContext extension methods. Here is an example:
http://blogs.microsoft.co.il/blogs/shimmy/archive/2010/08/06/say-goodbye-to-the-hard-coded-objectquery-t-include-calls.aspx
Here's the story: a Site (physical location of interest) has zero or more Contacts. Those Contacts are people associated with a Company who are authorized to deal with matters regarding the Site.
The schema looks like:
Person -< CompanyContact -< CompanySiteContact >- Site
||
| -< PersonPhone
|
-< PersonAddress
My entry point is Site. I need the list of Contacts. There is very little field data of interest until you get to Person. So, I'd like to collapse Person, CompanyContact and CompanySiteContact into one domain class.
The options I've come up with:
Create one domain class and use joins in the FluentNH map to flatten the layers as it retrieves the data
. It never sounded simple, and I'm running into problems with the multi-level join (if A joins B joins C, you can't specify the join to C within the join to B). I think, however, that if it's possible to specify the joins, that's just a one-time thing and so this will end up being the most maintainable solution.
Replicate the deep model in a set of "DTOs" that map 1:1 to the tables and can be passed to the constructor of a "flat" domain model. It works, but it feels like cheating (there is no problem that cannot be solved with another layer of abstraction, EXCEPT for having too many layers of abstraction), and my instinct tells me this will somehow eventually cause more problems than it solves.
Replicate the domain model 1:1 with the schema and use pass-through properties on CompanySiteContact to access properties down in the depths of a Person record. Again, works now, but it doesn't really solve the problem, and every new property that becomes of interest will require changes to the mapping, the actual domain class, AND the top-level domain class. Not very SOLID.
So, the Q is, how would I structure the mapping? Like I said, I'm not able to specify a join in a join. I think the way I have to do it is map the PK of each table, and use it in the next join from the top level, but I'm not exactly sure how to set that up (haven't used FluentNH to set up anything close to this complex before).
I'd recommend creating your domain model to closely match your database. From there I'd create DTOs and use AutoMapper to do the flattening. Easy.
Thanks to James for his answer; +1, but I don't think AutoMapper is necessary at this juncture, and I'm a little uneasy at including something that does the job quite that "automagically". I thought of a few more options:
Set up a view in the DB. This will work because due to business rules, the contact information is read-only; the app I'm developing must never update a contact directly because a different department maintains this rolodex.
Map my domain 1:1 as James suggested, but then use a Linq query to flatten the model into a DTO. This query can be encapsulated within a helper of the Repository, allowing developers to query the DTO directly using the same methods on the Repository as for other classes. It's more complex than the view with the same result, but it doesn't require schema changes.
I'll probably go with the first option, and resort to the second if necessary.
I would like to know the best way to populate an object that has a collection of child objects and each child object may inturn have a collection of objects, from database without making multiple calls to the database to get child objects for each object. basically in hierarchical format something like for example a customer has orders and each order has order items. is it best to retrieve the data in xml format (SQL server 2005), or retrieve a dataset by joining the related tables together and then map that the data to the object? thanks in advance for your help.
There are a lot of variables still there:
Are the child objects of the same type? If so you can select them all at the same time and then set up the parent/child relationships in your object mapping layer.
Can the child objects have children of their own? If the nesting is unlimited, then you can't get all the data at the same time unless you get all the data.
You could certainly do a join on all of the customers->orders->order items and break everything up in code, but that seems like that would be a lot of overhead in duplicated parent rows and a lot of work in processing that big mess.
Trying to avoid doing multiple calls might be a pre-mature optimization. Are you having performance problems with doing too many calls to the database?
Edit: Based on your comments, you should be able to do one query per hierarchy level:
Select * from orders
where orders.customerid = my_customer_id
--Do some orm mappings and make a list of child object ids--
Select * from child_order_object
where child_order_object_id in (list of child object ids)
--Do some more ORM mapping and link child objects to previous parent objects--
...
--Repeat for more levels--
You should be able to have just one query per relationship level rather than the exploding amount of queries to get just one object by id.
You may take a look at ORMs such as NHibernate and Entity Framework that are designed exactly for such scenarios.
MS SQL 2005 supports Common Table Expressions, which can be used for this purpose. Basically they allow you to do a recursive query. Do a keyword search on CTE / MS SQL and you'll find a lot of stuff like this: Apply a recursive CTE on grouped table rows (SQL server 2005)
This question is old, but the new answer (if you're using entity framework) is to use the Include method on the object query. This will eagerly load the all the navigation properties specified.
https://msdn.microsoft.com/en-us/library/bb738708(v=vs.100).aspx