As a new user of nHibernate and its utility library, fluent nhibernate, I am trying to learn enough to be dangerous with a good database.
I am having an exceptionally great deal of difficulty understanding the concept of Projections. Specifically, What in the world are they?
I have literally done exact searches on 'What are projections?' and 'Projects in nHibernate' and 'nHibernate, Projections, Definition', etc. And I am still very confused. The most helpful posts so far are This other StackOverflow Question and This Blog Post by Colin Ramsay. But I am still vastly confused. My knowledge of databases is still entry-level at best.
I do not really understand what projections are, why I would want to use them, what they are accomplishing, etc. I see in the blog post that he is using them to get a list of integers (I presume Primary Keys) so that he can use them in a different query, but this is kind of nebulous in the way it is functioning and the why.
Here's a practical example.
Let's say that you have an online store and one of your domain classes is a Brand like "Samsung". This class has a boatload of properties associated with it, perhaps an integer Identity, a Name, a free-text Description field, a reference to a Vendor object, and so on.
Now let's say that you want to display a menu with a list of all the brands offered on your online store. If you just do session.CreateCriteria<Brand>().List(), then you are indeed going to get all of the brands. But you'll also have sucked all of the long Description fields and references to Vendors from the database, and you don't need that to display a menu; you just need the Name and the Identity. Performance-wise, sucking all of this extra data down from the database slows things down and is unnecessary.
Instead, you can create a "projection" object that contains just the Identity and the Name calling it, say, NameIdentityPair:
public class NameIdentityPair
{
public int Identity { get; set; }
public string Name { get; set; }
}
And you could tell NHibernate to only select the data that you really need to perform the task at hand by telling it to transform the result set onto your projection:
var brandProjections = this.session.CreateCriteria<Brand>()
.SetProjection(Projections.ProjectionList()
.Add(Projections.Property("Name"), "Name")
.Add(Projections.Property("Identity"), "Identity"))
.SetResultTransformer(Transformers.AliasToBean<NameIdentityPair>())
.List<NameIdentityPair>();
foreach (var brandProjection in brandProjections)
{
Console.WriteLine(
"Identity: {0}, Name: {1}",
brandProjection.Identity,
brandProjection.Name);
}
Now you don't have a list of Brands but instead a list of NameIdentityPairs, and NHibernate will have only issued a SQL statement like SELECT b.Identity, b.Name from dbo.Brand b to obtain this projection, as opposed to a massive SQL statement that grabs everything necessary to hydrate a Brand object (e.g., SELECT b.Identity, b.Name, b.Description from dbo.brand b left join dbo.vendor v ....).
Hope this helps.
If you're familiar with SQL, a projection is the SELECT clause of a query, used to select which fields from the available results to return.
For example, assume you have a Person with FirstName, LastName, Address, and Phone fields. If you want a query to return everything, you can leave off the projection, which is like SELECT * FROM Person in SQL. If you just want the first and last names, you would create a projection with FirstName and LastName -- which would be SELECT FirstName, LastName FROM Person in SQL terms.
You can use projections to call sql functions like SUM, COUNT... or select single fields without return an entity.
"...Retrieving only properties of an entity or entities, without the overhead of loading
the entity itself in a transactional scope. This is sometimes called a report
query; it’s more correctly called projection." [NHibernate in Action]
Related
when i try to select some items, items are coming with their includes despite i did not include their object to linq
public List<Institution> GetListWithCities(Expression<Func<Institution,bool>> filter = null)
{
using (var context = new DbContext())
{
return filter == null
? context.Set<Institution>()
.Include(x => x.City)
.ToList()
: context.Set<Institution>()
.Include(x => x.City)
.Where(filter)
.ToList();
}
}
[Table("Institution")]
public class Institution{
public int ID;
public string Name;
public int CITY_ID;
public int RESPONSIBLE_INSTUTION_ID;
public virtual City City{ get; set; }
public virtual Institution ResponsibleInstution{ get; set; }
}
I expect a result include with city of instution but my method returns city and responsible instution. And it continues recursively.
People tend to use Include instead of Select while they don't plan to use the functionality that Include gives, but still wasting the processing power that Include uses.
In entity framework always use Select to fetch some data. Only user Include if you plan to update the included items.
One of the slower parts of a database query is the transport from the fetched data from the database management system to your local process. Hence it is wise to Select only those properties that you really plan to use.
Apparently your Institution is in exactly one City, namely the City that the foreign key (CityId?) is referring to. If Institution [10] is located in City [15], then Institution.CityId will have a value 15, equal to City.Id. So you are transferring this value twice.
using (var dbContext = new MyDbContext())
{
IQueryable<Institution> filteredInstitutions = (filter == null) ?
dbContext.Institutions :
dbContext.Institutions.Where(filter);
return filteredInstitutions.Select(institution => new Institution
{
// Select only the Institution properties that you actually plan to use:
Id = institution.Id,
Name = institution.Name,
City = new City
{
Id = institution.City.Id,
Name = institution.City.Name,
...
}
// not needed: you already know the value:
// CityId = institution.City.Id,
});
Possible improvement
Apparently you chose to add a layer between entity framework and the users of your functions: although they use your functions, they don't really have to know that you use entity framework to access the database. This gives your the freedom to use SQL instead of entity framework. Hell, it even gives you the freedom to get rid of your database and use an XML file instead of a DBMS: your users won't know the difference: nice if you want to write unit tests.
Although you chose to separate the method you use to persist the data, you chose to expose your database layout, inclusive foreign keys to the outside world. This makes it more difficult to change your database in future: your users have to change as well.
Consider writing repository classes for Institution and City that only expose those properties that the users of your persistency really need. If people only query "some properties of institutions with some properties of the City in which they are located", or the other way round "Several properties of Cities with several properties of the Institutions located in these Cities", then they won't need the foreign keys.
The intermediate repository classes give you more freedom to change your database. Apart from that, it will give you the freedom to hide certain properties for certain users.
For instance: suppose you add the possibility to delete an institution, but you don't want to immediately delete all information about this institution, for instance because this allows you to restore if someone accidently deletes the institution, you might add a nullable property ObsoleteDate
Moest people that query institutions, don't want the obsolete institutions. If you had an intermediate repository institution class, where you omitted the ObsoleteDate, and all queries removed all Institutions that have a non-zero ObsoleteData, then for your users it would be as if an obsolete institution would have been deleted from the database.
Only one user will need access to the ObsoleteDate: a cleaner task, that every now and then deleted all Institutions that are obsolete for a considerable time.
A third improvement for an intermediate repository class would be that you can give different users access to the same data, with different interfaces: some users can only query information about institutions, some are also allowed to change some data, while others are allowed to change other data. If you give them an interface, they can break this by casting them back to the original Institution.
With separate repository classes, you will have the possibility to give each of these users their own data, and nothing more than this data.
The disadvantage of a repository pattern is that you have to think about different users, and create different query functions. The advantages is that a repository is easier to change and easier to test, and thus easier to keep everything bug free after future changes.
I am currently having to work on a project which uses linq2sql as its database accessing framework, now there are a lot of linq queries which basically do the following:
var result = from <some_table>
join <some_other_table>
join <another_table>
select <some_other_domain_model> // This is a non linq2SQL poco
return result.Where(<Some_Predicate>);
So for example assume you read 3 tables, and then collate the contents into one big higher level model, for sending to a view. Now ignore the mixing of domains, as that doesn't bother me too much, its the final where clause which does.
Now I have not used Linq2Sql much before so would I be right in saying what is going to happen is:
Generate SQL based off the from, join, join, select linq
Retrieve all rows
Map all this data into one big model (in memory)
Loop through all models and then return only the applicable ones
As this is the crux of my question, it would make sense in my mind if the above flow is what would happen, but it has been debated by people who apparently know the framework a lot better than the 4th step is somehow factored into the SQL generation so it will not be pulling back all records, but I dont know how it could be doing that as it NEEDS all the data up front to populate this which it then applies a separate where clause on, so I assume by the 4th point the rows have all been read and are already in memory.
I am trying to push for them to move their where clause into the linq so that it filters out un-needed records at the database level, however I was wondering if anyone can advise as to if my assumptions above are right?
== Edit ==
Have added comment to draw more attention to the fact that the is not a linq2sql generated object and is some random poco hand rolled elsewhere, just to narrow down where my main focus is on the context of the question. As the question is LESS about "does it matter where I put the where clause" and more about "Does the where clause still get factored into the underlying query when it is applied to a non linq2sql object generated from a linq2sql query".
Here is another more concise example of what I mean hopefully drawing the point more towards where my lack of understanding is:
/*
I am only going to put auto properties into the linq2sql entities,
although in the real world they would be a mix of private backing
fields with public properties doing the notiftying.
*/
[global::System.Data.Linq.Mapping.TableAttribute(Name="dbo.some_table_1")]
public class SomeLinq2SqlTable1
{
[global::System.Data.Linq.Mapping.ColumnAttribute(Storage="some_table_1_id", AutoSync=AutoSync.OnInsert, DbType="Int NOT NULL IDENTITY", IsPrimaryKey=true, IsDbGenerated=true)]
public int Id {get;set;}
}
[global::System.Data.Linq.Mapping.TableAttribute(Name="dbo.some_table_2")]
public class SomeLinq2SqlTable2
{
[global::System.Data.Linq.Mapping.ColumnAttribute(Storage="some_table_2_id", AutoSync=AutoSync.OnInsert, DbType="Int NOT NULL", IsPrimaryKey=true, IsDbGenerated=true)]
public int Id {get;set;}
[global::System.Data.Linq.Mapping.ColumnAttribute(Storage="some_table_2_name", AutoSync=AutoSync.OnInsert, DbType="Varchar NOT NULL", IsPrimaryKey=false)]
public string Name {get;set;}
}
[global::System.Data.Linq.Mapping.TableAttribute(Name="dbo.some_table_3")]
public class SomeLinq2SqlTable3
{
[global::System.Data.Linq.Mapping.ColumnAttribute(Storage="some_table_3_id", AutoSync=AutoSync.OnInsert, DbType="Int NOT NULL", IsPrimaryKey=true, IsDbGenerated=true)]
public int Id {get;set;}
[global::System.Data.Linq.Mapping.ColumnAttribute(Storage="some_table_3_other", AutoSync=AutoSync.OnInsert, DbType="Varchar NOT NULL", IsPrimaryKey=false)]
public string Other {get;set;}
}
/*
This is some hand rolled Poco, has NOTHING to do with Linq2Sql, think of it as
a view model of sorts.
*/
public class SomeViewModel
{
public int Id {get;set;}
public string Name {get;set;}
public string Other {get;set;}
}
/*
Here is psudo query to join all tables, then populate the
viewmodel item from the query and finally do a where clause
on the viewmodel objects.
*/
var result = from // Linq2SqlTable1 as t1
join // Linq2SqlTable2.id on Linq2SqlTable1.id as t2
join // Linq2SqlTable3.id on Linq2SqlTable1.id as t3
select new ViewModel { Id = t1.Id, Name = t2.Name, Other = t3.Other }
return result.Where(viewModel => viewModel.Name.Contains("some-guff"));
So given the example above, will the final Where statement be factored into the underlying query, or will the where on the viewModel cause a retrieval and then evaluate in memory?
Sorry for the verbosity to this question but there is very little documentation about it, and this is quite a specific question.
You do not need to push the Where clause any higher. It is fine where it is, as long as result is IQueryable<T> (for some T). LINQ is composable. Indeed, there's absolutely no difference between using the LINQ syntax as using the extension-method syntax, and either would work identically. Basically, when you create a query, it is only building a model of what has been requested. Nothing is executed until you start iterating it (foreach, ToList(), etc). So adding an extra Where on the end is fine: that will get built into the composed query.
You can verify this very simply by monitoring the SQL connection; you'll see that it includes the where clause in the TSQL, and filters at the SQL server.
This allows for some interesting scenarios, for example a flexible search:
IQueryable<Customer> query = db.Customers;
if(name != null) query = query.Where(x => x.Name == name);
if(region != null) query = query.Where(x => x.Region == region);
...
if(dob != null) query = query.Where(x => x.DoB == dob);
var results = query.Take(50).ToList();
In terms of your assumptions, they are incorrect - it is really:
build composable query, composing (separately) from, join, join, select
further compose the query, adding a where (no different to the above compositions)
at some point later, iterate the query
generate sql from the fully-composed query
retreive rows
map into model
yield the results
note that the sql generation only happens when the query is iterated; until then you can keep composing it all day long. It doesn't touch the SQL server until it is iterated.
I did my little research about LINQtoSQL best practices, cause I'm always using this technology with my projects. Take a look to my blog post. maybe it can help you.
http://msguy.net/post/2012/03/20/LINQ-to-SQL-Practices-and-approaches.aspx
The provider knows how the populated properties from your custom model are mapped (because of the select clause in your query) to the actual columns on the database table. So it knows what column on the table it needs to filter when you filter on a property of your custom model. Think of your selected model (weather it be a designer defined entity with all of the columns of the table, or a custom model defined somewhere, or an anonymous type with just the data you need) as just the selected columns before the FROM clause in the SQL query. selecting an anonymous model makes it easily recognizable that the fields in the model correspond to the SELECT list in SQL.
Most important: always remember that var result = from ... is just a query... until it gets iterated result.ToArray(). Try to call your variable query instead of result, and the world may get new new colors when you look again.
I've seen 2 types of entities, like this:
public class Person
{
public int Id {get;set;}
public string Name {get;set;}
public Country Country {get;set;}
}
and like this:
public class Person
{
public int Id {get;set;}
public string Name {get;set;}
public int CountryId {get;set;}
}
I think that the 2nd approach is more lightweight, and you get related data only if you needed;
which one do you think is better?
It depends what you want. If you only want to get the Country's ID then go for the second option. If you actually want to make use of navigation properties and/or lazy loading, then go for the first option.
Personally, I use Entity Framework and combine options one and two:
public class Person
{
public int Id {get;set;}
public string Name {get;set;}
public int CountryId {get;set;}
public Country Country {get;set;}
}
So I have a choice when it comes to returning data from my repositories. This also means that when I come to save, I can just populate the actual value type properties, instead of having to load the country object and assign it to the person.
Taken at face value, the first is an example of a rich domain model, and the second is a data driven approach. Allowing rich domain models is one of the main benefits of ORM.
The only reason I would include the CountryId (either in place of the Country, or in addition to it) would be for optimization for some very specific performance problem. Even then I would think twice. And optimization is something you shouldn't be thinking about too much at the initial design stage. Whats wrong with Person.Country.Id? (Assuming you need the id at all, and it's not just infrastructure).
If you are looking at this from any other angle than performance optimisation, then you are probably taking the wrong approach by including 'foreign keys' in your domain model. I had the same problem when first using NHibernate, coming from an ADO type background. I would almost certainly go with the first example.
There are two considerations, Platforms and Traffic, outlined below...
All in Microsoft Platform
In multi tier solutions, where end client is Silverlight and you are going to share your generated code via RIA services, or you have WPF client with WCF RIA services, first solution gives you better design.
Non Microsoft End client
If your end client is non microsoft client like Flex/Flash, Java or any ajax based smart clients, then first model will not be useful as it needs track itself (self tracking objects). Second model is preferred here.
Low Traffic applications
If network traffic is not much of issue and your design of software is more important, or you have highly scalable middle tires for caching etc, like App Fabric etc, first solution good one which will give you better design.
High Traffic applications
First model will serialize more data then necessary, and that can be a real performance issue in high traffic applications. So in that case, second model will work better because only if user is requesting more data of reference, then only it will be loaded.
This is quite a tradeoff issue between "Better Design" vs "Better Performance", and it needs to be selected based on parameters mentioned above and there can be more parameters depending upon complexity of project, team size, documentation and more.
Good question! For me
public List<Person> GetPersonsLivingIn(int countryId) {
return ObjectContext.Persons.Where(x => x.CountryId == countryId).ToList();
}
just looks like it works that way without knowing about all the magic (leaky) abstractions that may be present in the ORM that would make x => x.Country == country work. I came from Linq2Sql where I had some problems with the first one when passing around objects created in different object contexts.
But I would do as GenericTypeTea said and include both the id and the navigation property. After all, you'll want a navigable object graph at some point. And that way you can still make
public List<Person> GetPersonsLivingIn(Country country) {
return ObjectContext.Persons.Where(x => x.CountryId == country.CountryId).ToList();
}
which has a more OO feeling interface, but still looks like it would work without magic.
Except in some weird edge cases, there are no good reasons for the second design.
They are both equally lightweight (references are lazily loaded by default), but the second one doesn't give you navigational capabilities, which restricts and complicates your queries.
STOP!
In NHibernate, there is NO need to specify the foreign key in your domain model, not even for performance reasons.
Assuming you have lazy loading enabled (it's enabled by default), calling:
int countryId = person.Country.Id;
...won't incur a database hit to retrieve the Country entity. NHibernate will return a dynamic proxy of your Customer, not an actual Customer. Because of the proxy, a database hit will only occur on first access to a Property on your Customer entity, but NHibernate is smart enough to realise that 'person.Country.Id' is the same as accessing the customer ID foreign key in your Person table, which gets loaded in anyway.
However, the following code:
string countryName = person.Country.Name;
...will hit the database, the call to the 'Name' property will load the entire Customer instance.
This behavior assumes you have set-up your mapping like so:
<many-to-one name="Country" class="Country" column="Country_ID" lazy="proxy" />
(note that lazy="proxy" is the default).
Simply put, there is no need to map foreign keys in your domain model with NHibernate.
I'm reading about the Entity Framework 4.0 and I was wondering why should I create a complex type and not a new Entity (Table) and a relation between them?
The perfect example is an address. Using a complex type for an address is much easier to deal with than a new entity. With complex types you do not have to deal with the Primary Key. Think about accessing an address how many common types of entities would have an address (Business Units, People, Places). Imagine populating many peoples addresses and needing to set a key for each one. With complex types you simply access the internal properties of they type and you're done. Here is an MSDN link of an example. http://msdn.microsoft.com/en-us/library/bb738613.aspx
This question has been here a while already, but I'm going to add an answer anyway in the hopes that the next poor sob that comes along knows what he's in for.
Complex types do not support lazy loading, at least not in EF 4.3. Let's take the address situation as an example. You have a Person table with 15 columns, 5 of which contain address information for certain individuals. It has 50k records. You create entity Person for the table with a complex type Address.
If you need a list of names of all individuals in your database you would do
var records = context.Persons;
which also includes addresses, pumping 5*50k values into your list for no reason and with noticeable delay. You could opt to only load the values you need in an anonymous type with
var records = from p in context.Persons
select new {
LastName = p.LastName,
FirstName = p.FirstName,
}
which works well for this case, but if you needed a more comprehensive list with, say, 8 non-address columns you would either need to add each one in the anonymous type or just go with the first case and go back to loading useless address data.
Here's the thing about anonymous types: While they are very useful within a single method, they force you to use dynamic variables elsewhere in your class or class children, which negate some of Visual Studio's refactoring facilities and leave you open to run-time errors. Ideally you want to circulate entities among your methods, so those entities should carry as little baggage as possible. This is why lazy loading is so important.
When it comes to the above example, the address information should really be in a table of its own with a full blown entity covering it. As a side benefit, if your client asks for a second address for a person, you can add it to your model by simply adding an extra Address reference in Person.
If unlike the above example you actually need the address data in almost every query you make and really want to have those fields in the Person table, then simply add them to the Person entity. You won't have the neat Address prefix any more, but it's not exactly something to lose sleep over.
But wait, there's more!
Complex types are a special case, a bump on the smooth landscape of plain EF entities. The ones in your project may not be eligible to inherit from your entity base class, making it impossible to put them through methods dealing with your entities in general.
Assume that you have an entity base class named EntityModel which defines a property ID. This is the key for all your entity objects, so you can now create
class EntityModelComparer<T> : IEqualityComparer<T> where T : EntityModel
which you then can use with Distinct() to filter duplicates from any IQueryable of type T where T is an entity class. A complex type can't inherit from EntityModel because it doesn't have an ID property, but that's fine because you wouldn't be using distinct on it anyway.
Further down the line you come across a situation where you need some way to go through any entity and perform an operation. Maybe you want to dynamically list the properties of an entity on the UI and let the user perform queries on them. So you build a class that you can instantiate for a particular type and have it take care of the whole thing:
public class GenericModelFilter<T> : where T : EntityModel
Oh wait, your complex type is not of type EntityModel. Now you have to complicate your entity inheritance tree to accommodate complex types or get rid of the EntityModel contract and reduce visibility.
Moving along, you add a method to your class that based on user selections can create an expression that you can use with linq to filter any entity class
Expression<Func<T, bool>> GetPredicate() { ... }
so now you can do something like this:
personFilter = new GenericModelFilter<Person>();
companyFilter = new GenericModelFilter<Company>();
addressFilter = new GenericModelFilter<Address>(); //Complex type for Person
...
var query = from p in context.Persons.Where(personFilter.GetPredicate())
join c in context.Companies.Where(companyFilter.GetPredicate()) on p.CompanyID = c.ID
select p;
This works the same for all entity objects... except Address with its special needs. You can't do a join for it like you did with Company. You can navigate to it from Person, but how do you apply that Expression on it and still end up with Person at the end? Now you have to take moment and figure out this special case for a simple system that works easily everywhere else.
This pattern repeats itself throughout the lifetime of a project. Do I speak from experience? I wish I didn't. Complex types keep stopping your progress, like a misbehaved student at the back of the class, without adding anything of essence. Do yourself a favor and opt for actual entity objects instead.
Based on Domain Driven Design Concepts, Aggregate root could have one or more internal objects as its parts. In this case, Internal objects - inside the boundary of Aggregate Root - does not have any KEY. The parent key will be applied to them or somehow like this. Your answer returns to the benefit of keeping all Parts inside Aggregate root that makes your model more robust and much simpler.
I had a quick question about the proper object relationship I should set up for this situation:
I have a Customer object with associated parameters and a depot object with associated parameters. Each depot serves a set of customers and the customer needs access to particular information for their respective depot.
I'm wondering what the proper relationship I should set up so that a set of customer objects all reference the same instance of a particular depot object. I wanted to be sure it wasn't creating a duplicate Depot object for each customer. Furthermore, i'd like to be able to change properties of the Depot without going through the customer itself.
I know this is probably a fairly basic question but C# has so many different "features" it gets confusing from time to time.
Thanks for your help!
Charlie
If I understand your question correctly, I think a solution to your problem might be an OR mapper. Microsoft provides two OR mappers at the moment, LINQ to SQL and Entity Framework. If you are using .NET 3.5, I recommend using LINQ to SQL, but if you are able to experiment with .NET 4.0, I would highly recommend looking into Entity Framework. (I discourage the use of Entity Framework in .NET 3.5, as it was released very prematurely and has a LOT of problems.)
Both of these OR mappers provide visual modeling tools that allow you to build a conceptual entity model. With LINQ to SQL, you can generate a model from your database, which will provide you with entity classes, as well as associations between those classes (representing your foreign keys from your DB schema). The LINQ to SQL framework will handle generating SQL queries for you, and will automatically map database query results into object graphs. Relationships such as the one you described, with multiple customers in a set referencing the same single department are handled automatically for you, you don't need to worry about them at all. You also have the ability to query your database using LINQ, and can avoid having to write a significant amount of stored procedures and plumbing/mapping code.
If you use .NET 4.0, Entity Framework is literally LINQ to SQL on steroids. It supports everything LINQ to SQL does, and a hell of a lot more. It supports model-driven design, allowing you to build a conceptual model from which code AND database schema are generated. It supports a much wider variety of mappings, providing a much more flexible platform. It also provides Entity SQL (eSQL), which is a text-based query language that can be used to query the model in addition to LINQ to Entities. Line LINQ to SQL, it will solve the scenario you used as an example, as well as many others.
OR mappers can be a huge time, money, and effort saver, greatly reducing the amount of effort required to interact with a relational database. They provide both dynamic querying as well as dynamic, optimistic updates/inserts/deletes with conflict resolution.
This sounds like you've got a Many-to-many relationship going on. (Customers know about their Depots, and vice versa)
Ideally this seems best suited for a database application where you define a weak-entity table ... Of course using a database is overkill if we're talking about 10 Customers and 10 Depots...
Assuming a database is overkill, this can be modeled in code with some Dictionarys. Assuming you're using int for the unique identifiers for both Depot and Customer you could create something like the following:
// creating a derived class for readability.
public class DepotIDToListOfCustomerIDs : Dictionary<int,List<int>> {}
public class CustomerIDToListOfDepotIDs : Dictionary<int,List<int>> {}
public class DepotIDToDepotObject : Dictionary<int,Depot>{}
public class CustomerIDToCustomerObject : Dictionary<int, Customer>{}
//...
// class scope for a class that manages all these objects...
DepotIDToListOfCustomerIDs _d2cl = new DepotIDToListOfCustomerIDs();
CustomerIDToListOfDepotIDs _c2dl = new CustomerIDToListOfDepotIDs();
DepotIDToDepotObject _d2do = new DepotIDToDepotObject();
CustomerIDToCustomerObject _c2co = new CustomerIDToCustomerObject();
//...
// Populate all the lists with the cross referenced info.
//...
// in a method that needs to build a list of depots for a given customer
// param: Customer c
if (_c2dl.ContainsKey(c.ID))
{
List<int> dids=_c2dl[c.ID];
List<Depot> ds=new List<Depot>();
foreach(int did in dids)
{
if (_d2do.ContainsKey(did))
ds.Add(_d2do[did]);
}
}
// building the list of customers for a Depot would be similar to the above code.
EDIT 1: note that with the code above, I've crafted it to avoid circular references. Having a customer reference a depot that also references that same customer will prevent these from being quickly garbage collected. If these objects will persist for the entirety of the applications lifespan a simpler approach certainly could be taken. In that approach you'd have two lists, one of Customer instances, the other would be a list of Depot instances. The Customer and Depot would contain lists of Depots and Customers respectively. However, you will still need two dictionaries in order to resolve the Depot IDs for the customers, and vice versa. The resulting code would be 99% the same as the above.
EDIT 2:
As is outlined in others replies you can (and should) have an object broker model that makes the relationships and answers questions about the relationships. For those who have misread my code; it is by no means intended to craft the absolute and full object model for this situation. However, it is intended to illustrate how the object broker would manage these relationships in a manner that prevents circular references. You have my apologies for the confusion it caused on the first go around. And my thanks for illustrating a good OO presentation that would be readily consumed by others.
In reply to #Jason D, and for the sake of #Nitax: I'm really skimming the surface, because while it's basically easy, it also can get complicated. There's no way I'm going to re-write it better than Martin Fowler either (certainly not in 10 minutes).
You first have to sort out the issue of only 1 object in memory that refers to a specific depot. We'll achieve that with something called a Repository. CustomerRepository has a GetCustomer() method, and the DepotRepository has a GetDepot() method. I'm going to wave my hands and pretend that just happens.
Second you need to need to write some tests that indicate how you want the code to work. I can't know that, but bear with me anyways.
// sample code for how we access customers and depots
Customer customer = Repositories.CustomerRepository.GetCustomer("Bob");
Depot depot = Repositories.DepotRepository.GetDepot("Texas SW 17");
Now the hard part here is: How do you want to model the relationship? In OO systems you don't really have to do anything. In C# I could just do the following.
Customers keep a list of the depots they are with
class Customer
{
public IList<Depot> Depots { get { return _depotList; } }
}
alternatively, Depots keep a list of the customers they are with
class Depot
{
public IList<Customer> Customers { get { return _customerList; } }
}
// * code is very brief to illustrate.
In it's most basic form, any number of Customers can refer to any number of Depots. m:n solved. References are cheap in OO.
Mind you, the problem we hit is that while the Customer can keep a list of references to all the depot's it cares about (first example), there's not an easy way for the Depot to enumerate all the Customers.
To get a list of all Customers for a Depot (first example) we have to write code that iterates over all customers and checks the customer.Depots property:
List<Customer> CustomersForDepot(Depot depot)
{
List<Customer> allCustomers = Repositories.CustomerRepository.AllCustomers();
List<Customer> customersForDepot = new List<Customer>();
foreach( Customer customer in allCustomers )
{
if( customer.Depots.Contains(depot) )
{
customersForDepot.Add(customer);
}
}
return customersForDepot;
}
If we were using Linq, we could write it as
var depotQuery = from o in allCustomers
where o.Depots.Contains(depot)
select o;
return query.ToList();
Have 10,000,000 Customers stored in a database? Ouch! You really don't want to have to load all 10,000,000 customers each time a Depot needs to determine its' customers. On the other hand, if you only have 10 Depots, a query loading all Depots once and a while isn't a big deal. You should always think about your data and your data access strategy.
We could have the list in both Customer and Depot. When we do that we have to be careful about the implementation. When adding or removing an association, we need to make the change to both lists at once. Otherwise we have customers thinking they are associated with a depot, but the depot doesn't know anything about the customer.
If we don't like that, and decide we don't really need to couple the objects so tightly. We can remove the explicit List's and introduce a third object that is just the relationship (and also include another repository).
class CustomerDepotAssociation
{
public Customer { get; }
public Depot { get; }
}
class CustomerDepotAssociationRepository
{
IList<Customer> GetCustomersFor(Depot depot) ...
IList<Depot> GetDepotsFor(Customer customer) ...
void Associate(Depot depot, Customer customer) ...
void DeAssociate(Depot depot, Customer customer) ...
}
It's yet another alternative. The repository for the association doesn't need to expose how it associates Customers to Depots (and by the way, from what I can tell, this is what #Jason D's code is attempting to do)
I might prefer the separate object in this instance because what we're saying is the association of Customer and Depot is an entity unto itself.
So go ahead and read some Domain Driven Design books, and also buy Martin Fowlers PoEAA (Patterns of Enterprise Application Architecture)
Hope this is self-explanatory.
OO:
ER: