Problem modelling address generalization

Problem modelling address generalization - c#

I’m a bit concerned about how to model this rather simple scenario the best way in UML and ERM. When designed it has to be implemented in C# (Entity Framework 4) and SQL Server 2008.
The case is that I have a person, who has a postal address. Now this address can be two types, either a postal box, or a house identifier (house number, street, etc.). I believe that this is called generalization, and might be modelled the following way using UML:
http://i.imgur.com/Vzx4Z.png (sorry for the link, but I don't have enough reputation to post images yet)
PostalAddressPostalBox and PostalAddressHouseIdentifier will of course have more properties and relations to other entities/tables. Same goes with the Person class. Also other classes will have a reference to a postal address, so it's not only persons.
My question is if this is the correct way to model this problem, and how I can implement it in SQL (schema wise) and what my entities in C# should look like?
Thank you in advance

I just think you shouldn't reuse the PostalAddress for different entities. It will just be harder to change the address for a given entity.
For the relational database, there are some options, like class table inheritance.
Or maybe single table inheritance or concrete table inheritance. It's up to you to decide the tradeoffs.

Related

On the "owned" types in EF Core

In my project, I use the EF Core fluent config, code first. I read a little about the owned types, but the situation bellow is not really clear to me:
Suppose I have a Project entity and a ProjectType.
Should I map that property as :
Entity<Project>.HasOne<ProjectType>(); or rather as
Entity<Project>.OwnsOne<ProjectType>();
The ProjectType entity should be mapped to a table ProjectType(ProjectTypeId, Name, Description)
As I read, owned are
"types that can only ever appear on navigation properties of other
entity types. These are called owned entity types. The entity
containing an owned entity type is its owner. Owned entities are
essentially a part of the owner and cannot exist without it"
In my case
"ProjectType can only ever appear on navigation properties of Project entity type. ProjectType is essentially a part of the Project and cannot exist without it"... however, in order to create a separate table, as I understood I need to use HasOne, not OwnsOne... would be great if someone explain better this idea. Thanks a lot.

ProjectTypes sound like a reference table that might otherwise be modifiable over the course of an application's lifespan, such as through a system administration role. Usage of the new "Owns" is a convention to help enforce concepts like type-specific composition, and linking tables in a relational data model.
A better example for composition: Say you have a Project and as a part of a Project there are some details that are fairly large, and infrequently used. Things like an image or other binary data, or perhaps some large text. Having these BLOB/CLOB details inside the Project table can be a recipe for disaster when you fetch a Project, or many Projects, so you normalize them out into a separate related table called ProjectDetails, or possibly several related tables. This way, for the most part when you are working with Project and loading those entities you don't have to worry about pulling back these large fields all of the time, you can reference a ProjectDetails to include only when it is actually needed. Since ProjectDetails doesn't really serve any purpose on it's own, it doesn't need a DbSet or anything of the like, so we can set up the relationship from Project to OwnsOne ProjectDetails.
ProjectType on the other hand would potentially have a DbSet to establish new project types over the course of configuring an application. You may also want to associate other project-related details based on a Project Type. In this case it would make more sense for Project to HasOne ProjectType. We can have a DbSet of ProjectTypes to manage, and other entities may filter by ProjectTYpe as well, Project Stages/Phases, etc.
As far as the database schema goes between Owns and Has, there is no difference. It's solely about how the EF DbContext will expect to work with the entities.
Other common examples of using Owns are linking tables. For instance you have an Address table which is shared between Orders, Customers, etc. Neither "Owns" addresses, but they do own their linking table: Order Owns OrderAddress, Customer Owns CustomerAddress. These entities "Has" an Address. We may still want to review Addresses as they represent physical locations and there is a difference between associating an Order etc. to a different location, and "adjusting" the details recorded for a physical location. (I.e. correcting a street name or municipality) There is not a need to ever deal with OrderAddresses or CustomerAddresses outside of the scope of the Order or Customer respectively.

DDD and CRUD on lookup tables using repository pattern

So I am trying to follow the Domain Driven Design approach and am not sure how to handle lookup tables that require CRUD operations. Here is a contrived example to demonstrate my question.
Let's say I have a Person class
public class Person
{
public string Address { get; private set; }
}
My database has a table of People and a table of Addresses. The People table has a column for Address_Id which is a foreign key to the Address table. The obvious idea being you can't add a person record to the People table with an address value that doesn't exist in the Addresses table since there is a foreign key relationship.
In my application, Person is an aggregate root and thus has an associated repository. Using the repository, I load people which also loads the associated address.
My question is how do I do CRUD operations on the Addresses table? Following DDD principles, I don't think I am supposed to create a repository for the Addresses table.
The requirement for my application is that when creating a new Person object, a drop-down list of addresses is presented from which the user will select an address. They are not allowed to hand type addresses, they must pick from one that already exists. Hence, the CRUD operations on the Addresses table. The administrator of the system will need to manage the Addresses lookup table so that the proper selections are presented to a user creating Person objects.
Again, this is a very contrived example and I get that nobody would ever design such a system. It is simply to help illustrate the question.

IMO, you have two use cases: 1) Saving Person objects, but before 2) listing all available Addresses to be able to select the right one.
So, I would create a AddressRepository, maybe not a CRUD one, but only for fetching entities.

Are you ever editing or retrieving addresses on their own? The repository is essentially a mapper for a business object to a relational database, it is supposed to encapsulate how the object is persisted so you don't have to worry about it.
If the object is persisted in multiple tables the business logic does not have to know that, so unless you needed to edit Address objects on their own, I wouldn't add a repository for Address.
Have a look at this: Repository Pattern Step by Step Explanation

Well, but I guess property string Address should be Address Address.
In that case, when you store a Person on PersonRepository, if some given Address doesn't exists in the underlying store, the whole repository using its tech-specific implementation will create the whole address registry in your Addresses relational table for you.
Also, I guess you'll be using a repository over an existing data mapper - an OR/M -, which should manage this cases easily: it's just about mapping the whole property as a many-to-one assocation.
Actually I believe a repository should store root aggregates like you mention in your own question.
It depends on your own domain. If an address can live alone because it can be associated to 0 or more persons, you should consider adding addresses using a specific AddressRepository, register addresses using it and later you can always associate one to some Person.

SQL Server and Entity Framework - Dynamic Columns

I use SQL Server and Entity Framework as ORM.
Currently I have a table Product which contains all products of any kind. The different kinds of products possess different attributes.
For example:
All products of kind TV have attributes title, resolution and contrast
Where as all products of kind Car have attributes like model and horsepower
Based on this scenario I created a table called Attribute which contains all the attributes of a product.
Now to fetch a product from database I always have to join all the attributes.
To insert a product I have to insert all the attributes one by one as single rows.
The application is not just a shop or anything like it. It should be possible to add/remove an attribute to/from a kind of product on the fly without changing the db.
But my questions to you is still:
Is this a bad design?
Is there another way of doing it?
Will my solution slow down significant? (f.e. an insert takes several seconds assumed the product has hundreds of attributes...)
Update
The problem is that my application is really complex. There are a lot of huge algorithms. The software is used for statistical purposes.
One problem for example is the following one: In an algorithm-table I'm storing which attributes are used for filters. Say an administrator wants to filter all cars that have less than 100 horsepowers. The filters are dynamical, what means that I have a filter table which stores the filter type (lessThan) and the attribute (horsepowers). How can I keep this flexibility with the suggested approaches (with "hardcoded" columns)?

There is a thing about EF that I don't think everybody is aware of when designing the relations.
When you query something, EF (at least <= 4) wants to create a single SELECT for that query.
What that implies is that if you have entity A, that have a one-to-many relationship to entity B (say Item to Attributes) then EF joins the two together such there will be a returned row for all dependent Bs for each A. If A have many properties, multiple dependencies or even worse if B has many sub-dependencies, then the returned table will be quite massive, since all A-properties will be copied for each row of dependent B. Over time, when your entity models grow in complexity, this can turn into a real performance problem.
EF only includes the Bs if you explicitly tell to it to eager load the dependencies "include"s. If the includes are omitted, your stuff will initially load faster, but once you access your attributes, they will be lazy-loaded by EF. This is known as the SELECT N+1 problem (each A will require N times B-lazy queries, which can be a huge overhead).
While this is not a straight answer to your question, it is something to consider when designing your tables.
Also note, that EF supports several alternatives for base-classing. One strategy is to have a common table, that automatically joined together with the sub-entities. The alternative, which typically performs better, but is harder to upgrade, is to have one table with a super-set of all properties of all sub-classes.
More (over) generalized database design considerations:
The devil is in the details. You can make a whole career out of making good database design choices. There is no silver bullet database patterns.
EF comes with a lot of limitations. This is the price for the convenience. If the model suits EF well, then EF is quite good, but do consider more flexible alternatives like NHibernate. Sometimes even plain old data tables with views and stored procedures are to be preferred.
EF is not efficient if your model has a lot of small dependents (like a ton of attributes to an item table). It will result in either a monster query and return table or the select n+1 problem. You can write some tricky multi-part LINQ queries to somewhat compensate, but it is tricky.
SQL's strength is in integrity and reporting which works best for rather rigid data models.
Depending on the details, your model looks like a great candidate for a NoSql backend, like RavenDb and MongoDb. NoSql is much better for dynamic datamodels and scale really well.

Abstracting away database specific id:s with the repository pattern?

I'm learning DDD (domain driven design) and the repository pattern (in C#). I would like to be able to use the repository pattern to persist an entity and not care which database is actually used (Oracle, MySQL, MongoDB, RavenDB, etc.). I am, however, not sure how to handle the database specific id:s most (all?) databases uses. RavenDB, for example, requires that each entity it should store has an id property of type string. Other may require an id property of type int. Since this is handled differently by different databases, I cannot make the database id a part of the entity class. But it would have to exist at some point, at least when I store the actual entity. My question is what the best practise regarding this is?
The idea I am currently pursuing is to, for each database I want to support, implement database specific "value objects" for each business object type. These value object would then have the database specific id property and I would map between the two upon reads and writes. Does this seem like a good idea?

This is the classic case of leaking abstractions. You can't possibly abstract away the type of database under a repository interface unless you want to loose all the good things that come with each database. The requirements on ID type (string, Guid or whatever) are only the very top of huge iceberg with majority of its mass under the muddy waters.
Think about transaction handling, concurrency and other stuff. I understand your point about persistence ignorance. It's a good thing for sure to not depend on specific database technology in the domain model. But you also can't get rid of any dependency on any persistence technology.
It's relatively easy to make your domain model work well with any RDBMS. Most of them have standardized data types. Using ORM like NHibernate will help you a lot. It's much harder to achieve the same among NoSQL databases because they tend to differ a lot (which is very good actually).
So my advise would be to do some research on what is the set of possible persistence technologies you will have to deal with and then choose appropriate level of abstraction for the persistence subsystem.
If this won't work for you, think about Event Sourcing. The event store is one of the least demanding persistence technique. Using library such as Jonathan Oliver's EventStore will allow you to use virtually any storage technology, including file system.

I would go ahead and create an int Id field in the entity and then convert that to a string in the repository where the Id must be a string. I think the effort to abstract your persistence is very worth while and actually eases maintenance.

You are doing the right thing! Abstract yourself away from the constraints of the databases primary key types!
Don't try to translate types, just use a different field.
Specifically: Do not try to use the database's primary key, except in your data access logic. If you need a friendly ID for an object, just create an additional field, of whatever type you like, and require your database to store that. Only in your data access layer would you need to find & update the DB record(s) based on your object's friendly ID. Easy.
Then, your constraints on which databases can persist your objects have changed from 'must be able to have a primary key of type xxxx' to simple 'must be able to store type xxxx'. I think you'll then find you cna use any database in the world. Happy coding! DDD is the best!

You can potentially have the ids in the entity but not expose it as part of entity's public interface. This is possible with NHibernate because it allows you to map table column to a private field.
So you can potentially have something like
class Customer {
private readonly Int32? _relationalId;
private readonly String? _documentId;
...
This is not ideal because your persistence logic 'bleeds' on business logic but given the requirements it probably is easier and more robust than maintaining mapping between entity and its id somewhere outside entity. I would also highly recommend you to evaluate "Database agnostic" approach which would be more realistic if you only want to support relational databases. In this case you can at least reuse ORM like NHibernate for your repository implementation. And most relational database support same id types. In your scenario you not only need ORM you also need something like "Object-Document-Mapper". I can see that you will have to write tons and tons of infrastructure code. I highly recommend you to reevaluate your requirements and choose between relational and document databases. Read this: Pros/cons of document-based databases vs. relational databases

How would you code a repository pattern like a "factory" design pattern?

I thought I would rewrite this question (same iteration). The original was how to wrap a repository pattern around an EAV/CR database. I am trying a different approach.
Question: How could you code a data repository in a "factory" design pattern way? I have a fixed number of entities, but the attributes to these entities are fairly customer specific. They advertise Products which are all similar, but each customer attaches different information to them based on their business model. For example, some care about the percent slab off waste while others care about the quantity of pounds sold. Every time we find another customer, we add a bunch of fields, remove a bunch of fields, and then spend hours keeping each solution current to the latest common release.
I thought we could put the repository classes in a factory pattern, so that when I know the customer type, then I know what fields they would use. Practical? Better way? The web forms use User Controls which are modified to reflect what fields are on the layouts. We currently "join" the fields found on the layout to the fields found in the product table, then CRUD common fields.
Previous question content:
We have an EAV/CR data model that allows different classes for the same entity. This tracks products where customers have wildly different products. Customers can define a "class" of product, load it up with fields, then populate it with data. For example,
Product.Text_Fields.Name
Product.Text_Fields.VitaminEContent
Any suggestion on how to wrap a repository pattern around this?
We have a three table EAV: a Product table, a value table, and a meta table that lists the field names and data types (we list the data types because we have other tables like Product.Price and Product.Price Meta data, along with others like Product.Photo.) Customers track all kinds of prices like a competitor's percent off difference as well as on the fly calculations.
We currently use Linq to SQL with C#.
Edit:
I like the "Dynamic Query" Linq below. The idea behind our DB is like a locker room locker storage. Each athlete (or customer) organizes their own locker in the way they wish, and we handle storing it for them. We don't care what's in the locker as long as they can do what they need with it.
Very interesting... The objects passed to the repository could be dynamic? This almost makes sense, almost like a factory pattern. Would it be possible for the customer to put their own class definitions in a text file, then we inherit them and store them in the DB?

As I understand it the repository pattern abstracts the physical implementation of the database from the application. Are you planning to store the data in differing data stores? If you are happy with Linq to SQL then I'd suggest that perhaps you don't need to abstract in this way as it looks to be extremely complex. That said, I can see that providing an EAV-style repository, i.e. a query would need to be passed Table, Field Type, and Field Name along with any conditional required, might offer you the abstraction you are seeking.
I'm not sure if that would still qualify as a repository pattern in the strictest terms as you aren't really abstracting the storage from the application. It's going to be a toss-up between benefit and effort and there I'm unable to help.
You might want to take a look at the Dynamic Linq extensions found in the dynamic data preview on Codeplex.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.