How to create my DDD entities with Mongodb/Couchdb?

How to create my DDD entities with Mongodb/Couchdb? - c#

I am starting my first project using DDD (using C#) and at this stage I feel we will probably go with MongoDB or maybe CouchDB for the persistence (an ORM like Entity framework seems too much of an overkill for what we want), but saying that I have pretty much zero experience in MongoDB or CouchDB at this stage.
As I am creating my domain I thought about using GUIDS as my IDs for my entities (coming from a relational database world, still having trouble moving away from it).
If I go down this route will I be able to easily plugin in my persistence layer (mongoDB/CouchDB) or would I have to change my domain model (currently my constructors on my entity objects have a string ID as a parameter (which will be the GUID ID)).
JD

With MongoDB you probbly want to have a collection per aggregate root, which means that your aggregate roots needs ids, since they will be the documents in the DB. If you want to keep your domain model free of MongoDB specific code those ids can be strings.
I would not include the ids in the constructor arguments. I would just let them be writable properties. As with an ORM I would consider handling reading and storing of entities via repositories. And keep the MongoDB code in there.

Related

.NET, Layered Architecture & MongoDB - What to use as ID?

I'm developing a .NET web service while trying to maintain a layered architecture that keeps my Models in one project and DB access (DAL) in a different one. The idea behind this is that if I have to change the DB technology, it's just a matter of creating a differnet DAL, while the rest of the application remains unchanged.
In the data-access layer that I'm developing, I am using Mongo DB C# Driver.
I've seen that:
Properties named "ID" will be mapped by the C# driver as the database's "_id" (Convention over configuration);
Int + Auto-increment in MongoDB is not a good idea;
Using Guid's as ID in MongoDB isn't a good idea either;
The recommended data type for the ID of documents stored in MongoDB is ObjectID. The C# driver provides a class to represent this;
However, if I use this data type (from MongoDB.Bson) in my Models, then they will become dependent on the MongoDB C# Driver and I don't want that: I want my models to be DB-independent; only my DALs can depend on whatever data access technologies I use.
So what data type should I use for my POCOs' IDs in order to have guarantee uniqueness in the DB? Would a string representation of a Guid be horrible in terms of performance?
Your feedback is welcome.

Good question.
From Experience, I can say that you're right: both GUIDs and auto-increment aren't the best idea (with GUID being a lot better than auto-increments), but not only for the reason mentioned in the SO question you linked to, but mostly because you need to be aware of the implications of monotonic vs. non-monotonic keys.
With the ObjectIds, I see three options:
Map between domain model and DAL. In the domain model, you could use the objectid's string representation. That's a bit annoying, but it forces you to separation of concerns.
Use your own data type and implement a type converter / mongodb serializer. I haven't tried that but I don't see why this wouldn't work.
Accept the MongoDB dependency. After all, if you really swap out your database, that will be a huge task. Different databases have very different characteristics and require very different data models. The whole "swap out the database" in a minute is bogus IMHO, it's never that easy and a database is a much leakier abstraction than anyone wants to admit. Trying to keep independent is a PITA. Anyway, doing a seek-and-destroy on the word ObjectId will be less than 1% of the other work.

Data Access Framework that addresses my needs

I'm having trouble choosing an appropriate data access framework, partly because I'm very picky with my preferences and mostly because I don't have much experience with most of them :-)
I need a framework that will allow me to easily map between the DB tables (SQL Server) and my entities, and that will handle the CRUD operations for me (for the most part).
I want my entities to reside in a separate assembly from my DAL.
I prefer using attributes for the mappings over external file like XML.
It doesn't have to be an ORM, and I want to code my entities myself.
I don't mind writing stored procedures.
The project's database won't be very big. Less than 50 tables.
I'd like some of my entities to correspond to an inner join of two tables - one for static data entered manually during development and the other with data filled during runtime - without using two entities that reference one another (the result of this join will be a single entity).
Entity Framework sounded perfect until I realized it doesn't support Enums (yet - and I can't wait for EF 5.0).
I want these entities to include Enums, and plan on using lookup tables for the enums + code generation for the enum to keep it synchronized with the database.
Linq-to-SQL seems like a good candidate, but I don't know if it copes well with my previous demands.
Using Enterprise Library 5.0 DAAB with it's RowMapper, and extending it's abilities to perform updates and inserts is also an option (but will require more coding on my part).
I plan on implementing the Repository Pattern.
How about NHibernate? Would it do? No experience there either.
I would be happy to hear all suggestions.. the more the merrier! Thanks in advance!

I think nHibernate is the way to go, although some of its main strengths (ORM, stored procedure generation, etc) are things you listed as non-requirements. Anyway, nHibernate will do everything you want it to do. Technically it does use xml mappings, but these can easily be auto-generated using fluent attribute mapping. I like this, as it IS done for you, but you get the customization too just in case you need it. Good luck!

Abstracting away database specific id:s with the repository pattern?

I'm learning DDD (domain driven design) and the repository pattern (in C#). I would like to be able to use the repository pattern to persist an entity and not care which database is actually used (Oracle, MySQL, MongoDB, RavenDB, etc.). I am, however, not sure how to handle the database specific id:s most (all?) databases uses. RavenDB, for example, requires that each entity it should store has an id property of type string. Other may require an id property of type int. Since this is handled differently by different databases, I cannot make the database id a part of the entity class. But it would have to exist at some point, at least when I store the actual entity. My question is what the best practise regarding this is?
The idea I am currently pursuing is to, for each database I want to support, implement database specific "value objects" for each business object type. These value object would then have the database specific id property and I would map between the two upon reads and writes. Does this seem like a good idea?

This is the classic case of leaking abstractions. You can't possibly abstract away the type of database under a repository interface unless you want to loose all the good things that come with each database. The requirements on ID type (string, Guid or whatever) are only the very top of huge iceberg with majority of its mass under the muddy waters.
Think about transaction handling, concurrency and other stuff. I understand your point about persistence ignorance. It's a good thing for sure to not depend on specific database technology in the domain model. But you also can't get rid of any dependency on any persistence technology.
It's relatively easy to make your domain model work well with any RDBMS. Most of them have standardized data types. Using ORM like NHibernate will help you a lot. It's much harder to achieve the same among NoSQL databases because they tend to differ a lot (which is very good actually).
So my advise would be to do some research on what is the set of possible persistence technologies you will have to deal with and then choose appropriate level of abstraction for the persistence subsystem.
If this won't work for you, think about Event Sourcing. The event store is one of the least demanding persistence technique. Using library such as Jonathan Oliver's EventStore will allow you to use virtually any storage technology, including file system.

I would go ahead and create an int Id field in the entity and then convert that to a string in the repository where the Id must be a string. I think the effort to abstract your persistence is very worth while and actually eases maintenance.

You are doing the right thing! Abstract yourself away from the constraints of the databases primary key types!
Don't try to translate types, just use a different field.
Specifically: Do not try to use the database's primary key, except in your data access logic. If you need a friendly ID for an object, just create an additional field, of whatever type you like, and require your database to store that. Only in your data access layer would you need to find & update the DB record(s) based on your object's friendly ID. Easy.
Then, your constraints on which databases can persist your objects have changed from 'must be able to have a primary key of type xxxx' to simple 'must be able to store type xxxx'. I think you'll then find you cna use any database in the world. Happy coding! DDD is the best!

You can potentially have the ids in the entity but not expose it as part of entity's public interface. This is possible with NHibernate because it allows you to map table column to a private field.
So you can potentially have something like
class Customer {
private readonly Int32? _relationalId;
private readonly String? _documentId;
...
This is not ideal because your persistence logic 'bleeds' on business logic but given the requirements it probably is easier and more robust than maintaining mapping between entity and its id somewhere outside entity. I would also highly recommend you to evaluate "Database agnostic" approach which would be more realistic if you only want to support relational databases. In this case you can at least reuse ORM like NHibernate for your repository implementation. And most relational database support same id types. In your scenario you not only need ORM you also need something like "Object-Document-Mapper". I can see that you will have to write tons and tons of infrastructure code. I highly recommend you to reevaluate your requirements and choose between relational and document databases. Read this: Pros/cons of document-based databases vs. relational databases

Strategies for replacing legacy data layer with Entity framework and POCO classes

We are using .net C# 4.0, VS 2010, EF 4.1 and legacy code in this project we are working on.
I'm working on a win form project where I have made a decision to start using entity framework 4.1 for accessing an ms sql db. The code base is quite old and we have an existing data layer that uses data adapters. These data adapters are used all over the place (in web apps and win form apps) My plan is to replace the old db access code with EF over time and get rid for the tight coupling between UI layers and data layer.
So my idea is to more or less combine EF with the legacy data access layer and slowly replace the legacy data layer with a more modern take on things using EF. So for now we need to use both EF and the legacy db access code.
What I have done so far is to add a project containing the edmx file and context. The edmx is generated using database first approach. I have also added another project that contains the POCO classes (by using ADO.NET POCO Entity Generator). I have more or less followed Julia Lerman's approach in her book "Programming Entity Framework" on how to split the model and the generated POCO classes. The database model has been set for years and it's not an option the change the table and the relationships, triggers, stored procedures, etc, so I'm basically stuck with the db model as it is.
I have read about the repository pattern and unit of work and I kind of like the patterns, but I struggle to implement them when I have both EF and the legacy db access code to deal with. Specially when I don't have the time to replace all of the legacy db access code with a pure EF implementation. In an perfect world I would start all over again with a fresh take one the data model, but that is not an option here.
Is the repository and unit of work patterns the way to go here? In order to use the POCO classes in my business layer, I sometimes need to use both EF and the legacy db code to populate my POCO classes. In another words, I can sometimes use EF to retrieve a part of the data I need and the use the old db access layer to retrieve the rest of the data and then map the data to my POCO classes. When I want to update some data I need to pick data from the POCO classes and use the legacy data access code to store the data in the database. So I need to map the data retrieved from the legacy data access layer to my POCO classes when I want to display the data in the UI and vice versa when I want to save data to the data base.
To complicate things we store some data in tables that we don't know the name of before runtime (Please don't ask me why:-) ). So in the old db access layer, we had to create sql statements on the fly where we inserted the table and column names based on information from other tables.
I also find that the relationships between the POCO classes are somewhat too data base centric. In another words, I feel that I need to have a more simplified domain model to work with. Perhaps I should create a domain model that fits the bill and then use the POCO classes as "DAO's" to populate the domain model classes?
How would you implement this using the Repository pattern and Unit of Work pattern? (if that is the way to go)

Alarm bells are ringing for me! We tried to do something similar a while ago (only with nHibernate not EF4). We had several problems running ADO.NET along side an ORM - database concurrency being a big one.
The database model has been set for
years and it's not an option the
change the table and the
relationships, triggers, stored
procedures, etc, so I'm basically
stuck with the db model as it is.
Yep. Same thing! The problem was that our stored procs contained a lot of business logic and weren't simple CRUD procs so keeping the ORM updated with the various updates performed by a stored procedure was not easy at all - Single Responsibility Principle - not a good one to break!
My plan is to replace the old db
access code with EF over time and get
rid for the tight coupling
between UI layers and data layer.
Maybe you could decouple without the need for an ORM - how about putting a service/facade layer infront of your UI layer to coordinate all interactions with the underlying domain and hide it from the UI.
If your database is 'king' and your app is highly data driven I think you will always be fighting an uphill battle implementing the patterns you mention.
Embrace ado.net for this project - use EF4 and DDD patterns on your next green field proj :)

EDMX + POCO class generator results in EFv4 code, not EFv4.1 code but you don't have to bother with these details. EFv4.1 offers just different API which does exactly the same (and it is only wrapper around EFv4 API).
Depending on the way how you use datasets you can reach some very hard problems. Datasets are representation of the change set pattern. They know what changes were done to data and they are able to store just these changes. EF entities know this only if they are attached to the context which loaded them from the database. Once you work with detached entities you must make a big effort to tell EF what has changed - especially when modifying relations (detached entities are common scenario in web applications and web services). For those purposes EF offers another template called Self-tracking entities but they have another problems and limitations (for example missing lazy loading, you cannot apply changes when entity with the same key is attached to the context, etc.).
EF also doesn't support several features used in datasets - for example unique keys and batch updates. It's fun that newer MS APIs usually solve some pains of previous APIs but in the same time provide much less features then previous APIs which introduces new pains.
Another problem can be with performance - EF is slower then direct data access with datasets and have higher memory consumption (and yes there are some memory leaks reported).
You can forget about using EF for accessing tables which you don't know at design time. EF doesn't allow any dynamic behavior. Table names and the type of database server are fixed in mapping. Another problems can be with the way how you use triggers - ORM tools don't like triggers and EF has limited features when working with database computed values (possibility to fill value in the database or in the application is disjunctive).
The way of filling POCOs from EF + Datasets sounds like this will not be possible when using only EF. EF has some allowed mapping patterns but possibilities to map several tables to single POCO class are extremely limited and constrained (if you want to have these tables editable). If you mean just loading one entity from EF and another entity from data adapter and just make reference between them you should be OK - in this scenario repository sounds like reasonable pattern because the purpose of the repository is exactly this: load or persist data. Unit of work can be also usable because you will most probably want to reuse single database connection between EF and data adapters to avoid distributed transaction during saving changes. UoW will be the place responsible for handling this connection.
EF mapping is related to database design - you can introduce some object oriented modifications but still EF is closely dependent on the database. If you want to use some advanced domain model you will probably need separate domain classes filled from EF and datasets. Again it will be responsibility of repository to hide these details.

From how much we have implemented, I have learned following things.
POCO and Self Tracking objects are difficult to deal with, as if you do not have easy understanding of what goes inside, there will be number of unexpected behavior which may have worked well in your previous project.
Changing pattern is not easy, so far we have been managing simple CRUD without unit of work and identity map pattern. Now lot of legacy code that we wrote in past does not consider these new patterns and the logic will not work correctly.
In our previous code, we were simply using transactions and single insert/update/delete statement that was directly sent to database assuming transactions on server side will take care of all operations.
In such conditions, we were directly dealing with IDs all the time, newly generated IDs were immediately available after single insert statement, however this is not case with EF.
In EF, we are not dealing with IDs, we are dealing with navigation properties, which is a huge change from earlier ADO.NET programming methods.
From our experience we found that only replacing EF with earlier data access code will result in chaos. But EF + RIA Services offer you a completely new solution where you will probably get everything you need and your UI will very easily bind to it. So if you are thinking about complete rewriting using UI + RIA Services + EF, then it is worth, because lot of dependency in query management reduces automatically. You will be focusing only on business logic, but this is a big decision and the amount of man hours required in complete rewriting or just replacing EF is almost same.
So we went UI + RIA Services + EF way, and we started replacing one one module. Mostly EF will easily co-exist with your existing infrastructure so there is no harm.

C# Winforms, migrating to NHibernate

We are currently developing a new WinForms application (C# .NET 3.5).
The project is currently 40% complete however we're spending a considerable amount of time writing the DAL implementation (CRUD). We now want to move NHibernate as an ORM solution to take advantage of its many benefits and to relieve some of the DAL coding work.
We would much rather concentrate on solving business problems.
At the current time we plan to migrate to NHibernate and FluentHibernate but have a few questions.
Is the change to NHibernate worth the steep learning curve? From a performance point of view do you think NHibernate would be a more sensible option than continuing to write our own?
We currently employ "soft delete" and read data through views in the database which have a field "Deleted = null" (Deleted is a TIMESTAMP). From my understanding, when we map each class we can also specify a "Where" clause which means we no longer need any "filtering" views in our database? Is that correct?
In relation to the question above. We also have a "Purge" function that can delete records from the database. Can we employ "soft delete" and still have a purge function?
Can we persist BLOBS to the database through NHibernate?
What would be the best migration strategy for us? How would you get started on a NHibernate migration, keeping in mind that the application has not been released and we are open to having the database structure changed. Ideally I am thinking to map each of our business objects and then have NHibernate generate the schema for us, does this sound like a good way to go?
Can NHibernate work with Lookup data? We currently read lookup data into a global dictionary that we use through the life of the application. Can we still do this with NHibernate.
Apologies if some of these questions are elementary, I am still trying to get a handle on NHibernate.

(Answers to your question below, referencing the original question number)
Going to NHibernate is absolutely worth the learning curve - did it at my current job, and we've never looked back. NHibernate in action is an excellent book to start with.
You can easily include a 'Where' clause as part of your map. We use it for filtering some common-use tables and views in our NHibernate mappings.
For your purge function, just add a secondary map that reverses the where clause (or one without the flag filtered) and you're golden (we sometimes have several maps to the same entities for data shaping).
RE Blobs, etc. here's an article on them by Ayende, and one on Calyptus.
Migration is probably a larger question - personally, we use a repository pattern with an interface for the repository (for unit testing and mocks), a concrete implementation of the repository, and our model (POCOs). We keep no NHibernate specific code anywhere outside of our repositories to reduce dependencies, etc. and to aid in testing.
Again, look at NHibernate in action for some great info on the product, as well as NHForge.org, TekPub for their NHibernate series, etc. (I even have some tutorials on my blog, linked in my profile).
For lookup data, NHibernate works fine, and also supports cacheing.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.