Referencing Multiple Foreign Documents with MongoDBRef in MongoDB

Referencing Multiple Foreign Documents with MongoDBRef in MongoDB - c#

I have come across a situation where it would be better to represent a particular part of my domain model as relational. I read the section on database references in MongoDB and understand you can provide references to multiple documents by providing a JSON array of $ref references to various documents in a foreign collection.
All the examples I've seen for adding a reference to a foreign document in code have been only for a single document and they've created a public property of type MongoDBRef. There is a lot of unnecessary overhead with this approach, in my opinion, but it also doesn't make it clear what to do about storing references to multiple documents.
If you would like to provide a one-to-many relationship between foreign documents in Mongo, is it necessary to provide a collection property containing MongoDBRef objects? Is it possible to stick to a collection of standard entity objects in my C# code and map it to Mongo documents using BsonClassMap?
Below is a simple class that represents the model I currently have. It seems to be saving the document and references correctly, but I don't like exposing a public collection of MongoDBRef objects and the overhead it takes to add new documents for anybody that uses the Person class.
If it matters, I'm using MongoDB 2.0 and their C# driver.
// This is how my class currently looks
public class Person
{
public string Name { get; set; }
public List<MongoDBRef> Vehicles { get; private set; }
public Person()
{
Vehicles = new List<MongoDBRef>();
}
}
// This is what I want my class to look like
public class Person
{
public string Name { get; set; }
public List<Vehicle> Vehicles { get; private set; }
public Person()
{
Vehicles = new List<Vehicle>();
}
}

DBRefs are not the appropriate tool for storing references to a known document type. Instead just save the _id values of the referred documents in your collection. Given a good mapping library (not sure about C# but the C# equivalent of pymongo, mongoose, morphia, etc.) it will allow you to do exactly what you want.
DBRefs should only be used if you do not know at compile time what kind of document you need to store a reference to (e.g. a "content" field that either holds an Image or a Text, etc.).

Related

Map C# complex data type in mongo

I have the following class hierarchy
public class Beneficiary
{
public string Id { get; set; }
public string Name { get; set; }
public InvoiceNumber InvoiceNumber { get; set; }
public ICollection<Invoice> Invoices { get; set; }
}
public class InvoiceNumber
{
public string Current { get; set; }
public DateTime IssueDate { get; }
}
public class Invoice
{
public string Number { get; set; }
public DateTime IssueDate { get; }
public ICollection<InvoiceEntry> InvoiceEntries { get; set; }
}
public class InvoiceEntry
{
public decimal BillableHours { get; set; }
}
Up till now I've used EF to configure relations but I would like to move to mongo (this is just for learning purposes).
In entity frameowrk I know how to map this type of hierarchy, but i don't know in mongo.
As it is seen only Beneficiary has an Id attached and generated but the rest is just a dependent hierarchy, so if the beneficiary gets deleted everything else will get deleted.
Can I map such class structure in mongo ? I'm using fluent api for mapping
BsonClassMap.RegisterClassMap<Beneficiary>(type =>
{
type.MapIdProperty(prop => prop.Id)
.SetIdGenerator(new StringObjectIdGenerator())
.SetSerializer(new StringSerializer(BsonType.ObjectId));
type.MapProperty(prop => prop.Name)
.SetIsRequired(true);
});
but I'm stuck because I don't know how to continue mapping the complex data types.

TLDR; After having decided on the document structure, you add maps for the classes if you want to customize the way they are serialized as you did for Beneficiary. For the most part, the defaults work well and the driver will (de)serialize your class structure to a document just fine.
If your data model is more complicated, you will have several document types; relationships are not depicted using navigation properties like in Entity Framework, but need dedicated query and update statements.
Before mapping the data, you should decide how to model the data. In this regard, MongoDB offers more options than the relational "normalize-everthing" approach. When it comes to relationships between data types, the following factors come into play:
Which type of relationship is it? 1:1 as in Beneficiary <-> InvoiceNumber, 1:n (where n is a few) or 1:z (where z means a lot like zillions)? As regards the relationship Beneficiary <-> Invoices you have to check which of the latter is the case for most of the documents. Including a few invoices in Beneficiary is ok, but storing a huge number of invoices means putting them in a separate collection (or documents at least).
Do you query the data together? If you query all the data together in most of the cases, putting all of them in a single document is an option; if you do not query them together or query only a subset of invoices (e.g. the last ten), you can have a look at the Subset pattern where Beneficiary stores only the latest invoices and the others are put into a separate collection.
If you load a Beneficiary, do you need all data for the invoices or only a part of it? If for instance you want to show the Beneficiary with an overview of the latest ten invoices, you could opt for a combination of the Subset and Extended reference patterns. This means you store Invoices in a separate collection, but include the most important data (e.g. id, number, date, totals) in the Beneficiary document to be able to fill the overview from a single document.
Are there exceptions to the amount of invoices of a Beneficiary? Do most Beneficiaries have 10 invoices, but a few have several thousand? In this case have a look at the Outlier pattern that prepares your documents to store invoices in the Beneficiary document, but adds a flag that signals that more invoices can be found in a different collection.
How often do you update the documents? Do you need to update Beneficiary every time you add an invoice?
How important are atomic updates to you? If all of the data are in a single document, all updates to this document are atomic whereas you need transactions if consistency is as important as it is in the relational world. I could imagine that InvoiceNumber is updated every time an Invoice is created and that consistency matters in this aspect.
Another option is to put Beneficiary and Invoices in separate documents in the same collection. You discern between the document types with a type discriminator (usually the _t property) but you have the advantage to be able to load both the Beneficiary and the corresponding Invoices with a single statement.
After having decided on the structure, mapping the data is usually not too complicated. In comparison to Entity Framework, you do not have to use navigation properties, but only create properties for the classes that are contained in your Beneficiary document. If your class contains subdocuments (like Beneficiary.InvoiceNumber), the driver will store the value as a subdocument. Also, in your current structure, if you have a collection like Beneficiary.Invoices, the driver will create an array when storing the data. Vice versa, when loading a Beneficiary, the properties will be deserialized from the BSON document, so Beneficiary.InvoiceNumber will contain an object of type InvoiceNumber (unless it is null) and Beneficiary.Invoices will contain a collection of Invoice objects, respectively.
If you want to customize the way that the other classes are serialized, you create another mapping as you already did for Beneficiary.
However, if you decide to store invoices in separate documents, you do not have the same comfort as in Entity Framework when loading the data (e.g. Include), but have to create separate query and update statement for Invoices (maybe with transactions if - and only if - required).

MongoDB C# driver 2.0 update collection and ignore duplicates

I am very new with MongoDB (only spend a day learning). I have a relatively simple problem to solve and I choose to take the opportunity and learn about this popular nosql database.
In C# I have the following classes:
public class Item
{
[BsonId]
public string ItemId { get; set; }
public string Name { get; set; }
public ICollection<Detail> Details { get; set; }
}
public class Detail
{
//[BsonId]
public int DetailId { get; set; }
public DateTime StartDate { get; set; }
public double Qty { get; set; }
}
I want to be able to add multiple objects (Details) to the Details collection. However I know that some of the items I have (coming from a rest api) will already be stored in the database and I want to avoid the duplicates.
So far I can think of 2 ways of doing it, but I am not really happy with either:
Get all stored details (per item) from MongoDB and then in .net I can filter
and find the new items and add them to the db. This way I can be sure that there will be no duplicates. That is however far from ideal solution.
I can add [BsonId] attribute to the DetailId (without this attribute this solution does not work) and then use AddToSetEach. This works and my only problem with that is that I don’t quite understand it. I mean, it suppose to only add the new objects if they do not already exists in the database,
but how does it know? How does it compare the objects? Do I have any control over that comparison process? Can I supply custom comparers? Also I noticed that if I pass 2 objects with the same DetailId (this should never happen in the real app), it still adds both, so BsonId attribute does not guarantee uniqueness?
Is there any elegant solution for this problem? Basically I just want to update the Details collection by passing another collection (which I know that contain some objects already stored in the db i.e. first collection) and ignore all duplicates.

The AddToSetEach based version is certainly the way to go since it is the only one that scales properly.
I would, however, recommend you to drop the entire DetailId field unless it is really required for some other part of your application. Judging from a distance it would appear like any entry in your list of item details is uniquely identifiable by its StartDate field (plus potentially Qty, too). So why would you need the DetailId on top of that?
That leads directly to your question of why adding a [BsonId] attribute to the DetailId property does not result in guaranteed uniqueness inside your collection of Detail elements. One reason is that MongoDB simply cannot do it (see this link). The second reason is that MongoDB C# driver does not create an unique index or attempts other magic in order to ensure uniqueness here - probably because of reason #1. ;) All the [BsonId] attribute does is tell the driver to serialize the attributed property as the "_id" field (and write the other way round upon deserialization).
On the topic of "how does MongoDB know which objects are already present", the documentation is pretty clear:
If the value is a document, MongoDB determines that the document is a
duplicate if an existing document in the array matches the to-be-added
document exactly; i.e. the existing document has the exact same fields
and values and the fields are in the same order. As such, field order
matters and you cannot specify that MongoDB compare only a subset of
the fields in the document to determine whether the document is a
duplicate of an existing array element.
And, no, there is no option to specify custom comparers.

How can I store an Array of Ints in my database, in a single field, using Code First Entity Framework?

What I want to do is store an Array of Ints in my database. I am Using Code First Entity Framework to create my database I can't seem to create a List or Array element in the database. The List item doesn't create a field. I gather that this is because it would need a separate table.
What is the best way to store my array in a single field here. Should I convert the array to a String, or is there an easier way?
My model contains:
public string Title { get; set; }
public List<int> IdArray { get; set; }

NHibernate supports this, but EF doesn't.
If you need to do it the only option is to store it serialized. For example you can do:
a XML column, and store it serialzed as XML
a char column, and store it serialized as JSON
EF will not automatically serialize / deserialize it. You should add a property that does the serialization/deserialization for you
public string IdsColumn { get; set; } // Mapped to DB
[NotMapped]
public List<int> Ids
{
get { return Deserialize(IdsColumn); }
set { IdsColumn = Serialize(value); }
}
Of course you have to provide the serialization functions implementation, using XMLSerializer or JSON.NET, for example.
If you want this functionality implemented, look here: EF data user voice
For example Richer collection support, including ordered collections and dictionaries of value types
You can vote for it.

Im not too sure of your situation but Most of the time this is never necessary...
Basically if you arrive at such a situation you should look at normalizing your table further. data in tables need to be "atomic". once retrieved as objects it can be in a list.

Saving data into multiple collections in DocumentDb

In DocumentDb, what is the best way and place to decouple data in order to save them in separate collections?
So far, most of the examples of how to manage data with DocumentDb use simple objects but in real life, we hardly ever do. I just want to understand how and where I need to handle my complex classes before I save them as Json objects in DocumentDb.
Let's look at the following example. I'll be saving my project information into the Projects collection but I do NOT want to save full names of people in the project team within a project document. I just want to save their EmployeeId's in the project document. I have a separate Employees collection where I want to save person/employee specific information. My project object looks like this:
public class Project
{
[JsonProperty(PropertyName="id")]
public int ProjectId {get; set;}
[JsonProperty(PropertyName="projectName")]
public string ProjectName {get; set;}
[JsonProperty(PropertyName="projectType")]
public string ProjectType {get; set;}
[JsonProperty(PropertyName="projectTeam")]
public List<TeamMember> ProjectTeam {get; set}
}
My TeamMember class inherits from Employee object and looks like this:
public class TeamMember : Employee
{
[JsonProperty(PropertyName="position")]
public string Position {get; set;}
}
My Employee class looks like this:
public class Employee
{
[JsonProperty(PropertyName="id")]
public int EmployeeId {get; set;}
[JsonProperty(PropertyName="firstName")]
public string FirstName {get; set;}
[JsonProperty(PropertyName="lastName")]
public string LastName {get; set;}
[JsonProperty(PropertyName="gender")]
public string Gender {get; set;}
[JsonProperty(PropertyName="emailAddress")]
public string EmailAddress {get; set;}
}
Before saving to Projects collection, here's an example of what my Project document should look like:
{
id: 12345,
projectName: "My first project",
projectType: "Construction Project",
projectTeam: [
{ id: 7777, position: "Engineer" },
{ id: 8998, position: "Project Manager" }
]
}
As you can see, I decoupled my project information from the employee data in order to store them in their own collections, Projects and Employees collections respectively.
Let's not get into why I should or should not decouple data. I just want to see how and where I should handle decoupling in order to produce fastest results. I want to follow the best practices so I just want to see how experts working with DocumentDb handle this scenario.
I can think of two places to handle this but I want to understand if there's a better, more direct way to do this:
I can convert my Project class into a JSON object within my C# code and pass the JSON object to DocumentDb for storage.
Alternatively, I can pass my Project object directly to DocumentDb, into a JavaScript stored procedure and I can handle decoupling and storing data in two or more collections within DocumentDb.
Here's what I'd like to know:
Which is the right place to handle decoupling data?
Which would provide better performance?
Is there a better way to handle this? I keep reading about how I can just pass my POCO classes to DocumentDb and it will just handle them for me. Would DocumentDb handle this more complex scenarios? If so, how?
I appreciate your help. Thank you.

in a NoSql store such as this, you can store different types of documents with different schemas in the same collection.
please do not treat collections as tables. think of collections rather as units of partition and boundaries for execution of queries, transactions etc.
so with that in mind, there is nothing wrong with storing your project document as shown and including the employee documents in to the same collection.
now saying all of that; if you still wanted to do this, then you can ...
in order to achieve this your project object would have to change.
instead of having TeamMember : Employee (which would include the entire Employee object) have the TeamMember object mimic what you want from your JSON ... i.e.
class TeamMember
{
int id {get;set;}
string position {get;set;}
}
Now when DocumentDB serializes your project object you would end up with JSON that resembled what you wanted. And then you could separately save your Employee object somewhere else.
If you don't want to do this, or can't do this because you don't control the definition of the Model or because other parts of the system are built to depend on this already then you could investigate building a custom JSON converter for your Project object that would spit out the JSON you wanted.
Then decorate your Project object with that JsonConverter and when DocumentDB does the conversion the correct result would be created each time.

Entity framework entity class mapping with a plain .NET class

I have the following in Entity Framework.
Table - Country
Fields
List item
Country_ID
Dialing_Code
ISO_Alpha2
ISO_Alpha3
ISO_Full
I would like to map only selected fields from this entity model to my domain class.
My domain model class is
public class DomainCountry
{
public int Country_ID { get; set; }
public string Dialing_Code { get; set; }
public string ISO_3166_1_Alpha_2 { get; set; }
}
The following will work, however insert or update is not possible. In order to get insert or update we need to use ObjectSet<>, but it will not support in my case.
IQueryable<DomainCountry> countries =
context.Countries.Select(
c =>
new DomainCountry
{
Country_ID = c.Country_Id,
Dialing_Code = c.Dialing_Code,
ISO_3166_1_Alpha_2 = c.ISO_3166_1_Alpha_2
});
Is there a nice solution for this? It wound be really fantastic.
Ideally it will be kind of proxy class which will support all the futures however highly customizable.
That is, only the columns we want to expose to the outer world.

The term for "plain .NET classes" is POCO - plain old CLR objects (inspired by POJO, plain old Java objects).
Read this blog post series, it helped me a lot:
http://blogs.msdn.com/b/adonet/archive/2009/05/21/poco-in-the-entity-framework-part-1-the-experience.aspx

I want to do the same thing. My goal is to build a WCF service that can use the same set of objects as the application I'm building by sharing a DLL and sending/receiving the same classes. Additionally, I also wanted to limit what fields are exposed. After thinking about this for a while it seems a user-defined cast might do the trick. Have a look to see if it works for you.
http://www.roque-patrick.com/windows/final/bbl0065.html

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.