In DocumentDb, what is the best way and place to decouple data in order to save them in separate collections?
So far, most of the examples of how to manage data with DocumentDb use simple objects but in real life, we hardly ever do. I just want to understand how and where I need to handle my complex classes before I save them as Json objects in DocumentDb.
Let's look at the following example. I'll be saving my project information into the Projects collection but I do NOT want to save full names of people in the project team within a project document. I just want to save their EmployeeId's in the project document. I have a separate Employees collection where I want to save person/employee specific information. My project object looks like this:
public class Project
{
[JsonProperty(PropertyName="id")]
public int ProjectId {get; set;}
[JsonProperty(PropertyName="projectName")]
public string ProjectName {get; set;}
[JsonProperty(PropertyName="projectType")]
public string ProjectType {get; set;}
[JsonProperty(PropertyName="projectTeam")]
public List<TeamMember> ProjectTeam {get; set}
}
My TeamMember class inherits from Employee object and looks like this:
public class TeamMember : Employee
{
[JsonProperty(PropertyName="position")]
public string Position {get; set;}
}
My Employee class looks like this:
public class Employee
{
[JsonProperty(PropertyName="id")]
public int EmployeeId {get; set;}
[JsonProperty(PropertyName="firstName")]
public string FirstName {get; set;}
[JsonProperty(PropertyName="lastName")]
public string LastName {get; set;}
[JsonProperty(PropertyName="gender")]
public string Gender {get; set;}
[JsonProperty(PropertyName="emailAddress")]
public string EmailAddress {get; set;}
}
Before saving to Projects collection, here's an example of what my Project document should look like:
{
id: 12345,
projectName: "My first project",
projectType: "Construction Project",
projectTeam: [
{ id: 7777, position: "Engineer" },
{ id: 8998, position: "Project Manager" }
]
}
As you can see, I decoupled my project information from the employee data in order to store them in their own collections, Projects and Employees collections respectively.
Let's not get into why I should or should not decouple data. I just want to see how and where I should handle decoupling in order to produce fastest results. I want to follow the best practices so I just want to see how experts working with DocumentDb handle this scenario.
I can think of two places to handle this but I want to understand if there's a better, more direct way to do this:
I can convert my Project class into a JSON object within my C# code and pass the JSON object to DocumentDb for storage.
Alternatively, I can pass my Project object directly to DocumentDb, into a JavaScript stored procedure and I can handle decoupling and storing data in two or more collections within DocumentDb.
Here's what I'd like to know:
Which is the right place to handle decoupling data?
Which would provide better performance?
Is there a better way to handle this? I keep reading about how I can just pass my POCO classes to DocumentDb and it will just handle them for me. Would DocumentDb handle this more complex scenarios? If so, how?
I appreciate your help. Thank you.
in a NoSql store such as this, you can store different types of documents with different schemas in the same collection.
please do not treat collections as tables. think of collections rather as units of partition and boundaries for execution of queries, transactions etc.
so with that in mind, there is nothing wrong with storing your project document as shown and including the employee documents in to the same collection.
now saying all of that; if you still wanted to do this, then you can ...
in order to achieve this your project object would have to change.
instead of having TeamMember : Employee (which would include the entire Employee object) have the TeamMember object mimic what you want from your JSON ... i.e.
class TeamMember
{
int id {get;set;}
string position {get;set;}
}
Now when DocumentDB serializes your project object you would end up with JSON that resembled what you wanted. And then you could separately save your Employee object somewhere else.
If you don't want to do this, or can't do this because you don't control the definition of the Model or because other parts of the system are built to depend on this already then you could investigate building a custom JSON converter for your Project object that would spit out the JSON you wanted.
Then decorate your Project object with that JsonConverter and when DocumentDB does the conversion the correct result would be created each time.
Related
I have an idea for a web app where I will want the user to create their own database through a web application, with their own table names and field types.
I thought about creating a database structure using Object Oriented Programming so that a pre-made database will support all kinds of Entities with custom properties. Something like this:
CustomType
{
public long TypeId {get;set;}
public string ActiveType {get;set;}
}
CustomProperty
{
public int wholeNumber {get;set;}
public string text {get;set;}
public bool boolean {get;set;}
public decimal dec {get;set;}
//Choosen Id of the type to work with
public long TypeId {get;set;}
public bool wholeNumber_ACTIVE {get;set;}
public bool text_ACTIVE {get;set;}
public bool boolean_ACTIVE {get;set;}
public bool dec_ACTIVE {get;set;}
}
CustomEntity
{
public string TableName {get;set;}
public CustomProperty Prop01 {get;set;}
public CustomProperty Prop02 {get;set;}
public CustomProperty Prop03 {get;set;}
public CustomProperty Prop04 {get;set;}
public CustomProperty Prop05 {get;set;}
}
The idea behind this is to let the user decide what they want their database to store, on a pre-made database for them to work with, without having to create it during runtime since this is a web app.
I believe I can manage it like this for them to store whatever they need, but I'm also thinking about the following issues:
How will I manage relationships when the user needs to link tables with Ids and foreign keys.
(I though about managing a public long ForeignId {get;set;} and just store the Id they need to associate).
How will I manage queries since tables will have CodeNames and each will have a different meaning for each person that sets it up.
(I though about, renaming the table during Runtime, but I'm afraid of errors and DB corruption).
Also thought about sending direct queries to create the database according to user's need, but then again non-experienced users can really mess up here or find it hard to manage.
How can I manage migration or DB changes with code instead of the use of PowerShell console.
If we have multiple users each with a unique database, but the same web app how can we manage webconfigs to work with this idea.
I know there's a lot of questions here, I'm looking for the best way to achieve this, having multiple users own their small web app through the internet using MVC pattern and lots of options through a browser.
I would recommend an Entity Attribute Value (EAV) pattern as a solution. With the EAV pattern, rather than creating new tables with new columns for every custom property you wish to store, you store those properties in rows. For example. Instead of every custom table being defined like this:
You define them like this instead:
This allows for flexible creation of multiple entities with multiple properties. The classes in your business logic will then be Entity classes with a collection of Property objects.
In case you haven’t spotted the trade-offs already, the limitation of using the EAV model is the inability to specify field types (int, varchar, decimal etc.), infact, all your property values will be stored as a single type (usually strings).
There are a number of ways to address this. Some handle all the validation etc. in the business logic, other create Field tables per type, so based on my example, rather than having just one EntityFields table, you’ll have multiple, separated by type.
I'm creating project which will parse .html to database (kind of sqlite or other, it's not important yet). Database will has many tables, relationship and to understanding the schema will be some difficult, well I'll show you more simple Schema.
For example:
Models:
Subject: SubjectId, Name
Teacher: TeacherId, SubjectFk, Name, Surname
ClassRoom: ClassRoomId, year
Student: StudentId, ClassRoomFk, Name, Surname
Relations (it's not important!):
One subject is leading by multiple teacher, one teacher leads only one subject
One ClassRoom contains many students, one students belongs to only one classRoom
Is unique pair: TeacherId, ClassRoomId (In one classRoom can be only one object carried by a particular teacher, many teachers can not teach in a classRoom of the same subject, iam not sure...but it's not important).
Now I build a project hierarchy:
ParseData - solution name
ParseData.Repository - it contains App.Config which contains app settings to root folder where exist data.
ParseData.Domain - Classes for data model which will be parse, for example:
public class Student
{
public int StudentId {get;set;}
public string Name {get;set;}
public string Surname {get;set;}
public int ClassRoomFk {get;set;}
}
public class ClassRoom
{
public int ClasRoomId {get; set;}
public List<Student> Students {get;set;}
}
ParseData.Core - contains all algorithm and Classes which will read file from path and convert data to class model, for example:
public class StudentParse : IEntity<Student>
{
public Student Student {get; set;}
public StudentParse(string filePathWithDataForStudentsFromParticularClass, int ClassRoomFk) { (...) }
/* All methods, which will parse data to StudentModel */
}
public class ClassRoomParse : IEntity<ClassRoom>
{
public ClassRoom ClassRoom {get;set;}
public ClassRoomParse(string filePathWithClassRoomsData) { (...) }
/* All methods, which will parse data to ClassRoomModel */
}
public interface IParser
{
string filePathToMainFile {get;set;}
List<ClassRoom> Start();
}
ParseData.UI - Console application. Here i can write some code which will present me results from featching data. For example:
IParser parser = new Parser(Repository.MainFilePath);
List<ClassRoom> parser.Start();
/* LINQ or other actions..save to file or something else */
I'm searching a knowledge how could I organize my solution to best practices. I am open to criticism about my approach and inexperience.
Another option besides a ReadRepository and WriteRepository I mentioned earlier is to apply the Extract, Transform, Load (ETL) method. This is a clearer approach, mainly because the data is one-way only.
The solution would have a ParseData.Extract project (the ReadRepository) that loads the data from the HTML files into DTOs which match the data structure of the HTML files. A ParseData.Transform project would transform the DTOs to the database models (the entities if you're using Entity Framework for example). A ParseData.Load project would then serve as the WriteRepository mentioned earlier to save the entities to the database.
The ParseData.UI project can still be used to orchestrate the ETL process.
What you have are two data sources: One to read from (the HTML files) and one to write to (the database). Ideally you would hide how the files are read from disk and how they are persisted to the database. That is what the Repository pattern is for.
Your application has a clear purpose: It should import the data from the files to the database. Creating one repository would make the architecture of your application less clear. It would first 'read' from the repository only to 'write' it to the same repository again.
Because of that I suggest creating two Repository projects: One ReadRepository and one WriteRepository. That would make the console application project really simple: Instantiate the repositories, query the ReadRepository and save to the WriteRepository. The Core project would in effect become the ReadRepository. Both repositories would use the Domain objects.
I would also suggest to let the UI console application decide where the files are stored. So store the location in the App.config of the console application. That way you can perhaps overwrite the file location with a command-line parameter.
I need to store some simple data (just some POCO objects with a few attributes, nothing fancy).
public class MyPOCO
{
public int Id {get; set;}
public string Title {get; set;}
public string Email {get; set;}
// ...
}
Basically, at some point of my web application I need to check if the MyPOCO object is already persisted (matching by Id), and if it's not, persist it. That's all I need.
It can't be any database, so probably XML or JSON. What's an easy way (or Nuget package) to store it?
I suggest to use a in-process database like SQL Server CE or SQLite together with an O/R mapper. This is easier and more maintainable than reinventing a small database.
I have come across a situation where it would be better to represent a particular part of my domain model as relational. I read the section on database references in MongoDB and understand you can provide references to multiple documents by providing a JSON array of $ref references to various documents in a foreign collection.
All the examples I've seen for adding a reference to a foreign document in code have been only for a single document and they've created a public property of type MongoDBRef. There is a lot of unnecessary overhead with this approach, in my opinion, but it also doesn't make it clear what to do about storing references to multiple documents.
If you would like to provide a one-to-many relationship between foreign documents in Mongo, is it necessary to provide a collection property containing MongoDBRef objects? Is it possible to stick to a collection of standard entity objects in my C# code and map it to Mongo documents using BsonClassMap?
Below is a simple class that represents the model I currently have. It seems to be saving the document and references correctly, but I don't like exposing a public collection of MongoDBRef objects and the overhead it takes to add new documents for anybody that uses the Person class.
If it matters, I'm using MongoDB 2.0 and their C# driver.
// This is how my class currently looks
public class Person
{
public string Name { get; set; }
public List<MongoDBRef> Vehicles { get; private set; }
public Person()
{
Vehicles = new List<MongoDBRef>();
}
}
// This is what I want my class to look like
public class Person
{
public string Name { get; set; }
public List<Vehicle> Vehicles { get; private set; }
public Person()
{
Vehicles = new List<Vehicle>();
}
}
DBRefs are not the appropriate tool for storing references to a known document type. Instead just save the _id values of the referred documents in your collection. Given a good mapping library (not sure about C# but the C# equivalent of pymongo, mongoose, morphia, etc.) it will allow you to do exactly what you want.
DBRefs should only be used if you do not know at compile time what kind of document you need to store a reference to (e.g. a "content" field that either holds an Image or a Text, etc.).
I have the following in Entity Framework.
Table - Country
Fields
List item
Country_ID
Dialing_Code
ISO_Alpha2
ISO_Alpha3
ISO_Full
I would like to map only selected fields from this entity model to my domain class.
My domain model class is
public class DomainCountry
{
public int Country_ID { get; set; }
public string Dialing_Code { get; set; }
public string ISO_3166_1_Alpha_2 { get; set; }
}
The following will work, however insert or update is not possible. In order to get insert or update we need to use ObjectSet<>, but it will not support in my case.
IQueryable<DomainCountry> countries =
context.Countries.Select(
c =>
new DomainCountry
{
Country_ID = c.Country_Id,
Dialing_Code = c.Dialing_Code,
ISO_3166_1_Alpha_2 = c.ISO_3166_1_Alpha_2
});
Is there a nice solution for this? It wound be really fantastic.
Ideally it will be kind of proxy class which will support all the futures however highly customizable.
That is, only the columns we want to expose to the outer world.
The term for "plain .NET classes" is POCO - plain old CLR objects (inspired by POJO, plain old Java objects).
Read this blog post series, it helped me a lot:
http://blogs.msdn.com/b/adonet/archive/2009/05/21/poco-in-the-entity-framework-part-1-the-experience.aspx
I want to do the same thing. My goal is to build a WCF service that can use the same set of objects as the application I'm building by sharing a DLL and sending/receiving the same classes. Additionally, I also wanted to limit what fields are exposed. After thinking about this for a while it seems a user-defined cast might do the trick. Have a look to see if it works for you.
http://www.roque-patrick.com/windows/final/bbl0065.html