I'm creating project which will parse .html to database (kind of sqlite or other, it's not important yet). Database will has many tables, relationship and to understanding the schema will be some difficult, well I'll show you more simple Schema.
For example:
Models:
Subject: SubjectId, Name
Teacher: TeacherId, SubjectFk, Name, Surname
ClassRoom: ClassRoomId, year
Student: StudentId, ClassRoomFk, Name, Surname
Relations (it's not important!):
One subject is leading by multiple teacher, one teacher leads only one subject
One ClassRoom contains many students, one students belongs to only one classRoom
Is unique pair: TeacherId, ClassRoomId (In one classRoom can be only one object carried by a particular teacher, many teachers can not teach in a classRoom of the same subject, iam not sure...but it's not important).
Now I build a project hierarchy:
ParseData - solution name
ParseData.Repository - it contains App.Config which contains app settings to root folder where exist data.
ParseData.Domain - Classes for data model which will be parse, for example:
public class Student
{
public int StudentId {get;set;}
public string Name {get;set;}
public string Surname {get;set;}
public int ClassRoomFk {get;set;}
}
public class ClassRoom
{
public int ClasRoomId {get; set;}
public List<Student> Students {get;set;}
}
ParseData.Core - contains all algorithm and Classes which will read file from path and convert data to class model, for example:
public class StudentParse : IEntity<Student>
{
public Student Student {get; set;}
public StudentParse(string filePathWithDataForStudentsFromParticularClass, int ClassRoomFk) { (...) }
/* All methods, which will parse data to StudentModel */
}
public class ClassRoomParse : IEntity<ClassRoom>
{
public ClassRoom ClassRoom {get;set;}
public ClassRoomParse(string filePathWithClassRoomsData) { (...) }
/* All methods, which will parse data to ClassRoomModel */
}
public interface IParser
{
string filePathToMainFile {get;set;}
List<ClassRoom> Start();
}
ParseData.UI - Console application. Here i can write some code which will present me results from featching data. For example:
IParser parser = new Parser(Repository.MainFilePath);
List<ClassRoom> parser.Start();
/* LINQ or other actions..save to file or something else */
I'm searching a knowledge how could I organize my solution to best practices. I am open to criticism about my approach and inexperience.
Another option besides a ReadRepository and WriteRepository I mentioned earlier is to apply the Extract, Transform, Load (ETL) method. This is a clearer approach, mainly because the data is one-way only.
The solution would have a ParseData.Extract project (the ReadRepository) that loads the data from the HTML files into DTOs which match the data structure of the HTML files. A ParseData.Transform project would transform the DTOs to the database models (the entities if you're using Entity Framework for example). A ParseData.Load project would then serve as the WriteRepository mentioned earlier to save the entities to the database.
The ParseData.UI project can still be used to orchestrate the ETL process.
What you have are two data sources: One to read from (the HTML files) and one to write to (the database). Ideally you would hide how the files are read from disk and how they are persisted to the database. That is what the Repository pattern is for.
Your application has a clear purpose: It should import the data from the files to the database. Creating one repository would make the architecture of your application less clear. It would first 'read' from the repository only to 'write' it to the same repository again.
Because of that I suggest creating two Repository projects: One ReadRepository and one WriteRepository. That would make the console application project really simple: Instantiate the repositories, query the ReadRepository and save to the WriteRepository. The Core project would in effect become the ReadRepository. Both repositories would use the Domain objects.
I would also suggest to let the UI console application decide where the files are stored. So store the location in the App.config of the console application. That way you can perhaps overwrite the file location with a command-line parameter.
Related
I have an idea for a web app where I will want the user to create their own database through a web application, with their own table names and field types.
I thought about creating a database structure using Object Oriented Programming so that a pre-made database will support all kinds of Entities with custom properties. Something like this:
CustomType
{
public long TypeId {get;set;}
public string ActiveType {get;set;}
}
CustomProperty
{
public int wholeNumber {get;set;}
public string text {get;set;}
public bool boolean {get;set;}
public decimal dec {get;set;}
//Choosen Id of the type to work with
public long TypeId {get;set;}
public bool wholeNumber_ACTIVE {get;set;}
public bool text_ACTIVE {get;set;}
public bool boolean_ACTIVE {get;set;}
public bool dec_ACTIVE {get;set;}
}
CustomEntity
{
public string TableName {get;set;}
public CustomProperty Prop01 {get;set;}
public CustomProperty Prop02 {get;set;}
public CustomProperty Prop03 {get;set;}
public CustomProperty Prop04 {get;set;}
public CustomProperty Prop05 {get;set;}
}
The idea behind this is to let the user decide what they want their database to store, on a pre-made database for them to work with, without having to create it during runtime since this is a web app.
I believe I can manage it like this for them to store whatever they need, but I'm also thinking about the following issues:
How will I manage relationships when the user needs to link tables with Ids and foreign keys.
(I though about managing a public long ForeignId {get;set;} and just store the Id they need to associate).
How will I manage queries since tables will have CodeNames and each will have a different meaning for each person that sets it up.
(I though about, renaming the table during Runtime, but I'm afraid of errors and DB corruption).
Also thought about sending direct queries to create the database according to user's need, but then again non-experienced users can really mess up here or find it hard to manage.
How can I manage migration or DB changes with code instead of the use of PowerShell console.
If we have multiple users each with a unique database, but the same web app how can we manage webconfigs to work with this idea.
I know there's a lot of questions here, I'm looking for the best way to achieve this, having multiple users own their small web app through the internet using MVC pattern and lots of options through a browser.
I would recommend an Entity Attribute Value (EAV) pattern as a solution. With the EAV pattern, rather than creating new tables with new columns for every custom property you wish to store, you store those properties in rows. For example. Instead of every custom table being defined like this:
You define them like this instead:
This allows for flexible creation of multiple entities with multiple properties. The classes in your business logic will then be Entity classes with a collection of Property objects.
In case you haven’t spotted the trade-offs already, the limitation of using the EAV model is the inability to specify field types (int, varchar, decimal etc.), infact, all your property values will be stored as a single type (usually strings).
There are a number of ways to address this. Some handle all the validation etc. in the business logic, other create Field tables per type, so based on my example, rather than having just one EntityFields table, you’ll have multiple, separated by type.
This question already has answers here:
Should Entities in Domain Driven Design and Entity Framework be the same?
(4 answers)
Closed 5 years ago.
I have a three tier app with a class library as the Infrastructure Layer, which contains an Entity Framework data model (database first).
Entity Framework creates entities under the Model.tt folder. These classes are populated with data from the database.
In the past I would map the classes created by Entity Framework (in the data project) to classes in the Domain project e.g. Infrastructure.dbApplication was mapped to Domain.Application.
My reading is telling me that I should be using the classes contained in .tt as the domain classes i.e. add domain methods to the classes generated by Entity Framework. However, this would mean that the domain classes would be contained in the Infrastructure project, wouldn't it? Is is possible to relocate the classes generated by Entity framework to the Domain project? Am I missing something fundamental here?
I think in the true sense it is a Data Model - not a Domain Model. Although people talk about having the Entity Framework Model as a domain concept, I don't see how you can easily retro fit Value objects such as say amount which would be represented in the true domain sense as such:
public class CustomerTransaction
{
public int Id { get; set; }
public string TransactionNumber { get; set; }
public Amount Amount { get; set; }
}
public class Amount
{
public decimal Value { get; }
public Currency Currency { get; }
}
As opposed to a more incorrect data model approach:
public class CustomerTransaction
{
public int Id { get; set; }
public string TransactionNumber { get; set; }
public int CurrencyType { get; set; }
public decimal Amount { get; set; }
}
Yes, the example is anaemic, but only interested in properties for clarity sake - not behaviour. You will need to change visibility of properties, whether you need default constructor on the "business/data object" for starters.
So in the domain sense, Amount is a value object on a Customer Transaction - which I am assuming as an entity in the example.
So how would this translate to database mappings via Entity Framework. There might be away to hold the above in a single CustomerTransaction table as the flat structure in the data model, but my way would to be add an additional repository around it and map out to the data structures.
Udi Dahan has some good information on DDD and ORM in the true sense. I thought somewhere he talked about DDD and ORM having the Data Model instance as a private field in the domain object but I might be wrong.
Also, that data model suffers from Primitive Obsession (I think Fowler coined it in his Refactoring book - although it Is in his book) Jimmy Bogard talks about that here.
Check out Udi Dahan stuff.
You should move your model to a different project. That is good practice. I don't quite get it what you meant by "moving to to Domain project" Normally entity framework generated classes are used as a domain model. No need for creating "different" domain model from this. This model should be use only near to database operations, whereas web(window) application should use only DTO (Domain transfer objects)
I don't know if you use it or not - but this is a nice tool allowing for recreating model from the database :
https://marketplace.visualstudio.com/items?itemName=SimonHughes.EntityFrameworkReversePOCOGenerator
This allows to store model in classes (instead of EDMX) Someone refers to it as "code first" but there is a misunderstanding. One can use this tool to create model and still be on "database first" This is done simply to omit using EDMX as a model definition.
You can relocate the entity classes by creating a new item in your Domain project: DbContext EF 6.x Generator (not sure of the name and you might have to install a plugin to get this item in the list, also exists for EF 5.x).
Once you have created this new item, you have to edit it to set the path of your EDMX at the very begining of the file. In my project for example it is:
const string inputFile = #"..\..\DAL.Impl\GlobalSales\Mapping\GlobalSalesContext.edmx";
You will also need to edit the DbContext.tt file to add the right using on top of the generated class. At each change you've done on the EDMX, you also will have to right click the generator and click: "Run custom tool" to generate the new classes.
That being said, is it a good practice? As you can see that's what I have done in my project. As long as you do not have EF specific annotations or stuff like that in the generated entity classes, I would said that it is acceptable.
If you need to change your ORM, you can just keep the generated classes and remove all the EF stuff (.tt files, etc) and the rest of your application will work the same. But that's opinion based.
I am doing this project in c# and when designing a database, i am using a rule that each class is basically sql table (at least the class that has to be persisted).
Since some classes are purely used to define business settings and the classes are rather flat, I am curios does it make any sense to do something like this..
Transform business layer class
class Contact
{
public string Name {get;set;}
public string PhoneNumber {get;set;}
public bool AcceptsTextMessages {get;set;}
public bool AllowedHoursForTextMessagesStart {get;set;}
public bool AllowedHoursForTextMessagesEnd {get;set;}
public List<DayOfWeek> SendMessagesOnlyOnWorkdays {get;set;}
}
to a data layer class that look something like (and persist it in sql)
public Settings
{
public ID {get;set}
public Name {get;set}
public Value {get;set;}
}
with real life data
ID Name Value
1 Name John Doe
2 PhoneNumber 01234657
3 ExceptsTextMessages true
4 AllowedHoursForTextMessagesStart 0
5 AllowedHoursForTextMessagesEnd 24
6 SendMessagesOnlyOnDays 1,2,3,4,5
The primary reason for this is to have one settings table instead of having as many tables as classes, possibly easier class modification, easier manipulation of properties between classes (in case there is a business logic need to move one property from one class to another)
Decomposing your objects into IDs and attribute-value pairs is one of those techniques that's sometimes extremely useful. EAV data is much more complicated to manage than a flat table with individual columns, so it's not something to implement lightly.
Given what you've posted, I probably wouldn't. All the fields you have seem reliably relevant to being-a-contact and unlikely to require changing around dynamically in production (since one starts or stops accepting text messages, rather than ascending to a plane of existence where text messages are epistemologically irrelevant).
Even if it made sense to represent certain fields as pairs, I'd only do it for those fields: keep a users table with a primary key and the essential data, then put the rest off in an EAV table with a foreign key relationship to users.
In DocumentDb, what is the best way and place to decouple data in order to save them in separate collections?
So far, most of the examples of how to manage data with DocumentDb use simple objects but in real life, we hardly ever do. I just want to understand how and where I need to handle my complex classes before I save them as Json objects in DocumentDb.
Let's look at the following example. I'll be saving my project information into the Projects collection but I do NOT want to save full names of people in the project team within a project document. I just want to save their EmployeeId's in the project document. I have a separate Employees collection where I want to save person/employee specific information. My project object looks like this:
public class Project
{
[JsonProperty(PropertyName="id")]
public int ProjectId {get; set;}
[JsonProperty(PropertyName="projectName")]
public string ProjectName {get; set;}
[JsonProperty(PropertyName="projectType")]
public string ProjectType {get; set;}
[JsonProperty(PropertyName="projectTeam")]
public List<TeamMember> ProjectTeam {get; set}
}
My TeamMember class inherits from Employee object and looks like this:
public class TeamMember : Employee
{
[JsonProperty(PropertyName="position")]
public string Position {get; set;}
}
My Employee class looks like this:
public class Employee
{
[JsonProperty(PropertyName="id")]
public int EmployeeId {get; set;}
[JsonProperty(PropertyName="firstName")]
public string FirstName {get; set;}
[JsonProperty(PropertyName="lastName")]
public string LastName {get; set;}
[JsonProperty(PropertyName="gender")]
public string Gender {get; set;}
[JsonProperty(PropertyName="emailAddress")]
public string EmailAddress {get; set;}
}
Before saving to Projects collection, here's an example of what my Project document should look like:
{
id: 12345,
projectName: "My first project",
projectType: "Construction Project",
projectTeam: [
{ id: 7777, position: "Engineer" },
{ id: 8998, position: "Project Manager" }
]
}
As you can see, I decoupled my project information from the employee data in order to store them in their own collections, Projects and Employees collections respectively.
Let's not get into why I should or should not decouple data. I just want to see how and where I should handle decoupling in order to produce fastest results. I want to follow the best practices so I just want to see how experts working with DocumentDb handle this scenario.
I can think of two places to handle this but I want to understand if there's a better, more direct way to do this:
I can convert my Project class into a JSON object within my C# code and pass the JSON object to DocumentDb for storage.
Alternatively, I can pass my Project object directly to DocumentDb, into a JavaScript stored procedure and I can handle decoupling and storing data in two or more collections within DocumentDb.
Here's what I'd like to know:
Which is the right place to handle decoupling data?
Which would provide better performance?
Is there a better way to handle this? I keep reading about how I can just pass my POCO classes to DocumentDb and it will just handle them for me. Would DocumentDb handle this more complex scenarios? If so, how?
I appreciate your help. Thank you.
in a NoSql store such as this, you can store different types of documents with different schemas in the same collection.
please do not treat collections as tables. think of collections rather as units of partition and boundaries for execution of queries, transactions etc.
so with that in mind, there is nothing wrong with storing your project document as shown and including the employee documents in to the same collection.
now saying all of that; if you still wanted to do this, then you can ...
in order to achieve this your project object would have to change.
instead of having TeamMember : Employee (which would include the entire Employee object) have the TeamMember object mimic what you want from your JSON ... i.e.
class TeamMember
{
int id {get;set;}
string position {get;set;}
}
Now when DocumentDB serializes your project object you would end up with JSON that resembled what you wanted. And then you could separately save your Employee object somewhere else.
If you don't want to do this, or can't do this because you don't control the definition of the Model or because other parts of the system are built to depend on this already then you could investigate building a custom JSON converter for your Project object that would spit out the JSON you wanted.
Then decorate your Project object with that JsonConverter and when DocumentDB does the conversion the correct result would be created each time.
I am doing the tutorial about movies of Microsoft to learn ASP.NET MVC4. I want some more information if that is possible.
I am at the page http://www.asp.net/mvc/tutorials/mvc-4/getting-started-with-aspnet-mvc4/adding-a-model .
It says to add this class at the movie class
public class MovieDBContext : DbContext
{
public DbSet<Movie> Movies { get; set; }
}
And then to create a new connection string
<add name="MovieDBContext"
connectionString="Data Source=(LocalDB)\v11.0;AttachDbFilename=|DataDirectory|\Movies.mdf;Integrated Security=True"
providerName="System.Data.SqlClient"
/>
And this is to get the info for the movies.
My question however, is the following. If we have two tables at the database, one for the movies and one for the actors for example. And we wanted to get the data of both of these tables, we should create two different models right? But we would use only one database(connection string). Then, how we will be able to say that the actor model, will take data from the actor table, and the movie model to take data from the movie table, but the two to have the same connection string?
EDIT:
Suppose I already have the two database tables before I create my project. Then how I make my models to point to them?
When the Database is created, metaData is added and the EntityFramework knows which table to reference for each model. The Connection String is identical because there's two tables in one database, and connection strings are the Databse-Level connection information, not the translation from table data to model.
Essentially, the basics of code first are One Model = One Table. You can do things like foreign key relationships so if your Actor Model references Movies, the underlying data structures will be created to match as well.
EDIT: If your confusion is around the way to create the DBContext, you could create a single DBContext Class with both DBSet types in it and only use one connection string, e.g:
public class ApplicationDBContext : DbContext
{
public DbSet<Movie> Movies { get; set; }
public DBSet<Actor> Actors { get; set; }
}
Looks like a good tutorial. If only there was one more table to complete the picture.
When you get to the next section, you will see the author uses LINQ to access the data. If you are adding a related table, you will be able to add an .Include("Actors") statement to get the additional data.
SolidFish notes http://www.asp.net/mvc/tutorials/mvc-music-store/mvc-music-store-part-4 - which is the tutorial i used to get started and contains multiple tables.
The connectionstring is for the database, so you will only ever use one.
Once you start getting multiple models, you should then start using viewmodels that contain the model data and setting your view from that.
i.e. a rough example:
public class MoviesViewModel
{
// Properties
public Movie Movie { get; set; }
public IEnumerable<Actors> Actors { get; set; }
}
Connection string will not point to table, it points to database. The model you create will match up with the proper table. If you use EF code first approach it will even create separate tables to match your models.
Update:
Simplest example I could find for Database first approach.
http://weblogs.asp.net/scottgu/archive/2010/08/03/using-ef-code-first-with-an-existing-database.aspx