Eager loading includes endless self-join tables without include tables

Eager loading includes endless self-join tables without include tables - c#

when i try to select some items, items are coming with their includes despite i did not include their object to linq
public List<Institution> GetListWithCities(Expression<Func<Institution,bool>> filter = null)
{
using (var context = new DbContext())
{
return filter == null
? context.Set<Institution>()
.Include(x => x.City)
.ToList()
: context.Set<Institution>()
.Include(x => x.City)
.Where(filter)
.ToList();
}
}
[Table("Institution")]
public class Institution{
public int ID;
public string Name;
public int CITY_ID;
public int RESPONSIBLE_INSTUTION_ID;
public virtual City City{ get; set; }
public virtual Institution ResponsibleInstution{ get; set; }
}
I expect a result include with city of instution but my method returns city and responsible instution. And it continues recursively.

People tend to use Include instead of Select while they don't plan to use the functionality that Include gives, but still wasting the processing power that Include uses.
In entity framework always use Select to fetch some data. Only user Include if you plan to update the included items.
One of the slower parts of a database query is the transport from the fetched data from the database management system to your local process. Hence it is wise to Select only those properties that you really plan to use.
Apparently your Institution is in exactly one City, namely the City that the foreign key (CityId?) is referring to. If Institution [10] is located in City [15], then Institution.CityId will have a value 15, equal to City.Id. So you are transferring this value twice.
using (var dbContext = new MyDbContext())
{
IQueryable<Institution> filteredInstitutions = (filter == null) ?
dbContext.Institutions :
dbContext.Institutions.Where(filter);
return filteredInstitutions.Select(institution => new Institution
{
// Select only the Institution properties that you actually plan to use:
Id = institution.Id,
Name = institution.Name,
City = new City
{
Id = institution.City.Id,
Name = institution.City.Name,
...
}
// not needed: you already know the value:
// CityId = institution.City.Id,
});
Possible improvement
Apparently you chose to add a layer between entity framework and the users of your functions: although they use your functions, they don't really have to know that you use entity framework to access the database. This gives your the freedom to use SQL instead of entity framework. Hell, it even gives you the freedom to get rid of your database and use an XML file instead of a DBMS: your users won't know the difference: nice if you want to write unit tests.
Although you chose to separate the method you use to persist the data, you chose to expose your database layout, inclusive foreign keys to the outside world. This makes it more difficult to change your database in future: your users have to change as well.
Consider writing repository classes for Institution and City that only expose those properties that the users of your persistency really need. If people only query "some properties of institutions with some properties of the City in which they are located", or the other way round "Several properties of Cities with several properties of the Institutions located in these Cities", then they won't need the foreign keys.
The intermediate repository classes give you more freedom to change your database. Apart from that, it will give you the freedom to hide certain properties for certain users.
For instance: suppose you add the possibility to delete an institution, but you don't want to immediately delete all information about this institution, for instance because this allows you to restore if someone accidently deletes the institution, you might add a nullable property ObsoleteDate
Moest people that query institutions, don't want the obsolete institutions. If you had an intermediate repository institution class, where you omitted the ObsoleteDate, and all queries removed all Institutions that have a non-zero ObsoleteData, then for your users it would be as if an obsolete institution would have been deleted from the database.
Only one user will need access to the ObsoleteDate: a cleaner task, that every now and then deleted all Institutions that are obsolete for a considerable time.
A third improvement for an intermediate repository class would be that you can give different users access to the same data, with different interfaces: some users can only query information about institutions, some are also allowed to change some data, while others are allowed to change other data. If you give them an interface, they can break this by casting them back to the original Institution.
With separate repository classes, you will have the possibility to give each of these users their own data, and nothing more than this data.
The disadvantage of a repository pattern is that you have to think about different users, and create different query functions. The advantages is that a repository is easier to change and easier to test, and thus easier to keep everything bug free after future changes.

Related

Extract one specific entry from a table?

I am currently looking for a way I can pass a foreing key to a table entry that is listed in one table,
and should be extracted in another table.
for example purposes I created this ?
public class Parent
{
public string Name {get; set;}
public virtual ICollection<child> Children
Public virtual ICollection<School> Schools {get; set;}
}
public class Child
{
public string Name {get; set;}
Public School Schoola{get; set;} // Which should be a school Name that the Parent Should know?
}
public class School
{
//ParentID
//ChildID
public string SchoolName {get; set;}
}
How do i give my Child instance a SchoolName that the Parent contains within the SchoolNames?
Children and SchoolNames are seperate tables - but child only need to know a specific entry..

Caveat
Your code does not work, since EF does not serialize collections of primitive types. EF Core does have value conversions but it is unclear what you're exactly looking for. I'm going to assume you meant to store these as actual School entities, since your question asks how to "extract one entry from a table".
For the sake of answering your question, I assume that your child should have a reference to the school entity, not a string property that's technically unrelated to the school entity itself, which would make it a question not related to Entity Framework and thus the question tags would be wrong.
I'll address both my assumption and your literal question, just to be sure.
If you need a relationship between a child and a school
From a purely database standpoint, there is no way to specify that an entity's (Child) foreign key should refer to an entity (School) which in and of itself has a foreign key to another entity (Parent). It simply doesn't exist in SQL and therefore EF cannot generate this behavior for you.
What you can do, is implement business validation on your code and refuse to store any child with a school that doesn't belong to its parent. Keep in mind, this requires you to load the parent and their schools every time you want to save a child to the database (because otherwise you can't check if the selected school is allowed for this child), so it will become a somewhat expensive operation.
However, that doesn't prevent the possibility for someone to introduce data into the database (circumventing your business logic, e.g. by a DBA) where this rule is violated but the FK constraint itself is upheld.
How you handle these bad data states is up to you. Do you remove those entries when you stumble upon them? Do you proactively scan the database once in a while? Do you allow it to exist but restrict your application's users to only choosing schools from the parent's scope? These are all business decisions that we cannot make for you.
If a child needs a school name without a relation to the school itself
At first sight, this seems to me to be a bad solution. What happens when the school's name changes? Wouldn't you expect the child's schoolname to also change? Because that's not going to happen in your current setup.
In either case, if you are looking to set a string property, that's trivial, you simply set the property. Presumably, your question is how to restrict the user's options to the child's parent's schools.
This restrictive list can be fetched from the database using the child's identifier:
var childID = 123;
var schoolsFromParent = db
.Children
.Where(c => c.Id == childId)
.Select(c => c.Parent.Schools)
.FirstOrDefault();
Note that this code works regardless of whether you have a School entity or a list of strings - though the type of schoolsFromParent will be different.
And then restrict your end user to only being able to pick from the presented options. Note that to prevent bad data, you should doublecheck the chosen name after the user has selected it.

Using AutoMapper to load entities from the database?

Most of what I've read (e.g. from the author) indicates that AutoMapper should be used to map an an entity to a DTO. It should not load anything from the database.
But what if I have this:
public class Customer {
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Order> Orders { get; set; }
}
public class CustomerDto {
public int Id { get; set; }
public string Name { get; set; }
public IEnumerable<int> OrderIds { get; set; } // here is the problem
}
I need to map from DTO to entity (i.e. from CustomerDto to Customer), but first I must use that list of foreign keys to load corresponding entities from the database. AutoMapper can do that with a custom converter.
I agree that it doesn't feel right... but what are the alternatives? Sticking that logic into a controller, service, a repository, some manager class? All that seems to be pushing the logic somewhere else, in the same tier. And if I do that, I must also perform the mapping manually!
From a DDD perspective, the DTO should not be part of the domain. So AutoMapper is also not part of the domain, because it knows about that DTO. So AutoMapper is in the same tier as the controllers, services, etc.
So does it make sense to put the DTO-to-entity logic (which includes accessing the database, and possibly throwing exceptions) into an AutoMapper mapping?
EDIT
#ChrisSimon's great answer below explains from a DDD perspective why I shouldn't do this. From a non-DDD perspective, is there a compelling reason not to use AutoMapper to load from the db?

To start with, I'm going to summarise my understanding of Entities in DDD:
Entities can be created - often using a factory. This is the start of their life-cycle.
Entities can be mutated - have their state modified - by calling methods on the entity. This is how they progress through their lifecycle. By ensuring that the entity owns its own state, and can only have its state modified by calling its methods, the logic that controls the entity's state is all within the entity class, leading to cleaner separation of business logic and more maintainable systems.
Using Automapper to convert from a Dto to the entity means the entity is giving up ownership of its state. If the dto is in an invalid state and you map that directly onto the entity, the entity may end up in an invalid state - you have lost the value of making entities contain data + logic, which is the foundation of the DDD entity.
To make a suggestion as to how you should approach this, I'd ask - what is the operation you are trying to achieve? DDD encourages us not to think about CRUD operations, but to think about real business processes, and to model them on our entities. In this case it looks like you are linking Orders to the Customer entity.
In an Application Service I would have a method like:
void LinkOrdersToCustomer(CustomerDto dto)
{
using (var dbTxn = _txnFactory.NewTransaction())
{
var customer = _customerRepository.Get(dto.Id);
foreach (var orderId in dto.OrderIds)
{
var order = _orderRepository.Get(orderId);
customer.LinkToOrder(order);
}
dbTxn.Save();
}
}
Within the LinkToOrder method, I would have explicit logic that did things like:
Check that order is not null
Check that the customer's state permits adding the order (are they currently active? is their account closed? etc.)
Check that the order actually does belong to the customer (what would happen if the order referenced by orderId belonged to another customer?)
Ask the order (via a method on the order entity) if it is in a valid state to be added to a customer.
Only then would I add it to the Customers Order's collection.
This way, the application 'flow' and infrastructure management is contained within the application/services layer, but the true business logic is contained within the domain layer - within your entities.
If the above requirements are not relevant in your application, you may have other requirements. If not, then perhaps it is not necessary to go the route of DDD - while DDD has a lot to add, its overheads are generally only worth it in systems with lots of complex business logic.
This isn't related to the question you asked, but I'd also suggest you take a look at the modelling of Customer and Order. Are they both independent Aggregates? If so, modelling Customer as containing a collection of Order may lead to problems down the road - what happens when a customer has a million orders? Even if the collection is lazy loaded, you know at some point something will attempt to load it, and there goes your performance. There's some great reading about aggregate design here: http://dddcommunity.org/library/vernon_2011/ which recommends modelling references by Id rather than reference. In your case, you could have a collection of OrderIds, or possibly even a completely new entity to represent the link - CustomerOrderLink which would have two properties - CustomerId, and OrderId. Then none of your entities would have embedded collections.

How to delete a record when it is nowhere used in multiple other tables

I'm making an application in C# and I'm using the EF Code First for my database-creation (for a SQL-Server database).
I have a class "Address" which is used in several other classes.
So several records can relate to the same Address-record. Is there an option in the EF where I can delete the Address-record when it is nowhere used anymore? Unless if I'm wrong, the CascadeOnDelete-option will remove the record once a certain record is deleted while others still relate to the Address-record.
Also, it wouldn't be very useful to create a new Address-record for each record that relates to it, because most Address-record would be exactly the same (for example, a lot of Address-record would just contain the name of the same country or city).
Sorry if it all sounds a bit fuzzy, I would give some code but I don't really know what code that would be.

The short answer is no, EF doesn't natively provide that feature.
The way I see it, there are a couple of things that you can (should) do to get to your desired result:
First, if you're concerned about duplicate data (Country, City, State, Region, etc.) in an address, you should extract them into their own tables and provide references. This will mean instead of having a varchar of United States in Address.Country for example, it would be a foreign key of 1 maybe, to the Countries table. This isn't a bad idea as it will also allow you to standardize more easily (so you don't get US, U.S., and United States say).
Second, you can have a business logic layer on top of your database (let's call it BO, so when you need to save a Person, you call PersonBO.Save(Person), which interacts with your database on your behalf. Then, extract your check one level further, to a static mainBO class maybe. Your PersonBO (and any other classes that use Addresses) can then call mainBO.FindAndDeleteUnusedAddresses() passing the applicable object (Person person in this case):
FindAndDeleteUnusedAddresses(Person person) {
using (var db = new Entities())
{
var personCount = db.Persons.Where(p => p.Address == person.Address).Count();
var businessCount = db.Businesses.Where(b => b.Address == person.Address).Count();
// other objects that have addresses here
if (personCount + businessCount == 0) // others included as necessary
{
db.Entry(person.Address).State = EntityState.Deleted;
db.SaveChanges();
}
}
}

This is how EF behaves when CascadeOnDelete switched off.
It will throw exception on attempt to delete row that is used in any relationship. You will need to catch this exception to show custom message or do something else you need in this case.
If you need to remove Address in a moment when last entity that uses this address deleted it is probably better to check if address is not used in any other places and mark it for deletion manually.

Can someone better explain what 'Projections' are in nHibernate?

As a new user of nHibernate and its utility library, fluent nhibernate, I am trying to learn enough to be dangerous with a good database.
I am having an exceptionally great deal of difficulty understanding the concept of Projections. Specifically, What in the world are they?
I have literally done exact searches on 'What are projections?' and 'Projects in nHibernate' and 'nHibernate, Projections, Definition', etc. And I am still very confused. The most helpful posts so far are This other StackOverflow Question and This Blog Post by Colin Ramsay. But I am still vastly confused. My knowledge of databases is still entry-level at best.
I do not really understand what projections are, why I would want to use them, what they are accomplishing, etc. I see in the blog post that he is using them to get a list of integers (I presume Primary Keys) so that he can use them in a different query, but this is kind of nebulous in the way it is functioning and the why.

Here's a practical example.
Let's say that you have an online store and one of your domain classes is a Brand like "Samsung". This class has a boatload of properties associated with it, perhaps an integer Identity, a Name, a free-text Description field, a reference to a Vendor object, and so on.
Now let's say that you want to display a menu with a list of all the brands offered on your online store. If you just do session.CreateCriteria<Brand>().List(), then you are indeed going to get all of the brands. But you'll also have sucked all of the long Description fields and references to Vendors from the database, and you don't need that to display a menu; you just need the Name and the Identity. Performance-wise, sucking all of this extra data down from the database slows things down and is unnecessary.
Instead, you can create a "projection" object that contains just the Identity and the Name calling it, say, NameIdentityPair:
public class NameIdentityPair
{
public int Identity { get; set; }
public string Name { get; set; }
}
And you could tell NHibernate to only select the data that you really need to perform the task at hand by telling it to transform the result set onto your projection:
var brandProjections = this.session.CreateCriteria<Brand>()
.SetProjection(Projections.ProjectionList()
.Add(Projections.Property("Name"), "Name")
.Add(Projections.Property("Identity"), "Identity"))
.SetResultTransformer(Transformers.AliasToBean<NameIdentityPair>())
.List<NameIdentityPair>();
foreach (var brandProjection in brandProjections)
{
Console.WriteLine(
"Identity: {0}, Name: {1}",
brandProjection.Identity,
brandProjection.Name);
}
Now you don't have a list of Brands but instead a list of NameIdentityPairs, and NHibernate will have only issued a SQL statement like SELECT b.Identity, b.Name from dbo.Brand b to obtain this projection, as opposed to a massive SQL statement that grabs everything necessary to hydrate a Brand object (e.g., SELECT b.Identity, b.Name, b.Description from dbo.brand b left join dbo.vendor v ....).
Hope this helps.

If you're familiar with SQL, a projection is the SELECT clause of a query, used to select which fields from the available results to return.
For example, assume you have a Person with FirstName, LastName, Address, and Phone fields. If you want a query to return everything, you can leave off the projection, which is like SELECT * FROM Person in SQL. If you just want the first and last names, you would create a projection with FirstName and LastName -- which would be SELECT FirstName, LastName FROM Person in SQL terms.

You can use projections to call sql functions like SUM, COUNT... or select single fields without return an entity.
"...Retrieving only properties of an entity or entities, without the overhead of loading
the entity itself in a transactional scope. This is sometimes called a report
query; it’s more correctly called projection." [NHibernate in Action]

What is the proper object relationship? (C#)

I had a quick question about the proper object relationship I should set up for this situation:
I have a Customer object with associated parameters and a depot object with associated parameters. Each depot serves a set of customers and the customer needs access to particular information for their respective depot.
I'm wondering what the proper relationship I should set up so that a set of customer objects all reference the same instance of a particular depot object. I wanted to be sure it wasn't creating a duplicate Depot object for each customer. Furthermore, i'd like to be able to change properties of the Depot without going through the customer itself.
I know this is probably a fairly basic question but C# has so many different "features" it gets confusing from time to time.
Thanks for your help!
Charlie

If I understand your question correctly, I think a solution to your problem might be an OR mapper. Microsoft provides two OR mappers at the moment, LINQ to SQL and Entity Framework. If you are using .NET 3.5, I recommend using LINQ to SQL, but if you are able to experiment with .NET 4.0, I would highly recommend looking into Entity Framework. (I discourage the use of Entity Framework in .NET 3.5, as it was released very prematurely and has a LOT of problems.)
Both of these OR mappers provide visual modeling tools that allow you to build a conceptual entity model. With LINQ to SQL, you can generate a model from your database, which will provide you with entity classes, as well as associations between those classes (representing your foreign keys from your DB schema). The LINQ to SQL framework will handle generating SQL queries for you, and will automatically map database query results into object graphs. Relationships such as the one you described, with multiple customers in a set referencing the same single department are handled automatically for you, you don't need to worry about them at all. You also have the ability to query your database using LINQ, and can avoid having to write a significant amount of stored procedures and plumbing/mapping code.
If you use .NET 4.0, Entity Framework is literally LINQ to SQL on steroids. It supports everything LINQ to SQL does, and a hell of a lot more. It supports model-driven design, allowing you to build a conceptual model from which code AND database schema are generated. It supports a much wider variety of mappings, providing a much more flexible platform. It also provides Entity SQL (eSQL), which is a text-based query language that can be used to query the model in addition to LINQ to Entities. Line LINQ to SQL, it will solve the scenario you used as an example, as well as many others.
OR mappers can be a huge time, money, and effort saver, greatly reducing the amount of effort required to interact with a relational database. They provide both dynamic querying as well as dynamic, optimistic updates/inserts/deletes with conflict resolution.

This sounds like you've got a Many-to-many relationship going on. (Customers know about their Depots, and vice versa)
Ideally this seems best suited for a database application where you define a weak-entity table ... Of course using a database is overkill if we're talking about 10 Customers and 10 Depots...
Assuming a database is overkill, this can be modeled in code with some Dictionarys. Assuming you're using int for the unique identifiers for both Depot and Customer you could create something like the following:
// creating a derived class for readability.
public class DepotIDToListOfCustomerIDs : Dictionary<int,List<int>> {}
public class CustomerIDToListOfDepotIDs : Dictionary<int,List<int>> {}
public class DepotIDToDepotObject : Dictionary<int,Depot>{}
public class CustomerIDToCustomerObject : Dictionary<int, Customer>{}
//...
// class scope for a class that manages all these objects...
DepotIDToListOfCustomerIDs _d2cl = new DepotIDToListOfCustomerIDs();
CustomerIDToListOfDepotIDs _c2dl = new CustomerIDToListOfDepotIDs();
DepotIDToDepotObject _d2do = new DepotIDToDepotObject();
CustomerIDToCustomerObject _c2co = new CustomerIDToCustomerObject();
//...
// Populate all the lists with the cross referenced info.
//...
// in a method that needs to build a list of depots for a given customer
// param: Customer c
if (_c2dl.ContainsKey(c.ID))
{
List<int> dids=_c2dl[c.ID];
List<Depot> ds=new List<Depot>();
foreach(int did in dids)
{
if (_d2do.ContainsKey(did))
ds.Add(_d2do[did]);
}
}
// building the list of customers for a Depot would be similar to the above code.
EDIT 1: note that with the code above, I've crafted it to avoid circular references. Having a customer reference a depot that also references that same customer will prevent these from being quickly garbage collected. If these objects will persist for the entirety of the applications lifespan a simpler approach certainly could be taken. In that approach you'd have two lists, one of Customer instances, the other would be a list of Depot instances. The Customer and Depot would contain lists of Depots and Customers respectively. However, you will still need two dictionaries in order to resolve the Depot IDs for the customers, and vice versa. The resulting code would be 99% the same as the above.
EDIT 2:
As is outlined in others replies you can (and should) have an object broker model that makes the relationships and answers questions about the relationships. For those who have misread my code; it is by no means intended to craft the absolute and full object model for this situation. However, it is intended to illustrate how the object broker would manage these relationships in a manner that prevents circular references. You have my apologies for the confusion it caused on the first go around. And my thanks for illustrating a good OO presentation that would be readily consumed by others.

In reply to #Jason D, and for the sake of #Nitax: I'm really skimming the surface, because while it's basically easy, it also can get complicated. There's no way I'm going to re-write it better than Martin Fowler either (certainly not in 10 minutes).
You first have to sort out the issue of only 1 object in memory that refers to a specific depot. We'll achieve that with something called a Repository. CustomerRepository has a GetCustomer() method, and the DepotRepository has a GetDepot() method. I'm going to wave my hands and pretend that just happens.
Second you need to need to write some tests that indicate how you want the code to work. I can't know that, but bear with me anyways.
// sample code for how we access customers and depots
Customer customer = Repositories.CustomerRepository.GetCustomer("Bob");
Depot depot = Repositories.DepotRepository.GetDepot("Texas SW 17");
Now the hard part here is: How do you want to model the relationship? In OO systems you don't really have to do anything. In C# I could just do the following.
Customers keep a list of the depots they are with
class Customer
{
public IList<Depot> Depots { get { return _depotList; } }
}
alternatively, Depots keep a list of the customers they are with
class Depot
{
public IList<Customer> Customers { get { return _customerList; } }
}
// * code is very brief to illustrate.
In it's most basic form, any number of Customers can refer to any number of Depots. m:n solved. References are cheap in OO.
Mind you, the problem we hit is that while the Customer can keep a list of references to all the depot's it cares about (first example), there's not an easy way for the Depot to enumerate all the Customers.
To get a list of all Customers for a Depot (first example) we have to write code that iterates over all customers and checks the customer.Depots property:
List<Customer> CustomersForDepot(Depot depot)
{
List<Customer> allCustomers = Repositories.CustomerRepository.AllCustomers();
List<Customer> customersForDepot = new List<Customer>();
foreach( Customer customer in allCustomers )
{
if( customer.Depots.Contains(depot) )
{
customersForDepot.Add(customer);
}
}
return customersForDepot;
}
If we were using Linq, we could write it as
var depotQuery = from o in allCustomers
where o.Depots.Contains(depot)
select o;
return query.ToList();
Have 10,000,000 Customers stored in a database? Ouch! You really don't want to have to load all 10,000,000 customers each time a Depot needs to determine its' customers. On the other hand, if you only have 10 Depots, a query loading all Depots once and a while isn't a big deal. You should always think about your data and your data access strategy.
We could have the list in both Customer and Depot. When we do that we have to be careful about the implementation. When adding or removing an association, we need to make the change to both lists at once. Otherwise we have customers thinking they are associated with a depot, but the depot doesn't know anything about the customer.
If we don't like that, and decide we don't really need to couple the objects so tightly. We can remove the explicit List's and introduce a third object that is just the relationship (and also include another repository).
class CustomerDepotAssociation
{
public Customer { get; }
public Depot { get; }
}
class CustomerDepotAssociationRepository
{
IList<Customer> GetCustomersFor(Depot depot) ...
IList<Depot> GetDepotsFor(Customer customer) ...
void Associate(Depot depot, Customer customer) ...
void DeAssociate(Depot depot, Customer customer) ...
}
It's yet another alternative. The repository for the association doesn't need to expose how it associates Customers to Depots (and by the way, from what I can tell, this is what #Jason D's code is attempting to do)
I might prefer the separate object in this instance because what we're saying is the association of Customer and Depot is an entity unto itself.
So go ahead and read some Domain Driven Design books, and also buy Martin Fowlers PoEAA (Patterns of Enterprise Application Architecture)

Hope this is self-explanatory.
OO:
ER:

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.