Entity Framework with multiple databases - c#

Is there anyway to map multiple SQL Server databases in a single EF context? For instance I'm trying to do something like this
select order from context.Orders
where context.Users.Any(user => user.UserID == order.UserID)
And I'd like to get generated SQL along the lines of:
select .. from store.dbo.order where userID in
(select userID from authentication.dbo.user)
and note that the database names are different - store in one place, authentication in the other.
I've found a few articles that deal with multiple schema ('dbo' in this case), but none dealing with multiple database names.

As a potential workaround, you could create a view of the table from the second database in the first database and point your mappings to the view.

I'm pretty sure this isn't possible. The context derives from DbContext.
A DbContext instance represents a combination of the Unit Of Work and Repository patterns such that it can be used to query from a database and group together changes that will then be written back to the store as a unit. DbContext is conceptually similar to ObjectContext.
Configuration (connection string, schema, etc) for a DbContext is specific to a single database.

It's not possible. A notion of context is below notion of a database, and allowing this would probably be a bad practice. Allowing such a thing could cause developers to forget that they are dealing with two databases, and to take care about all performance implications that come from that.
I imagine you should still be able use two contexts and write elegant code.
var userIds = AuthContext.Users
.Where(user => user.Name = "Bob")
.Select(user => user.UserId)
.ToList();
var orders = StoreContext.Orders
.Where(order => userIds.Contains(order.UserId))
.ToList()
First execute query on authentication database context, in order to provide parameters for second query.

Related

Does EF automatically load many to many references collections

Imagine we have the following db structure
Organization
{
Guid OrganizationId
//....
}
User
{
Guid UserId
}
OrganizationUsers
{
Guid OrganizationId
Guid UserId
}
When the edmx generated this class it abstracts away the OrganizationUsers into a many to many references. So no POCO class will be generated for it.
Say I'm loading data from my context, but to avoid Cartesian Production, I don't use an include I make two seperate queries.
using(var context = new EntitiesContext())
{
var organizationsQuery = context.Where(FilterByParent);
var organizations = organizationsQuery.ToList();
var users = organizationsQuery.SelectMany(x => x.Users).Load();
}
Is it safe to assume that the connected entitites are loaded?
Would this make any difference if I loaded the users directly from the DBSet?
From database point of view:
Is it safe to assume that the connected entitites are loaded?
Yes It's safe, because first organizations being tracked by EF Change Tracker and then by calling Load in next statement EF knows that results should be attach to tracked entities
Would this make any difference if I loaded the users directly from the DBSet?
In fact using Load this way does nothing better than Include!
If you use Include EF translate it to LEFT JOIN, if you use Load it will be translated to INNER JOIN, and if you fetch Users directly by their ids using Contains method it will be translated to IN on Sql side.
In Load and Contains cases you execute two query (in two pass) on Sql, but in Include case it's being done in one pass, so overally it's outperform your approach.
You can compare these approaches yourself using Sql Profiler tool.
Update:
Based on conversations I realized that the main issue of Johnny is just existence of OrganizationUsers object. So I suggest to change your approach from DB First to Code first then this object explicitly could be exist! See this to help you on this way
Also another approach that I guess maybe work is customizing T4 Template that seems harder but not impossible!

Authorize data access at database level

From the question title you might guess what is this about. I'll try to describe what I currently have and what I want to archive.
Suppose an application that handles four entities: User, Team, Repository and Document. The relationships between those entities are:
Each user belong to zero or more teams.
Each document belong to one repository
An user may own zero or more repositories
Each repository can be created as public or private
The content of a public repository is visible to all users who share a team with the repository's owner.
A private repository is only visible to it's owner.
Accessing documents of an user is not a problem, those are all documents stored in repositories that he owns. But the thing get complicated because what I really need is all documents visible to an user, this is all it's documents plus those documents other people made public and share a team with him.
Currently I'm enforcing this authorization mechanism in the Data Access Layer. This implies fetching all documents and do some filtering following rules above. I'm aware that this implementation is not scalable and I wonder if I can improve my database model by moving the authorization logic to the database. This way the filtering will be done by the DB engine and only requested entities will be returned to the client code.
This question is not tied to an specific implementation, but I'll tag it for the specific tools I'm using. Maybe it can be useful for someone's answer.
First let me explain why using entity framework (or another ORM tool) is more elegant than using stored procedures.
Stored Procedures are evil. That's why. As the link explains in detail, stored procedures tend to grow as a second BL and are therefore difficult to maintain. A simple task as renaming a column will become a big task when this column is used in multiple stored procedures. When you use a ORM tool, visual studio will do most of the work for you.
That said brings me to the second advantage of entity framework. You can compose your query by using your favorite .net language. Entity framework will not execute your query directly. You control when the query will be executed as you can read here. When doing this entity framework will compile your Linq statements to a complete tsql statement and run this against the database. So there is absolutely no need to fetch all data and loop through each record.
Tip: Move your cursor over the variable name and ef will give you a preview of the TSQL statement it will compile.
So how should your Linq query look like? I composed a test database based on your description and made an entity framework (ef6) model of it which looks like:
This Linq query will do what you want, at least as I understood your question correctly.
private IEnumerable<Document> GetDocumentsOfUser(Guid userId)
{
using (var db = new DocumentRepositoryEntities())
{
// Get owned repositories by the user
var ownedRepositories = db.Repositories
.Where(r => r.Owner.UserId == userId);
// Get all users of teams the user belongs to
var userInOtherTeams =
db.Users.Where(u => u.UserId == userId)
.SelectMany(u => u.Teams)
.SelectMany(t => t.Users);
// Get the public repositories owned by the teammembers
var repositoriesOwnedByTeamMembers =
userInOtherTeams.Where(u => u.Repositories.Any())
.SelectMany(u => u.Repositories)
.Where(r => !r.Private);
// Combine (union) the 2 lists of repositories
var allRepositories = ownedRepositories.Concat(
repositoriesOwnedByTeamMembers);
// Get all the documents from the selected repositories
return allRepositories.SelectMany(r => r.Documents)
.Distinct()
.ToArray(); //query will be composed here!
}
}
Note that the linq statement will be compiled to a TSQL select statement when the call to .ToArray() is made.
Based on your description, the goal is to find all of the repositories that the user currently has access to, then retrieve the documents from each of those repositories.
If this were my implementation, I would add a stored procedure to the database that accepts the current user's ID, then gathers the list of accessible repositories into a local table variable, then select from the documents table where the repository for the document is in the list of accessible repositories.
DECLARE
#Teams TABLE (TeamID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY (TeamID))
DECLARE
#Repositories TABLE (RepositoryID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY (RepositoryID))
/* Get the list of teams the user is a member of */
INSERT INTO #Teams
SELECT Teams.TeamID
FROM Teams INNER JOIN TeamUsers ON Teams.ID = TeamUsers.TeamID
WHERE TeamUsers.UserID = #UserID
/* Get the list of repositories the user shares a team member with */
INSERT INTO #Repositories
SELECT RepositoryID
FROM Repositories
WHERE OwnerID = #UserID
OR (OwnerID IN (SELECT DISTINCT TeamUsers.UserID
FROM TeamUsers INNER JOIN #Teams ON TeamUsers.TeamID = #Teams.TeamID)
AND IsShared = 1)
/* Finally, retrieve the documents in the specified repositories */
SELECT Documents.*
FROM Documents INNER JOIN #Repositories ON Documents.RepositoryID = #Repositories.RepositoryID
While the answer competent_tech suggests is valid, and good if your need is a one-off, what you would ideally want to do is implement your authorization requirements in a dedicated layer, in an externalized fashion. Reasons to do this include:
easier to maintain a decoupled architecture
you can update your authorization without touching your application and/or database
you do not need SQL / stored procedure knowledge
you can report more easily on what authorization is applied where: this is important if you have auditors breathing down your neck.
To achieve externalized authorization (see here for a Gartner report on the topic), you need to consider attribute-based access control (ABAC - see here for a report on ABAC by NIST) and the eXtensible Access Control Markup Language (XACML - more info here) as a means to implement ABAC.
If you follow the ABAC approach you get:
a clean, decoupled architecture with the notion of
an enforcement point or interceptor that will sit between your application and your database (in the case of ABAC applied to databases)
an authorization decision engine that reaches decisions and will produce a filter statement (a WHERE clause in the case of a SQL database) that the enforcement point will append to the original SQL statement
a policy-based and attribute-based authorization model whereby you can write authorization requirements in easy-to-understand statements instead of procedures, PL-SQL or other SQL artefacts. Examples include:
*a user can edit a document they own
a user can view documents if the user's team == the document's team
a user can view documents of another team if and only if the documents are marked as public
a user with the role editor can edit documents that belong to their team if and only if the document state is draft*
In the above examples, the user type, the resource type (document), the action (view, edit), the document's team, the user's team, and the document's visibility (private or public) are all examples of attributes. Attributes are the lifeline, the building blocks of ABAC.
ABAC can easily help you implement your authorization requirements from the simplest ones to the more advanced ones (such as can be found in export regulations, compliance regulations, or other business rules).
One neat benefit of this approach is that it is not specific to databases. You can apply the same principle and policies to home-grown apps, APIs, web services, and more. That's what I call the any-depth architecture / approach to externalized authorization. The following diagram summarizes it well:
The PDP is your centralized authorization engine.

Concatenating two queries bases on a foreign key

I'm (at least trying) to implement Repository pattern in my .NET C# project so when I need to communicate with the database I us something like this:
IList<Sole> soles = SoleService.All().ToList();
As the name of the method called from the service suggest with the query above I get all records form Sole table. I don't want and I think this is the right way to implement this pattern, to keep too much custom logic in my service. What I mean is that I only want to keep the All() method and each modification of the result to be made outside the service methods.
The current problem is this. I have entity Sole and entity SoleColor. SoleColor has a foreign key column SoleID making the relation between the two tables. Right now for those two entities I can call only All() method :
var soleColors = SoleColorService.All();
var soles = SoleService.All();
But here I need some customization in the form of selecting only those rows from Sole that are related with the SoleColor entity. In other words only end up with a list of only those rows from Sole where Sole.ID can be found as a foreign key in SoleColor SoleID foreign key.
Right now I'm a bit confused - it's been a while since I last used plain SQL synthax. I think this is easily achieved using SQL and JOIN. But when LINQ is involved and my experience so far tells me that I need those two queries :
var soleColors = SoleColorService.All();
var soles = SoleService.All();
And then make some kind of JOIN/UNION to filter only the results I need.
So which tools I need to use in this kind of situation cause it's not the only place I'm gonna need this and I want to learn to do it myself and of course to do it in this current situation?
After your last comment I think this is what you're looking for:
from s in SoleService.All()
join sc in SoleColorService.All() on s.ID equals sc.SoleID
select s
But this only works if both repositories have the same context instance. If not, you have to do it in two steps:
var ids = SoleColorService.All().Select(sc => sc.SoleID).ToArray();
var soles = SoleService.All().Where(s => ids.Contains(s.ID));
I'm a bit suspicious though about the static All() methods. They suggest that you use static contexts, which is considered bad practice. Further I wonder about the associations. By the sound of the words I'd expect Sole to have a SoleColor, i.e. Sole to have a SoleColorId FK.

Dynamically loading SQL tables in Entity Framework

I need to dynamically access some SQL tables hopefully using the Entity Framework. Here's some pseudo code:
var Account = DB.Accounts.SingleOrDefault(x => x.ID == 12345);
which will return me an Account object and this contains some fields called "PREFIX", "CAMPAIGN ID" and further information about the accounts are stored in separate SQL tables with the naming convention of PREFIX_CAMPAIGNID_MAIN.
The tables all have the same fields so I was thinking of creating a new Entity that isn't mapped anywhere and then dynamically loading it, like so:
var STA01_MAIN = new MyAccount(); // my "un-mapped" entity
DB.LoadTable('STA01_MAIN').LoadInto(STA01_MAIN);
I can now get anything about the STA01_MAIN account: STA01_MAIN.AccountId.
So my question is: how do I access these tables using the Entity Framework?
I don't think EF has a LoadTable and LoadInto method, but ObjectOntext.ExecuteStoreQuery might be what you're looking for:
http://msdn.microsoft.com/en-us/library/dd487208.aspx
This should let you execute an arbitrary query against your database, and then map the results to an arbitrary type that you specify (even if it's not otherwise mapped in EF).
It goes without saying that you would be responsible for putting together a query that supplied the necessary columns for mapping into the destination type, and also adjusting said query when this type changes.
Here's some further discussion concerning its usage
http://social.msdn.microsoft.com/Forums/en-US/adonetefx/thread/44cf5582-63f8-4f81-8029-7b43469c028d/
Have you considered mapping all of these tables (with the identical columns) into an inheritance relationship in EF, and then querying them as
db.BaseTypes.OfType<SpecificType>().Where(/*.....*/);

Linq: To join or not to join (which is the better way, joins or relationships)

I have written quite a bit of code which uses the Linq2Sql table relationships provided to me just by having foreign keys on my database. But, this is proving to be a bit laborious to mock data for my unit tests. I have to manually set up any relationships in my test harness.
So, I am wondering if writing Linq joins rather than relying on the relationships would give me more easily testable and possibly more performant code.
var query =
from orderItem in data.OrderItems
select new
{
orderItem.Order.Reference,
orderItem.SKU,
orderItem.Quantity,
};
Console.WriteLine("Relationship Method");
query.ToList().ForEach(x => Console.WriteLine(string.Format("Reference = {0}, {1} x {2}", x.Reference, x.Quantity, x.SKU)));
var query2 =
from orderItem in data.OrderItems
join order in data.Orders
on orderItem.OrderID equals order.OrderID
select new
{
order.Reference,
orderItem.SKU,
orderItem.Quantity,
};
Console.WriteLine();
Console.WriteLine("Join Method");
query2.ToList().ForEach(x => Console.WriteLine(string.Format("Reference = {0}, {1} x {2}", x.Reference, x.Quantity, x.SKU)));
Both queries above give me the same result, but is one better than the other in terms of performance and in terms of testability?
What are you testing? Linq to SQL's ability to read data? It is generally assumed that, linq to sql being a thin veneer over a database, that the linq to sql code itself is considered "pristine," and therefore doesn't need to be tested.
I am hugely not in favor of complicating your code in this way, just so that you can mock out the linq to sql DBML. If you want to test your business logic, it is far better to just hook up a test database to the DBML (there is a constructor overload for the datacontext that allows you to do this) and use database transactions to test your data interactions. That way, you can roll the transaction back to undo the changes to the database, leaving the test database in its original state.
In terms of performance, both queries will evaluate to the same SQL (Scott Guthrie has a blog post on how to view the SQL generated by LINQ queries). I don't think that either option is inherently more "testable" than the other. However, I prefer to use the foreign keys and relationships because when using SQL Metal it lets you know really quickly that your database has the appropriate keys.
I don't think either approach has an advantage in either performance or testability. The first form is easier to read though, and so I would personally go with that. It's a subjective matter though.
It seems to me that your problem lies with being able to setup your data in an easy way, and have the foreign key values and entity references remain consistent. I don't think that's an easy thing to solve. You could write some sort of framework which creates object proxies and uses the entity metadata to intercept FK and related entity property setters in order to sync them up, but before you know it, you'll have implemented an in-memory database!

Categories