Working with Cross Context Joins in LINQ-to-SQL - c#

Initially I had written this query using LINQ-to-SQL
var result = from w in PatternDataContext.Windows
join cf in PatternDataContext.ControlFocus on w.WindowId equals cf.WindowId
join p in PatternDataContext.Patterns on cf.CFId equals p.CFId
join r in ResultDataContext.Results on p.PatternId equals r.PatternId
join fi in ResultDataContext.IclFileInfos on r.IclFileId equals fi.IclFileId
join sp in sessionProfileDataContext.ServerProfiles on fi.ServerProfileId equals sp.ProfileId
join u in infrastructure.Users on sp.UserId equals u.Id
where w.Process.Equals(processName)
select u.DistributedAppId;
And when I executed it, and saw result in the QuickWatch.., it showed this message:
the query contains references to items defined on a different data context
On googling, I found this topic at Stackoverflow itself, where I learned simulating cross context joins and as suggested there, I changed my query a bit to this:
var result = from w in PatternDataContext.Windows
join cf in PatternDataContext.ControlFocus on w.WindowId equals cf.WindowId
join p in PatternDataContext.Patterns on cf.CFId equals p.CFId
join r in SimulateJoinResults() on p.PatternId equals r.PatternId
join fi in SimulateJoinIclFileInfos() on r.IclFileId equals fi.IclFileId
join sp in SimulateJoinServerProfiles() on fi.ServerProfileId equals sp.ProfileId
join u in SimulateJoinUsers() on sp.UserId equals u.Id
where w.Process.Equals(processName)
select u.DistributedAppId;
This query is using these SimulateXyz methods:
private static IQueryable<Result> SimulateJoinResults()
{
return from r in SessionDataProvider.Instance.ResultDataContext.Results select r;
}
private static IQueryable<IclFileInfo> SimulateJoinIclFileInfos()
{
return from f in SessionDataProvider.Instance.ResultDataContext.IclFileInfos select f;
}
private static IQueryable<ServerProfile> SimulateJoinServerProfiles()
{
return from sp in sessionProfileDataContext.ServerProfiles select sp;
}
private static IQueryable<User> SimulateJoinUsers()
{
return from u in infrastructureDataContext.Users select u;
}
But even this approach didn't solve the problem. I'm still getting this message in QuickWatch...:
the query contains references to items defined on a different data context
Any solution for this problem? Along with the solution, I would also want to know why the problem still exists, and how exactly the new solution removes it, so that from next time I could solve such problems myself. I'm new to LINQ, by the way.

I've had to do this before, and there are two ways to do it.
The first is to move all the servers into a single context. You do this by pointing LINQ-to-SQL to a single server, then, in that server, create linked servers to all the other servers. Then you just create views for any tables you're interested from the other servers, and add those views to your context.
The second is to manually do the joins yourself, by pulling in data from one context, and using just the properties you need to join into another context. For example,
int[] patternIds = SessionDataProvider.Instance.ResultDataContext.Results.Select(o => o.patternId).ToArray();
var results = from p in PatternDataContext.Patterns
where patternIds.Contains(p.PatternId)
select p;
Though the first is easier to work with, it does have its share of problems. The problem is that you're relying on SQL Server to be performant with linked servers, something it is notoriously bad at. For example, consider this query:
var results = from p in DataContext.Patterns
join r in DataContext.LinkedServerResults on p.PatternId equals r.PatternId
where r.userId = 10;
When you enumerate this query, the following will occur (let's call the normal and linked servers MyServer and MyLinkedServer, respectively)
MyServer asks MyLinkedServer for the Results
MyLinkedServer sends the Results back to MyServer
MyServer takes those Results, joins them on the Patterns table, and returns only the ones with Results.userId = 10
So now the question is: When is the filtering done - on MyServer or MyLinkedServer? In my experience, for such a simple query, it will usually be done on MyLinkedServer. However, once the query gets more complicated, you'll suddenly find that MyServer is requesting the entire Results table from MyLinkedServer and doing the filtering after the join! This wastes bandwidth, and, if the Results tables is large enough, could turn a 50ms query into a 50 second query!
You could fix unperformant cross-server joins using stored procedures, but if you do a lot of complex cross-server joins, you may end up writing stored procedures for most of your queries, which is a lot of work and defeats part of the purpose of using L2SQL in the first place (not having to write a lot of SQL).
In comparison, the following code would always perform the filtering on the server containing the Results table:
int[] patternIds = (from r in SessionDataProvider.Instance.ResultDataContext.Results
where r.userId = 10
select r.PatternId).ToArray();
var results = from p in PatternDataContext.Patterns
where patternIds.Contains(p.PatternId)
select p;
Which is best for your situation is up to your best judgement.
Note that there is a third potential solution which I did not mention, as it is not really a programmer-solution: you could ask your server admins to set up a replication task to copy the necessary data from MyLinkedServer to MyServer once a day/week/month. This is only an option if:
Your program can work with slightly stale data from MyLinkedServer
You only need to read, never write, to MyLinkedServer
The tables you need from MyLinkedServers are not exorbitantly huge
You have the space/bandwidth available
Your database admins are not stingy/lazy

Your SimulateJoins can't work because they return IQueryable. Your current solution is exactly the same as your former one and that is the reason why you get the same exception. If you check the linked question again you will see that their helper methods return IEnumerable which is the only way to make cross context operations. As you probably already know it means that join will be performed in memory on the application server instead of the database server = it will pull all data from your partial queries and execute join as linq-to-objects.
Cross context join on database level is IMO not possible. You can have different connections, different connection strings with different servers, etc. Linq-to-sql does not handle this.

You could work around it by "escaping from" Linq to SQL on the second context, i.e., calling for instance .ToList() on ResultDataContext.Results and ResultDataContext.IclFileInfos so that your query ended up looking like:
var result = from w in PatternDataContext.Windows
join cf in PatternDataContext.ControlFocus on w.WindowId equals cf.WindowId
join p in PatternDataContext.Patterns on cf.CFId equals p.CFId
join r in ResultDataContext.Results.ToList()
on p.PatternId equals r.PatternId
join fi in ResultDataContext.IclFileInfos.ToList()
on r.IclFileId equals fi.IclFileId
join sp in sessionProfileDataContext.ServerProfiles on
fi.ServerProfileId equals sp.ProfileId
join u in infrastructure.Users on sp.UserId equals u.Id
where w.Process.Equals(processName)
select u.DistributedAppId;
Or AsEnumerable() as long as you "get out" of Linq to SQL and into Linq to Objects for the "offending" context.

Old question, but as I happened to have the same problem, my solution was to pass the manually crafted T-SQL cross-server query (with linked servers) directly to the provider through the ExecuteQuery method of the first context:
db.ExecuteQuery(Of cTechSupportCall)(strSql).ToList
This just saves you from having to create a view server side, and Linq to SQL still maps the results to the proper type. This is useful when there is that one query that is just impossible to formulate in Linq.

Related

LinqToSQL joining local list to table - confusion

I am developing an application using LinqToSQL. As part of this I create a list of integers, which represent keys I want to filter. Every time in the past that I've done this and tried to join my list and the data table I get the following error:
Local sequence cannot be used in LINQ to SQL implementation of query operators except the Contains() operator
Now this is fine because, as I understand it, it is a limitaiton/feature of LinqToSQl. I've been using the Contains operator for my queries as shown:
List<CargoProduct> cargoProducts = context.CargoProducts
.Where(cp => cargos.Contains(cp.CargoID))
.ToList();
Recently I've come across the 2100 item limitation in Contains, so was looking for other ways to do it, eventually coming up with the following:
List<CargoProduct> cargoProducts = context.CargoProducts.AsEnumerable()
.Join(cargos, cp => cp.CargoID, c => c, (cp, c) => cp)
.ToList();
Now, that works fine so I was putting together a knowledge sharing email for the other developers in case they came across this limitation. I was trying to get the error message so put together another query than I'd expect to fail:
List<CargoProduct> results = (from c in cargos
join cp in context.CargoProducts on c equals cp.CargoID
select cp).ToList();
Much to my surprise, not only did this not throw an error but it returned exactly the same results as the previous query. So, what am I missing here? I'm sure it's something obvious!
For reference context is my LinqToSQl connection and cargos is instantiated as:
List<int> cargos = context.Cargos.Select(c => c.CargoID).ToList();
Update
As mentioned in the reply it would indeed appear to be the order in which I am joining stuff, as if I use the following then I get the expected error message:
List<CargoProduct> test3 = (from cp in context.CargoProducts
join c in cargos on cp.CargoID equals c
select cp).ToList();
It's interesting functionality and I think I understand why it is doing what it does. Could be a good workaround instead of using Contains for smaller datasets.
In this query
List<CargoProduct> results = (from c in cargos
join cp in context.CargoProducts on c equals cp.CargoID
select cp).ToList();
the left operand in the join statement is of type IEnumerable, then the Enumerable.Join extension method is being chosen on method overload resolution. This means that the whole CargoProducts table is being loaded in memory and and filtered via Linq To Objects. It is similar to do context.CargoProducts.AsEnumerable().

LINQ to Dynamics CRM query optimisation

I'm working with a website that links into a Dynamics CRM Online. I'm new to both of these but find the best way to learn is to put yourself under pressure.
Anyway, I have the following LINQ query that I've built using LinqPad:
from m in py3_membershipSet
join c in ContactSet on m.py3_Member.Id equals c.ContactId
where m.statuscode.Value == 1
orderby m.py3_name
select m
However, this gives an out of memory exception. It runs ok if I use Take(100) but I expect there to be about 1200 results to retrieve in total. Whether the memory issue is a LinqPad related problem I don't know but either way, I am assuming the above query isn't the most efficient way to pull these results.
I could really do with some help on making this more efficient, if it is as much of a memory hog as it appears via LinqPad.
An OutOfMemory exception,
...is thrown when there is not enough memory to continue the execution of
a program.
So I don't think it is anything in particular with the Linq you have written - apart from that it returning more data than your client can cope with. I suspect this is an issue more to do with your client than CRM or Linq.
This might be something do with LinqPad (not used it myself), have you tried running that script from a console app (to rule out any LinqPad issues)?
1200 doesn't sound like and awful lot of data, I often retrieve 1000~ records without issue but I have happily retrieved far more (5000~).
Paging might avoid the problem; Page Large Result Sets with LINQ.
Related reading: Troubleshooting Exceptions: System.OutOfMemoryException
Because the query does not know what fields will be needed later, all columns are returned from the entity when only the entity is specified in the select clause. In order to specify only the fields you will use, you must return a new object in the select clause, specifying the fields you want to use.
So instead of this:
from m in py3_membershipSet
join c in ContactSet on m.py3_Member.Id equals c.ContactId
where m.statuscode.Value == 1
orderby m.py3_name
select m
Use this:
from m in py3_membershipSet
join c in ContactSet on m.py3_Member.Id equals c.ContactId
where m.statuscode.Value == 1
orderby m.py3_name
select new py3_membership()
{
py3_membershipid = m.py3_membershipid,
py3_name = m.py3_name
}
Check out this post more details.
To Linq or not to Linq

DbContext times out on remote server only

I've got a Linq To Sql query (or with brackets) here that works on my local SQL2008, in about 00:00:00s - 00:00:01s, but on the remote server, it takes around 00:02:10s. There's about 56k items in dbo.Movies, dbo.Boxarts, and 300k in dbo.OmdbEntries
{SELECT
//pull distinct t_meter out of the created object
Distinct2.t_Meter AS t_Meter
//match all movie data on the same movie_id
FROM ( SELECT DISTINCT
Extent2.t_Meter AS t_Meter
FROM dbo.Movies AS Extent1
INNER JOIN dbo.OmdbEntries AS Extent2 ON Extent1.movie_ID = Extent2.movie_ID
INNER JOIN dbo.BoxArts AS Extent3 ON Extent1.movie_ID = Extent3.movie_ID
//pull the genres matched on movie_ids
INNER JOIN (SELECT DISTINCT
Extent4.movie_ID AS movie_ID
FROM dbo.MovieToGenres AS Extent4
//all genres matched on movie ids
INNER JOIN dbo.Genres AS Extent5 ON Extent4.genre_ID = Extent5.genre_ID ) AS Distinct1 ON Distinct1.movie_ID = Extent1.movie_ID
WHERE 1 = 1
//sort the t_meters by ascending
) AS Distinct2
ORDER BY Distinct2.t_Meter ASC}
The inner query first takes all the related items in the tables and then creates a new object, then from that object, find only the t_Meters that aren't null. Then from those t_Meters, select only the distinct items and then sort them, to return a list of 98 or so ints.
I don't know enough about SQL Databases yet or not to intuitively know whether or not that that's an extreme set of db calls to put into a single query, but since it only takes a second or less on my local server, I thought it was alright.
edit: Here's the LINQ code that I haven't really cleaned up at all: http://pastebin.com/JUkdjHDJ It's messy, but it gets the job done... The fix I found was calling ToArray after OrderBy, but before Distinct helped out immensely. So instead of
var results = IQueryableWithDBDatasTMeter.Distinct().OrderBy().ToArray()
I did
var orderedResults = IQueryableWithDBDatasTMeter.OrderBy().ToArray()
var distinctOrderedResults = orderedResults.Distinct().ToArray()
I'm sure had I linked the Linq code (and cleaned it up) rather than the autogenerated SQL query, you would have been able to solve this easily, sorry about that.
Here's the LINQ code that I haven't really cleaned up at all: http://pastebin.com/JUkdjHDJ It's messy, but it gets the job done... The fix I found was calling ToArray after OrderBy, but before Distinct helped out immensely. So instead of
var results = IQueryableWithDBDatasTMeter.Distinct().OrderBy().ToArray()
I did
var orderedResults = IQueryableWithDBDatasTMeter.OrderBy().ToArray()
var distinctOrderedResults = orderedResults.Distinct().ToArray()
I guess it works because it's running the Distinct only against the Array in memory, rather than the entire DB's worth of entries? I'm not really sure though, since the old LINQ works flawlessly on my local server.
I'm sure had I linked the Linq code (and cleaned it up) rather than the autogenerated SQL query, you would have been able to solve this easily, sorry about that.

Linq To SQL equivalent group by with multiple table columns in the output

I have just started on a project that uses Linq To SQL (there are various reasons why this is so, but for the moment, that is what is being used, not EF or ANOther ORM).
I have been tasked with migrating old (and I'm talking VB6 here) legacy code.
I come from a predominantly T-SQL background, so I knocked up a query that would do what I want, but I have to use LINQ to SQL (c# 3.5), which I don't have much experience with.
Note that the database will be SQL Server 2008 R2 and/or SQL Azure
Here is the T-SQL (simplified)
SELECT TBS.ServiceAvailID, sum(Places) as TakenPlaces,MAX(SA.TakenPlaces)
FROM TourBookService TBS
JOIN TourBooking TB
ON TB.TourBookID=TBS.TourBookID
JOIN ServiceAvail SA
ON TBS.ServiceAvailID = SA.ServiceAvailID
WHERE TB.Status = 10
AND ServiceSvrCode='test'
GROUP BY TBS.ServiceAvailID
HAVING sum(Places) <> MAX(SA.TakenPlaces)
So, there is a TourBooking table which has details of a customer's booking. This hangs off the TourBookService table which has details of the service they have booked. There is also a ServiceAvail table which links to the TourBookService table. Now, the sum of the Places should equal the Taken places amount in the ServiceAvail table, but sometimes this is not the case. This query gives back anything where this is not the case. I can create the Linq to just get the sum(places) details, but I am struggling to get the syntax to also get the TakenPlaces (note that this doesn't include the HAVING clause either)
var q = from tbs in TourBookServices
join tb in TourBookings on tbs.TourBookID equals tb.TourBookID
join sa in ServiceAvails on tbs.ServiceAvailID equals sa.ServiceAvailID
where (tb.Status == 10)
&& ( tbs.ServiceSvrCode =="test")
group tbs by tbs.ServiceAvailID
into g
select new {g.Key, TotalPlaces = g.Sum(p => p.Places)};
I need to somehow get the sa table into the group so that I can add g.Max(p=>p.PlacesTaken) to the select.
Am I trying to force T-SQL thinking into LINQ ?
I could just have another query that gets all the appropriate details from the ServiceAvail table, then loop through both result sets and match on the key, which would be easy to do, but feels wrong (but that may just be me!)
Any comments would be appreciated.
UPDATE:
As per the accepted answer below, this is what Linqer gave me. I will have a play and see what SQL it actually creates.
from tbs in db.TourBookService
join sa in db.ServiceAvail on tbs.ServiceAvailID equals sa.ServiceAvailID
where
tbs.TourBooking.Status == 10
tbs.ServiceSvrCode == "test")
group new {tbs, sa} by new {
tbs.ServiceAvailID
} into g
where g.Sum(p => p.tbs.Places) != g.Max(p => p.sa.TakenPlaces)
select new {
ServiceAvailID = (System.Int32?)g.Key.ServiceAvailID,
TakenPlaces = (System.Int32?)g.Sum(p => p.tbs.Places),
Column1 = (System.Int32?)g.Max(p => p.sa.TakenPlaces)
}
In your case I would try to use some kind of converter in my personal experience I used this program http://sqltolinq.com/ it often works very well in convertitng sql to linq.

Joins and subqueries in LINQ

I am trying to do a join with a sub query and can't seem to get it. Here is what is looks like working in sql. How do I get to to work in linq?
SELECT po.*, p.PermissionID
FROM PermissibleObjects po
INNER JOIN PermissibleObjects_Permissions po_p ON (po.PermissibleObjectID = po_p.PermissibleObjectID)
INNER JOIN Permissions p ON (po_p.PermissionID = p.PermissionID)
LEFT OUTER JOIN
(
SELECT u_po.PermissionID, u_po.PermissibleObjectID
FROM Users_PermissibleObjects u_po
WHERE u_po.UserID = '2F160457-7355-4B59-861F-9871A45FD166'
) used ON (p.PermissionID = used.PermissionID AND po.PermissibleObjectID = used.PermissibleObjectID)
WHERE used.PermissionID is null
Without seeing your database and data model, it's pretty impossible to offer any real help. But, probably the best way to go is:
download linqpad - http://www.linqpad.net/
create a connection to your database
start with the innermost piece - the subquery with the "where" clause
get each small query working, then join them up. Linqpad will show you the generated SQL, as well as the results, so build your small queries up until they are right
So, basically, split your problem up into smaller pieces. Linqpad is fantastic as it lets you test these things out, and check your results as you go
hope this helps, good luck
Toby
The LINQ translation for your query is suprisingly simple:
from pop in PermissibleObjectPermissions
where !pop.UserPermissibleObjects.Any (
upo => upo.UserID == new Guid ("2F160457-7355-4B59-861F-9871A45FD166"))
select new { pop.PermissibleObject, pop.PermissionID }
In words: "From all object permissions, retrieve those with at least one user-permission whose UserID is 2F160457-7355-4B59-861F-9871A45FD16".
You'll notice that this query uses association properties for navigating relationships - this avoids the need for "joining" and simplfies the query. As a result, the LINQ query is much closer to its description in English than the original SQL query.
The trick, when writing LINQ queries, is to get out of the habit of "transliterating" SQL into LINQ.

Categories