I have bought the premium version of LINQPad. I thought it would also be possible to perform cross data base queries with DevForce models.
There are two ways to do this. The simplest is the drag-and-drop
approach: hold down the Ctrl key while dragging additional databases
from the Schema Explorer to the query editor. To access those
additional databases in your queries, use database.table notation,
e.g., Northwind.Regions.Take(100). The databases that you query must
reside on the same server.
The second approach is to list the extra database(s) that you want to
query in the connection properties dialog. This dialog also lets you
choose databases from linked servers. Here's how to proceed:
Add a new LINQ to SQL connection.
Choose Specify New or Existing Database and choose the primary database that you want to query.
Click the Include Additional Databases checkbox and pick the extra database(s) you want to include. You can also choose databases from
linked servers in this dialog.
Source
But obviously there isn't any way, is it? Anyone a solution for this?
Cross-database querying works only with standard SQL Server connections, with databases on the same server or on linked servers. The main rationale is to ensure server-side joining (otherwise you'd end up pulling entire tables back to the client whenever you joined).
I have considered adding a feature to LINQPad to allow arbitrary cross-database queries, because sometimes it would be useful even with client-side joining. However, getting this to work with custom data contexts (such as DevForce or Entity Framework) turned out to be really tricky, and so the feature ended up in the "too-hard basket". A major problem was dealing with namespace/assembly/app.config conflicts.
Bear in mind that there's nothing to stop you from pressing F4 and adding a reference to an assembly containing an additional datacontext. Of course, you'd have to manually instantiate the second data context, but that shouldn't be a huge deal. You'll still get autocompletion, and you'll still be able to see its schema in the tree view if you create a separate connection for it. And functionally, that's what you'd end up with anyway, if LINQPad supported multi-connection queries.
What's special about LINQPad's cross-database querying support for SQL Server is that it does something you couldn't otherwise do simply by adding a reference to another assembly, which is to allow efficient cross-database querying by leveraging server-side joins.
You can instantiate as many contexts as you like to disparate SQL instances and execute pseudo cross database joins, copy data, etc. Note, joins across contexts are performed locally so you must call ToList(), ToArray(), etc to execute the queries using their respective data sources individually before joining. In other words if you "inner" join 10 rows from DB1.TABLE1 with 20 rows from DB2.TABLE2, both sets (all 30 rows) must be pulled into memory on your local machine before Linq performs the join and returns the related/intersecting set (20 rows max per example).
//EF6 context not selected in Linqpad Connection dropdown
var remoteContext = new YourContext();
remoteContext.Database.Connection.ConnectionString = "Server=[SERVER];Database="
+ "[DATABASE];Trusted_Connection=false;User ID=[SQLAUTHUSERID];Password="
+ "[SQLAUTHPASSWORD];Encrypt=True;";
remoteContext.Database.Connection.Open();
var DB1 = new Repository(remoteContext);
//EF6 connection to remote database
var remote = DB1.GetAll<Table1>()
.Where(x=>x.Id==123)
//note...depending on the default Linqpad connection you may get
//"EntityWrapperWithoutRelationships" results for
//results that include a complex type. you can use a Select() projection
//to specify only simple type columns
.Select(x=>new { x.Col1, x.Col1, etc... })
.Take(1)
.ToList().Dump(); // you must execute query by calling ToList(), ToArray(),
// etc before joining
//Linq-to-SQL default connection selected in Linqpad Connection dropdown
Table2.Where(x=>x.Id = 123)
.ToList() // you must execute query by calling ToList(), ToArray(),
// etc before joining
.Join(remote, a=> a.d, b=> (short?)b.Id, (a,b)=>new{b.Col1, b.Col2, a.Col1})
.Dump();
localContext.Database.Connection.Close();
localContext = null;
Related
I am currently using EF Extensions. One thing I don't understand, "its supposed to help with performance"
however placing a million+ records into List variable, is a Memory Issue itself.
So If wanting to update million records, without holding everything in memory, how can this be done efficiently?
Should we use a for loop, and update in batches say 10,000? Does EFExtensions BulkUpdate have any native functionality to support this?
Example:
var productUpdate = _dbContext.Set<Product>()
.Where(x => x.ProductType == 'Electronics'); // this creates IQueryable
await productUpdate.ForEachAsync(c => c.ProductBrand = 'ABC Company');
_dbContext.BulkUpdateAsync(productUpdate.ToList());
Resource:
https://entityframework-extensions.net/bulk-update
This is actually something that EF is not made for. EF's database interactions start from the record object, and flow from there. EF cannot generate a partial UPDATE (i.e. not overwriting everything) if the entity wasn't change tracked (and therefore loaded), and similarly it cannot DELETE records based on a condition instead of a key.
There is no EF equivalent (without loading all of those records) for conditional update/delete logic such as
UPDATE People
SET FirstName = 'Bob'
WHERE FirstName = 'Robert'
or
DELETE FROM People
WHERE FirstName = 'Robert'
Doing this using the EF approach will require you to load all of these entities just to send them back (with an update or delete) to the database, and that's a waste of bandwidth and performance as you've already found.
The best solution I've found here is to bypass EF's LINQ-friendly methods and instead execute the raw SQL yourself. This can still be done using an EF context.
using (var ctx = new MyContext())
{
string updateCommand = "UPDATE People SET FirstName = 'Bob' WHERE FirstName = 'Robert'";
int noOfRowsUpdated = ctx.Database.ExecuteSqlCommand(updateCommand);
string deleteCommand = "DELETE FROM People WHERE FirstName = 'Robert'";
int noOfRowsDeleted = ctx.Database.ExecuteSqlCommand(deleteCommand);
}
More information here. Of course don't forget to protect against SQL injection where relevant.
The specific syntax to run raw SQL may vary per version of EF/EF Core but as far as I'm aware all versions allow you to execute raw SQL.
I can't comment on the performance of EF Extensions or BulkUpdate specifically, and I'm not going to buy it from them.
Based on their documentation, they don't seem to have the methods with the right signatures to allow for conditional update/delete logic.
BulkUpdate doesn't seem to allow you to input the logical condition (the WHERE in your UPDATE command) that would allow you to optimize this.
BulkDelete still has a BatchSize setting, which suggests that they are still handling the records one at a time (well, per batch I guess), and not using a single DELETE query with a condition (WHERE clause).
Based on your intended code in the question, EF Extensions isn't really giving you what you need. It's more performant and cheaper to simply execute raw SQL on the database, as this bypasses EF's need to load its entities.
Update
I might stand corrected, there is some support for conditional update logic, as seen here. However, it is unclear to me while the example still loads everything in memory and what the purpose of that conditional WHERE logic then is if you've already loaded it all in memory (why not use in-memory LINQ then?)
However, even if this works without loading the entities, it's still:
more limited (only equality checks are allowed, compared to SQL allowing any boolean condition that is valid SQL),
relatively complex (I don't like their syntax, maybe that's subjective)
and more costly (still a paid library)
compared to rolling your own raw SQL query. I would still suggest rolling your own raw SQL here, but that's just my opinion.
I found the "proper" EF Extensions way to do a bulk update with a query-like condition:
var productUpdate = _dbContext.Set<Product>()
.Where(x => x.ProductType == 'Electronics')
.UpdateFromQuery( x => new Product { ProductBrand = "ABC Company" });
This should result in a proper SQL UPDATE ... SET ... WHERE, without the need to load entities first, as per the documentation:
Why UpdateFromQuery is faster than SaveChanges, BulkSaveChanges, and BulkUpdate?
UpdateFromQuery executes a statement directly in SQL such as UPDATE [TableName] SET [SetColumnsAndValues] WHERE [Key].
Other operations normally require one or multiple database round-trips which makes the performance slower.
You can check the working syntax on this dotnet fiddle example, adapted from their example of BulkUpdate.
Other considerations
No mention of batch operations for this, unfortunately.
Before doing a big update like this, it might be worth considering deactivating indexes you may have on this column, and rebuild them afterward. This is especially useful if you have many of them.
Careful about the condition in the Where, if it can't be translated as SQL by EF, then it will be done client side, meaning the "usual" terrible roundtrip "Load - change in memory - update"
I had the following:
List<Message> unreadMessages = this.context.Messages
.Where( x =>
x.AncestorMessage.MessageID == ancestorMessageID &&
x.Read == false &&
x.SentTo.Id == userID ).ToList();
foreach(var unreadMessage in unreadMessages)
{
unreadMessage.Read = true;
}
this.context.SaveChanges();
But there must be a way of doing this without having to do 2 SQL queries, one for selecting the items, and one for updating the list.
How do i do this?
Current idiomatic support in EF
As far as I know, there is no direct support for "bulk updates" yet in Entity Framework (there has been an ongoing discussion for bulk operation support for a while though, and it is likely it will be included at some point).
(Why) Do you want to do this?
It is clear that this is an operation that, in native SQL, can be achieved in a single statement, and provides some significant advantages over the approach followed in your question. Using the single SQL statement, only a very small amount of I/O is required between client and DB server, and the statement itself can be completely executed and optimized by the DB server. No need to transfer to and iterate through a potentially large result set client side, just to update one or two fields and send this back the other way.
How
So although not directly supported by EF, it is still possible to do this, using one of two approaches.
Option A. Handcode your SQL update statement
This is a very simple approach, that does not require any other tools/packages and can be performed Async as well:
var sql = "UPDATE TABLE x SET FIELDA = #fieldA WHERE FIELDB = #fieldb";
var parameters = new SqlParameter[] { ..., ... };
int result = db.Database.ExecuteSqlCommand(sql, parameters);
or
int result = await db.Database.ExecuteSqlCommandAsync(sql, parameters);
The obvious downside is, well breaking the nice linqy paradigm and having to handcode your SQL (possibly for more than one target SQL dialect).
Option B. Use one of the EF extension/utility packages
Since a while, a number of open source nuget packages are available that offer specific extensions to EF. A number of them do provide a nice "linqy" way to issue a single update SQL statement to the server. Two examples are:
Entity Framework Extended Library that allows performing a bulk update using a statement like:
context.Messages.Update(
x => x.Read == false && x.SentTo.Id == userID,
x => new Message { Read = true });
It is also available on github
EntityFramework.Utilities that allows performing a bulk update using a statement like:
EFBatchOperation
.For(context, context.Messages)
.Where(x => x.Read == false && x.SentTo.Id == userID)
.Update(x => x.Read, x => x.Read = true);
It is also available on github
And there are definitely other packages and libraries out there that provide similar support.
Even SQL has to do this in two steps in a sense, in that an UPDATE query with a WHERE clause first runs the equivalent of a SELECT behind the scenes, filtering via the WHERE clause, then applying the update. So really, I don't think you need to be worried about improving this.
Further, the reason why it's broken into two steps like this in LINQ is precisely for performance reasons. You want that "select" to be as minimal as possible, i.e. you don't want to load any more objects from the database into in memory objects than you have to. Only then do you alter objects (in the foreach).
If you really want to run a native UPDATE on the SQL side, you could use a System.Data.SqlClient.SqlCommand to issue the update, instead of having LINQ give you back objects that you then update. That will be faster, but then you conceptually move some of your logic out of your C# code object model space into the database model space (you are doing things in the database, not in your object space), even if the SqlCommand is being issued from your code.
I use two stored procedures that return the data with the same structure (list of records of the same type).
I call my method Execute(ISession session) twice. First time for the first stored procedure (it returns correct list of 6 rows). Second time - for the second stored procedure (it returns list of 11 rows, but first 6 rows are from the first request that overwrite the correct rows).
I found
Impact on NHibernate caching for searches with results including calculated value mapped as a formula (e.g. rank)
But I can't use it for IQuery
Any ideas or links how it can be fixed ?
public dynamic Execute(ISession session)
{
var query = session.GetNamedQuery(QueryName)
.SetCacheable(false)
.SetCacheMode(CacheMode.Ignore)
.SetReadOnly(true);
var results = query.List<T>();
return results;
}
I'm going to take a stab at answering this, because I think I have a hunch of what's going on, and I want to set you on the right track. I've made a lot of assumptions here, so please don't be too harsh on me if I was completely wrong with my guesses.
It feels like you're trying to use NHibernate as a tool to simply translate rows into objects. Instead NHibernate is a tool that translates between your object oriented domain model and your relational database domain model. It does a lot more that just turn rows into objects. In particular, the NHibernate feature that you're tripping over here is how NHibernate ensures that within a single NHibernate session, a single row in the database which represents a single entity will correspond to a single instance of an object. It uses its first-level cache to accomplish this.
Let's say you have two queries, QueryA and QueryB. These queries have been constructed so that they each pull from separate tables, TableA and TableB, so really they represent separate entities. However, the queries have also somehow been built so that the result look to NHibernate like the same entity. If QueryA and QueryB happen to return some of the same ids, then NHibernate will combine them into the same instance, so you would see some of the results from QueryA repeated when you run QueryB.
So how do we fix it?
The quick and dirty fix would be to use different sessions for each of those two queries, or throw a session.Clear() in-between them. The more appropriate fix would be to change these named queries so that they actually do return two different entities.
I have the same problem, in first place I resolved the problem with session.Clear() but this solution lead to another bug. I read the response of Daniel and this response I served to detect that the issue is in the stored procedure, the stored procedure did not return an unique identifier and this produced the error when I mapped the ID with nhibernate.
I've read many posts about this, the last being a statement saying the behaviour i'm experiencing the the expected if not preferred way.
I've got a handful of domain models mapped to their database tables.
I'm using a Criteria object to query them.
var query = session.CreateCriteria(typeof(Posting))
.CreateAlias("Person", "person", JoinType.InnerJoin).SetFetchMode("Person", FetchMode.Eager)
.CreateAlias("Location", "location", JoinType.InnerJoin).SetFetchMode("Location", FetchMode.Eager)
.CreateAlias("Post", "post", JoinType.InnerJoin).SetFetchMode("Post", FetchMode.Eager)
.CreateAlias("post.Zone", "postzone", JoinType.LeftOuterJoin).SetFetchMode("post.Zone", FetchMode.Select)
It produces the exact statment I'm looking for. I'd expect it to populate the objects eagerly (which it does) and leave nulls for post.Zone where the relationship fails.
In my mapping i've had to say not-found=ignore (Notfound.Ignore() FNH) which due to the outer join is causing NH to generate a number of sub-query - all on the Zone table - which doesn't seem necessary to me, especially as it already had the data it need to populate the Zone object.
Can anyone she any light on any changes I can make to the Mapping or Query/Criteria to keep this load to a single query, or should I expect an extra statement for every record outer joined to?
Thanks,
Sam
It is unfortunatly a known issue that not-found=ignore causes additional selects. There have been bug reports filed for this in both NHibernate and Hibernate for quite some time (2007) and currently no known work around has been made available other than removing not-found=ignore or changing the data present in the database so it is no longer required.
A possible work around, although not a nice one, would be to have a stored procedure perform the query you want and then call this from NHibernate. This would remove the redundant queries.
I am often comparing data in tables in different databases. These databases do not have the same schema. In TSQL, I can reference them with the DB>user>table structure (DB1.dbo.Stores, DB2.dbo.OtherPlaces) to pull the data for comparison. I like the idea of LINQPad quite a bit, but I just can't seem to easily pull data from two different data contexts within the same set of statements.
I've seen people suggest simply changing the connection string to pull the data from the other source into the current schema but, as I mentioned, this will not do. Did I just skip a page in the FAQ? This seems a fairly routine procedure to be unavailable to me.
In the "easy" world, I'd love to be able to simply reference the typed datacontext that LINQPad creates. Then I could simply:
DB1DataContext db1 = new DB1DataContext();
DB2DataContext db2 = new DB2DataContext();
And work from there.
Update: it's now possible to do cross-database SQL Server queries in LINQPad (from LINQPad v4.31, with a LINQPad Premium license). To use this feature, hold down the Control key while dragging databases from the Schema Explorer to the query window.
It's also possible to query linked servers (that you've linked by calling sp_add_linkedserver). To do this:
Add a new LINQ to SQL connection.
Choose Specify New or Existing Database and choose the primary database you want to query.
Click the Include Additional Databases checkbox and pick the linked server(s) from the list.
Keep in mind that you can always create another context on your own.
public FooEntities GetFooContext()
{
var entityBuilder = new EntityConnectionStringBuilder
{
Provider = "Devart.Data.Oracle",
ProviderConnectionString = "User Id=foo;Password=foo;Data Source=Foo.World;Connect Mode=Default;Direct=false",
Metadata = #"D:\FooModel.csdl|D:\FooModel.ssdl|D:\FooModel.msl"
};
return new FooEntities(entityBuilder.ToString());
}
You can instantiate as many contexts as you like to disparate SQL instances and execute pseudo cross database joins, copy data, etc. Note, joins across contexts are performed locally so you must call ToList(), ToArray(), etc to execute the queries using their respective data sources individually before joining. In other words if you "inner" join 10 rows from DB1.TABLE1 with 20 rows from DB2.TABLE2, both sets (all 30 rows) must be pulled into memory on your local machine before Linq performs the join and returns the related/intersecting set (20 rows max per example).
//EF6 context not selected in Linqpad Connection dropdown
var remoteContext = new YourContext();
remoteContext.Database.Connection.ConnectionString = "Server=[SERVER];Database="
+ "[DATABASE];Trusted_Connection=false;User ID=[SQLAUTHUSERID];Password="
+ "[SQLAUTHPASSWORD];Encrypt=True;";
remoteContext.Database.Connection.Open();
var DB1 = new Repository(remoteContext);
//EF6 connection to remote database
var remote = DB1.GetAll<Table1>()
.Where(x=>x.Id==123)
//note...depending on the default Linqpad connection you may get
//"EntityWrapperWithoutRelationships" results for
//results that include a complex type. you can use a Select() projection
//to specify only simple type columns
.Select(x=>new { x.Col1, x.Col1, etc... })
.Take(1)
.ToList().Dump(); // you must execute query by calling ToList(), ToArray(),
// etc before joining
//Linq-to-SQL default connection selected in Linqpad Connection dropdown
Table2.Where(x=>x.Id = 123)
.ToList() // you must execute query by calling ToList(), ToArray(),
// etc before joining
.Join(remote, a=> a.d, b=> (short?)b.Id, (a,b)=>new{b.Col1, b.Col2, a.Col1})
.Dump();
remoteContext.Database.Connection.Close();
remoteContext = null;
I do not think you are able to do this. See this LinqPad request.
However, you could build multiple dbml files in a separate dll and reference them in LinqPad.
Drag-and-drop approach: hold down the Ctrl key while dragging additional databases
from the Schema Explorer to the query editor.
Use case:
//Access Northwind
var ID = new Guid("107cc232-0319-4cbe-b137-184c82ac6e12");
LotsOfData.Where(d => d.Id == ID).Dump();
//Access Northwind_v2
this.NORTHWIND_V2.LotsOfData.Where(d => d.Id == ID).Dump();
Multiple databases are as far as I know only available in the "paid" version of LinqPad (what I wrote applies to LinqPad 6 Premium).
For more details, see this answer in StackOverflow (section "Multiple database support").