I have two tables in my database: TPM_AREAS and TPM_WORKGROUPS. There exists a many-to-many relationship between these two tables, and these relationships are stored in a table called TPM_AREAWORKGROUPS. This table looks like this:
What I need to do is load all these mappings into memory at once, in the quickest way possible. As TPM_AREAWORKGROUPS is an association, I can't just say:
var foo = (from aw in context.TPM_AREAWORKGROUPS select aw);
I can think of three ways to possibly do this, however I'm not quite sure how to accomplish each of them nor which one is the best.
1) Load in every workgroup, including the associated areas:
Something like:
var allWG = (from w in context.TPM_WORKGROUPS.Include("TPM_AREAS")
where w.TPM_AREAS.Count > 0
select w);
// Loop through this enumeration and manually build a mapping of distinct AREAID/WORKGROUPID combinations.
Pros: This is probably the standard EntityFramework way of doing things, and doesn't require me to change any of the database structure or mappings.
Cons: Could potentially be slow, since the TPM_WORKGROUPS table is rather large and the TPM_AREAWORKGROUPS table only has 13 rows. Plus, there's no TPM_AREAWORKGROUPS class, so I'd have to return a collection of Tuples or make a new class for this.
2) Change my model
Ideally, I'd like a TPM_AREAWORKGROUP class, and a context.TPM_AREAWORKGROUP property. I used the designer to create this model directly from the database, so I'm not quite sure how to force this association to be an actual model. Is there an easy way to do this?
Pros: It would allow me to select directly against this table, done in one line of code. Yay!
Cons: Forces me to change my model, but is this a bad thing?
3) Screw it, use raw SQL to get what I want.
I can get the StoreConnection property of the context, and call CreateCommand() directly. I can then just do:
using (DbCommand cmd = conn.CreateCommand())
{
cmd.CommandText = "SELECT AreaId, WorkgroupId FROM TPM_AREAWORKGROUPS";
var reader = cmd.ExecuteReader();
// Loop through and get each mapping
}
Pros: Fast, easy, doesn't require me to change my model.
Cons: Seems kind of hacky. Everywhere else in the project, we're just using standard Entity Framework code so this deviates from the norm. Also, it has the same issues as the first option; there's still no TPM_AREAWORKGROUPS class.
Question: What's the best solution for this problem?
Ideally, I'd like to do #2 however I'm not quite sure how to adjust my model. Or, perhaps someone knows of a better way than my three options.
You could do:
var result = context
.TPM_WORKGROUPS
.SelectMany(z => z.TPM_AREAS.Select(z2 => new
{
z2.AREAID,
z.WORKGROUPID
}));
The translated SQL will be a simple SELECT AREAID, WORKGROUPID FROM TPM_AREAWORKGROUPS.
About other options:
I wouldn't use option 3) because I personnally avoid raw SQL as much as possible when using Entity Framework (see https://stackoverflow.com/a/8880157/870604 for some reasons).
I wouldn't use option 2) because you would have to change your model, and there is a simple and efficient way that allows to not change it.
What about use projection to load data?
You could do that do fill a annonymous object and then work with it the way you like.
Related
I'm using EF 6 to work with a somewhat shoddily constructed database. I'm using a code-first model.
A lot of the logical relations there aren't implemented correctly using keys, but use various other strategies (Such as character-separated ids or strings, for example) that were previously manipulated using complex SQL queries.
(Changing the schema is not an option)
I really want to capture those relations as properties. It's possible to do this by using explicit queries instead of defining actual relations using the fluent/attribute syntax.
I'm planning to do this by having IQueryable<T> properties that perform a query. For example:
partial class Product {
public IQueryable<tblCategory> SubCategories {
get {
//SubCategoriesID is a string like "1234, 12351, 12" containing a list of IDs.
var ids = SubCategoriesID.Split(',').Select(x => int.Parse(x.Trim()));
return from category in this.GetContext().tblCategories
where ids.Contains(category.CategoryID)
select category;
}
}
}
(The GetContext() method is an extension method that somehow acquires an appropriate DbContext)
However, is there a better way to do this that I'm not familiar with?
Furthermore, if I do do this, what's the best way of getting the DbContext for the operation? It could be:
Just create a new one. I'm a bit leery of doing this, since I don't know much about how they work.
Use some tricks to get the context that was used to create this specific instance.
Do something else?
First, I would recommend not returning an IQueryable, as that retains a relationship to the original DbContext. Instead, I'd ToList the results of the query and return that as an IEnumerable<tblCategory>
Try not to keep DbContext instances hanging around; there's a lot of state management baked into them, and since they are not thread-safe you don't want to have multiple threads hitting the same instance. The pattern I personally tend to follow on data access methods is to use a new DbContext in a using block:
using (var ctx = new YourDbContextTypeHere()) {
return (from category in ctx.tblCategories
where ids.Contains(category.CategoryID)
select category).ToList();
}
Beware that .Contains() on a list of ids is very slow in EF, i.e. try to avoid it. I'd use subqueries, such as
var subcategories = context.SubCategories.Where(...);
var categories = context.Categories.Where(x => subCategories.Select(x => x.Id).Contains(category.CategoryId);
In this setup, you can avoid loading all the ids onto the server, and the query will be fast.
I see tons of questions on LINQ to SQL vs Stored Procs. I'm more curious about the benefits of using them in tandem as relates to object mapping.
I have my business objects defined, and I have stored procedures for all of my CRUD transactions.
Is it better to plop all the stored procs into a DBML file and call them from there, and then map the results to my business objects, or is it better to just use a DataReader and map it from there?
It's annoying to me because I want my objects as I define them, rather than use MyStoredProcResult objects as linq2sql generates, so I feel I'm doing the same field by field mapping as I would with a data reader.
Performance isn't necessarily key here (unless it's ridiculously slow). I'm looking to create a standard way for all our developers to load data from a database into an object in the simplest fashion with the least amount of code.
Mapping to LINQ2SQL has a serious advantage in being type-safe - you don't really have to worry about parsing the results or adding command parameters. It does it all for you.
On the other hand with calling stored procedures directly with SQLcommand and DataReader proves to have better performance (especially when reading/changing a lot of data).
Regardless of which you choose it is better to build a separate Data Access Layer as it allows more flexibility. The logic of accessing/changing database should not be built into your business objects cos if you are forced to change means of storing you data it updating you software will be painful.
Not direct answer to your question, but if you want your objects as result of query, you probably have to consider code first schemas. Linq2SQL does not support this, but Entity Framework and NHibernate does.
Direct answer is that DataReader will obviously has less overhead, but at the same time it will have much more magic strings. Overhead is bad in terms of perfomance(in your case not that big). Magic strings are bad in terms maintaining code. So definetly this will be your personal choise.
LINQ2SQL can provide your objects populated with the results of the query. You will have to build child objects in such a way as to support either a List(Of T) or List depending on your language choice.
Suppose you have a table with an ID, a Company Name, and a Phone Number for fields. Querying that table would be straight-forward in either LINQ or a stored procedure. The advantage that LINQ brings is the ability to map the results to either anonymous types or your own classes. So a query of:
var doSomething = from sList in myTableRef select sList;
would return an anonymous type. However, if you also have a class like this:
public class Company
{
public integer ID;
public string Company;
public string PhoneNumber;
}
changing your query to this will populate Company objects as it moves through the data:
List<Company> companies = (from sList in myTableRef select new Company
{ .ID = sList.id,
.Company = sList.company,
.PhoneNumber = sList.phonenumber }).ToList();
My C# syntax may not be 100% correct as I mainly code in VB, but it will be close enough to get you there.
I'm using the PetaPoco mini-ORM, which in my implementation runs stored procedures and maps them to object models I've defined. This works very intuitively for queries that pull out singular tables (i.e. SELECT * FROM Orders), but less so when I start writing queries that pull aggregate results. For example, say I've got a Customers table and Orders table, where the Orders table contains a foreign key reference to a CustomerID. I want to retrieve a list of all orders, but in the view of my application, display the Customer name as well as all the other order fields, i.e.
SELECT
Customers.Name,
Orders.*
FROM
Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.ID
Having not worked with an ORM of any sort before, I'm unsure of the proper method to handle this sort of data. I see two options right now:
Create a new aggregate model for the specific operation. I feel like I would end up with a ton of models in any large application by doing this, but it would let me map a query result directly to an object.
Have two separate queries, one that retrieves Orders, another that retrieves Customers, then join them via LINQ. This seems a better alternative than #1, but similarly seems obtuse as I am pulling out 30 columns when I desire one (although my particular mini-ORM allows me to pull out just one row and bind it to a model).
Is there a preferred method of doing this, either of the two I mentioned, or a better way I haven't thought of?
Option #1 is common in CQRS-based architectures. It makes sense when you think about it: even though it requires some effort, it maps intuitively to what you are doing, and it doesn't impact other pieces of your solution. So if you have to change it, you can do so without breaking anything elsewhere.
I need to reset a boolean field in a specific table before I run an update.
The table could have 1 million or so records and I'd prefer not to have to have to do a select before update as its taking too much time.
Basically what I need in code is to produce the following in TSQL
update tablename
set flag = false
where flag = true
I have some thing close to what I need here http://www.aneyfamily.com/terryandann/post/2008/04/Batch-Updates-and-Deletes-with-LINQ-to-SQL.aspx
but have yet to implement it but was wondering if there is a more standard way.
To keep within the restrictions we have for this project, we cant use SPROCs or directly write TSQL in an ExecuteStoreCommand parameter on the context which I believe you can do.
I'm aware that what I need to do may not be directly supported in EF4 and we may need to look at a SPROC for the job [in the total absence of any other way] but I just need to explore fully all possibilities first.
In an EF ideal world the call above to update the flag would be possible or alternatively it would be possible to get the entity with the id and the boolean flag only minus the associated entities and loop through the entity and set the flag and do a single SaveChanges call, but that may not be the way it works.
Any ideas,
Thanks in advance.
Liam
I would go to stakeholder who introduced restirctions about not using SQL or SProc directly and present him these facts:
Updates in ORM (like entity framework) work this way: you load object you perform modification you save object. That is the only valid way.
Obviously in you case it would mean load 1M entities and execute 1M updates separately (EF has no command batching - each command runs in its own roundtrip to DB) - usually absolutely useless solution.
The example you provided looks very interesting but it is for Linq-To-Sql. Not for Entity framework. Unless you implement it you can't be sure that it will work for EF, because infrastructure in EF is much more complex. So you can spent several man days by doing this without any result - this should be approved by stakeholder.
Solution with SProc or direct SQL will take you few minutes and it will simply work.
In both solution you will have to deal with another problem. If you already have materialized entities and you will run such command (via mentioned extension or via SQL) these changes will not be mirrored in already loaded entities - you will have to iterate them and set the flag.
Both scenarios break unit of work because some data changes are executed before unit of work is completed.
It is all about using the right tool for the right requirement.
Btw. loading of realted tables can be avoided. It is just about the query you run. Do not use Include and do not access navigation properties (in case of lazy loading) and you will not load relation.
It is possible to select only Id (via projection), create dummy entity (set only id and and flag to true) and execute only updates of flag but it will still execute up to 1M updates.
using(var myContext = new MyContext(connectionString))
{
var query = from o in myContext.MyEntities
where o.Flag == false
select o.Id;
foreach (var id in query)
{
var entity = new MyEntity
{
Id = id,
Flag = true
};
myContext.Attach(entity);
myContext.ObjectStateManager.GetObjectStateEntry(entity).SetModifiedProperty("Flag");
}
myContext.SaveChanges();
}
Moreover it will only work in empty object context (or at least no entity from updated table can be attached to context). So in some scenarios running this before other updates will require two ObjectContext instances = manually sharing DbConnection or two database connections and in case of transactions = distributed transaction and another performance hit.
Make a new EF model, and only add the one Table you need to make the update on. This way, all of the joins don't occur. This will greatly speed up your processing.
ObjectContext.ExecuteStoreCommand ( _
commandText As String, _
ParamArray parameters As Object() _
) As Integer
http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.executestorecommand.aspx
Edit
Sorry, did not read the post all the way.
I have a Linq-To-Sql based repository class which I have been successfully using. I am adding some functionality to the solution, which will provide WCF based access to the database.
I have not exposed the generated Linq classes as DataContracts, I've instead created my own "ViewModel" as a POCO for each entity I am going to be returning.
My question is, in order to do updates and take advantage of some of the Linq-To-Sql features like cyclic references from within my Service, do I need to add a Rowversion/Timestamp field to each table in by database so I can use code like dc.Table.Attach(myDisconnectedObject)? The alternitive, seems ugly:
var updateModel = dc.Table.SingleOrDefault(t => t.ID == myDisconnectedObject.ID);
updateModel.PropertyA = myDisconnectedObject.PropertyA;
updateModel.PropertyB = myDisconnectedObject.PropertyB;
updateModel.PropertyC = myDisconnectedObject.PropertyC;
// and so on and so forth
dc.SubmitChanges();
I guess a RowVersion/TimeStamp column on each table might be the best and least intrusive option - just basically check for that one value, and you're sure whether or not your data might have been modified in the mean time. All other columns can be set to Update Check=Never. This will take care of handling the possible concurrency issues when updating your database from "returning" objects.
However, the other thing you should definitely check out is AutoMapper - it's a great little component to ease those left-right-assignment orgies you have to go through when using ViewModels / Data Transfer Objects by making this mapping between two object types a snap. It's well used, well tested, used by many and very stable - a winner!