Accessing foreign keys through LINQ - c#

I have a setup on SQL Server 2008. I've got three tables. One has a string identifier as a primary key. The second table holds indices into an attribute table. The third simply holds foreign keys into both tables- so that the attributes themselves aren't held in the first table but are instead referred to. Apparently this is common in database normalization, although it is still insane because I know that, since the key is a string, it would take a maximum of 1 attribute per 30 first table room entries to yield a space benefit, let alone the time and complexity problems.
How can I write a LINQ to SQL query to only return values from the first table, such that they hold only specific attributes, as defined in the list in the second table? I attempted to use a Join or GroupJoin, but apparently SQL Server 2008 cannot use a Tuple as the return value.

"I attempted to use a Join or
GroupJoin, but apparently SQL Server
2008 cannot use a Tuple as the return
value".
You can use anonymous types instead of Tuples which are supported by Linq2SQL.
IE:
from x in source group x by new {x.Field1, x.Field2}

I'm not quite clear what you're asking for. Some code might help. Are you looking for something like this?
var q = from i in ctx.Items
select new
{
i.ItemId,
i.ItemTitle,
Attributes = from map in i.AttributeMaps
select map.Attribute
};

I use this page all the time for figuring out complex linq queries when I know the sql approach I want to use.
VB http://msdn.microsoft.com/en-us/vbasic/bb688085
C# http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx
If you know how to write the sql query to get the data you want then this will show you how to get the same result translating it into linq syntax.

Related

Querying with many (~100) search terms with Entity Framework

I need to do a query on my database that might be something like this where there could realistically be 100 or more search terms.
public IQueryable<Address> GetAddressesWithTown(string[] towns)
{
IQueryable<Address> addressQuery = DbContext.Addresses;
addressQuery.Where( x => towns.Any( y=> x.Town == y ) );
return addressQuery;
}
However when it contains more than about 15 terms it throws and exception on execution because the SQL generated is too long.
Can this kind of query be done through Entity Framework?
What other options are there available to complete a query like this?
Sorry, are we talking about THIS EXACT SQL?
In that case it is a very simple "open your eyes thing".
There is a way (contains) to map that string into an IN Clause, that results in ONE sql condition (town in ('','',''))
Let me see whether I get this right:
addressQuery.Where( x => towns.Any( y=> x.Town == y ) );
should be
addressQuery.Where ( x => towns.Contains (x.Town)
The resulting SQL will be a LOT smaller. 100 items is still taxing it - I would dare saying you may have a db or app design issue here and that requires a business side analysis, I have not me this requirement in 20 years I work with databases.
This looks like a scenario where you'd want to use the PredicateBuilder as this will help you create an Or based predicate and construct your dynamic lambda expression.
This is part of a library called LinqKit by Joseph Albahari who created LinqPad.
public IQueryable<Address> GetAddressesWithTown(string[] towns)
{
var predicate = PredicateBuilder.False<Address>();
foreach (string town in towns)
{
string temp = town;
predicate = predicate.Or (p => p.Town.Equals(temp));
}
return DbContext.Addresses.Where (predicate);
}
You've broadly got two options:
You can replace .Any with a .Contains alternative.
You can use plain SQL with table-valued-parameters.
Using .Contains is easier to implement and will help performance because it translated to an inline sql IN clause; so 100 towns shouldn't be a problem. However, it also means that the exact sql depends on the exact number of towns: you're forcing sql-server to recompile the query for each number of towns. These recompilations can be expensive when the query is complex; and they can evict other query plans from the cache as well.
Using table-valued-parameters is the more general solution, but it's more work to implement, particularly because it means you'll need to write the SQL query yourself and cannot rely on the entity framework. (Using ObjectContext.Translate you can still unpack the query results into strongly-typed objects, despite writing sql). Unfortunately, you cannot use the entity framework yet to pass a lot of data to sql server efficiently. The entity framework doesn't support table-valued-parameters, nor temporary tables (it's a commonly requested feature, however).
A bit of TVP sql would look like this select ... from ... join #townTableArg townArg on townArg.town = address.town or select ... from ... where address.town in (select town from #townTableArg).
You probably can work around the EF restriction, but it's not going to be fast and will probably be tricky. A workaround would be to insert your values into some intermediate table, then join with that - that's still 100 inserts, but those are separate statements. If a future version of EF supports batch CUD statements, this might actually work reasonably.
Almost equivalent to table-valued paramters would be to bulk-insert into a temporary table and join with that in your query. Mostly that just means you're table name will start with '#' rather than '#' :-). The temp table has a little more overhead, but you can put indexes on it and in some cases that means the subsequent query will be much faster (for really huge data-quantities).
Unfortunately, using either temporary tables or bulk insert from C# is a hassle. The simplest solution here is to make a DataTable; this can be passed to either. However, datatables are relatively slow; the over might be relevant once you start adding millions of rows. The fastest (general) solution is to implement a custom IDataReader, almost as fast is an IEnumerable<SqlDataRecord>.
By the way, to use a table-valued-parameter, the shape ("type") of the table parameter needs to be declared on the server; if you use a temporary table you'll need to create it too.
Some pointers to get you started:
http://lennilobel.wordpress.com/2009/07/29/sql-server-2008-table-valued-parameters-and-c-custom-iterators-a-match-made-in-heaven/
SqlBulkCopy from a List<>

Generate IN statement for decimal values with Oracle

I'm having trouble getting LINQ to generate an IN statement. I have a function that looks like the one below, taking as input a list of long representing the primary keys to find. Using the contains methods ends up generating a query of composite OR statements. We're running against an Oracle database. We're using the newest version of LINQ/EF. QueryableDbSet is a DbSet. The PrimaryKey property is a decimal value and is the primary integer identifier for the object in Oracle.
List<long> GetList(List<long> ids)
{
List<decimal> dIds = ids.Select(x => (decimal)x);
var query = from c in QueryableDbSet
where dIds.Contains(c.PrimaryKey)
select (long)c;
return query.ToList();
}
The above code will generate composite OR statements. Does any one know how to get LINQ to generate the faster IN statements? Thanks.
EDIT: Doing some more research it appears that Oracle actually converts IN statements to OR chains, so maybe this is an exercise in futility. Can anyone confirm or deny this?
Also, I'm aware that the above function is completely pointless/redundant. My real use case is more complicated and involves other conditionals in the where clause.

lightswitch LINQ PreprocessQuery

I use the PreprocessQuery method to extend a query in lightswitch.
Something like this:
query = (from item in query
where (validIDs.Contains(item.tableIDs.myID)) &&
elementCount[item.ID] <= maxEleCount)
select item);
Where validIDs is a HashSet(int) and elementCount is a Dictionary(int, int).
the first where clause is working fine, but the second -> elementCount[item.ID] <= maxEleCount
is not working.
What i want to do is to filter a table by some IDs (validIDs) and check also if in another table the number of entries for every of this IDs does not exceed a limit.
Any ideas?
EDIT
I found a solution. Instead of a Dictionary I also used a HashSet for the second where clause. It seems it is not possible to do the Dictionary lookup inside the LINQ statement for some reason (?)
First, although being a bit pedantic, what you're doing in a PreProcessQuery method is "restricting" records in the query, not "extending" the query.
What you put in a LING query has to be able to be processed by the Entity Framework data provider (in the case of LS, the SQL Server Data Provider).
Sometimes you'll find that while your LINQ query compiles, it fails at runtime. This is because the data provider is unable to express it to the data store (again in this case SQL Server).
You're normally restricted to "primitive" values, so if you hadn't said that using a Dictionary actually worked, I would have said that it wouldn't.
Any time you have a static (as in non-changing) value, I'd suggest that you create a variable outside of your LINQ query, then use the variable in the LINQ query. By doing this, you're simply passing a value, the data provider doesn't have to try to figure out how to pass it to the data store.
Reading your code again, this might not be what you're doing, but hopefully this explanation will still be helpful.

Linq efficiency - How do I best query a database for a list of values?

Let us say I have a database of Terms and a list of strings, is this a good (efficient) idea? It works smoothly, but I'm not sure it is scalable or the most efficient.
var results =
from t in Terms
join x in Targets on t.Term equals x
select t;
Here Terms is a database table with index table Term. Targets is an IEnumerable of strings. Terms might hold millions, Targets between 10-20 strings. Any thoughts?
Ultimately what matters, as far as efficiency is concerned, is if the query that is executed against the database is efficient. To see this, you can either use SQL Profiler or find an application that will show you SQL generated by linq-to-sql.
If you use SQL Profiler, be sure to have it look for stored procedures, as Linq-to-sql uses the exec_sql procedure to execute queries.
If you need to join two tables on one key, as in your example, there's no other way to express it than an actual join. What you have is as efficient as it CAN get.
However, change the select to return only the fields you're interested in, and make sure you trim them, because sql databases like to return char fields with trailing spaces, and they take time to process and transfer across the network.
Hmm, I didn't know you could join a local collection in like that. Perhaps that's a .Net 4.0 feature?
I have frequently issued queries like this:
IQueryable<Term> query =
from t in Terms
where Targets.Contains(t.Term)
select t;
There's a few caveats.
The variable x must be a List<string> reference. The variable x may not be an IList<string> reference.
Each string in the list is translated into a sql parameter. While linq to sql will happily translate many thousands of strings into parameters (I've seen 50k parameters), Sql Server will only accept ~2100. If you exceed this limit, you'll get a sql exception.
nvarchar vs varchar indexes.

Retrieving a tree structure from a database using LINQ

I have an organization chart tree structure stored in a database.
Is is something like
ID (int);
Name (String);
ParentID (int)
In C# it is represented by a class like
class Employee
{
int ID,
string Name,
IList < Employee> Subs
}
I am wondering how is the best way to retrieve these values from the database to fill up the C# Objects using LINQ (I am using Entity Framework)
There must be something better than making a call to get the top level then making repeated calls to get subs and so on.
How best to do it?
You can build a stored proc that has built in recursion. Take a look at http://msdn.microsoft.com/en-us/library/ms190766.aspx for more info on Common Table Expressions in SQL Server
You might want to find a different (better?) way to model your data. http://www.sqlteam.com/article/more-trees-hierarchies-in-sql lists a popular way of modeling hierarchical data in a database. Changing the modeling can allow you to create queries that can be expressed without recursion.
If you're using SQL Server 2008, you could make use of the new HIERARCHYID feature.
Organizations have struggled in past
with the representation of tree like
structures in the databases, lot of
joins lots of complex logic goes into
the place, whether it is organization
hierarchy or defining a BOM (Bill of
Materials) where one finished product
is dependent on another semi finished
materials / kit items and these kit
items are dependent on another semi
finished items or raw materials.
SQL Server 2008 has the solution to
the problem where we store the entire
hierarchy in the data type
HierarchyID. HierarchyID is a variable
length system data type. HierarchyID
is used to locate the position in the
hierarchy of the element like Scott is
the CEO and Mark as well as Ravi
reports to Scott and Ben and Laura
report to Mark, Vijay, James and Frank
report to Ravi.
So use the new functions available, and simply return the data you need without using LINQ. The drawback is you'll need to use UDF or stored procedures for anything beyond a simple root query:
SELECT #Manager = CAST('/1/' AS hierarchyid)
SELECT #FirstChild = #Manager.GetDescendant(NULL,NULL)
I'd add a field to the entity to include the parent ID, then I'd pull the whole table into memory leaving the List subs null. Id then iterate through the objects and populate the list using linq to objects. Only one DB query so should be reasonable.
An Entity Framework query should allow you to include related entity sets, though in a unary relationship, not sure how it would work...
Check this out for more information on that: http://msdn.microsoft.com/en-us/library/bb896272.aspx
Well... even with LINQ you will need two queries, because any single query will duplicate the main employee and thus will result in multiple employees (that are really the same) being created... However, you can hide this a bit with linq when you create the object, that's when you would execute the second query, something like this:
var v = from u in TblUsers
select new {
SupervisorName = u.DisplayName,
Subs = (from sub in TblUsers where sub.SupervisorID.Value==u.UserID select sub.DisplayName).ToList()
};

Categories