I'm having trouble getting LINQ to generate an IN statement. I have a function that looks like the one below, taking as input a list of long representing the primary keys to find. Using the contains methods ends up generating a query of composite OR statements. We're running against an Oracle database. We're using the newest version of LINQ/EF. QueryableDbSet is a DbSet. The PrimaryKey property is a decimal value and is the primary integer identifier for the object in Oracle.
List<long> GetList(List<long> ids)
{
List<decimal> dIds = ids.Select(x => (decimal)x);
var query = from c in QueryableDbSet
where dIds.Contains(c.PrimaryKey)
select (long)c;
return query.ToList();
}
The above code will generate composite OR statements. Does any one know how to get LINQ to generate the faster IN statements? Thanks.
EDIT: Doing some more research it appears that Oracle actually converts IN statements to OR chains, so maybe this is an exercise in futility. Can anyone confirm or deny this?
Also, I'm aware that the above function is completely pointless/redundant. My real use case is more complicated and involves other conditionals in the where clause.
Related
After hours of trying to figure out why a piece of code is out of sync i came to the realization that TSQL OrderBy differs greatly from Linq OrderBy for strings. So naturally I had to find a way to make sure a statement from tsql returns the same order as linq so I used a uniqueidentifier(guid) as my primary, however that seems to be wacked.
Here is the order using Linq (using the Guid(UniqueID) type for the property)
var listings = validListings.Select(x => x.TempListing)
.OrderBy(x => x.UniqueID)
.ToList();
It seems TSQL uses the binary value of the uniqueidentifier to compare while LINQ uses the string value even though the row is GUID
LINQ RESULTS (Printed) (first few)
00460400-a41d-465d-83c5-225f697e7bb5
015bef8d-5fa3-4c03-8d05-bfecf74b36b9
0202b433-4748-4660-97a1-94d119209aa6
03f34eb0-45cd-4586-b7d2-6e337b441c43
05e41d20-be24-4f4f-b098-574744dd84f0
0767e5d5-afba-49ab-a047-c9f509c80d3a
08f87ba1-8aa8-48a6-8f98-c3b4c6511b76
0b4157c4-7bdc-4e98-a00c-9259af754844
0bd194d0-fb66-4a69-9718-2128594ff9b0
0cda256a-7632-47de-b867-a2bc46382881
0d36f81a-ca37-446e-a325-87b46ef5b8d3
0d89fd26-4204-4d0a-b187-73a36536a848
0e345ca9-3d5d-43ed-aa75-fbd356f94535
0e767557-87ea-4c31-9f54-75d354a87d0f
0f62fc97-85b0-4611-b3e5-0c5ae4f12a18
1020d776-9810-4122-a9ef-3c527f21970c
TSQL FIRST FEW
9C5231CE-01DE-4A20-A4C9-001AD0D28512
3D52B47C-B29C-44A8-99F9-00AA660610A8
FDA7B67D-AEDB-4644-96E4-0147A0EEC29D
C8C7B677-76EB-41D3-B11C-020B9047EB00
487FF542-599B-42D4-BCE3-02C5D569E509
BDAA48DB-60AF-4A36-AFDB-02FA706EE87F
2CD9D59C-C2B5-433C-9FD1-0444F0384BB3
D44695A3-6FEF-4842-BFCB-048C110FA178
28FF051C-38A7-424F-B657-0698452DFE36
D9320EC6-64CD-4C26-8C5C-088C04E22AD7
D9F7FDC1-16D6-4C3A-B117-0908A234DF95
7DB09D09-F10B-4F33-9390-09211F9B2958
D970EE98-B575-4E73-BBAC-0981D6DC1682
9B05CDD9-2D85-486B-BC6B-0BA7E44F6021
539D22ED-FF2A-4376-A650-0BFE184C0E26
0F62FC97-85B0-4611-B3E5-0C5AE4F12A18
5D8EF134-0DC2-4B32-9F02-0C65940C1BCF
How can I make them both return the same result?
As you have discovered, C# (System.Guid) and SQL Server use different algorithms to sort GUIDs.
If you want SQL Server to sort GUIDs the way System.Guid does, then convert it to VARCHAR.
If you want to sort Guids in C# the way SQL does, convert the Guid values to System.Data.SqlGuid:
var listings = validListings.Select(x =>x.TempListing)
.OrderBy(x=> new SqlGuid(x.UniqueID)).ToList();
That said, GUID is not an appropriate data type for a sortable column - I would switch to an IDENTITY column if possible.
I have executed a linq query by using Entityframework like below
GroupMaster getGroup = null;
getGroup = DataContext.Groups.FirstOrDefault(item => keyword.IndexOf(item.Keywords,StringComparison.OrdinalIgnoreCase)>=0 && item.IsEnabled)
when executing this method I got exception like below
LINQ to Entities does not recognize the method 'Int32 IndexOf(System.String, System.StringComparison)' method, and this
method cannot be translated into a store expression.
Contains() method by default case sensitive so again I need to convert to lower.Is there any method for checking a string match other than the contains method and is there any method to solve the indexOf method issue?
The IndexOf method Of string class will not recognized by Entity Framework, Please replace this function with SQLfunction or Canonical functions
You can also take help from here or maybe here
You can use below code sample:
DataContext.Groups.FirstOrDefault(item =>
System.Data.Objects.SqlClient.SqlFunctions.CharIndex(item.Keywords, keyword).Value >=0 && item.IsEnabled)
You really only have four options here.
Change the collation of the database globally. This can be done in several ways, a simple google search should reveal them.
Change the collation of individual tables or columns.
Use a stored procedure and specify the COLATE statement on your query
perform a query and return a large set of results, then filter in memory using Linq to Objects.
number 4 is not a good option unless your result set is pretty small. #3 is good if you can't change the database (but you can't use Linq with it).
numbers 1 and 2 are choices you need to make about your data model as a whole, or if you only want to do it on specific fields.
Changing the Servers collation:
http://technet.microsoft.com/en-us/library/ms179254.aspx
Changing the Database Collation:
http://technet.microsoft.com/en-us/library/ms179254.aspx
Changing the Columns Collation:
http://technet.microsoft.com/en-us/library/ms190920(v=sql.105).aspx
Using the Collate statement in a stored proc:
http://technet.microsoft.com/en-us/library/ms184391.aspx
Instead you can use this method below for lowering the cases:
var lowerCaseItem = item.ToLower();
If your item is of type string. Then this might get you through that exception.
Erik Funkenbush' answer is perfectly valid when looking at it like a database problem. But I get the feeling that you need a better structure for keeping data regarding keywords if you want to traverse them efficiently.
Note that this answer isn't intended to be better, it is intended to fix the problem in your data model rather than making the environment adapt to the current (apparently flawed, since there is an issue) data model you have.
My main suggestion, regardless of time constraint (I realize this isn't the easiest fix) would be to add a separate table for the keywords (with a many-to-many relationship with its related classes).
[GROUPS] * ------- * [KEYWORD]
This should allow for you to search for the keyword, and only then retrieve the items that have that keyword related to it (based on ID rather than a compound string).
int? keywordID = DataContext.Keywords.Where(x => x.Name == keywordFilter).Select(x => x.Id).FirstOrDefault();
if(keywordID != null)
{
getGroup = DataContext.Groups.FirstOrDefault(group => group.Keywords.Any(kw => kw.Id == keywordID));
}
But I can understand completely if this type of fix is not possible anymore in the current project. I wanted to mention it though, in case anyone in the future stumbles on this question and still has the option for improving the data structure.
I need to do a query on my database that might be something like this where there could realistically be 100 or more search terms.
public IQueryable<Address> GetAddressesWithTown(string[] towns)
{
IQueryable<Address> addressQuery = DbContext.Addresses;
addressQuery.Where( x => towns.Any( y=> x.Town == y ) );
return addressQuery;
}
However when it contains more than about 15 terms it throws and exception on execution because the SQL generated is too long.
Can this kind of query be done through Entity Framework?
What other options are there available to complete a query like this?
Sorry, are we talking about THIS EXACT SQL?
In that case it is a very simple "open your eyes thing".
There is a way (contains) to map that string into an IN Clause, that results in ONE sql condition (town in ('','',''))
Let me see whether I get this right:
addressQuery.Where( x => towns.Any( y=> x.Town == y ) );
should be
addressQuery.Where ( x => towns.Contains (x.Town)
The resulting SQL will be a LOT smaller. 100 items is still taxing it - I would dare saying you may have a db or app design issue here and that requires a business side analysis, I have not me this requirement in 20 years I work with databases.
This looks like a scenario where you'd want to use the PredicateBuilder as this will help you create an Or based predicate and construct your dynamic lambda expression.
This is part of a library called LinqKit by Joseph Albahari who created LinqPad.
public IQueryable<Address> GetAddressesWithTown(string[] towns)
{
var predicate = PredicateBuilder.False<Address>();
foreach (string town in towns)
{
string temp = town;
predicate = predicate.Or (p => p.Town.Equals(temp));
}
return DbContext.Addresses.Where (predicate);
}
You've broadly got two options:
You can replace .Any with a .Contains alternative.
You can use plain SQL with table-valued-parameters.
Using .Contains is easier to implement and will help performance because it translated to an inline sql IN clause; so 100 towns shouldn't be a problem. However, it also means that the exact sql depends on the exact number of towns: you're forcing sql-server to recompile the query for each number of towns. These recompilations can be expensive when the query is complex; and they can evict other query plans from the cache as well.
Using table-valued-parameters is the more general solution, but it's more work to implement, particularly because it means you'll need to write the SQL query yourself and cannot rely on the entity framework. (Using ObjectContext.Translate you can still unpack the query results into strongly-typed objects, despite writing sql). Unfortunately, you cannot use the entity framework yet to pass a lot of data to sql server efficiently. The entity framework doesn't support table-valued-parameters, nor temporary tables (it's a commonly requested feature, however).
A bit of TVP sql would look like this select ... from ... join #townTableArg townArg on townArg.town = address.town or select ... from ... where address.town in (select town from #townTableArg).
You probably can work around the EF restriction, but it's not going to be fast and will probably be tricky. A workaround would be to insert your values into some intermediate table, then join with that - that's still 100 inserts, but those are separate statements. If a future version of EF supports batch CUD statements, this might actually work reasonably.
Almost equivalent to table-valued paramters would be to bulk-insert into a temporary table and join with that in your query. Mostly that just means you're table name will start with '#' rather than '#' :-). The temp table has a little more overhead, but you can put indexes on it and in some cases that means the subsequent query will be much faster (for really huge data-quantities).
Unfortunately, using either temporary tables or bulk insert from C# is a hassle. The simplest solution here is to make a DataTable; this can be passed to either. However, datatables are relatively slow; the over might be relevant once you start adding millions of rows. The fastest (general) solution is to implement a custom IDataReader, almost as fast is an IEnumerable<SqlDataRecord>.
By the way, to use a table-valued-parameter, the shape ("type") of the table parameter needs to be declared on the server; if you use a temporary table you'll need to create it too.
Some pointers to get you started:
http://lennilobel.wordpress.com/2009/07/29/sql-server-2008-table-valued-parameters-and-c-custom-iterators-a-match-made-in-heaven/
SqlBulkCopy from a List<>
I use the PreprocessQuery method to extend a query in lightswitch.
Something like this:
query = (from item in query
where (validIDs.Contains(item.tableIDs.myID)) &&
elementCount[item.ID] <= maxEleCount)
select item);
Where validIDs is a HashSet(int) and elementCount is a Dictionary(int, int).
the first where clause is working fine, but the second -> elementCount[item.ID] <= maxEleCount
is not working.
What i want to do is to filter a table by some IDs (validIDs) and check also if in another table the number of entries for every of this IDs does not exceed a limit.
Any ideas?
EDIT
I found a solution. Instead of a Dictionary I also used a HashSet for the second where clause. It seems it is not possible to do the Dictionary lookup inside the LINQ statement for some reason (?)
First, although being a bit pedantic, what you're doing in a PreProcessQuery method is "restricting" records in the query, not "extending" the query.
What you put in a LING query has to be able to be processed by the Entity Framework data provider (in the case of LS, the SQL Server Data Provider).
Sometimes you'll find that while your LINQ query compiles, it fails at runtime. This is because the data provider is unable to express it to the data store (again in this case SQL Server).
You're normally restricted to "primitive" values, so if you hadn't said that using a Dictionary actually worked, I would have said that it wouldn't.
Any time you have a static (as in non-changing) value, I'd suggest that you create a variable outside of your LINQ query, then use the variable in the LINQ query. By doing this, you're simply passing a value, the data provider doesn't have to try to figure out how to pass it to the data store.
Reading your code again, this might not be what you're doing, but hopefully this explanation will still be helpful.
I have a setup on SQL Server 2008. I've got three tables. One has a string identifier as a primary key. The second table holds indices into an attribute table. The third simply holds foreign keys into both tables- so that the attributes themselves aren't held in the first table but are instead referred to. Apparently this is common in database normalization, although it is still insane because I know that, since the key is a string, it would take a maximum of 1 attribute per 30 first table room entries to yield a space benefit, let alone the time and complexity problems.
How can I write a LINQ to SQL query to only return values from the first table, such that they hold only specific attributes, as defined in the list in the second table? I attempted to use a Join or GroupJoin, but apparently SQL Server 2008 cannot use a Tuple as the return value.
"I attempted to use a Join or
GroupJoin, but apparently SQL Server
2008 cannot use a Tuple as the return
value".
You can use anonymous types instead of Tuples which are supported by Linq2SQL.
IE:
from x in source group x by new {x.Field1, x.Field2}
I'm not quite clear what you're asking for. Some code might help. Are you looking for something like this?
var q = from i in ctx.Items
select new
{
i.ItemId,
i.ItemTitle,
Attributes = from map in i.AttributeMaps
select map.Attribute
};
I use this page all the time for figuring out complex linq queries when I know the sql approach I want to use.
VB http://msdn.microsoft.com/en-us/vbasic/bb688085
C# http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx
If you know how to write the sql query to get the data you want then this will show you how to get the same result translating it into linq syntax.