LINQ to SQL with null values - c#

Can anyone help me figure this out?
The below code works fine and gets inside the if statument
foreach (var m in msg)
{
if (string.IsNullOrEmpty(m.PhoneNumber))
{
m.PhoneNumber = (from c in db.Customers
where c.CustomerID == m.CustomerID
select c.PhoneNumber).Single();
}
}
However in the below code phoneNumber is never set
foreach (var m in msg.Where(z => (z.PhoneNumber == null || z.PhoneNumber == "")))
{
m.PhoneNumber = (from c in db.Customers
where c.CustomerID == m.CustomerID
select c.PhoneNumber).Single();
}
I'm presuming its because the top code actually evaluates the expression whereas the below dosent. If that is the case then how can you check for null on an unevaluated LINQ query?
EDIT Just to stop confusion here is how msg is poplated in both cases
var msg = from m in db.Messages
where (m.StatusID == (int)MessageStatus.Submitted && m.MessageBoxTypeID == (int)MessageBoxType.Outbox)
select m;

I’m somewhat baffled by this one, but I have a wild guess. If the msg sequence is an IQueryable<T> which translates to an SQL query, then the behavior of the two snippets may vary. Suppose you have:
var msg =
from m in dataContext.MyTable
select m;
Your first snippet would cause the entire msg sequence to be enumerated, thereby issuing an unfiltered SELECT…FROM command to the database and fetching all the rows within your table.
foreach (var m in msg)
On the other hand, your second snippet applies a filter to your sequence before it is enumerated. Thus, the command issued to the database is a SELECT…FROM…WHERE.
foreach (var m in msg.Where(z => (z.PhoneNumber == null || z.PhoneNumber == "")))
There are various cases where the behavior of a filter applied in .NET would differ from its translation to Transact-SQL. For one, case-sensitivity. In your case, I assume that the mismatch is caused by entries whose PhoneNumber consists of whitespace, as these may match the empty string in SQL Server.
To test this possibility, check what happens if you change your second snippet to:
foreach (var m in msg.ToList().Where(z => (z.PhoneNumber == null || z.PhoneNumber == "")))
Edit: Your issue might be that your query is being executed again during subsequent access (when you check whether PhoneNumber was set).
If you execute:
foreach (var m in msg.Where(z => (z.PhoneNumber == null || z.PhoneNumber == "")))
{
m.PhoneNumber = …
}
bool stillHasNulls = msg.Any(z => z.PhoneNumber == null || z.PhoneNumber == "");
You will find that stillHasNulls might still evaluate to true, since your assignment to m.PhoneNumber is being lost when you re-evaluate the msg sequence (in the above case, when you execute msg.Any, which issues an EXISTS command to the database).
For your m.PhoneNumber assignments to be preserved, you need to either persist them to the database (if that’s what you want), or else make sure that you’re accessing the same sequence elements each time. One way to do this would be to pre-populate the sequence as a collection, using ToList.
msg = msg.Where(z => (z.PhoneNumber == null || z.PhoneNumber == "")).ToList();
foreach (var m in msg)
{
m.PhoneNumber = …
}
In the above code, the filter still gets issued to the database as a SELECT…FROM…WHERE, but the result is evaluated eagerly, and then stored as a list within msg. Any subsequent queries on msg would be evaluated against the pre-populated in-memory collection (which would contain any new values you assign to its elements).

Related

EF Count() > 0 but First() throws exception

I have faced a strange problem. When user comes to any page of my web app
I do check if user has permissions to access it, and provide trial period if its first time to come.
Here is my piece of code:
List<string> temp_workers_id = new List<string>();
...
if (temp_workers_id.Count > 6)
{
System.Data.SqlTypes.SqlDateTime sqlDate = new System.Data.SqlTypes.SqlDateTime(DateTime.Now.Date);
var rusers = dbctx.tblMappings.Where(tm => temp_workers_id.Any(c => c == tm.ModelID));
var permissions = dbctx.UserPermissions
.Where(p => rusers
.Any(ap => ap.UserID == p.UserID)
&& p.DateStart != null
&& p.DateEnd != null
&& p.DateStart <= sqlDate.Value
&& p.DateEnd >= sqlDate.Value);
if (permissions.Count() < 1)
{
permissions = dbctx.UserPermissions
.Where(p => rusers
.Any(ap => ap.UserID == p.UserID)
&& p.DateStart == null
&& p.DateEnd == null);
var used = dbctx.UserPermissions
.Where(p => rusers
.Any(ap => ap.UserID == p.UserID)
&& p.DateStart != null
&& p.DateEnd != null);
if (permissions.Count() > 0 && used.Count() < 1)
{
var p = permissions.First();
using (Models.TTTDbContext tdbctx = new Models.TTTDbContext())
{
var tp = tdbctx.UserPermissions.SingleOrDefault(tup => tup.UserID == p.UserID);
tp.DateStart = DateTime.Now.Date;
tp.DateEnd = DateTime.Now.Date.AddDays(60);
tdbctx.SaveChanges();
}
here the First() method throws exception:
Sequence contains no elements
how that even could be?
EDIT:
I dont think that user opens two browsers and navigate here at the same time, but could be the concurrency issue?
You claim you only found this in the server logs and didn't encounter it during debugging. That means that between these lines:
if (permissions.Count() > 0)
{
var p = permissions.First();
Some other process or thread changed your database, so that the query didn't match any documents anymore.
This is caused by permissions holding a lazily evaluated resource, meaning that the query is only executed when you iterate it (which Count() and First()) do.
So in the Count(), the query is executed:
SELECT COUNT(*) ... WHERE ...
Which returns, at that moment, one row. Then the data is modified externally, causing the next query (at First()):
SELECT n1, n2, ... WHERE ...
To return zero rows, causing First() to throw.
Now for how to solve that, is up to you, and depends entirely on how you want to model this scenario. It means the second query was actually correct: at that moment, there were no more rows that fulfilled the query criteria. You could materialize the query once:
permissions = query.Where(...).ToList()
But that would mean your logic operates on stale data. The same would happen if you'd use FirstOrDefault():
var permissionToApply = permissions.FirstOrDefault();
if (permissionToApply != null)
{
// rest of your logic
}
So it's basically a lose-lose scenario. There's always the chance that you're operating on stale data, which means that the next code:
tdbctx.UserPermissions.SingleOrDefault(tup => tup.UserID == p.UserID);
Would throw as well. So every time you query the database, you'll have to write the code in such a way that it can handle the records not being present anymore.

Proper way to use LINQ for this type of query?

I was originally using a foreach loop and then for each element in the loop, I perform a LINQ query like so:
foreach (MyObject identifier in identifiers.Where(i => i.IsMarkedForDeletion == false))
{
if (this.MyEntities.Identifiers.Where(pi => identifier.Field1 == pi.Field1 && identifier.Field2 == pi.Field2 && identifier.Field3 == pi.Field3).Any())
{
return false;
}
}
return true;
Then I modified it like so:
if (identifiers.Any(i => !i.IsMarkedForDeletion && this.MyEntities.Identifiers.Where(pi => i.Field1 == pi.Field1 && i.Field2 == pi.Field2 && i.Field3 == pi.Field3).Any()))
{
return false;
}
return true;
My question is this still the wrong way to use LINQ? Basically, I want to eliminate the need for the foreach loop (which seems like I should be able to get rid of it) and also make the DB query faster by not performing separate DB queries for each element of a list. Instead, I want to perform one query for all elements. Thanks!
You can change your code in this way, and it will be converted to SQL statement as expected.
To prevent runtime errors during transformation, it will be better to save DBSet to the IQueryable variable; identifiers should be IQueryable too, so you should change your code into something like this (to be honest, Resharper converted your foreach in this short labda):
IQueryable<MyObject2> identifiers = MyEntities.Identifiers.Where(i => i.IsMarkedForDeletion == false);
IQueryable<MyObject2> ids = MyEntities.Identifiers.AsQueryable();
return identifiers.All(identifier => !ids.Any(pi => identifier.Field1 == pi.Field1 && identifier.Field2 == pi.Field2 && identifier.Field3 == pi.Field3));
If identifiers is in memory collection you can change code in this way (hope that fields are string):
IQueryable<MyObject2> ids = MyEntities.Identifiers.AsQueryable();
string[] values = identifiers.Where(i => i.IsMarkedForDeletion == false).Select(i => String.Concat(i.Field1, i.Field2, i.Field3)).ToArray();
return !ids.Any(i => values.Contains(i.Field1 + i.Field2 + i.Field3));
Unfortunately your modified version will be executed exactly the same way (i.e. multiple database queries) as in the original foreach approach because EF does not support database query with joins to in memory collection (except for primitive and enumeration type collections), so if you try the most logical way
bool result = this.MyEntities.Identifiers.Any(pi => identifiers.Any(i =>
!i.IsMarkedForDeletion &&
i.Field1 == pi.Field1 && i.Field2 == pi.Field2 && i.Field3 == pi.Field3));
you'll get
NotSupportedException: Unable to create a constant value of type 'YourType'. Only primitive types or enumeration types are supported in this context.
The only way to let EF execute a single database query is to manually build a LINQ query with Concat per each item from in memory collection, like this
IQueryable<Identifier> query = null;
foreach (var item in identifiers.Where(i => !i.IsMarkedForDeletion))
{
var i = item;
var subquery = this.MyEntities.Identifiers.Where(pi =>
pi.Field1 == i.Field1 && pi.Field2 == i.Field2 && pi.Field3 == i.Field3);
query = query != null ? query.Concat(subquery) : subquery;
}
bool result = query != null && query.Any();
See Logging and Intercepting Database Operations of how to monitor the EF actions.
I would use it as follows:
if (identifiers.Where(i => !i.IsMarkedForDeletion &&
this.MyEntities.Identifiers.Field1 == i.Field1 &&
this.MyEntities.Identifiers.Field2 == i.Field2 &&
this.MyEntities.Identifiers.Field3 == i.Field3).Any()))
{
return false;
}
return true;
I hope this helps. Even though it is more to type out, it is more understandable and readable then using multiple 'where' statements.

what can i do to improve performance of this code

I have this code that looks though all Contacts and does a count on each email that's been sent to them and if they haven't open/click the last X amount then return them in a list
at the moment the code is taking about 10 mins to run, is there anything I can do to improve this?
I know I could limit the amount returned but that's still slow.
var contactList =
(from c in db.Contacts
where c.Accounts_CustomerID == Account.AccountID && !c.Deleted && !c.EmailOptOut
select c).ToList();
foreach (var person in contactList)
{
var SentEmails =
(from c in db.Comms_Emails_EmailsSents where c.ContactID == person.ID select c).OrderBy(
x => x.DateSent).Take(Last).ToList();
if (SentEmails.Count == Last)
{
if (!Clicks)
{
if (SentEmails.Count(x => x.Opens == 0) == Last)
{
ReturnContacts.Add(person);
}
}
else
{
if (SentEmails.Count(x => x.Clicks == 0) == Last)
{
ReturnContacts.Add(person);
}
}
}
}
return ReturnContacts;
Remove the .ToList()'s and use IQueryables. By using iqueryables the code will execute once and reduces memory. The ToList() retrieves all entities and store them in memory, which you don't want.
Run the logic on the db - rewrite a query using joins etc., so that it returns a result set that already contains relevant data.
What you're doing now is performing a db query for each initial query result. That can mean A LOT of queries.
If you offload that to the RDBMS you can always try and and optimize it there (by introducing indexes etc.).
EDIT: I rewrote the code in notepad:
foreach(var record in (from c in db.Contacts
join es in db.Comms_Emails_EmailsSents
on c.Id equals es.ContactId
where c.Accounts_CustomerID == Account.AccountID && !c.Deleted && !c.EmailOptOut
orderby c.Id, es.DateSent descending
select new {opens=es.Opens, clicks=es.Clicks, person=c})
.GroupBy(r=>r.person)){
var mails = record.Take(Last).ToList();
if(mails.Count == Last){
if(!Clicks){
if(mails.Count(x=>x.opens == 0) == Last){
ReturnContacts.Add(record.Key);
}
}
}else
{
if (SentEmails.Count(x => x.Clicks == 0) == Last)
{
ReturnContacts.Add(record.Key);
}
}
}
I don't have time at hand to mock up a db and test it. Also, this approach performs a join between contacts and emails, and if you have 100k emails per person, this might be a very bad idea. You could optimize it by using rank function, but I'd say that if performance is still bad, you could start thinking of doing db-side optimizations, as this data structure is - at least to my, non-dba eyes - not perfectly suited for this kind of querying.

The method ‘Skip’ is only supported for sorted input in LINQ to Entities. The method ‘OrderBy’ must be called before the method ‘Skip’

Using Entity Framework 6.0.2 and .NET 4.5.1 in Visual Studio 2013 Update 1 with a DbContext connected to SQL Server:
I have a long chain of filters I am applying to a query based on the caller's desired results. Everything was fine until I needed to add paging. Here's a glimpse:
IQueryable<ProviderWithDistance> results = (from pl in db.ProviderLocations
let distance = pl.Location.Geocode.Distance(_geo)
where pl.Location.Geocode.IsEmpty == false
where distance <= radius * 1609.344
orderby distance
select new ProviderWithDistance() { Provider = pl.Provider, Distance = Math.Round((double)(distance / 1609.344), 1) }).Distinct();
if (gender != null)
{
results = results.Where(p => p.Provider.Gender == (gender.ToUpper() == "M" ? Gender.Male : Gender.Female));
}
if (type != null)
{
int providerType;
if (int.TryParse(type, out providerType))
results = results.Where(p => p.Provider.ProviderType.Id == providerType);
}
if (newpatients != null && newpatients == true)
{
results = results.Where(p => p.Provider.ProviderLocations.Any(pl => pl.AcceptingNewPatients == null || pl.AcceptingNewPatients == AcceptingNewPatients.Yes));
}
if (string.IsNullOrEmpty(specialties) == false)
{
List<int> _ids = specialties.Split(',').Select(int.Parse).ToList();
results = results.Where(p => p.Provider.Specialties.Any(x => _ids.Contains(x.Id)));
}
if (string.IsNullOrEmpty(degrees) == false)
{
List<int> _ids = specialties.Split(',').Select(int.Parse).ToList();
results = results.Where(p => p.Provider.Degrees.Any(x => _ids.Contains(x.Id)));
}
if (string.IsNullOrEmpty(languages) == false)
{
List<int> _ids = specialties.Split(',').Select(int.Parse).ToList();
results = results.Where(p => p.Provider.Languages.Any(x => _ids.Contains(x.Id)));
}
if (string.IsNullOrEmpty(keyword) == false)
{
results = results.Where(p =>
(p.Provider.FirstName + " " + p.Provider.LastName).Contains(keyword));
}
Here's the paging I added to the bottom (skip and max are just int parameters):
if (skip > 0)
results = results.Skip(skip);
results = results.Take(max);
return new ProviderWithDistanceDto { Locations = results.AsEnumerable() };
Now for my question(s):
As you can see, I am doing an orderby in the initial LINQ query, so why is it complaining that I need to do an OrderBy before doing a Skip (I thought I was?)...
I was under the assumption that it won't be turned into a SQL query and executed until I enumerate the results, which is why I wait until the last line to return the results AsEnumerable(). Is that the correct approach?
If I have to enumerate the results before doing Skip and Take how will that affect performance? Obviously I'd like to have SQL Server do the heavy lifting and return only the requested results. Or does it not matter (or have I got it wrong)?
I am doing an orderby in the initial LINQ query, so why is it complaining that I need to do an OrderBy before doing a Skip (I thought I was?)
Your result starts off correctly as an ordered queryable: the type returned from the query on the first line is IOrderedQueryable<ProviderWithDistance>, because you have an order by clause. However, adding a Where on top of it makes your query an ordinary IQueryable<ProviderWithDistance> again, causing the problem that you see down the road. Logically, that's the same thing, but the structure of the query definition in memory implies otherwise.
To fix this, remove the order by in the original query, and add it right before you are ready for the paging, like this:
...
if (string.IsNullOrEmpty(languages) == false)
...
if (string.IsNullOrEmpty(keyword) == false)
...
result = result.OrderBy(r => r.distance);
As long as ordering is the last operation, this should fix the runtime problem.
I was under the assumption that it won't be turned into a SQL query and executed until I enumerate the results, which is why I wait until the last line to return the results AsEnumerable(). Is that the correct approach?
Yes, that is the correct approach. You want your RDBMS to do as much work as possible, because doing paging in memory defeats the purpose of paging in the first place.
If I have to enumerate the results before doing Skip and Take how will that affect performance?
It would kill the performance, because your system would need to move around a lot more data than it did before you added paging.

Extremely Slow Linq to Excel

I am trying to create an application that will extract some data out of an automatically generated excel file. This can be very easily done with Access but the file is in Excel and the solution must be a one button sort of thing.
For some reason, simply looping through the data without doing any actions is slow. The code below is my attempt at optimizing it from something that was far slower. I have arrived at using Linq to SQL after a few attempts at this with the Interop classes directly and through different wrappers.
I also have read the answers to a few questions on here and Google. In an attempt to see what is causing the slowness, I have removed all instructions but kept "i++" from the relevant section. It is still very slow. I also tried to optimize it by limiting the number of records retrieved in the where clause in the third line but that didn't work. Your help would be appreciated.
Thank you.
Dictionary<string,double> instructors = new Dictionary<string,double>();
var t = from c in excel.Worksheet("Course_201410_M1")
// where c["COURSE CODE"].ToString().Substring(0,4) == "COSC" || c["COURSE CODE"].ToString().Substring(0,3) == "COEN" || c["COURSE CODE"].ToString().Substring(0,3) == "GEIT" || c["COURSE CODE"].ToString().Substring(0,3) == "ITAP" || c["COURSE CODE"] == "PRPL 0012" || c["COURSE CODE"] == "ASSE 4311" || c["COURSE CODE"] == "GEEN 2312" || c["COURSE CODE"] == "ITLB 1311"
select c;
HashSet<string> uniqueForce = new HashSet<string>();
foreach (var c in t)
{
if(uniqueForce.Add(c["Instructor"]))
instructors.Add(c["Instructor"],0.0);
}
foreach (string name in instructors.Keys)
{
var y = from d in t
where d["Instructor"] == name
select d;
int i = 1;
foreach(var z in y)
{
//this is the really slow. It takes a couple of minutes to finish. The
// file has less than a 1000 records.
i++;
}
}
Put the query that forms var t into brackets and then call ToList() on it.
var t = (from c in excel.Worksheet("Course_201410_M1")
select c).ToList();
Due to linq's lazy/deferred execution model, whenever you iterate over the collection it will requery the data source unless you give it a List to work with.

Categories