Occasionally, my users experience the issue where in the log files I can see this exception is thrown (Sequence contains no elements)
I search around and can see this exception happens when you try to access or use aggregate on an empty list.
I searched through the code around this exception (too bad no stacktrace was logged), and the only "potential" culprit is the below lines (that use either Fist(), Last(), Single() or any Aggregate). However I can't understand why and not able reproduce on my local. Please help to advise.
if (data.Any())
return data.OrderByDescending(d => d.UpdatedTime).First().UpdatedTime;
Here, data is a List<MyObject> and MyObject has DateTime property called UpdatedTime
===== More surrounding code =====
This is where I got the unhandled exception in the log. GetRecentUpdates method has its own try catch block, so ruled out.
public ActionResult GetUpdatedTime(long lastUpdated) {
var data = dataAccess.GetRecentUpdates(lastUpdated);
var html = htmlBuilder.Build(data);
return Content(html);
}
public List<MyObject> GetRecentUpdates(long lastUpdatedInTicks) {
var list = _cache.GetRecentRequests(_userCache.UserId);
if (list != null) {
var lastUpdated = new DateTime(lastUpdatedInTicks);
list = list.Where(l => l!=null && l.UpdatedTime > lastUpdated).ToList();
}
return list ?? new List<MyObject>();
}
public List<MyObject> GetRecentRequests(string userId) {
List<MyObject> requests = null;
try {
// simplied but the idea stays
requests = dictionary.Get(userId);
commonRequests = dictionary.Get("common");
if (requests != null) {
if (commonRequests != null)
requests = requests.Union(commonRequests).ToList();
} else {
request = commonRequests;
}
if (requests != null) {
requests = requests.OrderByDescending(r => r.CreatedDateTime).ToList();
}
catch (Exception ex) {
// log the exception (handled)
}
return requests;
}
public string Build(List<MyObject> data) {
var lastUpdated = DateTime.MinValue;
if (data.Any())
lastUpdated = data.OrderByDescending(d => d.UpdatedTime).First().UpdatedTime;
return String.Format("<tr style=\"display:none\"><td><div Id='MetaInfo' data-lastUpdated='{0}' /></td></tr>", lastUpdated.Ticks);
}
The javascript calls GetUpdatedTime every 10s. Ussually everything goes fine, but every once in awhile, this exception is thrown. Once it's thrown, it keeps being thrown every 10s until the user refreshes the page.
Update:
Another version after some investigate: as you've said, your code is running in multhithreading environment, and the data object could be accessed by two or more threads. As it's a reference type variable, the reference of it can be modified. So, consider such situation:
First thread enters the Build method and checking the condition:
if (data.Any())
and the data isn't empty at this moment, so it enters the true block. Right exactly in this time another one thread enters the Build method, but on that moment the data variable is empty, and all reference for it points to empty List. But the first one thread already into the true block:
lastUpdated = data.OrderByDescending(d => d.UpdatedTime).First().UpdatedTime;
And it fails with you exception. And now good news: you can fix it in many ways:
First of all, check the creating data logic. May be, it is a static or shared variable, or object it being populated from, is a static or shared variable, and you have race condition for this resource. You may change the logic of it's creation or wrap it into a some synchronizing primitive so the only one thread can Build at the same time (but this can affect the performance of your program).
Change the logic of GetRecentRequests - can't say for sure, but I think that the situation is something like this: commonRequests are empty all the time, and for the first thread dictionary got some data, but has no data for second thread, and data object is being overridden and is empty. Way to debug it: add Barrier primitive to your program during test run, and wait for 10-15 threads being awaiting for the barrier. After that they'll start simultaneously build your data and error will happen with high probability (do not insert breakpoints - they'll sychronize your threads).
Make a local copy of data object, something like this:
var localData = data.Select(d => d).ToList();
Hope this helps.
Your code is checking, if some data is available, and after that filters the data by date. As you are using the LINQ extension methods, I think that data is an IEnumerable object, not the List, so, when you are calling Any() method, it's being enumerated, and after that, you are calling the First() method, which is enumerating it too.
So, if your data is being a result of some yeild return method, it's being enumerated once, and on second time there is no data there, and sequence is empty.
Consider to change your code to work with data as a List or Array, or use the FirstOrDefault method to have a null object if there is no data, like this:
//var dataList = data.OrderByDescending(d => d.UpdatedTime).ToList();
if (data.Count > 0)
return dataList[0].UpdatedTime;
or
var firstElement = data.OrderByDescending(d => d.UpdatedTime).FirstOrDefault();
return firstElement != null ? firstElement.UpdatedTime : DateTime.MinValue;
Try FirstOrDefault() method.
var lastUpdated = DateTime.MinValue;
var first = data.OrderByDescending(d => d.UpdatedTime).FirstOrDefault();
if (first != null)
{
lastUpdated = first.UpdatedTime;
}
I don't know what MyObject definition is like. But I am guessing the
UpdatedTime field is DateTime?. If so, it happen at last method, when:
lastUpdated = data.OrderByDescending(d => d.UpdatedTime).First().UpdatedTime;
return String.Format("<tr style=\"display:none\"><td><div Id='MetaInfo' data-lastUpdated='{0}' /></td></tr>", lastUpdated.Ticks);
Where the UpdatedTime is null, and lastUpdated.Ticks throw NullReferenceException.
Your code checks first if any element of data collection satisfies a predicate with data.Any(). Since there's no predicate, this call is equivalent to tell you if your data collection has elements or not.
For this reason, I don't think that the line
return data.OrderByDescending(d => d.UpdatedTime).First().UpdatedTime;
is the real problem, because an exception with the message Sequence contains no elements is thrown when some operation is performed in an empty collection. Since there're elements within data (and then data.Any() returns true), the exception is about another line in your code.
You should add more logging to your application, in order to get the full stacktrace of the exception and understand better which line is causing the error.
Related
I am transforming an Excel spreadsheet into a list of "Elements" (this is a domain term). During this transformation, I need to skip the header rows and throw out malformed rows that cannot be transformed.
Now comes the fun part. I need to capture those malformed records so that I can report on them. I constructed a crazy LINQ statement (below). These are extension methods hiding the messy LINQ operations on the types from the OpenXml library.
var elements = sheet
.Rows() <-- BEGIN sheet data transform
.SkipColumnHeaders()
.ToRowLookup()
.ToCellLookup()
.SkipEmptyRows() <-- END sheet data transform
.ToElements(strings) <-- BEGIN domain transform
.RemoveBadRecords(out discard)
.OrderByCompositeKey();
The interesting part starts at ToElements, where I transform the row lookup to my domain object list (details: it's called an ElementRow, which is later transformed into an Element). Bad records are created with just a key (the Excel row index) and are uniquely identifiable vs. a real element.
public static IEnumerable<ElementRow> ToElements(this IEnumerable<KeyValuePair<UInt32Value, Cell[]>> map)
{
return map.Select(pair =>
{
try
{
return ElementRow.FromCells(pair.Key, pair.Value);
}
catch (Exception)
{
return ElementRow.BadRecord(pair.Key);
}
});
}
Then, I want to remove those bad records (it's easier to collect all of them before filtering). That method is RemoveBadRecords, which started like this...
public static IEnumerable<ElementRow> RemoveBadRecords(this IEnumerable<ElementRow> elements)
{
return elements.Where(el => el.FormatId != 0);
}
However, I need to report the discarded elements! And I don't want to muddy my transform extension method with reporting. So, I went to the out parameter (taking into account the difficulties of using an out param in an anonymous block)
public static IEnumerable<ElementRow> RemoveBadRecords(this IEnumerable<ElementRow> elements, out List<ElementRow> discard)
{
var temp = new List<ElementRow>();
var filtered = elements.Where(el =>
{
if (el.FormatId == 0) temp.Add(el);
return el.FormatId != 0;
});
discard = temp;
return filtered;
}
And, lo! I thought I was hardcore and would have this working in one shot...
var discard = new List<ElementRow>();
var elements = data
/* snipped long LINQ statement */
.RemoveBadRecords(out discard)
/* snipped long LINQ statement */
discard.ForEach(el => failures.Add(el));
foreach(var el in elements)
{
/* do more work, maybe add more failures */
}
return new Result(elements, failures);
But, nothing was in my discard list at the time I looped through it! I stepped through the code and realized that I successfully created a fully-streaming LINQ statement.
The temp list was created
The Where filter was assigned (but not yet run)
And the discard list was assigned
Then the streaming thing was returned
When discard was iterated, it contained no elements, because the elements weren't iterated over yet.
Is there a way to fix this problem using the thing I constructed? Do I have to force an iteration of the data before or during the bad record filter? Is there another construction that I've missed?
Some Commentary
Jon mentioned that the assignment /was/ happening. I simply wasn't waiting for it. If I check the contents of discard after the iteration of elements, it is, in fact, full! So, I don't actually have an assignment problem. Unless I take Jon's advice on what's good/bad to have in a LINQ statement.
When the statement was actually iterated, the Where clause ran and temp filled up, but discard was never assigned again!
It doesn't need to be assigned again - the existing list which will have been assigned to discard in the calling code will be populated.
However, I'd strongly recommend against this approach. Using an out parameter here is really against the spirit of LINQ. (If you iterate over your results twice, you'll end up with a list which contains all the bad elements twice. Ick!)
I'd suggest materializing the query before removing the bad records - and then you can run separate queries:
var allElements = sheet
.Rows()
.SkipColumnHeaders()
.ToRowLookup()
.ToCellLookup()
.SkipEmptyRows()
.ToElements(strings)
.ToList();
var goodElements = allElements.Where(el => el.FormatId != 0)
.OrderByCompositeKey();
var badElements = allElements.Where(el => el.FormatId == 0);
By materializing the query in a List<>, you only process each row once in terms of ToRowLookup, ToCellLookup etc. It does mean you need to have enough memory to keep all the elements at a time, of course. There are alternative approaches (such as taking an action on each bad element while filtering it) but they're still likely to end up being fairly fragile.
EDIT: Another option as mentioned by Servy is to use ToLookup, which will materialize and group in one go:
var lookup = sheet
.Rows()
.SkipColumnHeaders()
.ToRowLookup()
.ToCellLookup()
.SkipEmptyRows()
.ToElements(strings)
.OrderByCompositeKey()
.ToLookup(el => el.FormatId == 0);
Then you can use:
foreach (var goodElement in lookup[false])
{
...
}
and
foreach (var badElement in lookup[true])
{
...
}
Note that this performs the ordering on all elements, good and bad. An alternative is to remove the ordering from the original query and use:
foreach (var goodElement in lookup[false].OrderByCompositeKey())
{
...
}
I'm not personally wild about grouping by true/false - it feels like a bit of an abuse of what's normally meant to be a key-based lookup - but it would certainly work.
I have a method which instantiates an object, some of the properties of this object are arrays where i use linq to fetch the data.
private static GrowthYieldStructure CreateGrowthYieldStructure(int timberType, IEnumerable<Tree> trees)
{
var trees1 = trees;
return new GrowthYieldStructure
{
TimberType = timberType,
CurrentDbhList = trees1.Select(x => x.DBH).ToArray(),
CurrentSpeciesList = trees1.Select(x => x.SpeciesNumber).ToArray(),
CurrentTpaList = trees1.Select(x => x.TPA).ToArray(),
CurrentTreeListLength = trees1.Count()
};
}
The first time I call this method it works fine.
The second time it will fail with no exception on the second select statement.
-no matter which value its attempting to select
For instance, trees1.Select(x => x.DBH).ToArray() works, trees1.Select(x => x.SpeciesNumber).ToArray() crashes.
(I've tried switching the fetching order / making local variable copies / I've checked that values exist and they do, nothing out of the ordinary / using try/catch (no exp caught))
Edit:
I made more local variables to store the IEnumerable; still fails
If I only have one select statement it will run fine...
--
Edit2: (calling code - could be off going from memory)
stands,plots,trees are all IEnumerable (T being Stand,Plot,Tree)
foreach (var plot in plots.Where(x => x.StandID.Equals(stands.ID))) {
var plot1 = plot;
var treeList = trees.Where(x => x.PlotID.Equals(plot1.ID));
var growthYieldStructure = CreateGrowthYieldStructure(stands.TimberType, treeList); }
Edit3:
Finally saw this error:
A first chance exception of type 'System.AccessViolationException' occurred in Unknown Module.
Then finally realized my error-
It was actually the code using after the object creation.
I was using arrays to send to a external library since arrays are reference types this worked out they way I suspected.
But since I was not copying the arrays and instead creating a new local variable which would have the same memory reference.
This caused the next object init to fail since it wanted to write in the same memory loc..
I just changed the object to use IEnumerable this way i can have the array reference once.
Sorry for the confusion.
Any thoughts to why it is crashing?
It might be do to the IEnumerable being an IQueryable and it is trying to enumerate more than once.
Try changing the line:
var trees1 = trees;
to
var trees1 = trees.ToList();
that will force the enumeration and trees1 will be a List instead of the possible IQueryable
public readonly IEnumerable<string> PeriodToSelect = new string[] { "MONTH" };
var dataCollection = from p in somedata
from h in p.somemoredate
where h.Year > (DateTime.Now.Year - 2)
where PeriodToSelect.Contains(h.TimePeriod)
select new
{
p.Currency,
h.Year.Month, h.Value
};
Can someone tell me why an exception is thrown when at the following line of code?
int count = dataCollection.Count();
This is the exception:
System.NullReferenceException: Object reference not set to an instance of an object.
at System.Linq.Enumerable.<SelectManyIterator>d__31`3.MoveNext()
at System.Linq.Enumerable.<SelectManyIterator>d__31`3.MoveNext()
at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
at ...
This looks like a normal null reference exception in linq2objects while it tries to execute your predicates or projections.
The cases were you'd get a null ref exception that I can think of are if some elements of the "somedata" collection are null, if "h.Year" is null (what type is that?), or if "p.somemoredate" is null..
Deferred execution strikes again!
(First off, my first guess is that this is caused by p.somemoredate being null somewhere in your collection.)
Given your example, there's no way for us to really know, since you've simplified away the bits that are being queried. Taking it at its face, I would say that whatever "somedata" or "somemoredate" are the things you need to look at.
To figure this out, (when I get really desperate) I split the query into parts and watch where exceptions get thrown. Notice the .ToArray() calls which will basically "stop" deferred execution from happening temporarily:
var sd = somedata.ToArray();
var x = (from p in sd from h in p.somemoredate.ToArray()).ToArray(); //My guess is that you'll get your exception here.
Broken up like this, it's a lot easier to see where the exception gets thrown, and where to look for problems.
The exception is thrown at the Count() statement because LINQ uses deferred execution and the actual LINQ query will not be executed until to call .Count(), .ToList(), etc.
I ran into the same issue. It is annoying that MS did not provide built-in null detection and handling for the aggregate functions. The other issue is I wanted to ensure I got a 0 or $0 return result for nulls/empty query results, as I was working on dashboards/reporting. All of those tables would have data eventually, but early on you get a lot of null returns. After reading multiple postings on the subject I came up with this:
Retrieve the fields you want to return or later apply aggregate functions to first.
Test for a null return. Return 0 if a null is detected.
If you do get actual data returned then you can safely utilize/apply the Count or Sum aggregate Linq functions.
public ActionResult YourTestMethod()
{
var linqResults = (from e in db.YourTable
select e.FieldYouWantToCount);
if (linqResults != null)
{
return Json(linqResults.ToList().Count(), JsonRequestBehavior.AllowGet);
}
else
{
return Json(0, JsonRequestBehavior.AllowGet);
}
}
Sum Example Below
public ActionResult YourTestMethod()
{
var linqResults = (from e in db.YourTable
select e.Total);
if (linqResults != null)
{
return Json(linqResults.ToList().Sum(), JsonRequestBehavior.AllowGet);
}
else
{
return Json(0, JsonRequestBehavior.AllowGet);
}
}
I have an extension method called ToListIfNotNullOrEmpty(), which is hitting the DB twice, instead of once. The first time it returns one result, the second time it returns all the correct results.
I'm pretty sure the first time it hits the database, is when the .Any() method is getting called.
here's the code.
public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
if (value.IsNullOrEmpty())
{
return null;
}
if (value is IList<T>)
{
return (value as IList<T>);
}
return new List<T>(value);
}
public static bool IsNullOrEmpty<T>(this IEnumerable<T> value)
{
if (value != null)
{
return !value.Any();
}
return true;
}
I'm hoping to refactor it so that, before the .Any() method is called, it actually enumerates through the entire list.
If i do the following, only one DB call is made, because the list is already enumerated.
var pewPew = (from x in whatever
select x)
.ToList() // This enumerates.
.ToListIsNotNullOrEmpty(); // This checks the enumerated result.
I sorta don't really want to call ToList() then my extension method.
Any ideas, folks?
I confess that I see little point in this method. Surely if you simply do a ToList(), a check to see if the list is empty suffices as well. It's arguably harder to handle the null result when you expect a list because then you always have to check for null before you iterate over it.
I think that:
var query = (from ...).ToList();
if (query.Count == 0) {
...
}
works as well and is less burdensome than
var query = (from ...).ToListIfNotNullOrEmpty();
if (query == null) {
...
}
and you don't have to implement (and maintain) any code.
How about something like this?
public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
if (value == null)
return null;
var list = value.ToList();
return (list.Count > 0) ? list : null;
}
To actually answer your question:
This method hits the database twice because the extension methods provided by the System.Linq.Enumerable class exhibit what is called deferred execution. Essentially, this is to eliminate the need for constructing a string of temporarily cached collections for every part of a query. To understand this, consider the following example:
var firstMaleTom = people
.Where(p => p.Gender = Gender.Male)
.Where(p => p.FirstName == "Tom")
.FirstOrDefault();
Without deferred execution, the above code might require that the entire collection people be enumerated over, populating a temporary buffer array with all the individuals whose Gender is Male. Then it would need to be enumerated over again, populating another buffer array with all of the individuals from the first buffer whose first name is Tom. After all that work, the last part would return the first item from the resulting array.
That's a lot of pointless work. The idea with deferred execution is that the above code really just sets up the firstMaleTom variable with the information it needs to return what's being requested with the minimal amount of work.
Now, there's a flip side to this: in the case of querying a database, deferred execution means that the database gets queried when the return value is evaluated. So, in your IsNullOrEmpty method, when you call Any, the value parameter is actually being evaluated right then and there -- hence a database query. After this, in your ToListIfNotNullOrEmpty method, the line return new List<T>(value) also evaluates the value parameter -- because it's enumerating over the values and adding them to the newly created List<T>.
You could stick the .ToList() call inside the extension, the effect is slightly different, but does this still work in the cases you have?
public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
if(value == null)
{
return null;
}
var result = value.ToList();
return result.IsNullOrEmpty() ? null : result;
}
I built a wrapper around NpgSQL for a bunch of the methods I usually use in my projects' DAL. Two of them, I usually use to fill DTOs straight from a DataReader. Usually in a fill helper method, i'll instanciate the DTO and iterate through the properties mapping the Datareader's data to the corresponding property. The fill method is generated most of the time.
Since i allow many of the properties to be null or use the DTO's default values, I've used a method to check if the dataReader's data is valid for the property before filling in the prperty. So i'll have a IsValidString("fieldname") and a DRGetString("fieldname") methods, like so:
public bool IsValidString(string fieldName)
{
if (data.GetOrdinal(fieldName) != -1
&& !data.IsDBNull(data.GetOrdinal(fieldName)))
return true;
else
return false;
}
public string DRGetString(string fieldName)
{
return data.GetString(data.GetOrdinal(fieldName));
}
My fill method is delagated to whatever method executed the query and looks like:
public static object FillObject(DataParse<PostgreSQLDBDataParse> dataParser)
{
TipoFase obj = new TipoFase();
if (dataParser.IsValidInt32("T_TipoFase"))
obj.T_TipoFase = dataParser.DRGetInt32("T_TipoFase");
if (dataParser.IsValidString("NM_TipoFase"))
obj.NM_TipoFase = dataParser.DRGetString("NM_TipoFase");
//...rest of the properties .. this is usually autogenerated by a T4 template
return obj;
}
This was working fine and dandy in NpgSQL pre 2.02. . When the GetOrdinal method was called, and if the field was inexistent in the dataReader, I'd simply get a -1 returned. Easy to return false in IsValidString() and simply skip to the next property. The performace hit from checking inexistent fields was practically neglectable.
Unfortunately, changes to NpgSQL make GetOrdinal throw an exception when the field doesn't exist. I have a simple workaround in which I wrap the code in a try /catch and throw false within the catch. But I can feel the hit in performance, especially when I go in to debug mode. Filling in a long list takes minutes.
Suposedly, NpgSQL has a parameter that can be added to the connection string (Compatability) to support backward compatabilty for this method, but I've never got that to work correctly (I always get an exception because of a mal formed connection string). Anyway, I'm looking for suggestions for better workarounds. Any better way to fill in the object from the datareader or even somehow work around the exception problem?
I have created a solution to my problem, that doesn't require great changes, and presents interesting performance (or so it seems). Might just be a new parsing library / wrapper.
Basicly, I'll iterate through the dataReader's fields, and copy each to a Collection (in my case a List). Then I'll check for valid data and if considered valid, I'll copy the data to the object's property.
So I'll have:
public class ParserFields
{
public string FieldName { get; set; }
public Type FieldType { get; set; }
public object Data { get; set; }
}
and I'll fill the object using:
public static object FillObjectHashed(DataParse<PostgreSQLDBDataParse> dataParser)
{
//The the Field list with field type and data
List<ParserFields> pflist = dataParser.GetReaderFieldList();
//create resulting object instance
CandidatoExtendido obj = new CandidatoExtendido();
//check for existing field and valid data and create object
ParserFields pfdt = pflist.Find(objt => objt.FieldName == "NS_Candidato");
if (pfdt != null && pfdt.FieldType == typeof(int) && pfdt.Data.ToString() != String.Empty)
obj.NS_Candidato = (int)pfdt.Data;
pfdt = pflist.Find(objt => objt.FieldName == "NM_Candidato");
if (pfdt != null && pfdt.FieldType == typeof(string) && pfdt.Data.ToString() != String.Empty)
obj.NM_Candidato = (string)pfdt.Data;
pfdt = pflist.Find(objt => objt.FieldName == "Z_Nasc");
if (pfdt != null && pfdt.FieldType == typeof(DateTime) && pfdt.Data.ToString() != String.Empty)
obj.Z_Nasc = (DateTime)pfdt.Data;
//...
return obj;
}
I timed my variations, to see the diferences. Did a search that returned 612 results. First I queried the database twice too take in to account the first run of the query and the subsequent diferences related to caching ( and that where quite significant). My FillObject method simply created a new instance of the desired object to be added to the results list.
1st query to List of object's instances : 2896K ticks
2nd query (same as first) : 1141K ticks
Then I tried using the previous fill objects
To List of desired object, filled with return data or defaults, checking all of the objects properties: 3323K ticks
To List of desired objects, checking only the object's properties returned in the search: 1127K ticks
To list of desired objects, using lookup list, checking only the returned fields: 1097K ticks
To list of desired objects, using lookup list, checking all of the fields (minus a few nested properties): 1107K ticks
The original code i was using was consuming nearly 3 times more ticks than when using a method limited to the desired fields. The excpetions where killing it.
With the new code for the fillobject method, the overhead for checking inexistente fileds mas minimal compared to just checking for the desired fields.
This seems to work nice, for now at least. Might try looking for a couple of optimizations.
Any sugestion will be appreciated!