Is returning a null generic list preferred when no data is found? - c#

I have client code that is virtually identical in four different methods (the difference being the particular Web API RESTful method being called and the corresponding manipulated generic list).
In three of the four cases, I can break out of the while loop (see How can I safely loop until there is nothing more to do without using a "placeholder" while conditon?) like so:
if (arr.Count <= 0) break;
...but in one case, that causes an NRE once there is no more data returned from the RESTful method. In that method, I have to use:
if (null == arr) break;
I now know why, thus this:
UPDATE
The reason for the different behavior was because the Repository code differs. Therefore, I am changing the question from "Why would checking JArray.Count work in most instances, but cause an NRE in one specific case?"
Here is how it's done in the three methods where checking for array count works:
public IEnumerable<Subdepartment> Get(int ID, int CountToFetch)
{
return subdepartments.Where(i => i.Id > ID).Take(CountToFetch);
}
...and here's the "alternate version" contained in RedemptionRepository:
public IEnumerable<Redemption> Get(int ID, int CountToFetch)
{
IEnumerable<Redemption> redempts = null;
if (redemptions.Where(i => i.Id > ID).Take(CountToFetch).Count() > 0)
{
redempts = redemptions.Where(i => i.Id > ID).Take(CountToFetch);
}
return redempts;
}
So, to be consistent with all four methods, I can either make all the other Repository methods like the above (returning null when no data is found), and change the test condition in the client to nullification, OR I can revert the Redemption repository code to be like it was formerly/like the others.
So the question: Which is the preferred method (no pun intended)?

You should definitely change the last method to match the previous ones:
public IEnumerable<Redemption> Get(int ID, int CountToFetch)
{
return redemptions.Where(i => i.Id > ID).Take(CountToFetch);
}
And NullReferenceException is not the only reason. Because LINQ is lazy and execution is deferred the other approach executes the query twice! Once to get Count() and second one to get actual collection of results. If you really want to return null instead of empty collection use should get following:
public IEnumerable<Redemption> Get(int ID, int CountToFetch)
{
var redempts = redemptions.Where(i => i.Id > ID).Take(CountToFetch).ToList();
if (redemptions.Any())
{
return redempts;
}
return null;
}

You should return an empty collection.
All Linq methods (that I am aware of) that have an IEnumerable-based return type return empty collections rather than null. Returning null from a method prevents you from chaining method calls since you now need to check for null to avoid an NullReferenceExcpetion (as you've discovered).

Related

How to get an overloaded == operator to work with LINQ and EF Core?

so basically, I have a project which uses EF Core. In order to shorten my lambdas when comparing if two objects (class Protocol) are equal, I've overridden my Equals method and overloaded the == and != operators. However, LINQ doesn't seem to care about it, and still uses reference for determining equality. Thanks
As I've said before, I've overridden the Equals method and overloaded the == and != operators. With no luck. I've also tried implementing the IEquatable interface. Also no luck.
I am using:
EF Core 2.2.4
//the protocol class
[Key]
public int ProtocolId {get;set;}
public string Repnr {get;set;}
public string Service {get;set;}
public override bool Equals(object obj)
{
if (obj is Protocol other)
{
return this.Equals(other);
}
return false;
}
public override int GetHashCode()
{
return $"{Repnr}-{Service}".GetHashCode();
}
public bool Equals(Protocol other)
{
return this?.Repnr == other?.Repnr && this?.Service == other?.Service;
}
public static bool operator ==(Protocol lhs, Protocol rhs)
{
return lhs.Equals(rhs);
}
public static bool operator !=(Protocol lhs, Protocol rhs)
{
return !lhs.Equals(rhs);
}
//the problem
using (var db = new DbContext())
{
var item1 = new Protocol() { Repnr = "1666", Service = "180" };
db.Protocols.Add(item1 );
db.SaveChanges();
var item2 = new Protocol() { Repnr = "1666", Service = "180" };
var result1 = db.Protocols.FirstOrDefault(a => a == item2);
var result2 = db.Protocols.FirstOrDefault(a => a.Equals(item2));
//both result1 and result2 are null
}
I would expect both result1 and result2 to be item1. However, they're both null. I know I could just do a.Repnr == b.Repnr && a.Service == b.Service, but that just isn't as clean. Thanks
To understand why the incorrect equality comparer is used, you have to be aware about the difference between IEnumerable<...> and IQueryable<...>.
IEnumerable
An object that implements IEnumerable<...>, is an object that represents a sequence of similar objects. It holds everything to fetch the first item of the sequence, and once you've got an item of the sequence you can get the next item, as long as there is a next item.
You start enumerating either explicitly by calling GetEnumerator() and repeatedly call MoveNext(). More common is to start enumerating implicitly by using foreach, or LINQ terminating statements like ToList(), ToDictionary(), FirstOrDefault(), Count() or Any(). This group of LINQ methods internally uses either foreach, or GetEnumerator() and MoveNext() / Current.
IQueryable
An object that implements IQueryable<...> also represents an enumerable sequence. The difference however, is that this sequence usually is not held by your process, but by a different process, like a database management system.
The IQueryable does not (necessarily) hold everything to enumerate. Instead it holds an Expression and a Provider. The Expression is a generic description about what must be queried. The Provider knows which process will execute the query (usually a database management system) and how to communicate with this process (usually something SQL-like).
An IQueryable<..> also implements IEnumerable<..>, so you can start enumerating the sequence as if it was a standard IEnumerable. Once you start enumerating an IQueryable<...> by calling (internally) GetEnumerator(), the Expression is sent to the Provider, who translates the Expression into SQL and executes the query. The result is presented as an enumerator, which can be enumerated using MoveNext() / Current.
This means, that if you want to enumerate an IQueryable<...>, the Expression must be translated into a language that the Provider supports. As the compiler does not really know who will execute the query, the compiler can't complain if your Expression holds methods or classes that your Provider doesn't know how to translate to SQL. In such cases you'll get a run-time error.
It is easy to see, that SQL does not know your own defined Equals method. In fact, there are even several standard LINQ functions that are not supported. See Supported and Unsupported LINQ Methods (LINQ to Entities).
So what should I do if I want to use an unsupported function?
One of the things that you could do is move the data to your local process, and then call the unsupported function.
This can be done using ToList, but if you will only use one or a few of the fetched items, this would be a waste of processing power.
One of the slower parts of a database query is the transport of the selected data to your local process. Hence it is wise to limit the data to the data that you actually plan to use.
A smarter solution would be to use AsEnumerable. This will fetch the selected data "per page". It will fetch the first page, and once you've enumerated through the fetched page (using MoveNext), it will fetch the next page.
So if you only use a few of the fetched items, you will have fetched some items that are not used, but at least you won't have fetched all of them.
Example
Suppose you have a local function that takes a Student as input and returns a Boolean
bool HasSpecialAbility(Student student);
Requirement: give me three Students that live in New York City that have the special Ability.
Alas, HasSpecialAbility is a local function, it can't be translated into Sql. You'll have to get the Students to your local process before calling it.
var result = dbContext.Students
// limit the transported data as much as you can:
.Where(student => student.CityCode == "NYC")
// transport to local process per page:
.AsEnumerable()
// now you can call HasSpecialAbility:
.Where(student => HasSpecialAbility(student))
.Take(3)
.ToList();
Ok, you might have fetched a page of 100 Students while you only needed 3, but at least you haven't fetched all 25000 students.

Error: “Cannot implicitly convert type”

I am working on one project and I have a question regarding the converting type. I want to create simple search for my project, but it can't return result with this message :
Error 1 Cannot implicitly convert type 'System.Collections.Generic.List' to 'EmployeeDataAccess.TimeWorkMonthly'
public TimeWorkMonthly Get(int id)
{
using (EmployeeDbEntities Entities = new EmployeeDbEntities())
{
List<TimeWorkMonthly> persons = new List<TimeWorkMonthly>();
var result = Entities.TimeWorkMonthlies
.Where(e => e.KartNo == id)
.Select(e => e)
.ToList();
return result.ToList();
}
}
The return type of your method is TimeWorkMonthlies but inside the method body return List<TimeWorkMonthlies>
You should either
change your method return type to IEnumerable<TimeWorkMonthlies>
(You could use List<TimeWorkMonthlies> but using an interface to abstract a collection type is better for many reasons)
Use FirstOrDefault, First, SingleOrDefault or Single extension methods of IEnumerable if you aim to return only one element and you do not care about anything except for the first element
Which of those methods is better depends on your data and search criteria - i.e. whether you expect this ID to be unique or not.
From your semantics it looks like you're doing a sort of repository like ID lookup, so my guess would be solution 2) and using Single or SingleOrDefault
The last choice is how you want your program to behave if nothing is found by ID
If you want an exception, use Single
If you want a null use SingleOrDefault
In Summary, all you have to do is change your last line of code to
return result.Single();
(And ofcourse, you don't need a call to ToList() just before that)
Your method signature indicates you just want to return a single object. But you're returning a List of objects. Using .ToList() is not appropriate when you just want to return one object. There are four appropriate extension methods:
First - will return the first item from the collection and throw an exception if the collection is empty.
FirstOrDefault - will return the first item in the collection, or the default of the type if the collection is empty.
Single - if there is one item in the collection, it will return it. If there are no items in the collection an exception is thrown. If there are multiple items in the collection, an exception is thrown.
SingleOrDefault - if there is one item in the collection it will return it. If there are no items in the collection it will return the default value for the type. If there are multiple items in the collection it will thrown an exception.
Since you're searching by ID, you probably don't ever to expect to match two or more elements. So that rules out First and FirstOrDefault. You should use Single or SingleOrDefault depending on what you want the behavior to be if there is no item found that has the matching ID.
public TimeWorkMonthly Get(int id)
{
using (EmployeeDbEntities Entities = new EmployeeDbEntities())
{
var result = Entities.TimeWorkMonthlies.Where(e => e.KartNo == id).Single();
return result;
}
}
Note I eliminated the persons variable because you never did anything with it. And your usage of the .Select extension method was superflous since you just selected the same object already being iterated over. Select is for when you want to transform the object.
The problem is your qry only . If you want to convert it with Tolist() function you have to change your qry
like this
public TimeWorkMonthly Get(int id)
{
using (EmployeeDbEntities Entities = new EmployeeDbEntities())
{
var result = from x in Entities.TimeWorkMonthlies
Where x.KartNo == id
Select x;
return result.ToList();
}
}
You can now convert it to list by tolist() and use it according to your need.

How to pass a predicate as parameter c#

How can I pass a predicate into a method but also have it work if no predicate is passed? I thought maybe something like this, but it doesn't seem to be correct.
private bool NoFilter() { return true; }
private List<thing> GetItems(Predicate<thing> filter = new Predicate<thing>(NoFilter))
{
return rawList.Where(filter).ToList();
}
private List<thing> GetItems(Func<thing, bool> filter = null)
{
return rawList.Where(filter ?? (s => true)).ToList();
}
In this expression s => true is the fallback filter which is evaluated if the argument filter is null. It just takes each entry of the list (as s) and returns true.
There are two parts to this.
First, you need to adjust the NoFilter() function to be compatible with Predicate<T>. Notice the latter is generic, but the former is not. Make NoFilter() look like this:
private bool NoFilter<T>(T item) { return true; }
I know you never use the generic type argument, but it's necessary to make this compatible with your predicate signature.
For fun, you could also define NoFilter this way:
private Predicate<T> NoFilter = x => true;
Now the second part: we can look at using the new generic method as the default argument for GetItems(). The trick here is you can only use constants. While NoFilter() will never change, from the compiler's view that's not quite the same things a a formal constant. In fact, there is only one possible constant you can use for this: null. That means your method signature must look like this:
private List<thing> GetItems(Predicate<thing> filter = null)
Then you can check for null at the beginning of your function and replace it with NoFilter:
private List<thing> GetItems(Predicate<thing> filter = null)
{
if (filter == null) filter = NoFilter;
return rawList.Where(filter).ToList();
}
And if you also do want to explicitly pass this to the method when calling it, that would look like this:
var result = GetItems(NoFilter);
That should fully answer the original question, but I don't want to stop here. Let's look deeper.
Since you need the if condition anyway now, at this point I would tend to remove the NoFilter<T>() method entirely, and just do this:
private IEnumerable<thing> GetItems(Predicate<thing> filter = null)
{
if (filter == null) return rawList;
return rawList.Where(filter);
}
Note that I also changed the return type and removed the ToList() call at the end. If you find yourself calling ToList() at the end of a function just to match a List<T> return type, it's almost always much better to change the method signature to return IEnumerable<T> instead. If you really need a list (and usually, you don't), you can always call ToList() after calling the function.
This change makes your method more useful, by giving you a more abstract type that will be more compatible with other interfaces, and it potentially sets you up for a significant performance bump, both in terms of lowered memory use and in terms of lazy evaluation.
One final addition here is, if you do pare down to just IEnumerable, we can see now this method does not really provide much value at all beyond the base rawItems field. You might look at converting to a property, like this:
public IEnumerable<T> Items {get {return rawList;}}
This still allows the consumer of your type use a predicate (or not) if they want via the existing .Where() method, while also continuing to hide the underlying raw data (you can't directly just call .Add() etc on this).

Why does this linq extension method hit the database twice?

I have an extension method called ToListIfNotNullOrEmpty(), which is hitting the DB twice, instead of once. The first time it returns one result, the second time it returns all the correct results.
I'm pretty sure the first time it hits the database, is when the .Any() method is getting called.
here's the code.
public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
if (value.IsNullOrEmpty())
{
return null;
}
if (value is IList<T>)
{
return (value as IList<T>);
}
return new List<T>(value);
}
public static bool IsNullOrEmpty<T>(this IEnumerable<T> value)
{
if (value != null)
{
return !value.Any();
}
return true;
}
I'm hoping to refactor it so that, before the .Any() method is called, it actually enumerates through the entire list.
If i do the following, only one DB call is made, because the list is already enumerated.
var pewPew = (from x in whatever
select x)
.ToList() // This enumerates.
.ToListIsNotNullOrEmpty(); // This checks the enumerated result.
I sorta don't really want to call ToList() then my extension method.
Any ideas, folks?
I confess that I see little point in this method. Surely if you simply do a ToList(), a check to see if the list is empty suffices as well. It's arguably harder to handle the null result when you expect a list because then you always have to check for null before you iterate over it.
I think that:
var query = (from ...).ToList();
if (query.Count == 0) {
...
}
works as well and is less burdensome than
var query = (from ...).ToListIfNotNullOrEmpty();
if (query == null) {
...
}
and you don't have to implement (and maintain) any code.
How about something like this?
public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
if (value == null)
return null;
var list = value.ToList();
return (list.Count > 0) ? list : null;
}
To actually answer your question:
This method hits the database twice because the extension methods provided by the System.Linq.Enumerable class exhibit what is called deferred execution. Essentially, this is to eliminate the need for constructing a string of temporarily cached collections for every part of a query. To understand this, consider the following example:
var firstMaleTom = people
.Where(p => p.Gender = Gender.Male)
.Where(p => p.FirstName == "Tom")
.FirstOrDefault();
Without deferred execution, the above code might require that the entire collection people be enumerated over, populating a temporary buffer array with all the individuals whose Gender is Male. Then it would need to be enumerated over again, populating another buffer array with all of the individuals from the first buffer whose first name is Tom. After all that work, the last part would return the first item from the resulting array.
That's a lot of pointless work. The idea with deferred execution is that the above code really just sets up the firstMaleTom variable with the information it needs to return what's being requested with the minimal amount of work.
Now, there's a flip side to this: in the case of querying a database, deferred execution means that the database gets queried when the return value is evaluated. So, in your IsNullOrEmpty method, when you call Any, the value parameter is actually being evaluated right then and there -- hence a database query. After this, in your ToListIfNotNullOrEmpty method, the line return new List<T>(value) also evaluates the value parameter -- because it's enumerating over the values and adding them to the newly created List<T>.
You could stick the .ToList() call inside the extension, the effect is slightly different, but does this still work in the cases you have?
public static IList<T> ToListIfNotNullOrEmpty<T>(this IEnumerable<T> value)
{
if(value == null)
{
return null;
}
var result = value.ToList();
return result.IsNullOrEmpty() ? null : result;
}

NpgSQLdataReader GetOrdinal throwing exceptions.. any way around?

I built a wrapper around NpgSQL for a bunch of the methods I usually use in my projects' DAL. Two of them, I usually use to fill DTOs straight from a DataReader. Usually in a fill helper method, i'll instanciate the DTO and iterate through the properties mapping the Datareader's data to the corresponding property. The fill method is generated most of the time.
Since i allow many of the properties to be null or use the DTO's default values, I've used a method to check if the dataReader's data is valid for the property before filling in the prperty. So i'll have a IsValidString("fieldname") and a DRGetString("fieldname") methods, like so:
public bool IsValidString(string fieldName)
{
if (data.GetOrdinal(fieldName) != -1
&& !data.IsDBNull(data.GetOrdinal(fieldName)))
return true;
else
return false;
}
public string DRGetString(string fieldName)
{
return data.GetString(data.GetOrdinal(fieldName));
}
My fill method is delagated to whatever method executed the query and looks like:
public static object FillObject(DataParse<PostgreSQLDBDataParse> dataParser)
{
TipoFase obj = new TipoFase();
if (dataParser.IsValidInt32("T_TipoFase"))
obj.T_TipoFase = dataParser.DRGetInt32("T_TipoFase");
if (dataParser.IsValidString("NM_TipoFase"))
obj.NM_TipoFase = dataParser.DRGetString("NM_TipoFase");
//...rest of the properties .. this is usually autogenerated by a T4 template
return obj;
}
This was working fine and dandy in NpgSQL pre 2.02. . When the GetOrdinal method was called, and if the field was inexistent in the dataReader, I'd simply get a -1 returned. Easy to return false in IsValidString() and simply skip to the next property. The performace hit from checking inexistent fields was practically neglectable.
Unfortunately, changes to NpgSQL make GetOrdinal throw an exception when the field doesn't exist. I have a simple workaround in which I wrap the code in a try /catch and throw false within the catch. But I can feel the hit in performance, especially when I go in to debug mode. Filling in a long list takes minutes.
Suposedly, NpgSQL has a parameter that can be added to the connection string (Compatability) to support backward compatabilty for this method, but I've never got that to work correctly (I always get an exception because of a mal formed connection string). Anyway, I'm looking for suggestions for better workarounds. Any better way to fill in the object from the datareader or even somehow work around the exception problem?
I have created a solution to my problem, that doesn't require great changes, and presents interesting performance (or so it seems). Might just be a new parsing library / wrapper.
Basicly, I'll iterate through the dataReader's fields, and copy each to a Collection (in my case a List). Then I'll check for valid data and if considered valid, I'll copy the data to the object's property.
So I'll have:
public class ParserFields
{
public string FieldName { get; set; }
public Type FieldType { get; set; }
public object Data { get; set; }
}
and I'll fill the object using:
public static object FillObjectHashed(DataParse<PostgreSQLDBDataParse> dataParser)
{
//The the Field list with field type and data
List<ParserFields> pflist = dataParser.GetReaderFieldList();
//create resulting object instance
CandidatoExtendido obj = new CandidatoExtendido();
//check for existing field and valid data and create object
ParserFields pfdt = pflist.Find(objt => objt.FieldName == "NS_Candidato");
if (pfdt != null && pfdt.FieldType == typeof(int) && pfdt.Data.ToString() != String.Empty)
obj.NS_Candidato = (int)pfdt.Data;
pfdt = pflist.Find(objt => objt.FieldName == "NM_Candidato");
if (pfdt != null && pfdt.FieldType == typeof(string) && pfdt.Data.ToString() != String.Empty)
obj.NM_Candidato = (string)pfdt.Data;
pfdt = pflist.Find(objt => objt.FieldName == "Z_Nasc");
if (pfdt != null && pfdt.FieldType == typeof(DateTime) && pfdt.Data.ToString() != String.Empty)
obj.Z_Nasc = (DateTime)pfdt.Data;
//...
return obj;
}
I timed my variations, to see the diferences. Did a search that returned 612 results. First I queried the database twice too take in to account the first run of the query and the subsequent diferences related to caching ( and that where quite significant). My FillObject method simply created a new instance of the desired object to be added to the results list.
1st query to List of object's instances : 2896K ticks
2nd query (same as first) : 1141K ticks
Then I tried using the previous fill objects
To List of desired object, filled with return data or defaults, checking all of the objects properties: 3323K ticks
To List of desired objects, checking only the object's properties returned in the search: 1127K ticks
To list of desired objects, using lookup list, checking only the returned fields: 1097K ticks
To list of desired objects, using lookup list, checking all of the fields (minus a few nested properties): 1107K ticks
The original code i was using was consuming nearly 3 times more ticks than when using a method limited to the desired fields. The excpetions where killing it.
With the new code for the fillobject method, the overhead for checking inexistente fileds mas minimal compared to just checking for the desired fields.
This seems to work nice, for now at least. Might try looking for a couple of optimizations.
Any sugestion will be appreciated!

Categories