I'm working with Simple.data, and the answer is not in the technology aforementioned but helps bring the point across. So ignore the syntax etc.
I am querying a database with a simple query; but based on a set of conditions, the query will change.
So for example: (very simplistic, probably 5-10 conditions)
var result;
if(LoggedAtSelected)
{
// Condition 1 - Calls Logged after a certain date
result = db.Jobs.FindAll(db.Jobs.Logged_At >= startDate);
}
else
{
// Condition 2 - Calls Closed after a certain date
result = db.Jobs.FindAll(db.Jobs.Closed_At >= startDate && dd.Jobs.Closed_At <= endDate);
}
foreach(var JobRecord in result)
{
}
This is the ideal code above, but sadly this is not possible given the dynamic binding and variable nature of var. What is the best practice for this kind of situation? My only idea is to write a "var result = condition..." for every condition, and in the if..else if..else, to assign it to a global variable after converting it to that type; and then using it in the "foreach". Sounds a lot of work. Any ideas? Or is that it!!!?!!!
Instead of:
var result;
Use the actual type returned by db.Jobs.FindAll:
IEnumerable<Job> result;
You can only use var if the compiler can know exactly which type to use (or how to define a new type for you).
In your case you can either define it with the type say
List<Job> result;
or call the constructor to return an instance:
var result = new List<Job>;
(of course your query will return an IEnumarable instance instead of a List, I just used List as an example because you can't instantiate an enumeration.)
Just as a note, as your if statements determine the filters for the query rather than the query itself, you might want to build up a SimpleExpression there and run the query afterwards. For example.
var whereCLause;
if(LoggedAtSelected)
{
// Condition 1 - Calls Logged after a certain date
whereClause = db.Jobs.Logged_At >= startDate;
}
else
{
// Condition 2 - Calls Closed after a certain date
whereClause = db.Jobs.Closed_At >= startDate && dd.Jobs.Closed_At <= endDate;
}
List<Job> results = db.Jobs.All.Where(whereClause);
foreach(Job record in results)
{
...
}
Related
I have the below which calculates the running total for a customer account status, however he first value is always added to itself and I'm not sure why - though I suspect I've missed something obvious:
decimal? runningTotal = 0;
IEnumerable<StatementModel> statement = sage.Repository<FDSSLTransactionHistory>()
.Queryable()
.Where(x => x.CustomerAccountNumber == sageAccount)
.OrderBy(x=>x.UniqueReferenceNumber)
.AsEnumerable()
.Select(x => new StatementModel()
{
SLAccountId = x.CustomerAccountNumber,
TransactionReference = x.TransactionReference,
SecondReference = x.SecondReference,
Currency = x.CurrencyCode,
Value = x.GoodsValueInAccountCurrency,
TransactionDate = x.TransactionDate,
TransactionType = x.TransactionType,
TransactionDescription = x.TransactionTypeName,
Status = x.Status,
RunningTotal = (runningTotal += x.GoodsValueInAccountCurrency)
});
Which outputs:
29/02/2012 00:00:00 154.80 309.60
30/04/2012 00:00:00 242.40 552.00
30/04/2012 00:00:00 242.40 794.40
30/04/2012 00:00:00 117.60 912.00
Where the 309.60 of the first row should be simply 154.80
What have I done wrong?
EDIT:
As per ahruss's comment below, I was calling Any() on the result in my View, causing the first to be evaluated twice - to resolve I appended ToList() to my query.
Thanks all for your suggestions
Add a ToList() to the end of the call to avoid duplicate invocations of the selector.
This is a stateful LINQ query with side-effects, which is by nature unpredictable. Somewhere else in the code, you called something that caused the first element to be evaluated, like First() or Any(). In general, it is dangerous to have side-effects in LINQ queries, and when you find yourself needing them, it's time to think about whether or not it should just be a foreach instead.
Edit, or Why is this happening?
This is a result of how LINQ queries are evaluated: until you actually use the results of a query, nothing really happens to the collection. It doesn't evaluate any of the elements. Instead, it stores Abstract Expression Trees or just the delegates it needs to evaluate the query. Then, it evaluates those only when the results are needed, and unless you explicitly store the results, they're thrown away afterwards, and re-evaluated the next time.
So this makes the question why does it have different results each time? The answer is that runningTotal is only initialized the first time around. After that, its value is whatever it was after the last execution of the query, which can lead to strange results.
This means the question could just have easily have been "Why is the total always twice what it should be?" if the asker were doing something like this:
Console.WriteLine(statement.Count()); // this enumerates all the elements!
foreach (var item in statement) { Console.WriteLine(item.Total); }
Because the only way to get the number of elements in the sequence is to actually evaluate all of them.
Similarly, what actually happened in this question was that somewhere there was code like this:
if (statement.Any()) // this actually involves getting the first result
{
// do something with the statement
}
// ...
foreach (var item in statement) { Console.WriteLine(item.Total); }
It seems innocuous, but if you know how LINQ and IEnumerable work, you know that .Any() is basically the same as .GetEnumerator().MoveNext(), which makes it more obvious that it requires getting the first element.
It all boils down to the fact that LINQ is based on deferred execution, which is why the solution is to use ToList, which circumvents that and forces immediate execution.
If you don't want to freeze the results with ToList, a solution to the outer scope variable problem is using an iterator function, like this:
IEnumerable<StatementModel> GetStatement(IEnumerable<DataObject> source) {
decimal runningTotal = 0;
foreach (var x in source) {
yield return new StatementModel() {
...
RunningTotal = (runningTotal += x.GoodsValueInAccountCurrency)
};
}
}
Then pass to this function the source query (not including the Select):
var statement = GetStatement(sage.Repository...AsEnumerable());
Now it is safe to enumerate statement multiple times. Basically, this creates an enumerable that re-executes this entire block on each enumeration, as opposed to executing a selector (which equates to only the foreach part) -- so runningTotal will be reset.
I don't think is possible but wanted to ask to make sure. I am currently debugging some software someone else wrote and its a bit unfinished.
One part of the software is a search function which searches by different fields in the database and the person who wrote the software wrote a great big case statement with 21 cases in it 1 for each field the user may want to search by.
Is it possible to reduce this down using a case statement within the Linq or a variable I can set with a case statement before the Linq statement?
Example of 1 of the Linq queries: (Only the Where is changing in each query)
var list = (from data in dc.MemberDetails
where data.JoinDate.ToString() == searchField
select new
{
data.MemberID,
data.FirstName,
data.Surname,
data.Street,
data.City,
data.County,
data.Postcode,
data.MembershipCategory,
data.Paid,
data.ToPay
}
).ToList();
Update / Edit:
This is what comes before the case statement:
string searchField = txt1stSearchTerm.Text;
string searchColumn = cmbFirstColumn.Text;
switch (cmbFirstColumn.SelectedIndex + 1)
{
The cases are then done by the index of the combo box which holds the list of field names.
Given that where takes a predicate, you can pass any method or function which takes MemberDetail as a parameter and returns a boolean, then migrate the switch statement inside.
private bool IsMatch(MemberDetail detail)
{
// The comparison goes here.
}
var list = (from data in dc.MemberDetails
where data => this.IsMatch(data)
select new
{
data.MemberID,
data.FirstName,
data.Surname,
data.Street,
data.City,
data.County,
data.Postcode,
data.MembershipCategory,
data.Paid,
data.ToPay
}
).ToList();
Note that:
You may look for a more object-oriented way to do the comparison, rather than using a huge switch block.
An anonymous type with ten properties that you use in your select is kinda weird. Can't you return an instance of MemberDetail? Or an instance of its base class?
How are the different where statements handled, are they mutually excluside or do they all limit the query somehow?
Here is how you can have one or more filters for a same query and materialized after all filters have been applied.
var query = (from data in dc.MemberDetails
select ....);
if (!String.IsNullOrEmpty(searchField))
query = query.Where(pr => pr.JoinDate.ToString() == searchField);
if (!String.IsNullOrEmpty(otherField))
query = query.Where(....);
return query.ToList();
I am searching the land for an elegant, reusable solution to a problem that has been bothering me for ages. Thus,
Say I have some business logic I use all over the site: (don't get held up as to how simple this is, it could be complex)
public DateTime ExpiryDate
{
get { return DateAdded.Date.AddMonths(ApplicationConfiguration.Rule3ExpiryLengthInMonths); }
}
And a Linq statement:
groupedByPatient.Count(x =>
x.Max(a => System.Data.Objects.EntityFunctions.AddMonths(a.DateAdded, ApplicationConfiguration.Rule3ExpiryLengthInMonths))
<= DateTime.Now);
This "expired" logic has got to be repeated as (understandably) Expired is not a column in the db. The net result is that we end up with scattered business logic across the code. Ideally we would have:
var count = groupedByPatient.Count(x =>
x.Max(a => a.ExpiryDate)
<= DateTime.Now);
Theoretically as long as you conform to Linq's "c#" rules you should be able to abstract this code out, say:
public DateTime ExpiryDate
{
get { return System.Data.Objects.EntityFunctions.AddMonths(
DateAdded, ApplicationConfiguration.Rule3ExpiryLengthInMonths).D }
}
Why don't you create an extension method on DateTime? That way, whenever you have a date you can just call that to get your expiry date:
static class DateTimeExtensions
{
public static DateTime ExpiryDate(this DateTime dte)
{
return dte.AddMonths(ApplicationConfiguration.Rule3ExpiryLengthInMonths);
}
}
If I understand your example correctly, DateAdded is a date column in your table, from which you wish to find the expiry date. Then, just do this:
var count = groupedByPatient.Count(x =>
x.Max(a => a.DateAdded.ExpiryDate()) <= DateTime.Now);
I'm not sure from the subject of this vs the code samples you've put in, but I'm pretty sure the 2nd here is what you're looking for.
If you just want the result for a materialised query (ie after you've got the data), then use extensions:
public static string ToExpiryDate(this DateTime date)
{
return date.AddMonths(ApplicationConfiguration.Rule3ExpiryLengthInMonths);
}
If you want the result from within a IQueryable (which by your subject is what I think you are looking for), then you can use expressions:
public static Expression<Func<IEnumerable<YourEntity>, DateTime>> MaxExpiryDate = (y) => y.Max(
System.Data.Objects.EntityFunctions.AddMonths(y.DateAdded, ApplicationConfiguration.Rule3ExpiryLengthInMonths)
);
Then your query would look like:
var count = groupedByPatient.Count(x => x.YourEntities(MaxExpiryDate) <= DateTime.Now);
NOTE: The Func<> MUST be wrapped in Expression<> even though both will appear to work, without wrapping it in expression, the query will force materialisation before it is run. By putting expression around the function we tell EF to do it as part of the query.
Edit: Changed my test around as there was a flaw with the way the test was being run.
I was fighting some performance issues with Fluent Nhibernate recently and I came across something I thought was very odd. When I made an IEnumerable a List performance increased dramatically. I was trying to figure out why. It didn't seem like it should, and google didn't turn anything up.
Here's the basic test I ran:
//Class has various built in type fields, but no references to anything
public class Something
{
public int ID;
public decimal Value;
}
var someRepository = new Repository(uow);
//RUN 1
var start = DateTime.Now;
// Returns a IEnumerable from a session.Linq<SomeAgg> based on the passed in parameters, nothing fancy. Has about 1300 rows that get returned.
var somethings = someRepository.GetABunchOfSomething(various, parameters);
var returnValue = SumAllFunction(somethings);
var timeSpent = DateTime.Now - start; //Takes {00:00:00.3580358} on my box
//RUN2
var start2 = DateTime.Now;
var returnValue = someFunction(somethings);
var timeSpent = DateTime.Now - start2; //Takes {00:00:00.0560000} on my box
public decimal SumAllFunction(IEnumerable<Something> somethings)
{
return somethings.Sum(x => x.Value); //Value is a decimal that's part of the Something class
}
Now if I take the same code and just change the line someRepository.GetABunchOfSomethingto and appened .ToList():
//RUN 1
var start = DateTime.Now;
var somethings = someRepository.GetABunchOfSomething(various, parameters).ToList();
var returnValue = SumAllFunction(somethings);
var timeSpent = DateTime.Now - start; //Takes {00:00:00.3580358} on my box
//RUN 2
var start2 = DateTime.Now;
var returnValue = SumAllFunction(somethings);
var timeSpent = DateTime.Now - start2; //Takes {00:00:00.0010000} on my box
Nothing else changed. These results are very repeatable. So it's not just a one off timing issue.
The TLDR version is this:
When running the same IEnumerable through a loop twice the second run takes anywhere from 10-20 time longer than if I change the IEnumerable to a List using .ToList() before running it through the 2 loops.
I checked the SQL and when it's a List then the sql only gets run once and appears to be cached and used again rather than having to go back to the database to get the results.
If it's an IEnumerable then everytime it goes to access the children of the IEnumerable it makes a trip to the database to rehydrate them.
I understand that you can't add to/delete from an IEnumerable, but my understanding was that the IEnumerable would have been initially filled with the proxy objects and then the proxy objects would have been hydrated later on when needed. After they were hydrated you wouldn't have to go back to the DB again, but it does not appear to be that way. I obviously have a work around for this, but I thought it was odd and I was curious why it behaves the way it does.
When you call ToList() on your GetABunchOfSomething result, the query is performed at that moment, and the results are placed in a list. When you don't call ToList(), then it's not until someFunction runs that the query is performed, and your timer doesn't take that into account.
I think you'll find that the time difference between the two are due to that.
Update
The results, though maybe counter-intuitive to you, makes sense. The reason why the query isn't run until you iterate, and the reason why the results aren't cached, is provided as a feature. Say you wanted to call your repository method in two places in your code; one time sorted by Foo, another time filtered by Bar. If the repository method returns an IQueryable<YourClass>, any additional modifications made to that object will actually affect the SQL that gets emitted rather than causing the collection to be modified in-memory. For example, if you ran this:
someRepository
.GetABunchOfSomething(various, parameters)
.Where(s => s.Bar == "SomeValue");
The generated SQL might look something like this once you iterate:
select *
from someTable
where Bar = 'SomeValue'
However, if you did this instead:
someRepository
.GetABunchOfSomething(various, parameters)
.ToList()
.Where(s => s.Bar == "SomeValue");
Then you'll be retrieving all rows from the table instead, and your application would filter the results.
In the code below, I need to set "Variable1" to an arbitrary value so I don't get into scope issues further below. What is the best arbitrary value for a variable of type var, or is there a better way to avoid the scope issue I'm having?
var Variable1;
if(Something == 0)
{
//DB = DatabaseObject
Variable1 =
from a in DB.Table
select new {Data = a};
}
int RowTotal = Variable1.Count();
Well, you could do:
// It's not clear from your example what the type of Data should
// be; adjust accordingly.
var variable1 = Enumerable.Repeat(new { Data = 0 }, 0).AsQueryable();
if (something == 0)
{
//DB = DatabaseObject
variable1 = from a in DB.Table
select new {Data = a};
}
int rowTotal = variable1.Count();
This is effectively "typing by example". To be honest, I'd try to avoid it - but it's hard to know exactly how I'd do so without seeing the rest of the method. If possible, I'd try to keep the anonymous type scope as tight as possible.
Note: in this case you could just select a instead of an anonymous type. I'm assuming your real use case is more complex. Likewise if you genuinely only need the row total, then set that inside the braces. The above solution is only applicable if you really, really need the value of the variable later on.
Are you using Variable1 later in your code, or just to find the row count.
If the latter, it's just:
int RowTotal = DB.Table.Count();
If for the full block:
int RowTotal = (Something == 0) ? DB.Table.Count() : 0;
It looks like you can define it as IEnumerable. Then you can use the count function like you are trying to.