I have a table with an event id pk, date column and an event type column.
I want to use linq to get the count of events in each day.
The issue is that the table is sparse, i.e. values are not stored in days which did not have any events.
Since I want to use this data for a line chart, I need to fill out the data with the missing dates and give them a value of zero.
Is these any way to do this inside linq? or do I have to do this manually?
Is there any recommended method of doing this?
Edit:
I created the following method:
public string GetDailyData(int month, int year)
{
int days = DateTime.DaysInMonth(year,month);
DateTime firstOfTheMonth = new DateTime(year, month, 1);
PaymentModelDataContext db = new PaymentModelDataContext();
var q = from daynumber in Enumerable.Range(0, days)
let day = firstOfTheMonth.AddDays(daynumber)
join data in db.TrackingEvents on day equals data.timestamp.Day into d2
from x in d2.DefaultIfEmpty()
select Tuple.Create(x.Key, x.Value);
return ParseJson(q);
}
The problem is I get an error on the 'join' keyword:
"The type of one of the expressions in the join clause is incorrect. Type inference failed in the call to 'GroupJoin'"
Edit 2:
I made the changes suggested and tried to group the results.
When I send them to the parsing function, I get a null object ref error.
Here is the new code:
[WebMethod]
public string GetDailyData(int month, int year)
{
int days = DateTime.DaysInMonth(year, month);
DateTime firstOfTheMonth = new DateTime(year, month, 1);
PaymentModelDataContext db = new PaymentModelDataContext();
var q = from daynumber in Enumerable.Range(0, days)
let day = firstOfTheMonth.AddDays(daynumber)
join data in db.TrackingEvents on day equals data.timestamp.Date into d2
from x in d2.DefaultIfEmpty()
group x by x.timestamp.Date;
return ParseJson(q);
}
And the parsing function:
private string ParseJson<TKey, TValue>(IEnumerable<IGrouping<TKey, TValue>> q)
{
string returnJSON = "[{ \"type\" : \"pie\", \"name\" : \"Campaigns\", \"data\" : [ ";
foreach (var grp in q)
{
double currCount = grp.Count();
if (grp.Key != null)
returnJSON += "['" + grp.Key + "', " + currCount + "],";
else
returnJSON += "['none', " + currCount + "],";
}
returnJSON = returnJSON.Substring(0, returnJSON.Length - 1);
returnJSON += "]}]";
return returnJSON;
}
You should be able to use LINQ. One method is to use Enumerable.Range to create a collection of dates between the min and max dates, and then perform an outer join (using GroupJoin) against the sparse table. (See MSDN Reference: How to Perform Outer Joins (C# Programming Guide))
For instance, if numdays is the date range (in days), MinDate is the initial date, and SparseData is your sparse data, and SparseData has an instance property Day that specifies the date, then you might do:
var q = Enumerable.Range(0, numdays)
.Select(a => MinDate.AddDays(a))
.GroupJoin(
SparseData,
q=>q,
sd=>sd.Day,
(key, value) =>
Tuple.Create(
key,
value.DefaultIfEmpty().First()
)
);
Or, equivalently,
var q2 = from daynumber in Enumerable.Range(0, numdays)
let day = MinDate.AddDays(daynumber)
join data in SparseData on day equals data.Day into d2
from x in d2.DefaultIfEmpty()
select Tuple.Create(x.Key, x.Value);
The code I've written follows an almost identical approach to that suggested in #drf's answer - outer joining the aggregated results to the complete set of dates.
However, it's slightly simpler and I believe it produces the output format you want (also, I've compiled and run it, so it at least does what I expect it to :-))
I've assumed a collection called events, the members of which have a property timestamp
Note that I've assumed the timestamps may include times as well as dates - if this isn't the case you can simplify the code slightly by omitting the .Dates
Finally, I've determined the range of dates to be defined by the period you have data for - obviously you can change the startDate and endDate values
DateTime startDate = events.OrderBy(e=>e.timestamp).First().timestamp.Date;
DateTime endDate = events.OrderBy(e=>e.timestamp).Last().timestamp.Date;
var allDates = Enumerable.Range(0, (endDate - startDate).Days + 1)
.Select(a => startDate.AddDays(a))
.GroupJoin(events, d=>d.Date, e=>e.timestamp,
(d, e) =>
new{date = d, count = e.Count()});
Not in LINQ2SQL as far as I can figure out, but the standard trick when you write a stored procedure is to generate a list of all dates in the range, filter out those already in the list and take a union of the results.
This should be quite easy to do in LINQ2Objects once you have retrieved the sparse data.
Related
I have data from which I should count rows by weeks and weekdays. As result I should get
Starting day of the week, weekday, count of data for that day
I have tried this code:
var GroupedByDate = from r in dsDta.Tables[0].Rows.Cast<DataRow>()
let eventTime = (DateTime)r["EntryTime"]
group r by new
{
WeekStart = DateTime(eventTime.Year, eventTime.Month, eventTime.AddDays(-(int)eventTime.DayOfWeek).Day),
WeekDay = eventTime.DayOfWeek
}
into g
select new
{
g.Key,
g.WeekStart,
g.WeekDay,
LoadCount = g.Count()
};
However, from DateTime(eventTime.Year, ...)
I get an error "C# non-invocable member datetime cannot be used like a method."
What to do differently?
The immediate error is due to you missing the new part from your constructor call. However, even with that, you'd still have a problem as you're using the month and year of the existing date, even if the start of the week is in a previous month or year.
Fortunately, you can simplify it very easily:
group r by new
{
WeekStart = eventTime.AddDays(-(int)eventTime.DayOfWeek)),
WeekDay = eventTime.DayOfWeek
}
Or if eventTime isn't always a date, use eventTime.Date.AddDays(...).
Alternatively, for clarity, you could extract a separate method:
group r by new
{
WeekStart = GetStartOfWeek(eventTime)
WeekDay = eventTime.DayOfWeek
}
...
private static DateTime GetStartOfWeek(DateTime date)
{
// Whatever implementation you want
}
That way you can test the implementation of GetStartOfWeek separately, and also make it more complicated if you need to without it impacting your query.
I am new to C# and would like to make a dropdown list containing all Thursday Dates?
I currently have an SQL table with all of these dates, but would rather have them generated in a function and used to populate a dropdown.
Any examples of best approach to this?
Update: I ended up using a calendar bootstrap-datepicker and disabled all days except Thursday’s. This gave me the current month Thursday’s and solved my issue.
You could do something like that where you hard code the first Thursday you want to display, and then set how many Thursdays you want.
var list = new List<DateTime>();
DateTime firstThursday = new DateTime(2018,02,20);
var numberOfThursdayWanted = 1000;
for (int i = 0; i < numberOfThursdayWanted; ++i)
{
list.Add(firstThursday.AddDays(i*7));
}
return list;
You can use LINQ to generate a list of "all":
DateTime firstThursday = DateTime.MinValue.AddDays(Enumerable.Range(0, 7).First(d => DateTime.MinValue.AddDays(d).DayOfWeek == DayOfWeek.Thursday));
int weeks = (int)Math.Ceiling((DateTime.MaxValue - firstThursday).Days / 7.0);
var allThursdays = Enumerable.Range(0, weeks).Select(d => firstThursday.AddDays(d * 7));
Note that allThursdays is just the LINQ query. If doesn't make sense to store all DateTimes in a collection. But maybe you want only those between a specific timespan, f.e.:
DateTime start = DateTime.Today.AddYears(-10);
DateTime end = DateTime.Today.AddYears(10);
DateTime[] allThursDaysInLast10YearsUntilNext10Years = allThursdays
.Where(d => d >= start && d <= end)
.ToArray();
First you need to somehow narrow your results. For example by year, or by count as Dimitri suggests.
For that you can check this answer: Create an array or List of all dates between two dates
Once you have your IEnumerable<DateTime> you just have to use LINQ to filter the thurdays as follows:
IEnumerable<DateTime> allDateTimes = null; //change this for whatever range you need
IEnumerable<DateTime> onlyThursdays = allDateTimes.Where(d => d.DayOfWeek == DayOfWeek.Thursday);
I have a CSV File that i want to filter something like this
Example of my CSV File:
Name,LastName,Date
David,tod,09/09/1990
David,lopez,09/09/1994
David,cortez,09/09/1994
Maurice,perez,09/09/1980
Maurice,ruiz,09/09/1996
I want to know, How many people were born between date 1 (01/01/1990) and date 2 (01/01/1999) (with datetimepicker)
And the datagridview should it show something like this:
Name,Frecuency
David,3
Maurice,1
I dont know how do it with compare dates, but I have this code with linq logic
DataTable dtDataSource = new DataTable();
dtDataSource.Columns.Add("Name");
dtDataSource.Columns.Add("Frecuency");
int[] array = new int[10];
array[0] = 1;
array[1] = 1;
array[2] = 1;
array[3] = 2;
array[4] = 1;
array[5] = 2;
array[6] = 1;
array[7] = 1;
array[8] = 2;
array[9] = 3;
var group = from i in array
group i by i into g
select new
{
g.Key,
Sum = g.Count()
};
foreach (var g in group)
{
dtDataSource.Rows.Add(g.Key,g.Sum);
}
if (dtDataSource != null)
{
dataGridViewReporte.DataSource = dtDataSource;
}
Thanks!
The best, and easiest, way to work with dates in .NET is with the DateTimeOffset structure. This type exposes several methods for parsing dates (which makes converting the date strings from your CSV file easy), and also enables simple comparisons between dates with the standard operators.
See the DateTimeOffset documentation on MSDN.
Note: .NET also has a DateTime structure. I would encourage you to use DateTimeOffset wherever possible, as it helps prevent time zone bugs from creeping in to your code.
Simple Example
As a simple example, this code demonstrates how you can parse a string to a DateTimeOffset in .NET, and then compare it to another date.
// Static property to get the current time, in UTC.
DateTimeOffset now = DateTimeOffset.UtcNow;
string dateString = "09/09/1990";
DateTimeOffset date;
// Use TryParse to defensively parse the date.
if (DateTimeOffset.TryParse(dateString, out date))
{
// The date is valid; we can use a standard operator to compare it.
if (date < now)
{
Console.WriteLine("The parsed date is in the past.");
}
else
{
Console.WriteLine("The parsed date is in the future.");
}
}
Using LINQ
The key element you were missing from your sample code was a Where clause in the LINQ expression. Now that we've seen how to parse dates, it's simply a matter of comparing them to the start and end dates you care about.
.Where(p => p.BirthDate >= startDate && p.BirthDate <= endDate)
Note: I've found that LINQ expressions are really nice to work with when they're strongly typed to some object. I've included a simple Person class in this example, which hopefully clears up the code a lot. This should be fine for most cases, but do keep in mind that LINQ-to-Objects, while incredibly productive, is not always the most efficient solution when you have a lot of data.
The Person class:
class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public DateTimeOffset BirthDate { get; set; }
}
Example code:
// Representing the CSV file as an array of strings.
var csv = new []
{
"Name,LastName,Date",
"David,tod,09/09/1990",
"David,lopez,09/09/1994",
"David,cortez,09/09/1994",
"Maurice,perez,09/09/1980",
"Maurice,ruiz,09/09/1996"
};
// Parse each line of the CSV file into a Person object, skipping the first line.
// I'm using DateTimeOffset.Parse for simplicity, but production code should
// use the .TryParse method to be defensive.
var people = csv
.Skip(1)
.Select(line =>
{
var parts = line.Split(',');
return new Person
{
FirstName = parts[0],
LastName = parts[1],
BirthDate = DateTimeOffset.Parse(parts[2]),
};
});
// Create start and end dates we can use to compare.
var startDate = new DateTimeOffset(year: 1990, month: 01, day: 01, hour: 0, minute: 0, second: 0, offset: TimeSpan.Zero);
var endDate = new DateTimeOffset(year: 1999, month: 01, day: 01, hour: 0, minute: 0, second: 0, offset: TimeSpan.Zero);
// First, we filter the people by their birth dates.
// Then, we group by their first name and project the counts.
var groups = people
.Where(p => p.BirthDate >= startDate && p.BirthDate <= endDate)
.GroupBy(p => p.FirstName)
.Select(firstNameGroup => new
{
Name = firstNameGroup.Key,
Count = firstNameGroup.Count(),
});
foreach (var group in groups)
{
dtDataSource.Rows.Add(group.Name, group.Count);
}
LINQ Syntax
As a matter of personal preference, I typically use the LINQ extension methods (.Where, .Select, .GroupBy, etc.) instead of the query syntax. Following the style from your example above, the same query could be written as:
var groups = from p in people
where p.BirthDate >= startDate && p.BirthDate <= endDate
group p by p.FirstName into g
select new
{
Name = g.Key,
Count = g.Count(),
};
I'm trying to group some timestamps with folowing linq statement
var ds = (from wl in dbEntities.tbl_weblog
group wl by new
{
wl.tms_stamp.Value.Date,
wl.tms_stamp.Value.TimeOfDay
} into dateGrp
select new
{
Date = dateGrp.Key.Date,
Time = dateGrp.Key.TimeOfDay,
HitCount = dateGrp.Count(),
TotalKB = dateGrp.Sum(m => m.int_bytes).Value / 1024
}
).ToList();
return Helpers.ToDataSet(ds);
But i'm getting error "The specified type member 'Date' is not supported in LINQ to Entities. Only initializers, entity members, and entity navigation properties are supported.".
Can someone help me to resolve this?
Linq-To-Entities doesn't have a mapping for DateTime.Date to SQL. So, instead you have to break it down into the Year, Month, Day, and Hour to get the results you are looking for.
var ds = (from wl in dbEntities.tbl_weblog
group wl by new
{
wl.tms_stamp.Value.Year,
wl.tms_stamp.Value.Month,
wl.tms_stamp.Value.Day,
wl.tms_stamp.Value.Hour
} into dateGrp
select new
{
Year = dateGrp.Year,
Month= dateGrp.Month,
Day= dateGrp.Day,
Hour= dateGrp.Hour,
HitCount = dateGrp.Count(),
TotalKB = dateGrp.Sum(m => m.int_bytes).Value / 1024
}).ToList();
Then when you consume ds you can put the date parts back together.
foreach(var item in ds)
{
var date = new DateTime(item.Year, item.Month, item.Day);
var hour = item.Hour;
}
I have a Linq query that basically counts how many entries were created on a particular day, which is done by grouping by year, month, day. The problem is that because some days won't have any entries I need to back fill those missing "calendar days" with an entry of 0 count.
My guess is that this can probably be done with a Union or something, or maybe even some simple for loop to process the records after the query.
Here is the query:
from l in context.LoginToken
where l.CreatedOn >= start && l.CreatedOn <= finish
group l by
new{l.CreatedOn.Year, l.CreatedOn.Month, l.CreatedOn.Day} into groups
orderby groups.Key.Year , groups.Key.Month , groups.Key.Day
select new StatsDateWithCount {
Count = groups.Count(),
Year = groups.Key.Year,
Month = groups.Key.Month,
Day = groups.Key.Day
}));
If I have data for 12/1 - 12/4/2009 like (simplified):
12/1/2009 20
12/2/2009 15
12/4/2009 16
I want an entry with 12/3/2009 0 added by code.
I know that in general this should be done in the DB using a denormalized table that you either populate with data or join to a calendar table, but my question is how would I accomplish this in code?
Can it be done in Linq? Should it be done in Linq?
I just did this today. I gathered the complete data from the database and then generated a "sample empty" table. Finally, I did an outer join of the empty table with the real data and used the DefaultIfEmpty() construct to deal with knowing when a row was missing from the database to fill it in with defaults.
Here's my code:
int days = 30;
// Gather the data we have in the database, which will be incomplete for the graph (i.e. missing dates/subsystems).
var dataQuery =
from tr in SourceDataTable
where (DateTime.UtcNow - tr.CreatedTime).Days < 30
group tr by new { tr.CreatedTime.Date, tr.Subsystem } into g
orderby g.Key.Date ascending, g.Key.SubSystem ascending
select new MyResults()
{
Date = g.Key.Date,
SubSystem = g.Key.SubSystem,
Count = g.Count()
};
// Generate the list of subsystems we want.
var subsystems = new[] { SubSystem.Foo, SubSystem.Bar }.AsQueryable();
// Generate the list of Dates we want.
var datetimes = new List<DateTime>();
for (int i = 0; i < days; i++)
{
datetimes.Add(DateTime.UtcNow.AddDays(-i).Date);
}
// Generate the empty table, which is the shape of the output we want but without counts.
var emptyTableQuery =
from dt in datetimes
from subsys in subsystems
select new MyResults()
{
Date = dt.Date,
SubSystem = subsys,
Count = 0
};
// Perform an outer join of the empty table with the real data and use the magic DefaultIfEmpty
// to handle the "there's no data from the database case".
var finalQuery =
from e in emptyTableQuery
join realData in dataQuery on
new { e.Date, e.SubSystem } equals
new { realData.Date, realData.SubSystem } into g
from realDataJoin in g.DefaultIfEmpty()
select new MyResults()
{
Date = e.Date,
SubSystem = e.SubSystem,
Count = realDataJoin == null ? 0 : realDataJoin.Count
};
return finalQuery.OrderBy(x => x.Date).AsEnumerable();
I made a helper function which is designed to be used with anonymous types, and reused in as generic way as possible.
Let's say this is your query to get a list of orders for each date.
var orders = db.Orders
.GroupBy(o => o.OrderDate)
.Select(o => new
{
OrderDate = o.Key,
OrderCount = o.Count(),
Sales = o.Sum(i => i.SubTotal)
}
.OrderBy(o => o.OrderDate);
For my function to work please note this list must be ordered by date. If we had a day with no sales there would be a hole in the list.
Now for the function that will fill in the blanks with a default value (instance of anonymous type).
private static IEnumerable<T> FillInEmptyDates<T>(IEnumerable<DateTime> allDates, IEnumerable<T> sourceData, Func<T, DateTime> dateSelector, Func<DateTime, T> defaultItemFactory)
{
// iterate through the source collection
var iterator = sourceData.GetEnumerator();
iterator.MoveNext();
// for each date in the desired list
foreach (var desiredDate in allDates)
{
// check if the current item exists and is the 'desired' date
if (iterator.Current != null &&
dateSelector(iterator.Current) == desiredDate)
{
// if so then return it and move to the next item
yield return iterator.Current;
iterator.MoveNext();
// if source data is now exhausted then continue
if (iterator.Current == null)
{
continue;
}
// ensure next item is not a duplicate
if (dateSelector(iterator.Current) == desiredDate)
{
throw new Exception("More than one item found in source collection with date " + desiredDate);
}
}
else
{
// if the current 'desired' item doesn't exist then
// create a dummy item using the provided factory
yield return defaultItemFactory(desiredDate);
}
}
}
The usage is as follows:
// first you must determine your desired list of dates which must be in order
// determine this however you want
var desiredDates = ....;
// fill in any holes
var ordersByDate = FillInEmptyDates(desiredDates,
// Source list (with holes)
orders,
// How do we get a date from an order
(order) => order.OrderDate,
// How do we create an 'empty' item
(date) => new
{
OrderDate = date,
OrderCount = 0,
Sales = 0
});
Must make sure there are no duplicates in the desired dates list
Both desiredDates and sourceData must be in order
Because the method is generic if you are using an anonymous type then the compiler will automatically tell you if your 'default' item is not the same 'shape' as a regular item.
Right now I include a check for duplicate items in sourceData but there is no such check in desiredDates
If you want to ensure the lists are ordered by date you will need to add extra code
Essentially what I ended up doing here is creating a list of the same type with all the dates in the range and 0 value for the count. Then union the results from my original query with this list. The major hurdle was simply creating a custom IEqualityComparer. For more details here: click here
You can generate the list of dates starting from "start" and ending at "finish", a then step by step check the number of count for each date separately