I have a datatable that I have grouped as follows:
var result = from data in view.AsEnumerable()
group data by new {Group = data.Field<string>("group_no")}
into grp
select new
{
Group = grp.Key.Group,
PRAS = grp.Average(c => Convert.ToDouble(c.Field<string>("pAKT Total")))
};
Now, the average function is also counting the empty cells in it's calculation. For example, there are 10 cells with only 5 populated with values. I want the average to be the sum of the 5 values divided by 5.
How can I ensure that it does what I want?
Thanks.
Maybe something like this:
PRAS = grp.Select(row => row.Field<string>("pAKT Total"))
.Where(s => !String.IsNullOrEmpty(s))
.Select(Convert.ToDouble)
.Average()
To my knowledge, that's not possible with the Average method.
You can however achieve the result you want to, with the following substitute:
PRAS = grp.Sum(c => Convert.ToDouble(c.Field<string>("pAKT Total"))) / grp.Count(c => !c.IsDBNull)
This only makes sense, when you want to select the "empty" rows in the group, but just don't want to include them in your average. If you don't need the "empty" rows at all, don't select them in the first place, i.e. add a where clause that excludes them.
Related
I have a list of data retrieved from SQL and stored in a class. I want to now aggregate the data using LINQ in C# rather than querying the database again on a different dataset.
Example data I have is above.
Date, Period, Price, Vol and I am trying to create a histogram using this data. I tried to use Linq code below but seem to be getting a 0 sum.
Period needs to be a where clause based on a variable
Volume needs to be aggregated for the price ranges
Price needs to be a bucket and grouped on this column
I dont want a range. Just a number for each bucket.
Example output I want is (not real data just as example):
Bucket SumVol
18000 50
18100 30
18200 20
Attempted the following LINQ query but my SUM seems to be be empty. I still need to add my where clause in, but for some reason the data is not aggregating.
var ranges = new[] { 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, 20000 };
var priceGroups = eod.GroupBy(x => ranges.FirstOrDefault(r => r > x.price))
.Select(g => new { Price = g.Key, Sum = g.Sum(s => s.vol)})
.ToList();
var grouped = ranges.Select(r => new
{
Price = r,
Sum = priceGroups.Where(g => g.Price > r || g.Price == 0).Sum(g => g.Sum)
});
First things first... There seems to be nothing wrong with your priceGroups list. I've run that on my end and, as far as I can understand your purpose, it seems to be grabbing the expected values from your dataset.
var ranges = new[] { 10000, 11000, 12000, 13000, 14000, 15000, 16000, 17000, 18000, 19000, 20000 };
var priceGroups = eod.GroupBy(x => ranges.FirstOrDefault(r => r > x.price))
.Select(g => new { Price = g.Key, Sum = g.Sum(s => s.vol) })
.ToList();
Now, I assume your intent with the grouped list was to obtain yet another anonymous type list, much like you did with your priceGroups list, which is also an anonymous type list... List<'a> in C#.
var grouped = ranges.Select(r => new
{
Price = r,
Sum = priceGroups.Where(g => g.Price > r || g.Price == 0).Sum(g => g.Sum)
});
For starters, your are missing the ToList() method call at the end of it. However, that's not the main issue here, as you could still work with an IEnumerable<'a> just as well for most purposes.
As I see it, the core problem is at your anonymous property Sum attribution. Why are your filtering for g.Price > r || g.Price == 0?
There is no element with Price equal to zero on your priceGroups list. Those are a subset of ranges, and there is no zero there. Then you are comparing every value in ranges against that subset in priceGroups, and consolidating the Sums of every element in priceGroups that have Price higher than the range being evaluated. In other words, the property Sum in your grouped list is a sum of sums.
Keep in mind that priceGroups is already an aggregated list. It seems to me you are trying to aggregate it again when you call the Sum() method after a Where() clause like you are doing. That doesn't make much sense.
What you want (I believe) for the Sum property in the grouped list is for it to be the same as the Sum property in the priceGroups list, if the range being evaluated matches the Price being evaluated. Furthermore, where there is no matches, you want your grouped list Sum to be zero, as that means the range being evaluated was not in the original dataset. You can achieve that with the following instead:
Sum = priceGroups.FirstOrDefault(g => g.Price == r)?.Sum ?? 0
You said your Sum was "empty" in your post, but that's not the behavior I saw on my end. Try the above and, if still not behaving as you would expect, share a small dataset for which you know the expected output with me and I can try to help you further.
Use LINQ instead to query the DB is great, mainly because you are saving process avoiding a new call to your DB. And in case you don't have a high update BD (that change the data very quickly) you can use the retrived data to calculate all using LINQ
I think what I need is relatively simple but every example I Google just returns results using First(), which I'm already doing. Here is my expression:
var options = configData.AsEnumerable().GroupBy(row => row["myColumn"]).Select(grp => grp.First());
What I need is only ONE column from the grp portion and to be able to suffix .ToList() on there without an error. As it stands I receive 4 columns, but only need a specific one, kind of like if this (grp => grp["myColumn"]), didn't result in error the Error 153 Cannot apply indexing with [] to an expression of type 'System.Linq.IGrouping<object,System.Data.DataRow>'
Also, Key does not work in the grouping portion as these results are from a DataTable object. See here - >
If you want only the keys, you can use
var options = configData.AsEnumerable().Select(row=>row["myColumn"]).Distinct();
I think that this is what you want:
configData.AsEnumerable()
.GroupBy(r => r["myColumn"])
.Select(g => new
{
myColumnValue = g.Key,
myColumnItems = g.Select(r => r["OtherColumn"]).ToList()
});
Do you understand how/what this does though? Try it out and inspect the resulting IEnumerable. I'm not sure you have a perfect understanding on how GroupBy works but take your time with above example.
See this part:
new
{
myColumnValue = g.Key,
myColumnItems = g.Select(r => r["OtherColumn"]).ToList()
}
This creates an anonymous type which outputs the values of "OtherColumn" column into a list grouped by "myColumn" where value of "myColumn" is in the myColumnValue property.
I'm not sure this answers your question but it looks like this is what you want.
The variable g is of the type IGrouping<object, DataRow>, it's not DataRow. The IGrouping interface is designed to provide a list of DataRow's grouped by object values - it does not produce a flat list, if it did then it would just be a Sort, not GroupBy.
Just specify the field you want after your call to First() e.g.
.Select(grp => grp.FirstOrDefault()["MyFieldName"]);
This will take the first record from the grouping and select the specified field from that record.
This Code is working example of the content of a select case statement. It is in response to a link button click who's ID is passed in via Session variable. The link buttons represent High(3) Medium(2) and Low(1) Risk categories.
The logic here is; if you select Medium(2) it's related rows (riskCategory =2) are displayed first then the remaining rows ( risk category ) are listed descending so 3 then 2 then 0)
As I said my ugly implementation of the Linq Concat function to accomplish my goal does produce correct results, but it also showcases my need to spend more weekends reviewing and creating better intricate samples than the simple 101 Link Samples tutorial project provides.
There must be a more elegant way to group and order by while allowing for the groups to be ordered representing the selected group first, with remaining groups descending. Again Select Group 1 LowRiskCategory, I'll have to display LowRiskCategory first (1) then 3, 2, & 0 respectively in the sorted results set.
var midQuery = enumerableVersionTable.Where(x => x["RiskCategory"].Equals(intRiskCategoryEnum));
midQuery.OrderByDescending(v => v["DateOfService"]);
midQuery.OrderBy(v => v["Reviewed"]);
var midQueryZero = enumerableVersionTable.Where(x => x["RiskCategory"].Equals(0));
midQueryZero.OrderByDescending(v => v["DateOfService"]);
midQueryZero.OrderBy(v => v["Reviewed"]);
var midQueryOne = enumerableVersionTable.Where(x => x["RiskCategory"].Equals(1));
midQueryOne.OrderByDescending(v => v["DateOfService"]);
midQueryOne.OrderBy(v => v["Reviewed"]);
var midQueryThree = enumerableVersionTable.Where(x => x["RiskCategory"].Equals(3));
midQueryThree.OrderByDescending(v => v["DateOfService"]);
midQueryThree.OrderBy(v => v["Reviewed"]);
var querySummation = midQuery.Concat(midQueryThree);
querySummation = querySummation.Concat(midQueryOne);
querySummation = querySummation.Concat(midQueryZero);
dtQueryResults = querySummation.CopyToDataTable()
Just the sight of those hardcoded numeral values after the translated enum value for case 2:
makes me wana hurl. Theres gotta be more elegant way to do the groups. Order by a specific group. and of course apply all my other odd sorting, as you see date of service and reviewed.
Lastly if you going to AGAIN vote down
at least explain why please thank you
var dtQueryResults = yourData
.OrderByDescending(v => v["RiskCategory"] == intRiskCategoryEnum)//true for ==2 goes first, false goes then
.ThenBy(v => v["RiskCategory"]) //the rest is sorted normally
.ThenBy(v => v["Reviewed"]) //inside the groups, the rest of your sorts is used
.ThenByDescending(v => v["DateOfService"]);
Just change the lambda which you have used for the OrderBy. you are not limited to picking up just one field. I'd use a tertiary expression to select what to sort on based on what is selected.
Say you have columns AppleType, CreationDate and want to order each group of AppleType by CreationDate. Furthermore, you want to create a new column which explicitly ranks the order of the CreationDate per AppleType.
So, the resulting DataSet would have three columns, AppleType, CreationDate, OrderIntroduced.
Is there a LINQ way of doing this? Would I have to actually go through the data programmatically (but not via LINQ), create an array, convert that to a column and add to the DataSet? I have there is a LINQ way of doing this. Please use LINQ non-method syntax if possible.
So are the values actually appearing in the right order? If so, it's easy - but you do need to use method syntax, as the query expression syntax doesn't support the relevant overload:
var queryWithIndex = queryWithoutIndex.Select((x, index) => new
{
x.AppleType,
x.CreationDate,
OrderIntroduced = index + 1,
});
(That's assuming you want OrderIntroduced starting at 1.)
I don't know offhand how you'd then put that back into a DataSet - but do you really need it in a DataSet as opposed to in the strongly-typed sequence?
EDIT: Okay, the requirements are still unclear, but I think you want something like:
var query = dataSource.GroupBy(x => x.AppleType)
.SelectMany(g => g.OrderBy(x => x.CreationDate)
.Select((x, index ) => new {
x.AppleType,
x.CreationDate,
OrderIntroduced = index + 1 }));
Note: The GroupBy and SelectMany calls here can be put in query expression syntax, but I believe it would make it more messy in this case. It's worth being comfortable with both forms.
If you want a pure Linq to Entities/SQL solution you can do something like this:
Modified to handle duplicate CreationDate's
var query = from a in context.AppleGroup
orderby a.CreationDate
select new
{
AppleType = a.AppleType,
CreationDate = a.CreationDate,
OrderIntroduced = (from b in context.AppleGroup
where b.CreationDate < a.CreationDate
select b).Count() + 1
};
I need to add a literal value to a query. My attempt
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
In the above example, I get an error:
"Local sequence cannot be used in LINQ to SQL implementation
of query operators except the Contains() operator."
If I am using Entity Framework 4 for example, what could I add to the Union statement to always include the "seed" ID?
I am trying to produce SQL code like the following:
select distinct ID
from product
union
select 0 as ID
So later I can join the list to itself so I can find all values where the next highest value is not present (finding the lowest available ID in the set).
Edit: Original Linq Query to find lowest available ID
var skuQuery = Context.Products
.Where(p => p.sku > skuSeedStart &&
p.sku < skuSeedEnd)
.Select(p => p.sku).Distinct();
var lowestSkuAvailableList =
(from p1 in skuQuery
from p2 in skuQuery.Where(a => a == p1 + 1).DefaultIfEmpty()
where p2 == 0 // zero is default for long where it would be null
select p1).ToList();
var Answer = (lowestSkuAvailableList.Count == 0
? skuSeedStart :
lowestSkuAvailableList.Min()) + 1;
This code creates two SKU sets offset by one, then selects the SKU where the next highest doesn't exist. Afterward, it selects the minimum of that (lowest SKU where next highest is available).
For this to work, the seed must be in the set joined together.
Your problem is that your query is being turned entirely into a LINQ-to-SQL query, when what you need is a LINQ-to-SQL query with local manipulation on top of it.
The solution is to tell the compiler that you want to use LINQ-to-Objects after processing the query (in other words, change the extension method resolution to look at IEnumerable<T>, not IQueryable<T>). The easiest way to do this is to tack AsEnumerable() onto the end of your query, like so:
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().AsEnumerable().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
Up front: not answering exactly the question you asked, but solving your problem in a different way.
How about this:
var a = Products.Select(p => p.sku).Distinct().ToList();
a.Add(0);
a.Dump(); // LinqPad's way of showing the values
You should create database table for storing constant values and pass query from this table to Union operator.
For example, let's imagine table "Defaults" with fields "Name" and "Value" with only one record ("SKU", 0).
Then you can rewrite your expression like this:
var zero = context.Defaults.Where(_=>_.Name == "SKU").Select(_=>_.Value);
var result = context.Products.Select(p => p.sku).Distinct().Union(zero).ToList();