compare two successive rows - group by - c#

i have a model as below
ID Date BoitierNumber
1 07/04/2012 14:01:46 1
2 07/04/2012 14:01:50 2
3 07/04/2012 14:01:50 3
4 07/04/2012 14:01:56 1
5 07/04/2012 14:02:06 1
6 07/04/2012 14:02:10 2
I have grouped rows by (BoitierNumber)
*boitier Number 1
1 07/04/2012 14:01:46
4 07/04/2012 14:01:56
5 07/04/2012 14:02:06
*boitier Number 2
2 07/04/2012 14:01:50
6 07/04/2012 14:02:10
*boitier Number 3
3 07/04/2012 14:01:50
To do this i have used this code
var groups = context.Essais.GroupBy(p => p.BoitierNumber)
.Select(g => new { GroupName = g.Key, Members = g });
foreach (var g in groups)
{
Console.WriteLine("Members of {0}", g.GroupName);
foreach (var member in g.Members.OrderBy(x=>x.Id))
{
Console.WriteLine("{0} {1}", member.Id,member.Date);
}
}
For the moment everything works fine
now i want to compare the date of two successives grouped rows
if row[i].date>row[i-1].date i will delete row[i-1]
For example:
*boitier Number 1
1 07/04/2012 14:01:46
4 07/04/2012 14:01:56
5 07/04/2012 14:02:06
8 07/04/2012 14:01:00
10 07/04/2012 14:00:00
13 07/04/2012 14:03:00
boitier Number 1
---> Date of row of Id 4 > Date of row of ID 1
then i will delete row of ID 1
---> Date of row of Id 5 > Date of row of ID 4
then i will delete row of ID 4
---> Date of row of Id 8 < Date of row of ID 5
then i will skip it
---> Date of row of Id 10 < Date of row of ID 8
then i will skip it
---> Date of row of Id 13 > Date of row of ID 10
then i will delete row of ID 10
...
TherFore, After this process, only rows 13 and 8 will be remain

The fastest change you can do now is:
var groups = context.Essais.GroupBy(p => p.BoitierNumber)
.Select(g => new
{
GroupName = g.Key,
Members = g.OrderBy(m=>m.Id)
});
EDIT
It seems EF doesn't work with calling ToList() in the projection.
The Members will be now an ordered list by id, then you can do a for instead a foreach
foreach (var g in groups)
{
var members = g.Members.ToList();
for (int i = 1, i < members.Count; i++)
{
var previousMember = members[i-1];
var currentMember = members[i];
if (..)
// code to delete
}
}
Just a note. Projecting the grouped Members with the initial query, it will create other queries to select the members corresponding to the each group key. You'll still load the entire table, but in subsequent queries. You can do the grouping in memory:
var groups = context.Essais
.AsEnumberable().
.GroupBy(p => p.BoitierNumber)
.Select(g => new { GroupName = g.Key, Members = g.OrderBy(m=>m.Id).ToList() });

Related

Sum duplicated objects in an array

I have an unsorted array of objects with CustomerId, ProductId and Count (all ints)
I want to combine records where CustomerId and ProductId match, summing the count.
for example:
CId PId Cnt
1 100 5
1 100 1
2 100 7
Desired output:
CId PId Cnt
1 100 6
2 100 7
As you can see for the two records for CId 1 & PId 100 have been merged and the count has been summed.
Can this be done with LINQ?
I know it could done with loops but I'm hoping for a more elegant way
Here I have assumed that the class name is Item:
var result = array.GroupBy(x => new { x.CId, x.PId })
.Select(g => new Item { CId = g.Key.CId, PId = g.Key.PId, Cnt = g.Sum(x => x.Cnt) });
Here is a Live Demo

Grouping records that haven't groups values

Please consider this records:
Id Week Value
-----------------------------
1 1 1000
2 1 1200
3 2 800
4 3 1800
5 3 1100
6 3 1000
I want to group records for 4 weeks but we haven't record for week 4.For Example:
Week Count
---------------------
1 2
2 1
3 3
4 0
How I can do this with linq?
Thanks
First you need an array of weeks then this query might help
var weeks = new List<int>{1,2,3,4}
var q = from w in weeks
join rw in (
from r in table
group r by r.Week into g
select new {week = g.Key, count = g.Count()}) on w equals rw.week into p
from x2 in p.DefaultIfEmpty()
select new {w, count = (x2 != null ? x2.count : 0)};
online result in .net fiddle
You can try
var result = Enumerable.Range(1, 4)
.GroupJoin(table,
week => week,
record => record.Week,
(week, records) => new { Week = week, Count = records.Count() });
As suggested by jessehouwing, the Enumerable.Range will return the possible week numbers to be used as left outer keys within the join.
GroupJoin will then accept as parameters
A lambda/delegate/method that returns the left outer key
A lambda/delegate/method that extracts the right key from your table.
A lambda/delegate/method that builds an item of the result.
Regards,
Daniele.

Group by with return everything from one group but only the first from others

First things first - I am connecting to an SqlServerCe database using C# in visual studio 2012.
I am using Entity framework 6 and Linq to perform this function.
So - On to the question.
I have a table as follows
ItemGroup
(
ID INT PRIMARY KEY,
ItemID INT,
GroupID INT
)
There are two tables that link to this (Items and Groups) via their IDs and the two foreign key columns in the Item Group table.
Each Item can be part of one groups and each group can have many items.
If an Item has a group ID of 0 then it is considered to not be part of a group
the result of this is that there are about 3000 groups each with 2 -> ~30 items, but there is one group that has about 4000 items.
My problem is that I have a list of items, and I want to return only one from each group unless the item is part of group 0 (ie no group). In the case of group 0 I want to return all items that match.
for example:
**Group 0**
*Item 1,
Item 2,
Item 3,*
**Group 1**
*Item 4,
Item 5*
**Group 2**
*Item 6,
Item 7,
Item 8*
**Group 3**
*Item 9*
I have list of the following items:
*Item1, Item2, Item4, Item5, Item6, Item7*
In this case I want to output all the items from my list that are in group 0 so:
*Item1, Item2*
Item 4 is part of group 1 so we want to display that, but as item 5 is part of the same group we do not want that, so the remainder of my list would be displayed as follows:
*Item4, Item6*
Giving a full list of:
*Item1, Item2, Item4, Item6*
I have tried several approaches, mainly through the use of a Union whereby I get all those records that are part of group 0 first, then do a group by first on the other records then union them together to get the final results.
However this seems tremendously inefficient and takes an age to perform - not to mention the Linq statement is very difficult to follow.
Can someone point me in a direction that I might be able to follow in order to perform this function?
You want to use SelectMany(), conditionally returning all or just one of the grouped sequences depending on the group ID:
var result = (from item in data
group item by item.Group)
.SelectMany(group => group.Key == 0 ? group : group.Take(1));
This code will give you results for the non zero group. Similarly you can figure out the other group. I hope this helps.
var query1 = from t in context.Table1
where t.GroupID != 0
group t by t.GroupID into g
select new
{
ID = g.Key,
Groups = g.Take(1)
};
Console.WriteLine("items with non 0 group");
foreach (var item in query1)
{
foreach (var g in item.Groups)
{
Console.WriteLine(" ID " + g.ID + " " + "Group ID " + g.GroupID + " " + " Item ID " + g.ItemID);
}
}
Input data
ID ItemID GroupID
1 1 0
2 2 0
3 3 0
4 4 1
5 5 1
6 6 2
7 7 2
8 8 2
Output generated
items with non 0 group
ID 4 Group ID 1 Item ID 4
ID 6 Group ID 2 Item ID 6

Using Linq to get the last N number of rows that have duplicated values in a field

Given a database table, a column name C, and a number N larger than 1, how can I get a group of rows with equal values of column C which has at least N rows? If there exists more than one such group, I need to get the group which contains the newest entry (the one with the largest Id).
Is it possible to do this using LINQ to Entities?
Example:
> Id | Mycolumn
> - - - - - - -
> 1 | name55555
> 2 | name22
> 3 | name22
> 4 | name22
> 5 | name55555
> 6 | name55555
> 7 | name1
Primary Key: ID
OrderBy: ID
Repeated column: Mycolumn
If N = 3 and C = Mycolumn, then we need to get rows which have the column MyColumn duplicated at least 3 times.
For the example above, it should return rows 1, 5 and 6, because last index of name55555 is 6, and last index of name22 (which is also repeated 3 times) is 4.
data.Mytable
.OrderByDescending(m => m.Id)
.GroupBy(m => m.Mycolumn)
.FirstOrDefault(group => group.Count() >= N)
.Take(N)
.Select(m => m.Id)
If the rows are identical (all columns) then frankly there's no point fetching more than one of each - they will be indistinguishable; I don't know about LINQ, but you can do something like:
select id, name /* more cols */, count(1) from #foo
group by id, name /* more cols */ having count(1) > 1
You can probably do that in link using GroupBy etc. If they aren't entirely identical (for example, the IDENTITY is different, but the other columns are the same), it gets more difficult, and certainly there is no easy LINQ syntax for it; at the TSQL level, though:
select id, name /* more cols */
from (
select id, name /* more cols */,
ROW_NUMBER() over (partition by name /* more cols */ order by id) as [_row]
from #foo) x where x._row > 1
I have scratched this together in Linqpad, which should give you the wanted results:
int Border = 3;
var table = new List<table>
{
new table {Id = 1, Value = "Name1"},
new table {Id = 2, Value = "Name2"},
new table {Id = 3, Value = "Name5"},
new table {Id = 4, Value = "Name5"},
new table {Id = 5, Value = "Name2"},
new table {Id = 6, Value = "Name5"},
new table {Id = 7, Value = "Name5"},
};
var results = from p in table
group p.Id by p.Value into g
where g.Count() > Border
select new {rows = g.ToList()};
//only in LP
results.Dump();
this yields the rows 3, 4, 6, 7.
However: You only want the last occurence, not all, so you have to query results again:
results.Skip(Math.Max(0, results.Count() - 1)).Take(1);
Kind regards

Datatable group by sum

in a Queue i have datatables in the following format
some table in the Queue
Name Rank
AAA 9
BBB 5
CCC 1
DDD 5
some other table in the Queue
Name Rank
AAA 1
SSS 5
MMM 1
DDD 8
using LINQ need to process those tables table by table continously and add the results to a global DataTable in the following format:
Name Rank1 Rank2 Rank3 Rank>3
AAA 1 0 0 1
BBB 0 0 0 1
CCC 1 0 0 0
DDD 0 0 0 2
SSS 0 0 0 1
MMM 0 0 0 0
in the global table 4 columns state how many times a name was ranked in ranks 1,2,3 or >3.
now if the name already exists in global table i will not add it but only increment the rank count columns, and if does not exist then add it.
i've done this with nested looping but i wonder if anyone can help me with the LINQ syntax to do such thing,also will using LINQ make the process faster than with nested looping?
note that new tables are added to the Queue every second and i will be getting sometable from the Queue and process it to the global datatable
table1.AsEnumerable().Concat(table2.AsEnumerable())
.GroupBy(r => r.Field<string>("Name"))
.Select(g => new {
Name = g.Key,
Rank1 = g.Count(x => x.Field<int>("Rank") == 1),
Rank2 = g.Count(x => x.Field<int>("Rank") == 2),
Rank3 = g.Count(x => x.Field<int>("Rank") == 3),
OtherRank = g.Count(x => x.Field<int>("Rank") > 3)
}).CopyToDataTable();
You will need implementation of CopyToDataTable method where Generic Type T Is Not a DataRow.
A little optimized solution (single parsing and single loop over grouped ranks):
(from row in table1.AsEnumerable().Concat(table2.AsEnumerable())
group row by row.Field<string>("Name") into g
let ranks = g.Select(x => x.Field<int>("Rank")).ToList()
select new {
Name = g.Key,
Rank1 = ranks.Count(r => r == 1),
Rank2 = ranks.Count(r => r == 2),
Rank3 = ranks.Count(r => r == 3),
OtherRank = ranks.Count(r => r > 3)
}).CopyToDataTable();

Categories