check duplicate rows on condition in datatable using linq - c#

I have a datatable. it has 5 columns say (Type, A1, A2, B1, B2)
If Type is A, I want to make sure no 2 rows will have same data in A1 and A2 columns and
for Type B, no 2 rows will have same data in B1 and B2 columns
e.g.
Type | A1 | A2 | B1 | B2 |
--------------------------
1 A | 123 | XY | | |
2 A | 123 | XY | | |
3 B | | | TT | LL |
4 A | 456 | YZ | | |
5 B | | | TT | LL |
6 A | 123 | YZ | | |
7 B | | | TT | LL |
8 A | 456 | YZ | | |
In this case I want to flag an error on rows 1,2,4,8
and another error on rows 3,5,7.
Row 6 is OK.
To start with I have done Group by on Key= Type as:
var groups = dt.AsEnumerable().GroupBy(r => r["Type"]).ToList();
I am not sure if I further use for-each on each group or is there a better way in linq.
Please guide.
Thanks.

You can use an anonymous type for the GroupBy:
var dupGroups = table.AsEnumerable()
.Select(r => new{
Row = r,
IsA = r.Field<string>("Type")=="A",
IsB = r.Field<string>("Type")=="B",
A1 = r.Field<string>("A1"),
A2 = r.Field<string>("A2"),
B1 = r.Field<string>("B1"),
B2 = r.Field<string>("B2"),
})
.GroupBy(x => x.IsA ? new { Val1 = x.A1, Val2 = x.A2 } : new { Val1 = x.B1, Val2 = x.B2 })
.Where(g => g.Count() > 1);
foreach (var dupGroup in dupGroups)
foreach (DataRow row in dupGroup.Select(x => x.Row))
row.RowError = "Duplicate detected";
Result:
DataTable errors = table.GetErrors().CopyToDataTable();
1 A 123 XY
2 A 123 XY
3 B TT LL
4 A 456 YZ
5 B TT LL
7 B TT LL
8 A 456 YZ
As you can see, row 6 is not part of the error-table because it's not a duplicate.

Related

C# Datatable - group by multiple columns with linq

I have a Datatable like this
| Supplier | Product | Price1 | Price2 | Price3 | ... | PriceN |
|-----------|---------|--------|--------|--------|-----|--------|
| Supplier1 | Orange | 100 | 105 | 150 | ... | 180 |
| Supplier1 | Orange | 110 | 130 | 140 | ... | 180 |
| Supplier2 | Orange | 200 | 250 | 270 | ... | 350 |
| Supplier2 | Orange | 250 | 270 | 320 | ... | 270 |
I want to group rows as next:
| Supplier | Product | Price1 | Price2 | Price3 | ... | PriceN |
|-----------|---------|---------|---------|---------|-----|---------|
| Supplier1 | Orange | 100-110 | 105-130 | 140-150 | ... | 180 |
| Supplier2 | Orange | 200-250 | 250-270 | 270-320 | ... | 270-350 |
Count of columns like "PriceN" can be arbitrary.
How can I do this with LINQ?
You can group by Supplier and Product as
var result = from x in data
group x by new { x.Supplier, x.Product }
select x;
or
var result = data.GroupBy(x => new { x.Supplier, x.Product });
similarly you can use any number of property in group clause
you have to group by it separately in another lambda expression
result.tolist().GroupBy(p=> p.x,p.x2,p.x3 ...);
You can use GroupBy and JOIN to concatenate the values of the price columns:
var groupedSupplier = supplier.GroupBy(s => new { s.Supplier, s.Product })
.Select(supplier => supplier.Supplier,
supplier.Product,
supplier.Price1 = string.Join(",", supplier.Select(x => x.Price1)),
supplier.Price2 = string.Join(",", supplier.Select(x => x.Price2)),
...);
To group by multiple columns in lambda expression use:
var groupedSupplier = supplier.GroupBy(s => s.Supplier,
s => s.Product)
For details, please see chapter 11 of "C# in Depth" by Jon Skeet, Chapter 11 "Query expressions and LINQ to Objects: 11.6 Groupings and continuations", http://csharpindepth.com/

Join a multiple tables get clone only demand not supply

Plantdemand
Id | FId | FY
------------------
22 | 1 | 2011-15
No.PlantDemand
Id | PDId | CId | Demand
------------------------
1 | 22 | 1 | 100
2 | 22 | 2 | 200
3 | 22 | 3 | 300
^
"- Id of plantDemand
PlantSupply
Id | FId | DId | FY
---------------------
11 | 1 | 22 | 2012-13
^
"-Id of plantDemand
No.PlantSuply
ID | PSId | CId | Supply
---------------------------
1 | 11 | 1 | 10
2 | 11 | 2 | 10
^
"--Id of PlantSupply
I am stuck to get a CId entries not in table No.PlantSupply of FId=1 like to get a clone who demand enter but not supplied
var getNoofPlantDemand = (from r in getdemand
join nd in context.tbl_NoOfPlantDemanded on r.PlantDemandId equals nd.PlantDemandId into list1
from l1 in list1.DefaultIfEmpty()
join p in context.PlantationTypes on r.PlantationTypeId equals p.Id into list3
from l3 in list3.DefaultIfEmpty()
select new
{}).toList()
var getCloneDemand = (from r in getNoofPlantDemand
join cl in context.Clones on r.CloneId equals cl.Id into list4
from l4 in list4.DefaultIfEmpty()
var getPlantSupply = (from r in getCloneDemand
join s in context.PlantSupply on r.PlantDemandId equals s.DemandId into list
from l1 in list.DefaultIfEmpty()
join ns in context.No.PlantSupply on l1.Id equals ns.PSId into list1
from l2 in list1.DefaultIfEmpty()
where r.CloneId != l2.CloneId && r.PlantDemandId==l1.DemandId && r.PlantationTypeId == l2.PlantationTypeId
select new
{}).toList()
my requirement:
Id | FId | FY | CId | Demand
22 | 1 | 2011-012 | 3 | 300
Please let me know if anybody know how can i get the clone those are not supplied only a demand entry

Join 2 datatable on difference columns

I have 2 DataTable. I want to use LINQ to join the 2 datatable on difference columns. How to do that?
Table A:
+--------+-------+-------+
| ACol1 | ACol2 | ACol3 |
+--------+-------+-------+
| 1 | tbA12 | tbA13 |
| 2 | tbA22 | tbA23 |
| 3 | tbA32 | tbA33 |
| 4 | tbA42 | tbA43 |
| 5 | tbA52 | tbA53 |
+--------+-------+-------+
Table B:
+-------+-------+-------+
| BCol1 | BCol2 | BCol3 |
+-------+-------+-------+
| 1 | XX | tbB13 |
| XX | 1 | tbB23 |
| XX | 2 | tbB33 |
| 4 | XX | tbB43 |
+-------+-------+-------+
SQL Query:
SELECT a.*, b.BCol3
FROM tableA a
JOIN tableB b ON a.ACol1=b.BCol1 OR a.ACol1=b.BCol2
Expected Result:
+--------+-------+-------+-------+
| ACol1 | ACol2 | ACol3 | BCol3 |
+--------+-------+-------+-------+
| 1 | tbA12 | tbA13 | tbB13 |
| 1 | tbA12 | tbA13 | tbB23 |
| 2 | tbA22 | tbA23 | tbB33 |
| 4 | tbA42 | tbA43 | tbB43 |
+--------+-------+-------+-------+
Currently my LINQ query are below:
var query1= from rowA in tableA.AsEnumerable()
join rowB in tableB.AsEnumerable()
on rowA["ACol1"].ToString() equals rowB["BCol1"].ToString()
select new
{
rowA["ACol1"],
rowA["ACol2"],
rowA["ACol3"],
rowB["BCol3"]
};
var query2= from rowA in tableA.AsEnumerable()
join rowB in tableB.AsEnumerable()
on rowA["ACol1"].ToString() equals rowB["BCol2"].ToString()
{
rowA["ACol1"],
rowA["ACol2"],
rowA["ACol3"],
rowB["BCol3"]
};
var result=query1.Union(query2);
Any better idea how to solve this?
LINQ support for JOIN with non-trivial conditions is very limited. You could do a cross join + move your condition to where clause.
var query1= from rowA in tableA.AsEnumerable()
from rowB in tableB.AsEnumerable()
where rowA["ACol1"].ToString() == rowB["BCol1"].ToString()
|| rowA["ACol1"].ToString() == rowB["BCol2"].ToString()
select new
{
rowA["ACol1"],
rowA["ACol2"],
rowA["ACol3"],
rowB["BCol3"]
};
Try this:-
var result = from a in tableA.AsEnumerable()
from b in tableB.AsEnumerable()
where a.Field<string>("ACol1") == b.Field<string>("BCol1")
|| a.Field<string>("ACol1") == b.Field<string>("BCol2")
select new
{
a["ACol1"],
a["ACol2"],
a["ACol3"],
b["BCol3"]
};
Here is the complete working Fiddle, you can copy paste the same in your editor and test because its not supporting AsEnumerable in DotNetFiddle.

Group and count in linq C#

I have a query that needs to retrieve 3 fields:
| MaintenanceID | MaintenanceIDCount | StatusID |
| 1 | 2 | -1 |
| 3 | 2 | -1 |
The field MaintenanceIDCount (like the name says), is the count of MaintenanceID column.
My basic query expression is above:
var result = from m in Maintenance
select new
{
m.MaintenanceID,
m.StatusID
}
The result of this query is:
| MaintenanceID | StatusID |
| 1 | -1 |
| 1 | -1 |
| 3 | -1 |
| 3 | -1 |
How can I group and mount my query to retrieve a column with the MaintenanceID column count?
Some tips?
from m in Maintenance
group m by new { m.MaintenanceID, m.StatusID } into g
select new {
g.Key.MaintenanceID,
g.Key.StatusID,
MaintenanceIDCount = g.Count()
}

Linq query summary by column with ordering

The scenario is like this
public class Test {
public string name;
public int val1;
public int val1;
}
name |val 1 |val 2|
'aa' | 10 | 4 |
'aa' | 30 | 5 |
'bb' | 14 | 4 |
'bb' | 16 | 6 |
'cc' | 5 | 5 |
'cc' | 2 | 1 |
'cc' | 1 | 1 |
What is the best way group by name and get summary val_1 ans val_2 for every name
as
name |val 1 |val 2|
'aa' | 40 | 9 |
'bb' | 30 | 10 |
'cc' | 8 | 7 |
Try this
var results =
from t in db.Tests
group t by t.name into g
orderby g.Key
select new
{
name = g.Key,
val_1 = g.Sum(x => x.val_1),
val_2 = g.Sum(x => x.val_2)
};
Or if you prefer fluent syntax:
var results = db.Tests.GroupBy(t => t.name)
.OrderBy(g => g.Key)
.Select(g => new
{
name = g.Key,
val_1 = g.Sum(x => x.val_1),
val_2 = g.Sum(x => x.val_2)
});

Categories