Convert Datatable GroupBy Multiple Columns with Sum using Linq - c#

I want to sum of all TotalImages Column after Group BY but its' showing me error.
any one who can help me what's going wrong.
Remember just want to use from this syntax base and want DataTable not a List. Kindly if some one help me out will be grateful.
Sample Data:-
CountryId | CItyId | TotalImages
1 1 2
1 2 2
1 2 3
1 3 4
2 1 2
2 2 2
2 2 3
2 3 4
DataTable dt = dt.AsEnumerable()
.GroupBy(r => new { Col1 = r["CountryId"], Col2 = r["CityId"]})
.Select(g => g.Sum(r => r["TotalImages"]).First())
.CopyToDataTable();

You can use this:-
DataTable countriesTable = dt.AsEnumerable().GroupBy(x => new { CountryId = x.Field<int>("CountryId"), CityId = x.Field<int>("CityId") })
.Select(x => new Countries
{
CountryId = x.Key.CountryId,
CityId = x.Key.CityId,
TotalSum = x.Sum(z => z.Field<int>("TotalImages"))
}).PropertiesToDataTable<Countries>();
I am getting, following output:-
Since, We cannot use CopyToDataTable method for anonymous types, I have used an extension method took from here and modified it accordingly.
public static DataTable PropertiesToDataTable<T>(this IEnumerable<T> source)
{
DataTable dt = new DataTable();
var props = TypeDescriptor.GetProperties(typeof(T));
foreach (PropertyDescriptor prop in props)
{
DataColumn dc = dt.Columns.Add(prop.Name, prop.PropertyType);
dc.Caption = prop.DisplayName;
dc.ReadOnly = prop.IsReadOnly;
}
foreach (T item in source)
{
DataRow dr = dt.NewRow();
foreach (PropertyDescriptor prop in props)
{
dr[prop.Name] = prop.GetValue(item);
}
dt.Rows.Add(dr);
}
return dt;
}
And, here is the Countries type:-
public class Countries
{
public int CountryId { get; set; }
public int CityId { get; set; }
public int TotalSum { get; set; }
}
You can use any other approach to convert it to a DataTable if you wish.

Related

DataTable: how to get the duplicates and the row number of the duplicates

I have the following DataTable:
Article Price
ART1 99
ART2 100
ART3 150
ART2 90
ART1 50
Now, I should create a new datatable with the position of the duplicates like that
Article Duplicates
ART1 1,5
ART2 2,4
ART3
ART2 2,4
ART1 1,5
So the key is the "article" column
I found around only examples about finding which are the duplicate values and how many times are repeated the values with linq.
How can I achieve something like that with linq?
thank you
You can use this approach:
var articleLookup = yourTable.AsEnumerable()
.Select((row, index) => new { Row = row, RowNum = index + 1 })
.ToLookup(x=> x.Row.Field<string>("Article"));
DataTable dupTable = new DataTable();
dupTable.Columns.Add("Article");
dupTable.Columns.Add("Duplicates");
foreach(DataRow row in yourTable.Rows)
{
DataRow addedRow = dupTable.Rows.Add();
string article = row.Field<string>("Article");
var dupRowNumList = articleLookup[article].Select(x => x.RowNum).ToList();
string dupRowNumText = dupRowNumList.Count == 1 ? "" : String.Join(",", dupRowNumList);
addedRow.SetField("Article", article);
addedRow.SetField("Duplicates", dupRowNumText);
}
Hi I tried your exact requirement with creating a Object of List. I could get the expected result you require. Important is you have the Linq query which will give you the result.
Here is the Main class
class Program
{
static void Main(string[] args)
{
List<data> datas = new List<data>();
datas.Add(new data() {atricle = "ART1", price = 99});
datas.Add(new data() { atricle = "ART2", price = 100 });
datas.Add(new data() { atricle = "ART3", price = 150 });
datas.Add(new data() { atricle = "ART2", price = 90 });
datas.Add(new data() { atricle = "ART1", price = 50 });
Console.WriteLine($"Atricle | Duplicates");
foreach (data templist in datas)
{
var duplicates = datas.Select((data, index) => new {atricle = data.atricle, Index = index + 1})
.Where(x => x.atricle == templist.atricle)
.GroupBy(pair => pair.atricle)
.Where(g => g.Count() > 1)
.Select(grp => grp.Select(g => g.Index.ToString()).ToArray())
.ToArray();
string joined = duplicates.Length>0 ? string.Join(",", duplicates[0].ToList()):"";
Console.WriteLine($"{templist.atricle} | {joined}");
}
Console.ReadLine();
}
}
Here is the Data model class
public class data{
public string atricle { get; set; }
public int price { get; set; }
}

remove duplicate values from a list with multiple properties

I have a list of MyClass:
class MyClass
{
public DateTime? DueDate;
public string Desc;
public Decimal Amount;
}
var sample = new List<MyClass>();
This is how sample data looks like :
DueDate Desc Amount
06-29-2015 ABC 100
06-29-2015 DEF 200
01-15-2015 ABC 100
01-15-2015 DEF 200
Output I want in this format
DueDate Desc Amount
06-29-2015 ABC 100
DEF 200
01-15-2015 ABC 100
DEF 200
So basically I would like to remove duplicate DueDate values but keeping its adjacent Desc & Amount field values
I tried this but it will remove values from adjacent column as well :
var test = sample.GroupBy(d => d.DueDate).Select(a => a.First()).ToList();
Any suggestions?
Here's how to "remove" (set to null) duplicate, adjacent DueDates from the sample list:
sample.GroupBy(d => d.DueDate).ToList()
.ForEach(g => g.Skip(1).ToList().ForEach(o => o.DueDate = null));
This is done by Group-ing by DueDate, and for each group, Skip-ing the first element, setting the remainder of the elements in the group DueDates to null.
Output with format:
Console.WriteLine("DueDate Desc Amount");
foreach (var item in sample)
{
var dateString = item.DueDate != null
? item.DueDate.Value.ToString("MM-dd-yyyy")
: string.Empty;
Console.WriteLine(dateString.PadRight(12) + item.Desc + " " + item.Amount);
}
Result:
DueDate Desc Amount
06-29-2015 ABC 100
DEF 200
01-15-2015 ABC 100
DEF 200
var finalData = data
.GroupBy(d=>d.DueDate)
.Select(g=>
new {
DueDate = g.Key,
Values = g.Select(d2=>new{d2.Desc, d2.Amount})})
The Final Structure would be
finalDate = [
{
DueDate:'06-29-1015',
Values:[{Desc:"ABC", Amount:100}, {Desc:"DEF", Amount:200}]
},
{...}
]
EDIT:-
var finalData = data
.GroupBy(d=>d.DueDate)
.Select(g=>
new {
DueDate = g.Key,
Values = g.Select(d2=>d2)
})
.ToDictionary(o=>o.DueDate, o=>o.Values)
What you want is a pivot table. this is how it is done :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
MyClass myClass = new MyClass();
myClass.Load();
myClass.CreatePivotTable();
}
}
class MyClass
{
public static List<MyClass> samples = new List<MyClass>();
public DateTime dueDate { get; set; }
public string desc { get; set; }
public Decimal amount { get; set; }
public static DataTable dt = new DataTable();
public void Load()
{
samples = new List<MyClass>() {
new MyClass() { dueDate = DateTime.Parse("06-29-2015"), desc = "ABC", amount = 100},
new MyClass() { dueDate = DateTime.Parse("06-29-2015"), desc = "DEF", amount = 200},
new MyClass() { dueDate = DateTime.Parse("01-15-2015"), desc = "ABC", amount = 100},
new MyClass() { dueDate = DateTime.Parse("01-15-2015"), desc = "DEF", amount = 100}
};
}
public void CreatePivotTable()
{
string[] uniqueDescription = samples.Select(x => x.desc).Distinct().ToArray();
dt.Columns.Add("Due Date", typeof(DateTime));
foreach (string desc in uniqueDescription)
{
dt.Columns.Add(desc, typeof(decimal));
}
var groups = samples.GroupBy(x => x.dueDate);
foreach(var group in groups)
{
DataRow newRow = dt.Rows.Add();
newRow["Due Date"] = group.Key;
foreach (string col in uniqueDescription)
{
newRow[col] = group.Where(x => x.desc == col).Sum(x => x.amount);
}
}
}
}
}
I'd simply prefer that you loop through your records after you got them in the correct order. Just start with an empty variable and keep the last date in it. If the next value is the same, just don't plot it out. If you find another date value the next iteration, plot it and overwrite your variable for further iterations.
Yeah I know, Linq and Lambdas are cool and stuff (and I love them too) but in this case it seems to be appropriate to me.
var last = DateTime.MinValue;
foreach (var f in sample.OrderBy(x => x.DueDate))
{
if (f.DueDate.Equals(last))
Console.WriteLine("{0}\t{1}\t{2}", "SKIP DATE", f.Desc, f.Amount);
else
{
Console.WriteLine("{0}\t{1}\t{2}", f.DueDate.ToShortDateString(), f.Desc, f.Amount);
last = f.DueDate;
}
}
Based on your latest comments I have edited my answer.
As I am understanding, your requirements are:
Group by DueDate, and only allow the first of the group to have a
DueDate.
The results have to be the same structure.
If you want to remove the DueDate property from all i>0 items in a group then you need to make your property nullable: public DateTime? DueDate;. This way you can assign the value of null to subsequent items in the group.
//New list to hold our new items
var outputList = new List<MyClass>();
//Groups all the items together by DueDate
foreach(var grouping in samples.GroupBy(d => d.DueDate))
{
//Iterates through all items in a group (selecting the index as well)
foreach(var item in grouping.Select((Value, Index) => new { Value, Index }))
{
//If this is any item after the first one, we remove the due date
if(item.Index > 0)
{
item.Value.DueDate = null;
}
outputList.Add(item.Value);
}
}
Fiddle here.

Join 2 DataTables on dynamic number of columns

I'm trying to join two DataTables on a dynamic number of columns. I've gotten as far as the code below. The problem is the ON statement of the join. How can I make this dynamic based on how many column names are in the list "joinColumnNames".
I was thinking I will need to build some sort of expression tree, but I can't find any examples of how to do this with multiple join columns and with the DataRow object which doesn't have properties for each column.
private DataTable Join(List<string> joinColumnNames, DataTable pullX, DataTable pullY)
{
DataTable joinedTable = new DataTable();
// Add all the columns from pullX
foreach (string colName in joinColumnNames)
{
joinedTable.Columns.Add(pullX.Columns[colName]);
}
// Add unique columns from PullY
foreach (DataColumn col in pullY.Columns)
{
if (!joinedTable.Columns.Contains((col.ColumnName)))
{
joinedTable.Columns.Add(col);
}
}
var Join = (from PX in pullX.AsEnumerable()
join PY in pullY.AsEnumerable() on
// This must be dynamic and join on every column mentioned in joinColumnNames
new { A = PX[joinColumnNames[0]], B = PX[joinColumnNames[1]] } equals new { A = PY[joinColumnNames[0]], B = PY[joinColumnNames[1]] }
into Outer
from PY in Outer.DefaultIfEmpty<DataRow>(pullY.NewRow())
select new { PX, PY });
foreach (var item in Join)
{
DataRow newRow = joinedTable.NewRow();
foreach (DataColumn col in joinedTable.Columns)
{
var pullXValue = item.PX.Table.Columns.Contains(col.ColumnName) ? item.PX[col.ColumnName] : string.Empty;
var pullYValue = item.PY.Table.Columns.Contains(col.ColumnName) ? item.PY[col.ColumnName] : string.Empty;
newRow[col.ColumnName] = (pullXValue == null || string.IsNullOrEmpty(pullXValue.ToString())) ? pullYValue : pullXValue;
}
joinedTable.Rows.Add(newRow);
}
return joinedTable;
}
Adding a specific example to show input/output using 3 join columns (Country, Company, and DateId):
Pull X:
Country Company DateId Sales
United States Test1 Ltd 20160722 $25
Canada Test3 Ltd 20160723 $30
Italy Test4 Ltd 20160724 $40
India Test2 Ltd 20160725 $35
Pull Y:
Country Company DateId Downloads
United States Test1 Ltd 20160722 500
Mexico Test2 Ltd 20160723 300
Italy Test4 Ltd 20160724 900
Result:
Country Company DateId Sales Downloads
United States Test1 Ltd 20160722 $25 500
Canada Test3 Ltd 20160723 $30
Mexico Test2 Ltd 20160723 300
Italy Test4 Ltd 20160724 $40 900
India Test2 Ltd 20160725 $35
var Join =
from PX in pullX.AsEnumerable()
join PY in pullY.AsEnumerable()
on string.Join("\0", joinColumnNames.Select(c => PX[c]))
equals string.Join("\0", joinColumnNames.Select(c => PY[c]))
into Outer
from PY in Outer.DefaultIfEmpty<DataRow>(pullY.NewRow())
select new { PX, PY };
Another way is to have both DataTable in a DataSet and use DataRelation
How To: Use DataRelation to perform a join on two DataTables in a DataSet?
Since you are using LINQ to Objects, there is no need to use expression trees. You can solve your problem with a custom equality comparer.
Create an equality comparer that can compare equality between two DataRow objects based on the values of specific columns. Here is an example:
public class MyEqualityComparer : IEqualityComparer<DataRow>
{
private readonly string[] columnNames;
public MyEqualityComparer(string[] columnNames)
{
this.columnNames = columnNames;
}
public bool Equals(DataRow x, DataRow y)
{
return columnNames.All(cn => x[cn].Equals(y[cn]));
}
public int GetHashCode(DataRow obj)
{
unchecked
{
int hash = 19;
foreach (var value in columnNames.Select(cn => obj[cn]))
{
hash = hash * 31 + value.GetHashCode();
}
return hash;
}
}
}
Then you can use it to make the join like this:
public class TwoRows
{
public DataRow Row1 { get; set; }
public DataRow Row2 { get; set; }
}
private static List<TwoRows> LeftOuterJoin(
List<string> joinColumnNames,
DataTable leftTable,
DataTable rightTable)
{
return leftTable
.AsEnumerable()
.GroupJoin(
rightTable.AsEnumerable(),
l => l,
r => r,
(l, rlist) => new {LeftValue = l, RightValues = rlist},
new MyEqualityComparer(joinColumnNames.ToArray()))
.SelectMany(
x => x.RightValues.DefaultIfEmpty(rightTable.NewRow()),
(x, y) => new TwoRows {Row1 = x.LeftValue, Row2 = y})
.ToList();
}
Please note that I am using method syntax because I don't think that you can use a custom equality comparer otherwise.
Please note that the method does a left outer join, not a full outer join. Based on the example you provided, you seem to want a full outer join. To do this you need to do two left outer joins (see this answer). Here is how the full method would look like:
private static DataTable FullOuterJoin(
List<string> joinColumnNames,
DataTable pullX,
DataTable pullY)
{
var pullYOtherColumns =
pullY.Columns
.Cast<DataColumn>()
.Where(x => !joinColumnNames.Contains(x.ColumnName))
.ToList();
var allColumns =
pullX.Columns
.Cast<DataColumn>()
.Concat(pullYOtherColumns)
.ToArray();
var allColumnsClone =
allColumns
.Select(x => new DataColumn(x.ColumnName, x.DataType))
.ToArray();
DataTable joinedTable = new DataTable();
joinedTable.Columns.AddRange(allColumnsClone);
var first =
LeftOuterJoin(joinColumnNames, pullX, pullY);
var resultRows = new List<DataRow>();
foreach (var item in first)
{
DataRow newRow = joinedTable.NewRow();
foreach (DataColumn col in joinedTable.Columns)
{
var value = pullX.Columns.Contains(col.ColumnName)
? item.Row1[col.ColumnName]
: item.Row2[col.ColumnName];
newRow[col.ColumnName] = value;
}
resultRows.Add(newRow);
}
var second =
LeftOuterJoin(joinColumnNames, pullY, pullX);
foreach (var item in second)
{
DataRow newRow = joinedTable.NewRow();
foreach (DataColumn col in joinedTable.Columns)
{
var value = pullY.Columns.Contains(col.ColumnName)
? item.Row1[col.ColumnName]
: item.Row2[col.ColumnName];
newRow[col.ColumnName] = value;
}
resultRows.Add(newRow);
}
var uniqueRows =
resultRows
.Distinct(
new MyEqualityComparer(
joinedTable.Columns
.Cast<DataColumn>()
.Select(x => x.ColumnName)
.ToArray()));
foreach (var uniqueRow in uniqueRows)
joinedTable.Rows.Add(uniqueRow);
return joinedTable;
}
Please note also how I clone the columns. You cannot use the same column object in two tables.

How to use Inner Join and then fill a DataSet with result of the join?

Well, this is my question. In short terms; I have two tables, Consequents and Atomic propositions:
AtomicP table
ID Proposition
1 | A |
1 | B |
1 | C |
2 | D |
2 | E |
Consequent Table
ID | Consequent |
1 | A |
2 | B |
And all I just want to do, is to implement a inner join which gives me all the values where the ID for both tables is the same(i.e):
AtomicP Table "A" "B" "C" -> "A" Consequent Table
and withe result given tanks to the inner joins , save that result in a Data Set or in another data structure that could be better.
Best regards.
Assuming the destination table has the values Id, Proposition and Consequent ..
insert into newtable (id,proposition,consequent) select id,atomicP,Consequent from atmicp,consequent where atomicP.id = consequent.id
public class Proposition
{
public int Id;
public string Value;
public Proposition(int id, string value){
Id = id;
Value = value;
}
}
public class Consequent
{
public int Id;
public string Value;
public Consequent(int id, string value){
Id = id;
Value = value;
}
}
var atomicP = new List<Proposition>{
new Proposition(1, "A"),
new Proposition(1, "B"),
new Proposition(1, "C"),
new Proposition(2, "D"),
new Proposition(2, "E"),
}
var consequents = new List<Consequent>{
new Consequent(1, "A"),
new Consequent(2, "B"),
}
var query = from proposition in atomicP
join consequent in consequents on proposition.Id == consequent.Id
select proposition.Value;
return query.ToList();
use this function
private DataTable JoinDataTables(DataTable t1, DataTable t2, params Func<DataRow, DataRow, bool>[] joinOn)
{
DataTable result = new DataTable();
foreach (DataColumn col in t1.Columns)
{
if (result.Columns[col.ColumnName] == null)
result.Columns.Add(col.ColumnName, col.DataType);
}
foreach (DataColumn col in t2.Columns)
{
if (result.Columns[col.ColumnName] == null)
result.Columns.Add(col.ColumnName, col.DataType);
}
foreach (DataRow row1 in t1.Rows)
{
var joinRows = t2.AsEnumerable().Where(row2 =>
{
foreach (var parameter in joinOn)
{
if (!parameter(row1, row2)) return false;
}
return true;
});
foreach (DataRow fromRow in joinRows)
{
DataRow insertRow = result.NewRow();
foreach (DataColumn col1 in t1.Columns)
{
insertRow[col1.ColumnName] = row1[col1.ColumnName];
}
foreach (DataColumn col2 in t2.Columns)
{
insertRow[col2.ColumnName] = fromRow[col2.ColumnName];
}
result.Rows.Add(insertRow);
}
}
return result;
}
An example of how you might use this:
var test = JoinDataTables(Consequents, Atomic,
(row1, row2) =>
row1.Field<int>("ID") == row2.Field<int>("ID"));
I assume you want to join In C# and get DataTable(bit unclear in question).
Code snippets joins two DataTable using Linq and inserts to another Table.
DataTable results = new DataTable();
results.Columns.Add("ID", typeof(int));
results.Columns.Add("Proposition", typeof(string));
results.Columns.Add("Consequent", typeof(string));
var result1 = from arow in AtomicP.AsEnumerable()
join con in Consequent.AsEnumerable()
on arow.Field<int>("ID") equals con.Field<int>("ID")
select results.LoadDataRow(new object[]
{
arow.Field<int>("ID"),
arow.Field<string>("Proposition"),
con.Field<string>("Consequent")
}, false);
Now we can access results by iterating through results.
foreach(DataRow row in results.Rows)
{
foreach(DataColumn column in results.Columns)
{
//Console.WriteLine(row[column]);
}
}
Working Code

How to select the data from datatable which not in an IEnumerable

I have a DataTable dbcrs and I want to get only the data which is not in the following enumerable:
IEnumerable<Crs> res
Note : the key in both is id.
Here is my suggestion:
var result = dbcrs.Where(item => res.FirstOrDefault(resItem => resItem.Id == item.Id) == null);
First you need to use AsEnumerable() in order to query against the DataTable's Rows collection, then use !Contains as not in like this:
var query = from r in dbcrs.AsEnumerable()
where !( from s in res select r.Id)
.Contains(r.Id)
select r;
An example of doing this with Except and IEquatable<>
A benefit of this way is that you can define what you mean by "Equals", so that two lists which may have the same ID's but are NOT equal can still be used.
e.g. You get data from two tables, so the Id's can repeat but some other properties define if they are actually equal.
class Crs:IEquatable<Crs>
{
public int Id { get; set; }
public string Description { get; set; }
public bool Equals(Crs other)
{
if (Object.ReferenceEquals(other, null))
return false;
if (Object.ReferenceEquals(this, other))
return true;
return Id.Equals(other.Id) && Description.Equals(other.Description);
}
public override int GetHashCode()
{
int hashId = Id.GetHashCode();
int hashDescription = Description == null ? 0 : Description.GetHashCode();
return hashId ^ hashDescription;
}
}
internal static void RunMe()
{
var dataTable = new List<Crs>(){
new Crs{Id=1, Description="First"},
new Crs{Id=2, Description="Second"},
new Crs{Id=5, Description="Fifth"}
};
var enumerable = new List<Crs>(){
new Crs{Id=2, Description="Second"},
new Crs{Id=4, Description="Fourth"}
};
var distinct = dataTable.Except(enumerable);
distinct.ToList().ForEach(d => Console.WriteLine("{0}: {1}", d.Id, d.Description));
}
DataTable dt = new DataTable();
dt.Columns.AddRange(new DataColumn[]
{
new DataColumn("Id", typeof(System.Int32)),
new DataColumn("Name", typeof(System.String))
});
dt.Rows.Add (new Object[]{1,"Test"});
dt.Rows.Add(new Object[] {2, "Test" });
var l = new Int32[] { 2, 4 };
var l1 = dt.AsEnumerable().Where(p1 => Array.IndexOf(l, p1.Field<Int32>(0))<0).CopyToDataTable();
This would return us one row because in Datatable and array both have one value in common that's 2 only. so out put will be
2, Test

Categories