How to compare 2 columns in different datatables - c#

table1
id | fileName | fileDateTime
1 | somefile | somedatetime
2 | somefile2 | somedatetime2
table2
id | fileName | fileDateTime
| somefile1 | somedatetime1
| somefile2 | somedatetime2
output table3
id | fileName | fileDatetime
| somefile1 | somedatetime1
I want to compare the 2 tables (column 2 & 3 only) and only have what is not in both tables, there is no ID field in the 2nd table. Then I plan on parsing the data in the file and add file info to database to record file has been parsed. I am having trouble comparing the 2 fields. This does not seem to work.
for (int i = 0; i < finalTable.Rows.Count; i++)
{
for (int r = 0; r < filesTable.Rows.Count; i++)
{
if (finalTable.Rows[i][2] == filesTable.Rows[r][2])
{
finalTable.Rows.Remove(finalTable.Rows[i]);
}
}
}

Assuming the value is a string, you could just do
for (int i = 0; i < finalTable.Rows.Count; i++)
{
for (int r = 0; r < filesTable.Rows.Count; i++)
{
if (finalTable.Rows[r].Field<string>(2) == filesTable.Rows[r].Field<string>(2))
{
finalTable.Rows.Remove(finalTable.Rows[i]);
}
}
}
If it's another type, just change the <string> for the real type !

You can use Linq to achieve it as shown below.
Assuming your fields are string. For other datatype you can cast them accordingly:
using System.Data.Linq;
......
......
// Merge both tables data and group them by comparing columns values
dt1.Merge(dt2);
var g = dt1.AsEnumerable()
.GroupBy(x => x[0]?.ToString() + x[1]?.ToString()) // You can build Unique key here, I used string concatination
.ToDictionary(x => x.Key, y => y.ToList());
var unique = dt1.Clone();
foreach (var e in g)
{
if (e.Value.Count() == 1) // In either table - Intersaction
{
e.Value.CopyToDataTable(unique, LoadOption.OverwriteChanges);
}
if (e.Value.Count() == 2)
{
// In both tables -Union
}
}

Related

how to spilt a string with separator into 5 columns [duplicate]

This question already has answers here:
How to get the ever first five character of a string with separator
(4 answers)
Closed 2 years ago.
How can I extract this whole string into 5 columns?
"1;2;3;4;5;A;AAA;AA;AAAA;AAAAA;NA;NA;PASS;YES;NO;test1;test2;test3;test4;test5;Word4;Word5;Word3;Word2;Word1"
Like this:
And I want to put the final result in a List by rows;
eg:
result 1 =
result 2 =
The question does not stipulate a final data structure.
Let's suppose we want something like this at the end:
List<string[]>
{
string[]{ "1", "A", "NA", "test1", "Word4" },
string[]{ "2", "AAA", "NA", "test2", "Word5" },
...
}
A better structured data structure would be more suitable.
And let's suppose that the input string does not contain words having the separator ;
var data = "1;2;3;4;5;A;AAA;AA;AAAA;AAAAA;NA;NA;PASS;YES;NO;test1;test2;test3;test4;test5;Word4;Word5;Word3;Word2;Word1";
var splitData = data.Split(';');
var columnCount = 5;
// We check that the input data has the right number of elements (multiple of columnCount).
if (splitData.Length % columnCount != 0)
{
throw new InvalidOperationException();
}
var results = splitData
.Select((columnVal, i) => (columnVal, i)) // projection of each element including its index.
.GroupBy( x => x.i % columnCount, // group elements by 'row' [0,5,10,15,20][1,6,11,16,21][2,7,12,17,22]...
x => x.columnVal, // selecting only the value (we don't need the index anymore).
(_, columns) => columns.ToArray()) // putting all column values for each row in an array, here we could map to a different data structure.
.ToList();
// the 'results' variable has the target data structure.
If we display the parsed data on the console:
for (var i = 0; i < results.Count; i++)
{
var row = results[i];
// each column value 'i' is in 'result[i]'
var columnStr = string.Join(" | ", row);
Console.WriteLine($"Row {i} -> {columnStr}");
}
/*
Row 0 -> 1 | A | NA | test1 | Word4
Row 1 -> 2 | AAA | NA | test2 | Word5
Row 2 -> 3 | AA | PASS | test3 | Word3
Row 3 -> 4 | AAAA | YES | test4 | Word2
Row 4 -> 5 | AAAAA | NO | test5 | Word1
*/
Here is the solution with assertion tests
string input = "1;2;3;4;5;A;AAA;AA;AAAA;AAAAA;NA;NA;PASS;YES;NO;test1;test2;test3;test4;test5;Word4;Word5;Word3;Word2;Word1";
var elementSplit = input.Split(';');
int columnNumber = 5;
List<string> result = new List<string>();
for (int columnIndex = 0; columnIndex < columnNumber; columnIndex++)
{
StringBuilder row = new StringBuilder();
for (int i = columnIndex; i < elementSplit.Length; i+= columnNumber)
{
row.Append(elementSplit[i]);
row.Append(";");
}
row.Remove(row.Length - 1, 1);
result.Add(row.ToString());
}
Assert.AreEqual("1;A;NA;test1;Word4", result[0]);
Assert.AreEqual("2;AAA;NA;test2;Word5", result[1]);

Remove rows and adjust number

I am trying to remove objects from a list where a certain property value are identical to the previous/next objects property value. If an object are found I need to update the nested objects value.
Example:
Level | Text
1 | General
2 | Equipment
3 | Field Staff
2 | Scheduling
3 | Scheduling
4 | Deadlines
4 | Windows
1 | Specialities
In the example above I want to remove the second Scheduling and change the Deadlines Level to 3 as well as the Windows to 3.
I tried to look a head and compare with the next object in the list and also keep a counter but it didnt work.
int counter = 0;
for (int i = 0; i < notes.Count(); i++)
{
if (i <= notes.Count() - 1)
{
var currentRow = notes.ElementAt(i);
var nextRow = notes.ElementAt(i + 1);
if (currentRow.Text.Equals(nextRow.Text))
{
notes.Remove(nextRow);
counter++;
}
else
{
notes.ElementAt(i).Level = notes.ElementAt(i).Level - counter;
counter = 0;
}
}
}
Could anyone point me in the correct direction?
You can do it with Linq:
1 - Get distinct lines
var distinctList = notes
.GroupBy(p => p.Text)
.Select(v => v.First());
2 - get deleted level
IEnumerable<int> deletedLevel = notes
.Except(distinctList)
.Select(l => l.Level);
3 - update your distinct list
foreach(int level in deletedLevel)
{
distinctList
.Where(l => l.Level >= level + 1)
.ToList()
.ForEach(item => { item.Level -= 1; });
}
Result :
Level | Text
1 | General
2 | Equipment
3 | Field Staff
2 | Scheduling
3 | Deadlines
3 | Windows
1 | Specialities
i hope that will help you out
Try this:
var query = notesList.GroupBy(x => x.Text)
.Where(g => g.Count() > 1)
.Select(y => y.Key)
.Select(y => new { Element = y, Index = Array.FindIndex<Notes>(notesList.ToArray(), t => t.Text ==y) })
.ToList();
var filteredList = new List<Notes>();
foreach (var duplicate in query)
{
filteredList = notesList.Where((n, index) => index < duplicate.Index + 1).ToList();
var newElems = notesList.Where((n, index) => index > duplicate.Index + 1).Select(t =>
new Notes {Level = t.Level == 1 ? 1 : t.Level - 1, Text = t.Text});
filteredList.AddRange(newElems);
}

Delete duplicates within a single collection c#

I'm new to C#, here is my problem
My expected result is removing the entire row of ABC.
Both rows (with duplicate ABC) will be removed.
I need to do it the iterative way. Can't use distinct and stuff as recommended by the other post.
I tried to remove duplicates but it didn't work.
So i decided to add the non-duplicates to a new collection.
But it isn't working as well.
CollectionIn --> My sample collection
| Folder| Times
------------------------
| ABC | 3 |
| CDE | 2 |
| ACD | 2 |
| ABC | 1 |
CollectionOut = new DataTable();
CollectionOut.Columns.Add("Folder");
CollectionOut.Columns.Add("Times");
bool duplicate = false;
for (int i = 0; i < CollectionIn.Rows.Count; i++)
{
string value1 = CollectionIn.Rows[i].ItemArray[0].ToString().ToLower();
for (int z = 0; z < i; z++)
{
string value2 = CollectionIn.Rows[z].ItemArray[0].ToString().ToLower();
if (value1 == value2)
{
duplicate = true;
break;
}
}
if (!duplicate)
{
CollectionOut.Rows.Add(value1);
}
}
Can anyone help to take a look. Thanks!
Since you dont want to use Distinct, you cant do it with LINQ like:
var newList = myList.GroupBy(s=>s).Where(s => s.Count() == 1).ToList();
I would use Linq-To-DataTable:
List<DataRow> duplicates = CollectionIn.AsEnumerable()
.GroupBy(r => r.Field<string>("Folder"))
.Where(g => g.Count() > 1)
.SelectMany(grp => grp)
.ToList();
duplicates.ForEach(CollectionIn.Rows.Remove);
This will remove the duplicates from the original collection(DataTable) without creating a new.

Retrieve pairs from a list that are within a certain time of one and another

I am trying to get groups that have a pair of items in it.
public enum eState
{
On,
Off,
blah,
}
public class TestA
{
public DateTime Time {get; set;}
public eState State {get; set;}
}
If I have a list of TestA items. If one if the item's state is On it will have a following Off. There could be other items between the On and Off. I need to get a list of pairs that are within 7 minutes of each other.
For instance if my List looked like:
1) 5/30/2014 8:30 | blah
2) 5/30/2014 8:32 | On
3) 5/30/2014 8:33 | blah
4) 5/30/2014 8:34 | blah
5) 5/30/2014 8:35 | Off
6) 5/30/2014 8:36 | blah
7) 5/30/2014 8:37 | On
8) 5/30/2014 8:55 | blah
9) 5/30/2014 8:56 | Off
10) 5/30/2014 8:57 | On
11) 5/30/2014 8:58 | Off
12) 5/30/2014 8:59 | blah
It should return these Pairs:
2,5 and 10,11 both these on and off pairs are within 7 minutes of each other.
There will never be a situation where there is a off before a corresponding on.
If you are not planning on using the linq expression against a dataprovider (database) I would recommend using a for loop to iterate the list.
Here is a linq implementation if you are still interested:
var query = from x in list
from z in list.Where(y => y.State == eState.Off)
.Where(y => y.Time > x.Time)
.Where(y => y.Time <= x.Time.AddMinutes(7))
.OrderBy(y => y.Time)
.Take(1)
where x.State == eState.On
select new
{
x, z
};
TestA lastOn = null;
int lastIndex = -1;
for (int i = 0; i < tests.Length; i++)
{
switch (test[i].State)
{
case eState.On:
if (lastOn == null)
{
lastOn = test[i];
lastIndex = i;
}
break;
case eState.Off:
if (lastOn != null)
{
if (test[i].Time - lastOn.Time < TimeSpan.FromMinutes(7))
Console.WriteLine(lastIndex + "," + i);
lastOn = null;
}
break;
}
}

Merging two datatable in memory and grouping them to get sum of columns.Using linq but kind of lost here

I have two table where two column are fixed. Some columns are identical and some are new.Columns are dynamic.
Have to do it in code level and I am trying to loop and conditions
What I want is to generate a report following the condition,
All columns in table1 and table2 must be present.
If a column is common and value is there it should be added with the identical row in other table.
If any row is present in one table but not in other, it should be included.
Example data
Table1
ID | NAME | P1 | P2 | P3
----------------------------
1 | A1 | 1 | 2 | 3.3
2 | A2 | 4.4 | 5 | 6
TABLE 2
ID | NAME | P1 | P2 | P4
---------------------------
1 | A1 | 10 | 11 | 12
2 | A2 | 12 | 14 | 15
3 | A3 | 16 | 17 | 18
Expected output:
ID | NAME | P1 | P2 | P3 | P4
---------------------------------
1 | A1 | 11 | 13 | 3.3 | 12
2 | A2 | 16.4 | 19 | 6 | 15
3 | A3 | 16 | 17 | null| 18
Progress till now:
First I merged those two table in to table1
table1.Merge(table2)
Then trying to group by over it
var query = from row in table1.AsEnumerable()
group row by new
{
ID = row.Field<int>("ID"),
Name = row.Field<string>("Name")
}
into grp
select new
{
ID = grp.Key.ID,
Name = grp.Key.Name,
Phase1 = grp.Sum(r => r.Field<decimal>("P1"))
};
I have modified this code to get a datatable. Please see attached cs file.
This is working, but as the number of columns are dynamic, I guess I have to repeat it for other columns and join all these small tables where one columns will be added.
How can I merge all those small tables?
I am lost here.Is there any other way. Its feeling as stupid thing.
Any help would be appreciated.
Attached File:
http://dl.dropbox.com/u/26252340/Program.cs
You want to use an implementation of a full outer join. Something like what follows.
Some setup so you can try this yourself:
DataTable t1 = new DataTable();
t1.Columns.Add("ID", typeof(int));
t1.Columns.Add("Name", typeof(string));
t1.Columns.Add("P1", typeof(double));
t1.Columns.Add("P2", typeof(double));
t1.Columns.Add("P3", typeof(double));
DataRow dr1 = t1.NewRow();
dr1["ID"] = 1;
dr1["Name"] = "A1";
dr1["P1"] = 1;
dr1["P2"] = 2;
dr1["P3"] = 3.3;
t1.Rows.Add(dr1);
DataRow dr2 = t1.NewRow();
dr2["ID"] = 2;
dr2["Name"] = "A2";
dr2["P1"] = 4.4;
dr2["P2"] = 5;
dr2["P3"] = 6;
t1.Rows.Add(dr2);
DataTable t2 = new DataTable();
t2.Columns.Add("ID", typeof(int));
t2.Columns.Add("Name", typeof(string));
t2.Columns.Add("P1", typeof(double));
t2.Columns.Add("P2", typeof(double));
t2.Columns.Add("P4", typeof(double));
DataRow dr3 = t2.NewRow();
dr3["ID"] = 1;
dr3["Name"] = "A1";
dr3["P1"] = 10;
dr3["P2"] = 11;
dr3["P4"] = 12;
t2.Rows.Add(dr3);
DataRow dr4 = t2.NewRow();
dr4["ID"] = 2;
dr4["Name"] = "A2";
dr4["P1"] = 12;
dr4["P2"] = 14;
dr4["P4"] = 15;
t2.Rows.Add(dr4);
DataRow dr5 = t2.NewRow();
dr5["ID"] = 3;
dr5["Name"] = "A3";
dr5["P1"] = 16;
dr5["P2"] = 17;
dr5["P4"] = 18;
t2.Rows.Add(dr5);
The queries look like:
var ids = (from r1 in t1.AsEnumerable() select new { ID = r1["ID"], Name = r1["Name"] }).Union(
from r2 in t2.AsEnumerable() select new { ID = r2["ID"], Name = r2["Name"] });
var query = from id in ids
join r1 in t1.AsEnumerable() on id equals new { ID = r1["ID"], Name = r1["Name"] } into left
from r1 in left.DefaultIfEmpty()
join r2 in t2.AsEnumerable() on id equals new { ID = r2["ID"], Name = r2["Name"] } into right
from r2 in right.DefaultIfEmpty()
select new
{
ID = (r1 == null) ? r2["ID"] : r1["ID"],
Name = (r1 == null) ? r2["Name"] : r1["Name"],
P1 = (r1 == null) ? r2["P1"] : (r2["P1"] == null) ? r1["P1"] : (double)r1["P1"] + (double)r2["P1"],
P2 = (r1 == null) ? r2["P2"] : (r2["P2"] == null) ? r1["P2"] : (double)r1["P2"] + (double)r2["P2"],
P3 = (r1 == null) ? null : r1["P3"],
P4 = (r2 == null) ? null : r2["P4"]
};
Got this solved by
table1.Merge(table2, true, MissingSchemaAction.Add);
finalTable = table1.Clone();
finalTable.PrimaryKey = new DataColumn[] { finalTable.Columns["ID"], finalTable.Columns["Name"] };
List<string> columnNames = new List<string>();
for (int colIndex = 2; colIndex < finalTable.Columns.Count; colIndex++)
{
columnNames.Add(finalTable.Columns[colIndex].ColumnName);
}
foreach (string cols in columnNames)
{
var temTable = new DataTable();
temTable.Columns.Add("ID", typeof(int));
temTable.Columns.Add("Name", typeof(string));
temTable.Columns.Add(cols, typeof(decimal));
(from row in table1.AsEnumerable()
group row by new { ID = row.Field<int>("ID"), Team = row.Field<string>("Team") } into grp
orderby grp.Key.ID
select new
{
ID = grp.Key.ID,
Name = grp.Key.Team,
cols = grp.Sum(r => r.Field<decimal?>(cols)),
})
.Aggregate(temTable, (dt, r) => { dt.Rows.Add(r.ID, r.Team, r.cols); return dt; });
finalTable.Merge(temTable, false, MissingSchemaAction.Ignore);
}
Since the columns are dynamic you'll need to return an object with dynamic properties. You could do this with an ExpandoObject.
The following code is ugly in many ways - I would do some massive refactoring before letting it go - but it gets the job done and might help you out to achieve what you want.
(Sorry for using the other linq syntax.)
var query = table1.AsEnumerable()
.GroupBy(row => new
{
ID = row.Field<int>("ID"),
Name = row.Field<string>("Name")
})
.Select(grp =>
{
dynamic result = new ExpandoObject();
var dict = result as IDictionary<string, object>;
result.ID = grp.Key.ID;
result.Name = grp.Key.Name;
foreach (DataRow row in grp)
{
foreach (DataColumn column in table1.Columns)
{
string columnName = column.ColumnName;
if (columnName.Equals("ID") || columnName.Equals("Name"))
continue;
//else
if (!dict.Keys.Contains(columnName))
dict[columnName] = row[columnName];
else
{
if (row[columnName] is System.DBNull)
continue;
if (dict[columnName] is System.DBNull)
{
dict[columnName] = row[columnName];
continue;
}
//else
dict[columnName] = (decimal)dict[columnName] + (decimal)row[columnName];
}
}
}
return result;
});

Categories