Outter joins in LINQ query - c#

This code works good it takes and matches all the unit_no and vehiclename together but it only shows the matches i also need the ones that dont match on.
I am loading a datagrid in WPF with SQLserver data and another Datagrid from oracle data.
private void GetSQLOraclelinqData()
{
var TstarData = GetTrackstarTruckData();
var M5Data = GetM5Data();
DataTable ComTable = new DataTable();
foreach (DataColumn OraColumn in M5Data.Columns)
{
ComTable.Columns.Add(OraColumn.ColumnName, OraColumn.DataType);
}
foreach (DataColumn SQLColumn in TstarData.Columns)
{
if (SQLColumn.ColumnName == "VehicleName")
ComTable.Columns.Add(SQLColumn.ColumnName + "2", SQLColumn.DataType);
else
ComTable.Columns.Add(SQLColumn.ColumnName, SQLColumn.DataType);
}
var results = M5Data.AsEnumerable().Join(TstarData.AsEnumerable(),
a => a.Field<String>("Unit_No"),
b => b.Field<String>("VehicleName"),
(a, b) =>
{
DataRow row = ComTable.NewRow();
row.ItemArray = a.ItemArray.Concat(b.ItemArray).ToArray();
ComTable.Rows.Add(row);
return row;
}).ToList();

Related

How to convert string separated by new line and comma to DataTable in C#

I have a string like this:
"Product,Price,Condition
Cd,13,New
Book,9,Used
"
Which is being passed like this:
"Product,Price,Condition\r\Cd,13,New\r\nBook,9,Used"
How could I convert it to DataTable?
Trying to do it with this helper function:
DataTable dataTable = new DataTable();
bool columnsAdded = false;
foreach (string row in data.Split(new string[] { "\r\n" }, StringSplitOptions.None))
{
DataRow dataRow = dataTable.NewRow();
foreach (string cell in row.Split(','))
{
string[] keyValue = cell.Split('~');
if (!columnsAdded)
{
DataColumn dataColumn = new DataColumn(keyValue[0]);
dataTable.Columns.Add(dataColumn);
}
dataRow[keyValue[0]] = keyValue[1];
}
columnsAdded = true;
dataTable.Rows.Add(dataRow);
}
return dataTable;
However I don't get that "connecting cells with appropriate columns" part - my cells don't have ~ in string[] keyValue = cell.Split('~'); and I obviously get an IndexOutOfRange at DataColumn dataColumn = new DataColumn(keyValue[0]);
Based on your implementation, I have written the code for you, I have not tested it. But you can use the concept.
DataRow dataRow = dataTable.NewRow();
int i = 0;
foreach (string cell in row.Split(','))
{
if (!columnsAdded)
{
DataColumn dataColumn = new DataColumn(cell);
dataTable.Columns.Add(dataColumn);
}
else
{
dataRow[i] = cell;
}
i++;
}
if(columnsAdded)
{
dataTable.Rows.Add(dataRow);
}
columnsAdded = true;
You can do that simply with Linq (and actually there is LinqToCSV on Nuget, maybe you would prefer that):
void Main()
{
string data = #"Product,Price,Condition
Cd,13,New
Book,9,Used
";
var table = ToTable(data);
Form f = new Form();
var dgv = new DataGridView { Dock = DockStyle.Fill, DataSource = table };
f.Controls.Add(dgv);
f.Show();
}
private DataTable ToTable(string CSV)
{
DataTable dataTable = new DataTable();
var lines = CSV.Split(new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries);
foreach (var colname in lines[0].Split(','))
{
dataTable.Columns.Add(new DataColumn(colname));
}
foreach (var row in lines.Where((r, i) => i > 0))
{
dataTable.Rows.Add(row.Split(','));
}
return dataTable;
}
You can split given string into flattened string array in one call. Then you can iterate through the array and populate list of objects.
That part is optional, since you can immediately populate DataTable but I think it's way easier (more maintainable) to work with strongly-typed objects when dealing with DataTable.
string input = "Product,Price,Condition\r\nCd,13,New\r\nBook,9,Used";
string[] deconstructedInput = input.Split(new string[] { "\r\n", "," }, StringSplitOptions.None);
List<Product> products = new List<Product>();
for (int i = 3; i < deconstructedInput.Length; i += 3)
{
products.Add(new Product
{
Name = deconstructedInput[i],
Price = Decimal.Parse(deconstructedInput[i + 1]),
Condition = deconstructedInput[i + 2]
});
}
public class Product
{
public string Name { get; set; }
public decimal Price { get; set; }
public string Condition { get; set; }
}
So, products collection holds 2 objects which you can easily iterate over and populate your DataTable.
Note: This requires further checks to avoid possible runtime exceptions, also it is not dynamic. That means, if you have differently structured input it won't work.
DataTable dataTable = new DataTable();
dataTable.Columns.Add(new DataColumn(nameof(Product.Name)));
dataTable.Columns.Add(new DataColumn(nameof(Product.Price)));
dataTable.Columns.Add(new DataColumn(nameof(Product.Condition)));
foreach (var product in products)
{
var row = dataTable.NewRow();
row[nameof(Product.Name)] = product.Name;
row[nameof(Product.Price)] = product.Price;
row[nameof(Product.Condition)] = product.Condition;
dataTable.Rows.Add(row);
}

Put LINQ query result to DataTable

I have this query:
var smallExchangeReport = from ex in exchangeProgReport
where !string.IsNullOrEmpty(ex.comment)
group ex by new { ex.siteName } into g
select new SummuryReportTraffic
{
siteName = g.Key.siteName,
exchangeCounter = g.Where(x => x.Prog1ToProg2Check == 1).Count(),
descriptions = (from t in g
group t by new { t.comment, t.siteName } into grp
select new Description
{
title = grp.Key.comment,
numbers = grp.Select(x => x.comment).Count()
})
};
At some point I put it to the dataTable using foreach loop:
foreach (var item in smallExchangeReport)
{
dr = smrTable.NewRow();
foreach (var d in item.descriptions)
{
dr[d.title] = d.numbers;
}
smrTable.Rows.Add(dr);
}
But I need to put the LINQ result to dataTable without using foreach loop.
So I made some changes to my code above according to this link:
DataTable dt = new DataTable();
DataRow dr = dt.NewRow();
IEnumerable<DataRow> smallExchangeReport = from ex in exchangeProgReport.AsEnumerable()
where !string.IsNullOrEmpty(ex.comment)
group ex by new { ex.siteName } into g
select new
{
siteName = g.Key.siteName,
exchangeCounter = g.Where(x => x.Prog1ToProg2Check == 1).Count(),
descriptions = (from t in g.AsEnumerable()
group t by new { t.comment, t.siteName } into grp
select new
{
title = grp.Key.comment,
numbers = grp.Select(x => x.comment).Count()
})
};
// Create a table from the query.
DataTable boundTable = smallExchangeReport.CopyToDataTable<DataRow>();
But on changed LINQ query I get this error:
Cannot implicitly convert type:'System.Collections.Generic.IEnumerable<<anonymous type: string siteName, int exchangeCounter>>' to
'System.Collections.Generic.IEnumerable<System.Data.DataRow>'. An explicit conversion exists (are you missing a cast?)
My question is how to cast the query to make it work?I tryed to cast to(DataRow) the result of the LINQ but it didn't worked.
In your LINQ query, you are trying to get IEnumerable<DataRow> as the result, but actually you select new objects of an anonymous type: select new { siteName = .... }. This cannot work because your anonymous type cannot be cast to DataRow.
What you need to do is use a function that would populate a DataRow like this:
DataRow PopulateDataRow(
DataTable table,
string siteName,
int exchangeCounter,
IEnumerable<Description> descriptions
{
var dr = table.NewRow();
// populate siteName and exchangeCounter
// (not sure how your data row is structured, so I leave it to you)
foreach (var d in descriptions)
{
dr[d.title] = d.numbers;
}
return dr;
}
then in your LINQ query, use it as follows:
IEnumerable<DataRow> smallExchangeReport =
from ex in exchangeProgReport.AsEnumerable()
where !string.IsNullOrEmpty(ex.comment)
group ex by new { ex.siteName } into g
select PopulateDataRow(
smrTable,
siteName: g.Key.siteName,
exchangeCounter: g.Where(x => x.Prog1ToProg2Check == 1).Count(),
descriptions: (from t in g.AsEnumerable()
group t by new { t.comment, t.siteName } into grp
select new Description {
title = grp.Key.comment,
numbers = grp.Select(x => x.comment).Count()
}
)
);
This solution gets rid of one foreach (on rows) and leaves the other one (on descriptions).
If removing the second foreach is important... I would still leave it inside PopulateDataRow. I don't see an elegant way to remove it. You can call a method from LINQ query which reads like a deterministic function, but actually creates the side effect of setting a column value on a data row, but it doesn't feel right to me.
this is can help you.
defining table structure.
DataTable tbl = new DataTable();
tbl.Columns.Add("Id");
tbl.Columns.Add("Name");
and we need to create datarow from anonymous type.
Func<object, DataRow> createRow = (object data) =>
{
var row = tbl.NewRow();
row.ItemArray = data.GetType().GetProperties().Select(a => a.GetValue(data)).ToArray();
return row;
};
test with fake query:
var enumarate = Enumerable.Range(0, 10);
var rows = from i in enumarate
select createRow( new { Id = i, Name = Guid.NewGuid().ToString() });
var dataTable = rows.CopyToDataTable<DataRow>();
You can use this method:
private DataTable ListToDataTable<T>(List<T> objs, string tableName) {
var table = new DataTable(tableName);
var lists = new List<List<object>>();
// init columns
var propertyInfos = new List<PropertyInfo>();
foreach (var propertyInfo in typeof(T).GetProperties()) {
propertyInfos.Add(propertyInfo);
if(propertyInfo.PropertyType.IsEnum || propertyInfo.PropertyType.IsNullableEnum()) {
table.Columns.Add(propertyInfo.Name, typeof(int));
} else {
table.Columns.Add(propertyInfo.Name, Nullable.GetUnderlyingType(propertyInfo.PropertyType) ?? propertyInfo.PropertyType);
}
table.Columns[table.Columns.Count - 1].AllowDBNull = true;
}
// fill rows
foreach(var obj in objs) {
var list = new List<object>();
foreach(var propertyInfo in propertyInfos) {
object currentValue;
if(propertyInfo.PropertyType.IsEnum || propertyInfo.PropertyType.IsNullableEnum()) {
var val = propertyInfo.GetValue(obj);
if(val == null) {
currentValue = DBNull.Value;
} else {
currentValue = (int)propertyInfo.GetValue(obj);
}
} else {
var val = propertyInfo.GetValue(obj);
currentValue = val ?? DBNull.Value;
}
list.Add(currentValue);
}
lists.Add(list);
}
lists.ForEach(x => table.Rows.Add(x.ToArray()));
return table;
}
Edit:
this extension method is used:
public static bool IsNullableEnum(this Type t) {
var u = Nullable.GetUnderlyingType(t);
return u != null && u.IsEnum;
}

Convert IEnumerable string array to datatable

I have a csv file delimited with pipe(|). I am reading it using the following line of code:
IEnumerable<string[]> lineFields = File.ReadAllLines(FilePath).Select(line => line.Split('|'));
Now, I need to bind this to a GridView. So I am creating a dynamic DataTable as follows:
DataTable dt = new DataTable();
int i = 0;
foreach (string[] order in lineFields)
{
if (i == 0)
{
foreach (string column in order)
{
DataColumn _Column = new DataColumn();
_Column.ColumnName = column;
dt.Columns.Add(_Column);
i++;
//Response.Write(column);
//Response.Write("\t");
}
}
else
{
int j = 0;
DataRow row = dt.NewRow();
foreach (string value in order)
{
row[j] = value;
j++;
//Response.Write(column);
//Response.Write("\t");
}
dt.Rows.Add(row);
}
//Response.Write("\n");
}
This works fine. But I want to know if there is a better way to convert IEnumerable<string[]> to a DataTable. I need to read many CSVs like this, so I think the above code might have performance issues.
Starting from .Net 4:
use ReadLines.
DataTable FileToDataTable(string FilePath)
{
var dt = new DataTable();
IEnumerable<string[]> lineFields = File.ReadLines(FilePath).Select(line => line.Split('|'));
dt.Columns.AddRange(lineFields.First().Select(i => new DataColumn(i)).ToArray());
foreach (var order in lineFields.Skip(1))
dt.Rows.Add(order);
return dt;
}
(edit: instead this code, use the code of #Jodrell answer, This prevents double charging of the Enumerator).
Before .Net 4:
use streaming:
DataTable FileToDataTable1(string FilePath)
{
var dt = new DataTable();
using (var st = new StreamReader(FilePath))
{
// first line procces
if (st.Peek() >= 0)
{
var order = st.ReadLine().Split('|');
dt.Columns.AddRange(order.Select(i => new DataColumn(i)).ToArray());
}
while (st.Peek() >= 0)
dt.Rows.Add(st.ReadLine().Split('|'));
}
return dt;
}
since, in your linked example, the file has a header row.
const char Delimiter = '|';
var dt = new DataTable;
using (var m = File.ReadLines(filePath).GetEnumerator())
{
m.MoveNext();
foreach (var name in m.Current.Split(Delimiter))
{
dt.Columns.Add(name);
}
while (m.MoveNext())
{
dt.Rows.Add(m.Current.Split(Delimiter));
}
}
This reads the file in one pass.

deleting rows from datatable causes error

I am trying to remove rows that are not needed from a DataTable. Basically, there may be several rows where the itemID is identical. I want to find the rows where the column "failEmail" = "fail", and using the itemID of those rows, remove all rows from the emails DataTable that have the same itemID.
Here is what I have tried:
System.Diagnostics.Debug.Print(emails.Rows.Count.ToString() + " emails!");
// create a list of the email IDs for records that will be deleted
List<DataRow> rows2Delete = new List<DataRow>();
foreach (DataRow dr in emails.Rows)
{
if (dr["failEmail"].ToString().ToLower() == "fail")
{
rows2Delete.Add(dr);
}
}
foreach (DataRow row in rows2Delete)
{
DataRow[] drRowsToCheck =emails.Select("itemID ='" + row["itemID"].ToString() +"'");
foreach (DataRow drCheck in drRowsToCheck)
{
emails.Rows.RemovedDrCheck);
emails.AcceptChanges();
}
}
Here is the error message I am getting on the second pass:
This row has been removed from a table and does not have any data.
BeginEdit() will allow creation of new data in this row
How can I do what I need to without throwing errors like that? Is there a better way like using a LiNQ query?
The problem is that when the same itemID has multiple entries with 'fail', you are trying to remove them multiple times.
// 1. Find the Unique itemIDs to remove
var idsToRemove = emails.Select("failEmail = 'fail'").Select (x => x["itemID"]).Distinct();
// 2. Find all the rows that match the itemIDs found
var rowsToRemove = emails.Select(string.Format("itemID in ({0})", string.Join(", ", idsToRemove)));
// 3. Remove the found rows.
foreach(var rowToRemove in rowsToRemove)
{
emails.Rows.Remove(rowToRemove);
}
emails.AcceptChanges();
this is what I ended up doing, based on an answer I got from MSDN c# Forums:
create an extension on DataTable to enable LINQ euering of the Datatable:
public static class DataTableExtensions
{
public static IEnumerable<DataRow> RowsAsEnumerable ( this DataTable source )
{
return (source != null) ? source.Rows.OfType<DataRow>() : Enumerable.Empty<DataRow>();
}
}
then modified my code as below:
//Get IDs to delete
var deleteIds = from r in emails.RowsAsEnumerable()
where String.Compare(r["failEmail"].ToString(), "fail", true) == 0
select r["itemID"];
//Get all rows to delete
var rows2Delete = (from r in emails.RowsAsEnumerable()
where deleteIds.Contains(r["itemID"])
select r).ToList();
//Now delete them
foreach (var row in rows2Delete)
emails.Rows.Remove(row);
emails.AcceptChanges();
and now it works, just wish I could do it the normal way successfully.
foreach (DataRow rowFail in emails.Select("failEmail = 'fail'"))
{
DataRow[] rowsItem = emails.Select(String.Format("itemID = '{0}'", rowFail["itemID"]));
for (int i = rowsItem.Length - 1; i >= 0; i--)
{
rowsItem[i].Delete();
}
}
emails.AcceptChanges();
DataTable.Select returns an array of all DataRow objects that match the filter criteria.

How I get the index of a DataTable Row with two calues of two other Column Values?

To My Application:
I have two DataTables and I want to filter out the first DataTable to the second. For this I get the columns user and modul from the first DataTable and if it not exist in the second I add a new row.
This is the structure from my second DataTable:
User | Modul | Time | Department | Status
and I want to check two Columns (User and Modul) whether the row with this values in the second DataTable exist. If the Entry Exist I need the row index. How I can do this with Linq?
The name of my second DataTable is analyse_table.
here my code:
private static DataTable FilterDataTable(DataTable nofilter_datatable)
{
DataTable analyse_table = new DataTable("Filter_Analyse");
DataColumn User = new DataColumn("User", typeof(string));
DataColumn Modul = new DataColumn("Modul", typeof(string));
DataColumn TIME = new DataColumn("TIME", typeof(string));
DataColumn Department = new DataColumn("Department", typeof(string));
DataColumn Status = new DataColumn("Status", typeof(string));
analyse_table.Columns.Add(User);
analyse_table.Columns.Add(Modul);
analyse_table.Columns.Add(TIME);
analyse_table.Columns.Add(Department);
analyse_table.Columns.Add(Status);
foreach (DataRow nf_row in nofilter_datatable.Rows)
{
string user = nf_row["User"].ToString();
string modul = nf_row["Modul"].ToString();
string OUT = nf_row["OUT"].ToString();
string IN = nf_row["IN"].ToString();
bool contains_user = analyse_table.AsEnumerable()
.Any(row => user == row.Field<string>("User"));
bool contains_modul = analyse_table.AsEnumerable()
.Any(Row => modul == Row.Field<string>("Modul"));
if (!contains_user || !contains_modul)
{
try
{
DataRow row = analyse_table.NewRow();
row["User"] = user;
row["Modul"] = modul;
if (OUT != string.Empty)
{
row["TIME"] = OUT;
row["Status"] = "OUT";
}
else if (IN != string.Empty)
{
row["TIME"] = IN;
row["Status"] = "IN";
}
string[] userSpli = user.Split('#');
row["Department"] = GetActiveDirectoryAttribute(userSpli[0], "Department", domaincontroller);
analyse_table.Rows.Add(row);
}
catch (Exception)
{
}
}
if (contains_user && contains_modul)
{
//index??
//string status = analyse_table.Rows[0]["Status"].ToString();
}
}
return analyse_table;
}
I need help.
There is no need to know the index in the analyse_table, using FirstOrDefault should allow you to find directly the row required
var rowUser = analyse_table.AsEnumerable()
.FirstOrDefault(row => user == row.Field<string>("User"));
var rowModul = analyse_table.AsEnumerable()
.FirstOrDefault(Row => modul == Row.Field<string>("Modul"));
if (rowUser == null || rowModul == null)
{
// Not exist so I add a new row
}
if (rowUser != null && rowModul != null)
{
string statusUser = rowUser["Status"].ToString();
string statusModul = rowModul["Status"].ToString();
}
However, having executed two different queries to search for your rows, we have no guarantees that the two rows are the same. So perhaps you need to change your code to search for both user and modul in the same row
var rowResult = analyse_table.AsEnumerable()
.FirstOrDefault(row => (user == row.Field<string>("User") &&
modul == row.Field<string>("Modul"));
if(rowResult == null)
// add new
else
// read status
I assume you want to add all missing rows that are in the first table but not in the second, the rows are identified via two columns User + Modul.
You could use a "LEFT OUTER JOIN" to link both tables on an anonymous type containing these columns. This is much more efficient.
var newRowsInFirst = from r1 in first_table.AsEnumerable()
join r2 in analyse_table.AsEnumerable()
on new { User=r1.Field<string>("User"), Modul=r1.Field<string>("Modul") }
equals new { User=r2.Field<string>("User"), Modul=r2.Field<string>("Modul") }
into gj from g2 in gj.DefaultIfEmpty()
where g2 == null
select r1;
Then use a simple foreach-loop to add these rows.
foreach(var newRow in newRowsInFirst)
{
analyse_table.ImportRow(newRow);
// or:
//DataRow addedRow = analyse_table.Rows.Add();
//addedRow.ItemArray = newRow.ItemArray;
}

Categories