How to add Sub total row and grand total row using linq - c#

I want to add the sub total row for each BusinessTypeCode and grand total for all BusinessTypeCode. How can I add these two rows in my linq and put the below of each businesstypecode.
MY CURRENT CODE
var query = (from _transaction in _entities.Transactions
join _cd in _entities.Organisations on _transaction.Refno equals _cd.Refno
join _b in _entities.BusinessType on _transaction.BusinessTypeCode equals _b.BusinessTypeCode
group new
{
_trans = _transaction,
cd = _cd,
}
by new { _transaction.BusinessTypeCode,_transaction.Refno, _cd.BusinessName, _b.Description } into _group
orderby _group.Key.BusinessTypeCode
select new
{
BusinessTypeCode = _group.Key.BusinessTypeCode,
BusType = _group.Key.BusinessTypeCode + " - " +_group.Key.Description,
BusName = _group.Key.BusinessName,
BusL = _group.Sum(x=>x._trans.BusL),
BusInterrest = _group.Sum(x => x._trans.BusInterrest),
BusAdmin = _group.Sum(x => x._trans.BusAdmin),
BusPenalty = _group.Sum(x => x._trans.BusPenalty),
TotalBusCollected =_group.Sum(x=>x._trans.TotalBusCollected)
});
DataTable dt=new DataTable();
DataSet ds = new DataSet();
ds.Tables.Add(query.CopyToDataTable());
ds.Tables[0].TableName = "Table1";
var subtotal = query.GroupBy(x=>x.BusinessTypeCode ).Select(s=>new
{
BusinessTypeCode =s.Key,
BusLSub = s.Sum(x=>x.BusL),
BusInterrestSub = s.Sum(x=>x.BusInterrest),
BusAdminSub = s.Sum(x=>x.BusAdmin),
BusPenaltySub = s.Sum(x=>x.BusPenalty),
TotalBusCollectedSub = s.Sum(x=>x.TotalBusCollected),
});
foreach (var a in subtotal)
{
dt = ds.Tables[0];
dt.NewRow();
dt.Rows.Add(a.BusLSub, a.BusInterrestSub, a.BusAdminSub, a.BusPenaltySub, a.TotalBusCollectedSub );
}
return ds;
Current Output
BusType |BusName | BusL |BusInterest|BusAdmin| BusPenalty|TotalBusCollected
1 - ACCOUNTING |HIGHVELD |-23.91 | 0 |-22.84 | 0 |-46.75
1 - ACCOUNTING |BHP |-50.81 |-79.21 |-76 |-20.02 |-226.04
2 - FOOD |SAB |-14.18 |-435.97 |-2.57 |-67.55 |-520.27
2 - FOOD |DISTIL |-43.05 |0 |-66,59 |0 |-109.64
3 - MINING |ANGLOGOLD |-4.43 |0 |-72 |0 |-76.43
-74.72 |-79.21 |-98.84 |-20.02 |-272.79
-57.23 |-435.97 |-69.16 |-67.55 |-629.91
-4.43 |0 |-72 |0 |-76.43
How can I push it into where BusinessTypeCode =BusinessTypeCode ?
OUTPUT SUPPOSE TO BE LIKE
BusType |BusName | BusL |BusInterest|BusAdmin| BusPenalty|TotalBusCollected
1 - ACCOUNTING |HIGHVELD |-23.91 | 0 |-22.84 | 0 |-46.75
1 - ACCOUNTING |BHP |-50.81 |-79.21 |-76 |-20.02 |-226.04
--------------------------+-------+-----------+--------+-----------+-----------------
Sub Total |-74.72 |-79.21 |-98.84 |-20.02 |-272.79
--------------------------+-------+-----------+--------+-----------+-----------------
2 - FOOD |SAB |-14.18 |-435.97 |-2.57 |-67.55 |-520.27
2 - FOOD |DISTIL |-43.05 |0 |-66,59 |0 |-109.64
--------------------------+-------+-----------+--------+-----------+-----------------
Sub Total |-57.23 |-435.97 |-69.16 |-67.55 |-629.91
--------------------------+-------+-----------+--------+-----------+-----------------
3 - MINING |ANGLOGOLD |-4.43 |0 |-72 |0 |-76.43
--------------------------+-------+-----------+--------+-----------+-----------------
Sub Total |-4.43 |0 |-72 |0 |-76.43
--------------------------+-------+-----------+--------+-----------+-----------------
GRAND TOTAL |-136.38|-515.38 |-240 |-87.57 |-979.13

As I am currently on mobile I won't be able to verify this piece of code.
You can do something like this:
foreach(var item in subtotal)
{
var lastRecord = ds.Tables[0].Rows.LastOrDefault(r=>r["BusType"]==item.BusTypeCode);
var lastIndex = ds.Tables[0].Rows.IndexOf(lastRecord);
DataRow dr = ds.Tables[0].NewRow();
dr["BusType"] = item.BusTypeCode;
// etc.
ds.Tables[0].Rows.InserAt(dr, lastIndex);
}
// insert the grand total row at the end
return ds;
This is just to show the idea on how you can transform the current DataSet output to your desired output.

It looks like what you need is some sort of interleaving. I haven't tested this but the idea you want is similar to the following (assuming you have a class called BusRow instead of your anonymous type). I've also taken the liberty of making the subrow's type the same as BusRow, but this need not be the case when you do it.
public static IEnumerable<BusRow> Interleave(List<BusRow> rowItems, List<BusRow> subTotalItems)
{
for(int i = 0 ; i < rowItems.Count; ++i)
{
yield return rowItems[i];
if(i > 0 && rowItems[i].BusinessTypeCode != rowItems[i-1].BusinessTypeCode)
{
yield return subTotalItems.Single(x => x.BusinessTypeCode == rowItems[i-1].BusinessTypeCode);
}
}
yield return subTotalItems.Single(x => x.BusinessTypeCode == rowItems[rowItems.Count-1].BusinessTypeCode);
}
Of course you're going to change your original query to return BusRow like:
var subtotal = query.GroupBy(x=>x.BusinessTypeCode ).Select(s=>new BusRow
{
BusinessTypeCode =s.Key,
BusL = s.Sum(x=>x.BusL),
BusInterrest = s.Sum(x=>x.BusInterrest),
BusAdmin = s.Sum(x=>x.BusAdmin),
BusPenalty = s.Sum(x=>x.BusPenalty),
TotalBusCollected = s.Sum(x=>x.TotalBusCollected),
});
Then call
var interleaved = Interleave(query, subtotal);
foreach (var a in interleaved)
{
dt = ds.Tables[0];
dt.NewRow();
if(String.isNullOrEmpty(a.BusName)//format your subtotal row as you like - add --rows before and after
dt.Rows.Add(a.BusL, a.BusInterrest, a.BusAdmin, a.BusPenalty, a.TotalBusCollected );
else
dt.Rows.Add(a.BusinessTypeCode, a.BusName, a.BusL, a.BusInterrest, a.BusAdmin, a.BusPenalty, a.TotalBusCollected ); //normal row
}

Related

Getting the sum per hour from 2 datatable

I'm writing a txt file from 2 data table.
Following is the 2 data table.
dt1
Transaction No. Time Amount Date
1 10:00:00 200.00 03/05/2020
2 10:30:11 250.00 03/05/2020
3 11:05:22 140.00 03/05/2020
4 11:45:33 230.00 03/05/2020
5 12:15:10 220.00 03/05/2020
dt2
Transaction No. Added Amount Date
1 40.00 03/05/2020
2 25.00 03/05/2020
3 40.00 03/05/2020
4 30.00 03/05/2020
5 30.00 03/05/2020
following is my code
using (StreamWriter sw = File.AppendText(fileName))
{
for (int a = 6; a <= 23; a++)
{
string aa = a.ToString().PadLeft(2, '0');
double salex = double.Parse(dt1.Rows[0]["Amount"].ToString());
if (salex.Equals(""))
{
salex = 0;
}
else
{
salex = double.Parse(dt1.Rows[0]["Amount"].ToString());
}
double vatx = double.Parse(dt2.Rows[0]["Added Amount"].ToString());
if (vatx.Equals(""))
{
vatx = 0;
}
else
{
vatx = double.Parse(dt2.Rows[0]["Added Amount"].ToString());
}
double dailysaleHRLY = -salex + -vatx;
sw.Write(dtpDate.Value.ToString("MM/dd/yyyy") + ",");
sw.Write(aa + ":00" + ",");
sw.Write(dailysaleHRLY.ToString("0.00") + ",");
}
for (int a = 0; a <= 5; a++)
{
string aa = a.ToString().PadLeft(2, '0');
double salex = double.Parse(dt1.Rows[0]["Amount"].ToString());
if (salex.Equals(""))
{
salex = 0;
}
else
{
salex = double.Parse(dt1.Rows[0]["Amount"].ToString());
}
double vatx = double.Parse(dt2.Rows[0]["Added Amount"].ToString());
if (vatx.Equals(""))
{
vatx = 0;
}
else
{
vatx = double.Parse(dt2.Rows[0]["Added Amount"].ToString());
}
double dailysaleHRLY = -salex + -vatx;
sw.Write(dtpDate.Value.ToString("MM/dd/yyyy") + ",");
sw.Write(aa + ":00" + ",");
sw.Write(dailysaleHRLY.ToString("0.00") + ",");
}
MessageBox.Show("Txt File succesfully created!", "SYSTEM", MessageBoxButtons.OK, MessageBoxIcon.Information);
}
This is the output of my code.
Date, Time, Sum
03/05/2020,06:00,515.00
03/05/2020,07:00,515.00
03/05/2020,08:00,515.00
03/05/2020,09:00,515.00
03/05/2020,10:00,515.00
03/05/2020,11:00,515.00
03/05/2020,12:00,515.00
03/05/2020,13:00,515.00
03/05/2020,14:00,515.00
03/05/2020,15:00,515.00
03/05/2020,16:00,515.00
03/05/2020,17:00,515.00
03/05/2020,18:00,515.00
03/05/2020,19:00,515.00
03/05/2020,20:00,515.00
03/05/2020,21:00,515.00
03/05/2020,22:00,515.00
03/05/2020,23:00,515.00
03/05/2020,00:00,515.00
03/05/2020,01:00,515.00
03/05/2020,02:00,515.00
03/05/2020,03:00,515.00
03/05/2020,04:00,515.00
03/05/2020,05:00,515.00
I just want to get the sum of Amount and Added Amount base on hour. Like this.
Date, Time, Sum
03/05/2020,06:00,0.00
03/05/2020,07:00,0.00
03/05/2020,08:00,0.00
03/05/2020,09:00,0.00
03/05/2020,10:00,515.00
03/05/2020,11:00,440.00
03/05/2020,12:00,250.00
03/05/2020,13:00,0.00
03/05/2020,14:00,0.00
03/05/2020,15:00,0.00
03/05/2020,16:00,0.00
03/05/2020,17:00,0.00
03/05/2020,18:00,0.00
03/05/2020,19:00,0.00
03/05/2020,20:00,0.00
03/05/2020,21:00,0.00
03/05/2020,22:00,0.00
03/05/2020,23:00,0.00
03/05/2020,00:00,0.00
03/05/2020,01:00,0.00
03/05/2020,02:00,0.00
03/05/2020,03:00,0.00
03/05/2020,04:00,0.00
03/05/2020,05:00,0.00
Assuming that you have two DataTable-s and you have them filled with the mentioned data.
var dt1 = new DataTable();
var dt2 = new DataTable();
dt1.Columns.AddRange(new[]
{
new DataColumn("Transaction No.", typeof(int)),
new DataColumn("Time", typeof(DateTime)),
new DataColumn("Amount", typeof(decimal)),
new DataColumn("Date", typeof(DateTime)),
});
dt2.Columns.AddRange(new[]
{
new DataColumn("Transaction No.", typeof(int)),
new DataColumn("Added Amount", typeof(decimal)),
new DataColumn("Date", typeof(DateTime)),
});
Note: The double types have been replaced with decimal types since its the right type to be used when dealing with money.
As I understand the problem, you want to group the rows of dt1 by hour part of the Time field, sum the Amount, and add to the sum the Added Amount from dt2 rows where their Transaction No. equals to any Transaction No. of the grouped rows of dt1.
This will do:
var group = dt1.AsEnumerable().GroupBy(x => x.Field<DateTime>(1).Hour);
var sb = new StringBuilder();
sb.Append("Date,");
sb.Append("Time,".PadLeft(12, ' '));
sb.AppendLine("Sum".PadLeft(5, ' '));
//if PadLeft is not required in the output, then just:
//sb.AppendLine($"Date, Time, Sum");
foreach (var g in group)
{
var sum = 0M;
foreach (var r in g)
sum += r.Field<decimal>(2) + dt2.AsEnumerable()
.Where(x => x.Field<int>(0) == r.Field<int>(0))
.Sum(x => x.Field<decimal>(1));
sb.AppendLine($"{g.First().Field<DateTime>(3).ToString("MM/dd/yyyy")}, {g.Key.ToString("00")}:00, {sum.ToString("0.00")}");
}
Note: You can use the fields names instead of their indexes.
The output is:
Date, Time, Sum
03/05/2020, 10:00, 515.00
03/05/2020, 11:00, 440.00
03/05/2020, 12:00, 250.00
I don't know whether the DataTable-s already contain the required data to generate the output mentioned in the last quote block or you want to append the rest before writing to the text file. In case of the second scenario, you can do something like:
var group = dt1.AsEnumerable().GroupBy(x => x.Field<DateTime>(1).Hour);
var sb = new StringBuilder();
sb.AppendLine($"Date, Time, Sum");
for (var i = 0; i < 24; i++)
{
var g = group.FirstOrDefault(x => x.Key == i);
if (g != null)
{
var sum = 0M;
foreach (var r in g)
sum += r.Field<decimal>(2) + dt2.AsEnumerable()
.Where(x => x.Field<int>(0) == r.Field<int>(0))
.Sum(x => x.Field<decimal>(1));
sb.AppendLine($"{g.First().Field<DateTime>(3).ToString("MM/dd/yyyy")}, {g.Key.ToString("00")}:00, {sum.ToString("0.00")}");
}
else
sb.AppendLine($"{group.First().First().Field<DateTime>(3).ToString("MM/dd/yyyy")}, {i.ToString("00")}:00, 0.00");
}
If you need to preserve the same order of the hours:
for (var ii = 6; ii < 30; ii++)
{
var i = ii > 23 ? ii % 24 : ii;
var g = group.FirstOrDefault(x => x.Key == i);
if (g != null)
{
//The same...
}
Finally, to create or overwrite the text file (fileName):
File.WriteAllText(fileName, sb.ToString());
Or to append the output:
File.AppendAllText(fileName, sb.ToString());

C# MySQLDataReader.AffectedRows = -1

In my MySQL console, i can see the results of select Price from rates order by id, I get this:
mysql> select Price from rates order by id;
+-------+
| Price |
+-------+
| 100 |
| 120 |
| 150 |
| 200 |
| 350 |
| 700 |
| 500 |
| 700 |
| 800 |
| 1300 |
| 1500 |
| 7000 |
| 8000 |
| 15000 |
| 20000 |
+-------+
15 rows in set
but when I run it in this method as the string command;
public List<string[]> ExecuteQuery(string command)
{
com = new MySqlCommand(command, con);
reader = com.ExecuteReader();
if (reader.HasRows)
{
List<string[]> records = new List<string[]>();
while (reader.Read())
{
string[] row = new string[reader.FieldCount];
for (int i = 0; i < reader.RecordsAffected; i++)
row[i] = reader[i].ToString();
records.Add(row);
}
reader.Close();
return records;
}
else
{
reader.Close();
return new List<string[]>();
}
}
the reader.AffectedRows is -1 and it messes up the whole process...
But this one works just fine...
MySqlCommand com = new MySqlCommand("select Access from useraccounts where Username = '" + tbxUsername.Text + "' and Pass = '" + tbxPassword.Text + '\'', d.con);
object result = com.ExecuteScalar();
I am using this connection string: datasource = 192.168.43.191; database = database_name; user = user_name; password = pass_word; in a Visual Studio 2015 and XAMPP with Apache and MySQL running.
This is the first time I've encountered this problem. I hope you can help
The RecordsAffected is set when your query is an INSERT/UPDATE/DELETE query not when your query is a SELECT one. In your code it seems that you want to use the FieldCount property instead
while (reader.Read())
{
string[] row = new string[reader.FieldCount];
for (int i = 0; i < reader.FieldCount; i++)
row[i] = reader[i].ToString();
records.Add(row);
}
You can also change your code to this shorter one
public List<string[]> ExecuteQuery(string command)
{
List<string[]> records = new List<string[]>();
using(com = new MySqlCommand(command, con))
using(reader = com.ExecuteReader())
{
while (reader.Read())
{
string[] row = new string[reader.FieldCount];
for (int i = 0; i < reader.FieldCount i++)
row[i] = reader[i].ToString();
records.Add(row);
}
}
return records;
}
However, in general, I recommend to avoid these do it all methods that cannot be able to handle, in the most performant way, the many different kind of queries required by an application.
For example the method returns a list containing an array of string while, in reality, you are just returning a single column (no array needed) and the values are probably decimals that are converted to strings and probably are converted back to decimals when you use them. And we don't even start talking about dates. Do you see how this method propagates its problem through all your application?
If you want a general solution then choose a good ORM that abstract the use of a database and return data properly converted to object instances. Check for Entity Framework or Dapper (but many other exist)
The AffectedRows property should not be used with SELECT statements, since it is only meaningful when INSERT, UPDATE and DELETE statements are used. The following should fix your issue:
Int32 fields = reader.FieldCount;
List<String[]> records = new List<String[]>()
while (reader.Read())
{
String[] row = new String[fields];
for (Int32 i = 0; i < fields; ++i)
row[i] = reader.GetString(i);
records.Add(row);
}

In C#, what is the best way to parse this WIKI markup?

I need to take data that I am reading in from a WIKI markup page and store it as a table structure. I am trying to figure out how to properly parse the below markup syntax into some table data structure in C#
Here is an example table:
|| Owner || Action || Status || Comments ||
| Bill | Fix the lobby | In Progress | This is easy |
| Joe | Fix the bathroom | In Progress | Plumbing \\
\\
Electric \\
\\
Painting \\
\\
\\ |
| Scott | Fix the roof | Complete | This is expensive |
and here is how it comes in directly:
|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|
So as you can see:
The column headers have "||" as the separator
A row columns have a separator or "|"
A row might span multiple lines (as in the second data row example above) so i would have to keep reading until I hit the same number of "|" (cols) that I have in the header row.
I tried reading in line by line and then concatenating lines that had "\" in between then but that seemed a bit hacky.
I also tried to simply read in as a full string and then just parse by "||" first and then keep reading until I hit the same number of "|" and then go to the next row. This seemed to work but it feel like there might be a more elegant way using regular expressions or something similar.
Can anyone suggest the correct way to parse this data?
I have largely replaced the previous answer, due to the fact that the format of the input after your edit is substantially different from the one posted before. This leads to a somewhat different solution.
Because there are no longer any line breaks after a row, the only way to determine for sure where a row ends, is to require that each row has the same number of columns as the table header. That is at least if you don't want to rely on some potentially fragile white space convention present in the one and only provided example string (i.e. that the row separator is the only | not preceded by a space). Your question at least does not provide this as the specification for a row delimiter.
The below "parser" provides at least the error handling validity checks that can be derived from your format specification and example string and also allows for tables that have no rows. The comments explain what it is doing in basic steps.
public class TableParser
{
const StringSplitOptions SplitOpts = StringSplitOptions.None;
const string RowColSep = "|";
static readonly string[] HeaderColSplit = { "||" };
static readonly string[] RowColSplit = { RowColSep };
static readonly string[] MLColSplit = { #"\\" };
public class TableRow
{
public List<string[]> Cells;
}
public class Table
{
public string[] Header;
public TableRow[] Rows;
}
public static Table Parse(string text)
{
// Isolate the header columns and rows remainder.
var headerSplit = text.Split(HeaderColSplit, SplitOpts);
Ensure(headerSplit.Length > 1, "At least 1 header column is required in the input");
// Need to check whether there are any rows.
var hasRows = headerSplit.Last().IndexOf(RowColSep) >= 0;
var header = headerSplit.Skip(1)
.Take(headerSplit.Length - (hasRows ? 2 : 1))
.Select(c => c.Trim())
.ToArray();
if (!hasRows) // If no rows for this table, we are done.
return new Table() { Header = header, Rows = new TableRow[0] };
// Get all row columns from the remainder.
var rowsCols = headerSplit.Last().Split(RowColSplit, SplitOpts);
// Require same amount of columns for a row as the header.
Ensure((rowsCols.Length % (header.Length + 1)) == 1,
"The number of row colums does not match the number of header columns");
var rows = new TableRow[(rowsCols.Length - 1) / (header.Length + 1)];
// Fill rows by sequentially taking # header column cells
for (int ri = 0, start = 1; ri < rows.Length; ri++, start += header.Length + 1)
{
rows[ri] = new TableRow() {
Cells = rowsCols.Skip(start).Take(header.Length)
.Select(c => c.Split(MLColSplit, SplitOpts).Select(p => p.Trim()).ToArray())
.ToList()
};
};
return new Table { Header = header, Rows = rows };
}
private static void Ensure(bool check, string errorMsg)
{
if (!check)
throw new InvalidDataException(errorMsg);
}
}
When used like this:
public static void Main(params string[] args)
{
var wikiLine = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
var table = TableParser.Parse(wikiLine);
Console.WriteLine(string.Join(", ", table.Header));
foreach (var r in table.Rows)
Console.WriteLine(string.Join(", ", r.Cells.Select(c => string.Join(Environment.NewLine + "\t# ", c))));
}
It will produce the below output:
Where "\t# " represents a newline caused by the presence of \\ in the input.
Here's a solution which populates a DataTable. It does require a litte bit of data massaging (Trim), but the main parsing is Splits and Linq.
var str = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
var headerStop = str.LastIndexOf("||");
var headers = str.Substring(0, headerStop).Split(new string[1] { "||" }, StringSplitOptions.None).Skip(1).ToList();
var records = str.Substring(headerStop + 4).TrimEnd(new char[2] { ' ', '|' }).Split(new string[1] { "| |" }, StringSplitOptions.None).ToList();
var tbl = new DataTable();
headers.ForEach(h => tbl.Columns.Add(h.Trim()));
records.ForEach(r => tbl.Rows.Add(r.Split('|')));
This makes some assumptions but seems to work for your sample data. I'm sure if I worked at I could combine the expressions and clean it up but you'll get the idea.
It will also allow for rows that do not have the same number of cells as the header which I think is something confluence can do.
List<List<string>> table = new List<List<string>>();
var match = Regex.Match(raw, #"(?:(?:\|\|([^|]*))*\n)?");
if (match.Success)
{
var headersWithExtra = match.Groups[1].Captures.Cast<Capture>().Select(c=>c.Value);
List<String> headerRow = headersWithExtra.Take(headersWithExtra.Count()-1).ToList();
if (headerRow.Count > 0)
{
table.Add(headerRow);
}
}
match = Regex.Match(raw + "\r\n", #"[^\n]*\n" + #"(?:\|([^|]*))*");
var cellsWithExtra = match.Groups[1].Captures.Cast<Capture>().Select(c=>c.Value);
List<string> row = new List<string>();
foreach (string cell in cellsWithExtra)
{
if (cell.Trim(' ', '\t') == "\r\n")
{
if (!table.Contains(row) && row.Count > 0)
{
table.Add(row);
}
row = new List<string>();
}
else
{
row.Add(cell);
}
}
This ended up very similar to Jon Tirjan's answer, although it cuts the LINQ to a single statement (the code to replace that last one was horrifically ugly) and is a bit more extensible. For example, it will replace the Confluence line breaks \\ with a string of your choosing, you can choose to trim or not trim whitespace from around elements, etc.
private void ParseWikiTable(string input, string newLineReplacement = " ")
{
string separatorHeader = "||";
string separatorRow = "| |";
string separatorElement = "|";
input = Regex.Replace(input, #"[ \\]{2,}", newLineReplacement);
string inputHeader = input.Substring(0, input.LastIndexOf(separatorHeader));
string inputContent = input.Substring(input.LastIndexOf(separatorHeader) + separatorHeader.Length);
string[] headerArray = SimpleSplit(inputHeader, separatorHeader);
string[][] rowArray = SimpleSplit(inputContent, separatorRow).Select(r => SimpleSplit(r, separatorElement)).ToArray();
// do something with output data
TestPrint(headerArray);
foreach (var r in rowArray) { TestPrint(r); }
}
private string[] SimpleSplit(string input, string separator, bool trimWhitespace = true)
{
input = input.Trim();
if (input.StartsWith(separator)) { input = input.Substring(separator.Length); }
if (input.EndsWith(separator)) { input = input.Substring(0, input.Length - separator.Length); }
string[] segments = input.Split(new string[] { separator }, StringSplitOptions.None);
if (trimWhitespace)
{
for (int i = 0; i < segments.Length; i++)
{
segments[i] = segments[i].Trim();
}
}
return segments;
}
private void TestPrint(string[] lst)
{
string joined = "[" + String.Join("::", lst) + "]";
Console.WriteLine(joined);
}
Console output from your direct input string:
[Owner::Action::Status::Comments]
[Bill::fix the lobby::In Progress::This is eary]
[Joe::fix the bathroom::In progress::plumbing Electric Painting]
[Scott::fix the roof::Complete::this is expensive]
A generic regex solution that populate a datatable and is a little flexible with the syntax.
var text = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
// Get Headers
var regHeaders = new Regex(#"\|\|\s*(\w[^\|]+)", RegexOptions.Compiled);
var headers = regHeaders.Matches(text);
//Get Rows, based on number of headers columns
var regLinhas = new Regex(String.Format(#"(?:\|\s*(\w[^\|]+)){{{0}}}", headers.Count));
var rows = regLinhas.Matches(text);
var tbl = new DataTable();
foreach (Match header in headers)
{
tbl.Columns.Add(header.Groups[1].Value);
}
foreach (Match row in rows)
{
tbl.Rows.Add(row.Groups[1].Captures.OfType<Capture>().Select(col => col.Value).ToArray());
}
Here's a solution involving regular expressions. It takes a single string as input and returns a List of headers and a List> of rows/columns. It also trims white space, which may or may not be the desired behavior, so be aware of that. It even prints things nicely :)
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace parseWiki
{
class Program
{
static void Main(string[] args)
{
string content = #"|| Owner || Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
content = content.Replace(#"\\", "");
string headerContent = content.Substring(0, content.LastIndexOf("||") + 2);
string cellContent = content.Substring(content.LastIndexOf("||") + 2);
MatchCollection headerMatches = new Regex(#"\|\|([^|]*)(?=\|\|)", RegexOptions.Singleline).Matches(headerContent);
MatchCollection cellMatches = new Regex(#"\|([^|]*)(?=\|)", RegexOptions.Singleline).Matches(cellContent);
List<string> headers = new List<string>();
foreach (Match match in headerMatches)
{
if (match.Groups.Count > 1)
{
headers.Add(match.Groups[1].Value.Trim());
}
}
List<List<string>> body = new List<List<string>>();
List<string> newRow = new List<string>();
foreach (Match match in cellMatches)
{
if (newRow.Count > 0 && newRow.Count % headers.Count == 0)
{
body.Add(newRow);
newRow = new List<string>();
}
else
{
newRow.Add(match.Groups[1].Value.Trim());
}
}
body.Add(newRow);
print(headers, body);
}
static void print(List<string> headers, List<List<string>> body)
{
var CELL_SIZE = 20;
for (int i = 0; i < headers.Count; i++)
{
Console.Write(headers[i].Truncate(CELL_SIZE).PadRight(CELL_SIZE) + " ");
}
Console.WriteLine("\n" + "".PadRight( (CELL_SIZE + 2) * headers.Count, '-'));
for (int r = 0; r < body.Count; r++)
{
List<string> row = body[r];
for (int c = 0; c < row.Count; c++)
{
Console.Write(row[c].Truncate(CELL_SIZE).PadRight(CELL_SIZE) + " ");
}
Console.WriteLine("");
}
Console.WriteLine("\n\n\n");
Console.ReadKey(false);
}
}
public static class StringExt
{
public static string Truncate(this string value, int maxLength)
{
if (string.IsNullOrEmpty(value) || value.Length <= maxLength) return value;
return value.Substring(0, maxLength - 3) + "...";
}
}
}
Read the input string one character at a time and use a state-machine to decide what should be done with each input character. This approach probably needs more code, but it will be easier to maintain and to extend than regular expressions.

reading a CSV into a Datatable without knowing the structure

I am trying to read a CSV into a datatable.
The CSV maybe have hundreds of columns and only up to 20 rows.
It will look something like this:
+----------+-----------------+-------------+---------+---+
| email1 | email2 | email3 | email4 | … |
+----------+-----------------+-------------+---------+---+
| ccemail1 | anotherccemail1 | 3rdccemail1 | ccemail | |
| ccemail2 | anotherccemail2 | 3rdccemail2 | | |
| ccemail3 | anotherccemail3 | | | |
| ccemail4 | anotherccemail4 | | | |
| ccemail5 | | | | |
| ccemail6 | | | | |
| ccemail7 | | | | |
| … | | | | |
+----------+-----------------+-------------+---------+---+
i am trying to use genericparser for this; however, i believe that it requires you to know the column names.
string strID, strName, strStatus;
using (GenericParser parser = new GenericParser())
{
parser.SetDataSource("MyData.txt");
parser.ColumnDelimiter = "\t".ToCharArray();
parser.FirstRowHasHeader = true;
parser.SkipStartingDataRows = 10;
parser.MaxBufferSize = 4096;
parser.MaxRows = 500;
parser.TextQualifier = '\"';
while (parser.Read())
{
strID = parser["ID"]; //as you can see this requires you to know the column names
strName = parser["Name"];
strStatus = parser["Status"];
// Your code here ...
}
}
is there a way to read this file into a datatable without know the column names?
It's so simple!
var adapter = new GenericParsing.GenericParserAdapter(filepath);
DataTable dt = adapter.GetDataTable();
This will automatically do everything for you.
I looked at the source code, and you can access the data by column index too, like this
var firstColumn = parser[0]
Replace the 0 with the column number.
The number of colums can be found using
parser.ColumnCount
I'm not familiar with that GenericParser, i would suggest to use tools like TextFieldParser, FileHelpers or this CSV-Reader.
But this simple manual approach should work also:
IEnumerable<String> lines = File.ReadAllLines(filePath);
String header = lines.First();
var headers = header.Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);
DataTable tbl = new DataTable();
for (int i = 0; i < headers.Length; i++)
{
tbl.Columns.Add(headers[i]);
}
var data = lines.Skip(1);
foreach(var line in data)
{
var fields = line.Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);
DataRow newRow = tbl.Rows.Add();
newRow.ItemArray = fields;
}
i used generic parser to do it.
On the first run through the loop i get the columns names and then reference them to add them to a list
In my case i have pivoted the data but here is a code sample if it helps someone
bool firstRow = true;
List<string> columnNames = new List<string>();
List<Tuple<string, string, string>> results = new List<Tuple<string, string, string>>();
while (parser.Read())
{
if (firstRow)
{
for (int i = 0; i < parser.ColumnCount; i++)
{
if (parser.GetColumnName(i).Contains("FY"))
{
columnNames.Add(parser.GetColumnName(i));
Console.Log("Column found: {0}", parser.GetColumnName(i));
}
}
firstRow = false;
}
foreach (var col in columnNames)
{
double actualCost = 0;
bool hasValueParsed = Double.TryParse(parser[col], out actualCost);
csvData.Add(new ProjectCost
{
ProjectItem = parser["ProjectItem"],
ActualCosts = actualCost,
ColumnName = col
});
}
}

Sorting DataTable-columns

I have a datatable something like this:
| Col1 | Col6 | Col3 | Col43 | Col0 |
---------------------------------------------------
RowA | 1 | 6 | 54 | 4 | 123 |
As you see, the Cols are not sorted by their numbers. That is what I want it to look like after the "magic":
| Col0 | Col1 | Col3 | Col6 | Col43 |
---------------------------------------------------
RowA | 123 | 1 | 54 | 6 | 4 |
Is there a built-in function for such things in C#? And if not, how could I get started with this?
You can do the column sorting in the table itself:
dt.Columns["Col0"].SetOrdinal(0);
dt.Columns["Col1"].SetOrdinal(1);
dt.Columns["Col2"].SetOrdinal(2);
You don't need to sort the columns in the DataTable object, just copy the column names to an array and sort the array. Then use the array to access the column values in the right order.
Sample:
class Program
{
static void Main(string[] args)
{
var dt = new DataTable { Columns = { "A3", "A2", "B1", "B3", "B2", "A1" } };
dt.BeginLoadData();
dt.Rows.Add("A3val", "A2val", "B1val", "B3val", "B2val", "A1val");
dt.EndLoadData();
string[] names=new string[dt.Columns.Count];
for (int i = 0; i < dt.Columns.Count;i++ )
{
names[i] = dt.Columns[i].ColumnName;
}
Array.Sort(names);
foreach (var name in names)
{
Console.Out.WriteLine("{0}={1}", name, dt.Rows[0][name]);
}
Console.ReadLine();
}
Here is my code, surely not is the best solution but works. In my case I let a fixed column that could be "Nombre" or "Problem", this always be the first in column order.
// class
public class stringInt
{
public string Nombre;
public int orden;
}
// function
static public DataTable AlphabeticDataTableColumnSort(DataTable dtTable)
{
//vamos a extraer todos los nombres de columnas, meterlos en una lista y ordenarlo
int orden = 1;
List<stringInt> listaColumnas = new List<stringInt>();
foreach (DataColumn dc in dtTable.Columns)
{
stringInt columna = new stringInt();
columna.Nombre = dc.Caption;
if ((dc.Caption != "Problema") && (dc.Caption != "Nombre")) columna.orden = 1;
else columna.orden = 0;
listaColumnas.Insert(0,columna);
}
listaColumnas.Sort(delegate(stringInt si1, stringInt si2) { return si1.Nombre.CompareTo(si2.Nombre); });
// ahora lo tenemos ordenado por nombre de columna
foreach (stringInt si in listaColumnas)
{
// si el orden es igual a 1 vamos incrementando
if (si.orden != 0)
{
si.orden = orden;
orden++;
}
}
listaColumnas.Sort(delegate(stringInt si1, stringInt si2) { return si1.orden.CompareTo(si2.orden); });
// tenemos la matriz con el orden de las columnas, ahora vamos a trasladarlo al datatable
foreach(stringInt si in listaColumnas)
dtTable.Columns[si.Nombre].SetOrdinal(si.orden);
return dtTable;
}
var columnArray = new DataColumn[table.Columns.Count];
table.Columns.CopyTo(columnArray, 0);
var ordinal = -1;
foreach (var orderedColumn in columnArray.OrderBy(c => c.ColumnName))
orderedColumn.SetOrdinal(++ordinal);
You'll probably need to implement your IComparer<T>, as "natural" order would be: Col0, Col1, Col3, Col43 and Col6. ("4" comes before "6")
Here's a combination of the preceding answers. Use the built in Sort() method of a List or an Array of strings to sort a list of the column names, then use the DataColumn.SetOrdinal() method to rearrange your DataTable's columns to match the sorted list.
List<string> columnNames = new List<string>();
foreach (DataColumn col in table.Columns)
{
columnNames.Add(col.ColumnName);
}
columnNames.Sort();
int i = 0;
foreach (string name in columnNames)
{
table.Columns[name].SetOrdinal(i);
i++;
}
(Where "table" is the name of your DataTable.)

Categories