I have a string like this. I want to put the second row in an array(3,9,10,11...), and the third(5,8,4,3...) in an array
C8| 3| 5| 0| | 0|1|
C8| 9| 8| 0| | 0|1|
C8| 10| 4| 0| | 0|1|
C8| 11| 3| 0| | 0|1|
C8| 12| 0| 0| | 0|1|
C8| 13| 0| 0| | 0|1|
C8| 14| 0| 0| | 0|1|
This method originally parsed numbers by rows. now i have columns..
How to do this in this Parse method? I am trying for hours, i dont know what to do.
The Add method waits 2 integer. int secondNumberFinal, int thirdNumberFinal
private Parse(string lines)
{
const int secondColumn = 1;
const int thirdColum = 2;
var secondNumbers = lines[secondColumn].Split('\n'); // i have to split by new line, right?
var thirdNumbers = lines[thirdColum].Split('\n'); // i have to split by new line, right?
var res = new Collection();
for (var i = 0; i < secondNumbers.Length; i++)
{
try
{
var secondNumberFinal = Int32.Parse(secondNumbers[i]);
var thirdNumberFinal = Int32.Parse(thirdNumbers[i]);
res.Add(secondNumberFinal, thirdNumberFinal);
}
catch (Exception ex)
{
log.Error(ex);
}
}
return res;
}
thank you!
Below piece of code should do it for you. The logic is simple: Split the array with '\n' (please check if you need "\r\n" or some other line ending format) and then split with '|'. Returning the data as an IEnumerable of Tuple will provide flexibility and Lazy execution both. You can convert that into a List at the caller if you so desire using the Enumerable.ToList extension method
It uses LINQ (Select), instead of foreach loops due to its elegance in this situation
static IEnumerable<Tuple<int, int>> Parse(string lines) {
const int secondColumn = 1;
const int thirdColum = 2;
return lines.Split('\n')
.Select(line => line.Split('|'))
.Select(items => Tuple.Create(int.Parse(items[secondColumn]), int.Parse(items[thirdColum])));
}
If the original is a single string, then split once on newline to produce an array of string. Parse each of the new string by splitting on | & select the second & third values.
Partially rewriting your method for you :
private Parse(string lines)
{
const int secondColumn = 1;
const int thirdColum = 2;
string [] arrlines = lines.Split('\r');
foreach (string line in arrlines)
{
string [] numbers = line.Split('|');
var secondNumberFinal = Int32.Parse(numbers[secondNumbers]);
var thirdNumberFinal = Int32.Parse(numbers[thirdNumbers]);
// Whatever you want to do with them here
}
}
Related
I have a list of custom class ModeTime, its structure is below:
private class ModeTime
{
public DateTime Date { get; set; }
public string LineName { get; set; }
public string Mode { get; set; }
public TimeSpan Time { get; set; }
}
In this list I have some items, whose LineName and Modeare the same, and they are written in the list one by one. I need to sum Time property of such items and replace it with one item with sum of Time property without changing LineName and Mode, Date should be taken from first of replaced items. I will give an example below:
Original: Modified:
Date | LineName | Mode | Time Date | LineName | Mode | Time
01.09.2018 | Line1 | Auto | 00:30:00 01.09.2018 | Line1 | Auto | 00:30:00
01.09.2018 | Line2 | Auto | 00:10:00 01.09.2018 | Line2 | Auto | 00:15:00
01.09.2018 | Line2 | Auto | 00:05:00 01.09.2018 | Line2 | Manual | 00:02:00
01.09.2018 | Line2 | Manual | 00:02:00 01.09.2018 | Line2 | Auto | 00:08:00
01.09.2018 | Line2 | Auto | 00:08:00 01.09.2018 | Line1 | Manual | 00:25:00
01.09.2018 | Line1 | Manual | 00:25:00 01.09.2018 | Line2 | Auto | 00:24:00
01.09.2018 | Line2 | Auto | 00:05:00 02.09.2018 | Line1 | Auto | 00:05:00
02.09.2018 | Line2 | Auto | 00:12:00
02.09.2018 | Line2 | Auto | 00:07:00
02.09.2018 | Line1 | Auto | 00:05:00
I have tried to write method to do it, it partly works, but some not summarized items still remain.
private static List<ModeTime> MergeTime(List<ModeTime> modeTimes)
{
modeTimes = modeTimes.OrderBy(e => e.Date).ToList();
var mergedModeTimes = new List<ModeTime>();
for (var i = 0; i < modeTimes.Count; i++)
{
if (i - 1 != -1)
{
if (modeTimes[i].LineName == modeTimes[i - 1].LineName &&
modeTimes[i].Mode == modeTimes[i - 1].Mode)
{
mergedModeTimes.Add(new ModeTime
{
Date = modeTimes[i - 1].Date,
LineName = modeTimes[i - 1].LineName,
Mode = modeTimes[i - 1].Mode,
Time = modeTimes[i - 1].Time + modeTimes[i].Time
});
i += 2;
}
else
{
mergedModeTimes.Add(modeTimes[i]);
}
}
else
{
mergedModeTimes.Add(modeTimes[i]);
}
}
return mergedModeTimes;
}
I have also tried to wrap for with do {} while() and reduce source list modeTimes length. Unfortunately it leads to loop and memory leak (I waited till 5GB memory using).
Hope someone can help me. I searched this problem, in some familiar cases people use GroupBy. But I don't think it will work in my case, I must sum item with the same LineName and Mode, only if they are in the list one by one.
Most primitive solution would be something like this.
var items = GetItems();
var sum = TimeSpan.Zero;
for (int index = items.Count - 1; index > 0; index--)
{
var item = items[index];
var nextItem = items[index - 1];
if (item.LineName == nextItem.LineName && item.Mode == nextItem.Mode)
{
sum += item.Time;
items.RemoveAt(index);
}
else
{
item.Time += sum;
sum = TimeSpan.Zero;
}
}
items.First().Time += sum;
Edit: I missed last line, where you have to add leftovers. This only applies if first and second elements of the collection are the same. Without it, it would not assign aggregated time to first element.
You can use LINQ's GroupBy. To group only consecutive elements, this uses a trick. It stores the key values in a tuple together with a group index which is only incremented when LineName or Mode changes.
int i = 0; // Used as group index.
(int Index, string LN, string M) prev = default; // Stores previous key for later comparison.
var modified = original
.GroupBy(mt => {
var ret = (Index: prev.LN == mt.LineName && prev.M == mt.Mode ? i : ++i,
LN: mt.LineName, M: mt.Mode);
prev = (Index: i, LN: mt.LineName, M: mt.Mode);
return ret;
})
.Select(g => new ModeTime {
Date = g.Min(mt => mt.Date),
LineName = g.Key.LN,
Mode = g.Key.M,
Time = new TimeSpan(g.Sum(mt => mt.Time.Ticks))
})
.ToList();
This produces the expected 7 result rows.
I am trying to parse with C#
+-------------+-----------------------------------------------------------------------------------+----------------+
| 1 | 2 | 3 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 000 | Собственные средства (капитал), итого, | |
| | в том числе: | 1024231079 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 100 |Источники базового капитала: | 1291298211 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 100.1 |Уставный капитал кредитной организации: | 651033884 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 100.1.1 |сформированный обыкновенными акциями | 129605413 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 100.1.2 |сформированный привилегированными акциями | 521428471 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 100.1.3 |сформированный долями | 0 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 100.2 |Эмиссионный доход: | 439401101 |
+-------------+-----------------------------------------------------------------------------------+----------------+
| 100.2.1 |кредитной организации в организационно-правовой форме акционерного общества, всего,| |
| | в том числе: | 439401101 |
+-------------+-----------------------------------------------------------------------------------+----------------+
My code is
string[] dels = { "\r\n" };
string[] strArr = someStr.Split(dels, StringSplitOptions.None);
Console.WriteLine(strArr);
foreach (String sourcestring in strArr)
{
if (sourcestring != null)
{
Console.WriteLine("Processing string: ");
Console.WriteLine(sourcestring);
//Regex regex = new Regex(#"^(\|)(.*)(\|)(.*[а-я]{3}.*)(\|)(.*\d+.*)(\|)(.*[\d+|Х].*)(\|)(.*[\d+|Х].*)(\|)(.*\d+.*)(\|)$");
//Regex regex = new Regex(#"^(\|)(\s?|\d+[\.?])(\|)(.*[а-я]{3}.*)(\|)(.*\d+.*)(\|)(.*[\d+|Х].*)(\|)(.*[\d+|Х].*)(\|)(.*\d+.*)(\|)$");
Regex regex = new Regex(#"^(\|)(\d+\.?\d+)");
MatchCollection mc = regex.Matches(sourcestring);
int mIdx = 0;
foreach (Match m in mc)
{
for (int gIdx = 0; gIdx < m.Groups.Count; gIdx++)
{
Console.WriteLine("[{0}][{1}] = {2}", mIdx, regex.GetGroupNames()[gIdx], m.Groups[gIdx].Value);
}
mIdx++;
}
Console.WriteLine("---------------------------------------------------------");
}
}
I need to extract values of lines
4 - ' 000 ', ' Собственные средства (капитал), итого, ', ' '
5 - ' ', ' в том числе: ', ' 1024231079 '
and line 7, 9...
The main issue now it that I don't know how to make reg exp to find in the first column values, that could be:
' 000 '
' '
' 100 '
' 100.1 '
' 100.1.1 '
and etc.
The second issue is in the second column. I've tried to parse it with the (.*[а-я]{3}.*), but it failed on lines, which contain such symbols, like '(', ',', '.', ':'.
I'll appreciate all possible solutions.
I think RegEx would be overkill in this case, a simple, manual parse approach would be a lot easier:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Two approaches that might work in this case:
Parse the first line (+---+--- ...) to determine the length of each column and parse your data by separation it with Substring.
Split each column by |.
Below, I've outlines the basics for the second approach (No sanity checks).
If your data can contain | too, you might want to parse the data based on cell-size rather than splitting by it.
// Row is defined below - simple data storage for three the columns
List<Row> rows = new List<Row>();
Row currentRow = null;
// Process each line
foreach (string line in input.Split(new string[] {"\r\n"}, StringSplitOptions.RemoveEmptyEntries))
{
// Row separator or content?
if (line.StartsWith("+"))
{
if (currentRow != null)
{
rows.Add(currentRow);
currentRow = null;
}
}
else if (line.StartsWith("|"))
{
string[] parts = line.Split(new char[] {'|'});
if(currentRow == null)
currentRow = new Row();
// Might need additional processing
currentRow.Column1 += parts[1].Trim();
currentRow.Column2 += parts[2].TrimEnd();
currentRow.Column3 += parts[3].TrimStart();
}
else
{
//Invalid data?
}
}
// Show result
foreach(Row row in rows)
{
Console.WriteLine("[{0}][{1}] = {2}", row.Column1, row.Column2, row.Column3);
}
Instead of a custom class you could of course use a Tuple<string,string,string> or whatever fits your data types.
public class Row
{
public string Column1 = "";
public string Column2 = "";
public string Column3 = "";
}
Example on DotNetFiddle
For example, this SecurityProtocol property can be assigned using an OR operator:
System.Net.ServicePointManager.SecurityProtocol =
SecurityProtocolType.Tls | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls12;
Now if we wanted to not hard code this assignment in the application by moving it to AppSettings as a comma-separated string like "192,768,3072", how would we convert the string to it's enumeration and assign it to a property using an OR operator?
[Flags]
public enum E
{
Foo = 1 << 2,
Bar = 1 << 4,
Baz = 1 << 9,
Planxty = Foo | Bar | Baz
}
...
var s = "16,4,512";
E enumresult =
// Split string by commas...
s.Split(',')
// Parse each numeric substring in turn and cast the result to the enum type...
.Select(nstr => (E)Int32.Parse(nstr))
// bitwise or each succeeding value against the rest
.Aggregate((a, b) => a | b);
You can cast an integer directly to an enum value, provided it's a valid value for that enum type. This will work fine:
var x = (E)(4 | 512);
You can split it by commas and apply | in a loop to parsed value array you have got from split:
string v = "192,768,3072";
string[] vals = v.Split(',');
var result = (SecurityProtocolType)int.Parse(vals[0]);
for (int i = 1; i < vals.Length; i++)
result = result | (SecurityProtocolType)int.Parse(vals[i]);
Here is the DEMO
You can create a function for it like"
public static SecurityProtocolType GetProtocolType(string v)
{
string[] vals = v.Split(',');
var result = (SecurityProtocolType)int.Parse(vals[0]);
for (int i = 1; i < vals.Length; i++)
result = result | (SecurityProtocolType)int.Parse(vals[i]);
return result;
}
and use it like:
GetProtocolType("192,3072");
DEMO
I need to take data that I am reading in from a WIKI markup page and store it as a table structure. I am trying to figure out how to properly parse the below markup syntax into some table data structure in C#
Here is an example table:
|| Owner || Action || Status || Comments ||
| Bill | Fix the lobby | In Progress | This is easy |
| Joe | Fix the bathroom | In Progress | Plumbing \\
\\
Electric \\
\\
Painting \\
\\
\\ |
| Scott | Fix the roof | Complete | This is expensive |
and here is how it comes in directly:
|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|
So as you can see:
The column headers have "||" as the separator
A row columns have a separator or "|"
A row might span multiple lines (as in the second data row example above) so i would have to keep reading until I hit the same number of "|" (cols) that I have in the header row.
I tried reading in line by line and then concatenating lines that had "\" in between then but that seemed a bit hacky.
I also tried to simply read in as a full string and then just parse by "||" first and then keep reading until I hit the same number of "|" and then go to the next row. This seemed to work but it feel like there might be a more elegant way using regular expressions or something similar.
Can anyone suggest the correct way to parse this data?
I have largely replaced the previous answer, due to the fact that the format of the input after your edit is substantially different from the one posted before. This leads to a somewhat different solution.
Because there are no longer any line breaks after a row, the only way to determine for sure where a row ends, is to require that each row has the same number of columns as the table header. That is at least if you don't want to rely on some potentially fragile white space convention present in the one and only provided example string (i.e. that the row separator is the only | not preceded by a space). Your question at least does not provide this as the specification for a row delimiter.
The below "parser" provides at least the error handling validity checks that can be derived from your format specification and example string and also allows for tables that have no rows. The comments explain what it is doing in basic steps.
public class TableParser
{
const StringSplitOptions SplitOpts = StringSplitOptions.None;
const string RowColSep = "|";
static readonly string[] HeaderColSplit = { "||" };
static readonly string[] RowColSplit = { RowColSep };
static readonly string[] MLColSplit = { #"\\" };
public class TableRow
{
public List<string[]> Cells;
}
public class Table
{
public string[] Header;
public TableRow[] Rows;
}
public static Table Parse(string text)
{
// Isolate the header columns and rows remainder.
var headerSplit = text.Split(HeaderColSplit, SplitOpts);
Ensure(headerSplit.Length > 1, "At least 1 header column is required in the input");
// Need to check whether there are any rows.
var hasRows = headerSplit.Last().IndexOf(RowColSep) >= 0;
var header = headerSplit.Skip(1)
.Take(headerSplit.Length - (hasRows ? 2 : 1))
.Select(c => c.Trim())
.ToArray();
if (!hasRows) // If no rows for this table, we are done.
return new Table() { Header = header, Rows = new TableRow[0] };
// Get all row columns from the remainder.
var rowsCols = headerSplit.Last().Split(RowColSplit, SplitOpts);
// Require same amount of columns for a row as the header.
Ensure((rowsCols.Length % (header.Length + 1)) == 1,
"The number of row colums does not match the number of header columns");
var rows = new TableRow[(rowsCols.Length - 1) / (header.Length + 1)];
// Fill rows by sequentially taking # header column cells
for (int ri = 0, start = 1; ri < rows.Length; ri++, start += header.Length + 1)
{
rows[ri] = new TableRow() {
Cells = rowsCols.Skip(start).Take(header.Length)
.Select(c => c.Split(MLColSplit, SplitOpts).Select(p => p.Trim()).ToArray())
.ToList()
};
};
return new Table { Header = header, Rows = rows };
}
private static void Ensure(bool check, string errorMsg)
{
if (!check)
throw new InvalidDataException(errorMsg);
}
}
When used like this:
public static void Main(params string[] args)
{
var wikiLine = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
var table = TableParser.Parse(wikiLine);
Console.WriteLine(string.Join(", ", table.Header));
foreach (var r in table.Rows)
Console.WriteLine(string.Join(", ", r.Cells.Select(c => string.Join(Environment.NewLine + "\t# ", c))));
}
It will produce the below output:
Where "\t# " represents a newline caused by the presence of \\ in the input.
Here's a solution which populates a DataTable. It does require a litte bit of data massaging (Trim), but the main parsing is Splits and Linq.
var str = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
var headerStop = str.LastIndexOf("||");
var headers = str.Substring(0, headerStop).Split(new string[1] { "||" }, StringSplitOptions.None).Skip(1).ToList();
var records = str.Substring(headerStop + 4).TrimEnd(new char[2] { ' ', '|' }).Split(new string[1] { "| |" }, StringSplitOptions.None).ToList();
var tbl = new DataTable();
headers.ForEach(h => tbl.Columns.Add(h.Trim()));
records.ForEach(r => tbl.Rows.Add(r.Split('|')));
This makes some assumptions but seems to work for your sample data. I'm sure if I worked at I could combine the expressions and clean it up but you'll get the idea.
It will also allow for rows that do not have the same number of cells as the header which I think is something confluence can do.
List<List<string>> table = new List<List<string>>();
var match = Regex.Match(raw, #"(?:(?:\|\|([^|]*))*\n)?");
if (match.Success)
{
var headersWithExtra = match.Groups[1].Captures.Cast<Capture>().Select(c=>c.Value);
List<String> headerRow = headersWithExtra.Take(headersWithExtra.Count()-1).ToList();
if (headerRow.Count > 0)
{
table.Add(headerRow);
}
}
match = Regex.Match(raw + "\r\n", #"[^\n]*\n" + #"(?:\|([^|]*))*");
var cellsWithExtra = match.Groups[1].Captures.Cast<Capture>().Select(c=>c.Value);
List<string> row = new List<string>();
foreach (string cell in cellsWithExtra)
{
if (cell.Trim(' ', '\t') == "\r\n")
{
if (!table.Contains(row) && row.Count > 0)
{
table.Add(row);
}
row = new List<string>();
}
else
{
row.Add(cell);
}
}
This ended up very similar to Jon Tirjan's answer, although it cuts the LINQ to a single statement (the code to replace that last one was horrifically ugly) and is a bit more extensible. For example, it will replace the Confluence line breaks \\ with a string of your choosing, you can choose to trim or not trim whitespace from around elements, etc.
private void ParseWikiTable(string input, string newLineReplacement = " ")
{
string separatorHeader = "||";
string separatorRow = "| |";
string separatorElement = "|";
input = Regex.Replace(input, #"[ \\]{2,}", newLineReplacement);
string inputHeader = input.Substring(0, input.LastIndexOf(separatorHeader));
string inputContent = input.Substring(input.LastIndexOf(separatorHeader) + separatorHeader.Length);
string[] headerArray = SimpleSplit(inputHeader, separatorHeader);
string[][] rowArray = SimpleSplit(inputContent, separatorRow).Select(r => SimpleSplit(r, separatorElement)).ToArray();
// do something with output data
TestPrint(headerArray);
foreach (var r in rowArray) { TestPrint(r); }
}
private string[] SimpleSplit(string input, string separator, bool trimWhitespace = true)
{
input = input.Trim();
if (input.StartsWith(separator)) { input = input.Substring(separator.Length); }
if (input.EndsWith(separator)) { input = input.Substring(0, input.Length - separator.Length); }
string[] segments = input.Split(new string[] { separator }, StringSplitOptions.None);
if (trimWhitespace)
{
for (int i = 0; i < segments.Length; i++)
{
segments[i] = segments[i].Trim();
}
}
return segments;
}
private void TestPrint(string[] lst)
{
string joined = "[" + String.Join("::", lst) + "]";
Console.WriteLine(joined);
}
Console output from your direct input string:
[Owner::Action::Status::Comments]
[Bill::fix the lobby::In Progress::This is eary]
[Joe::fix the bathroom::In progress::plumbing Electric Painting]
[Scott::fix the roof::Complete::this is expensive]
A generic regex solution that populate a datatable and is a little flexible with the syntax.
var text = #"|| Owner|| Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
// Get Headers
var regHeaders = new Regex(#"\|\|\s*(\w[^\|]+)", RegexOptions.Compiled);
var headers = regHeaders.Matches(text);
//Get Rows, based on number of headers columns
var regLinhas = new Regex(String.Format(#"(?:\|\s*(\w[^\|]+)){{{0}}}", headers.Count));
var rows = regLinhas.Matches(text);
var tbl = new DataTable();
foreach (Match header in headers)
{
tbl.Columns.Add(header.Groups[1].Value);
}
foreach (Match row in rows)
{
tbl.Rows.Add(row.Groups[1].Captures.OfType<Capture>().Select(col => col.Value).ToArray());
}
Here's a solution involving regular expressions. It takes a single string as input and returns a List of headers and a List> of rows/columns. It also trims white space, which may or may not be the desired behavior, so be aware of that. It even prints things nicely :)
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace parseWiki
{
class Program
{
static void Main(string[] args)
{
string content = #"|| Owner || Action || Status || Comments || | Bill\\ | fix the lobby |In Progress | This is eary| | Joe\\ |fix the bathroom\\ | In progress| plumbing \\Electric \\Painting \\ \\ | | Scott \\ | fix the roof \\ | Complete | this is expensive|";
content = content.Replace(#"\\", "");
string headerContent = content.Substring(0, content.LastIndexOf("||") + 2);
string cellContent = content.Substring(content.LastIndexOf("||") + 2);
MatchCollection headerMatches = new Regex(#"\|\|([^|]*)(?=\|\|)", RegexOptions.Singleline).Matches(headerContent);
MatchCollection cellMatches = new Regex(#"\|([^|]*)(?=\|)", RegexOptions.Singleline).Matches(cellContent);
List<string> headers = new List<string>();
foreach (Match match in headerMatches)
{
if (match.Groups.Count > 1)
{
headers.Add(match.Groups[1].Value.Trim());
}
}
List<List<string>> body = new List<List<string>>();
List<string> newRow = new List<string>();
foreach (Match match in cellMatches)
{
if (newRow.Count > 0 && newRow.Count % headers.Count == 0)
{
body.Add(newRow);
newRow = new List<string>();
}
else
{
newRow.Add(match.Groups[1].Value.Trim());
}
}
body.Add(newRow);
print(headers, body);
}
static void print(List<string> headers, List<List<string>> body)
{
var CELL_SIZE = 20;
for (int i = 0; i < headers.Count; i++)
{
Console.Write(headers[i].Truncate(CELL_SIZE).PadRight(CELL_SIZE) + " ");
}
Console.WriteLine("\n" + "".PadRight( (CELL_SIZE + 2) * headers.Count, '-'));
for (int r = 0; r < body.Count; r++)
{
List<string> row = body[r];
for (int c = 0; c < row.Count; c++)
{
Console.Write(row[c].Truncate(CELL_SIZE).PadRight(CELL_SIZE) + " ");
}
Console.WriteLine("");
}
Console.WriteLine("\n\n\n");
Console.ReadKey(false);
}
}
public static class StringExt
{
public static string Truncate(this string value, int maxLength)
{
if (string.IsNullOrEmpty(value) || value.Length <= maxLength) return value;
return value.Substring(0, maxLength - 3) + "...";
}
}
}
Read the input string one character at a time and use a state-machine to decide what should be done with each input character. This approach probably needs more code, but it will be easier to maintain and to extend than regular expressions.
I am trying to read a CSV into a datatable.
The CSV maybe have hundreds of columns and only up to 20 rows.
It will look something like this:
+----------+-----------------+-------------+---------+---+
| email1 | email2 | email3 | email4 | … |
+----------+-----------------+-------------+---------+---+
| ccemail1 | anotherccemail1 | 3rdccemail1 | ccemail | |
| ccemail2 | anotherccemail2 | 3rdccemail2 | | |
| ccemail3 | anotherccemail3 | | | |
| ccemail4 | anotherccemail4 | | | |
| ccemail5 | | | | |
| ccemail6 | | | | |
| ccemail7 | | | | |
| … | | | | |
+----------+-----------------+-------------+---------+---+
i am trying to use genericparser for this; however, i believe that it requires you to know the column names.
string strID, strName, strStatus;
using (GenericParser parser = new GenericParser())
{
parser.SetDataSource("MyData.txt");
parser.ColumnDelimiter = "\t".ToCharArray();
parser.FirstRowHasHeader = true;
parser.SkipStartingDataRows = 10;
parser.MaxBufferSize = 4096;
parser.MaxRows = 500;
parser.TextQualifier = '\"';
while (parser.Read())
{
strID = parser["ID"]; //as you can see this requires you to know the column names
strName = parser["Name"];
strStatus = parser["Status"];
// Your code here ...
}
}
is there a way to read this file into a datatable without know the column names?
It's so simple!
var adapter = new GenericParsing.GenericParserAdapter(filepath);
DataTable dt = adapter.GetDataTable();
This will automatically do everything for you.
I looked at the source code, and you can access the data by column index too, like this
var firstColumn = parser[0]
Replace the 0 with the column number.
The number of colums can be found using
parser.ColumnCount
I'm not familiar with that GenericParser, i would suggest to use tools like TextFieldParser, FileHelpers or this CSV-Reader.
But this simple manual approach should work also:
IEnumerable<String> lines = File.ReadAllLines(filePath);
String header = lines.First();
var headers = header.Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);
DataTable tbl = new DataTable();
for (int i = 0; i < headers.Length; i++)
{
tbl.Columns.Add(headers[i]);
}
var data = lines.Skip(1);
foreach(var line in data)
{
var fields = line.Split(new[]{','}, StringSplitOptions.RemoveEmptyEntries);
DataRow newRow = tbl.Rows.Add();
newRow.ItemArray = fields;
}
i used generic parser to do it.
On the first run through the loop i get the columns names and then reference them to add them to a list
In my case i have pivoted the data but here is a code sample if it helps someone
bool firstRow = true;
List<string> columnNames = new List<string>();
List<Tuple<string, string, string>> results = new List<Tuple<string, string, string>>();
while (parser.Read())
{
if (firstRow)
{
for (int i = 0; i < parser.ColumnCount; i++)
{
if (parser.GetColumnName(i).Contains("FY"))
{
columnNames.Add(parser.GetColumnName(i));
Console.Log("Column found: {0}", parser.GetColumnName(i));
}
}
firstRow = false;
}
foreach (var col in columnNames)
{
double actualCost = 0;
bool hasValueParsed = Double.TryParse(parser[col], out actualCost);
csvData.Add(new ProjectCost
{
ProjectItem = parser["ProjectItem"],
ActualCosts = actualCost,
ColumnName = col
});
}
}