CSVHelper, go back to the beginning after reading the header row - c#

I have the following code where I look at columns headers before some condition statement:
var csvConfig = new CsvHelper.Configuration.CsvConfiguration(CultureInfo.InvariantCulture)
{
// csvconfig.Delimiter = "\t";
HasHeaderRecord = true
};
using (StreamReader reader = new StreamReader(filePath, Encoding.UTF8))
using (CsvReader csv1 = new CsvReader(reader, csvConfig))
{
CsvReader csv2 = csv1;
csv2.Read();
csv2.ReadHeader();
string[] headers = csv2.HeaderRecord;
if(headers[0].Replace("\0", "").ToUpper() == "TEST")
{
using (var dr1 = new CsvDataReader(csv1))
{
var dt1 = new DataTable();
dt1.Load(dr1);
}
}
}
I thought that by having 2 variables for the CsvReader (csv1 and csv2) it would be feasible but it seems that they both use the same object in memory.
Therefore when I want to use csv2 to fill my datatable, the header row has been already read in csv1 and is not loaded in my datatable.
How can I make sure that csv2 contains the whole csv and is distinct from csv1? Is there a method to go back to the beginning or do I need to read the whole CSV again using CsvReader?
Thank you

In C#, datas types are categorized based on how they store their value in the memory : by value or by reference (by pointer exists too).
For example, a string is usually stored by value but complex objects like your reader are almost always stored by reference.
When you do csv2 = csv1 both now refer to the same memory area. This means that they are 2 names for the same thing. When you do an action on csv1, csv2 also receives it since they are 2 aliases for the same information.
Try if CSVreader implements Clone() :
CsvReader csv2 = csv1.Clone();
If it is, the function will create a new object with the same informations which do not share the same memory area.

You need a separate CsvReader for csv1 and csv2. As both you #UserNam3 advise, you don't want csv2 to be referencing the same object as csv1 in memory. They will both be using the same Stream, so after reading the header with csv2 you will need to reset the Stream back to the beginning.
using (StreamReader reader = new StreamReader(filePath, Encoding.UTF8))
using (CsvReader csv1 = new CsvReader(reader, csvConfig))
using (CsvReader csv2 = new CsvReader(reader, csvConfig))
{
csv2.Read();
csv2.ReadHeader();
string[] headers = csv2.HeaderRecord;
reader.BaseStream.Position = 0;
if(headers[0].Replace("\0", "").ToUpper() == "TEST")
{
using (var dr1 = new CsvDataReader(csv1))
{
var dt1 = new DataTable();
dt1.Load(dr1);
}
}
}

Related

CsvHelper - Set the header row and data row

I have sample data that looks like this:
1 This is a random line in the file
2
3 SOURCE_ID|NAME|START_DATE|END_DATE|VALUE_1|VALUE_2
4
5 Another random line in the file
6
7
8
9
10 GILBER|FRED|2019-JAN-01|2019-JAN-31|ABC|DEF
11 ALEF|ABC|2019-FEB-01|2019-AUG-31|FBC|DGF
12 GILBER|FRED|2019-JAN-01|2019-JAN-31|ABC|TEF
13 FLBER|RED|2019-JUN-01|2019-JUL-31|AJC|DEH
14 GI|JOE|2020-APR-01|2020-DEC-31|GBC|DER
I am unable to save changes to the file. Ie, I can't manipulate/clean the original files before consumption. Any manipulation will need to be done on the fly in memory. But what if the files are large (eg, I am currently testing with some files that are 5m+ records).
I am using CsvHelper
I have already referred to the following threads for guidance:
CSVHelper to skip record before header
Better way to skip extraneous lines at the start?
How to read a header from a specific line with CsvHelper?
What I would like to do is:
Set row where header is = 3 (I will know where the header is)
Set row where data starts = 10 (I will know where the data starts from)
Load data into data table, to be displayed into datagridview
If I need perform a combination of stream manipulation before I pass this into the CsvHelper, then do also let me know if that's the missing piece? (and any assistance on how I can actually achieve that under one block of code with be greatly appreciated)
So far I have come up with the below:
string filepath = Path.Combine(txtTst04_File_Location.Text, txtTst04_File_Name.Text);
using (var reader = new StreamReader(filepath))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
// skip rows to get the header
for (int i = 0; i < 4; i++)
{
csv.Read();
}
csv.Configuration.Delimiter = "|"; // Set delimiter
csv.Configuration.IgnoreBlankLines = false;
csv.Configuration.HasHeaderRecord = true;
// how do I set the row where the actual data starts?
using (var dr = new CsvDataReader(csv))
{
var dt = new DataTable();
dt.Load(dr);
dgvTst04_View.DataSource = dt; // Set datagridview source to datatable
}
}
I get the below result:
Do let me know if you would like me to expand on any point.
thanks!
EDIT:
New linked post created here trying to resolve the same objective, but in a different way but getting a new error:
Filestream and datagridview memory issue with CsvHelper
I can get it to work with ShouldSkipRecord. The only problem is it will fail if any of the random lines has a "|" delimiter in it.
using (var reader = new StreamReader(filepath))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Configuration.Delimiter = "|"; // Set delimiter
csv.Configuration.ShouldSkipRecord = row => row.Length == 1;
using (var dr = new CsvDataReader(csv))
{
var dt = new DataTable();
dt.Load(dr);
dgvTst04_View.DataSource = dt; // Set datagridview source to datatable
}
}
If you know how many columns there are, you could set it to skip any rows that have less than that many columns.
csv.Configuration.ShouldSkipRecord = row => row.Length < 6;
I came up with another approach that allows you to skip the lines to the header and then to the records.
using (var reader = new StreamReader(filepath))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Configuration.Delimiter = "|"; // Set delimiter
csv.Configuration.IgnoreBlankLines = false;
// skip to header
for (int i = 0; i < 3; i++)
{
csv.Read();
}
csv.ReadHeader();
var headers = csv.Context.HeaderRecord;
// skip to records
for (int i = 0; i < 6; i++)
{
csv.Read();
}
var dt = new DataTable();
foreach (var header in headers)
{
dt.Columns.Add(header);
}
while (csv.Read())
{
var row = dt.NewRow();
for (int i = 0; i < headers.Length; i++)
{
row[i] = csv.GetField(i);
}
dt.Rows.Add(row);
}
}

How to merge mutliple CSV files into one with newline after each dataset

I wrote a method which creates a list of strings. The string's values are accountance data.
When I click on a button, a new .csv-file will be created.
It looks like this:
As you can see, there is no newline carriage return feed at the end of the line.
I would like to combine all of these .csv files to 1, each dataset for 1 row.
I tried that manually with this simple cmd copy command copy *.csv allcsv.csv but they are all appended in the first row instead of added to the next row:
What do I need to add/change in my code to include the newline character at the end of each row?
How could I include the cmd copy command in my method the easiest way possible?
private void BuchungssatzBilden(object obj)
{
//Lieferschein-Buchungswerte in Liste speichern
List<string> bs = new List<string>();
bs.Add(SelItem.Umsatz.ToString());
bs.Add(SelItem.Gegenkonto);
bs.Add(SelItem.Beleg);
bs.Add(SelItem.Buchungsdatum);
bs.Add(SelItem.Konto);
bs.Add(SelItem.Kost1);
bs.Add(SelItem.Kost2);
bs.Add(SelItem.Text);
using (var stream = new MemoryStream())
using (var reader = new StreamReader(stream))
using (var sr = new StreamWriter(#"C:\" + SelItem.Beleg + SelItem.Text + SelItem.Hv + ".csv", true, Encoding.UTF8))
{
using (var csv = new CsvWriter(sr, System.Globalization.CultureInfo.CurrentCulture))
{
//csv.Configuration.Delimiter = ";";
//csv.Configuration.HasHeaderRecord = true;
foreach (var s in bs)
{
csv.WriteField(s);
}
csv.Flush();
stream.Position = 0;
reader.ReadToEnd();
}
}
MessageBox.Show("CSV erfolgreich erstellt!");
}

CSVHelper BadDataFound in a valid csv

Our customer started reporting bugs with importing data from CSV file. After seeing the csv file, we decided to switch from custom CSV parser to CSVHelper, but the CSV Helper can't read some valid CSV files.
The users are able to load any csv file into our application, so we can't use any class mapper. We use csv.Parser.Read to read string[] dataRows. We can't change a way how this csv file is generated, it is generated by another company and we can't convince them to change the generation when this file is in a valid format.
If we youse BadDataFound handler, the context.RawRecord is:
"1000084;SMRSTOVACI TRUBICE PBF 12,7/6,4 (1/2\") H;"
the data row in csv file is:
1000084;SMRSTOVACI TRUBICE PBF 12,7/6,4 (1/2") H;;;ks;21,59;26,46;21.00;;;8591735015183;8591735015183;Technik;Kabelový spojovací materiál;Označování, smršťování, izolace;Bužírky, smršťovačky;
This should be a valid csv file by RFC 4180.
The code is:
using (var reader = new StreamReader(filePath, Encoding.Default))
{
using (var csv = new CsvReader(reader))
{
csv.Read();
csv.ReadHeader();
List<string> badRecord = new List<string>();
csv.Configuration.BadDataFound = context => badRecord.Add(context.RawRecord);
header = csv.Context.HeaderRecord.ToList();
while (true)
{
var dataRow = csv.Parser.Read();
if (dataRow == null)
{
break;
}
data.Add(dataRow);
}
}
}
Can you help me to configure CSVHelper to be able to load this row to string[]? Or can you suggest different parse which will be able to do that?
Thank you
I believe it is the quote in the middle of the row that is causing the issue. Try setting the configuration to ignore quotes.
using (var reader = new StreamReader(filePath, Encoding.Default))
{
using (var csv = new CsvReader(reader))
{
csv.Configuration.Delimiter = ";";
csv.Configuration.IgnoreQuotes = true;
csv.Read();
csv.ReadHeader();
List<string> badRecord = new List<string>();
csv.Configuration.BadDataFound = context => badRecord.Add(context.RawRecord);
header = csv.Context.HeaderRecord.ToList();
while (true)
{
var dataRow = csv.Parser.Read();
if (dataRow == null)
{
break;
}
data.Add(dataRow);
}
}
}
Updated for version 27.2.1
using (var reader = new StreamReader(filePath, Encoding.Default))
{
List<string> badRecord = new List<string>();
var config = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";",
Mode = CsvMode.NoEscape,
BadDataFound = context => badRecord.Add(context.RawRecord)
};
using (var csv = new CsvReader(reader, config))
{
csv.Read();
csv.ReadHeader();
header = csv.Context.Reader.HeaderRecord.ToList();
while (csv.Parser.Read())
{
data.Add(csv.Parser.Record);
}
}
}

My code wont read my CSV file into the my object list

Either my CSV file is not being read properly or i'm adding my object to the list wrong but i'm not getting any values added into my list.
I've tried using different paths to my CSV file and I have tried using different ways to read a CSV file but nothing has worked.
void TheDex()
{
List<Class1> Pokedex = new List<Class1>();
TextFieldParser parser = new TextFieldParser("pokemon.csv");
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
string row = parser.ReadLine();
String[] Columns = row.Split(Convert.ToChar(","));
Class1 Dex = new Class1();
Dex.DexNumber = Columns[0];
Dex.Name = Columns[1];
Dex.Type1 = Columns[2];
Dex.Type2 = Columns[3];
Dex.Total = Columns[4];
Dex.HP = Columns[5];
Dex.ATK = Columns[6];
Dex.DEf = Columns[7];
Dex.SpAtk = Columns[8];
Dex.SpDef = Columns[9];
Dex.Spd = Columns[10];
Dex.Generation = Columns[11];
Dex.Legendary = Columns[12];
Pokedex.Add(Dex);
}
parser.Close();
}
I want the list Pokedex to contain my objects that hold the data from the CSV but so far the Pokedex list stays empty.
You are using the TextFieldParser class incorrectly.
After you have initialized the parser you can use the ReadFields method to get all the fields. It will do this using the delimiters you have specified:
List<Class1> Pokedex = new List<Class1>();
string[] delimiters = { "," };
using (TextFieldParser parser = FileSystem.OpenTextFieldParser("Names.txt", delimiters))
{
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields();
// populate your class here
Class1 Dex = new Class1();
Dex.DexNumber = fields[0];
Dex.Name = fields[1];
// add the other fields here
Pokedex.Add(Dex);
}
}
There are lots of examples of the correct way to use TextFieldParser on the internet. I found this one that shows the example above.
Andy by wrapping the TextFieldParser in a using clause you don't need to close the parser once you have finished as the using statement will handle that for you.

Reading only headers from csv

I am trying to read only headers from a csv file using CSVHELPER but i am unable to get GetFieldHeaders() method of csvhelper.
I have taken code from this link :Source
public static String[] GetHeaders(string filePath)
{
using (CsvReader csv = new CsvReader(new StreamReader("data.csv")))
{
int fieldCount = csv.FieldCount;
string[] headers = csv.GetFieldHeaders();//Error:doesnt contains definition
}
}
But GetFieldHeaders is not working.
Note: I only want to read headers from csv file
Update : Headers in my csv files are like below :
Id,Address,Name,Rank,Degree,Fahrenheit,Celcius,Location,Type,Stats
So can anybody tell me what i am missing??
Please try below code ... hope this will help you.
var csv = new CsvReader(new StreamReader("YOUR FILE PATH"));
csv.ReadHeader();
var headers = csv.Parser.RawRecord;
Note: headers will return all headers together.. you will need to make substring(s) for each comma to get each header separately.
I did not try to use this library. But quick overview of documentation brought this possible solution:
public static String[] GetHeaders(string filePath)
{
using (CsvReader csv = new CsvReader(new StreamReader("data.csv")))
{
csv.Configuration.HasHeaderRecord = true;
int fieldCount = csv.FieldCount;
string[] headers = csv.GetFieldHeaders();
}
}
* See documentation and search for header.
You can try this instead:
public IEnumerable<string> ReadHeaders(string path)
{
using (var reader = new StreamReader(path))
{
var csv = new CsvReader(reader);
if (csv.Read())
return csv.FieldHeaders;
}
return null;
}
All of the methods in the other answers seem to have been removed in the latest version?
This is what worked for me:
using (var fileReader = File.OpenText("data.csv"))
{
var csv = new CsvReader(fileReader);
csv.Configuration.HasHeaderRecord = true;
csv.Read();
csv.ReadHeader();
string[] headers = ((CsvFieldReader)((CsvParser)csv.Parser).FieldReader).Context.HeaderRecord;
}

Categories