CSV to C# FileHelpers - c#

I want to use FileHelpers to read an extremely basic CSV file into C#.
I have a Model that looks like this;
[DelimitedRecord(",")]
[IgnoreFirst()]
public class NavigationButton
{
public int ID;
public string Text;
public string Path;
}
The CSV file looks like this;
I want to be able to read the appropriate lines and create a new NavigationButton for each record read in from the CSV file. I have read them into a DataTable using;
public DataTable GetNavigationButtonNames()
{
var filename = #"C:\Desktop\NavigationButtons.csv";
var engine = new FileHelperEngine(typeof(NavigationButton));
return engine.ReadFileAsDT(filename);
}
but I now cannot loop through the DataTable as it doesn't implement IEnumerable. I would have created a new NavigationButton in a foreach loop and added in the appropriate rows, however this cannot be done the way I have started out.
How can I change this method so that I can loop through the object I read into from the CSV file and create a new button for each row in the CSV file?

Use the generic version of FileHelperEngine<T> instead, then you can do:
var filename = #"C:\Desktop\NavigationButtons.csv";
var engine = new FileHelperEngine<NavigationButton>();
// ReadFile returns an array of NavigationButton
var records = engine.ReadFile(filename);
// then you can do your foreach or just get the button names with LINQ
return records.Select(x => x.Text);
The documentation for ReadFile() is here.
Or there is also ReadFileAsList() if you prefer.

How about this:
List<NavigationButton> buttons = new List<NavigationButton>();
DataTable dt = GetNavigationButtonNames();
foreach (DataRow dr in dt.Rows)
{
buttons.Add(new NavigationButton
{
ID = int.Parse(dr["id"]),
Text = dr["Text"].ToString(),
Path = dr["Path"].ToString() });
});
}

Related

Is there a way to filter a CSV file for data validation without for loops. (Lumenworks CSVReader)

I want to be able to filter out a CSV file and perform data validation on the filtered data. I imagine for loops, but the file has 2 million cells and it would take a long time. I am using Lumenworks CSVReader for accessing the file using C#.
I found this method csvfile.Where<> but I have no idea what to put in the parameters. Sorry I am still new to coding as well.
[EDIT] This is my code for loading the file. Thanks for all the help!
//Creating C# table from CSV data
var csvTable = new DataTable();
var csvReader = new CsvReader(newStreamReader(System.IO.File.OpenRead(filePath[0])), true);
csvTable.Load(csvReader);
//grabs header from the CSV data table
string[] headers = csvReader.GetFieldHeaders(); //this method gets the headers of the CSV file
string filteredData[] = csvReader.Where // this is where I would want to implement the where method, or some sort of way to filter the data
//I can access the rows and columns with this
csvTable.Rows[0][0]
csvTable.Columns[0][0]
//After filtering (maybe even multiple filters) I want to add up all the filtered data (assuming they are integers)
var dataToValidate = 0;
foreach var data in filteredData{
dataToValidate += data;
}
if (dataToValidate == 123)
//data is validated
I would read some of the documentation for the package you are using:
https://github.com/phatcher/CsvReader
https://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader
To specifically answer the filtering question, so it only contains the data you are searching for consider the following:
var filteredData = new List<List<string>>();
using (CsvReader csv = new CsvReader(new StreamReader(System.IO.File.OpenRead(filePath[0])), true));
{
string searchTerm = "foo";
while (csv.ReadNextRecord())
{
var row = new List<string>();
for (int i = 0; i < csv.FieldCount; i++)
{
if (csv[i].Contains(searchTerm))
{
row.Add(csv[i]);
}
}
filteredData.Add(row);
}
}
This will give you a list of a list of string that you can enumerate over to do your validation
int dataToValidate = 0;
foreach (var row in filteredData)
{
foreach (var data in row)
{
// do the thing
}
}
--- Old Answer ---
Without seeing the code you are using to load the file, it might be a bit difficult to give you a full answer, ~2 Million cells may be slow no matter what what.
Your .Where comes from System.Linq
https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where?view=net-6.0
A simple example using .Where
//Read the file and return a list of strings that match the where clause
public List<string> ReadCSV()
{
List<string> data = File.ReadLines(#"C:\Users\Public\Documents\test.csv");
.Select(line => line.Split(','))
// token[x] where x is the column number, assumes ID is column 0
.Select(tokens => new CsvFileStructure { Id = tokens[0], Value = tokens[1] })
// Where filters based on whatever you are looking for in the CSV
.Where(csvFileStructure => csvFileStructure.Id == "1")
.ToList();
return data;
}
// Map of your data structure
public class CsvFileStructure
{
public long Id { get; set; }
public string Name { get; set; }
public string Value { get; set; }
}
Modified from this answer:
https://stackoverflow.com/a/10332737/7366061
There is no csvreader.Where method. The "where" is part of Linq in C#. The link below shows an example of computing columns in a csv file using Linq:
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/how-to-compute-column-values-in-a-csv-text-file-linq

C# - check which element in a csv is not in an other csv and then write the elements to another csv

My task is to check which of the elements of a column in one csv are not included in the elements of a column in the other csv. There is a country column in both csv and the task is to check which countries are not in the secong csv but are in the first csv.
I guess I have to solve it with Lists after I read the strings from the two csv. But I dont know how to check which items in the first list are not in the other list and then put it to a third list.
There are many way to achieve this, for many real world CSV applications it is helpful to read the CSV input into a typed in-memory store there are standard libraries that can assist with this like CsvHelper as explained in this canonical post: Parsing CSV files in C#, with header
However for this simple requirement we only need to parse the values for Country form the master list, in this case the second csv. We don't need to manage, validate or parse any of the other fields in the CSVs
Build a list of unique Country values from the second csv
Iterate the first csv
Get the Country value
Check against the list of countries from the second csv
Write to the third csv if the country was not found
You can test the following code on .NET Fiddle
NOTE: this code uses StringWriter and StringReader as their interfaces are the same as the file reader and writers in the System.IO namespace. but we can remove the complexity associated with file access for this simple requirement
string inputcsv = #"Id,Field1,Field2,Country,Field3
1,one,two,Australia,three
2,one,two,New Zealand,three
3,one,two,Indonesia,three
4,one,two,China,three
5,one,two,Japan,three";
string masterCsv = #"Field1,Country,Field2
one,Indonesia,...
one,China,...
one,Japan,...";
string errorCsv = "";
// For all in inputCsv where the country value is not listed in the masterCsv
// Write to errorCsv
// Step 1: Build a list of unique Country values
bool csvHasHeader = true;
int countryIndexInMaster = 1;
char delimiter = ',';
List<string> countries = new List<string>();
using (var masterReader = new System.IO.StringReader(masterCsv))
{
string line = null;
if (csvHasHeader)
{
line = masterReader.ReadLine();
// an example of how to find the column index from first principals
if(line != null)
countryIndexInMaster = line.Split(delimiter).ToList().FindIndex(x => x.Trim('"').Equals("Country", StringComparison.OrdinalIgnoreCase));
}
while ((line = masterReader.ReadLine()) != null)
{
string country = line.Split(delimiter)[countryIndexInMaster].Trim('"');
if (!countries.Contains(country))
countries.Add(country);
}
}
// Read the input CSV, if the country is not in the master list "countries", write it to the errorCsv
int countryIndexInInput = 3;
csvHasHeader = true;
var outputStringBuilder = new System.Text.StringBuilder();
using (var outputWriter = new System.IO.StringWriter(outputStringBuilder))
using (var inputReader = new System.IO.StringReader(inputcsv))
{
string line = null;
if (csvHasHeader)
{
line = inputReader.ReadLine();
if (line != null)
{
countryIndexInInput = line.Split(delimiter).ToList().FindIndex(x => x.Trim('"').Equals("Country", StringComparison.OrdinalIgnoreCase));
outputWriter.WriteLine(line);
}
}
while ((line = inputReader.ReadLine()) != null)
{
string country = line.Split(delimiter)[countryIndexInInput].Trim('"');
if(!countries.Contains(country))
{
outputWriter.WriteLine(line);
}
}
outputWriter.Flush();
errorCsv = outputWriter.ToString();
}
// dump output to the console
Console.WriteLine(errorCsv);
Since you write about solving it with lists, I assume you can load those values from the CSV to the lists, so let's start with:
List<string> countriesIn1st = LoadDataFrom1stCsv();
List<string> countriesIn2nd = LoadDataFrom2ndCsv();
Then you can easily solve it with linq:
List<string> countriesNotIn2nd = countriesIn1st.Where(country => !countriesIn2nd.Contains(country)).ToList();
Now you have your third list with countries that are in first, but not in the second list. You can save it.

C# - LinQ - Read text files, group by first column and order by last column?

hi have a text files that contains 3 columns something like this:
contract1;pdf1;63
contract1;pdf2;5
contract1;pdf3;2
contract1;pdf4;00
contract2;pdf1;2
contract2;pdf2;30
contract2;pdf3;5
contract2;pdf4;80
now, i want to write those information into another text files ,and the output will be order put for first the records with the last column in "2,5", something like this:
contract1;pdf3;2
contract1;pdf2;5
contract1;pdf1;63
contract1;pdf4;00
contract2;pdf1;2
contract2;pdf3;5
contract2;pdf2;30
contract2;pdf4;80
how can i do?
thanks
You can use LINQ to group and sort the lines after reading, then put them back together:
var output = File.ReadAllLines(#"path-to-file")
.Select(s => s.Split(';'))
.GroupBy(s => s[0])
.SelectMany(sg => sg.OrderBy(s => s[2] == "2" ? "-" : s[2] == "5" ? "+" : s[2]).Select(sg => String.Join(";", sg)));
Then just write them to a file.
I'm not going to write your program for you, but I would recommend this library for reading and writing delimited files:
https://joshclose.github.io/CsvHelper/getting-started/
When you new up the reader make sure to specify your semi-colon delimiter:
using (var reader = new StreamReader("path\\to\\input_file.csv"))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Configuration.Delimiter = ";";
var records = csv.GetRecords<Row>();
// manipulate the data as needed here
}
Your "Row" class (choose a more appropriate name for clarity) will specify the schema of the flat file. It sounds like you don't have headers? If not, you can specify the Order of each item.
public class Row
{
[Index(1)]
public string MyValue1 { get; set; }
[Index(2)]
public string MyValue2 { get; set; }
}
After reading the data in, you can manipulate it as needed. If the output format is different from the input format, you should convert the input class into an output class. You can use the Automapper library if you would like. However, for a simple project I would suggest to just manually convert the input class into the output class.
Lastly, write the data back out:
using (var writer = new StreamWriter("path\\to\\output_file.csv"))
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.WriteRecords(records);
}

Read all values from CSV into a List using CsvHelper

So I've been reading that I shouldn't write my own CSV reader/writer, so I've been trying to use the CsvHelper library installed via nuget. The CSV file is a grey scale image, with the number of rows being the image height and the number columns the width. I would like to read the values row-wise into a single List<string> or List<byte>.
The code I have so far is:
using CsvHelper;
public static List<string> ReadInCSV(string absolutePath)
{
IEnumerable<string> allValues;
using (TextReader fileReader = File.OpenText(absolutePath))
{
var csv = new CsvReader(fileReader);
csv.Configuration.HasHeaderRecord = false;
allValues = csv.GetRecords<string>
}
return allValues.ToList<string>();
}
But allValues.ToList<string>() is throwing a:
CsvConfigurationException was unhandled by user code
An exception of type 'CsvHelper.Configuration.CsvConfigurationException' occurred in CsvHelper.dll but was not handled in user code
Additional information: Types that inherit IEnumerable cannot be auto mapped. Did you accidentally call GetRecord or WriteRecord which acts on a single record instead of calling GetRecords or WriteRecords which acts on a list of records?
GetRecords is probably expecting my own custom class, but I'm just wanting the values as some primitive type or string. Also, I suspect the entire row is being converted to a single string, instead of each value being a separate string.
According to #Marc L's post you can try this:
public static List<string> ReadInCSV(string absolutePath) {
List<string> result = new List<string>();
string value;
using (TextReader fileReader = File.OpenText(absolutePath)) {
var csv = new CsvReader(fileReader);
csv.Configuration.HasHeaderRecord = false;
while (csv.Read()) {
for(int i=0; csv.TryGetField<string>(i, out value); i++) {
result.Add(value);
}
}
}
return result;
}
If all you need is the string values for each row in an array, you could use the parser directly.
var parser = new CsvParser( textReader );
while( true )
{
string[] row = parser.Read();
if( row == null )
{
break;
}
}
http://joshclose.github.io/CsvHelper/#reading-parsing
Update
Version 3 has support for reading and writing IEnumerable properties.
The whole point here is to read all lines of CSV and deserialize it to a collection of objects. I'm not sure why do you want to read it as a collection of strings. Generic ReadAll() would probably work the best for you in that case as stated before. This library shines when you use it for that purpose:
using System.Linq;
...
using (var reader = new StreamReader(path))
using (var csv = new CsvReader(reader))
{
var yourList = csv.GetRecords<YourClass>().ToList();
}
If you don't use ToList() - it will return a single record at a time (for better performance), please read https://joshclose.github.io/CsvHelper/examples/reading/enumerate-class-records
Please try this. This had worked for me.
TextReader reader = File.OpenText(filePath);
CsvReader csvFile = new CsvReader(reader);
csvFile.Configuration.HasHeaderRecord = true;
csvFile.Read();
var records = csvFile.GetRecords<Server>().ToList();
Server is an entity class. This is how I created.
public class Server
{
private string details_Table0_ProductName;
public string Details_Table0_ProductName
{
get
{
return details_Table0_ProductName;
}
set
{
this.details_Table0_ProductName = value;
}
}
private string details_Table0_Version;
public string Details_Table0_Version
{
get
{
return details_Table0_Version;
}
set
{
this.details_Table0_Version = value;
}
}
}
You are close. It isn't that it's trying to convert the row to a string. CsvHelper tries to map each field in the row to the properties on the type you give it, using names given in a header row. Further, it doesn't understand how to do this with IEnumerable types (which string implements) so it just throws when it's auto-mapping gets to that point in testing the type.
That is a whole lot of complication for what you're doing. If your file format is sufficiently simple, which yours appear to be--well known field format, neither escaped nor quoted delimiters--I see no reason why you need to take on the overhead of importing a library. You should be able to enumerate the values as needed with System.IO.File.ReadLines() and String.Split().
//pseudo-code...you don't need CsvHelper for this
IEnumerable<string> GetFields(string filepath)
{
foreach(string row in File.ReadLines(filepath))
{
foreach(string field in row.Split(',')) yield return field;
}
}
static void WriteCsvFile(string filename, IEnumerable<Person> people)
{
StreamWriter textWriter = File.CreateText(filename);
var csvWriter = new CsvWriter(textWriter, System.Globalization.CultureInfo.CurrentCulture);
csvWriter.WriteRecords(people);
textWriter.Close();
}

C# Linq to CSV Dynamic Object runtime column name

I'm new to using Dynamic Objects in C#. I am reading a CSV file very similarly to the code found here: http://my.safaribooksonline.com/book/programming/csharp/9780321637208/csharp-4dot0-features/ch08lev1sec3
I can reference the data I need with a static name, however I can not find the correct syntax to reference using a dynamic name at run time.
For example I have:
var records = from r in myDynamicClass.Records select r;
foreach(dynamic rec in records)
{
Console.WriteLine(rec.SomeColumn);
}
And this works fine if you know the "SomeColumn" name. I would prefer to have a column name a a string and be able to make the same type refrence at run time.
Since one has to create the class which inherits from DynamicObject, simply add an indexer to the class to achieve one's result via strings.
The following example uses the same properties found in the book example, the properties which holds the individual line data that has the column names. Below is the indexer on that class to achieve the result:
public class myDynamicClassDataLine : System.Dynamic.DynamicObject
{
string[] _lineContent; // Actual line data
List<string> _headers; // Associated headers (properties)
public string this[string indexer]
{
get
{
string result = string.Empty;
int index = _headers.IndexOf(indexer);
if (index >= 0 && index < _lineContent.Length)
result = _lineContent[index];
return result;
}
}
}
Then access the data such as
var csv =
#",,SomeColumn,,,
ab,cd,ef,,,"; // Ef is the "SomeColumn"
var data = new myDynamicClass(csv); // This holds multiple myDynamicClassDataLine items
Console.WriteLine (data.OfType<dynamic>().First()["SomeColumn"]); // "ef" is the output.
You will need to use reflection. To get the names you would use:
List<string> columnNames = new List<string>(records.GetType().GetProperties().Select(i => i.Name));
You can then loop through your results and output the values for each column like so:
foreach(dynamic rec in records)
{
foreach (string prop in columnNames)
Console.Write(rec.GetType().GetProperty (prop).GetValue (rec, null));
}
Try this
string column = "SomeColumn";
var result = rec.GetType().GetProperty (column).GetValue (rec, null);

Categories