Trouble with parsing CSV files in C# - c#

I'm trying to import a CSV file to my C# site and save it in the database. While doing research I learned about CSV parsing, I've tried to implement this but I've ran into some trouble. Here is a portion of my code so far:
string fileext = Path.GetExtension(fupcsv.PostedFile.FileName);
if (fileext == ".csv")
{
string csvPath = Server.MapPath("~/CSVFiles/") + Path.GetFileName(fupcsv.PostedFile.FileName);
fupcsv.SaveAs(csvPath);
// Add Columns to Datatable to bind data
DataTable dtCSV = new DataTable();
dtCSV.Columns.AddRange(new DataColumn[2] { new DataColumn("ModuleId", typeof(int)), new DataColumn("CourseId", typeof(int))});
// Read all the lines of the text file and close it.
string[] csvData = File.ReadAllLines(csvPath);
// iterate over each row and Split it to New line.
foreach (string row in csvData)
{
// Check for is null or empty row record
if (!string.IsNullOrEmpty(row))
{
using (TextFieldParser parser = new TextFieldParser(csvPath))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
//Process row
string[] fields = parser.ReadFields();
int i = 1;
foreach (char cell in row)
{
dtCSV.NewRow()[i] = cell;
i++;
}
}
}
}
}
}
I keep getting the error "There is no row at position -1" at " dtCSV.Rows[dtCSV.Rows.Count - 1][i] = cell;"
Any help would be greatly appreciated, thanks

You are trying to index rows that you have not created. Instead of
dtCSV.Rows[dtCSV.Rows.Count - 1][i] = cell;
use
dtCSV.NewRow()[i] = cell;
I also suggest you start indexing i from 0 and not from 1.
All right so it turns out there were a bunch of errors with your code, so I made some edits.
string fileext = Path.GetExtension(fupcsv.PostedFile.FileName);
if (fileext == ".csv")
{
string csvPath = Server.MapPath("~/CSVFiles/") + Path.GetFileName(fupcsv.PostedFile.FileName);
fupcsv.SaveAs(csvPath);
DataTable dtCSV = new DataTable();
dtCSV.Columns.AddRange(new DataColumn[2] { new DataColumn("ModuleId", typeof(int)), new DataColumn("CourseId", typeof(int))});
var csvData = File.ReadAllLines(csvPath);
bool headersSkipped = false;
foreach (string line in csvData)
{
if (!headersSkipped)
{
headersSkipped = true;
continue;
}
// Check for is null or empty row record
if (!string.IsNullOrEmpty(line))
{
//Process row
int i = 0;
var row = dtCSV.NewRow();
foreach (var cell in line.Split(','))
{
row[i] = Int32.Parse(cell);
i++;
}
dtCSV.Rows.Add(row);
dtCSV.AcceptChanges();
}
}
}
I ditched the TextFieldParser solution solely because I'm not familiar with it, but if you want to stick with it, it shouldn't be hard to reintegrate it.
Here are some of the things you got wrong:
Not calling NewRow() to create a new row or adding it to the table with AddRow(row)
Iterating through the characters in row instead of the fields you parsed
Not parsing the value of cell - it's value type is string and you are trying to add to an int column
Some other things worth noting (just to improve your code's performance and readability :))
Consider using var when declaring new variables, it takes a lot of the stress away from having to worry about exactly what type of variable you are creating
As others in the comments said, use ReadAllLines() it parses your text file into lines neatly, making it easier to iterate through.
Most of the times when working with arrays or lists, you need to index from 0, not from 1
You have to use AcceptChanges() to commit all the changes you've made

Related

After import csv file to datagrid i need to add or insert new records to datagridview then export lastest data to csv file

enter image description here After import csv file to datagrid i need to add or insert new records to datagridview then export lastest data to csv file
after adding new data to datagrid imported data is not showing in datagrid only new data is showing
this is my code: I don't where I am going wrong.
-- here I need to add new data; and click on add; with new row.
namespace Bind_DataGridView_Using_DataTable
{
public partial class Bind_DataGridView_Using_DataTable : Form
{
public Bind_DataGridView_Using_DataTable()
{
InitializeComponent();
}
DataTable table = new DataTable();
int selectedRow;
private void Bind_DataGridView_Using_DataTable_Load(object sender, EventArgs e)
{
//Create Headers for the dataTable
table.Columns.Add("Id", typeof(int));
table.Columns.Add("FirstName", typeof(string));
table.Columns.Add("LastName", typeof(string));
table.Columns.Add("Profession", typeof(string));
dataGridView1.DataSource = table;
}
//Add to DataGridView
bool found = false;
private void BtnAdd_Click(object sender, EventArgs e)
{
if (!string.IsNullOrEmpty(IDTxt.Text))
{
if(dataGridView1.Rows.Count > 0)
{
foreach(DataGridViewRow row in dataGridView1.Rows)
{
if(Convert.ToString(row.Cells[0].Value) == IDTxt.Text)
{
found = true;
MessageBox.Show("Person Id already Exist");
}
}
if(!found)
{
table.Rows.Add(IDTxt.Text, fisrtTxt.Text, SurNameTxt.Text, ProfesTxt.Text);
dataGridView1.DataSource = table;
cleatTxts();
}
}
}
else if(string.IsNullOrEmpty(IDTxt.Text))
{
label1.Text = "Person Id should not be empty";
}
}
private void ImportBtn_Click(object sender, EventArgs e)
{
try
{
OpenFileDialog dlg = new OpenFileDialog();
if (DialogResult.OK == dlg.ShowDialog())
{
string path = dlg.FileName;
BindData(path);
MessageBox.Show("Import Action Compelted");
}
}
catch(Exception ex)
{
Console.WriteLine("The path does not exist", ex);
}
}
private void BindData(string filePath)
{
DataTable table = new DataTable();
string[] lines = System.IO.File.ReadAllLines(filePath);
if (lines.Length > 0)
{
//first line to create header
string firstLine = lines[0];
string[] headerLabels = firstLine.Split(',');
foreach (string headerWord in headerLabels)
{
table.Columns.Add(new DataColumn(headerWord));
}
//For Data
for (int i = 1; i < lines.Length; i++)
{
string[] dataWords = lines[i].Split(',');
DataRow dr = table.NewRow();
int columnIndex = 0;
foreach (string headerWord in headerLabels)
{
dr[headerWord] = dataWords[columnIndex++];
}
table.Rows.Add(dr);
}
dataGridView1.DataSource = table;
}
}
Atfter add new data imported data went i need want old and new data in datagrid
This i need to do but after adding only new data is coming output attcahed in other picture
As I commented, when the form loads, the code creates the “global” DataTable variable table. Inside the load event, the code creates the four columns and then sets the grids DataSource to this table. The form is displayed and we can see a grid with one “empty” new row displayed.
Then you click the ImportBtn button, it fires it click event, where the user picks a file, then this file path is passed to the method, BindData. Inside BindData …
The code “creates” a new LOCAL DataTable variable named table. Note, this is a different local table and NOT the table defined globally in the load event. The code creates the columns, add the rows, then sets this local table as a DataSource to the grid. When execution leaves the BindData method, the local variable table will go out of scope.
Therefore, after the imported data is displayed, we add some legal values to the text boxes and click the ADD button. In the BtnAdd_Click event, it uses the “global” variable table to add the row to. However, this table is empty because in the import code, it used a different local table. So when you add the row and set the grids DataSource to table… it will only have the one newly added row.
I will assume you do NOT want to create a new table variable in the BindData method. Instead we want to use the “global” DataTable variable table to add the imported rows to. This change is seen below. Here the changes are that instead of referencing the new row cells using the name, I used indexes. Also, a check is made to catch if the read row read from the csv file has fewer values than the data row has columns and vice versa. The code simply used the smaller of the two lengths to stay inbounds. totCols = Math.Min(dataRowCount, dataWords.Length);
private void BindData(string filePath) {
string[] lines = System.IO.File.ReadAllLines(filePath);
if (lines.Length > 0) {
int totCols;
int dataRowCount = table.Columns.Count;
string[] dataWords;
DataRow dr;
for (int i = 1; i < lines.Length; i++) {
dataWords = lines[i].Split(',');
dr = table.NewRow();
totCols = Math.Min(dataRowCount, dataWords.Length);
for (int colIndex = 0; colIndex < totCols; colIndex++) {
dr[colIndex] = dataWords[colIndex];
}
table.Rows.Add(dr);
}
}
}
Next in the ADD button click event, since the grid is already bound to the variable table, after we add the row, we simply call the tables AcceptChanges method and this should update the grid.
if (!found) {
table.Rows.Add(IDTxt.Text, fisrtTxt.Text, SurNameTxt.Text, ProfesTxt.Text);
table.AcceptChanges();
}
I hope this makes sense.
Edit update as per comment…
#Caius Jard … Thanks for the info. I switched the ReadAllLines to ReadLines and switched the type of lines to IEnumerable<string>. …
Before this change, I changed the code as you suggested with row.ItemArray = line.Split(','); and this failed. The error was that it did not like the string number when adding this to the first column Id which is an int column.
I copied the same code the OP used so the Id column in the table is an int type. Initially I though I must be using a string instead of an int, however this is not the case… So, this seems odd to me since the first code I posted in my answer assigns the value in column zero (0) as…
dr[colIndex] = dataWords[colIndex];
Since dataWords is a string[] array, I was wondering why I did not get the same error. From a couple of tests, it appears there is a conversion going on behind the scenes when the data is loaded from the CSV file and also when the row is added to the table in the ADD event. In the add event, the code is simply adding the row from the text boxes…
table.Rows.Add(IDTxt.Text, fisrtTxt.Text, SurNameTxt.Text, ProfesTxt.Text);
Here IDTxt.Text is obviously a string, however, if the string is a valid int, then this will get converted to an int when added. I would think this would throw the same error even if the text was a valid int.
This had me scratching my head as in my past experience, this would throw the grids DataError. At least I though it would. It appears it is ok and will do the behind-the-scenes conversion as long as the string is a valid int. I tested this in the BindData method, such that the Id was something like “a5.” When the table was filled, I got the format exception complaining about the bad int value.
So, in either case, checking for a valid int value would appear necessary to avoid getting these exceptions. In this case, when reading the file, the code below simply ignores rows with invalid Id values.
Also, if a line in the CSV has fewer values than the data row has columns, then that CSV row is ignored. If the line in the CSV has more values than the number of columns in the data row, then only the first n values are used. Where n is the number of columns in the data row.
In addition, when reading the CSV file, a check is made to see if there are “duplicate” Ids in the CSV file. If a duplicate Id is found in the CSV then, that row is ignored.
The int.TryParse method is used when reading the CSV file for converting the Id number from a string to an int. It is also used when adding a new row from the text boxes. If the Id text is not a valid int number, the row is not added.
The updated BindData method is below and renamed to AddCSVDataToGrid.
private void AddCSVDataToGrid(string filePath) {
IEnumerable<string> lines = File.ReadLines(filePath);
int tableColumnCount = table.Columns.Count;
string[] dataValues;
DataRow dr;
foreach (string line in lines.Skip(1)) { // <- Skip the header row
dataValues = line.Split(',');
if (dataValues.Length >= tableColumnCount) {
if (int.TryParse(dataValues[0], out int idValue)) { // <- check if Id is a valid int
if ((table.Select("Id = " + idValue)).Length == 0) { // <- check if Id is duplicate
dr = table.NewRow();
dr[0] = idValue;
for (int colIndex = 1; colIndex < tableColumnCount; colIndex++) {
dr[colIndex] = dataValues[colIndex];
}
table.Rows.Add(dr);
}
else {
// MessageBox.Show("Duplicate Id value in CSV file ..." + dataValues[0] + " - Skipping duplicate Id row");
}
}
else {
// MessageBox.Show("Person Id in CSV is not an integer..." + dataValues[0] + " Skiping invalid Id row");
}
}
else {
// MessageBox.Show("Missing data in CSV ..." + dataValues[0] + " - Skipping CSV row");
}
}
}
Next is the updated ADD button click event.
private void btnAdd_Click(object sender, EventArgs e) {
string idText = IDTxt.Text.Trim();
if (!string.IsNullOrEmpty(idText) && int.TryParse(idText, out int idValue)) {
if ((table.Select("Id = " + idValue)).Length == 0) {
table.Rows.Add(idValue, fisrtTxt.Text.Trim(), SurNameTxt.Text.Trim(), ProfesTxt.Text.Trim());
}
else {
MessageBox.Show("Person Id already Exist");
}
}
else {
label1.Text = "Person Id should not be empty && must be a valid int value";
}
}

[C#]TextFieldParser not behaving as expected

I am trying to read the csv you can download from here: https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=planets . Just click on "Download Table" and select CSV, all columns, all rows.
The code has some problems:
How to recognize the comment? I expect the class to simply skip them and do not put them in the fields variables. But they are.
Why the number of columns is wrong? They are 403 and instead it find 405. According to pandas (Python3) they are 403. In fact when I try to use TextFieldParser for more complicated operations on this csv I get some errors like OutOfBoundary related to the index of the array (of course, columns are 403 but it though they are 405).
Code:
private void loadData(string fileName) {
int rows = 0;
int columns = 0;
using (TextFieldParser parser = new TextFieldParser(fileName, Encoding.UTF8))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.CommentTokens = new []{"#"};
parser.TrimWhiteSpace = false;
parser.HasFieldsEnclosedInQuotes = false;
while (!parser.EndOfData)
{
//Process row
string[] fields = parser.ReadFields();
foreach (string field in fields)
{
//TODO: Process field
}
if (fields.Length == 0) {
//Should be a commment
printLine("Comment found on row " + rows);
}
if (fields.Length > columns)
columns = fields.Length;
rows++;
}
printLine ("Rows: " + rows);
printLine ("Columns: " + columns);
printLine ("Errors on line: " + parser.ErrorLineNumber);
}
}
To ignore the commented lines you need to change your parser.CommentTokens statement to use new string[] as below
parser.CommentTokens = new string []{"#"};
Once you change that the comments will be ignored. There are 3 lines in the file that have a different number of columns then the 403 that all others have
I added the check below to determine when the number of fields is greater than 403(Line 159, 3310, and 3311 have 404 and 405 columns/fields)
if (fields.Length > 403)
{
Console.WriteLine($"Line:{lineNo} has {fields.Length}.");
}
With the above at least you can do some kind of checking/cleanup on those lines that have more than the number of expected fields

C#: Reading a variable structured CSV file into a datatable with a row counter

I am trying to develop a tool that will take a CSV file and import it into a datatable with the first column in the datatable being a row counter.
The CSV files are from different customers and so have different structures. Some have a header line; some have several header lines; some have no header line. They have also have varying columns.
So far, I have the code below.
public void Import_CSV()
{
OpenFileDialog dialog = new OpenFileDialog();
dialog.Filter = "CSV Files (*.csv)|*.csv";
bool? result = dialog.ShowDialog();
if (result ?? false)
{
string[] headers;
string CSVFilePathName = dialog.FileName;
string delimSelect = cboDelimiter.Items.GetItemAt(cboDelimiter.SelectedIndex).ToString();
// If user hasn't selected a delimiter, assume comma
if (delimSelect == "")
{
delimSelect = ",";
}
string[] delimiterType = new string[] {cboDelimiter.Items.GetItemAt(cboDelimiter.SelectedIndex).ToString()};
DataTable dt = new DataTable();
// Read first line of file to get number of fields and create columns and column numbers in data table
using (StreamReader sr1 = new StreamReader(CSVFilePathName))
{
headers = sr1.ReadLine().Split(delimiterType, StringSplitOptions.None);
//dt.Columns.Add("ROW", typeof(int));
//dt.Columns["ROW"].AutoIncrement = true;
//dt.Columns["ROW"].AutoIncrementSeed = 1;
//dt.Columns["ROW"].AutoIncrementStep = 1;
int colCount = 1;
foreach (string header in headers)
{
dt.Columns.Add("C" + colCount.ToString());
colCount++;
}
}
using (StreamReader sr = new StreamReader(CSVFilePathName))
{
while (!sr.EndOfStream)
{
string[] rows = sr.ReadLine().Split(delimiterType, StringSplitOptions.None);
DataRow dr = dt.NewRow();
for (int i = 0; i < headers.Length; i++)
{
dr[i] = rows[i];
}
dt.Rows.Add(dr);
}
}
dtGrid.ItemsSource = dt.DefaultView;
txtColCount.Text = dtGrid.Columns.Count.ToString();
txtRowCount.Text = dtGrid.Items.Count.ToString();
}
}
This works, in as much as it creates column headers (C1, C2....according to how many there are in the csv file) and then the rows are written in, but I want to add a column at the far left with a row number as the rows are added. In the code, you can see I've got a section commented out that creates an auto-number column, but I'm totally stuck on how the rows are written into the datatable. If I uncomment that section, I get errors as the first column in the csv file tries to write into an int field. I know you can specify which field in each row can go in which column, but that won't help here as the columns are unknown at this point. I just need it to be able to read ANY file in, regardless of the structure, but with the row counter.
Hope that makes sense.
You write in your question, that uncommenting the code that adds the first column leads to errors. This is because of your loop: it starts at 0, but the 0-th column is the one you have added manually. So you need just to skip it in your loop, starting at 1. However, the source array has to be processed from the 0-th element.
So the solution is:
First, uncomment the row adding code.
Then, in your loop, introduce an offset to leave the first column untouched:
for (int i = 0; i < headers.Length; i++)
{
dr[i + 1] = rows[i];
}

Exporting table to CSV using LINQ

I'm having hard times exporting DB table in CSV file using LINQ. I've tried few things from related topics, but it was all way too long and I need a simpliest solution. There has to be something.
With this code is problem, that file is created, but empty. When I tried to debug, query is fine, there's everything I want to export. What am I doing wrong?
private void Save_Click(object sender, RoutedEventArgs e)
{
StreamWriter sw = new StreamWriter("test.csv");
DataDataContext db = new DataDataContext();
var query = from x in db.Zbozis
orderby x.Id
select x;
foreach (var something in query)
{
sw.WriteLine(something.ToString());
}
}
Edit: Ok, I tried all your suggestions, sadly with same result (CSV was created, but in it was 10x Lekarna.Zbozi (Name of project/db + name of table)).
So I used a method, that I've found (why reinventing a wheel, huh).
public string ConvertToCSV(IQueryable query, string replacementDelimiter)
{
// Create the csv by looping through each row and then each field in each row
// seperating the columns by commas
// String builder for our header row
StringBuilder header = new StringBuilder();
// Get the properties (aka columns) to set in the header row
PropertyInfo[] rowPropertyInfos = null;
rowPropertyInfos = query.ElementType.GetProperties();
// Setup header row
foreach (PropertyInfo info in rowPropertyInfos)
{
if (info.CanRead)
{
header.Append(info.Name + ",");
}
}
// New row
header.Append("\r\n");
// String builder for our data rows
StringBuilder data = new StringBuilder();
// Setup data rows
foreach (var myObject in query)
{
// Loop through fields in each row seperating each by commas and replacing
// any commas in each field name with replacement delimiter
foreach (PropertyInfo info in rowPropertyInfos)
{
if (info.CanRead)
{
// Get the fields value and then replace any commas with the replacement delimeter
string tmp = Convert.ToString(info.GetValue(myObject, null));
if (!String.IsNullOrEmpty(tmp))
{
tmp.Replace(",", replacementDelimiter);
}
data.Append(tmp + ",");
}
}
// New row
data.Append("\r\n");
}
// Check the data results... if they are empty then return an empty string
// otherwise append the data to the header
string result = data.ToString();
if (string.IsNullOrEmpty(result) == false)
{
header.Append(result);
return header.ToString();
}
else
{
return string.Empty;
}
}
So I have a modified version of previous code:
StreamWriter sw = new StreamWriter("pokus.csv");
ExportToCSV ex = new ExportToCSV();
var query = from x in db.Zbozis
orderby x.Id
select x;
string s = ex.ConvertToCSV(query,"; ");
sw.WriteLine(s);
sw.Flush();
Everything is fine, except it export every line in one column and does not separate it. See here -> http://i.stack.imgur.com/XSNK0.jpg
Question is obvious then, how to divide it into columns like I have in my DB?
Thanks
You are not closing the file. Either use "using"
using(StreamWriter sw = new StreamWriter("test.csv"))
{
..............
}
or simply try this
File.WriteAllLines("test.csv",query);

is there any way to insert data from text file to dataset?

i have text file that looks like this:
1 \t a
2 \t b
3 \t c
4 \t d
i have dataset: DataSet ZX = new DataSet();
is there any way for inserting the text file values to this dataset ?
thanks in advance
You will have to parse the file manually. Maybe like this:
string data = System.IO.File.ReadAllText("myfile.txt");
DataRow row = null;
DataSet ds = new DataSet();
DataTable tab = new DataTable();
tab.Columns.Add("First");
tab.Columns.Add("Second");
string[] rows = data.Split(new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string r in rows)
{
string[] columns = r.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries);
if (columns.Length <= tab.Columns.Count)
{
row = tab.NewRow();
for (int i = 0; i < columns.Length; i++)
row[i] = columns[i];
tab.Rows.Add(row);
}
}
ds.Tables.Add(tab);
UPDATE
If you don't know how many columns in the text file you can modify my original example as the following (assuming that the number of columns is constant for all rows):
// ...
string[] columns = r.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries);
if (tab.Columns.Count == 0)
{
for(int i = 0; i < columns.Length; i++)
tab.Columns.Add("Column" + (i + 1));
}
if (columns.Length <= tab.Columns.Count)
{
// ...
Also remove the initial creation of table columns:
// tab.Columns.Add("First");
// tab.Columns.Add("Second")
-- Pavel
Sure there is,
Define a DataTable, Add DataColumn with data types that you want,
ReadLine the file, split the values by tab, and add each value as a DataRow to DataTable by calling NewRow.
There is a nice sample code at MSDN, take a look and follow the steps
Yes, create data tabel on the fly, refer this article for how-to
Read your file line by line and add those value to your data table , refer this article for how-to read text file
Try this
private DataTable GetTextToTable(string path)
{
try
{
DataTable dataTable = new DataTable
{
Columns = {
{"MyID", typeof(int)},
"MyData"
},
TableName="MyTable"
};
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(path))
{
String line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
string[] words = line.Split(new string[] { "\\t" }, StringSplitOptions.RemoveEmptyEntries);
dataTable.Rows.Add(words[0], words[1]);
}
}
return dataTable;
}
catch (Exception e)
{
// Let the user know what went wrong.
throw new Exception(e.Message);
}
}
Call it like
GetTextToTable(Path.Combine(Server.MapPath("."), "TextFile.txt"));
You could also check out CSV File Imports in .NET
I'd like also to add to the "volpan" code the following :
String _source = System.IO.File.ReadAllText(FilePath, Encoding.GetEncoding(1253));
It's good to add the encoding of your text file, so you can be able to read the data and in my case export those after modification to another file.

Categories