Reading .txt file content to DataTable with Column headers on first line - c#

I am trying to load data from a .txt file which looks like this:
|ABC|DEF|GHI|
|111|222|333|
|444|555|666|
With code:
using (StringReader reader = new StringReader(new StreamReader(fileStream, Encoding.Default).ReadToEnd()))
{
string line;
//reader.ReadLine(); //skip first line
while (reader.Peek() != -1)
{
line = reader.ReadLine();
if (line == null || line.Length == 0)
continue;
string[] values = line.Split('|').Skip(1).ToArray();
if (!isColumnCreated)
{
for (int i = 0; i < values.Count(); i++)
{
table.Columns.Add(values[i]);
}
isColumnCreated = true;
}
DataRow row = table.NewRow();
for (int i = 0; i < values.Count(); i++)
{
row[i] = values[i];
}
table.Rows.Add(row);
products++;
}
}
The problem is, when I generate a DataTable, I have first line as Column, but first line:
|ABC|DEF|GHI|
is visible also in the rows:
How to put first line as column headers and rest as rows?
I do not want to use CSVHelper for that if it possible.

Just need to skip when the first line after header is created
string line;
bool bheader= false;
//reader.ReadLine(); //skip first line
while (reader.Peek() != -1)
{
line = reader.ReadLine();
if (line == null || line.Length == 0)
continue;
string[] values = line.Split('|').Skip(1).ToArray();
if (!isColumnCreated)
{
for (int i = 0; i < values.Count(); i++)
{
table.Columns.Add(values[i]);
}
isColumnCreated = true;
bheader = true;
}
if(bheader ==false){
DataRow row = table.NewRow();
for (int i = 0; i < values.Count(); i++)
{
row[i] = values[i];
}
table.Rows.Add(row);
products++;
}
}
bheader = false;
}

The issue with your current code is that you handle when isColumnCreated is false, but not true. If you change this:
if (!isColumnCreated)
{
for (int i = 0; i < values.Count(); i++)
{
table.Columns.Add(values[i]);
}
isColumnCreated = true;
}
DataRow row = table.NewRow();
for (int i = 0; i < values.Count(); i++)
{
row[i] = values[i];
}
table.Rows.Add(row);
products++;
to this
if (!isColumnCreated)
{
for (int i = 0; i < values.Count(); i++)
{
table.Columns.Add(values[i]);
}
isColumnCreated = true;
}
DataRow row = table.NewRow();
else if (isColumnCreated)
{
for (int i = 0; i < values.Count(); i++)
{
row[i] = values[i];
}
table.Rows.Add(row);
}
it should work just fine. By only creating a row if the column headers have been created you're creating a situation wherein only on the first pass do you do anything with the first row, then it gets dumped.

I would think you want to add the columns or add a row.
if (!isColumnCreated)
{
for (int i = 0; i < values.Count(); i++)
{
table.Columns.Add(values[i]);
}
isColumnCreated = true;
}
}
else
{
DataRow row = table.NewRow();
for (int i = 0; i < values.Count(); i++)
{
row[i] = values[i];
}
table.Rows.Add(row);
}

This would work
DataTable dt = new DataTable();
using (System.IO.StreamReader sr = new System.IO.StreamReader("PathToFile"))
{
string currentline = string.Empty;
bool doneHeader = false;
while ((currentline = sr.ReadLine()) != null)
{
if (!doneHeader)
{
foreach (string item in currentline.Split('YourDelimiter'))
{
dt.Columns.Add(item);
}
doneHeader = true;
continue;
}
dt.Rows.Add();
int colCount = 0;
foreach (string item in currentline.Split('YourDelimiter'))
{
dt.Rows[dt.Rows.Count - 1][colCount] = item;
colCount++;
}
}
}

Another method, more LINQ oriented.
Use File.ReadAllLines to parse all the File lines into a string array.
Create a List<string[]> containing all the data Rows. The columns values are composed splitting the string row using a
provided Delimiter.
The first Row values are used to build a DataTable Columns elements.
The first Row is removed from the List.
All the other Rows are added to the DataTable.Rows collection.
Set a DataGridView.DataSource to the new DataTable.
char Delimiter = '|';
string[] Lines = File.ReadAllLines("[SomeFilePath]", Encoding.Default);
List<string[]> FileRows = Lines.Select(line =>
line.Split(new[] { Delimiter }, StringSplitOptions.RemoveEmptyEntries)).ToList();
DataTable dt = new DataTable();
dt.Columns.AddRange(FileRows[0].Select(col => new DataColumn() { ColumnName = col }).ToArray());
FileRows.RemoveAt(0);
FileRows.ForEach(row => dt.Rows.Add(row));
dataGridView1.DataSource = dt;

Related

How to remove duplicates Id and names from the csv file in c# winform before import to datagrid or datatable

Here is my code now its removing only duplicate ID but I want to remove duplicate ID and Name from the row This is my CSV file
I want remove duplicates like this before import
Remove duplicates from CSV before import
namespace Export_Import_CSV
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
this.BindDataGridView();
}
DataTable dt = new DataTable();
private void ImportBtn_Click(object sender, EventArgs e)
{
dt.Rows.Clear();
OpenFileDialog dlg = new OpenFileDialog();
dlg.Multiselect = true;
if (DialogResult.OK == dlg.ShowDialog())
{
string path = dlg.FileName;
// BindData(path);
AddCSVDataToGrid(path);
MessageBox.Show("Import Action Completed");
}
}
private void BindData(string filePath)
{
string[] Lines = System.IO.File.ReadAllLines(filePath);
string[] lines = Lines.Distinct().ToArray();
if (Lines.Length != lines.Length)
{
MessageBox.Show("duplicates data found,Ignored duplicates data from csv file");
}
if (lines.Length > 0)
{
string[] dataWords;
DataRow dr;
for (int i = 1; i < lines.Length; i++)
{
dataWords = lines[i].Split(',');
dr = dt.NewRow();
int totCols = dt.Columns.Count;
for (int colIndex = 0; colIndex < totCols; colIndex++)
{
dr[colIndex] = dataWords[colIndex];
}
dt.Rows.Add(dr);
}
dataGridView1.DataSource = dt;
}
}
}
}
By using two HashSets (one for Id and one for the name) we can check if an ID or name had already come up.
string[] lines = System.IO.File.ReadAllLines(filePath);
HashSet<int> IDs = new HashSet<int>()
HashSet<string> Names = new HashSet<string>();
bool hasError = false;
string[] dataWords;
DataRow dr;
for (int i = 1; i < lines.Length; i++)
{
dataWords = lines[i].Split(',');
int ID = int.Parse(dataWords[0]);
string name = dataWords[1] + " " + dataWords[2];
if (IDs.Contains(ID) || Names.Contains(name)
{
hasError = true;
continue;
}
IDs.Add(ID);
Names.Add(name);
dr = dt.NewRow();
int totCols = dt.Columns.Count;
for (int colIndex = 0; colIndex < totCols; colIndex++)
{
dr[colIndex] = dataWords[colIndex];
}
dt.Rows.Add(dr);
}
dataGridView1.DataSource = dt;
if (hasError)
{
MessageBox.Show("duplicates data found,Ignored duplicates data from csv file");
}
//Make Sure you have trimmed the line when appending in list
string[] Lines = System.IO.File.ReadAllLines(filePath);
string[] lines = Lines.Distinct().ToArray();
string[] duplicateLines = Lines.Except(lines);
if(duplicateLines.Count()>0)
{
MessageBox.Show("duplicates data found,Ignored duplicates data from csv file");
}

How can I paste multi-rows from Excel to a DataGridView in C#?

I am trying to paste rows from an Excel sheet to a DataGridView in C#.
I have used the following code:
private void PasteClipboard(DataGridView myDataGridView)
{
DataObject o = (DataObject)Clipboard.GetDataObject();
if (o.GetDataPresent(DataFormats.Text))
{
if (myDataGridView.RowCount > 0)
myDataGridView.Rows.Clear();
if (myDataGridView.ColumnCount > 0)
myDataGridView.Columns.Clear();
bool columnsAdded = false;
string[] pastedRows = Regex.Split(o.GetData(DataFormats.Text).ToString().TrimEnd("\r\n".ToCharArray()), "\r\n");
foreach (string pastedRow in pastedRows)
{
string[] pastedRowCells = pastedRow.Split(new char[] { '\t' });
if (!columnsAdded)
{
for (int i = 0; i < pastedRowCells.Length; i++)
myDataGridView.Columns.Add("col" + i, pastedRowCells[i]);
columnsAdded = true;
continue;
}
myDataGridView.Rows.Add();
int myRowIndex = myDataGridView.Rows.Count - 1;
using (DataGridViewRow myDataGridViewRow = myDataGridView.Rows[myRowIndex])
{
for (int i = 0; i < pastedRowCells.Length; i++)
myDataGridViewRow.Cells[i].Value = pastedRowCells[i];
}
}
However, as a result, only one row contains data while the others are empty. For instance, if I copy and paste 3 rows, the 3rd row is the only row with data and the other two rows are empty. What am I doing wrong?
You need to do this:
int myRowIndex = myDataGridView.Rows.Add();
Instead of this:
myDataGridView.Rows.Add();
int myRowIndex = myDataGridView.Rows.Count - 1;
Note that when you create a new row, you also receive the index of that row, as the return value of myDataGridView.Rows.Add(); . Your code ignores that value and instead it assumes that the newly created row will always be the last one: myDataGridView.Rows.Count - 1;

CSV to 2D array but Split function throws a NullPointerException

I was trying to put loop through and put CSV to 2D array.
My app crashes due to var tokens = sr.ReadLine().Split(','); It throws a NullPointerException. How can I fix this?
Below is my whole method named csvToArray:
public string[,] csvToArray (string filePath)
{
int col = colCount(filePath);
int row = rowCount(filePath);
string line;
string[,] data = new string[col, row];
using (StreamReader sr = new StreamReader(filePath))
{
for (int i = 0; i < col; i++)
{
var tokens = sr.ReadLine().Split(',');
for (int j = 0; j < row; j++)
{
data[i, j] = tokens[j];
}
}
}
return data;
}
What does not make sense is that it finished the whole loop. The variables below the exception have the values that they were supposed to have.
You have to use first for loop for row and then inner loop for column.
public static string[,] csvToArray(string filePath)
{
int col = colCount(filePath);
int row = rowCount(filePath);
string line;
string[,] data = new string[row, col];
using (StreamReader sr = new StreamReader(filePath))
{
for (int i = 0; i < row; i++)
{
var tokens = sr.ReadLine().Split(',');
for (int j = 0; j < col; j++)
{
data[i, j] = tokens[j];
}
}
}
return data;
}
In your code you are doing null.Split(), that is why it giving you exception.
If you want to insert 0 in each cell for a blank row, then you can implement the following code.
for (int i = 0; i < row; i++)
{
string content = sr.ReadLine();
if (!string.IsNullOrEmpty(content))
{
var tokens = content.Split(',');
for (int j = 0; j < col; j++)
{
data[i, j] = tokens[j];
}
}
else
{
for (int j = 0; j < col; j++)
{
data[i, j] = "0";
}
}
}
Lets first analyse errors in your example.
1) Why do you event need to know length of row and columns in the beggining? It is overhead.
2) Row and columns in your loop is invalid.
3) This exception throws in your example because your reached EOF.
So, here is better way to read csv to 2D matrix:
public int[][] csvToArray (string filePath)
{
string line = null;
var result = new List<int[]>();
using (var sr = new StreamReader(filePath))
{
while((line = sr.ReadLine()) != null)
{
if(string.IsNullOrWhiteSpace(line)) continue;
result.Add(sr.Split(',').Select(x=> string.IsNullOrWhiteSpace(x) ? 0 : int.Parse(x)).ToArray());
}
}
return result.ToArray();
}
Then you can just check your matrix for consistency.
At least, this way you won't open your file three times and protected from counting errors.

How to Convert String Data to Data Table in C# asp.net?

So far I have tried to convert DataTable to String as follow:-
public static string convertDataTableToString(DataTable dataTable)
{
string data = string.Empty;
int rowsCount = dataTable.Rows.Count;
for (int i = 0; i < rowsCount; i++)
{
DataRow row = dataTable.Rows[i];
int columnsCount = dataTable.Columns.Count;
for (int j = 0; j < columnsCount; j++)
{
data += dataTable.Columns[j].ColumnName + "~" + row[j];
if (j == columnsCount - 1)
{
if (i != (rowsCount - 1))
data += "$";
}
else
data += "|";
}
}
return data;
}
Now I want to convert returned string into DataTable again.
You can use String.Split to break your string into rows and cells. If the column setup is always the same (as it should be), then you can simply add the columns on your first iteration through the cells.
Here's a simple example:
public static DataTable convertStringToDataTable(string data)
{
DataTable dataTable = new DataTable();
bool columnsAdded = false;
foreach(string row in data.Split('$'))
{
DataRow dataRow = dataTable.NewRow();
foreach(string cell in row.Split('|'))
{
string[] keyValue = cell.Split('~');
if (!columnsAdded)
{
DataColumn dataColumn = new DataColumn(keyValue[0]);
dataTable.Columns.Add(dataColumn);
}
dataRow[keyValue[0]] = keyValue[1];
}
columnsAdded = true;
dataTable.Rows.Add(dataRow);
}
return dataTable;
}
Alternatively you could get a list of all columns prior to the loop, but this way is likely easier for your purpose.

NPOI with ASP.NET (C#)

I am trying to get NPOI to work with ASP.NET (C#) and I want to read an excel file and put it in a DataSet. Here is the code I attempted:
public static DataTable getExcelData(string FileName, string strSheetName)
{
DataTable dt = new DataTable();
HSSFWorkbook hssfworkbook;
using (FileStream file = new FileStream(FileName, FileMode.Open, FileAccess.Read))
{
hssfworkbook = new HSSFWorkbook(file);
}
ISheet sheet = hssfworkbook.GetSheet(strSheetName);
System.Collections.IEnumerator rows = sheet.GetRowEnumerator();
while (rows.MoveNext())
{
IRow row = (HSSFRow)rows.Current;
if (dt.Columns.Count == 0)
{
for (int j = 0; j < row.LastCellNum; j++)
{
dt.Columns.Add(row.GetCell(j).ToString());
}
continue;
}
DataRow dr = dt.NewRow();
for (int i = 0; i < row.LastCellNum; i++)
{
ICell cell = row.GetCell(i);
if (cell == null)
{
dr[i] = null;
}
else
{
dr[i] = cell.ToString();
}
}
dt.Rows.Add(dr);
}
return dt;
}
The Error that I get is
+ $exception {"Object reference not set to an instance of an object."} System.Exception {System.NullReferenceException}
The odd thing is that this actually works with 2 excel files that I have, but when I put in a third one it crashes with that error.
This returns null if strSheetName isn't found:
ISheet sheet = hssfworkbook.GetSheet(strSheetName);
try:
for( int iSheet = 0; iSheet < hssfworkbook.NumberOfSheets; ++iSheet )
{
ISheet sheet = hssfworkbook.GetSheetAt(iSheet); // could cast to HSSFSheet
String strSheetNameActual = sheet.SheetName;
}
Then figure out how you want to compare strSheetName to strSheetNameActual or which sheets you want to process and how.
Try using this:
for (int j = row.FirstCellNum; j < row.LastCellNum; j++)
and
for (int i = row.FirstCellNum; i < row.LastCellNum; i++)
Instead of:
for (int j = 0; j < row.LastCellNum; j++)
and
for (int i = 0; i < row.LastCellNum; i++)
Also, make sure that you manage the case when the cells on the first row are null:
if (dt.Columns.Count == 0)
{
int empty = 0;
for (int j = row.FirstCellNum; j < row.LastCellNum; j++)
{
ICell cell = row.GetCell(j);
if (cell == null)
{
dt.Columns.Add(String.Format("emptyColumnName_{0}", empty++));
}
else
{
dt.Columns.Add(row.GetCell(j).ToString());
}
}
continue;
}
If you always want to read from the first sheet (probably, to get rid of the second method parameter, the sheet name, which is also the cause of your error), you may use:
// rest of the method's code
ISheet sheet = hssfworkbook.GetSheetAt(0);
if (sheet == null)
return dt;
var rows = sheet.GetRowEnumerator();
// rest of the method's code

Categories