How do i parse a text file in c# [closed] - c#

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
How do i parse a text file in c#?

Check this interesting approach, Linq To Text Files, very nice, you only need a IEnumerable<string> method, that yields every file.ReadLine(), and you do the query.
Here is another article that better explains the same technique.

using (TextReader rdr = new StreamReader(fullFilePath))
{
string line;
while ((line = rdr.ReadLine()) != null)
{
// use line here
}
}
set the variable "fullFilePath" to the full path eg. C:\temp\myTextFile.txt

The algorithm might look like this:
Open Text File
For every line in the file:
Parse Line
There are several approaches to parsing a line.
The easiest from a beginner standpoint is to use the String methods.
System.String at MSDN
If you are up for more of a challenge, then you can use the System.Text.RegularExpression library to parse your text.
RegEx at MSDN

You might want to use a helper class such as the one described at http://www.blackbeltcoder.com/Articles/strings/a-text-parsing-helper-class.

From years of analyzing CSV files, including ones that are broken or have edge cases, here is my code that passes virtually all of my unit tests:
/// <summary>
/// Read in a line of text, and use the Add() function to add these items to the current CSV structure
/// </summary>
/// <param name="s"></param>
public static bool TryParseCSVLine(string s, char delimiter, char text_qualifier, out string[] array)
{
bool success = true;
List<string> list = new List<string>();
StringBuilder work = new StringBuilder();
for (int i = 0; i < s.Length; i++) {
char c = s[i];
// If we are starting a new field, is this field text qualified?
if ((c == text_qualifier) && (work.Length == 0)) {
int p2;
while (true) {
p2 = s.IndexOf(text_qualifier, i + 1);
// for some reason, this text qualifier is broken
if (p2 < 0) {
work.Append(s.Substring(i + 1));
i = s.Length;
success = false;
break;
}
// Append this qualified string
work.Append(s.Substring(i + 1, p2 - i - 1));
i = p2;
// If this is a double quote, keep going!
if (((p2 + 1) < s.Length) && (s[p2 + 1] == text_qualifier)) {
work.Append(text_qualifier);
i++;
// otherwise, this is a single qualifier, we're done
} else {
break;
}
}
// Does this start a new field?
} else if (c == delimiter) {
list.Add(work.ToString());
work.Length = 0;
// Test for special case: when the user has written a casual comma, space, and text qualifier, skip the space
// Checks if the second parameter of the if statement will pass through successfully
// e.g. "bob", "mary", "bill"
if (i + 2 <= s.Length - 1) {
if (s[i + 1].Equals(' ') && s[i + 2].Equals(text_qualifier)) {
i++;
}
}
} else {
work.Append(c);
}
}
list.Add(work.ToString());
// If we have nothing in the list, and it's possible that this might be a tab delimited list, try that before giving up
if (list.Count == 1 && delimiter != DEFAULT_TAB_DELIMITER) {
string[] tab_delimited_array = ParseLine(s, DEFAULT_TAB_DELIMITER, DEFAULT_QUALIFIER);
if (tab_delimited_array.Length > list.Count) {
array = tab_delimited_array;
return success;
}
}
// Return the array we parsed
array = list.ToArray();
return success;
}
However, this function does not actually parse every valid CSV file out there! Some files have embedded newlines in them, and you need to enable your stream reader to parse multiple lines together to return an array. Here's a tool that does that:
/// <summary>
/// Parse a line whose values may include newline symbols or CR/LF
/// </summary>
/// <param name="sr"></param>
/// <returns></returns>
public static string[] ParseMultiLine(StreamReader sr, char delimiter, char text_qualifier)
{
StringBuilder sb = new StringBuilder();
string[] array = null;
while (!sr.EndOfStream) {
// Read in a line
sb.Append(sr.ReadLine());
// Does it parse?
string s = sb.ToString();
if (TryParseCSVLine(s, delimiter, text_qualifier, out array)) {
return array;
}
}
// Fails to parse - return the best array we were able to get
return array;
}
For reference, I placed my open source CSV code on code.google.com.

If you have more than a trivial language, use a parser generator. It drove me nuts but I've heard good things about ANTLR (Note: get the manual and read it before you start. If you have used a parser generator other than it before you will not approach it correctly right off the bat, at least I didn't)
Other tools also exist.

What do you mean by parse? Parse usually means to split the input into tokens, which you might do if you're trying to implement a programming language. If you're just wanting to read the contents of a text file, look at System.IO.FileInfo.

Without really knowing what sort of text file you're on about, its hard to answer. However, the FileHelpers library has a broad set of tools to help with fixed length file formats, multirecord, delimited etc.

A small improvement on Pero's answer:
FileInfo txtFile = new FileInfo("c:\myfile.txt");
if(!txtFile.Exists) { // error handling }
using (TextReader rdr = txtFile.OpenText())
{
// use the text file as Pero suggested
}
The FileInfo class gives you the opportunity to "do stuff" with the file before you actually start reading from it. You can also pass it around between functions as a better abstraction of the file's location (rather than using the full path string). FileInfo canonicalizes the path so it's absolutely correct (e.g. turning / into \ where appropriate) and lets you extract extra data about the file -- parent directory, extension, name only, permissions, etc.

To begin with, make sure that you have the following namespaces:
using System.Data;
using System.IO;
using System.Text.RegularExpressions;
Next, we build a function that parses any CSV input string into a DataTable:
public DataTable ParseCSV(string inputString) {
DataTable dt=new DataTable();
// declare the Regular Expression that will match versus the input string
Regex re=new Regex("((?<field>[^\",\\r\\n]+)|\"(?<field>([^\"]|\"\")+)\")(,|(?<rowbreak>\\r\\n|\\n|$))");
ArrayList colArray=new ArrayList();
ArrayList rowArray=new ArrayList();
int colCount=0;
int maxColCount=0;
string rowbreak="";
string field="";
MatchCollection mc=re.Matches(inputString);
foreach(Match m in mc) {
// retrieve the field and replace two double-quotes with a single double-quote
field=m.Result("${field}").Replace("\"\"","\"");
rowbreak=m.Result("${rowbreak}");
if (field.Length > 0) {
colArray.Add(field);
colCount++;
}
if (rowbreak.Length > 0) {
// add the column array to the row Array List
rowArray.Add(colArray.ToArray());
// create a new Array List to hold the field values
colArray=new ArrayList();
if (colCount > maxColCount)
maxColCount=colCount;
colCount=0;
}
}
if (rowbreak.Length == 0) {
// this is executed when the last line doesn't
// end with a line break
rowArray.Add(colArray.ToArray());
if (colCount > maxColCount)
maxColCount=colCount;
}
// create the columns for the table
for(int i=0; i < maxColCount; i++)
dt.Columns.Add(String.Format("col{0:000}",i));
// convert the row Array List into an Array object for easier access
Array ra=rowArray.ToArray();
for(int i=0; i < ra.Length; i++) {
// create a new DataRow
DataRow dr=dt.NewRow();
// convert the column Array List into an Array object for easier access
Array ca=(Array)(ra.GetValue(i));
// add each field into the new DataRow
for(int j=0; j < ca.Length; j++)
dr[j]=ca.GetValue(j);
// add the new DataRow to the DataTable
dt.Rows.Add(dr);
}
// in case no data was parsed, create a single column
if (dt.Columns.Count == 0)
dt.Columns.Add("NoData");
return dt;
}
Now that we have a parser for converting a string into a DataTable, all we need now is a function that will read the content from a CSV file and pass it to our ParseCSV function:
public DataTable ParseCSVFile(string path) {
string inputString="";
// check that the file exists before opening it
if (File.Exists(path)) {
StreamReader sr = new StreamReader(path);
inputString = sr.ReadToEnd();
sr.Close();
}
return ParseCSV(inputString);
}
And now you can easily fill a DataGrid with data coming off the CSV file:
protected System.Web.UI.WebControls.DataGrid DataGrid1;
private void Page_Load(object sender, System.EventArgs e) {
// call the parser
DataTable dt=ParseCSVFile(Server.MapPath("./demo.csv"));
// bind the resulting DataTable to a DataGrid Web Control
DataGrid1.DataSource=dt;
DataGrid1.DataBind();
}
Congratulations! You are now able to parse CSV into a DataTable. Good luck with your programming.

Related

CSV export Function, what to do if string contains the character seperator?

i use a fuction to convert a Datatable to CSV and i use File.WriteAllText to save it to a file.
private static string DataTableToCSV(DataTable dtable, char seperator)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < dtable.Columns.Count; i++)
{
sb.Append(dtable.Columns[i]);
if (i < dtable.Columns.Count - 1)
sb.Append(seperator);
}
sb.AppendLine();
foreach (DataRow dr in dtable.Rows)
{
for (int i = 0; i < dtable.Columns.Count; i++)
{
sb.Append(dr[i].ToString());
if (i < dtable.Columns.Count - 1)
{
sb.Append(seperator);
}
}
sb.AppendLine();
}
return sb.ToString();
}
well, the Code is working. My problem is, in CSV the seperator is ';'. Now, of course, errors occur when a string in the table contains a semicolon. Is there perhaps an elegant way to solve the problem?
You should consider using a library to handle the CSV part for you. The current accepted answer handles quoting when the value contains the delimiter, but what happens when the value contains or starts with the quote character, or what if the value contains a newline? That approach will create an invalid file. The de-facto CSV standard specifies that fields should be quoted when the value contains a delimiter or a newline, and that quotes should be doubled up to "escape" them.
There are many libraries that can help with this, including one that I'm the author of: Sylvan.Data.Csv. Sylvan handles your scenarios in a very straightforward way:
using Sylvan.Data.Csv;
static string DataTableToCSV(DataTable dtable, char seperator)
{
using var sw = new StringWriter();
var opts = new CsvDataWriterOptions { Delimiter = seperator };
using var csvw = CsvDataWriter.Create(sw, opts);
csvw.Write(dtable.CreateDataReader());
return sw.ToString();
}
I wrote a little helper for formatting every line of my CSV.
private string FormatForCsv(string value) => value != null && value.Contains(';') ? value.Replace(value, "\"" + value + "\"") : value;
So then you can implement using:
sb.Append(FormatForCsv(dr[i]?.ToString()));
Of course you can use the same for the headers too.
I also added a null check when converting dr[i] to a string, just to be on the safe side.

Use continue key word to processed with the loop

I am reading data from excel file(which is actually a comma separated csv file) columns line-by-line, this file gets send by an external entity.Among the columns to be read is the time, which is in 00.00 format, so a split method is used read all the different columns, however the file sometimes comes with extra columns(commas between the elements) so the split elements are now always correct. Below is the code used to read and split the different columns, this elements will be saved in the database.
public void SaveFineDetails()
{
List<string> erroredFines = new List<string>();
try
{
log.Debug("Start : SaveFineDetails() - Saving Downloaded files fines..");
if (!this.FileLines.Any())
{
log.Info(string.Format("End : SaveFineDetails() - DataFile was Empty"));
return;
}
using (RAC_TrafficFinesContext db = new RAC_TrafficFinesContext())
{
this.FileLines.RemoveAt(0);
this.FileLines.RemoveAt(FileLines.Count - 1);
int itemCnt = 0;
int errorCnt = 0;
int duplicateCnt = 0;
int count = 0;
foreach (var line in this.FileLines)
{
count++;
log.DebugFormat("Inserting {0} of {1} Fines..", count.ToString(), FileLines.Count.ToString());
string[] bits = line.Split(',');
int bitsLength = bits.Length;
if (bitsLength == 9)
{
string fineNumber = bits[0].Trim();
string vehicleRegistration = bits[1];
string offenceDateString = bits[2];
string offenceTimeString = bits[3];
int trafficDepartmentId = this.TrafficDepartments.Where(tf => tf.DepartmentName.Trim().Equals(bits[4], StringComparison.InvariantCultureIgnoreCase)).Select(tf => tf.DepartmentID).FirstOrDefault();
string proxy = bits[5];
decimal fineAmount = GetFineAmount(bits[6]);
DateTime fineCreatedDate = DateTime.Now;
DateTime offenceDate = GetOffenceDate(offenceDateString, offenceTimeString);
string username = Constants.CancomFTPServiceUser;
bool isAartoFine = bits[7] == "1" ? true : false;
string fineStatus = "Sent";
try
{
var dupCheck = db.GetTrafficFineByNumber(fineNumber);
if (dupCheck != null)
{
duplicateCnt++;
string ExportFileName = (base.FileName == null) ? string.Empty : base.FileName;
DateTime FileDate = DateTime.Now;
db.CreateDuplicateFine(ExportFileName, FileDate, fineNumber);
}
else
{
var adminFee = db.GetAdminFee();
db.UploadFTPFineData(fineNumber, fineAmount, vehicleRegistration, offenceDate, offenceDateString, offenceTimeString, trafficDepartmentId, proxy, false, "Imported", username, adminFee, isAartoFine, dupCheck != null, fineStatus);
}
itemCnt++;
}
catch
{
errorCnt++;
}
}
else
{
erroredFines.Add(line);
continue;
}
}
Now the problem is, this file doesn't always come with 9 elements as we expect, for example on this image, the lines are not the same(ignore first line, its headers)
On first line FM is supposed to be part of 36DXGP instead of being two separated elements. This means the columns are now extra. Now this brings us to the issue at hand, which is the time element, beacuse of extra coma, the time is now something else, is now read as 20161216, so the split on the time element is not working at all. So what I did was, read the incorrect line, check its length, if the length is not 9 then, add it to the error list and continue.
But my continue key word doesn't seem to work, it gets into the else part and then goes back to read the very same error line.
I have checked answers on Break vs Continue and they provide good example on how continue works, I introduced the else because the format on this examples did not work for me(well the else did not made any difference neither). Here is the sample data,
NOTE the first line to be read starts with 96
H,1789,,,,,,,,
96/17259/801/035415,FM,36DXGP,20161216,17.39,city hall-cape town,Makofane,200,0,0
MA/80/034808/730,CA230721,20170117,17.43,malmesbury,PATEL,200,0,0,
what is it that I am doing so wrong here
I have found a way to solve my problem, there was an issue with the length of the line because of the trailing comma which caused an empty element, I then got rid of this empty element with this code and determined the new length
bits = bits.Where(x => !string.IsNullOrEmpty(x)).ToArray();
int length = bits.Length
All is well now
I suggest you use the following overload for performance and readability reasons:
line.Split(new char[] {','}, StringSplitOptions.RemoveEmptyEntries)l

How to format and read CSV file?

Here is just an example of the data I need to format.
The first column is simple, the problem the second column.
What would be the best approach to format multiple data fields in one column?
How to parse this data?
Important*: The second column needs to contain multiple values, like in an example below
Name Details
Alex Age:25
Height:6
Hair:Brown
Eyes:Hazel
A csv should probably look like this:
Name,Age,Height,Hair,Eyes
Alex,25,6,Brown,Hazel
Each cell should be separated by exactly one comma from its neighbor.
You can reformat it as such by using a simple regex which replaces certain newline and non-newline whitespace with commas (you can easily find each block because it has values in both columns).
A CSV file is normally defined using commas as field separators and CR for a row separator. You are using CR within your second column, this will cause problems. You'll need to reformat your second column to use some other form of separator between multiple values. A common alternate separator is the | (pipe) character.
Your format would then look like:
Alex,Age:25|Height:6|Hair:Brown|Eyes:Hazel
In your parsing, you would first parse the comma separated fields (which would return two values), and then parse the second field as pipe separated.
This is an interesting one - it can be quite difficult to parse specific format files which is why people often write specific classes to deal with them. More conventional file formats like CSV, or other delimited formats are [more] easy to read because they are formatted in a similar way.
A problem like the above can be addressed in the following way:
1) What should the output look like?
In your instance, and this is just a guess, but I believe you are aiming for the following:
Name, Age, Height, Hair, Eyes
Alex, 25, 6, Brown, Hazel
In which case, you have to parse out this information based on the structure above. If it's repeated blocks of text like the above then we can say the following:
a. Every person is in a block starting with Name Details
b. The name value is the first text after Details, with the other columns being delimited in the format Column:Value
However, you might also have sections with addtional attributes, or attributes that are missing if the original input was optional, so tracking the column and ordinal would be useful too.
So one approach might look like the following:
public void ParseFile(){
String currentLine;
bool newSection = false;
//Store the column names and ordinal position here.
List<String> nameOrdinals = new List<String>();
nameOrdinals.Add("Name"); //IndexOf == 0
Dictionary<Int32, List<String>> nameValues = new Dictionary<Int32 ,List<string>>(); //Use this to store each person's details
Int32 rowNumber = 0;
using (TextReader reader = File.OpenText("D:\\temp\\test.txt"))
{
while ((currentLine = reader.ReadLine()) != null) //This will read the file one row at a time until there are no more rows to read
{
string[] lineSegments = currentLine.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries);
if (lineSegments.Length == 2 && String.Compare(lineSegments[0], "Name", StringComparison.InvariantCultureIgnoreCase) == 0
&& String.Compare(lineSegments[1], "Details", StringComparison.InvariantCultureIgnoreCase) == 0) //Looking for a Name Details Line - Start of a new section
{
rowNumber++;
newSection = true;
continue;
}
if (newSection && lineSegments.Length > 1) //We can start adding a new person's details - we know that
{
nameValues.Add(rowNumber, new List<String>());
nameValues[rowNumber].Insert(nameOrdinals.IndexOf("Name"), lineSegments[0]);
//Get the first column:value item
ParseColonSeparatedItem(lineSegments[1], nameOrdinals, nameValues, rowNumber);
newSection = false;
continue;
}
if (lineSegments.Length > 0 && lineSegments[0] != String.Empty) //Ignore empty lines
{
ParseColonSeparatedItem(lineSegments[0], nameOrdinals, nameValues, rowNumber);
}
}
}
//At this point we should have collected a big list of items. We can then write out the CSV. We can use a StringBuilder for now, although your requirements will
//be dependent upon how big the source files are.
//Write out the columns
StringBuilder builder = new StringBuilder();
for (int i = 0; i < nameOrdinals.Count; i++)
{
if(i == nameOrdinals.Count - 1)
{
builder.Append(nameOrdinals[i]);
}
else
{
builder.AppendFormat("{0},", nameOrdinals[i]);
}
}
builder.Append(Environment.NewLine);
foreach (int key in nameValues.Keys)
{
List<String> values = nameValues[key];
for (int i = 0; i < values.Count; i++)
{
if (i == values.Count - 1)
{
builder.Append(values[i]);
}
else
{
builder.AppendFormat("{0},", values[i]);
}
}
builder.Append(Environment.NewLine);
}
//At this point you now have a StringBuilder containing the CSV data you can write to a file or similar
}
private void ParseColonSeparatedItem(string textToSeparate, List<String> columns, Dictionary<Int32, List<String>> outputStorage, int outputKey)
{
if (String.IsNullOrWhiteSpace(textToSeparate)) { return; }
string[] colVals = textToSeparate.Split(new[] { ":" }, StringSplitOptions.RemoveEmptyEntries);
List<String> outputValues = outputStorage[outputKey];
if (!columns.Contains(colVals[0]))
{
//Add the column to the list of expected columns. The index of the column determines it's index in the output
columns.Add(colVals[0]);
}
if (outputValues.Count < columns.Count)
{
outputValues.Add(colVals[1]);
}
else
{
outputStorage[outputKey].Insert(columns.IndexOf(colVals[0]), colVals[1]); //We append the value to the list at the place where the column index expects it to be. That way we can miss values in certain sections yet still have the expected output
}
}
After running this against your file, the string builder contains:
"Name,Age,Height,Hair,Eyes\r\nAlex,25,6,Brown,Hazel\r\n"
Which matches the above (\r\n is effectively the Windows new line marker)
This approach demonstrates how a custom parser might work - it's purposefully over verbose as there is plenty of refactoring that could take place here, and is just an example.
Improvements would include:
1) This function assumes there are no spaces in the actual text items themselves. This is a pretty big assumption and, if wrong, would require a different approach to parsing out the line segments. However, this only needs to change in one place - as you read a line at a time, you could apply a reg ex, or just read in characters and assume that everything after the first "column:" section is a value, for example.
2) No exception handling
3) Text output is not quoted. You could test each value to see if it's a date or number - if not, wrap it in quotes as then other programs (like Excel) will attempt to preserve the underlying datatypes more effectively.
4) Assumes no column names are repeated. If they are, then you have to check if a column item has already been added, and then create an ColName2 column in the parsing section.

C#: Checking That ArrayList Elements have specific type

ArrayList fileList = new ArrayList();
private void button2_Click(object sender, EventArgs e)
{
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
string line;
// Read the file and display it line by line.
System.IO.StreamReader file = new System.IO.StreamReader(openFileDialog1.FileName);
while ((line = file.ReadLine()) != null)
{
// Puts elements in table
fileList.Add(line.Split(';'));
}
file.Close();
}
for (int i = 0; i < fileList.Count; i++)
{
for (int x = 0; x < (fileList[i] as string[]).Length; x++)
{
// if (x ==0)
// {
//fileList[0] must Be int
// }
// if (x==1)
//fileList[1] must be string
this.textBox2.Text += ((fileList[i] as string[])[x] + " ");
}
this.textBox2.Text += Environment.NewLine;
}
}
I am so far here.
I take the elements from a CSV file.
I need now to be sure that the 1 column has only numbers-integers (1,2,3,4,5), the second column has only names(so it will have the type string or character), the third surnames etc. etc.
The rows are presented like this : 1;George;Mano;
How can I be sure that the CSV file has the correct types?
I think that any more code about this problem will be placed inside the 2 for statements.
Thank you very much,
George.
I think your question needs more work.
You don't show your declaration for filelist. Whatever it is, there is no reason to convert it to string[] just to get the length. The length with be the same no matter what type it is. You cannot use this method to determine which items are strings.
You'll need to loop through the items and see if they contain only digits or whatever.
Also, your code to read CSV files is not quote right. CSV files are comma-separated. And it's possible that they could contain commas within double quotes. These commas should be ignored. A better way to read CSV files can be seen here.
An Arraylist contains object.
System.IO.StreamReader.ReadLine returns a String.
Checking the value of the first line read and trying to convert the string into an integer would be a valid approach.
Your current approach is adding the String that is returned by System.IO.StreamReader.ReadLine into your collection which you later turn into a String[] by using the String.Split method.
Your other requirements will be a greal more difficult because every line you are reading is a String already. So you would have to look at each character within the string to determine if it appears to be a name.
In other words you might want to find a different way to provide an input. I would agree that a regular expression might be the best way to get rid of junk data.
Edit: Now that we know it's really CSV, here's a columnar answer ;-)
Your ArrayList contains string[], so you need to verify that each array has the appropriate type of string.
for (int i = 0; i < fileList.Count; i++)
{
string[] lineItems = (string[])fileList[i];
if (!Regex.IsMatch (lineItems[0], "^\d+$")) // numbers
throw new ArgumentException ("invalid id at row " + i);
if (!Regex.IsMatch (lineItems[1], "^[a-zA-Z]+$")) // surnames - letters-only
throw new ArgumentException ("invalid surname at row " + i);
if (!Regex.IsMatch (lineItems[2], "^[a-zA-Z]+$")) // names - letters-only
throw new ArgumentException ("invalid name at row " + i);
}
You can use Regex class.
fileList[0] must Be int:
int x;
if(int.TryParse(fileList[0], out x)){ //do whatever here and x will have that integer value. TryParse will return false if it's not an integer so the if will not fire}
fileList[1] must be string :
iterate over the string and check each element is a letter. look at the char. methods for the appropriate one.

How to check if the last column of line in csv file is "True" in a buffered reader

I have a CSV file and I am reading data byte by byte by using buffered stream. I want to ignore reading the line if the last column = "True". How do I achieve it?
So far I have got:
BufferedStream stream = new BufferedStream(csvFile, 1000);
int byteIn = stream.ReadByte();
while (byteIn != -1 && (char)byteIn != '\n' && (char)byteIn != '\r')
byteIn = stream.ReadByte();
I want to ignore reading the line if the last column of the line is "True"
Firstly, I wouldn't approach any file IO byte-by-byte without an absolute need for it. Secondly, reading lines from a text file in .Net is a really cheap operation.
Here is some naive starter code, which ignores the possibility of string CSV values:
List<string> matchingLines = new List<string>();
using (var reader = new StreamReader("data.csv"))
{
string rawline;
while (null != (rawline = reader.ReadLine()))
{
if (rawline.TrimEnd().Split(',').Last() == "True") continue;
matchingLines.Add(rawline);
}
}
In reality, it would be advised to parse each CSV line into a strongly typed object and then filter on that collection using LINQ. However, that can be a separate answer for a separate question.
I would read/import the entire CSV file into a DataTable object and then do a Select on the datatable to include rows where last column not equal to true.
Here is a solution using a StreamReader, rather than a BufferedStream:
public string RemoveTrueRows( string csvFile )
{
var sr = new StreamReader( csvFile );
var line = string.Empty;
var contentsWithoutTrueRows = string.Empty;
while ( ( line = sr.ReadLine() ) != null )
{
var columns = line.Split( ',' );
if ( columns[ columns.Length - 1 ] == "True" )
{
contentsWithoutTrueRows += line;
}
}
sr.Close();
return contentsWithoutTrueRows;
}
In addition to jkirkwood's answer, you could also read each line and conditionally add a class or struct to a list of objects.
Some quick, semi-pseudocode:
List<MyObject> ObjectList = new List<MyObject>();
struct MyObject
{
int Property1;
string Property2;
bool Property3;
}
while (buffer = StreamReader.ReadLine())
{
string[] LineData = buffer.Split(',');
if (LineData[LineData.Length - 1] == "true") continue;
MyObject CurrentObject = new MyObject();
CurrentObject.Property1 = Convert.ToInt32(LineData[1]);
CurrentObject.Property2 = LineData[2];
CurrentObject.Property3 = Convert.ToBoolean(LineData[LineData.Length - 1]);
ObjectList.Add(CurrentObject);
}
It really kind of depends on what you want to do with the data once you've read it.
Hopefully this example is a bit helpful.
EDIT
As noted in comments, please be aware this is just a quick example. Your CSV file may have qualifiers and other things which make the string split completely useless. The take-away concept is to read line data into some sort of temporary variable, evaluate it for the desired condition, then output it or add it to your collection as needed.
EDIT 2
If the line lengths vary, you'll need to grab the last field instead of the *n*th field, so I changed the boolean field grabber to show how you would always get the last field instead of, say, the 42nd one.

Categories