Read CSV line by line in c# - c#

I am reading a CSV file and want to read it line by line. The code below does not have any error but when execute the code it reads from middle of the CSV, it just prints last four lines of CSV but i need the whole CSV data as output. please assist what i an missing in my code
I want to achieve this using streamreader only and not parser.
using (StreamReader rd = new StreamReader(#"C:\Test.csv"))
{
while (!rd.EndOfStream)
{
String[] value = null;
string splits = rd.ReadLine();
value = splits.Split(',');
foreach (var test in value)
{
Console.WriteLine(test);
}
}
}
Test.csv
TEST Value ,13:00,,,14:00,,,15:00,,, "Location","Time1","Transaction1","Transaction2","Tim2", "Pune","1.07","-","-","0.99", "Mumbai","0.55","-","-","0.59", "Delhi","1.00","-","-","1.08", "Chennai","0.52","-","-","0.50",

There is already a stack overflow article about this.
Also, the article provides a much better way to do this same:
using (TextFieldParser parser = new TextFieldParser(#"c:\test.csv"))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
//Processing row
string[] fields = parser.ReadFields();
foreach (string field in fields)
{
//TODO: Process field
}
}
}

I believe there is something wrong with your CSV file, it may contain some unexpected characters.Several things you can try:
You can let StreamReader class to detect correct encoding of your CSV
new StreamReader(#"C:\Test.csv", System.Text.Encoding.Default, true)
You can force your StreamReader class to read your CSV from the beginning.
rd.DiscardBufferedData();
rd.BaseStream.Seek(0, SeekOrigin.Begin);
rd.BaseStream.Position = 0;
You can try to fix your CSV file, such as clean null character and Convert Unix newline to Windows newline.

Related

C#: How to make Stream Reader to Int

I'm reading from a file with numbers and then when I try to convert it to an Int I get this error, System.FormatException: 'Input string was not in a correct format.' Reading the file works and I've tested all of that, it just seems to get stuck on this no matter what I try. This is what I've done so far:
StreamReader share_1 = new StreamReader("Share_1_256.txt");
string data_1 = share_1.ReadToEnd();
int intData1 = Int16.Parse(data_1);
And then if parse is in it doesn't print anything.
As we can see in your post, your input file contains not one number but several. So what you will need is to iterate through all lines of your file, then try the parsing for each lines of your string.
EDIT: The old code was using a external library. For raw C#, try:
using (StringReader reader = new StringReader(input))
{
string line;
while ((line = reader.ReadLine()) != null)
{
// Do something with the line
}
}
In addition, I encourage you to always parse string to number using the TryParse method, not the Parse one.
You can find some details and different implementations for that common problem in C#: C#: Looping through lines of multiline string
parser every single line
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
int intData1 = Int16.Parse(line);
}
You can simplify the code and get rid of StreamReader with a help of File class and Linq:
// Turn text file into IEnumerable<int>:
var data = File
.ReadLines("Share_1_256.txt")
.Select(line => int.Parse(line));
//TODO: add .OrderBy(item => item); if you want to sort items
// Loop over all numbers within file: 15, 1, 48, ..., 32
foreach (int item in data) {
//TODO: Put relevant code here, e.g. Console.WriteLine(item);
}

Count all characters in file during reading CSV

I just want to ask you if there is any possibilities to get the number of all characters in file during reading the CSV file? I don't want to load file into memory two times (one time for parsing, second time for counting).
I need to parse CSV file but also I need to get the number of all characters in this file (with delimeters). Someone has any idea how to do that in the most efficient way?
using (TextReader stream = new StreamReader(file.OpenReadStream()))
{
CsvReader reader = new CsvReader(stream, GetCsvReaderOptions());
while (reader.Read())
{
//parsing
}
}
There is an option to iterate through all fields in actual reader row
and at the end increment length by delimeters (number of fields ==
number of delimeters).
Also I have idea to count characters on parsed objects by reflection
(get all properties value from object).
I don't think that these options will be efficient.
Thanks in Advance
You can use Reader.Context.RawRecord and remove the line endings. (Assuming you don't want to count those)
using (TextReader stream = new StreamReader(file.OpenReadStream()))
{
var count = 0;
CsvReader reader = new CsvReader(stream, GetCsvReaderOptions());
while (reader.Read())
{
count += reader.Context.RawRecord.Replace("\n", "").Replace("\r", "").Length;
//parsing
}
}
The basic way of doing this could be the following:
using (TextReader stream = new StreamReader(file.OpenReadStream()))
{
var content = stream.ReadToEnd();
var length = content.Length;
}
So that variable "length" will contain the count of all symbols in the passed file

c# write an arraylist to a text file

I need to write my array list into a text file and so far have come up with this code.
Now im confused as to how to write the 'line' to my text file using the textwriter?
One procedure loads the list out of a txt file below.
public void LoadArrayList()
{
TextReader tr;
tr = File.OpenText("C:\\Users\\Mirro\\Documents\\Visual Studio 2010\\Projects\\Assessment2\\Assessment2\\act\\actors.txt");
string line = tr.ReadToEnd();
Console.WriteLine(line);
if (line != null)
{
ActorArrayList.Add(line);
}
else
tr.Close();
}
Then i have one that will populate the combo box in my form.
public void PopulateActors()
{
cboActor.Items.Clear();
foreach (string line in ActorArrayList)
{
cboActor.Items.AddRange(File.ReadAllLines("C:\\Users\\Mirro\\Documents\\Visual Studio 2010\\Projects\\Assessment2\\Assessment2\\act\\actors.txt"));
}
}
and this procedure i need it to write my whole array "ActoryArrayList" into the text file.
public void WriteArrayList()
{
}
Im sorry for being confusing originally.
Try with following code
// Example #1: Write an array of strings to a file.
// Create a string array that consists of three lines.
string[] lines = { "First line", "Second line", "Third line" };
// WriteAllLines creates a file, writes a collection of strings to the file,
// and then closes the file.
System.IO.File.WriteAllLines(#"C:\Users\Mirro\Documents\Visual Studio 2010\Projects\Assessment2\Assessment2\act\actors.txt", lines);
OUTPUT :
// First line
// Second line
// Third line
The best way is #Leez's way, but You also may use TextWriter and foreach operator to make this:
//your array
string[] yourArray = { "fisrt", "second", "third" };
string text = "C:\\Users\\Mirro\\Documents\\Visual Studio 2010\\Projects\\Assessment2\\Assessment2\\act\\actors.txt";
using (TextWriter writer = File.CreateText(text))
{
foreach (string i in yourArray)
{
writer.WriteLine(i);
}
}
System.IO.File.WriteAllText("FILE_PATH", line);
BTW, where is the ArrayList in your code? Also, consider using System.IO.File.ReadAllText("FILE_PATH") for everyday file reading.
If you were to actually write an ArrayList to a disk file, that would require you to first serialize the contents of the ArrayList to maybe XML or binary etc. Then you can use the above methods to write that serialized representation to a file. Also note that serializing collections involves a concept called deep and shallow copying. This question may help you better understand the concept.
File.WriteAllLines(filePath, ActorArrayList.ToArray());
WriteAllLines outputs two end of line characters (carriage return and line feed - \r\n). If you don't want two end of line characters at the end of each line (\r\n), you can output only one character (\n) by using StreamWriter.
using (StreamWriter sw = new StreamWriter(#"C:\mypath\file.txt"))
{
foreach (string s in linesArray)
sw.Write(s + "\n");
}

What's the best way to get all the content in between two tagged lines of a file so that you can deserialize it?

I've been noticing that the following segment of code does not scale well for large files (I think that appending to the paneContent string is slow):
string paneContent = String.Empty;
bool lineFound = false;
foreach (string line in File.ReadAllLines(path))
{
if (line.Contains(tag))
{
lineFound = !lineFound;
}
else
{
if (lineFound)
{
paneContent += line;
}
}
}
using (TextReader reader = new StringReader(paneContent))
{
data = (PaneData)(serializer.Deserialize(reader));
}
What's the best way to speed this all up? I have a file that looks like this (so I wanna get all the content in between the two different tags and then deserialize all that content):
A line with some tag
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with content I want to get into a single stream or string
A line with some tag
Note: These tags are not XML tags.
You could use a StringBuilder as opposed to a string, that is what the StringBuilder is for. Some example code is below:
var paneContent = new StringBuilder();
bool lineFound = false;
foreach (string line in File.ReadLines(path))
{
if (line.Contains(tag))
{
lineFound = !lineFound;
}
else
{
if (lineFound)
{
paneContent.Append(line);
}
}
}
using (TextReader reader = new StringReader(paneContent.ToString()))
{
data = (PaneData)(serializer.Deserialize(reader));
}
As mentioned in this answer, a StringBuilder is preferred to a string when you are concatenating in a loop, which is the case here.
Here is an example of how to use groups with regexes and retrieve their contents afterwards.
What you want is a regex that will match your tags, label this as a group then retrieve the data of the group as in the example
Use a StringBuilder to build your data string (paneContent). It's much faster because concatenating strings results in new memory allocations. StringBuilder pre-allocates memory (if you expect large data strings, you can customize the initial allocation).
It's a good idea to read your input file line-by-line so you can avoid loading the whole file into memory if you expect files with many lines of text.

How to read a csv file one line at a time and replace/edit certain lines as you go?

I have a 60GB csv file I need to make some modifications to. The customer wants some changes to the files data, but I don't want to regenerate the data in that file because it took 4 days to do.
How can I read the file, line by line (not loading it all into memory!), and make edits to those lines as I go, replacing certain values etc.?
The process would be something like this:
Open a StreamWriter to a temporary file.
Open a StreamReader to the target file.
For each line:
Split the text into columns based on a delimiter.
Check the columns for the values you want to replace, and replace them.
Join the column values back together using your delimiter.
Write the line to the temporary file.
When you are finished, delete the target file, and move the temporary file to the target file path.
Note regarding Steps 2 and 3.1: If you are confident in the structure of your file and it is simple enough, you can do all this out of the box as described (I'll include a sample in a moment). However, there are factors in a CSV file that may need attention (such as recognizing when a delimiter is being used literally in a column value). You can drudge through this yourself, or try an existing solution.
Basic example just using StreamReader and StreamWriter:
var sourcePath = #"C:\data.csv";
var delimiter = ",";
var firstLineContainsHeaders = true;
var tempPath = Path.GetTempFileName();
var lineNumber = 0;
var splitExpression = new Regex(#"(" + delimiter + #")(?=(?:[^""]|""[^""]*"")*$)");
using (var writer = new StreamWriter(tempPath))
using (var reader = new StreamReader(sourcePath))
{
string line = null;
string[] headers = null;
if (firstLineContainsHeaders)
{
line = reader.ReadLine();
lineNumber++;
if (string.IsNullOrEmpty(line)) return; // file is empty;
headers = splitExpression.Split(line).Where(s => s != delimiter).ToArray();
writer.WriteLine(line); // write the original header to the temp file.
}
while ((line = reader.ReadLine()) != null)
{
lineNumber++;
var columns = splitExpression.Split(line).Where(s => s != delimiter).ToArray();
// if there are no headers, do a simple sanity check to make sure you always have the same number of columns in a line
if (headers == null) headers = new string[columns.Length];
if (columns.Length != headers.Length) throw new InvalidOperationException(string.Format("Line {0} is missing one or more columns.", lineNumber));
// TODO: search and replace in columns
// example: replace 'v' in the first column with '\/': if (columns[0].Contains("v")) columns[0] = columns[0].Replace("v", #"\/");
writer.WriteLine(string.Join(delimiter, columns));
}
}
File.Delete(sourcePath);
File.Move(tempPath, sourcePath);
memory-mapped files is a new feature in .NET Framework 4 which can be used to edit large files.
read here http://msdn.microsoft.com/en-us/library/dd997372.aspx
or google Memory-mapped files
Just read the file, line by line, with streamreader, and then use REGEX! The most amazing tool in the world.
using (var sr = new StreamReader(new FileStream(#"C:\temp\file.csv", FileMode.Open)))
{
var line = sr.ReadLine();
while (!sr.EndOfStream)
{
// do stuff
line = sr.ReadLine();
}
}

Categories