I'm trying to write a text file, from a datatable, text file which later on will be imported by some other application, therefor I need to have no empty lines in the exported file (at the end or between).
I used the following code:
DataTable table = FisierSoldata.dtSoldata;
var result = new StringBuilder();
foreach (DataRow row in table.Rows)
{
for (int i = 0; i < table.Columns.Count; i++)
{
result.Append(row[i].ToString());
result.Append(i == table.Columns.Count - 1 ? "\n" : " ");
}
result.AppendLine();
}
StreamWriter objWriter = new StreamWriter(filePath, false);
objWriter.WriteLine(result.ToString());
objWriter.Close();
My datatable has 26 records and when the file is generated, it has one dataline followed by one empty line and another dataline and so on and at the end after the last dataline it has 3 empty lines.
How can I modify the code in order to have only datalines without the empty one between data and without the last 3 empty ones?
Regards,
LE:I've found a workaround, by reading some similar thread, but the solution was to use both streamwriter and streamreader in order to remove the unneeded data (empty lines) from the file which will be generated. But I wanted to know if there's a way of getting the file as needed from the start.
Why not using File.WriteAllLines, it's using a StreamWriter behind the scenes but allows simpler code:
var fileLines = table.AsEnumerable()
.Select(row => String.Join(" ", row.ItemArray))
.Where(line => !String.IsNullOrWhiteSpace(line));
File.WriteAllLines(filePath, fileLines);
But you really want to use a space as column delimiter? That's very error-prone. It will fail as soon as one field value contains a space.
I would try to avoid to add the empty lines to the Stringbuild. The Stringclass should give appropriate functions, maybe something like this:
String stringToTestIsEmpty = row[i].ToString();
//Check if it is not an empty line
if( ! String.IsNullOrWhiteSpace(stringToTestIsEmpty ))
{
resutl.Append(stringToTestIsEmpty);
}
Related
This question already has answers here:
How to delete a line from a text file in C#?
(11 answers)
Closed last year.
I'm working on a Unity application and I have a csv file with 6 columns and 5000+ rows. The files might grow significantly. The second column contains either "0" or "1". All rows with a "0" are not needed.
I want to read the CSV file and write every row that has no "0" at the 2nd column into a new csv file, so that I have a new CSV file with only rows that where 1. (The "1" also doesn't need to be in the new CSV file then.)
I tried something like this, but I removed the parts that didn't work. Maybe someone has a idea. Mainly the index x and i where somehow off. Sometimes it didn't read the first column but I might just have confused myself with the incrementation.
public void UpdateCSV()
{
string[] values = File.ReadAllText(datapath).Split(new char[] { ',' });
StringBuilder ObjStringBuilder = new StringBuilder();
// x as index for second column
int x = 1;
for (int i = 0; i < values.Length; i++)
{
// if column 2 is not 0 it needs to be printed in the new file
if (values[x] != "0")
{
//using ObjStringBuilder.Append() to pass the values of the row that I want to keep
}
}
ObjStringBuilder.ToString().Remove(ObjStringBuilder.Length - 1);
File.WriteAllText("Assets/Datasets/UpdatedCSV.csv", ObjStringBuilder.ToString());
}
Try: ReadAllLines - that will give you an array of lines. Then iterate through all using the Split function like you have and check values[1].
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
public void UpdateCSV()
{
string[] lines = File.ReadAllLines(datapath);
StringBuilder sb = new StringBuilder();
foreach (string line in lines)
{
List<string> values = new(line.Split(','));
if (values[1] != "0")
{
values.Remove(1); // or values[1] = string.empty;
string newLine = string.Join(",", values);
sb.Append(newLine + Environment.NewLine);
}
}
File.WriteAllText("Assets/Datasets/UpdatedCSV.csv", sb.ToString());
}
so I have this application that I have inherited from someone that is long gone. The gist of the application is that it reads in a .cvs file that has about 5800 lines in it, copies it over to another .cvs, which it creates new each time, after striping out a few things , #, ', &. Well everything works great, or it has until about a month ago. so I started checking into it, and what I have found so far is that there are about 131 items missing from the spreadsheet. Now I read someplace that the maximun amount of data a string can hold is over 1,000,000,000 chars, and my spreadsheet is way under that, around 800,000 chars, but the only thing I can think is doing it is the string object.
So anyway, here is the code in question, this piece appears
to both read in from the existing field, and output to the new file:
StreamReader s = new StreamReader(File);
//Read the rest of the data in the file.
string AllData = s.ReadToEnd();
//Split off each row at the Carriage Return/Line Feed
//Default line ending in most windows exports.
//You may have to edit this to match your particular file.
//This will work for Excel, Access, etc. default exports.
string[] rows = AllData.Split("\r\n".ToCharArray(), System.StringSplitOptions.RemoveEmptyEntries);
//Now add each row to the DataSet
foreach (string r in rows)
{
//Split the row at the delimiter.
string[] items = r.Split(delimiter.ToCharArray());
//Add the item
result.Rows.Add(items);
}
If anyone can help me I would really appreciate it. I either need to figure out how to split the data better, or I need to figure out why it is cutting out the last 131 lines from the existing excel file to the new excel file.
One easier way to do this, since you're using "\r\n" for lines, would be to just use the built-in line reading method: File.ReadLines(path)
foreach(var line in File.ReadLines(path))
{
var items = line.Split(',');
result.Rows.Add(items);
}
You may want to check out the TextFieldParser class, which is part of the Microsoft.VisualBasic.FileIO namespace (yes, you can use this with C# code)
Something along the lines of:
using(var reader = new TextFieldParser("c:\\path\\to\\file"))
{
//configure for a delimited file
reader.TextFieldType = FieldType.Delimited;
//configure the delimiter character (comma)
reader.Delimiters = new[] { "," };
while(!reader.EndOfData)
{
string[] row = reader.ReadFields();
//do stuff
}
}
This class can help with some of the issues of splitting a line into its fields, when the field may contain the delimiter.
Here is just an example of the data I need to format.
The first column is simple, the problem the second column.
What would be the best approach to format multiple data fields in one column?
How to parse this data?
Important*: The second column needs to contain multiple values, like in an example below
Name Details
Alex Age:25
Height:6
Hair:Brown
Eyes:Hazel
A csv should probably look like this:
Name,Age,Height,Hair,Eyes
Alex,25,6,Brown,Hazel
Each cell should be separated by exactly one comma from its neighbor.
You can reformat it as such by using a simple regex which replaces certain newline and non-newline whitespace with commas (you can easily find each block because it has values in both columns).
A CSV file is normally defined using commas as field separators and CR for a row separator. You are using CR within your second column, this will cause problems. You'll need to reformat your second column to use some other form of separator between multiple values. A common alternate separator is the | (pipe) character.
Your format would then look like:
Alex,Age:25|Height:6|Hair:Brown|Eyes:Hazel
In your parsing, you would first parse the comma separated fields (which would return two values), and then parse the second field as pipe separated.
This is an interesting one - it can be quite difficult to parse specific format files which is why people often write specific classes to deal with them. More conventional file formats like CSV, or other delimited formats are [more] easy to read because they are formatted in a similar way.
A problem like the above can be addressed in the following way:
1) What should the output look like?
In your instance, and this is just a guess, but I believe you are aiming for the following:
Name, Age, Height, Hair, Eyes
Alex, 25, 6, Brown, Hazel
In which case, you have to parse out this information based on the structure above. If it's repeated blocks of text like the above then we can say the following:
a. Every person is in a block starting with Name Details
b. The name value is the first text after Details, with the other columns being delimited in the format Column:Value
However, you might also have sections with addtional attributes, or attributes that are missing if the original input was optional, so tracking the column and ordinal would be useful too.
So one approach might look like the following:
public void ParseFile(){
String currentLine;
bool newSection = false;
//Store the column names and ordinal position here.
List<String> nameOrdinals = new List<String>();
nameOrdinals.Add("Name"); //IndexOf == 0
Dictionary<Int32, List<String>> nameValues = new Dictionary<Int32 ,List<string>>(); //Use this to store each person's details
Int32 rowNumber = 0;
using (TextReader reader = File.OpenText("D:\\temp\\test.txt"))
{
while ((currentLine = reader.ReadLine()) != null) //This will read the file one row at a time until there are no more rows to read
{
string[] lineSegments = currentLine.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries);
if (lineSegments.Length == 2 && String.Compare(lineSegments[0], "Name", StringComparison.InvariantCultureIgnoreCase) == 0
&& String.Compare(lineSegments[1], "Details", StringComparison.InvariantCultureIgnoreCase) == 0) //Looking for a Name Details Line - Start of a new section
{
rowNumber++;
newSection = true;
continue;
}
if (newSection && lineSegments.Length > 1) //We can start adding a new person's details - we know that
{
nameValues.Add(rowNumber, new List<String>());
nameValues[rowNumber].Insert(nameOrdinals.IndexOf("Name"), lineSegments[0]);
//Get the first column:value item
ParseColonSeparatedItem(lineSegments[1], nameOrdinals, nameValues, rowNumber);
newSection = false;
continue;
}
if (lineSegments.Length > 0 && lineSegments[0] != String.Empty) //Ignore empty lines
{
ParseColonSeparatedItem(lineSegments[0], nameOrdinals, nameValues, rowNumber);
}
}
}
//At this point we should have collected a big list of items. We can then write out the CSV. We can use a StringBuilder for now, although your requirements will
//be dependent upon how big the source files are.
//Write out the columns
StringBuilder builder = new StringBuilder();
for (int i = 0; i < nameOrdinals.Count; i++)
{
if(i == nameOrdinals.Count - 1)
{
builder.Append(nameOrdinals[i]);
}
else
{
builder.AppendFormat("{0},", nameOrdinals[i]);
}
}
builder.Append(Environment.NewLine);
foreach (int key in nameValues.Keys)
{
List<String> values = nameValues[key];
for (int i = 0; i < values.Count; i++)
{
if (i == values.Count - 1)
{
builder.Append(values[i]);
}
else
{
builder.AppendFormat("{0},", values[i]);
}
}
builder.Append(Environment.NewLine);
}
//At this point you now have a StringBuilder containing the CSV data you can write to a file or similar
}
private void ParseColonSeparatedItem(string textToSeparate, List<String> columns, Dictionary<Int32, List<String>> outputStorage, int outputKey)
{
if (String.IsNullOrWhiteSpace(textToSeparate)) { return; }
string[] colVals = textToSeparate.Split(new[] { ":" }, StringSplitOptions.RemoveEmptyEntries);
List<String> outputValues = outputStorage[outputKey];
if (!columns.Contains(colVals[0]))
{
//Add the column to the list of expected columns. The index of the column determines it's index in the output
columns.Add(colVals[0]);
}
if (outputValues.Count < columns.Count)
{
outputValues.Add(colVals[1]);
}
else
{
outputStorage[outputKey].Insert(columns.IndexOf(colVals[0]), colVals[1]); //We append the value to the list at the place where the column index expects it to be. That way we can miss values in certain sections yet still have the expected output
}
}
After running this against your file, the string builder contains:
"Name,Age,Height,Hair,Eyes\r\nAlex,25,6,Brown,Hazel\r\n"
Which matches the above (\r\n is effectively the Windows new line marker)
This approach demonstrates how a custom parser might work - it's purposefully over verbose as there is plenty of refactoring that could take place here, and is just an example.
Improvements would include:
1) This function assumes there are no spaces in the actual text items themselves. This is a pretty big assumption and, if wrong, would require a different approach to parsing out the line segments. However, this only needs to change in one place - as you read a line at a time, you could apply a reg ex, or just read in characters and assume that everything after the first "column:" section is a value, for example.
2) No exception handling
3) Text output is not quoted. You could test each value to see if it's a date or number - if not, wrap it in quotes as then other programs (like Excel) will attempt to preserve the underlying datatypes more effectively.
4) Assumes no column names are repeated. If they are, then you have to check if a column item has already been added, and then create an ColName2 column in the parsing section.
I'm writing a program for some data entry I have to periodically do. I have begun testing a few things that the program will have to do but i'm not sure about this part.
What i need this part to do is:
read a .txt file of data
take the first 12 characters from each line
take the first 12 characters from each line of the data that has been entered in a multi-line text box
compare the two lists line by line
if one of the 12 character blocks from the multi-line text box match one of the blocks in the .txt file then overwrite that entire line (only 17 characters in total)
if one of the 12 character blocks from the multi-line text box DO NOT match any of the blocks in the.txt file then append that entire line to the file
thats all it has to do.
i'll do an example:
TXT FILE:
G01:78:08:32 JG05
G08:80:93:10 JG02
G28:58:29:28 JG04
MULTI-LINE TEXT BOX:
G01:78:08:32 JG06
G28:58:29:28 JG03
G32:10:18:14 JG01
G32:18:50:78 JG07
RESULTING TXT FILE:
G01:78:08:32 JG06
G08:80:93:10 JG02
G28:58:29:28 JG03
G32:10:18:14 JG01
G32:18:50:78 JG07
as you can see lines 1 and 3 were overwriten, line 2 was left alone as it did not match any blocks in the text box, lines 4 and 5 were appended to the file.
thats all i want it to do.
How do i go about this?
Thanks in advance
Edit
The code i'm using is this:
private void WriteToFile()
{
// Read all lines from file into memory
System.IO.StreamReader objReader = new System.IO.StreamReader("Jumpgate List.JG");
List<String> fileTextList = new List<String>();
do
{
fileTextList.Add(objReader.ReadLine());
}
while (objReader.Peek() != -1);
objReader.Close();
// Read all lines from the Input textbox into memory
System.IO.StringReader objReaderi = new System.IO.StringReader(txtInput.Text);
List<String> inputTextList = new List<String>();
do
{
inputTextList.Add(objReaderi.ReadLine());
}
while (objReaderi.Peek() != -1);
objReaderi.Close();
for(int i=0;i<fileTextList.Count;i++)
{
for(int j=0;j<inputTextList.Count;j++)
//compare the first 12 characters of each string
if (String.Compare(fileTextList[i], 0, inputTextList[j], 0, 12) == 0) // strings are equal
{
//replace the fileTextList string with the inputTextList string
fileTextList[i] = inputTextList[j];
// now that you have matched you inputTextList line you remember not to append it at the end
inputTextList[j] = String.Empty; // or nothing
}
}
for(int i=0;i<inputTextList.Count;i++)
{
if (!string.IsNullOrEmpty(inputTextList[i])) fileTextList.Add(inputTextList[i]);
}
System.IO.StreamWriter objWriter = new System.IO.StreamWriter("Jumpgate List.JG");
// Overwrite the Jumpgate List.JG file using the updated fileTextList
objWriter.Write(fileTextList);
objWriter.Close();
}
However, when i open the txt file all i get is: System.Collections.Generic.List`1[System.String]
I'm not going to write the whole code for doing this but it would be something like this:
Disclaimer: I have not used a code editor to try the code, just wrote it here, hopefully you'll get the idea and fill in the missing pieces :)
1) get all the lines in the file in a list. Something like this
StreamReader rd = new StreamReader("sadasd");
List<String> llist = new List<String>();
do
{
llist.Add(rd.ReadLine());
} while (rd.Peek() != -1);
2) get all the lines in your multiline text box (the procedure should be similar to the one above): multiTextList
3) now that you can compare the content of the 2 lists iterating through them
for(int i=0;i<fileTextList.Count;i++)
{
for(int j=0;j<multiTextList.Count;j++)
//compare the first 12 characters of each string
if String.Compare(fileTextList[i], 0, multiTextList[j], 0, 12) == 0 // strings are equal
{
//replace the initial line with whatever you want
fileTextList[i] = //whatever
// now that you have matched you multiTextList line you remember not to append it at the end
multiTextList[j] = String.empty // or nothing
}
}
4) at the end you will have in fileTextList the initial rows, modified where necessary
In multiTextList you will have only the lines that were not matched so we add them to the initial file rows
for(int i=0;i<multiTextList.Count;i++)
{
if !string.isnullorempty(multitextlist[i]) fileTextList.add(multitextlist[i])
}
5) now in fileTextList you have all the rows you require so you can print them one by one in a file and you have your result
StringBuilder lSb = new StringBuilder();
for (int i = 0; i < fileTextList.Count; i++)
{
lSb.AppendLine(fileTextList[i]);
}
File.WriteAllText(#"C:/test2.txt",lSb.ToString());
In C:/test2.txt you should have the results.
Hope this helps!
// this variable maps the timestamps to complete lines
var dict = new Dictionary<string, string>();
// create the map of stamp => line for the original text file
string fileLine = file.ReadLine();
string fileStamp = fileLine.Substring(0, 12);
dict[fileStamp] = fileLine;
// now update the map with results from the text input. This will overwrite text
// strings that already exist in the file
foreach (string inputLine in textInputString.Split('\n'))
{
string inputStamp = inputLine.Substring(0, 12);
dict[inputStamp] = inputLine;
}
// write out the new file with the updated lines
foreach (string line in dict.Values)
{
outputFile.WriteLine(line);
}
if the file is large, loading the entire file into a dictionary to update a handful of lines from a textfield is probably excessive.
In pseudocode I would probably:
Create a list of booleans or other structure to track if a line was matched.
open the file in read/write mode.
For each line in file
{
for each unmatched line in text field
{
If stamps match
Update file
record that it was matched
}
}
for each unmatched line in text field
{
append to file
}
If the lines are fixed width, you can probably optimize by only reading the stamp rather than the whole line. If they match your file pointer is in the right spot to start writing, if not you move to the next line.
I have an Excel spreadsheet being converted into a CSV file in C#, but am having a problem dealing with line breaks. For instance:
"John","23","555-5555"
"Peter","24","555-5
555"
"Mary,"21","555-5555"
When I read the CSV file, if the record does not starts with a double quote (") then a line break is there by mistake and I have to remove it. I have some CSV reader classes from the internet but I am concerned that they will fail on the line breaks.
How should I handle these line breaks?
Thanks everybody very much for your help.
Here's is what I've done so far. My records have fixed format and all start with
JTW;...;....;...;
JTW;...;...;....
JTW;....;...;..
..;...;... (wrong record, line break inserted)
JTW;...;...
So I checked for the ; in the [3] position of each line. If true, I write; if false, I'll append on the last (removing the line-break)
I'm having problems now because I'm saving the file as a txt.
By the way, I am converting the Excel spreadsheet to csv by saving as csv in Excel. But I'm not sure if the client is doing that.
So the file as a TXT is perfect. I've checked the records and totals. But now I have to convert it back to csv, and I would really like to do it in the program. Does anybody know how?
Here is my code:
namespace EditorCSV
{
class Program
{
static void Main(string[] args)
{
ReadFromFile("c:\\source.csv");
}
static void ReadFromFile(string filename)
{
StreamReader SR;
StreamWriter SW;
SW = File.CreateText("c:\\target.csv");
string S;
char C='a';
int i=0;
SR=File.OpenText(filename);
S=SR.ReadLine();
SW.Write(S);
S = SR.ReadLine();
while(S!=null)
{
try { C = S[3]; }
catch (IndexOutOfRangeException exception){
bool t = false;
while (t == false)
{
t = true;
S = SR.ReadLine();
try { C = S[3]; }
catch (IndexOutOfRangeException ex) { S = SR.ReadLine(); t = false; }
}
}
if( C.Equals(';'))
{
SW.Write("\r\n" + S);
i = i + 1;
}
else
{
SW.Write(S);
}
S=SR.ReadLine();
}
SR.Close();
SW.Close();
Console.WriteLine("Records Processed: " + i.ToString() + " .");
Console.WriteLine("File Created SucacessFully");
Console.ReadKey();
}
}
}
CSV has predefined ways of handling that. This site provides an easy to read explanation of the standard way to handle all the caveats of CSV.
Nevertheless, there is really no reason to not use a solid, open source library for reading and writing CSV files to avoid making non-standard mistakes. LINQtoCSV is my favorite library for this. It supports reading and writing in a clean and simple way.
Alternatively, this SO question on CSV libraries will give you the list of the most popular choices.
Rather than check if the current line is missing the (") as the first character, check instead to see if the last character is a ("). If it is not, you know you have a line break, and you can read the next line and merge it together.
I am assuming your example data was accurate - fields were wrapped in quotes. If quotes might not delimit a text field (or new-lines are somehow found in non-text data), then all bets are off!
There is a built-in method for reading CSV files in .NET (requires Microsoft.VisualBasic assembly reference added):
public static IEnumerable<string[]> ReadSV(TextReader reader, params string[] separators)
{
var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader);
parser.SetDelimiters(separators);
while (!parser.EndOfData)
yield return parser.ReadFields();
}
If you're dealing with really large files this CSV reader claims to be the fastest one you'll find: http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader
I've used this piece of code recently to parse rows from a CSV file (this is a simplified version):
private void Parse(TextReader reader)
{
var row = new List<string>();
var isStringBlock = false;
var sb = new StringBuilder();
long charIndex = 0;
int currentLineCount = 0;
while (reader.Peek() != -1)
{
charIndex++;
char c = (char)reader.Read();
if (c == '"')
isStringBlock = !isStringBlock;
if (c == separator && !isStringBlock) //end of word
{
row.Add(sb.ToString().Trim()); //add word
sb.Length = 0;
}
else if (c == '\n' && !isStringBlock) //end of line
{
row.Add(sb.ToString().Trim()); //add last word in line
sb.Length = 0;
//DO SOMETHING WITH row HERE!
currentLineCount++;
row = new List<string>();
}
else
{
if (c != '"' && c != '\r') sb.Append(c == '\n' ? ' ' : c);
}
}
row.Add(sb.ToString().Trim()); //add last word
//DO SOMETHING WITH LAST row HERE!
}
Try CsvHelper (a library I maintain). It ignores empty rows. I believe there is a flag you can set in FastCsvReader to have it handle empty rows also.
Heed the advice from the experts and Don't roll your own CSV parser.
Your first thought is, "How do I handle new line breaks?"
Your next thought is, "I need to handle commas inside of quotes."
Your next thought will be, "Oh, crap, I need to handle quotes inside of quotes. Escaped quotes. Double quotes. Single quotes..."
It's a road to madness. Don't write your own. Find a library with an extensive unit test coverage that hits all the hard parts and has gone through hell for you. For .NET, use the free CsvHelper library.
Maybe you could count for (") during the ReadLine(). If they are odd, that will raise the flag. You could either ignore those lines, or get the next two and eliminate the first "\n" occurrence of the merge lines.
What I usually do is read the text in character by character opposed to line by line, due to this very problem.
As you're reading each character, you should be able to figure out where each cell starts and stops, but also the difference between a linebreak in a row and in a cell: If I remember correctly, for Excel generated files anyway, rows start with \r\n, and newlines in cells are only \r.
There is an example parser is c# that seems to handle your case correctly. Then you can read your data in and purge the line breaks out of it post-read.
Part 2 is the parser, and there is a Part 1 that covers the writer portion.
Read the line.
Split into columns(fields).
If you have enough columns expected for each line, then process.
If not, read the next line, and capture the remaining columns until you get what you need.
Repeat.
A somewhat simple regular expression could be used on each line. When it matches, you process each field from the match. When it doesn't find a match, you skip that line.
The regular expression could look something like this.
Match match = Regex.Match(line, #"^(?:,?(?<q>['"](?<field>.*?\k'q')|(?<field>[^,]*))+$");
if (match.Success)
{
foreach (var capture in match.Groups["field"].Captures)
{
string fieldValue = capture.Value;
// Use the value.
}
}
Have a look at FileHelpers Library
It supports reading\writing CSV with line breaks as well as reading\writing to excel
The LINQy solution:
string csvText = File.ReadAllText("C:\\Test.txt");
var query = csvText
.Replace(Environment.NewLine, string.Empty)
.Replace("\"\"", "\",\"").Split(',')
.Select((i, n) => new { i, n }).GroupBy(a => a.n / 3);
You might also check out my CSV parser SoftCircuits.CsvParser on NuGet. It will not only parse a CSV file but--if wanted--can also automatically map column values to your class properties. And it runs nearly four times faster than CsvHelper.
For a line break to exist in a CSV, there must be an open double quote that's not closed.
Assuming that all CSVs cells must open and close a double quote, just check if there's an odd number of quotation marks
my_string.Count(c => c == '"') % 2 == 1
and if that's the case, continue reading until you have the even number.