Convert .XYZ to .csv using c# - c#

Hi i am using this method to replace " " to "," but is failing when i try to use it on data that have 32 millions lines. Is anyone knows how to modify it to make it running?
List<String> lines = new List<String>();
//loop through each line of file and replace " " sight to ","
using (StreamReader sr = new StreamReader(inputfile))
{
int id = 1;
int i = File.ReadAllLines(inputfile).Count();
while (sr.Peek() >= 0)
{
//Out of memory issuee
string fileLine = sr.ReadLine();
//do something with line
string ttt = fileLine.Replace(" ", ", ");
//Debug.WriteLine(ttt);
lines.Add(ttt);
//lines.Add(id++, 'ID');
}
using (StreamWriter writer = new StreamWriter(outputfile, false))
{
foreach (String line in lines)
{
writer.WriteLine(line+","+id);
id++;
}
}
}
//change extension to .csv
FileInfo f = new FileInfo(outputfile);
f.MoveTo(Path.ChangeExtension(outputfile, ".csv"));
I general i am trying to convert big .XYZ file to .csv format and add incremental field at the end. I am using c# for first time in my life to be honest :) Can you help me?

See my comment above - you could modify your reading / writing as follows :
using (StreamReader sr = new StreamReader(inputfile))
{
using (StreamWriter writer = new StreamWriter(outputfile, false))
{
int id = 1;
while (sr.Peek() >= 0)
{
string fileLine = sr.ReadLine();
//do something with line
string ttt = fileLine.Replace(" ", ", ");
writer.WriteLine(ttt + "," + id);
id++;
}
}
}

Related

Read a text file until a line contains some string file, then again keep reading next lines until another string is encountered

I have to read a text file and if line contains ".engineering $name" then look for line which contains ".default" and do some operation with this line. I need to keep reading lines until I find ".default" in a set of lines. (This set is like, until I hit next ".engineering"). Loop continue like this again for next ".engineering $name"
Note:
".engineering" keyword is fixed string, $name reading dynamically,
".default" is fixed string,
I am able to do the first part that is reading line which contains ".engineering $name"
I am unable to get logic for next part, finding ".default" until it hits next ".engineering"
Looking for logic or code for this logic in C#. Thank you
Code:
using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read))
using (var reader = new StreamReader(stream))
{
while (!reader.EndOfStream)
{
string[] def_arr = null;
var line1 = reader.ReadLine();
if (line1.Contains(".engineering " + name + " ") && !reader.EndOfStream)
{
var nextLine = reader.ReadLine(); // nextLine contains ".default"
def_arr = nextLine.Split(' ');
def_val = def_arr[1].Replace("\"", "");
port_DefaultValues.Add(name + ", " + def_val);
}
}
}
var nextLine is the line containing ".default". I have coded like immidiate next line of finding ".engineering" is having ".default".But it is not always the case. ".default" can be in any line before it hits next ."engineering".
I hope the problem statement is clear.
Try this code -
using (var stream = new FileStream(path, FileMode.Open, FileAccess.Read))
using (var reader = new StreamReader(stream))
{
while (!reader.EndOfStream)
{
string[] def_arr = null;
var line1 = reader.ReadLine();
if (line1.Contains(".engineering " + name + " ") && !reader.EndOfStream)
{
var nextLine = reader.ReadLine(); // nextLine contains ".default"
while (!nextLine.Contains(".default") && !reader.EndOfStream)
{
nextLine = reader.ReadLine();
}
def_arr = nextLine.Split(' ');
def_val = def_arr[1].Replace("\"", "");
port_DefaultValues.Add(name + ", " + def_val);
}
}
}
I have just added a loop that will keep reading the next line until it encounters .default. Keep in mind it will throw exception if that is not found in rest of the file.

C# - Split CSV File by Removing Bad Rows

I have a csv file with 2 million rows and file size of 2 GB. But due to a couple of free text form columns, these contain redundant CRLF and cause the file to not load in the SQL Server table. I get an error that the last column does not end with ".
I have the following code, but it gives an OutOfMemoryException when reading from fileName. The line is:
var lines = File.ReadAllLines(fileName);
How can I fix it? Ideally, I would like to split the file into two good and bad rows. Or delete rows that do not end with "CRLF.
int goodRow = 0;
int badRow = 0;
String badRowFileName = fileName.Substring(0, fileName.Length - 4) + "BadRow.csv";
String goodRowFileName = fileName.Substring(0, fileName.Length - 4) + "GoodRow.csv";
var charGood = "\"\"";
String lineOut = string.Empty;
String str = string.Empty;
var lines = File.ReadAllLines(fileName);
StringBuilder sbGood = new StringBuilder();
StringBuilder sbBad = new StringBuilder();
foreach (string line in lines)
{
if (line.Contains(charGood))
{
goodRow++;
sbGood.AppendLine(line);
}
else
{
badRow++;
sbBad.AppendLine(line);
}
}
if (badRow > 0)
{
File.WriteAllText(badRowFileName, sbBad.ToString());
}
if (goodRow > 0)
{
File.WriteAllText(goodRowFileName, sbGood.ToString());
}
sbGood.Clear();
sbBad.Clear();
msg = msg + "Good Rows - " + goodRow.ToString() + " Bad Rows - " + badRow.ToString() + " Done.";
You can translate that code like this to be much more efficient:
int goodRow = 0, badRow = 0;
String badRowFileName = fileName.Substring(0, fileName.Length - 4) + "BadRow.csv";
String goodRowFileName = fileName.Substring(0, fileName.Length - 4) + "GoodRow.csv";
var charGood = "\"\"";
using (var lines = File.ReadLines(fileName))
using (var swGood = new StreamWriter(goodRowFileName))
using (var swBad = new StreamWriter(badRowFileName))
{
foreach (string line in lines)
{
if (line.Contains(charGood))
{
goodRow++;
swGood.WriteLine(line);
}
else
{
badRow++;
swBad.WriteLine(line);
}
}
}
msg += $"Good Rows: {goodRow,9} Bad Rows: {badRow,9} Done.";
But I'd also look at using a real csv parser for this. There are plenty on NuGet. That might even let you clean up the data on the fly.
I would not suggest reading the entire file into memory, then processing the file, then writing all modified contents out to the new file.
Instead using file streams:
using (var rdr = new StreamReader(fileName))
using (var wrtrGood = new StreamWriter(goodRowFileName))
using (var wrtrBad = new StreamWriter(badRowFileName))
{
string line = null;
while ((line = rdr.ReadLine()) != null)
{
if (line.Contains(charGood))
{
goodRow++;
wrtr.WriteLine(line);
}
else
{
badRow++;
wrtrBad.WriteLine(line);
}
}
}

c# File1's text keeps replacing file2's when I run it

All I need is for file1 and file2 to show the text inside the file. File1 is working great! File2 not so much. I believe there is something wrong with how I wrote file2 being read. Because I made a class so that I can make file2's text go to another file called outputfile2, and even that isn't working.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Threading.Tasks;
namespace RandomName
{
class Program
{
static void Main(string[] args)
{
string winDir =
"C:/Users/RandomPerson/Desktop/RandomName/bin/Debug/";
string fileName = "file1.txt";
StreamReader reader = new StreamReader(winDir + fileName);
string outputFileName = "upperfile" + fileName;
StreamWriter writer = new StreamWriter(outputFileName);
int n = 0;
string st = "";
string upperString = "";
int n2 = 0;
string st2 = "";
string upperString2 = "";
string fileName2 = "file2.txt";
StreamReader reader2 = new StreamReader(winDir + fileName2);
string outputFileName2 = "output" + fileName2;
StreamWriter writer2 = new StreamWriter(outputFileName2);
do
{
++n;
st = reader.ReadLine(); // read one line from disk file
Console.WriteLine("Line #" + n + ": " + st); // write to the console
writer.WriteLine(st); // write line to disk file instead, using WriteLine() method
upperString = upperString + "\n" + st; // append each line to the big string
}
while (!reader.EndOfStream);
do
{
++n2;
st2 = reader2.ReadLine(); // read one line from disk file
Console.WriteLine("Line #" + n2 + ": " + st2); // write to the
console
writer2.WriteLine(st2); // write line to disk file instead,
using WriteLine() method
upperString2 = upperString2 + "\n" + st2; // append each line
to the big string
}
while (!reader2.EndOfStream);
reader.Close();
writer.Close();
Console.WriteLine("\nHere is the entire file in a string:");
Console.WriteLine(upperString);
Console.WriteLine(upperString2);
UpperString b = new UpperString(upperString);
UpperString2 c = new UpperString2(upperString2);
Console.WriteLine("\nThe string in reverse case: ");
b.showReverseCase();
Console.WriteLine("\n");
c.readingFile2();
c.toNewFile2();
}
}
}
"b." is for another class that I have. I copied the code from that class into the "c." one, changing names of strings and such. And that didn't work. Which is why I think something is wrong somewhere in the main.
Here is the class
class UpperString2
{
private string upperString2;
public UpperString2() { }
public UpperString2(string c) { upperString2 = c; }
public void readingFile2()
{
string[] lines = System.IO.File.ReadAllLines("C:/Users/SomeName/Desktop/FolderName/bin/Debug/file2.txt");
System.Console.WriteLine("\nAnother Poem \n");
foreach (string line in lines)
{
// Use a tab to indent each line of the file.
Console.WriteLine(line);
}
}
public void toNewFile2()
{
using (StreamWriter writetext = new StreamWriter("outputfile2.txt"))
{
string newText = (upperString2.ToUpper()).ToString();
writetext.WriteLine(newText);
}
}
I am a bit new to SteamReader and SteamWriter, which is why I think I went wrong somehow with that. I'm not sure what though. Thank you anyone who can help me have the text in file2 show up without it being overwritten by file1's text!
The problem is "outputfile2" was already opened by reader2 in Main().
string fileName2 = "file2.txt";
StreamReader reader2 = new StreamReader(winDir + fileName2);
string outputFileName2 = "output" + fileName2; //<--outputfile2.txt
StreamWriter writer2 = new StreamWriter(outputFileName2)
Then it raises an exception when you try to open the same file for writting in toNewFile2():
public void toNewFile2()
{
using (StreamWriter writetext = new StreamWriter("outputfile2.txt"))
{
string newText = (upperString2.ToUpper()).ToString();
writetext.WriteLine(newText);
}
}
This happens because the object writer2 is still alive and locking the file in Main() and there's no using statement for disposing the object when no longer needed.
Since you have moved the code to a class, call that class instead.

Is there a more efficient way of reading and writing a text fill at the same time?

I'm back at it again with another question, this time with regards to editing text files. My home work is as follow
Write a program that reads the contents of a text file and inserts the line numbers at the beginning of each line, then rewrites the file contents.
This is what I have so far, though I am not so sure if this is the most efficient way of doing it. I've only started learning on handling text files at the moment.
static void Main(string[] args)
{
string fileName = #"C:\Users\Nate\Documents\Visual Studio 2015\Projects\Chapter 15\Chapter 15 Question 3\Chapter 15 Question 3\TextFile1.txt";
StreamReader reader = new StreamReader(fileName);
int lineCounter = 0;
List<string> list = new List<string>();
using (reader)
{
string line = reader.ReadLine();
while (line != null)
{
list.Add("line " + (lineCounter + 1) + ": " + line);
line = reader.ReadLine();
lineCounter++;
}
}
StreamWriter writer = new StreamWriter(fileName);
using (writer)
{
foreach (string line in list)
{
writer.WriteLine(line);
}
}
}
your help would be appreciated!
thanks once again. :]
this should be enough (in case the file is relatively small):
using System.IO;
(...)
static void Main(string[] args)
{
string fileName = #"C:\Users\Nate\Documents\Visual Studio 2015\Projects\Chapter 15\Chapter 15 Question 3\Chapter 15 Question 3\TextFile1.txt";
string[] lines = File.ReadAllLines(fileName);
for (int i = 0; i< lines.Length; i++)
{
lines[i] = string.Format("{0} {1}", i + 1, lines[i]);
}
File.WriteAllLines(fileName, lines);
}
I suggest using Linq, use File.ReadLinesto read the content.
// Read all lines and apply format
var formatteLines = File
.ReadLines("filepath") // read lines
.Select((line, i) => string.Format("line {0} :{1} ", line, i+1)); // format each line.
// write formatted lines to either to the new file or override previous file.
File.WriteAllLines("outputfilepath", formatteLines);
Just one loop here. I think it will be efficient.
class Program
{
public static void Main()
{
string path = Directory.GetCurrentDirectory() + #"\MyText.txt";
StreamReader sr1 = File.OpenText(path);
string s = "";
int counter = 1;
StringBuilder sb = new StringBuilder();
while ((s = sr1.ReadLine()) != null)
{
var lineOutput = counter++ + " " + s;
Console.WriteLine(lineOutput);
sb.Append(lineOutput);
}
sr1.Close();
Console.WriteLine();
StreamWriter sw1 = File.AppendText(path);
sw1.Write(sb);
sw1.Close();
}

c# Remove rows from csv

I have two csv files. In the first file i have a list of users, and in the second file i have a list of duplicate users. Im trying to remove the rows in the first file that are equal to the second file.
Heres the code i have so far:
StreamWriter sw = new StreamWriter(path3);
StreamReader sr = new StreamReader(path2);
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
File 1 example:
Modify,ABAMA3C,Allpay - Free State - HO,09072701
Modify,ABCG327,Processing Centre,09085980
File 2 Example:
Modify,ABAA323,Group HR Credit Risk & Finance
Modify,ABAB959,Channel Sales & Service,09071036
Any suggestions?
Thanks.
All you'd have to do is change the following file paths in the code below and you will get a file back (file one) without the duplicate users from file 2. This code was written with the idea in mind that you want something that is easy to understand. Sure there are other more elegant solutions, but I wanted to make it as basic as possible for you:
(Paste this in the main method of your program)
string line;
StreamReader sr = new StreamReader(#"C:\Users\J\Desktop\texts\First.txt");
StreamReader sr2 = new StreamReader(#"C:\Users\J\Desktop\texts\Second.txt");
List<String> fileOne = new List<string>();
List<String> fileTwo = new List<string>();
while (sr.Peek() >= 0)
{
line = sr.ReadLine();
if(line != "")
{
fileOne.Add(line);
}
}
sr.Close();
while (sr2.Peek() >= 0)
{
line = sr2.ReadLine();
if (line != "")
{
fileTwo.Add(line);
}
}
sr2.Close();
var t = fileOne.Except(fileTwo);
StreamWriter sw = new StreamWriter(#"C:\Users\justin\Desktop\texts\First.txt");
foreach(var z in t)
{
sw.WriteLine(z);
}
sw.Flush();
If this is not homework, but a production thing, and you can install assemblies, you'll save 3 hours of your life if you swallow your pride and use a piece of the VB library:
There are many exceptions (CR/LF between commas=legal in quotes; different types of quotes; etc.) This will handle anything excel will export/import.
Sample code to load a 'Person' class pulled from a program I used it in:
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(CSVPath)
Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
Reader.Delimiters = New String() {","}
Reader.TrimWhiteSpace = True
Reader.HasFieldsEnclosedInQuotes = True
While Not Reader.EndOfData
Try
Dim st2 As New List(Of String)
st2.addrange(Reader.ReadFields())
If iCount > 0 Then ' ignore first row = field names
Dim p As New Person
p.CSVLine = st2
p.FirstName = st2(1).Trim
If st2.Count > 2 Then
p.MiddleName = st2(2).Trim
Else
p.MiddleName = ""
End If
p.LastNameSuffix = st2(0).Trim
If st2.Count >= 5 Then
p.TestCase = st2(5).Trim
End If
If st2(3) > "" Then
p.AccountNumbersFromCase.Add(st2(3))
End If
While p.CSVLine.Count < 15
p.CSVLine.Add("")
End While
cases.Add(p)
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is not valid and will be skipped.")
End Try
iCount += 1
End While
End Using
this to close the streams properly:
using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path2))
{
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
}
}
for help on the real logic of removal or compare, answer the comment of El Ronnoco above...
You need to close the streams or utilize using clause
sw.Close();
using(StreamWriter sw = new StreamWriter(#"c:\test3.txt"))
You can use LINQ...
class Program
{
static void Main(string[] args)
{
var fullList = "TextFile1.txt".ReadAsLines();
var removeThese = "TextFile2.txt".ReadAsLines();
//Change this line if you need to change the filter results.
//Note: this assume you are wanting to remove results from the first
// list when the entire record matches. If you want to match on
// only part of the list you will need to split/parse the records
// and then filter your results.
var cleanedList = fullList.Except(removeThese);
cleanedList.WriteAsLinesTo("result.txt");
}
}
public static class Tools
{
public static IEnumerable<string> ReadAsLines(this string filename)
{
using (var reader = new StreamReader(filename))
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
public static void WriteAsLinesTo(this IEnumerable<string> lines, string filename)
{
using (var writer = new StreamWriter(filename) { AutoFlush = true, })
foreach (var line in lines)
writer.WriteLine(line);
}
}
using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path))
{
string []arrRemove = File.ReadAllLines(path2);
HashSet<string> listRemove = new HashSet<string>(arrRemove.Count);
foreach(string s in arrRemove)
{
string []sa = s.Split(',');
if( sa.Count < 2 ) continue;
listRemove.Add(sa[1].toUpperCase());
}
string line = sr.ReadLine();
while( line != null )
{
string []sa = line.Split(',');
if( sa.Count < 2 )
sw.WriteLine(line);
else if( !listRemove.contains(sa[1].toUpperCase()) )
sw.WriteLine(line);
line = sr.ReadLine();
}
}

Categories