How do I remove corrupted data from .csv file? - c#

So I have lots of data but I'm not sure how I remove the corrupt data.
In the file the list is like this:
EMERIE,ESPARZA,166,57,34,BLUE,BLONDE
ADALINE,PARSONS,158,39,£$**),BROWN,GREY
The £$**) represents corrupted data but I don't know how to remove it, I have over 10,000 names and some of them are like this.

Assuming you want to completely remove the corrupted data rows rather than modify them, you could do something like the following:
public void RemoveCorruptData()
{
string path = #"C:\CSV.txt";
string newPath = #"C:\new-CSV.txt";
List<string> lines = new List<string>();
Regex corrupt = new Regex("£$**");
if (File.Exists(path))
{
using (StreamReader reader = new StreamReader(path))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (!corrupt.IsMatch(line))
{
lines.Add(line);
}
}
}
using (StreamWriter writer = new StreamWriter(newpath, false))
{
foreach (String line in lines)
writer.WriteLine(line);
}
}
}

Related

Replacing a certain word in a text file

I know this has been asked a few times, but I have seen a lot of regex etc., and I'm sure there is another way to do this with just a stream reader/writer. Below is my code. I'm trying to replace "tea" with the word "cabbage". Can somebody help? I believe I have the wrong syntax.
namespace Week_9_Exer_4
{
class TextImportEdit
{
public void EditorialControl()
{
string fileName;
string lineReadFromFile;
Console.WriteLine("");
// Ask for the name of the file to be read
Console.Write("Which file do you wish to read? ");
fileName = Console.ReadLine();
Console.WriteLine("");
// Open the file for reading
StreamReader fileReader = new StreamReader("C:\\Users\\Greg\\Desktop\\Programming Files\\story.txt");
// Read the lines from the file and display them
// until a null is returned (indicating end of file)
lineReadFromFile = fileReader.ReadLine();
Console.WriteLine("Please enter the word you wish to edit out: ");
string editWord = Console.ReadLine();
while (lineReadFromFile != null)
{
Console.WriteLine(lineReadFromFile);
lineReadFromFile = fileReader.ReadLine();
}
String text = File.ReadAllText("C:\\Users\\Greg\\Desktop\\Programming Files\\story.txt");
fileReader.Close();
StreamWriter fileWriter = new StreamWriter("C:\\Users\\Greg\\Desktop\\Programming Files\\story.txt", false);
string newText = text.Replace("tea", "cabbage");
fileWriter.WriteLine(newText);
fileWriter.Close();
}
}
}
If you don't care about memory usage:
string fileName = #"C:\Users\Greg\Desktop\Programming Files\story.txt";
File.WriteAllText(fileName, File.ReadAllText(fileName).Replace("tea", "cabbage"));
If you have a multi-line file that doesn't randomly split words at the end of the line, you could modify one line at a time in a more memory-friendly way:
// Open a stream for the source file
using (var sourceFile = File.OpenText(fileName))
{
// Create a temporary file path where we can write modify lines
string tempFile = Path.Combine(Path.GetDirectoryName(fileName), "story-temp.txt");
// Open a stream for the temporary file
using (var tempFileStream = new StreamWriter(tempFile))
{
string line;
// read lines while the file has them
while ((line = sourceFile.ReadLine()) != null)
{
// Do the word replacement
line = line.Replace("tea", "cabbage");
// Write the modified line to the new file
tempFileStream.WriteLine(line);
}
}
}
// Replace the original file with the temporary one
File.Replace("story-temp.txt", "story.txt", null);
In the end i used this : Hope it can help out others
public List<string> EditorialResponse(string fileName, string searchString, string replacementString)
{
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader(fileName))
{
string line;
while ((line = reader.ReadLine()) != null)
{
line = line.Replace(searchString, replacementString);
list.Add(line);
Console.WriteLine(line);
}
reader.Close();
}
return list;
}
}
class Program
{
static void Main(string[] args)
{
TextImportEdit tie = new TextImportEdit();
List<string> ls = tie.EditorialResponse(#"C:\Users\Tom\Documents\Visual Studio 2013\story.txt", "tea", "cockrel");
StreamWriter writer = new StreamWriter(#"C:\Users\Tom\Documents\Visual Studio 2013\story12.txt");
foreach (string line in ls)
{
writer.WriteLine(line);
}
writer.Close();
}
}
}

C# loading listView subItems from file using StreamReader

I need some help with loading text file into a listView. Text file looks like this:
1,6 sec,5 sec,1 sec,17,
2,6 sec,4 sec,2 sec,33,
3,7 sec,5 sec,3 sec,44,
I have to load this into a listView control and every subitem should be separated by comma (or any other character). I tried something like this:
using (var sr = new StreamReader(file))
{
string fileLine = sr.ReadLine();
foreach (string piece in fileLine.Split(','))
{
listView1.Items.Add(piece);
}
sr.Close();
}
it would work just fine apart from only first line is loaded to the first column in listview. I cannot figure it out.
Thanks for your time!
KR!
You have to advance to the next line, you can use a while-loop:
using (var sr = new StreamReader(file))
{
string fileLine;
while ((fileLine = sr.ReadLine()) != null)
{
foreach (string piece in fileLine.Split(','))
{
listView1.Items.Add(piece);
}
}
}
Note that you don't need to close the stream manually, that is done by the using-statement.
Another way is using File.ReadLines or File.ReadAllLines which can help to simplify your code:
var allPieces = File.ReadLines(file).SelectMany(line => line.Split(','));
foreach(string piece in allPieces)
listView1.Items.Add(piece);
using (var sr = new StreamReader(file))
{
while(!sr.EndOfStream)
{
string fileLine = sr.ReadLine();
foreach (string piece in fileLine.Split(','))
{
listView1.Items.Add(piece);
}
sr.Close();
}
}
Ι guess you just have to add:
while (!sr.EndOfStream)
{
string fileLine = sr.ReadLine();
foreach (string piece in fileLine.Split(','))
{
listView1.Items.Add(piece);
}
}
sr.Close();// close put the end of while scope beacause you have a multiline text this code can't be read second line, and throw exceptions this code.

Remove Stop Words From text File

i want to remove stop words from my text file and i write the following code for this purpose
TextWriter tw = new StreamWriter("D:\\output.txt");
private void button1_Click(object sender, EventArgs e)
{
StreamReader reader = new StreamReader("D:\\input1.txt");
string line;
while ((line = reader.ReadLine()) != null)
{
string[] parts = line.Split(' ');
string[] stopWord = new string[] { "is", "are", "am","could","will" };
foreach (string word in stopWord)
{
line = line.Replace(word, "");
tw.Write("+"+line);
}
tw.Write("\r\n");
}
but it doesn't show the result in the output file and the output file remain empty.
A regular expression might be perfect for the job:
Regex replacer = new Regex("\b(?:is|are|am|could|will)\b");
using (TextWriter writer = new StreamWriter("C:\\output.txt"))
{
using (StreamReader reader = new StreamReader("C:\\input.txt"))
{
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
replacer.Replace(line, "");
writer.WriteLine(line);
}
}
writer.Flush();
}
This method will only replace the words with blanks and do nothing with the stopwords if they are part of another word.
Good luck with your quest.
The following works as expected for me. However, it's not a good approach because it will remove the stop words even when they are part of a larger word. Also, it doesn't clean up extra spaces between removed words.
string[] stopWord = new string[] { "is", "are", "am","could","will" };
TextWriter writer = new StreamWriter("C:\\output.txt");
StreamReader reader = new StreamReader("C:\\input.txt");
string line;
while ((line = reader.ReadLine()) != null)
{
foreach (string word in stopWord)
{
line = line.Replace(word, "");
}
writer.WriteLine(line);
}
reader.Close();
writer.Close();
Also, I recommend using using statements for when you create your streams in order to ensure the files are closed in a timely manner.
You should wrap your IO objects in using statements so that they are disposed properly.
using (TextWriter tw = new TextWrite("D:\\output.txt"))
{
using (StreamReader reader = new StreamReader("D:\\input1.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
string[] parts = line.Split(' ');
string[] stopWord = new string[] { "is", "are", "am","could","will" };
foreach (string word in stopWord)
{
line = line.Replace(word, "");
tw.Write("+"+line);
}
}
}
}
Try wrapping StreamWriter and StreamReader in using() {} clauses.
using (TextWriter tw = new StreamWriter(#"D:\output.txt")
{
...
}
You may also want to call tw.Flush() at the very end.

How to add lines of a text file into individual items on a ListBox (C#)

How would it be possible to read a text file with several lines, and then to put each line in the text file on a separate row in a ListBox?
The code I have so far:
richTextBox5.Text = File.ReadAllText("ignore.txt");
String text = File.ReadAllText("ignore.txt");
var result = Regex.Split(text, "\r\n|\r|\n");
foreach(string s in result)
{
lstBox.Items.Add(s);
}
string[] lines = System.IO.File.ReadAllLines(#"ignore.txt");
foreach (string line in lines)
{
listBox.Items.Add(line);
}
Write a helper method that return the collection of lines
static IEnumerable<string> ReadFromFile(string file)
{// check if file exist, null or empty string
string line;
using(var reader = File.OpenText(file))
{
while((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
use it
var lines = ReadFromFile(myfile);
myListBox.ItemsSource = lines.ToList(); // or change it to ObservableCollection. also you can add to the end line by line with myListBox.Items.Add()
You should use a streamreader to read the file one line at a time.
using (StreamReader sr = new StreamReader("ignore.txt"))
{
string line;
while ((line = sr.ReadLine()) != null)
listBox1.Items.Add(line);
}
StreamReader info -> http://msdn.microsoft.com/en-us/library/system.io.streamreader.aspx
ListBox info -> http://msdn.microsoft.com/en-us/library/system.windows.forms.listbox.aspx

Delete specific line from a text file?

I need to delete an exact line from a text file but I cannot for the life of me workout how to go about doing this.
Any suggestions or examples would be greatly appreciated?
Related Questions
Efficient way to delete a line from a text file (C#)
If the line you want to delete is based on the content of the line:
string line = null;
string line_to_delete = "the line i want to delete";
using (StreamReader reader = new StreamReader("C:\\input")) {
using (StreamWriter writer = new StreamWriter("C:\\output")) {
while ((line = reader.ReadLine()) != null) {
if (String.Compare(line, line_to_delete) == 0)
continue;
writer.WriteLine(line);
}
}
}
Or if it is based on line number:
string line = null;
int line_number = 0;
int line_to_delete = 12;
using (StreamReader reader = new StreamReader("C:\\input")) {
using (StreamWriter writer = new StreamWriter("C:\\output")) {
while ((line = reader.ReadLine()) != null) {
line_number++;
if (line_number == line_to_delete)
continue;
writer.WriteLine(line);
}
}
}
The best way to do this is to open the file in text mode, read each line with ReadLine(), and then write it to a new file with WriteLine(), skipping the one line you want to delete.
There is no generic delete-a-line-from-file function, as far as I know.
One way to do it if the file is not very big is to load all the lines into an array:
string[] lines = File.ReadAllLines("filename.txt");
string[] newLines = RemoveUnnecessaryLine(lines);
File.WriteAllLines("filename.txt", newLines);
Hope this simple and short code will help.
List linesList = File.ReadAllLines("myFile.txt").ToList();
linesList.RemoveAt(0);
File.WriteAllLines("myFile.txt"), linesList.ToArray());
OR use this
public void DeleteLinesFromFile(string strLineToDelete)
{
string strFilePath = "Provide the path of the text file";
string strSearchText = strLineToDelete;
string strOldText;
string n = "";
StreamReader sr = File.OpenText(strFilePath);
while ((strOldText = sr.ReadLine()) != null)
{
if (!strOldText.Contains(strSearchText))
{
n += strOldText + Environment.NewLine;
}
}
sr.Close();
File.WriteAllText(strFilePath, n);
}
You can actually use C# generics for this to make it real easy:
var file = new List<string>(System.IO.File.ReadAllLines("C:\\path"));
file.RemoveAt(12);
File.WriteAllLines("C:\\path", file.ToArray());
This can be done in three steps:
// 1. Read the content of the file
string[] readText = File.ReadAllLines(path);
// 2. Empty the file
File.WriteAllText(path, String.Empty);
// 3. Fill up again, but without the deleted line
using (StreamWriter writer = new StreamWriter(path))
{
foreach (string s in readText)
{
if (!s.Equals(lineToBeRemoved))
{
writer.WriteLine(s);
}
}
}
Read and remember each line
Identify the one you want to get rid
of
Forget that one
Write the rest back over the top of
the file
I cared about the file's original end line characters ("\n" or "\r\n") and wanted to maintain them in the output file (not overwrite them with what ever the current environment's char(s) are like the other answers appear to do). So I wrote my own method to read a line without removing the end line chars then used it in my DeleteLines method (I wanted the option to delete multiple lines, hence the use of a collection of line numbers to delete).
DeleteLines was implemented as a FileInfo extension and ReadLineKeepNewLineChars a StreamReader extension (but obviously you don't have to keep it that way).
public static class FileInfoExtensions
{
public static FileInfo DeleteLines(this FileInfo source, ICollection<int> lineNumbers, string targetFilePath)
{
var lineCount = 1;
using (var streamReader = new StreamReader(source.FullName))
{
using (var streamWriter = new StreamWriter(targetFilePath))
{
string line;
while ((line = streamReader.ReadLineKeepNewLineChars()) != null)
{
if (!lineNumbers.Contains(lineCount))
{
streamWriter.Write(line);
}
lineCount++;
}
}
}
return new FileInfo(targetFilePath);
}
}
public static class StreamReaderExtensions
{
private const char EndOfFile = '\uffff';
/// <summary>
/// Reads a line, similar to ReadLine method, but keeps any
/// new line characters (e.g. "\r\n" or "\n").
/// </summary>
public static string ReadLineKeepNewLineChars(this StreamReader source)
{
if (source == null)
throw new ArgumentNullException(nameof(source));
char ch = (char)source.Read();
if (ch == EndOfFile)
return null;
var sb = new StringBuilder();
while (ch != EndOfFile)
{
sb.Append(ch);
if (ch == '\n')
break;
ch = (char)source.Read();
}
return sb.ToString();
}
}
Are you on a Unix operating system?
You can do this with the "sed" stream editor. Read the man page for "sed"
What?
Use file open, seek position then stream erase line using null.
Gotch it? Simple,stream,no array that eat memory,fast.
This work on vb.. Example search line culture=id where culture are namevalue and id are value and we want to change it to culture=en
Fileopen(1, "text.ini")
dim line as string
dim currentpos as long
while true
line = lineinput(1)
dim namevalue() as string = split(line, "=")
if namevalue(0) = "line name value that i want to edit" then
currentpos = seek(1)
fileclose()
dim fs as filestream("test.ini", filemode.open)
dim sw as streamwriter(fs)
fs.seek(currentpos, seekorigin.begin)
sw.write(null)
sw.write(namevalue + "=" + newvalue)
sw.close()
fs.close()
exit while
end if
msgbox("org ternate jua bisa, no line found")
end while
that's all..use #d

Categories