parsing large textfile output to another textfile [closed]

parsing large textfile output to another textfile [closed] - c#

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I want to parse a large textfile and if the line contains a certain substring then append that line to my new text file. I need the solution with the lowest memory usage, This is what I have so far, the comments is what I need help adding:
.
.
.
if (File.ReadLines(filepath).Any(line => line.Contains(myXML.searchSTRING)))
{
// code to grab that line and append it to the a new text file
// if new text file doesn't exist then create it.
// All text files im parsing have the same header, I want to grab
// the third line and use it as my new text file header.
// Only write the header once, I do not want it written every time a new
// text file is opened for parsing
}

Try :
var count = 1;
File.WriteAllLines(newFilePath,
File.ReadLines(filepath)
.Where(count++ == 3 || l => l.Contains(myXML.searchSTRING))
);
Both WriteAllLines() and ReadLines() use enumerators, so should have relatively low memory usage.
I'm not sure how you would know to write the header only once, it depends on how you have your list of files to open available. Are they in an array? If so wrap the File.WriteAllLines call in a foreach loop around that array.

Something like this should do it (edited to reflect #JimMischel's comments):
private static void WriteFile(string mySearchString, string fileToWrite, params string[] filesToRead)
{
using (var sw = new StreamWriter(fileToWrite, true))
{
var count = 1;
foreach (var file in filesToRead)
{
using (var sr = new StreamReader(file))
{
string line;
while ((line = sr.ReadLine()) != null)
{
if (count == 3)
{
sw.WriteLine(line);
}
if (count > 3 && line.Contains(mySearchString))
{
sw.WriteLine(line);
}
count++;
}
}
}
}
}
You would call it like this:
WriteFile("Foobar", "fileToWrite.txt", "input1.txt", "input2.txt", "input3.txt");

You can use a StreamWriter for that :
using (var fs = new FileStream(outpuFilePath, FileMode.Append, FileAccess.Write))
{
using (var sw = new StreamWriter(fs))
{
foreach (var line in File.ReadLines(filepath).Where(line => line.Contains(myXML.searchSTRING)))
{
sw.WriteLine(line);
}
}
}

I think the most important thing is to use "Where" instead of "Any" Any returns a true/false, if a collection matches, whereas you want to filter the collection. Below should get you started in combination with the answers above (I would use Linq for clarity though).
StreamWriter outFile = new StreamWriter("output.txt");
string filepath = "infile.txt";
var header=File.ReadLines(filepath).Skip(2).First();
outFile.WriteLine(header);
var searchString = "temp";
File.ReadLines(filepath).Where(x => x.Contains(searchString))
.Select(x =>outFile.WriteLine(x));

Please read article for MemoryMappedFile
http://www.dotnetperls.com/memorymappedfile-benchmark

Related

How to increment the value of a variable while writing to a txt file C #

I have the following program:
Database (if you can call it that on text files)
When writing to a text file, I need to increase the record id by one
How can I not understand / find with the help of which method it is possible to implement a loop in which I will increase the id, can anyone tell me?
I have a method by which I can format a text file from WPF text boxes:
using (StreamReader sr = new StreamReader("D:\\123.txt", true))
{
while (sr.ReadLine() != null)
{
id++;
using (StreamWriter txt = new StreamWriter("D:\\123.txt", true))
{
txt.WriteLine(string.Format("{0} {1} {2} {3} {4} {5} {6}\n", id, TBName, TBLName, TBMName, TBInfo, TBMat, TBFiz));
}
MessageBox.Show("Данные успешно сохранены!");
}
}
How can the id increase by 1 with each new entry in the text file?
The output of information to the datagrid was as follows:
private void Work()
{
try
{
List<Student> list = new List<Student>();
using (StreamReader sr = new StreamReader(fileName, true))
{
string line;
while ((line = sr.ReadLine()) != null)
{
var parsed = line.Split(' ');
list.Add(new Student
(
Convert.ToInt32(parsed[0]),
parsed[1],
parsed[2],
parsed[3],
Convert.ToInt32(parsed[4]),
Convert.ToInt32(parsed[5]),
Convert.ToInt32(parsed[6])
));
}
}
DGridStudents.ItemsSource = list;
}
catch(Exception ex)
{
MessageBox.Show(ex.Message);
}
}

The code shown has some problems. First you cannot write in a file that you have opened for reading using the StreamReader/StreamWriter classes. But even if you can, look closely at how it works. First you open the file, then you start a loop reading a line, then writing a new line in the same file, then reading the next line (and that next line could be the same one you have just written).
In the better outcome your file will fill your disk.
To increment the value used as id in the last line you could approach with this
// First read the whole file and get the last line from it
int id = 0;
string lastLine = File.ReadLines("D:\\123.txt").LastOrDefault();
if(!string.IsNullOrEmpty(line))
{
// Now split and convert the value of the first splitted part
var parts = line.Split();
if(parts.Length > 0)
{
Int32.TryParse(parts[0], out id);
}
}
// You can now increment and write the new line
id++
using (StreamWriter txt = new StreamWriter("D:\\123.txt", true))
{
txt.WriteLine($"{id} {TBName} {TBLName} {TBMName} {TBInfo} {TBMat} {TBFiz}");
}
This approach will force you to read the whole file to find the last line. However you could add a second file (and index file) to your txt with the same name but with the idx extension. This file will contain only the last number written
int id = 0;
string firstLine = File.ReadLines("D:\\123.idx").FirstOrDefault();
if(!string.IsNullOrEmpty(line))
{
// Now split and convert the value of the first splitted part
var parts = line.Split();
if(parts.Length > 0)
{
Int32.TryParse(parts[0], out id);
}
}
id++
using (StreamWriter txt = new StreamWriter("D:\\123.txt", true))
{
txt.WriteLine($"{id} {TBName} {TBLName} {TBMName} {TBInfo} {TBMat} {TBFiz}");
}
File.WriteAllText("D:\\123.idx", id.ToString());
This second approach is probably better if the txt file is big because it doesn't require to read the whole txt file but there are more points of possible failure. You have two files to handle and this double the chances of IO errors and of course we are not even considering the multiuser scenario.
A database, even one based of a file like SQLite or Access are better suited for these tasks.

How to randomly generate a word in a CSV file [duplicate]

This question already has answers here:
How do I generate a random integer in C#?
(31 answers)
Closed 3 years ago.
I have a csv file in my file explorer windows 10. This file contains a list of rows e.g.:
John, 5656, Phil, Simon,,Jude, Helen, Andy
Conor, 5656, Phil, Simon,,Jude, Helen, Andy
I am an automated tester using C#, selenium and visual studio. In the application I am testing, there is an upload button which imports the csv file.
How do I randomly change the second number automatically so the update would be 1234 on the first row, 4444 on the second row(just append randomly). I think I would need a random generator for this.
Any advice or snippets of code would be appreciated.

Do you want to append the CSV file before its uploaded to the program or after? Either way it would look something like this:
public File updateFile(string filePath){
List<string> modifiedNames;
using (StreamReader sr = File.OpenText(path))
{
string s;
while ((s = sr.ReadLine()) != null)
{
s = s + randomlyGeneratedSuffix();
newEntries.add(s)
}
}
using (StreamWriter sw = new StreamWriter("names.txt")) {
foreach (string s in modifiedNames) {
sw.WriteLine(s);
}
}
// return new file?
}

Reading the file before uploading, changing the numbers on the second position in csv and writing it again to disk should work. Here is a very simple approach, to help you get started:
var fileLines = File.ReadAllLines("file.csv");
var randomGenerator = new Random();
var newFileLines = new List<string>();
foreach (var fileLine in fileLines)
{
var lineValues = fileLine.Split(',');
lineValues[1] = randomGenerator.Next(1000, int.MaxValue).ToString();
var newLine = string.Join(",", lineValues);
newFileLines.Add(newLine);
}
File.WriteAllLines("file.csv", newFileLines);

Instead of updating an existing CSV file for testing I would generate a new one from code.
There are a lot of code examples online how to create a CSV file in C#, for example: Writing data into CSV file in C#
For random numbers you can use the random class: https://learn.microsoft.com/en-us/dotnet/api/system.random?view=netframework-4.7.2

Getting the next line of a file or writing to the previous line in C# [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Is it possible to either see what the next line of a file is or write to the previous line of a file?
I am reading through a roughly 13,000 line file and if a line matches one of my regular expressions I then change the line, if not, it stays the same. These lines are getting written to a new file. It looks like this, roughly of course.
//create Streamreader sr
//create Streamwriter sw
//loop through file by line
//if line matches REGEX, change it. Else, don't change it
//write line to new file
//if end of file, close sr and sw
I need to either look to the next for ENDREC so I can write a new line before it
OR if the current line is ENDREC I need to write to the line before it. Any ideas?

If loading the whole file into memory isn't a problem, try something like this:
public void Test()
{
string fileName = "oldFileName";
string newFileName = "newFileName";
string[] allLines = File.ReadAllLines(fileName);
string changedLine = "Changed";
var changedLines = allLines.Select(p => ((Regex.IsMatch(p, "test")) ? changedLine : p));
File.WriteAllLines(newFileName, changedLines);
}

How about something like this?
var regex = new Regex(#"[a-zA-Z0-9_]+", RegexOptions.Compiled);
using (var reader = File.OpenText("in.txt"))
using (var writer = File.CreateText("out.txt"))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var match = regex.Match(line);
if (match.Success)
{
// Alter the line however you wish here.
}
writer.WriteLine(line);
}
writer.Flush();
}

Search string , if no match then delete line [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am looking for code to open a text file , then read the text file line by line , if the line in the text file (each line will store approx 5 values) does not contain a certain value e.g "hart" then I wand to remove that line. I am using c# and vs2012 , could anyone show me how to do this ? the file being read from is a csv file. I have no code example here as my current code does not work and I feel givin example will only cause more confusion than asking for someone to show me a clean fresh approach to doing this.
I have added the code I currently have which adds all of the data to a text file however the code I need to figure out is to take these results and filter them
foreach (DataRow dr in this.CalcDataSet.Items)
{
foreach (object field in dr.ItemArray)
{
str.Append(field.ToString() + ",");
}
str.Replace(",", "\n", str.Length - 1, 1);
}
try
{
System.IO.File.WriteAllText(Filepath, str.ToString());
}
catch (Exception ex)
{
MessageBox.Show("Write Error :" + ex.Message);
}
var lines = System.IO.File.ReadAllLines(Filepath).ToList();
var acceptedLines = new List<string>();
foreach (var line in lines)
if (Matches(line))
acceptedLines.Add(line);
System.IO.File.WriteAllLines(Filepath, acceptedLines);
}
private bool Matches(string s)
{
if (s == cmbClientList.SelectedText.ToString())
{
return true;
}
else return false;
}

Use the TextFieldParser class to open and read the file, and split the values into an array. You can then examine each item on each line to see if it contains the value you want.
If the line contains the value, then write the line to a new file. If it doesn't contain the value, then do not write to the new file.
When you're done, close the input and output files. Then delete the original input file and rename the output file.
You can't easily read and modify a text file in-place.
Another option would be to read using TextFieldParser and write to an in-memory stream. At the end, write from the memory stream back to the original file. This will work if the file is small enough to fit in memory.

This will basically do what you want:
var lines = System.IO.File.ReadAllLines("somefile.csv");
var acceptedLines = new List<string>();
foreach (var line in lines)
if (Matches(line))
acceptedLines.Add(line);
System.IO.File.WriteAllLines("output.csv", acceptedLines);
private bool Matches(string s) {
// Whatever you want, return true to include the line, false to exclude)
}

you can do this
string[] lines = File.ReadAllLines("yourfile.csv");
List<string> linesToWrite = new List<string>();
int currentCount = 0;
foreach(string s in lines)
{
if(s.Contains("YourKeyValue"))
linesToWrite.Add(s);
}
File.WriteAllLines("yourfile.csv", linesToWrite );

read only given last x lines in txt file [duplicate]

This question already has answers here:
Get last 10 lines of very large text file > 10GB
(21 answers)
Closed 9 years ago.
Currently I'm reading file content using File.ReadAllText(), but now I need to read last x lines in my txt file. How can I do that?
content of myfile.txt
line1content
line2content
line3content
line4content
string contentOfLastTwoLines = ...

What about this
List <string> text = File.ReadLines("file.txt").Reverse().Take(2).ToList()

Use Queue<string> to store last X lines and replace the first one with currently read:
int x = 4; // number of lines you want to get
var buffor = new Queue<string>(x);
var file = new StreamReader("Input.txt");
while (!file.EndOfStream)
{
string line = file.ReadLine();
if (buffor.Count >= x)
buffor.Dequeue();
buffor.Enqueue(line);
}
string[] lastLines = buffor.ToArray();
string contentOfLastLines = String.Join(Environment.NewLine, lastLines);

You can use ReadLines to avoid reading the entire file into memory, like this:
const int neededLines = 5;
var lines = new List<String>();
foreach (var s in File.ReadLines("c:\\myfile.txt")) {
lines.Add(s);
if (lines.Count > neededLines) {
lines.RemoveAt(0);
}
}
Once the for loop is finished, the lines list contains up to the last neededLines of text from the file. Of course if the file does not contain as many lines as required, fewer lines will be placed in the lines list.

Read the lines into an array, then extract the last two:
string[] lines = File.ReadAllLines();
string last2 = lines[lines.Count-2] + Environment.NewLine + lines[lines.Count-1];
Assuming your file is reasonably small, it's easier to just read the whole thing and throw away what you don't need.

Since reading a file is done linearly, usually line-by-line. Simply read line-by-line and remember last two lines (you can use queue or something if you want... or just two string variables). When you get to EOF, you'll have your last two lines.

You want to read the file backwards using ReverseLineReader:
How to read a text file reversely with iterator in C#
Then run .Take(2) on it.
var lines = new ReverseLineReader(filename);
var last = lines.Take(2);
OR
Use a System.IO.StreamReader.
string line1, line2;
using(StreamReader reader = new StreamReader("myFile.txt")) {
line1 = reader.ReadLine();
line2 = reader.ReadLine();
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

parsing large textfile output to another textfile [closed] - c#

You can use a StreamWriter for that : using (var fs = new FileStream(outpuFilePath, FileMode.Append, FileAccess.Write)) { using (var sw = new StreamWriter(fs)) { foreach (var line in File.ReadLines(filepath).Where(line => line.Contains(myXML.searchSTRING))) { sw.WriteLine(line); } } }

Please read article for MemoryMappedFile http://www.dotnetperls.com/memorymappedfile-benchmark

Related

How to increment the value of a variable while writing to a txt file C #

How to randomly generate a word in a CSV file [duplicate]

Getting the next line of a file or writing to the previous line in C# [closed]

Search string , if no match then delete line [closed]

read only given last x lines in txt file [duplicate]

Categories

Resources