How to split the large text file(32 GB) using C# - c#

I tried to split the file about 32GB using the below code but I got the memory exception.
Please suggest me to split the file using C#.
string[] splitFile = File.ReadAllLines(#"E:\\JKS\\ImportGenius\\0.txt");
int cycle = 1;
int splitSize = Convert.ToInt32(txtNoOfLines.Text);
var chunk = splitFile.Take(splitSize);
var rem = splitFile.Skip(splitSize);
while (chunk.Take(1).Count() > 0)
{
string filename = "file" + cycle.ToString() + ".txt";
using (StreamWriter sw = new StreamWriter(filename))
{
foreach (string line in chunk)
{
sw.WriteLine(line);
}
}
chunk = rem.Take(splitSize);
rem = rem.Skip(splitSize);
cycle++;
}

Well, to start with you need to use File.ReadLines (assuming you're using .NET 4) so that it doesn't try to read the whole thing into memory. Then I'd just keep calling a method to spit the "next" however many lines to a new file:
int splitSize = Convert.ToInt32(txtNoOfLines.Text);
using (var lineIterator = File.ReadLines(...).GetEnumerator())
{
bool stillGoing = true;
for (int chunk = 0; stillGoing; chunk++)
{
stillGoing = WriteChunk(lineIterator, splitSize, chunk);
}
}
...
private static bool WriteChunk(IEnumerator<string> lineIterator,
int splitSize, int chunk)
{
using (var writer = File.CreateText("file " + chunk + ".txt"))
{
for (int i = 0; i < splitSize; i++)
{
if (!lineIterator.MoveNext())
{
return false;
}
writer.WriteLine(lineIterator.Current);
}
}
return true;
}

Do not read immediately all lines into an array, but use StremReader.ReadLine method, like:
using (StreamReader sr = new StreamReader(#"E:\\JKS\\ImportGenius\\0.txt"))
{
while (sr.Peek() >= 0)
{
var fileLine = sr.ReadLine();
//do something with line
}
}

File.ReadAllLines
That will read the whole file into memory.
To work with large files you need to only read what you need now into memory, and then throw that away as soon as you have finished with it.
A better option would be File.ReadLines which returns a lazy enumerator, data is only read into memory as you get the next line from the enumerator. Providing you avoid multiple enumerations (eg. don't use Count()) only parts of the file will be read.

Instead of reading all the file at once using File.ReadAllLines, use File.ReadLines in a foreach loop to read the lines as needed.
foreach (var line in File.ReadLines(#"E:\\JKS\\ImportGenius\\0.txt"))
{
// Do something
}
Edit: On an unrelated note, you don't have to escape your backslashes when prefixing the string with a '#'. So either write "E:\\JKS\\ImportGenius\\0.txt" or #"E:\JKS\ImportGenius\0.txt", but #"E:\\JKS\\ImportGenius\\0.txt" is redundant.

The problem here is that you are reading the entire file's content into memory at once with File.ReadAllLines(). What you need to do is open a FileStream with File.OpenRead() and read/write smaller chunks.
Edit: Actually for your case ReadLine is obviously better. See other answers. :)

Use a StreamReader to read the file, write with a StreamWriter.

Related

Read & write a single line from a file without overwrite [duplicate]

I have two text files, Source.txt and Target.txt. The source will never be modified and contain N lines of text. So, I want to delete a specific line of text in Target.txt, and replace by an specific line of text from Source.txt, I know what number of line I need, actually is the line number 2, both files.
I haven something like this:
string line = string.Empty;
int line_number = 1;
int line_to_edit = 2;
using StreamReader reader = new StreamReader(#"C:\target.xml");
using StreamWriter writer = new StreamWriter(#"C:\target.xml");
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
writer.WriteLine(line);
line_number++;
}
But when I open the Writer, the target file get erased, it writes the lines, but, when opened, the target file only contains the copied lines, the rest get lost.
What can I do?
the easiest way is :
static void lineChanger(string newText, string fileName, int line_to_edit)
{
string[] arrLine = File.ReadAllLines(fileName);
arrLine[line_to_edit - 1] = newText;
File.WriteAllLines(fileName, arrLine);
}
usage :
lineChanger("new content for this line" , "sample.text" , 34);
You can't rewrite a line without rewriting the entire file (unless the lines happen to be the same length). If your files are small then reading the entire target file into memory and then writing it out again might make sense. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2; // Warning: 1-based indexing!
string sourceFile = "source.txt";
string destinationFile = "target.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read the old file.
string[] lines = File.ReadAllLines(destinationFile);
// Write the new file over the old file.
using (StreamWriter writer = new StreamWriter(destinationFile))
{
for (int currentLine = 1; currentLine <= lines.Length; ++currentLine)
{
if (currentLine == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(lines[currentLine - 1]);
}
}
}
}
}
If your files are large it would be better to create a new file so that you can read streaming from one file while you write to the other. This means that you don't need to have the whole file in memory at once. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2;
string sourceFile = "source.txt";
string destinationFile = "target.txt";
string tempFile = "target2.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read from the target file and write to a new file.
int line_number = 1;
string line = null;
using (StreamReader reader = new StreamReader(destinationFile))
using (StreamWriter writer = new StreamWriter(tempFile))
{
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(line);
}
line_number++;
}
}
// TODO: Delete the old file and replace it with the new file here.
}
}
You can afterwards move the file once you are sure that the write operation has succeeded (no excecption was thrown and the writer is closed).
Note that in both cases it is a bit confusing that you are using 1-based indexing for your line numbers. It might make more sense in your code to use 0-based indexing. You can have 1-based index in your user interface to your program if you wish, but convert it to a 0-indexed before sending it further.
Also, a disadvantage of directly overwriting the old file with the new file is that if it fails halfway through then you might permanently lose whatever data wasn't written. By writing to a third file first you only delete the original data after you are sure that you have another (corrected) copy of it, so you can recover the data if the computer crashes halfway through.
A final remark: I noticed that your files had an xml extension. You might want to consider if it makes more sense for you to use an XML parser to modify the contents of the files instead of replacing specific lines.
When you create a StreamWriter it always create a file from scratch, you will have to create a third file and copy from target and replace what you need, and then replace the old one.
But as I can see what you need is XML manipulation, you might want to use XmlDocument and modify your file using Xpath.
You need to Open the output file for write access rather than using a new StreamReader, which always overwrites the output file.
StreamWriter stm = null;
fi = new FileInfo(#"C:\target.xml");
if (fi.Exists)
stm = fi.OpenWrite();
Of course, you will still have to seek to the correct line in the output file, which will be hard since you can't read from it, so unless you already KNOW the byte offset to seek to, you probably really want read/write access.
FileStream stm = fi.Open(FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
with this stream, you can read until you get to the point where you want to make changes, then write. Keep in mind that you are writing bytes, not lines, so to overwrite a line you will need to write the same number of characters as the line you want to change.
I guess the below should work (instead of the writer part from your example). I'm unfortunately with no build environment so It's from memory but I hope it helps
using (var fs = File.Open(filePath, FileMode.Open, FileAccess.ReadWrite)))
{
var destinationReader = StreamReader(fs);
var writer = StreamWriter(fs);
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
destinationReader .ReadLine();
}
line_number++;
}
}
The solution works fine. But I need to change single-line text when the same text is in multiple places. For this, need to define a trackText to start finding after that text and finally change oldText with newText.
private int FindLineNumber(string fileName, string trackText, string oldText, string newText)
{
int lineNumber = 0;
string[] textLine = System.IO.File.ReadAllLines(fileName);
for (int i = 0; i< textLine.Length;i++)
{
if (textLine[i].Contains(trackText)) //start finding matching text after.
traced = true;
if (traced)
if (textLine[i].Contains(oldText)) // Match text
{
textLine[i] = newText; // replace text with new one.
traced = false;
System.IO.File.WriteAllLines(fileName, textLine);
lineNumber = i;
break; //go out from loop
}
}
return lineNumber
}

asp.net MVC Seed a database from a .TXT file with code first (over 10000 words) [duplicate]

I am using a list to limit the file size since the target is limited in disk and ram.
This is what I am doing now but is there a more efficient way?
readonly List<string> LogList = new List<string>();
...
var logFile = File.ReadAllLines(LOG_PATH);
foreach (var s in logFile) LogList.Add(s);
var logFile = File.ReadAllLines(LOG_PATH);
var logList = new List<string>(logFile);
Since logFile is an array, you can pass it to the List<T> constructor. This eliminates unnecessary overhead when iterating over the array, or using other IO classes.
Actual constructor implementation:
public List(IEnumerable<T> collection)
{
...
ICollection<T> c = collection as ICollection<T>;
if( c != null) {
int count = c.Count;
if (count == 0)
{
_items = _emptyArray;
}
else {
_items = new T[count];
c.CopyTo(_items, 0);
_size = count;
}
}
...
}
A little update to Evan Mulawski answer to make it shorter
List<string> allLinesText = File.ReadAllLines(fileName).ToList()
Why not use a generator instead?
private IEnumerable<string> ReadLogLines(string logPath) {
using(StreamReader reader = File.OpenText(logPath)) {
string line = "";
while((line = reader.ReadLine()) != null) {
yield return line;
}
}
}
Then you can use it like you would use the list:
var logFile = ReadLogLines(LOG_PATH);
foreach(var s in logFile) {
// Do whatever you need
}
Of course, if you need to have a List<string>, then you will need to keep the entire file contents in memory. There's really no way around that.
You can simple read this way .
List<string> lines = System.IO.File.ReadLines(completePath).ToList();
[Edit]
If you are doing this to trim the beginning of a log file, you can avoid loading the entire file by doing something like this:
// count the number of lines in the file
int count = 0;
using (var sr = new StreamReader("file.txt"))
{
while (sr.ReadLine() != null)
count++;
}
// skip first (LOG_MAX - count) lines
count = LOG_MAX - count;
using (var sr = new StreamReader("file.txt"))
using (var sw = new StreamWriter("output.txt"))
{
// skip several lines
while (count > 0 && sr.ReadLine() != null)
count--;
// continue copying
string line = "";
while ((line = sr.ReadLine()) != null)
sw.WriteLine(line);
}
First of all, since File.ReadAllLines loads the entire file into a string array (string[]), copying to a list is redundant.
Second, you must understand that a List is implemented using a dynamic array under the hood. This means that CLR will need to allocate and copy several arrays until it can accommodate the entire file. Since the file is already on disk, you might consider trading speed for memory and working on disk data directly, or processing it in smaller chunks.
If you need to load it entirely in memory, at least try to leave in an array:
string[] lines = File.ReadAllLines("file.txt");
If it really needs to be a List, load lines one by one:
List<string> lines = new List<string>();
using (var sr = new StreamReader("file.txt"))
{
while (sr.Peek() >= 0)
lines.Add(sr.ReadLine());
}
Note: List<T> has a constructor which accepts a capacity parameter. If you know the number of lines in advance, you can prevent multiple allocations by preallocating the array in advance:
List<string> lines = new List<string>(NUMBER_OF_LINES);
Even better, avoid storing the entire file in memory and process it "on the fly":
using (var sr = new StreamReader("file.txt"))
{
string line;
while ((line = sr.ReadLine()) != null)
{
// process the file line by line
}
}
Don't store it if possible. Just read through it if you are memory constrained. You can use a StreamReader:
using (var reader = new StreamReader("file.txt"))
{
var line = reader.ReadLine();
// process line here
}
This can be wrapped in a method which yields strings per line read if you want to use LINQ.
//this is only good in .NET 4
//read your file:
List<string> ReadFile = File.ReadAllLines(#"C:\TEMP\FILE.TXT").ToList();
//manipulate data here
foreach(string line in ReadFile)
{
//do something here
}
//write back to your file:
File.WriteAllLines(#"C:\TEMP\FILE2.TXT", ReadFile);
List<string> lines = new List<string>();
using (var sr = new StreamReader("file.txt"))
{
while (sr.Peek() >= 0)
lines.Add(sr.ReadLine());
}
i would suggest this... of Groo's answer.
string inLine = reader.ReadToEnd();
myList = inLine.Split(new string[] { "\r\n" }, StringSplitOptions.None).ToList();
I also use the Environment.NewLine.toCharArray as well, but found that didn't work on a couple files that did end in \r\n. Try either one and I hope it works well for you.
string inLine = reader.ReadToEnd();
myList = inLine.Split(new string[] { "\r\n" }, StringSplitOptions.None).ToList();
This answer misses the original point, which was that they were getting an OutOfMemory error. If you proceed with the above version, you are sure to hit it if your system does not have the appropriate CONTIGUOUS available ram to load the file.
You simply must break it into parts, and either store as List or String[] either way.

Replace a line in text file without creating another file [duplicate]

I have two text files, Source.txt and Target.txt. The source will never be modified and contain N lines of text. So, I want to delete a specific line of text in Target.txt, and replace by an specific line of text from Source.txt, I know what number of line I need, actually is the line number 2, both files.
I haven something like this:
string line = string.Empty;
int line_number = 1;
int line_to_edit = 2;
using StreamReader reader = new StreamReader(#"C:\target.xml");
using StreamWriter writer = new StreamWriter(#"C:\target.xml");
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
writer.WriteLine(line);
line_number++;
}
But when I open the Writer, the target file get erased, it writes the lines, but, when opened, the target file only contains the copied lines, the rest get lost.
What can I do?
the easiest way is :
static void lineChanger(string newText, string fileName, int line_to_edit)
{
string[] arrLine = File.ReadAllLines(fileName);
arrLine[line_to_edit - 1] = newText;
File.WriteAllLines(fileName, arrLine);
}
usage :
lineChanger("new content for this line" , "sample.text" , 34);
You can't rewrite a line without rewriting the entire file (unless the lines happen to be the same length). If your files are small then reading the entire target file into memory and then writing it out again might make sense. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2; // Warning: 1-based indexing!
string sourceFile = "source.txt";
string destinationFile = "target.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read the old file.
string[] lines = File.ReadAllLines(destinationFile);
// Write the new file over the old file.
using (StreamWriter writer = new StreamWriter(destinationFile))
{
for (int currentLine = 1; currentLine <= lines.Length; ++currentLine)
{
if (currentLine == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(lines[currentLine - 1]);
}
}
}
}
}
If your files are large it would be better to create a new file so that you can read streaming from one file while you write to the other. This means that you don't need to have the whole file in memory at once. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2;
string sourceFile = "source.txt";
string destinationFile = "target.txt";
string tempFile = "target2.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read from the target file and write to a new file.
int line_number = 1;
string line = null;
using (StreamReader reader = new StreamReader(destinationFile))
using (StreamWriter writer = new StreamWriter(tempFile))
{
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(line);
}
line_number++;
}
}
// TODO: Delete the old file and replace it with the new file here.
}
}
You can afterwards move the file once you are sure that the write operation has succeeded (no excecption was thrown and the writer is closed).
Note that in both cases it is a bit confusing that you are using 1-based indexing for your line numbers. It might make more sense in your code to use 0-based indexing. You can have 1-based index in your user interface to your program if you wish, but convert it to a 0-indexed before sending it further.
Also, a disadvantage of directly overwriting the old file with the new file is that if it fails halfway through then you might permanently lose whatever data wasn't written. By writing to a third file first you only delete the original data after you are sure that you have another (corrected) copy of it, so you can recover the data if the computer crashes halfway through.
A final remark: I noticed that your files had an xml extension. You might want to consider if it makes more sense for you to use an XML parser to modify the contents of the files instead of replacing specific lines.
When you create a StreamWriter it always create a file from scratch, you will have to create a third file and copy from target and replace what you need, and then replace the old one.
But as I can see what you need is XML manipulation, you might want to use XmlDocument and modify your file using Xpath.
You need to Open the output file for write access rather than using a new StreamReader, which always overwrites the output file.
StreamWriter stm = null;
fi = new FileInfo(#"C:\target.xml");
if (fi.Exists)
stm = fi.OpenWrite();
Of course, you will still have to seek to the correct line in the output file, which will be hard since you can't read from it, so unless you already KNOW the byte offset to seek to, you probably really want read/write access.
FileStream stm = fi.Open(FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
with this stream, you can read until you get to the point where you want to make changes, then write. Keep in mind that you are writing bytes, not lines, so to overwrite a line you will need to write the same number of characters as the line you want to change.
I guess the below should work (instead of the writer part from your example). I'm unfortunately with no build environment so It's from memory but I hope it helps
using (var fs = File.Open(filePath, FileMode.Open, FileAccess.ReadWrite)))
{
var destinationReader = StreamReader(fs);
var writer = StreamWriter(fs);
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
destinationReader .ReadLine();
}
line_number++;
}
}
The solution works fine. But I need to change single-line text when the same text is in multiple places. For this, need to define a trackText to start finding after that text and finally change oldText with newText.
private int FindLineNumber(string fileName, string trackText, string oldText, string newText)
{
int lineNumber = 0;
string[] textLine = System.IO.File.ReadAllLines(fileName);
for (int i = 0; i< textLine.Length;i++)
{
if (textLine[i].Contains(trackText)) //start finding matching text after.
traced = true;
if (traced)
if (textLine[i].Contains(oldText)) // Match text
{
textLine[i] = newText; // replace text with new one.
traced = false;
System.IO.File.WriteAllLines(fileName, textLine);
lineNumber = i;
break; //go out from loop
}
}
return lineNumber
}

File cannot be accessed because it is being used by another program

I am trying to remove the space at the end of line and then that line will be written in another file.
But when the program reaches to FileWriter then it gives me the following error
Process can't be accessed because it is being used by another process.
The Code is as below.
private void FrmCounter_Load(object sender, EventArgs e)
{
string[] filePaths = Directory.GetFiles(#"D:\abc", "*.txt", SearchOption.AllDirectories);
string activeDir = #"D:\dest";
System.IO.StreamWriter fw;
string result;
foreach (string file in filePaths)
{
result = Path.GetFileName(file);
System.IO.StreamReader f = new StreamReader(file);
string newFileName = result;
// Combine the new file name with the path
string newPath = System.IO.Path.Combine(activeDir, newFileName);
File.Create(newPath);
fw = new StreamWriter(newPath);
int counter = 0;
int spaceAtEnd = 0;
string line;
// Read the file and display it line by line.
while ((line = f.ReadLine()) != null)
{
if (line.EndsWith(" "))
{
spaceAtEnd++;
line = line.Substring(0, line.Length - 1);
}
fw.WriteLine(line);
fw.Flush();
counter++;
}
MessageBox.Show("File Name : " + result);
MessageBox.Show("Total Space at end : " + spaceAtEnd.ToString());
f.Close();
fw.Close();
}
}
File.Create itself returns a stream.
Use that stream to write file. Reason you are receiving this error is because Stream returned by File.Create is open and you are trying to open that file again for write.
Either close the stream returned by File.Create or better use that stream for file write or use
Stream newFile = File.Create(newPath);
fw = new StreamWriter(newFile);
Even though you solved your initial problem, if you want to write everything into a new file in the original location, you can try to read all of the data into an array and close the original StreamReader. Performance note: If your file is sufficiently large though, this option is not going to be the best for performance.
And you don't need File.Create as the StreamWriter will create a file if it doesn't exist, or overwrite it by default or if you specify the append parameter as false.
result = Path.GetFileName(file);
String[] f = File.ReadAllLines(file); // major change here...
// now f is an array containing all lines
// instead of a stream reader
using(var fw = new StreamWriter(result, false))
{
int counter = f.Length; // you aren't using counter anywhere, so I don't know if
// it is needed, but now you can just access the `Length`
// property of the array and get the length without a
// counter
int spaceAtEnd = 0;
// Read the file and display it line by line.
foreach (var item in f)
{
var line = item;
if (line.EndsWith(" "))
{
spaceAtEnd++;
line = line.Substring(0, line.Length - 1);
}
fw.WriteLine(line);
fw.Flush();
}
}
MessageBox.Show("File Name : " + result);
MessageBox.Show("Total Space at end : " + spaceAtEnd.ToString());
Also, you will not remove multiple spaces from the end of the line using this method. If you need to do that, consider replacing line = line.Substring(0, line.Length - 1); with line = line.TrimEnd(' ');
You have to close any files you are reading before you attempt to write to them in your case.
Write stream in using statement like:
using (System.IO.StreamReader f = new StreamReader(file))
{
//your code goes here
}
EDIT:
Zafar is correct, however, maybe this will clear things up.
Because File.Create returns a stream.. that stream has opened your destination file. This will make things clearer:
File.Create(newPath).Close();
Using the above line, makes it work, however, I would suggest re-writing that properly. This is just for illustrative purposes.

Fastest way to find strings in a file

I have a log file that is not more than 10KB (File size can go up to 2 MB max) and I want to find if atleast one group of these strings occurs in the files. These strings will be on different lines like,
ACTION:.......
INPUT:...........
RESULT:..........
I need to know atleast if one group of above exists in the file. And I have do this about 100 times for a test (each time log is different, so I have reload and read the log), so I am looking for fastest and bets way to do this.
I looked up in the forums for finding the fastest way, but I dont think my file is too big for those silutions.
Thansk for looking.
I would read it line by line and check the conditions. Once you have seen a group you can quit. This way you don't need to read the whole file into memory. Like this:
public bool ContainsGroup(string file)
{
using (var reader = new StreamReader(file))
{
var hasAction = false;
var hasInput = false;
var hasResult = false;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (!hasAction)
{
if (line.StartsWith("ACTION:"))
hasAction = true;
}
else if (!hasInput)
{
if (line.StartsWith("INPUT:"))
hasInput = true;
}
else if (!hasResult)
{
if (line.StartsWith("RESULT:"))
hasResult = true;
}
if (hasAction && hasInput && hasResult)
return true;
}
return false;
}
}
This code checks if there is a line starting with ACTION then one with INPUT and then one with RESULT. If the order of those is not important then you can omit the if () else if () checks. In case the line does not start with the strings replace StartsWith with Contains.
Here's one possible way to do it:
StreamReader sr;
string fileContents;
string[] logFiles = Directory.GetFiles(#"C:\Logs");
foreach (string file in logFiles)
{
using (StreamReader sr = new StreamReader(file))
{
fileContents = sr.ReadAllText();
if (fileContents.Contains("ACTION:") || fileContents.Contains("INPUT:") || fileContents.Contains("RESULT:"))
{
// Do what you need to here
}
}
}
You may need to do some variation based on your exact implementation needs - for example, what if the word spans two lines, does the line need to start with the word, etc.
Added
Alternate line-by-line check:
StreamReader sr;
string[] lines;
string[] logFiles = Directory.GetFiles(#"C:\Logs");
foreach (string file in logFiles)
{
using (StreamReader sr = new StreamReader(file)
{
lines = sr.ReadAllLines();
foreach (string line in lines)
{
if (line.Contains("ACTION:") || line.Contains("INPUT:") || line.Contains("RESULT:"))
{
// Do what you need to here
}
}
}
}
Take a look at How to Read Text From a File. You might also want to take a look at the String.Contains() method.
Basically you will loop through all the files. For each file read line-by-line and see if any of the lines contains 1 of your special "Sections".
You don't have much of a choice with text files when it comes to efficiency. The easiest way would definitely be to loop through each line of data. When you grab a line in a string, split it on the spaces. Then match those words to your words until you find a match. Then do whatever you need.
I don't know how to do it in c# but in vb it would be something like...
Dim yourString as string
Dim words as string()
Do While objReader.Peek() <> -1
yourString = objReader.ReadLine()
words = yourString.split(" ")
For Each word in words()
If Myword = word Then
do stuff
End If
Next
Loop
Hope that helps
This code sample searches for strings in a large text file. The words are contained in a HashSet. It writes the found lines in a temp file.
if (File.Exists(#"temp.txt")) File.Delete(#"temp.txt");
String line;
String oldLine = "";
using (var fs = File.OpenRead(largeFileName))
using (var sr = new StreamReader(fs, Encoding.UTF8, true))
{
HashSet<String> hash = new HashSet<String>();
hash.Add("house");
using (var sw = new StreamWriter(#"temp.txt"))
{
while ((line = sr.ReadLine()) != null)
{
foreach (String str in hash)
{
if (oldLine.Contains(str))
{
sw.WriteLine(oldLine);
// write the next line as well (optional)
sw.WriteLine(line + "\r\n");
}
}
oldLine = line;
}
}
}

Categories