C# language compare - c#

i wan to compare two xml file and find their differences and similarities,
private void checkLanguage(string file1, string file2)
{
XmlDocument xmldoc1 = new XmlDocument();
XmlDocument xmldoc2 = new XmlDocument();
XmlNodeList xmlnode1;
XmlNodeList xmlnode2;
int i = 0;
int j = 0;
string str = null;
FileStream fs1 = new FileStream(file1, FileMode.Open, FileAccess.Read);
xmldoc1.Load(fs1);
FileStream fs2 = new FileStream(file2, FileMode.Open, FileAccess.Read);
xmldoc2.Load(fs2);
xmlnode1 = xmldoc1.GetElementsByTagName("data");
xmlnode2 = xmldoc2.GetElementsByTagName("data");
for (i = 0; i <= xmlnode1.Count - 1; i++)
{
str = xmlnode1[i].Attributes["name"].Value;
for (j = 0; j <= xmlnode2.Count - 1; j++)
{
if (str == xmlnode2[j].Attributes["name"].Value)
{
lblResult.ForeColor = Color.Green;
lblResult.Text += Environment.NewLine + xmlnode1[i].Attributes["name"].Value;
}
else
{
label4.ForeColor = Color.Red;
label4.Text += Environment.NewLine + xmlnode1[i].Attributes["name"].Value;
}
}
}
}
my problem is the similar languange in both xml file list out in the difference field also..how to solve this..
can someone help me? thanks

Best way to do this is using linq to xml. Load both xml's and reachout to language value to compare. The same way for other tags.
Thanks
AHARI

If doing a general diff is adequate I wouldn't use an XML parser at all. Instead, I'd use Google's DiffMatchPatch library. Here is some sample code below;
var dmf = new diff_match_patch();
var diffs = dmf.diff_main(FileOneAsString, FileTwoAsString);
After that you have a list of all the diff's. These are categorized as EQUAL, ADD, or DELETE. They show what transformations have to happen in file2 to arrive at file1. To make it more human readable I'd recommend using the semantic clean up method. There's an int you can set that determines how major the clean up is (I don't remember the name but you can find it easily if you go this route).
dmf.diff_cleanupSemantic(diffs);
At that point you can see what's equal and what's not by looping over the diffs, and looking at the strings (the diff is an object with the type and the string).
The library can be downloaded here http://code.google.com/p/google-diff-match-patch/ and you can test out how much semantic cleanup you want to do using the diff demo.
If all you want is the differences in the two files this will be a lot simpler than using an xml parser.

Related

Read & write a single line from a file without overwrite [duplicate]

I have two text files, Source.txt and Target.txt. The source will never be modified and contain N lines of text. So, I want to delete a specific line of text in Target.txt, and replace by an specific line of text from Source.txt, I know what number of line I need, actually is the line number 2, both files.
I haven something like this:
string line = string.Empty;
int line_number = 1;
int line_to_edit = 2;
using StreamReader reader = new StreamReader(#"C:\target.xml");
using StreamWriter writer = new StreamWriter(#"C:\target.xml");
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
writer.WriteLine(line);
line_number++;
}
But when I open the Writer, the target file get erased, it writes the lines, but, when opened, the target file only contains the copied lines, the rest get lost.
What can I do?
the easiest way is :
static void lineChanger(string newText, string fileName, int line_to_edit)
{
string[] arrLine = File.ReadAllLines(fileName);
arrLine[line_to_edit - 1] = newText;
File.WriteAllLines(fileName, arrLine);
}
usage :
lineChanger("new content for this line" , "sample.text" , 34);
You can't rewrite a line without rewriting the entire file (unless the lines happen to be the same length). If your files are small then reading the entire target file into memory and then writing it out again might make sense. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2; // Warning: 1-based indexing!
string sourceFile = "source.txt";
string destinationFile = "target.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read the old file.
string[] lines = File.ReadAllLines(destinationFile);
// Write the new file over the old file.
using (StreamWriter writer = new StreamWriter(destinationFile))
{
for (int currentLine = 1; currentLine <= lines.Length; ++currentLine)
{
if (currentLine == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(lines[currentLine - 1]);
}
}
}
}
}
If your files are large it would be better to create a new file so that you can read streaming from one file while you write to the other. This means that you don't need to have the whole file in memory at once. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2;
string sourceFile = "source.txt";
string destinationFile = "target.txt";
string tempFile = "target2.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read from the target file and write to a new file.
int line_number = 1;
string line = null;
using (StreamReader reader = new StreamReader(destinationFile))
using (StreamWriter writer = new StreamWriter(tempFile))
{
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(line);
}
line_number++;
}
}
// TODO: Delete the old file and replace it with the new file here.
}
}
You can afterwards move the file once you are sure that the write operation has succeeded (no excecption was thrown and the writer is closed).
Note that in both cases it is a bit confusing that you are using 1-based indexing for your line numbers. It might make more sense in your code to use 0-based indexing. You can have 1-based index in your user interface to your program if you wish, but convert it to a 0-indexed before sending it further.
Also, a disadvantage of directly overwriting the old file with the new file is that if it fails halfway through then you might permanently lose whatever data wasn't written. By writing to a third file first you only delete the original data after you are sure that you have another (corrected) copy of it, so you can recover the data if the computer crashes halfway through.
A final remark: I noticed that your files had an xml extension. You might want to consider if it makes more sense for you to use an XML parser to modify the contents of the files instead of replacing specific lines.
When you create a StreamWriter it always create a file from scratch, you will have to create a third file and copy from target and replace what you need, and then replace the old one.
But as I can see what you need is XML manipulation, you might want to use XmlDocument and modify your file using Xpath.
You need to Open the output file for write access rather than using a new StreamReader, which always overwrites the output file.
StreamWriter stm = null;
fi = new FileInfo(#"C:\target.xml");
if (fi.Exists)
stm = fi.OpenWrite();
Of course, you will still have to seek to the correct line in the output file, which will be hard since you can't read from it, so unless you already KNOW the byte offset to seek to, you probably really want read/write access.
FileStream stm = fi.Open(FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
with this stream, you can read until you get to the point where you want to make changes, then write. Keep in mind that you are writing bytes, not lines, so to overwrite a line you will need to write the same number of characters as the line you want to change.
I guess the below should work (instead of the writer part from your example). I'm unfortunately with no build environment so It's from memory but I hope it helps
using (var fs = File.Open(filePath, FileMode.Open, FileAccess.ReadWrite)))
{
var destinationReader = StreamReader(fs);
var writer = StreamWriter(fs);
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
destinationReader .ReadLine();
}
line_number++;
}
}
The solution works fine. But I need to change single-line text when the same text is in multiple places. For this, need to define a trackText to start finding after that text and finally change oldText with newText.
private int FindLineNumber(string fileName, string trackText, string oldText, string newText)
{
int lineNumber = 0;
string[] textLine = System.IO.File.ReadAllLines(fileName);
for (int i = 0; i< textLine.Length;i++)
{
if (textLine[i].Contains(trackText)) //start finding matching text after.
traced = true;
if (traced)
if (textLine[i].Contains(oldText)) // Match text
{
textLine[i] = newText; // replace text with new one.
traced = false;
System.IO.File.WriteAllLines(fileName, textLine);
lineNumber = i;
break; //go out from loop
}
}
return lineNumber
}

How can I read a Lync conversation file containing HTML?

I'm having trouble reading a local file, into a string, in c#.
Here's what I came up with till now:
string file = #"C:\script_test\{5461EC8C-89E6-40D1-8525-774340083829}.html";
using (StreamReader reader = new StreamReader(file))
{
string line = "";
while ((line = reader.ReadLine()) != null)
{
textBox1.Text += line.ToString();
}
}
And it's the only solution that seems to work.
I've tried some other suggested methods for reading a file, such as:
string file = #"C:\script_test\{5461EC8C-89E6-40D1-8525-774340083829}.html";
string html = File.ReadAllText(file).ToString();
textBox1.Text += html;
Yet it does not work as expected.
Here are the first few lines of the file i'm trying to read:
as you can see, it has some funky characters, honestly I don't know if that's the cause of this weird behavior.
But in the first case, the code seems to skip those lines, printing only "Document generated by Office Communicator..."
Your task would be easier if you could use an API or the SDK or even would have a description of the format you try to read. However the binary format looks not to be that complicated and with an hexviewer installed I got this far to get the html out of the example you provided.
To parse non-text files you fall-back to the BinaryReader and then use one of the Read methods to read the correct type from the bytestream. I used ReadByte and ReadInt32. Notice how in the description of the method is explained how many bytes are read. That becomes handy when you try to decipher your file.
private string ParseHist(string file)
{
using (var f = File.Open(file, FileMode.Open))
{
using (var br = new BinaryReader(f))
{
// read 4 bytes as an int
var first = br.ReadInt32();
// read integer / zero ended byte arrays as string
var lead = br.ReadInt32();
// until we have 4 zero bytes
while (lead != 0)
{
var user = ParseString(br);
Trace.Write(lead);
Trace.Write(":");
Trace.Write(user.Length);
Trace.Write(":");
Trace.WriteLine(user);
lead = br.ReadInt32();
// weird special case
if (lead == 2)
{
lead = br.ReadInt32();
}
}
// at the start of the html block
var htmllen = br.ReadInt32();
Trace.WriteLine(htmllen);
// parse the html
var html = ParseString(br);
Trace.Write(len);
Trace.Write(":");
Trace.Write(html.Length);
Trace.Write(":");
Trace.WriteLine(html);
// other structures follow, left unparsed
return html.ToString();
}
}
}
// a string seems to be ascii encoded and ends with a zero byte.
private static string ParseString(BinaryReader br)
{
var ch = br.ReadByte();
var sb = new StringBuilder();
while (ch != 0)
{
sb.Append((char)ch);
ch = br.ReadByte();
}
return sb.ToString();
}
You could use the simple parsing logic in a winform application as follows:
private void button1_Click(object sender, EventArgs e)
{
webBrowser1.DocumentText = ParseHist(#"5461EC8C-89E6-40D1-8525-774340083829-Copia.html");
}
Keep in mind that this is not bullet proof or the recommended way but it should get you started. For files that don't parse well you'll need to go back to the hexviewer and work-out what other byte structures are new or different from what you already had. That is not something I intend to help you with, that is left as an exercise for you to figure out.
I don't know if it's the right way to answer this, but here's what I've managed to do so far:
string file = #"C:\script_test\{1C0365BC-54C6-4D31-A1C1-586C4575F9EA}.hist";
string outText = "";
//Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
StreamReader reader = new StreamReader(file, utf8);
char[] text = reader.ReadToEnd().ToCharArray();
//skip first n chars
/*
for (int i = 250; i < text.Length; i++)
{
outText += text[i];
}
*/
for (int i = 0; i < text.Length; i++)
{
//skips non printable characters
if (!Char.IsControl(text[i]))
{
outText += text[i];
}
}
string source = "";
source = WebUtility.HtmlDecode(outText);
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(source);
string html = "<html><style>";
foreach (HtmlNode node in htmlDoc.DocumentNode.SelectNodes("//style"))
{
html += node.InnerHtml+ Environment.NewLine;
}
html += "</style><body>";
foreach (HtmlNode node in htmlDoc.DocumentNode.SelectNodes("//body"))
{
html += node.InnerHtml + Environment.NewLine;
}
html += "</body></html>";
richTextBox1.Text += html+Environment.NewLine;
webBrowser1.DocumentText = html;
The conversation displays correctly, both style and encoding.
So it's a start for me.
Thank you all for the support!
EDIT
Char.IsControl(char)
skips non printable characters :)

Asynchronous Call using Delegate

I want that separate Async threads of method splitFile should run so that the task will become faster but below code is not working. When I debug , it reaches till line RecCnt = File.ReadAllLines(SourceFile).Length - 1; and comes out. Please help.
public delegate void SplitFile_Delegate(FileInfo file);
static void Main(string[] args)
{
DirectoryInfo d = new DirectoryInfo(#"D:\test\Perf testing Splitter"); //Assuming Test is your Folder
FileInfo[] Files = d.GetFiles("*.txt"); //Getting Text files
foreach (FileInfo file in Files)
{
SplitFile_Delegate LocalDelegate = new SplitFile_Delegate(SplitFile);
IAsyncResult R = LocalDelegate.BeginInvoke(file, null, null); //invoking the method
LocalDelegate.EndInvoke(R);
}
}
private static void SplitFile(FileInfo file)
{
try
{
String fname;
//int FileLength;
int RecCnt;
int fileCount;
fname = file.Name;
String SourceFile = #"D:\test\Perf testing Splitter\" + file.Name;
RecCnt = File.ReadAllLines(SourceFile).Length - 1;
fileCount = RecCnt / 10000;
FileStream fs = new FileStream(SourceFile, FileMode.Open);
using (StreamReader sr = new StreamReader(fs))
{
while (!sr.EndOfStream)
{
String dataLine = sr.ReadLine();
for (int x = 0; x < (fileCount + 1); x++)
{
String Filename = #"D:\test\Perf testing Splitter\Destination Files\" + fname + "_" + x + "by" + (fileCount + 1) + ".txt"; //test0by4
using (StreamWriter Writer = file.AppendText(Filename))
{
for (int y = 0; y < 10000; y++)
{
Writer.WriteLine(dataLine);
dataLine = sr.ReadLine();
}
Writer.Close();
}
}
}
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
Your code doesn't really need any multi-threading. It doesn't really even need asynchronous processing all that much - you're saturating the I/O most likely, and unless you've got multiple drives as the data sources, you're not going to improve that by adding parallelism.
On the other hand, your code is reading each file twice. For no reason, wasting memory, time and even CPU. Instead, just do this:
FileStream fs = new FileStream(SourceFile, FileMode.Open);
using (StreamReader sr = new StreamReader(fs))
{
string line;
string fileName = null;
StreamWriter outputFile = null;
int lineCounter = 0;
int outputFileIndex = 0;
while ((line = sr.ReadLine()) != null)
{
if (fileName == null || lineCounter >= 10000)
{
lineCounter = 0;
outputFileIndex++;
fileName = #"D:\Output\" + fname + "_" + outputFileIndex + ".txt";
if (outputFile != null) outputFile.Dispose();
outputFile = File.AppendText(fileName);
}
outputFile.WriteLine(line);
lineCounter++;
}
}
If you really need to have the filename in format XOutOfY, you can just rename them afterwards - it's a lot cheaper than reading the source file twice, line after line. Or, if you don't care about keeping the whole file in memory at once, just use the array you got from ReadAllLines and iterate over that, rather than doing the reading all over again.
To make this even easier, you can also use foreach (var line in File.ReadLines(fileName)).
If you really want to make this asynchronous, the way to handle that is by using asynchronous I/O, not just by spooling new threads. So you can use await with StreamReader.ReadLineAsync etc.
You are not required to call EndInvoke and really all EndInvoke does is wait on the return value for you. Since SplitFile returns void, my guess is there's an optimization that kicks in because you don't need to wait on anything and it simply ignores the wait. For more details: C# Asynchronous call without EndInvoke?
That being said, your usage of Begin/EndInvoke will likely not be faster than serial programming (and will likely be marginally slower) as your for loop is still serialized, and you're still running the iteration in serial. All that has changed is you're using a delegate where it looks like one isn't necessary.
It's possible that what you meant to use was Parallel.ForEach (MSDN: https://msdn.microsoft.com/en-us/library/dd992001(v=vs.110).aspx) which will potentially run iterations in parallel.
Edit: As someone else has mentioned, having multiple threads engage in file operations will likely not improve performance as your file ops are probably disk bound. The main benefit you would get from an async file read/write would probably be unblocking the main thread for a UI update. You will need to specify what you want with "performance" if you want a better answer.

Replace a line in text file without creating another file [duplicate]

I have two text files, Source.txt and Target.txt. The source will never be modified and contain N lines of text. So, I want to delete a specific line of text in Target.txt, and replace by an specific line of text from Source.txt, I know what number of line I need, actually is the line number 2, both files.
I haven something like this:
string line = string.Empty;
int line_number = 1;
int line_to_edit = 2;
using StreamReader reader = new StreamReader(#"C:\target.xml");
using StreamWriter writer = new StreamWriter(#"C:\target.xml");
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
writer.WriteLine(line);
line_number++;
}
But when I open the Writer, the target file get erased, it writes the lines, but, when opened, the target file only contains the copied lines, the rest get lost.
What can I do?
the easiest way is :
static void lineChanger(string newText, string fileName, int line_to_edit)
{
string[] arrLine = File.ReadAllLines(fileName);
arrLine[line_to_edit - 1] = newText;
File.WriteAllLines(fileName, arrLine);
}
usage :
lineChanger("new content for this line" , "sample.text" , 34);
You can't rewrite a line without rewriting the entire file (unless the lines happen to be the same length). If your files are small then reading the entire target file into memory and then writing it out again might make sense. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2; // Warning: 1-based indexing!
string sourceFile = "source.txt";
string destinationFile = "target.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read the old file.
string[] lines = File.ReadAllLines(destinationFile);
// Write the new file over the old file.
using (StreamWriter writer = new StreamWriter(destinationFile))
{
for (int currentLine = 1; currentLine <= lines.Length; ++currentLine)
{
if (currentLine == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(lines[currentLine - 1]);
}
}
}
}
}
If your files are large it would be better to create a new file so that you can read streaming from one file while you write to the other. This means that you don't need to have the whole file in memory at once. You can do that like this:
using System;
using System.IO;
class Program
{
static void Main(string[] args)
{
int line_to_edit = 2;
string sourceFile = "source.txt";
string destinationFile = "target.txt";
string tempFile = "target2.txt";
// Read the appropriate line from the file.
string lineToWrite = null;
using (StreamReader reader = new StreamReader(sourceFile))
{
for (int i = 1; i <= line_to_edit; ++i)
lineToWrite = reader.ReadLine();
}
if (lineToWrite == null)
throw new InvalidDataException("Line does not exist in " + sourceFile);
// Read from the target file and write to a new file.
int line_number = 1;
string line = null;
using (StreamReader reader = new StreamReader(destinationFile))
using (StreamWriter writer = new StreamWriter(tempFile))
{
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
writer.WriteLine(line);
}
line_number++;
}
}
// TODO: Delete the old file and replace it with the new file here.
}
}
You can afterwards move the file once you are sure that the write operation has succeeded (no excecption was thrown and the writer is closed).
Note that in both cases it is a bit confusing that you are using 1-based indexing for your line numbers. It might make more sense in your code to use 0-based indexing. You can have 1-based index in your user interface to your program if you wish, but convert it to a 0-indexed before sending it further.
Also, a disadvantage of directly overwriting the old file with the new file is that if it fails halfway through then you might permanently lose whatever data wasn't written. By writing to a third file first you only delete the original data after you are sure that you have another (corrected) copy of it, so you can recover the data if the computer crashes halfway through.
A final remark: I noticed that your files had an xml extension. You might want to consider if it makes more sense for you to use an XML parser to modify the contents of the files instead of replacing specific lines.
When you create a StreamWriter it always create a file from scratch, you will have to create a third file and copy from target and replace what you need, and then replace the old one.
But as I can see what you need is XML manipulation, you might want to use XmlDocument and modify your file using Xpath.
You need to Open the output file for write access rather than using a new StreamReader, which always overwrites the output file.
StreamWriter stm = null;
fi = new FileInfo(#"C:\target.xml");
if (fi.Exists)
stm = fi.OpenWrite();
Of course, you will still have to seek to the correct line in the output file, which will be hard since you can't read from it, so unless you already KNOW the byte offset to seek to, you probably really want read/write access.
FileStream stm = fi.Open(FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
with this stream, you can read until you get to the point where you want to make changes, then write. Keep in mind that you are writing bytes, not lines, so to overwrite a line you will need to write the same number of characters as the line you want to change.
I guess the below should work (instead of the writer part from your example). I'm unfortunately with no build environment so It's from memory but I hope it helps
using (var fs = File.Open(filePath, FileMode.Open, FileAccess.ReadWrite)))
{
var destinationReader = StreamReader(fs);
var writer = StreamWriter(fs);
while ((line = reader.ReadLine()) != null)
{
if (line_number == line_to_edit)
{
writer.WriteLine(lineToWrite);
}
else
{
destinationReader .ReadLine();
}
line_number++;
}
}
The solution works fine. But I need to change single-line text when the same text is in multiple places. For this, need to define a trackText to start finding after that text and finally change oldText with newText.
private int FindLineNumber(string fileName, string trackText, string oldText, string newText)
{
int lineNumber = 0;
string[] textLine = System.IO.File.ReadAllLines(fileName);
for (int i = 0; i< textLine.Length;i++)
{
if (textLine[i].Contains(trackText)) //start finding matching text after.
traced = true;
if (traced)
if (textLine[i].Contains(oldText)) // Match text
{
textLine[i] = newText; // replace text with new one.
traced = false;
System.IO.File.WriteAllLines(fileName, textLine);
lineNumber = i;
break; //go out from loop
}
}
return lineNumber
}

How to split the large text file(32 GB) using C#

I tried to split the file about 32GB using the below code but I got the memory exception.
Please suggest me to split the file using C#.
string[] splitFile = File.ReadAllLines(#"E:\\JKS\\ImportGenius\\0.txt");
int cycle = 1;
int splitSize = Convert.ToInt32(txtNoOfLines.Text);
var chunk = splitFile.Take(splitSize);
var rem = splitFile.Skip(splitSize);
while (chunk.Take(1).Count() > 0)
{
string filename = "file" + cycle.ToString() + ".txt";
using (StreamWriter sw = new StreamWriter(filename))
{
foreach (string line in chunk)
{
sw.WriteLine(line);
}
}
chunk = rem.Take(splitSize);
rem = rem.Skip(splitSize);
cycle++;
}
Well, to start with you need to use File.ReadLines (assuming you're using .NET 4) so that it doesn't try to read the whole thing into memory. Then I'd just keep calling a method to spit the "next" however many lines to a new file:
int splitSize = Convert.ToInt32(txtNoOfLines.Text);
using (var lineIterator = File.ReadLines(...).GetEnumerator())
{
bool stillGoing = true;
for (int chunk = 0; stillGoing; chunk++)
{
stillGoing = WriteChunk(lineIterator, splitSize, chunk);
}
}
...
private static bool WriteChunk(IEnumerator<string> lineIterator,
int splitSize, int chunk)
{
using (var writer = File.CreateText("file " + chunk + ".txt"))
{
for (int i = 0; i < splitSize; i++)
{
if (!lineIterator.MoveNext())
{
return false;
}
writer.WriteLine(lineIterator.Current);
}
}
return true;
}
Do not read immediately all lines into an array, but use StremReader.ReadLine method, like:
using (StreamReader sr = new StreamReader(#"E:\\JKS\\ImportGenius\\0.txt"))
{
while (sr.Peek() >= 0)
{
var fileLine = sr.ReadLine();
//do something with line
}
}
File.ReadAllLines
That will read the whole file into memory.
To work with large files you need to only read what you need now into memory, and then throw that away as soon as you have finished with it.
A better option would be File.ReadLines which returns a lazy enumerator, data is only read into memory as you get the next line from the enumerator. Providing you avoid multiple enumerations (eg. don't use Count()) only parts of the file will be read.
Instead of reading all the file at once using File.ReadAllLines, use File.ReadLines in a foreach loop to read the lines as needed.
foreach (var line in File.ReadLines(#"E:\\JKS\\ImportGenius\\0.txt"))
{
// Do something
}
Edit: On an unrelated note, you don't have to escape your backslashes when prefixing the string with a '#'. So either write "E:\\JKS\\ImportGenius\\0.txt" or #"E:\JKS\ImportGenius\0.txt", but #"E:\\JKS\\ImportGenius\\0.txt" is redundant.
The problem here is that you are reading the entire file's content into memory at once with File.ReadAllLines(). What you need to do is open a FileStream with File.OpenRead() and read/write smaller chunks.
Edit: Actually for your case ReadLine is obviously better. See other answers. :)
Use a StreamReader to read the file, write with a StreamWriter.

Categories