I am keeping several text log files that I want to keep from growing too large. I searched for and found a lot of people asking the same thing and I found couple of solutions that looked like the efficiency was questionable so I tried rolling my own function. I did the same thing previously in VB6 and ended up using the function in all my apps so I know I will be using it frequently now in my C# programs. This should probably be CW but since marking a question as CW is disabled I am posting it here. My question is, since I will be using this a lot is it efficient, and if not what should I change to improve it? Currently I am limiting the log files to 1MB and these are the largest logs I have kept so I don't anticipate them getting much if any larger.
private static void ShrinkFile(string file)
{
StreamReader sr = new StreamReader(file);
for (int i = 0; i < 9; i++) // throw away the first 10 lines
{
sr.ReadLine();
}
string remainingContents = sr.ReadToEnd();
sr.Close();
File.WriteAllText(file, remainingContents);
}
beside suggesting you to use a proper logging framework like Log4Net or NLog (or any other), to improve your code you can at minimum make sure you always close the stream with a using:
private static void ShrinkFile(string file)
{
using(var sr = new StreamReader(file))
{
for (int i = 0; i < 9; i++) // throw away the first 10 lines
{
sr.ReadLine();
}
// false here means to overwrite existing file.
using (StreamWriter sw = new StreamWriter(file, false))
{
sw.Write(sr.ReadToEnd());
}
}
}
also I have avoided to do the ReadToEnd into a string because you can directly write into the StreamWriter.
Related
Fairly new to C# - Sitting here practicing. I have a file with 10 million passwords listed in a single file that I downloaded to practice with.
I want to break the file down to lists of 99. Stop at 99 then do something. Then start where it left off and repeat the do something with the next 99 until it reaches the last item in the file.
I can do the count part well, it is the stop at 99 and continue where I left off is where I am having trouble. Anything I find online is not close to what I am trying to do and anything I add to this code on my own does not work.
I am more than happy to share more information if I am not clear. Just ask and will respond however, I might not be able to respond until tomorrow depending on what time it is.
Here is the code I have started:
using System;
using System.IO;
namespace lists01
{
class Program
{
static void Main(string[] args)
{
int count = 0;
var f1 = #"c:\tmp\10-million-password-list-top-1000000.txt";
{
var content = File.ReadAllLines(f1);
foreach (var v2 in content)
{
count++;
Console.WriteLine(v2 + "\t" + count);
}
}
}
}
}
My end goal is to do this with any list of items from files I have. I am only using this password list because it was sizable and thought it would be good for this exercise.
Thank you
Keith
Here is a couple of different ways to approach this. Normally, I would suggest the ReadAllLines function that you have in your code. The trade off is that you are loading the entire file into memory at once, then you operate on it.
Using read all lines in concert with Linq's Skip() and Take() methods, you can chop the lines up into groups like this:
var lines = File.ReadAllLines(fileName);
int linesAtATime = 99;
for (int i = 0; i < lines.Length; i = i + linesAtATime)
{
List<string> currentLinesGroup = lines.Skip(i).Take(linesAtATime).ToList();
DoSomethingWithLines(currentLinesGroup);
}
But, if you are working with a really large file, it might not be practical to load the entire file into memory. Plus, you might not want to leave the file open while you are working on the lines. This option gives you more control over how you move through the file. It just loads the part it needs into memory, and closes the file while you are working on the current set of lines.
List<string> lines = new List<string>();
int maxLines = 99;
long seekPosition = 0;
bool fileLoaded = false;
string line;
while (!fileLoaded)
{
using (Stream stream = File.Open(fileName, FileMode.Open))
{
//Jump back to the previous position
stream.Seek(seekPosition, SeekOrigin.Begin);
using (StreamReader reader = new StreamReader(stream))
{
while (!reader.EndOfStream && lines.Count < maxLines)
{
line = reader.ReadLine();
seekPosition += (line.Length + 2); //Tracks how much data has been read.
lines.Add(line);
}
fileLoaded = reader.EndOfStream;
}
}
DoSomethingWithLines(lines);
lines.Clear();
}
In this case, I used Stream because it has the ability to seek to a specific position in the file. But then I used StreaReader because it has the ReadLine() methods.
I have the following loop inside a function:
for(int i = 0; i < 46;i++){
String[] arrStr = File.ReadAllLines(path+"File_"+i+".txt")
List<String> output = new List<String>();
for(j = 0;j< arrStr.Length;j++){
//Do Something
output.Add(someString);
}
File.WriteAllLines(path+"output_File_"+i+".txt",output.toArray());
output.Clear();
}
Each txt file has about 20k lines.The function opens 46 of them and I need to run the function more than 1k times so I'm planning to leave the program running overnight,so far I didnt find any erros but since there is an 20k size String array being referenced at each interaction of the loop,i'm afraid that there might be some issue with trash memory being acumulated or something from the arrays in the past interactions. If there is such a risk,which method is best to dispose of the old array in this case?
Also,is it memory safe to run 3 programs like this at the same time?
Use Streams with using this will handle the memory management for you:
for (int i = 0; i < 46; i++)
{
using (StreamReader reader = new StreamReader(path))
{
using (StreamWriter writer = new StreamWriter(outputpath))
{
while(!reader.EndOfStream)
{
string line = reader.ReadLine();
// do something with line
writer.WriteLine(line);
}
}
}
}
The Dispose methods of StreamReader and StreamWriter are automatically called when exiting the using block, freeing up any memory used. Using streams also ensures your entire file isn't in memory at once.
More info on MSDN - File Stream and I/O
Sounds like you came from the C world :-)
C# garbage collection is fine, you will not have any problems with that.
I would be more worried about file-system errors.
I'm trying to get my program to read code from a .txt and then read it back to me, but for some reason, it crashes the program when I compile. Could someone let me know what I'm doing wrong? Thanks! :)
using System;
using System.IO;
public class Hello1
{
public static void Main()
{
string winDir=System.Environment.GetEnvironmentVariable("windir");
StreamReader reader=new StreamReader(winDir + "\\Name.txt");
try {
do {
Console.WriteLine(reader.ReadLine());
}
while(reader.Peek() != -1);
}
catch
{
Console.WriteLine("File is empty");
}
finally
{
reader.Close();
}
Console.ReadLine();
}
}
I don't like your solution for two simple reasons:
1)I don't like gotta Cath 'em all(try catch). For avoing check if the file exist using System.IO.File.Exist("YourPath")
2)Using this code you haven't dispose the streamreader. For avoing this is better use the using constructor like this: using(StreamReader sr=new StreamReader(path)){ //Your code}
Usage example:
string path="filePath";
if (System.IO.File.Exists(path))
using (System.IO.StreamReader sr = new System.IO.StreamReader(path))
{
while (sr.Peek() > -1)
Console.WriteLine(sr.ReadLine());
}
else
Console.WriteLine("The file not exist!");
If your file is located in the same folder as the .exe, all you need to do is StreamReader reader = new StreamReader("File.txt");
Otherwise, where File.txt is, put the full path to the file. Personally, I think it's easier if they are in the same location.
From there, it's as simple as Console.WriteLine(reader.ReadLine());
If you want to read all lines and display all at once, you could do a for loop:
for (int i = 0; i < lineAmount; i++)
{
Console.WriteLine(reader.ReadLine());
}
Use the code below if you want the result as a string instead of an array.
File.ReadAllText(Path.Combine(winDir, "Name.txt"));
Why not use System.IO.File.ReadAllLines(winDir + "\Name.txt")
If all you're trying to do is display this as output in the console, you could do that pretty compactly:
private static string winDir = Environment.GetEnvironmentVariable("windir");
static void Main(string[] args)
{
Console.Write(File.ReadAllText(Path.Combine(winDir, "Name.txt")));
Console.Read();
}
using(var fs = new FileStream(winDir + "\\Name.txt", FileMode.Open, FileAccess.Read))
{
using(var reader = new StreamReader(fs))
{
// your code
}
}
The .NET framework has a variety of ways to read a text file. Each have pros and cons... lets go through two.
The first, is one that many of the other answers are recommending:
String allTxt = File.ReadAllText(Path.Combine(winDir, "Name.txt"));
This will read the entire file into a single String. It will be quick and painless. It comes with a risk though... If the file is large enough, you may run out of memory. Even if you can store the entire thing into memory, it may be large enough that you will have paging, and will make your software run quite slowly. The next option addresses this.
The second solution allows you to work with one line at a time and not load the entire file into memory:
foreach(String line in File.ReadLines(Path.Combine(winDir, "Name.txt")))
// Do Work with the single line.
Console.WriteLine(line);
This solution may take a little longer for files because it's going to do work MORE OFTEN with the contents of the file... however, it will prevent awkward memory errors.
I tend to go with the second solution, but only because I'm paranoid about loading huge Strings into memory.
While troubleshooting a performance problem, I came across an issue in Windows 8 which relates to file names containing .dat (e.g. file.dat, file.data.txt).
I found that it takes over 6x as long to create them as any file with any other extension.
The same issue occurs in windows explorer where it takes significantly longer when copying folders containing .dat* files.
I have created some sample code to illustrate the issue.
internal class DatExtnIssue
{
internal static void Run()
{
CreateFiles("txt");
CreateFiles("dat");
CreateFiles("dat2");
CreateFiles("doc");
}
internal static void CreateFiles(string extension)
{
var folder = Path.Combine(#"c:\temp\FileTests", extension);
if (!Directory.Exists(folder))
Directory.CreateDirectory(folder);
var sw = new Stopwatch();
sw.Start();
for (var n = 0; n < 500; n++)
{
var fileName = Path.Combine(folder, string.Format("File-{0:0000}.{1}", n, extension));
using (var fileStream = File.Create(fileName))
{
// Left empty to show the problem is due to creation alone
// Same issue occurs regardless of writing, closing or flushing
}
}
sw.Stop();
Console.WriteLine(".{0} = {1,6:0.000}secs", extension, sw.ElapsedMilliseconds/1000.0);
}
}
Results from creating 500 files with the following extensions
.txt = 0.847secs
.dat = 5.200secs
.dat2 = 5.493secs
.doc = 0.806secs
I got similar results using:
using (var fileStream = new FileStream(fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{ }
and:
File.WriteAllText(fileName, "a");
This caused a problem as I had a batch application which was taking far too long to run. I finally tracked it down to this.
Does anyone have any idea why this would be happening? Is this by design? I hope not, as it could cause problems for high-volume application creating .dat files.
It could be something on my PC but I have checked the windows registry and found no unusual extension settings.
If all else fails, try a kludge:
Write all files out as .txt and then rename *.txt to .dat. Maybe it will be faster :)
I'm trying to write 4 sets of 15 txt files into 4 large txt files in order to make it easier to import into another app.
Here's my code:
using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace AggregateMultipleFiles
{
class AggMultiFilestoOneFile
{/*This program can reduce multiple input files and grouping results into one file for easier app loading.*/
static void Main(string[] args)
{
TextWriter writer = new StreamWriter("G:/user/data/yr2009/fy09_filtered.txt");
int linelen =495;
char[] buf = new char[linelen];
int line_num = 1;
for (int i = 1; i <= 15; i++)
{
TextReader reader = File.OpenText("G:/user/data/yr2009/fy09_filtered"+i+".txt");
while (true)
{
int nin = reader.Read(buf, 0, buf.Length);
if (nin == 0 )
{
Console.WriteLine("File ended");
break;
}
writer.Write(new String(buf));
line_num++;
}
reader.Close();
}
Console.WriteLine("done");
Console.WriteLine(DateTime.Now);
Console.ReadLine();
writer.Close();
}
}
}
My problem is somewhere in calling the end of the file. It doesn't finishing writing the last line of a file, and then, proceeds to start writing the first line of the next file half way through the middle of the last line of the previous file.
This is throwing off all of my columns and data in the app it imports into.
Someone suggested that perhaps I need to pad the end of each line of each of the 15 files with carriage and line return, \r\n.
Why doesn't what I have work?
Would padding work instead? How would I write that?
Thank you!
I strongly suspect this is the problem:
writer.Write(new String(buf));
You're always creating a string from all of buf, rather than just the first nin characters. If any of your files are short, you may end up with "null" Unicode characters (i.e. U+0000) which may be seen as string terminators in some apps.
There's no need even to create a string - just use:
writer.Write(buf, 0, nin);
(I would also strongly suggest using using statements instead of manually calling Close, by the way.)
It's also worth noting that there's nothing to guarantee that you're really reading a line at a time. You might as well increase your buffer size to something like 32K in order to read the files in potentially fewer chunks.
Additionally, if the files are small enough, you could read each one into memory completely, which would make your code simpler:
using (var writer = File.CreateText("G:/user/data/yr2009/fy09_filtered.txt"))
{
for (int i = 1; i <= 15; i++)
{
string inputName = "G:/user/data/yr2009/fy09_filtered" + i + ".txt";
writer.Write(File.ReadAllText(inputName));
}
}