useing thread (task) for doing work contain I/O - c#

I need to read data from a file,process and write result to another file. I use backgroundworker to show process state .I write something like this to use in DoWork event of backgroundworker
private void ProcData(string fileToRead,string fileToWrite)
{
byte[] buffer = new byte[4 * 1024];
//fileToRead & fileToWrite have same size
FileInfo fileInfo = new FileInfo(fileToRead);
using (FileStream streamReader = new FileStream(fileToRead, FileMode.Open))
using (BinaryReader binaryReader = new BinaryReader(streamReader))
using (FileStream streamWriter = new FileStream(fileToWrite, FileMode.Open))
using (BinaryWriter binaryWriter = new BinaryWriter(streamWriter))
{
while (streamWriter.Position < fileInfo.Length)
{
if (streamWriter.Position + buffer.Length > fileInfo.Length)
{
buffer = new byte[fileInfo.Length - streamWriter.Position];
}
//read
buffer = binaryReader.ReadBytes(buffer.Length);
//proccess
Proc(buffer);
//write
binaryWriter.Write(buffer);
//report if procentage changed
//...
}//while
}//using
}
but it is 5 more time slower than just reading from fileToRead and writing to fileToWrite so I think about threading. I read some question in site and try something like this base on this question
private void ProcData2(string fileToRead, string fileToWrite)
{
int threadNumber = 4; //for example
Task[] tasks = new Task[threadNumber];
long[] startByte = new long[threadNumber];
long[] length = new long[threadNumber];
//divide file to threadNumber(4) part
//and update startByte & length
var parentTask = Task.Run(() =>
{
for (int i = 0; i < threadNumber; i++)
{
tasks[i] = Task.Factory.StartNew(() =>
{
Proc2(fileToRead, fileToWrite, startByte[i], length[i]);
});
}
});
parentTask.Wait();
Task.WaitAll(tasks);
}
//
private void Proc2(string fileToRead,string fileToWrite,long fileStartByte,long partLength)
{
byte[] buffer = new byte[4 * 1024];
using (FileStream streamReader = new FileStream(fileToRead, FileMode.Open,FileAccess.Read,FileShare.Read))
using (BinaryReader binaryReader = new BinaryReader(streamReader))
using (FileStream streamWriter = new FileStream(fileToWrite, FileMode.Open,FileAccess.Write,FileShare.Write))
using (BinaryWriter binaryWriter = new BinaryWriter(streamWriter))
{
streamReader.Seek(fileStartByte, SeekOrigin.Begin);
streamWriter.Seek(fileStartByte, SeekOrigin.Begin);
while (streamWriter.Position < fileStartByte+partLength)
{
if (streamWriter.Position + buffer.Length > fileStartByte+partLength)
{
buffer = new byte[fileStartByte+partLength - streamWriter.Position];
}
//read
buffer = binaryReader.ReadBytes(buffer.Length);
//proccess
Proc(buffer);
//write
binaryWriter.Write(buffer);
//report if procentage changed
//...
}//while
}//using
}
but I think it have some problem and by each time switching task it needs to seek again. I think about reading file, use threading for Proc() and then writing result, but it seems wrong. How can I do it properly?(reading a buffer from a file, process and write it on other file by using task)
//===================================================================
base on Pete Kirkham post I modified my method. I do not know why ,but it did not work for me. I added new method for who it may help them. thanks every body
private void ProcData3(string fileToRead, string fileToWrite)
{
int bufferSize = 4 * 1024;
int threadNumber = 4;//example
List<byte[]> bufferPool = new List<byte[]>();
Task[] tasks = new Task[threadNumber];
//fileToRead & fileToWrite have same size
FileInfo fileInfo = new FileInfo(fileToRead);
using (FileStream streamReader = new FileStream(fileToRead, FileMode.Open))
using (BinaryReader binaryReader = new BinaryReader(streamReader))
using (FileStream streamWriter = new FileStream(fileToWrite, FileMode.Open))
using (BinaryWriter binaryWriter = new BinaryWriter(streamWriter))
{
while (streamWriter.Position < fileInfo.Length)
{
//read
for (int g = 0; g < threadNumber; g++)
{
if (streamWriter.Position + bufferSize <= fileInfo.Length)
{
bufferPool.Add(binaryReader.ReadBytes(bufferSize));
}
else
{
bufferPool.Add(binaryReader.ReadBytes((int)(fileInfo.Length - streamWriter.Position)));
break;
}
}
//do
var parentTask = Task.Run(() =>
{
for (int th = 0; th < bufferPool.Count; th++)
{
int index = th;
//threads
tasks[index] = Task.Factory.StartNew(() =>
{
Proc(bufferPool[index]);
});
}//for th
});
//stop parent task(run childs)
parentTask.Wait();
//wait till all task be done
Task.WaitAll(tasks);
//write
for (int g = 0; g < bufferPool.Count; g++)
{
binaryWriter.Write(bufferPool[g]);
}
//report if procentage changed
//...
}//while
}//using
}

Essentially you want a split the processing of the data up into parallel tasks, but you don't want want to split the IO up.
How this happens depends on the size of your data. If it is small enough to fit into memory, then you can read it all into an input array and create an output array, then create tasks to process some of the input array and populate some of the output array, then write the whole output array to file.
If the data is too large for this, then you need to put a limit on the amount of data read and written at a time. So you have your main flow which starts off by reading N blocks of data and creating N tasks to process them. You then wait for the tasks to complete in order, and each time one completes you write the block of output and read a new block of input and create another task. Some experimentation will be required for a good value for N and block size which means tasks tend to complete in about the same rate as the IO works at.

Related

How to read file by chunks

I'm a little bit confused aboot how i should read large file(> 8GB) by chunks in case each chunk has own size.
If I know chunk size it looks like code bellow:
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read, ProgramOptions.BufferSizeForChunkProcessing))
{
using (BufferedStream bs = new BufferedStream(fs, ProgramOptions.BufferSizeForChunkProcessing))
{
byte[] buffer = new byte[ProgramOptions.BufferSizeForChunkProcessing];
int byteRead;
while ((byteRead = bs.Read(buffer, 0, ProgramOptions.BufferSizeForChunkProcessing)) > 0)
{
byte[] originalBytes;
using (MemoryStream mStream = new MemoryStream())
{
mStream.Write(buffer, 0, byteRead);
originalBytes = mStream.ToArray();
}
}
}
}
But imagine, I've read large file by chunks made some coding with each chunk(chunk's size after that operation has been changed) and written to another new file all processed chunks. And now I need to do the opposite operation. But I don't know exactly chunk size. I have an idea. After each chunk has been processed i have to write new chunk size before chunk bytes. Like this:
Number of block bytes
Block bytes
Number of block bytes
Block bytes
So in that case first what i need to do is read chunk's header and learn what is chunk size exactly. I read and write to file only byte arrays. But I have a question - how should look chunk's header ? May be header have to contain some boundary ?
If the file is rigidly structured so that each block of data is preceded by a 32-bit length value, then it is easy to read. The "header" for each block is just the 32-bit length value.
If you want to read such a file, the easiest way is probably to encapsulate the reading into a method that returns IEnumerable<byte[]> like so:
public static IEnumerable<byte[]> ReadChunks(string path)
{
var lengthBytes = new byte[sizeof(int)];
using (var fs = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read))
{
int n = fs.Read(lengthBytes, 0, sizeof (int)); // Read block size.
if (n == 0) // End of file.
yield break;
if (n != sizeof(int))
throw new InvalidOperationException("Invalid header");
int blockLength = BitConverter.ToInt32(lengthBytes, 0);
var buffer = new byte[blockLength];
n = fs.Read(buffer, 0, blockLength);
if (n != blockLength)
throw new InvalidOperationException("Missing data");
yield return buffer;
}
}
Then you can use it simply:
foreach (var block in ReadChunks("MyFileName"))
{
// Process block.
}
Note that you don't need to provide your own buffering.
try this
public static IEnumerable<byte[]> ReadChunks(string fileName)
{
const int MAX_BUFFER = 1048576;// 1MB
byte[] filechunk = new byte[MAX_BUFFER];
int numBytes;
using (var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
{
long remainBytes = fs.Length;
int bufferBytes = MAX_BUFFER;
while (true)
{
if (remainBytes <= MAX_BUFFER)
{
filechunk = new byte[remainBytes];
bufferBytes = (int)remainBytes;
}
if ((numBytes = fs.Read(filechunk, 0, bufferBytes)) > 0)
{
remainBytes -= bufferBytes;
yield return filechunk;
}
else
{
break;
}
}
}
}

Closing MemoryStream in async task

I am writing code that loads multiple instances of the same task at once and waits for them all to finish. Each task reads from a file and uploads a byte array of a portion of that file.
var requests = new Task[parts.Count];
foreach (var part in parts)
{
var partNumber = part.Item1;
var partSize = part.Item2;
var ms = new MemoryStream(partSize);
var bw = new BinaryWriter(ms);
var offset = (partNumber - 1) * partMaxSize;
var count = partSize;
bw.Write(assetContentBytes, offset, count);
ms.Position = 0;
Console.WriteLine("beginning upload of part " + partNumber);
requests[partNumber - 1] = uploadClient.UploadPart(uploadResult.AssetId, partNumber, ms);
}
await Task.WhenAll(requests);
I would like to close these MemoryStreams after the related task is complete, but if I write stream.Close() into the loop, the streams close before the task is complete. Is it possible to close each stream after the task is complete? Thanks.
Just extract the part that uses the stream to another async method:
var requests = new Task[parts.Count];
foreach (var part in parts)
{
var partNumber = part.Item1;
var partSize = part.Item2;
requests[partNumber - 1] = UploadPartAsync(partNumber, partSize);
}
await Task.WhenAll(requests);
...
async Task UploadPartAsync(int partNumber, int partSize)
{
using (var ms = new MemoryStream(partSize))
using (var bw = new BinaryWriter(ms))
{
var offset = (partNumber - 1) * partMaxSize;
var count = partSize;
bw.Write(assetContentBytes, offset, count);
ms.Position = 0;
Console.WriteLine("beginning upload of part " + partNumber);
await uploadClient.UploadPart(uploadResult.AssetId, partNumber, ms);
}
}

.NET 4.5 file read performance sync vs async

We're trying to measure the performance between reading a series of files using sync methods vs async. Was expecting to have about the same time between the two but turns out using async is about 5.5x slower.
This might be due to the overhead of managing the threads but just wanted to know your opinion. Maybe we're just measuring the timings wrong.
These are the methods being tested:
static void ReadAllFile(string filename)
{
var content = File.ReadAllBytes(filename);
}
static async Task ReadAllFileAsync(string filename)
{
using (var file = File.OpenRead(filename))
{
using (var ms = new MemoryStream())
{
byte[] buff = new byte[file.Length];
await file.ReadAsync(buff, 0, (int)file.Length);
}
}
}
And this is the method that runs them and starts the stopwatch:
static void Test(string name, Func<string, Task> gettask, int count)
{
Stopwatch sw = new Stopwatch();
Task[] tasks = new Task[count];
sw.Start();
for (int i = 0; i < count; i++)
{
string filename = "file" + i + ".bin";
tasks[i] = gettask(filename);
}
Task.WaitAll(tasks);
sw.Stop();
Console.WriteLine(name + " {0} ms", sw.ElapsedMilliseconds);
}
Which is all run from here:
static void Main(string[] args)
{
int count = 10000;
for (int i = 0; i < count; i++)
{
Write("file" + i + ".bin");
}
Console.WriteLine("Testing read...!");
Test("Read Contents", (filename) => Task.Run(() => ReadAllFile(filename)), count);
Test("Read Contents Async", (filename) => ReadAllFileAsync(filename), count);
Console.ReadKey();
}
And the helper write method:
static void Write(string filename)
{
Data obj = new Data()
{
Header = "random string size here"
};
int size = 1024 * 20; // 1024 * 256;
obj.Body = new byte[size];
for (var i = 0; i < size; i++)
{
obj.Body[i] = (byte)(i % 256);
}
Stopwatch sw = new Stopwatch();
sw.Start();
MemoryStream ms = new MemoryStream();
Serializer.Serialize(ms, obj);
ms.Position = 0;
using (var file = File.Create(filename))
{
ms.CopyToAsync(file).Wait();
}
sw.Stop();
//Console.WriteLine("Writing file {0}", sw.ElapsedMilliseconds);
}
The results:
-Read Contents 574 ms
-Read Contents Async 3160 ms
Will really appreciate if anyone can shed some light on this as we searched the stack and the web but can't really find a proper explanation.
There are lots of things wrong with the testing code. Most notably, your "async" test does not use async I/O; with file streams, you have to explicitly open them as asynchronous or else you're just doing synchronous operations on a background thread. Also, your file sizes are very small and can be easily cached.
I modified the test code to write out much larger files, to have comparable sync vs async code, and to make the async code asynchronous:
static void Main(string[] args)
{
Write("0.bin");
Write("1.bin");
Write("2.bin");
ReadAllFile("2.bin"); // warmup
var sw = new Stopwatch();
sw.Start();
ReadAllFile("0.bin");
ReadAllFile("1.bin");
ReadAllFile("2.bin");
sw.Stop();
Console.WriteLine("Sync: " + sw.Elapsed);
ReadAllFileAsync("2.bin").Wait(); // warmup
sw.Restart();
ReadAllFileAsync("0.bin").Wait();
ReadAllFileAsync("1.bin").Wait();
ReadAllFileAsync("2.bin").Wait();
sw.Stop();
Console.WriteLine("Async: " + sw.Elapsed);
Console.ReadKey();
}
static void ReadAllFile(string filename)
{
using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, 4096, false))
{
byte[] buff = new byte[file.Length];
file.Read(buff, 0, (int)file.Length);
}
}
static async Task ReadAllFileAsync(string filename)
{
using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, 4096, true))
{
byte[] buff = new byte[file.Length];
await file.ReadAsync(buff, 0, (int)file.Length);
}
}
static void Write(string filename)
{
int size = 1024 * 1024 * 256;
var data = new byte[size];
var random = new Random();
random.NextBytes(data);
File.WriteAllBytes(filename, data);
}
On my machine, this test (built in Release, run outside the debugger) yields these numbers:
Sync: 00:00:00.4461936
Async: 00:00:00.4429566
All I/O Operation are async. The thread just waits(it gets suspended) for I/O operation to finish. That's why when read jeffrey richter he always tells to do i/o async, so that your thread is not wasted by waiting around.
from Jeffery Ricter
Also creating a thread is not cheap. Each thread gets 1 mb of address space reserved for user mode and another 12kb for kernel mode. After this the OS has to notify all the dll in system that a new thread has been spawned.Same happens when you destroy a thread. Also think about the complexities of context switching
Found a great SO answer here

Making self-extracting executable with C#

I'm creating simple self-extracting archive using magic number to mark the beginning of the content.
For now it is a textfile:
MAGICNUMBER .... content of the text file
Next, textfile copied to the end of the executable:
copy programm.exe/b+textfile.txt/b sfx.exe
I'm trying to find the second occurrence of the magic number (the first one would be a hardcoded constant obviously) using the following code:
string my_filename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
StreamReader file = new StreamReader(my_filename);
const int block_size = 1024;
const string magic = "MAGICNUMBER";
char[] buffer = new Char[block_size];
Int64 count = 0;
Int64 glob_pos = 0;
bool flag = false;
while (file.ReadBlock(buffer, 0, block_size) > 0)
{
var rel_pos = buffer.ToString().IndexOf(magic);
if ((rel_pos > -1) & (!flag))
{
flag = true;
continue;
}
if ((rel_pos > -1) & (flag == true))
{
glob_pos = block_size * count + rel_pos;
break;
}
count++;
}
using (FileStream fs = new FileStream(my_filename, FileMode.Open, FileAccess.Read))
{
byte[] b = new byte[fs.Length - glob_pos];
fs.Seek(glob_pos, SeekOrigin.Begin);
fs.Read(b, 0, (int)(fs.Length - glob_pos));
File.WriteAllBytes("c:/output.txt", b);
but for some reason I'm copying almost entire file, not the last few kilobytes. Is it because of the compiler optimization, inlining magic constant in while loop of something similar?
How should I do self-extraction archive properly?
Guessed I should read file backwards to avoid problems of compiler inlining magic constant multiply times.
So I've modified my code in the following way:
string my_filename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
StreamReader file = new StreamReader(my_filename);
const int block_size = 1024;
const string magic = "MAGIC";
char[] buffer = new Char[block_size];
Int64 count = 0;
Int64 glob_pos = 0;
while (file.ReadBlock(buffer, 0, block_size) > 0)
{
var rel_pos = buffer.ToString().IndexOf(magic);
if (rel_pos > -1)
{
glob_pos = block_size * count + rel_pos;
}
count++;
}
using (FileStream fs = new FileStream(my_filename, FileMode.Open, FileAccess.Read))
{
byte[] b = new byte[fs.Length - glob_pos];
fs.Seek(glob_pos, SeekOrigin.Begin);
fs.Read(b, 0, (int)(fs.Length - glob_pos));
File.WriteAllBytes("c:/output.txt", b);
}
So I've scanned the all file once, found that I though would be the last occurrence of the magic number and copied from here to the end of it. While the file created by this procedure seems smaller than in previous attempt it in no way the same file I've attached to my "self-extracting" archive. Why?
My guess is that position calculation of the beginning of the attached file is wrong due to used conversion from binary to string. If so how should I modify my position calculation to make it correct?
Also how should I choose magic number then working with real files, pdfs for example? I wont be able to modify pdfs easily to include predefined magic number in it.
Try this out. Some C# Stream IO 101:
public static void Main()
{
String path = #"c:\here is your path";
// Method A: Read all information into a Byte Stream
Byte[] data = System.IO.File.ReadAllBytes(path);
String[] lines = System.IO.File.ReadAllLines(path);
// Method B: Use a stream to do essentially the same thing. (More powerful)
// Using block essentially means 'close when we're done'. See 'using block' or 'IDisposable'.
using (FileStream stream = File.OpenRead(path))
using (StreamReader reader = new StreamReader(stream))
{
// This will read all the data as a single string
String allData = reader.ReadToEnd();
}
String outputPath = #"C:\where I'm writing to";
// Copy from one file-stream to another
using (FileStream inputStream = File.OpenRead(path))
using (FileStream outputStream = File.Create(outputPath))
{
inputStream.CopyTo(outputStream);
// Again, this will close both streams when done.
}
// Copy to an in-memory stream
using (FileStream inputStream = File.OpenRead(path))
using (MemoryStream outputStream = new MemoryStream())
{
inputStream.CopyTo(outputStream);
// Again, this will close both streams when done.
// If you want to hold the data in memory, just don't wrap your
// memory stream in a using block.
}
// Use serialization to store data.
var serializer = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
// We'll serialize a person to the memory stream.
MemoryStream memoryStream = new MemoryStream();
serializer.Serialize(memoryStream, new Person() { Name = "Sam", Age = 20 });
// Now the person is stored in the memory stream (just as easy to write to disk using a
// file stream as well.
// Now lets reset the stream to the beginning:
memoryStream.Seek(0, SeekOrigin.Begin);
// And deserialize the person
Person deserializedPerson = (Person)serializer.Deserialize(memoryStream);
Console.WriteLine(deserializedPerson.Name); // Should print Sam
}
// Mark Serializable stuff as serializable.
// This means that C# will automatically format this to be put in a stream
[Serializable]
class Person
{
public String Name { get; set; }
public Int32 Age { get; set; }
}
The easiest solution is to replace
const string magic = "MAGICNUMBER";
with
static string magic = "magicnumber".ToUpper();
But there are more problems with the whole magic string approach. What is the file contains the magic string? I think that the best solution is to put the file size after the file. The extraction is much easier that way: Read the length from the last bytes and read the required amount of bytes from the end of the file.
Update: This should work unless your files are very big. (You'd need to use a revolving pair of buffers in that case (to read the file in small blocks)):
string inputFilename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
string outputFilename = inputFilename + ".secret";
string magic = "magic".ToUpper();
byte[] data = File.ReadAllBytes(inputFilename);
byte[] magicData = Encoding.ASCII.GetBytes(magic);
for (int idx = magicData.Length - 1; idx < data.Length; idx++) {
bool found = true;
for (int magicIdx = 0; magicIdx < magicData.Length; magicIdx++) {
if (data[idx - magicData.Length + 1 + magicIdx] != magicData[magicIdx]) {
found = false;
break;
}
}
if (found) {
using (FileStream output = new FileStream(outputFilename, FileMode.Create)) {
output.Write(data, idx + 1, data.Length - idx - 1);
}
}
}
Update2: This should be much faster, use little memory and work on files of all size, but the program your must be proper executable (with size being a multiple of 512 bytes):
string inputFilename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
string outputFilename = inputFilename + ".secret";
string marker = "magic".ToUpper();
byte[] data = File.ReadAllBytes(inputFilename);
byte[] markerData = Encoding.ASCII.GetBytes(marker);
int markerLength = markerData.Length;
const int blockSize = 512; //important!
using(FileStream input = File.OpenRead(inputFilename)) {
long lastPosition = 0;
byte[] buffer = new byte[blockSize];
while (input.Read(buffer, 0, blockSize) >= markerLength) {
bool found = true;
for (int idx = 0; idx < markerLength; idx++) {
if (buffer[idx] != markerData[idx]) {
found = false;
break;
}
}
if (found) {
input.Position = lastPosition + markerLength;
using (FileStream output = File.OpenWrite(outputFilename)) {
input.CopyTo(output);
}
}
lastPosition = input.Position;
}
}
Read about some approaches here: http://www.strchr.com/creating_self-extracting_executables
You can add the compressed file as resource to the project itself:
Project > Properties
Set the property of this resource to Binary.
You can then retrieve the resource with
byte[] resource = Properties.Resources.NameOfYourResource;
Search backwards rather than forwards (assuming your file won't contain said magic number).
Or append your (text) file and then lastly its length (or the length of the original exe), so you only need read the last DWORD / few bytes to see how long the file is - then no magic number is required.
More robustly, store the file as an additional data section within the executable file. This is more fiddly without external tools as it requires knowledge of the PE file format used for NT executables, q.v. http://msdn.microsoft.com/en-us/library/ms809762.aspx

c# continuously read file

I want to read file continuously like GNU tail with "-f" param. I need it to live-read log file.
What is the right way to do it?
More natural approach of using FileSystemWatcher:
var wh = new AutoResetEvent(false);
var fsw = new FileSystemWatcher(".");
fsw.Filter = "file-to-read";
fsw.EnableRaisingEvents = true;
fsw.Changed += (s,e) => wh.Set();
var fs = new FileStream("file-to-read", FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
using (var sr = new StreamReader(fs))
{
var s = "";
while (true)
{
s = sr.ReadLine();
if (s != null)
Console.WriteLine(s);
else
wh.WaitOne(1000);
}
}
wh.Close();
Here the main reading cycle stops to wait for incoming data and FileSystemWatcher is used just to awake the main reading cycle.
You want to open a FileStream in binary mode. Periodically, seek to the end of the file minus 1024 bytes (or whatever), then read to the end and output. That's how tail -f works.
Answers to your questions:
Binary because it's difficult to randomly access the file if you're reading it as text. You have to do the binary-to-text conversion yourself, but it's not difficult. (See below)
1024 bytes because it's a nice convenient number, and should handle 10 or 15 lines of text. Usually.
Here's an example of opening the file, reading the last 1024 bytes, and converting it to text:
static void ReadTail(string filename)
{
using (FileStream fs = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
// Seek 1024 bytes from the end of the file
fs.Seek(-1024, SeekOrigin.End);
// read 1024 bytes
byte[] bytes = new byte[1024];
fs.Read(bytes, 0, 1024);
// Convert bytes to string
string s = Encoding.Default.GetString(bytes);
// or string s = Encoding.UTF8.GetString(bytes);
// and output to console
Console.WriteLine(s);
}
}
Note that you must open with FileShare.ReadWrite, since you're trying to read a file that's currently open for writing by another process.
Also note that I used Encoding.Default, which in US/English and for most Western European languages will be an 8-bit character encoding. If the file is written in some other encoding (like UTF-8 or other Unicode encoding), It's possible that the bytes won't convert correctly to characters. You'll have to handle that by determining the encoding if you think this will be a problem. Search Stack overflow for info about determining a file's text encoding.
If you want to do this periodically (every 15 seconds, for example), you can set up a timer that calls the ReadTail method as often as you want. You could optimize things a bit by opening the file only once at the start of the program. That's up to you.
To continuously monitor the tail of the file, you just need to remember the length of the file before.
public static void MonitorTailOfFile(string filePath)
{
var initialFileSize = new FileInfo(filePath).Length;
var lastReadLength = initialFileSize - 1024;
if (lastReadLength < 0) lastReadLength = 0;
while (true)
{
try
{
var fileSize = new FileInfo(filePath).Length;
if (fileSize > lastReadLength)
{
using (var fs = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
fs.Seek(lastReadLength, SeekOrigin.Begin);
var buffer = new byte[1024];
while (true)
{
var bytesRead = fs.Read(buffer, 0, buffer.Length);
lastReadLength += bytesRead;
if (bytesRead == 0)
break;
var text = ASCIIEncoding.ASCII.GetString(buffer, 0, bytesRead);
Console.Write(text);
}
}
}
}
catch { }
Thread.Sleep(1000);
}
}
I had to use ASCIIEncoding, because this code isn't smart enough to cater for variable character lengths of UTF8 on buffer boundaries.
Note: You can change the Thread.Sleep part to be different timings, and you can also link it with a filewatcher and blocking pattern - Monitor.Enter/Wait/Pulse. For me the timer is enough, and at most it only checks the file length every second, if the file hasn't changed.
This is my solution
static IEnumerable<string> TailFrom(string file)
{
using (var reader = File.OpenText(file))
{
while (true)
{
string line = reader.ReadLine();
if (reader.BaseStream.Length < reader.BaseStream.Position)
reader.BaseStream.Seek(0, SeekOrigin.Begin);
if (line != null) yield return line;
else Thread.Sleep(500);
}
}
}
so, in your code you can do
foreach (string line in TailFrom(file))
{
Console.WriteLine($"line read= {line}");
}
You could use the FileSystemWatcher class which can send notifications for different events happening on the file system like file changed.
private void button1_Click(object sender, EventArgs e)
{
if (folderBrowserDialog.ShowDialog() == DialogResult.OK)
{
path = folderBrowserDialog.SelectedPath;
fileSystemWatcher.Path = path;
string[] str = Directory.GetFiles(path);
string line;
fs = new FileStream(str[0], FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
tr = new StreamReader(fs);
while ((line = tr.ReadLine()) != null)
{
listBox.Items.Add(line);
}
}
}
private void fileSystemWatcher_Changed(object sender, FileSystemEventArgs e)
{
string line;
line = tr.ReadLine();
listBox.Items.Add(line);
}
If you are just looking for a tool to do this then check out free version of Bare tail

Categories