Reading file after writing it - c#

I have a strange problem. So my code follows as following.
The exe takes some data from the user
Call a web service to write(and create CSV for the data) the file at perticular network location(say \some-server\some-directory).
Although this web service is hosted at the same location where this
folder is (i.e i can also change it to be c:\some-directory). It then
returns after writing the file
the exe checks for the file to exists, if the file exists then further processing else quite with error.
The problem I am having is at step 3. When I try to read the file immediately after it has been written, I always get file not found exception(but the file there is present). I do not get this exception when I am debugging (because then I am putting a delay by debugging the code) or when Thread.Sleep(3000) before reading the file.
This is really strange because I close the StreamWriter before I return the call to exe. Now according to the documention, close should force the flush of the stream. This is also not related to the size of the file. Also I am not doing Async thread calls for writing and reading the file. They are running in same thread serially one after another(only writing is done by a web service and reading is done by exe. Still the call is serial)
I do not know, but it feels like there is some time difference between the file actually gets written on the disk and when you do Close(). However this baffling because this is not at all related to size. This happens for all file size. I have tried this with file with 10, 50, 100,200 lines of data.
Another thing which I suspected was since I was writing this file to a network location, it could be windows is optimizing the call by writing first to cache and then to network location. So I went ahead and changed the code to write it on drive(i.e use c:\some-directory), rather than network location. But it also resulted in same error.
There is no error in code(for reading and writing). As explained earlier, by putting a delay, it starts working fine. Some other useful information
The exe is .Net Framework 3.5
Windows Server 2008(64 bit, 4 GB Ram)
Edit 1
File.AppendAllText() is not correct solution, as it creates a new file, if it does not exits
Edit 2
code for writing
using (FileStream fs = new FileStream(outFileName, FileMode.Create))
{
using (StreamWriter writer = new StreamWriter(fs, Encoding.Unicode))
{
writer.WriteLine(someString)
}
}
code for reading
StreamReader rdr = new StreamReader(File.OpenRead(CsvFilePath));
string header = rdr.ReadLine();
rdr.Close();
Edit 3
used textwriter, same error
using (TextWriter writer = File.CreateText(outFileName))
{
}
Edit 3
Finally as suggested by some users, I am doing a check for the file in while loop for certain number of times before I throw the exception of file not found.
int i = 1;
while (i++ < 10)
{
bool fileExists = File.Exists(CsvFilePath);
if (!fileExists)
System.Threading.Thread.Sleep(500);
else
break;
}

So you are writing a stream to a file, then reading the file back to a stream? Do you need to write the file then post process it, or can you not just use the source stream directly?
If you need the file, I would use a loop that keeps checking if the file exists every second until it appears (or a silly amount of time has passed) - the writer would give you an error if you couldn't write the file, so you know it will turn up eventually.

Since you're writing over a network, most optimal solution would be to save your file in the local system first, then copy it to network location. This way you can avoid network connection problems. And as well have a backup in case of network failure.
Based on your update, Try this instead:
File.WriteAllText(outFileName, someString);
header = null;
using(StreamReader reader = new StreamReader(CsvFilePath)) {
header = reader.ReadLine();
}

Have you tried to read after disposing the writer FileStream?
Like this:
using (FileStream fs = new FileStream(outFileName, FileMode.Create))
{
using (StreamWriter writer = new StreamWriter(fs, Encoding.Unicode))
{
writer.WriteLine(someString)
}
}
using (StreamReader rdr = new StreamReader(File.OpenRead(CsvFilePath)))
{
string header = rdr.ReadLine();
}

Related

HttpContent.CopyToAsync for large files

Hope you're all doing well!
Lets say I'm downloading a file from an HTTP API endpoint and file size is quite large. API returns application/octet-stream i.e. HttpContent in my download method.
when I use
using (FileStream fs = new FileStrean(somepath, FileMode.Create))
{
// this operation takes a few seconds to write to disk
await httpContent.CopyToAsync(fs);
}
As soon as the using statement is executed - I see the file created on the file system at given path, although it is 0 KB at this point, but when CopyToAsync() finishes executing, file size is as expected.
Problem is there's another service running which is constantly polling the folder where above files are saved and often times 0 KB are picked up or sometimes even partial files (this seems to be the case when I use WriteAsync(bytes[]).
Is there a way to not save the file on file system until its ready to be saved...?
One weird work around I could think of was:
using (var memStream = new MemoryStream())
{
await httpContent.CopyToAsync(memStream);
using (FileStream file = new FileStream(destFilePath, FileMode.Create, FileAccess.Write))
{
memStream.Position = 0;
await memStream.CopyToAsync(file);
}
}
I copy the HttpContent over to a MemoryStream and then copy the memorystream over to FileStream... this seems to have worked but there's a cost to memory consumption...
Another work around I could think of was to first save the files into a secondary location and when operation is complete, Move the file over to Primary folder.
Thank you in Advance,
Johny
I ended up saving the file into a temporary folder and when the operation is complete, I move the downloaded file to my primary folder. Since Move is atomic I do not have this issue anymore.
Thank you for those who commented!

EPPlus Open File and lock file through multiple saves

I want to be able to open an Excel file (or create if it doesn't exist) and add data to it asynchronously. I have the async component working quite well using a blocking collection, though if I want to save every loop of my while statement i keep getting issues.
I can either get file corruption, or the data never saves at all. Or sometimes it only saves the first or second data segment in my two part test.
I have the following code to show a similar cut down version of my issue:
BlockingCollection<Excel_Data> collection = null;
FileStream fs = new FileStream(this.path, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.Read);
ExcelPackage excel = new ExcelPackage(fs);
int i = 0;
while (true) {
//---- do some asyc operations
Excel_Data dict_item = collection.Take();
excel.Workbook.Worksheets.Add("sheet" + i.ToString());
//excel.Save();
excel.SaveAs(fs);
if (++i == 2) {
break;
}
}
fs.Close();
In the above example after simply create 2 sheets, the file already becomes corrupted and I am unsure how to fix this issue without going purely with FileInfo over FileStream. But then i will never be able to lock my file for writing for the duration of my app.

Capturing changes to a log file

I'm developing a small C# application that scans a log file for lines containing certain keywords and alerts the user when one of the keywords is found. This log is potentially extremely large (several gigabytes, in worst case scenario) but the only lines on the log that are relevant to me, are the ones added to the log while my application is running.
Is there a way I can capture each text line being appended to the file, without having to worry about the file content that was already present?
I already found out about the FileSystemWatcher class while searching for a solution, and while that seems great for notifying when I have new content to fetch from the log, it doesn't seem to help for telling me what was added to it.
If you keep a FileStream open in Read mode (allowing writers, of course), you should be able to initially scan through the whole file and wait at the end until the FSW notifies you that the file has been modified.
Just be careful to reset your reading thread somehow if the file is deleted, for example if the log file that you are tailing gets rolled.
Here, I knocked together an example- run this, and while it is running, edit C:\Temp\Temp.txt in notepad and save it:
public static void Main()
{
var lockMe = new object();
using (var latch = new ManualResetEvent(true))
using (var fs = new FileStream(#"C:\Temp\Temp.txt", FileMode.OpenOrCreate, FileAccess.Read, FileShare.ReadWrite))
using (var fsw = new FileSystemWatcher(#"C:\Temp\"))
{
fsw.Changed += (s, e) =>
{
lock (lockMe)
{
if (e.FullPath != #"C:\Temp\Temp.txt") return;
latch.Set();
}
};
using (var sr = new StreamReader(fs))
while (true)
{
latch.WaitOne();
lock (lockMe)
{
String line;
while ((line = sr.ReadLine()) != null)
Console.Out.WriteLine(line);
latch.Set();
}
}
}
}
The most efficient solution (if your application needs it), is to write a file hook driver to capture all write access to to the file. That driver might tell you what bytes were changed. If you don't want to write the driver in C/C++, perhaps you can use EasyHook. EasyHook is great because, if you know the exact application that's writing to the log file, you can write a very simple user-mode hook (check his examples on CodePlex). If you don't know the name of the applications, you might have to write a kernel-hook (which is still easier with EasyHook).
Instead of reading the text from the file (what I assume you are doing), read the bytes of the file. If you can assume that writes to the file will always be appended, and you know the text encoding of the file, then you can just read in the bytes starting at the file size of the original file. Then convert the bytes to text using the proper encoding.
In a similar way to this question, but you'll need to have the old file size recorded. Then instead of seeking back 10 newlines, just seek back the size difference. You'll have to be careful about encodings though.

File contents stops updating on windows share

I have a program that (repeatedly) reads contents of a file and, if new data arrives, do some processing. Reading is quite straightforward, something like
class Reader
{
FileStream fs_ = null;
StreamReader sr_ = null;
Reader(string filename)
{
fs_= new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite | FileShare.Delete);
sr_ = new StreamReader(fs_);
}
void Read()
{
string line;
while (line = sr_.ReadLine())
{
// ...
}
}
}
Method Read() is polled every 300 ms. There is some piece of code that closes and reopens file in case of its renaming/deletion by external actors.
Generally, it works ok, but sometimes (I've encountered this two times during last month) strange thing appears. File on a share reporting correct length, but when trying to read from it, shows one and a half of string and rest of file with zeroed (0x00, not '0') contents. Moreover, I got same picture when trying to read the file via any external text/binary editor from the same machine that hosted my program. From other machines on the network file is read without any problems and shows full contents. The problem persists until I reboot the machine with my program.
Any idea what happens and how can I fix it?

Reusing a filestream

In the past I've always used a FileStream object to write or rewrite an entire file after which I would immediately close the stream. However, now I'm working on a program in which I want to keep a FileStream open in order to allow the user to retain access to the file while they are working in between saves. ( See my previous question).
I'm using XmlSerializer to serialize my classes to a from and XML file. But now I'm keeping the FileStream open to be used to save (reserialized) my class instance later. Are there any special considerations I need to make if I'm reusing the same File Stream over and over again, versus using a new file stream? Do I need to reset the stream to the beginning between saves? If a later save is smaller in size than the previous save will the FileStream leave the remainder bytes from the old file, and thus create a corrupted file? Do I need to do something to clear the file so it will behave as if I'm writing an entirely new file each time?
Your suspicion is correct - if you reset the position of an open file stream and write content that's smaller than what's already in the file, it will leave trailing data and result in a corrupt file (depending on your definition of "corrupt", of course).
If you want to overwrite the file, you really should close the stream when you're finished with it and create a new stream when you're ready to re-save.
I notice from your linked question that you are holding the file open in order to prevent other users from writing to it at the same time. This probably wouldn't be my choice, but if you are going to do that, then I think you can "clear" the file by invoking stream.SetLength(0) between successive saves.
There are various ways to do this; if you are re-opening the file, perhaps set it to truncate:
using(var file = new FileStream(path, FileMode.Truncate)) {
// write
}
If you are overwriting the file while already open, then just trim it after writing:
file.SetLength(file.Position); // assumes we're at the new end
I would try to avoid delete/recreate, since this loses any ACLs etc.
Another option might be to use SetLength(0) to truncate the file before you start rewriting it.
Recently ran into the same requirement. In fact, previously, I used to create a new FileStream within a using statement and overwrite the previous file. Seems like the simple and effective thing to do.
using (var stream = new FileStream(path, FileMode.Create, FileAccess.Write)
{
ProtoBuf.Serializer.Serialize(stream , value);
}
However, I ran into locking issues where some other process is locking the target file. In my attempt to thwart this I retried the write several times before pushing the error up the stack.
int attempt = 0;
while (true)
{
try
{
using (var stream = new FileStream(path, FileMode.Create, FileAccess.Write)
{
ProtoBuf.Serializer.Serialize(stream , value);
}
break;
}
catch (IOException)
{
// could be locked by another process
// make up to X attempts to write the file
attempt++;
if (attempt >= X)
{
throw;
}
Thread.Sleep(100);
}
}
That seemed to work for almost everyone. Then that problem machine came along and forced me down the path of maintaining a lock on the file the entire time. So in lieu of retrying to write the file in the case it's already locked, I'm now making sure I get and hold the stream open so there are no locking issues with later writes.
int attempt = 0;
while (true)
{
try
{
_stream = new FileStream(path, FileMode.Open, FileAccess.ReadWrite, FileShare.Read);
break;
}
catch (IOException)
{
// could be locked by another process
// make up to X attempts to open the file
attempt++;
if (attempt >= X)
{
throw;
}
Thread.Sleep(100);
}
}
Now when I write the file the FileStream position must be reset to zero, as Aaronaught said. I opted to "clear" the file by calling _stream.SetLength(0). Seemed like the simplest choice. Then using our serializer of choice, Marc Gravell's protobuf-net, serialize the value to the stream.
_stream.SetLength(0);
ProtoBuf.Serializer.Serialize(_stream, value);
This works just fine most of the time and the file is completely written to the disk. However, on a few occasions I've observed the file not being immediately written to the disk. To ensure the stream is flushed and the file is completely written to disk I also needed to call _stream.Flush(true).
_stream.SetLength(0);
ProtoBuf.Serializer.Serialize(_stream, value);
_stream.Flush(true);
Based on your question I think you'd be better served closing/re-opening the underlying file. You don't seem to be doing anything other than writing the whole file. The value you can add by re-writing Open/Close/Flush/Seek will be next to 0. Concentrate on your business problem.

Categories