File IO slow or cached in a web service?

File IO slow or cached in a web service? - c#

I am writing a simple web service using .NET, one method is used to send a chunk of a file from the client to the server, the server opens a temp file and appends this chunk. The files are quite large 80Mb, the net work IO seems fine, but the append write to the local file is slowing down progressively as the file gets larger.
The follow is the code that slows down, running on the server, where aFile is a string, and aData is a byte[]
using (StreamWriter lStream = new StreamWriter(aFile, true))
{
BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
lWriter.Write(aData);
}
Debugging this process I can see that exiting the using statement is slower and slower.
If I run this code in a simple standalone test application the writes are the same speed every time about 3 ms, note the buffer (aData) is always the same side, about 0.5 Mb.
I have tried all sorts of experiments with different writers, system copies to append scratch files, all slow down when running under the web service.
Why is this happening? I suspect the web service is trying to cache access to local file system objects, how can I turn this off for specific files?
More information -
If I hard code the path the speed is fine, like so
using (StreamWriter lStream = new StreamWriter("c:\\test.dat", true))
{
BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
lWriter.Write(aData);
}
But then it slow copying this scratch file to the final file destination later on -
File.Copy("c:\\test.dat", aFile);
If I use any varibale in the path it gets slow agin so for example -
using (StreamWriter lStream = new StreamWriter("c:\\test" + someVariable, true))
{
BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
lWriter.Write(aData);
}
It has been commented that I should not use StreamWriter, note I tried many ways to open the file using FileStream, none of which made any change when the code is running under the web service, I tried WriteThrough etc.
Its the strangest thing I even tried this -
Write the data to file a.dat
Spawn system "cmd" "copy /b b.dat + a.dat b.dat"
Delete a.dat
This slows down the same way????
Makes me think the web server is running in some protected file IO environment catching all file operations in this process and child process, I can understand this if I was generating a file that might be later served to a client, but I am not, what I am doing is storing large binary blobs on disk, with a index/pointer to them stored in a database, if I comment out the write to the file the whole process fly's no performance issues at all.
I started reading about web server caching strategies, makes me think is there a web.config setting to mark a folder as uncached? Or am I completely barking up the wrong tree.

A long shot: is it possible that you need close some resources when you have finished?

If the file is binary, then why are you using a StreamWriter, which is derived from TextWriter? Just use a FileStream.
Also, BinaryWriter implements IDisposable, You need to put it into a using block.

Update....I replicated the basic code, no database, simple and it seems to work fine, so I suspect there is another reason, I will rest on it over the weekend....
Here is the replicated server code -
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.Services;
using System.IO;
namespace TestWS
{
/// <summary>
/// Summary description for Service1
/// </summary>
[WebService(Namespace = "http://tempuri.org/")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
[System.ComponentModel.ToolboxItem(false)]
// To allow this Web Service to be called from script, using ASP.NET AJAX, uncomment the following line.
// [System.Web.Script.Services.ScriptService]
public class Service1 : System.Web.Services.WebService
{
private string GetFileName ()
{
if (File.Exists("index.dat"))
{
using (StreamReader lReader = new StreamReader("index.dat"))
{
return lReader.ReadLine();
}
}
else
{
using (StreamWriter lWriter = new StreamWriter("index.dat"))
{
string lFileName = Path.GetRandomFileName();
lWriter.Write(lFileName);
return lFileName;
}
}
}
[WebMethod]
public string WriteChunk(byte[] aData)
{
Directory.SetCurrentDirectory(Server.MapPath("Data"));
DateTime lStart = DateTime.Now;
using (FileStream lStream = new FileStream(GetFileName(), FileMode.Append))
{
BinaryWriter lWriter = new BinaryWriter(lStream);
lWriter.Write(aData);
}
DateTime lEnd = DateTime.Now;
return lEnd.Subtract(lStart).TotalMilliseconds.ToString();
}
}
}
And the replicated client code -
static void Main(string[] args)
{
Service1 s = new Service1();
byte[] b = new byte[1024 * 512];
for ( int i = 0 ; i < 160 ; i ++ )
{
Console.WriteLine(s.WriteChunk(b));
}
}

Based on your code, it appears you're using the default handling inside of StreamWriter for files, which means synchronous and exclusive locks on the file.
Based on your comments, it seems the issue you really want to solve is the return time from the web service -- not necessarily the write time for the file. While the write time is the current gating factor as you've discovered, you might be able to get around your issue by going to an asynchronous-write mode.
Alternatively, I prefer completely de-coupled asynchronous operations. In that scenario, the inbound byte[] of data would be saved to its own file (or some other structure), then appended to the master file by a secondary process. More complex for operation, but also less prone to failure.

I don't have enough points to vote up an answer, but jro has the right idea. We do something similar in our service; each chunk is saved to a single temp file, then as soon as all chunks are received they're reassembled into a single file.
I'm not certain on the underlying processes for appending data to a file using StreamWriter, but I would assume it would have to at least read to the end of the current file before attempting to write whatever is in the buffer to it. So as the file gets larger it would have to read more and more of the existing file before writing the next chunk.

Well I found the root cause, "Microsoft Forefront Security", group policy has this running real time scanning, I could see the process goto 30% CPU usage when I close the file, killing this process and everything works the same speed, outside and inside the web service!
Next task find a way to add an exclusion to MFS!

Related

C# How do I append code to the end of a running .NET .EXE, preferably from inside that .EXE?

Can a running .NET .EXE append data to itself? What's stopping it?
I could launch a separate process to do it just fine.
But I can't figure out how to write to itself while it's running. Is there anyway to do this? IN .NET
EDIT: And preferably no hacky solutions like write it out somewhere else then copy/rename
EDIT2: Clarifying type of executeable
EDIT3: Purpose: Writing binary stream to my running EXE file allows me to then parse the .EXE file on disk for those bytes and use them in the program. Without having to create any new files or registry entries or stuff like that. It is self contained. This is extremely convenient.
EDIT4: For those against this file, please thinking about the functions of: FILE ZIPPING, DLL LINKING, and PORTABLE APPLICATIONS before trying to discredit this idea,

There are a lot of bad consequences for storing data this way, as said in the comments, but there's a bigger problem: the answer to "What's stopping it?" question. The Windows PE loader locks the image file for writing while in execution, so you can't get an HANDLE to the file with write permissions, as NtCreateFile and NtOpenFile system calls with FILE_WRITE_DATA option will fail, as well as any attempt to delete the file. This block is implemented at kernel level and set during the NtCreateProcess system call, before the process and its modules entry point are actually called.
The only dirty trick possible without writing data to the disk, sending it to a remote server and without kernel privileges is to use another process via an helper executable, code injection or command-line arguments scripts (e.g. with PowerShell) which can kill your process releasing the lock, append data to the end of file and restart it. Of course these options have even worse consequences, I wrote it only to make clear the OS limitations (made by purpose) and why no professional software uses this technique to store data.
EDIT: since you are so determined to accomplish this behavior I post a proof of concept for appending data via helper executable (file copy), the method relies on executing a new copy of the image in TEMP folder, passing the path to the original executable so it can be "written" because isn't running and locked. FOR READERS I SUGGEST TO DON'T USE IT IN PRODUCTION
using System;
using System.IO;
using System.Reflection;
using System.Diagnostics;
namespace BadStorage
{
class Program
{
static void Main(string[] args)
{
var temp = Path.GetTempPath();
var exePath = Assembly.GetExecutingAssembly().Location;
if (exePath.IndexOf(temp, StringComparison.OrdinalIgnoreCase) >= 0 && args.Length > 0)
{
// "Real" main
var originalExe = args[0];
if (File.Exists(originalExe))
{
// Your program code...
byte[] data = { 0xFF, 0xEE, 0xDD, 0xCC };
// Write
using (var fs = new FileStream(originalExe, FileMode.Append, FileAccess.Write, FileShare.None))
fs.Write(data, 0, data.Length);
// Read
using (var fs = new FileStream(originalExe, FileMode.Open, FileAccess.Read, FileShare.Read))
{
fs.Seek(-data.Length, SeekOrigin.End);
fs.Read(data, 0, data.Length);
}
}
}
else
{
// Self-copy
var exeCopy = Path.Combine(temp, Path.GetFileName(exePath));
File.Copy(exePath, exeCopy, true);
var p = new Process()
{
StartInfo = new ProcessStartInfo()
{
FileName = exeCopy,
Arguments = $"\"{exePath}\"",
UseShellExecute = false
}
};
p.Start();
}
}
}
}

Despite all the negativity, there is a clean way to do this:
The way I have found only requires the program be executed on an NTFS drive.
The trick is to have your app copy itself to an alternate stream as soon as it's launched, then execute that image and immediately close itself. This can be easily done with the commands:
type myapp.exe > myapp.exe:image
forfiles /m myapp.exe /c myapp.exe:image
Once your application is running from an alternate stream (myapp.exe:image), it is free to modify the original file (myapp.exe) and read the data that's stored within it. The next time the program starts, the modified application will be copied to the alternate stream and executed.
This allows you to get the effect of an executable writing to itself while running, without dealing with any extra files and allows you to store all settings within a single .exe file.
The file must be executed on an NTFS partition, but that is not a big deal since all Windows installations use this format. You can still copy the file to other filesystems, you just cannot execute it there.

Writing to file, memory used steadily increasing

I have an application where I need to write binary to a file constantly. The bits of data are small, about 1K each. The computers this is running on aren't great and are running XP. I've run into the problem that when I turn on the logging the computers just get totally hosed and I watch the Task Manager and just see the memory usage going up and up until it crashes.
A coworker suggested that I just keep the packets in memory until a certain amount of time has passed and then write it all at once instead of writing each one separately - tried that, same issue.
This is the code (loggingBuffer is the List<byte[]> I'm storing the packets in while the interval passes):
if ((DateTime.Now - lastStoreTime).TotalSeconds > 10)
{
string fileName = #"C:\Storage\file";
FileMode fm = File.Exists(fileName) ? FileMode.Append : FileMode.Create;
using (BinaryWriter w = new BinaryWriter(File.Open(fileName, fm), Encoding.ASCII))
{
foreach (byte[] packetData in loggingBuffer)
{
w.Write(packetData);
}
}
loggingBuffer.Clear();
lastStoreTime= DateTime.Now;
}
Is there anything different I should be doing to accomplish this?

Seems to me that, while you're writing each 10 seconds, you could close the file in between. And cleanup all related file-writing things. Perhaps that would solved your problem.
Secondly, I'd suggest creating the BinaryWriter outside the function where you actually write the data. It'll keep things clearer. In your current code you're checking each time wether to append data or to create a new file and the write to it. If you'll do this outside the function and call it just once perhaps this will save memory too. All untested by me, that is :)

Reading file after writing it

I have a strange problem. So my code follows as following.
The exe takes some data from the user
Call a web service to write(and create CSV for the data) the file at perticular network location(say \some-server\some-directory).
Although this web service is hosted at the same location where this
folder is (i.e i can also change it to be c:\some-directory). It then
returns after writing the file
the exe checks for the file to exists, if the file exists then further processing else quite with error.
The problem I am having is at step 3. When I try to read the file immediately after it has been written, I always get file not found exception(but the file there is present). I do not get this exception when I am debugging (because then I am putting a delay by debugging the code) or when Thread.Sleep(3000) before reading the file.
This is really strange because I close the StreamWriter before I return the call to exe. Now according to the documention, close should force the flush of the stream. This is also not related to the size of the file. Also I am not doing Async thread calls for writing and reading the file. They are running in same thread serially one after another(only writing is done by a web service and reading is done by exe. Still the call is serial)
I do not know, but it feels like there is some time difference between the file actually gets written on the disk and when you do Close(). However this baffling because this is not at all related to size. This happens for all file size. I have tried this with file with 10, 50, 100,200 lines of data.
Another thing which I suspected was since I was writing this file to a network location, it could be windows is optimizing the call by writing first to cache and then to network location. So I went ahead and changed the code to write it on drive(i.e use c:\some-directory), rather than network location. But it also resulted in same error.
There is no error in code(for reading and writing). As explained earlier, by putting a delay, it starts working fine. Some other useful information
The exe is .Net Framework 3.5
Windows Server 2008(64 bit, 4 GB Ram)
Edit 1
File.AppendAllText() is not correct solution, as it creates a new file, if it does not exits
Edit 2
code for writing
using (FileStream fs = new FileStream(outFileName, FileMode.Create))
{
using (StreamWriter writer = new StreamWriter(fs, Encoding.Unicode))
{
writer.WriteLine(someString)
}
}
code for reading
StreamReader rdr = new StreamReader(File.OpenRead(CsvFilePath));
string header = rdr.ReadLine();
rdr.Close();
Edit 3
used textwriter, same error
using (TextWriter writer = File.CreateText(outFileName))
{
}
Edit 3
Finally as suggested by some users, I am doing a check for the file in while loop for certain number of times before I throw the exception of file not found.
int i = 1;
while (i++ < 10)
{
bool fileExists = File.Exists(CsvFilePath);
if (!fileExists)
System.Threading.Thread.Sleep(500);
else
break;
}

So you are writing a stream to a file, then reading the file back to a stream? Do you need to write the file then post process it, or can you not just use the source stream directly?
If you need the file, I would use a loop that keeps checking if the file exists every second until it appears (or a silly amount of time has passed) - the writer would give you an error if you couldn't write the file, so you know it will turn up eventually.

Since you're writing over a network, most optimal solution would be to save your file in the local system first, then copy it to network location. This way you can avoid network connection problems. And as well have a backup in case of network failure.
Based on your update, Try this instead:
File.WriteAllText(outFileName, someString);
header = null;
using(StreamReader reader = new StreamReader(CsvFilePath)) {
header = reader.ReadLine();
}

Have you tried to read after disposing the writer FileStream?
Like this:
using (FileStream fs = new FileStream(outFileName, FileMode.Create))
{
using (StreamWriter writer = new StreamWriter(fs, Encoding.Unicode))
{
writer.WriteLine(someString)
}
}
using (StreamReader rdr = new StreamReader(File.OpenRead(CsvFilePath)))
{
string header = rdr.ReadLine();
}

Prune simple text log file using C# .NET 4.0

An external Windows service I work with maintains a single text-based log file that it continuously appends to. This log file grows unbounded over time. I'd like to prune this log file periodically to maintain, say the most recent 5mb of log entries. How can I efficiently implement the file I/O code in C# .NET 4.0 to prune the file to say 5mb?
Updated:
The way service dependencies are set up, my service always starts before the external service. This means I get exclusive access to the log file to truncate it, if required. Once the external service starts up, I will not access the log file. I can gain exclusive access to the file on desktop startup. The problem is - the log file may a few gigabytes in size and I'm looking for an efficient way to truncate it.

It's going to take the amount of memory that you want to store to process the "new" log file but if you only want 5Mb then it should be fine. If you are talking about Gb+ then you probably have other problems; however, it could still be accomplished using a temp file and some locking.
As noted before, you may experience a race condition but that's not the case if this is the only thread writing to this file. This would replace your current writing to the file.
const int MAX_FILE_SIZE_IN_BYTES = 5 * 1024 * 1024; //5Mb;
const string LOG_FILE_PATH = #"ThisFolder\log.txt";
string newLogMessage = "Hey this happened";
#region Use one or the other, I mean you could use both below if you really want to.
//Use this one to save an extra character
if (!newLogMessage.StartsWith(Environment.NewLine))
newLogMessage = Environment.NewLine + newLogMessage;
//Use this one to imitate a write line
if (!newLogMessage.EndsWith(Environment.NewLine))
newLogMessage = newLogMessage + Environment.NewLine;
#endregion
int newMessageSize = newLogMessage.Length*sizeof (char);
byte[] logMessage = new byte[MAX_FILE_SIZE_IN_BYTES];
//Append new log to end of "file"
System.Buffer.BlockCopy(newLogMessage.ToCharArray(), 0, logMessage, MAX_FILE_SIZE_IN_BYTES - newMessageSize, logMessage.Length);
FileStream logFile = File.Open(LOG_FILE_PATH, FileMode.Open, FileAccess.ReadWrite);
int sizeOfRetainedLog = (int)Math.Min(MAX_FILE_SIZE_IN_BYTES - newMessageSize, logFile.Length);
//Set start position/offset of the file
logFile.Position = logFile.Length - sizeOfRetainedLog;
//Read remaining portion of file to beginning of buffer
logFile.Read(logMessage, logMessage.Length, sizeOfRetainedLog);
//Clear the file
logFile.SetLength(0);
logFile.Flush();
//Write the file
logFile.Write(logMessage, 0, logMessage.Length);
I wrote this really quick, I apologize if I'm off by 1 somewhere.

depending on how often it is written to I'd say you might be facing a race condition to modify the file without damaging the log. You could always try writing a service to monitor the file size, and once it reaches a certain point lock the file, dupe and clear the whole thing and close it. Then store the data in another file that the service controls the size of easily. Alternatively you could see if the external service has an option for logging to a database, which would make it pretty simple to roll out the oldest data.

You could use a file observer to monitor the file:
FileSystemWatcher logWatcher = new FileSystemWatcher();
logWatcher.Path = #"c:\example.log"
logWatcher.Changed += logWatcher_Changed;
Then when the event is raised you can use a StreamReader to read the file
private void logWatcher_Changed(object sender, FileSystemEventArgs e)
{
using (StreamReader readFile = new StreamReader(path))
{
string line;
string[] row;
while ((line = readFile.ReadLine()) != null)
{
// Here you delete the lines you want or move it to another file, so that your log keeps small. Then save the file.
}
}
}
It´s an option.

MemoryMappedFile.CreateFromFile always throws UnauthorizedAccessException

I realize .NET 4.0 is in Beta, but I'm hoping someone has a resolution for this. I'm trying to create a memory mapped file from a DLL:
FileStream file = File.OpenRead("C:\mydll.dll");
using (MemoryMappedFile mappedFile = MemoryMappedFile.CreateFromFile(file,
"PEIMAGE", 1024 * 1024, MemoryMappedFileAccess.ReadExecute))
{
using (MemoryMappedViewStream viewStream = mappedFile.CreateViewStream())
{
// read from the view stream
}
}
Unfortunately, no matter what I do, I always get an UnauthorizedAccessException, for which the MSDN documentation states:
The operating system denied the specified access to the file; for example, access is set to Write or ReadWrite, but the file or directory is read-only.
I've monitored my application with Sysinternals Process Monitor, which shows that the file is indeed being opened successfully. I've also tried memory mapping other non-DLL files, but with the same result.

Well, I've got an example based on the above which runs without exceptions. I've made two important changes:
I'm only specified MemoryMappedFileAccess.Read when creating the MemoryMappedFile. You've opened it for read, so you can only read. I haven't tried fixing it to allow execute as well by changing how the FileStream is opened.
I've made the CreateViewStream call explicitly use MemoryMappedFileAccess.Read as well. I'm not sure why it doesn't use the existing access rights by itself, but there we go.
Full program:
using System.IO;
using System.IO.MemoryMappedFiles;
class Test
{
static void Main()
{
FileStream file = File.OpenRead("Test.cs");
using (MemoryMappedFile mappedFile = MemoryMappedFile.CreateFromFile
(file, "PEIMAGE", file.Length, MemoryMappedFileAccess.Read, null, 0, false))
{
using (var viewStream = mappedFile.CreateViewStream
(0, file.Length, MemoryMappedFileAccess.Read))
{
// read from the view stream
}
}
}
}

I had the same behaviour when calling the CreateViewAccessor(...) method.
Turns out, the error was only thrown when the size parameter exceeded the length of the file (it's not the same behaviour as we're used to with streams where size is a maximum value, instead it appears to take the parameter as a literal and the result is an attempt to read past the end of the file).
I fixed my problem by checking that the size doesn't exceed the size of the open file.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

File IO slow or cached in a web service? - c#

A long shot: is it possible that you need close some resources when you have finished?

If the file is binary, then why are you using a StreamWriter, which is derived from TextWriter? Just use a FileStream. Also, BinaryWriter implements IDisposable, You need to put it into a using block.

Related

C# How do I append code to the end of a running .NET .EXE, preferably from inside that .EXE?

Writing to file, memory used steadily increasing

Reading file after writing it

Prune simple text log file using C# .NET 4.0

MemoryMappedFile.CreateFromFile always throws UnauthorizedAccessException

Categories

Resources