Multi-threaded file write enqueueing

Multi-threaded file write enqueueing - c#

So I have a static class that is supposed to be used as a log file manager, capable of adding "messages" (strings) to a Queue object, and that will push messages out to a file. Trouble is, many different threads should be enqueueing, and that the writer needs to be async as well. Currently when I insert into the queue, I'm also checking to see if the writer is writing (bool check), if it's not, i set the bool and then start the writing, but I'm getting intermittent IO exceptions about file access, and then wierd writing behavior sometimes.
Someone want to give me a hand on this?

If you don't want to restructure your code dramatically like I suggested in my other answer, you could try this, which assumes your LogManager class has:
a static thread-safe queue, _SynchronizedQueue
a static object to lock on when writing, _WriteLock
and these methods:
public static void Log(string message) {
LogManager._SynchronizedQueue.Enqueue(message);
ThreadPool.QueueUserWorkItem(LogManager.Write(null));
}
// QueueUserWorkItem accepts a WaitCallback that requires an object parameter
private static void Write(object data) {
// This ensures only one thread can write at a time, but it's dangerous
lock(LogManager._WriteLock) {
string message = (string)LogManager._SynchronizedQueue.Dequeue();
if (message != null) {
// Your file writing logic here
}
}
}
There's only one problem: the lock statement in the Write method above will guarantee only one thread can write at a time, but this is dangerous. A lot can go wrong when trying to write to a file, and you don't want to hold onto (block) thread pool threads indefinitely. Therefore, you need to use a synchronization object that lets you specify a timeout, such as a Monitor, and rewrite your Write method like this:
private static void Write() {
if (!Monitor.TryEnter(LogManager._WriteLock, 2000)) {
// Do whatever you want when you can't get a lock in time
} else {
try {
string message = (string)LogManager._SynchronizedQueue.Dequeue();
if (message != null) {
// Your file writing logic here
}
}
finally {
Monitor.Exit(LogManager._WriteLock);
}
}
}

It sounds like the queue is driving the file writing operation. I recommend that you invert the control relationship so that the writer drives the process and checks the queue for work instead.
The simplest way to implement this is to add a polling mechanism to the writer in which it checks the queue for work at regular intervals.
Alternately, you could create an observerable queue class that notifies subscribers (the writer) whenever the queue transitions from empty: the subscribing writer could then begin its work. (At this time, the writer should also unsubscribe from the queue's broadcast, or otherwise change the way it reacts to the queue's alerts.)
After completing its job, the writer then checks the queue for more work. If there's no more work to do, it goes to sleep and resume polling or goes to sleep and resubscribes to the queue's alerts.
As Irwin noted in his answer, you also need to use the thread-safe wrapper provided by the Queue class' Synchronized method or manually synchronize access to your Queue if multiple threads are reading from it and writing to it (as in SpaceghostAli's example).

I would have just one thread doing the writes to avoid contentions, while i would use multiple threads to enqueue.
You are advised "To guarantee the thread safety of the Queue, all operations must be done through the wrapper returned by the Synchronized method." - from http://msdn.microsoft.com/en-us/library/system.collections.queue.aspx

You should synchronize around your queue. Have multiple threads send to the queue and a single thread read from the queue and write to the file.
public void Log(string entry)
{
_logMutex.WaitOne();
_logQueue.Enqueue(entry);
_logMutex.ReleaseMutex();
}
private void Write()
{
...
_logMutex.WaitOne();
string entry = _logQueue.Dequeue();
_logMutex.ReleaseMutex();
_filestream.WriteLine(entry);
...
}

Let me address the problem at a different level:
If your writing a business application then you'd want to focus on the business-logic portions rather than the infrastructural code, more so if this infra code is already available, tested and deployed to multiple production sites (taking care of your NFRs)
I'm sure you're aware of the existance of logging frameworks like log4net and others http://csharp-source.net/open-source/logging.
Have you given these a chance before hand-rolling out your own Logger?
Take this option to the technical architect of the enterprise you're writing for and see she thinks.
Cheers

Related

Scaling Connections with BlockingCollection<T>()

I have a server which communicates with 50 or more devices over TCP LAN. There is a Task.Run for each socket reading message loop.
I buffer each message reach into a blocking queue, where each blocking queue has a Task.Run using a BlockingCollection.Take().
So something like (semi-pseudocode):
Socket Reading Task
Task.Run(() =>
{
while (notCancelled)
{
element = ReadXml();
switch (element)
{
case messageheader:
MessageBlockingQueue.Add(deserialze<messageType>());
...
}
}
});
Message Buffer Task
Task.Run(() =>
{
while (notCancelled)
{
Process(MessageQueue.Take());
}
});
So that would make 50+ reading tasks and 50+ tasks blocking on their own buffers.
I did it this way to avoid blocking the reading loop and allow the program to distribute processing time on messages more fairly, or so I believe.
Is this an inefficient way to handle it? what would be a better way?

You may be interested in the "channels" work, in particular: System.Threading.Channels. The aim of this is to provider asynchronous producer/consumer queues, covering both single and multiple producer and consumer scenarios, upper limits, etc. By using an asynchronous API, you aren't tying up lots of threads just waiting for something to do.
Your read loop would become:
while (notCancelled) {
var next = await queue.Reader.ReadAsync(optionalCancellationToken);
Process(next);
}
and the producer:
switch (element)
{
case messageheader:
queue.Writer.TryWrite(deserialze<messageType>());
...
}
so: minimal changes
Alternatively - or in combination - you could look into things like "pipelines" (https://www.nuget.org/packages/System.IO.Pipelines/) - since you're dealing with TCP data, this would be an ideal fit, and is something I've looked at for the custom web-socket server here on Stack Overflow (which deals with huge numbers of connections). Since the API is async throughout, it does a good job of balancing work - and the pipelines API is engineered with typical TCP scenarios in mind, for example partially consuming incoming data streams as you detect frame boundaries. I've written about this usage a lot, with code examples mostly here. Note that "pipelines" doesn't include a direct TCP layer, but the "kestrel" server includes one, or the third-party library https://www.nuget.org/packages/Pipelines.Sockets.Unofficial/ does (disclosure: I wrote it).

I actually do something similar in another project. What I learned or would do differently are the following:
First of all, better to use dedicated threads for the reading/writing loop (with new Thread(ParameterizedThreadStart)) because Task.Run uses a pool thread and as you use it in a (nearly) endless loop the thread is practically never returned to the pool.
var thread = new Thread(ReaderLoop) { Name = nameof(ReaderLoop) }; // priority, etc if needed
thread.Start(cancellationToken);
Your Process can be an event, which you can invoke asynchronously so your reader loop can be return immediately to process the new incoming packages as fast as possible:
private void ReaderLoop(object state)
{
var token = (CancellationToken)state;
while (!token.IsCancellationRequested)
{
try
{
var message = MessageQueue.Take(token);
OnMessageReceived(new MessageReceivedEventArgs(message));
}
catch (OperationCanceledException)
{
if (!disposed && IsRunning)
Stop();
break;
}
}
}
Please note that if a delegate has multiple targets it's async invocation is not trivial. I created this extension method for invoking a delegate on pool threads:
public static void InvokeAsync<TEventArgs>(this EventHandler<TEventArgs> eventHandler, object sender, TEventArgs args)
{
void Callback(IAsyncResult ar)
{
var method = (EventHandler<TEventArgs>)ar.AsyncState;
try
{
method.EndInvoke(ar);
}
catch (Exception e)
{
HandleError(e, method);
}
}
foreach (EventHandler<TEventArgs> handler in eventHandler.GetInvocationList())
handler.BeginInvoke(sender, args, Callback, handler);
}
So the OnMessageReceived implementation can be:
protected virtual void OnMessageReceived(MessageReceivedEventArgs e)
=> messageReceivedHandler.InvokeAsync(this, e);
Finally it was a big lesson that BlockingCollection<T> has some performance issues. It uses SpinWait internally, whose SpinOnce method waits longer and longer times if there is no incoming data for a long time. This is a tricky issue because even if you log every single step of the processing you will not notice that everything is started delayed unless you can mock also the server side. Here you can find a fast BlockingCollection implementation using an AutoResetEvent for triggering incoming data. I added a Take(CancellationToken) overload to it as follows:
/// <summary>
/// Takes an item from the <see cref="FastBlockingCollection{T}"/>
/// </summary>
public T Take(CancellationToken token)
{
T item;
while (!queue.TryDequeue(out item))
{
waitHandle.WaitOne(cancellationCheckTimeout); // can be 10-100 ms
token.ThrowIfCancellationRequested();
}
return item;
}
Basically that's it. Maybe not everything is applicable in your case, eg. if the nearly immediate response is not crucial the regular BlockingCollection also will do it.

Yes, this is a bit inefficient, because you block ThreadPool threads.
I already discussed this problem Using Task.Yield to overcome ThreadPool starvation while implementing producer/consumer pattern
You can also look at examples with testing a producer -consumer pattern here:
https://github.com/BBGONE/TestThreadAffinity
You can use await Task.Yield in the loop to give other tasks access to this thread.
You can solve it also by using dedicated threads or better a custom ThreadScheduler which uses its own thread pool. But it is ineffective to create 50+ plain threads. Better to adjust the task, so it would be more cooperative.
If you use a BlockingCollection (because it can block the thread for long while waiting to write (if bounded) or to read or no items to read) then it is better to use System.Threading.Tasks.Channels https://github.com/stephentoub/corefxlab/blob/master/src/System.Threading.Tasks.Channels/README.md
They don't block the thread while waiting when the collection will be available to write or to read. There's an example how it is used https://github.com/BBGONE/TestThreadAffinity/tree/master/ThreadingChannelsCoreFX/ChannelsTest

Nested lock in Task.ContinueWith - Safe, or playing with fire?

Windows service: Generating a set of FileWatcher objects from a list of directories to watch in a config file, have the following requirements:
File processing can be time consuming - events must be handled on their own task threads
Keep handles to the event handler tasks to wait for completion in an OnStop() event.
Track the hashes of uploaded files; don't reprocess if not different
Persist the file hashes to allow OnStart() to process files uploaded while the service was down.
Never process a file more than once.
(Regarding #3, we do get events when there are no changes... most notably because of the duplicate-event issue with FileWatchers)
To do these things, I have two dictionaries - one for the files uploaded, and one for the tasks themselves. Both objects are static, and I need to lock them when adding/removing/updating files and tasks. Simplified code:
public sealed class TrackingFileSystemWatcher : FileSystemWatcher {
private static readonly object fileWatcherDictionaryLock = new object();
private static readonly object runningTaskDictionaryLock = new object();
private readonly Dictionary<int, Task> runningTaskDictionary = new Dictionary<int, Task>(15);
private readonly Dictionary<string, FileSystemWatcherProperties> fileWatcherDictionary = new Dictionary<string, FileSystemWatcherProperties>();
// Wired up elsewhere
private void OnChanged(object sender, FileSystemEventArgs eventArgs) {
this.ProcessModifiedDatafeed(eventArgs);
}
private void ProcessModifiedDatafeed(FileSystemEventArgs eventArgs) {
lock (TrackingFileSystemWatcher.fileWatcherDictionaryLock) {
// Read the file and generate hash here
// Properties if the file has been processed before
// ContainsNonNullKey is an extension method
if (this.fileWatcherDictionary.ContainsNonNullKey(eventArgs.FullPath)) {
try {
fileProperties = this.fileWatcherDictionary[eventArgs.FullPath];
}
catch (KeyNotFoundException keyNotFoundException) {}
catch (ArgumentNullException argumentNullException) {}
}
else {
// Create a new properties object
}
fileProperties.ChangeType = eventArgs.ChangeType;
fileProperties.FileContentsHash = md5Hash;
fileProperties.LastEventTimestamp = DateTime.Now;
Task task;
try {
task = new Task(() => new DatafeedUploadHandler().UploadDatafeed(this.legalOrg, datafeedFileData), TaskCreationOptions.LongRunning);
}
catch {
..
}
// Only lock long enough to add the task to the dictionary
lock (TrackingFileSystemWatcher.runningTaskDictionaryLock) {
try {
this.runningTaskDictionary.Add(task.Id, task);
}
catch {
..
}
}
try {
task.ContinueWith(t => {
try {
lock (TrackingFileSystemWatcher.runningTaskDictionaryLock) {
this.runningTaskDictionary.Remove(t.Id);
}
// Will this lock burn me?
lock (TrackingFileSystemWatcher.fileWatcherDictionaryLock) {
// Persist the file watcher properties to
// disk for recovery at OnStart()
}
}
catch {
..
}
});
task.Start();
}
catch {
..
}
}
}
}
What's the effect of requesting a lock on the FileSystemWatcher collection in the ContinueWith() delegate when the delegate is defined within a lock on the same object? I would expect it to be fine, that even if the task starts, completes, and enters the ContinueWith() before ProcessModifiedDatafeed() releases the lock, the task thread would simply be suspended until the creating thread has released the lock. But I want to make sure I'm not stepping on any delayed execution landmines.
Looking at the code, I may be able to release the lock sooner, avoiding the issue, but I'm not certain yet... need to review the full code to be sure.
UPDATE
To stem the rising "this code is terrible" comments, there are very good reasons why I catch the exceptions I do, and am catching so many of them. This is a Windows service with multi-threaded handlers, and it may not crash. Ever. Which it will do if any of those threads have an unhandled exception.
Also, those exceptions are written to future bulletproofing. The example I've given in comments below would be adding a factory for the handlers... as the code is written today, there will never be a null task, but if the factory is not implemented correctly, the code could throw an exception. Yes, that should be caught in testing. However, I have junior developers on my team... "May. Not. Crash." (also, it must shut down gracefully if there is an unhandled exception, allowing currently-running threads to complete - which we do with an unhandled exception handler set in main()). We have enterprise-level monitors configured to send alerts when application errors appear on the event log – those exceptions will log and flag us. The approach was a deliberate and discussed decision.
Each possible exception has each been carefully considered and chosen to fall into one of two categories - those that apply to a single datafeed and will not shut down the service (the majority), and those that indicate clear programming or other errors that fundamentally render the code useless for all datafeeds. For example, we've chosen to shut down the service down if we can't write to the event log, as that's our primary mechanism for indicating datafeeds are not getting processed. The exceptions are caught locally, because the local context is the only place where the decision to continue can be made. Furthermore, allowing exceptions to bubble up to higher levels (1) violates the concept of abstraction, and (2) makes no sense in a worker thread.
I'm surprised at the number of people who argue against handling exceptions. If I had a dime for every try..catch(Exception){do nothing} I see, you'd get your change in nickels for the rest of eternity. I would argue to the death1 that if a call into the .NET framework or your own code throws an exception, you need to consider the scenario that would cause that exception to occur and explicitly decide how it should be handled. My code catches UnauthorizedExceptions in IO operations, because when I considered how that could happen, I realized that adding a new datafeed directory requires permissions to be granted to the service account (it won't have them by default).
I appreciate the constructive input... just please don't criticize simplified example code with a broad "this sucks" brush. The code does not suck - it is bulletproof, and necessarily so.
1 I would only argue a really long time if Jon Skeet disagrees

First, your question: it's not a problem in itself to request lock inside ContinueWith. If you bother you do that inside another lock block - just don't. Your continuation will execute asynchronously, in different time, different thread.
Now, code itself is questionable. Why do you use many try-catch blocks around statements that almost cannot throw exceptions? For example here:
try {
task = new Task(() => new DatafeedUploadHandler().UploadDatafeed(this.legalOrg, datafeedFileData), TaskCreationOptions.LongRunning);
}
catch {}
You just create task - I cannot imagine when this can throw. Same story with ContinueWith. Here:
this.runningTaskDictionary.Add(task.Id, task);
you can just check if such key already exists. But even that is not necessary because task.Id is unique id for given task instance which you just created. This:
try {
fileProperties = this.fileWatcherDictionary[eventArgs.FullPath];
}
catch (KeyNotFoundException keyNotFoundException) {}
catch (ArgumentNullException argumentNullException) {}
is even worse. You should not use exceptions lile this - don't catch KeyNotFoundException but use appropriate methods on Dictionary (like TryGetValue).
So to start with, remove all try catch blocks and either use one for the whole method, or use them on statements that can really throw exceptions and you cannot handle that situation otherwise (and you know what to do with exception thrown).
Then, your approach to handle filesystem events is not quite scaleable and reliable. Many programs will generate multiple change events in short intervals when they are saving changes to a file (there are also other cases of multiple events for the same file going in sequence). If you just start processing file on every event, this might lead to different kind of troubles. So you might need to throttle events coming for a given file and only start processing after certain delay after last detected change. That might be a bit advanced stuff, though.
Don't forget to grab a read lock on the file as soon as possible, so that other processes cannot change file while you are working with it (for example, you might calculate md5 of a file, then someone changes file, then you start uploading - now your md5 is invalid). Other approach is to record last write time and when it comes to uploading - grab read lock and check if file was not changed in between.
What is more important is that there can be a lot of changes at once. Say I copied 1000 files very fast - you do not want to start uploading them all at once with 1000 threads. You need a queue of files to process, and take items from that queue with several threads. This way thousands of events might happen at once and your upload will still work reliably. Right now you create new thread for each change event, where you immediatly start upload (according to method names) - this will fail under serious load of events (and in cases described above).

No it will not burn you. Even if the ContinueWith is inlined into to the current thread that was running the new Task(() => new DatafeedUploadHandler().. it will get the lock e.g. no dead lock.
The lock statement is using the Monitor class internally, and it is reentrant. e.g. a thread can aquire a lock multiple times if it already got/owns the lock. Multithreading and Locking (Thread-Safe operations)
And the other case where the task.ContinueWith starts before the ProcessModifiedDatafeed finished is like you said. The thread that is running the ContinueWith simply would have to wait to get the lock.
I would really consider to do the task.ContinueWith and the task.Start() outside of the lock if you reviewed it. And it is possible based on your posted code.
You should also take a look at the ConcurrentDictionary in the System.Collections.Concurrent namespace. It would make the code easier and you dont have to manage the locking yourself. You are doing some kind of compare exchange/update here if (this.fileWatcherDictionary.ContainsNonNullKey(eventArgs.FullPath)). e.g. only add if not already in the dictionary. This is one atomic operation. There is no function to do this with a ConcurrentDictionary but there is an AddOrUpdate method. Maybe you can rewrite it by using this method. And based on your code you could safely use the ConcurrentDictionary at least for the runningTaskDictionary
Oh and TaskCreationOptions.LongRunning is literally creating a new thread for every task which is kind of an expensive operation. The windows internal thread pool is intelligent in new windows versions and is adapting dynamically. It will "see" that you are doing lots of IO stuff and will spawn new threads as needed and practical.
Greetings

I have not fully followed the logic of this code but are you aware that task continuations and calls to Wait/Result can be inlined onto the current thread? This can cause reentrancy.
This is very dangerous and has burned many.
Also I don't quite see why you are starting task delayed. This is a code smell. Also why are you wrapping the task creation with try? This can never throw.
This clearly is a partial answer. But the code looks very tangled to me. If it's this hard to audit it you probably should write it differently in the first place.

What is the reason for "while(true) { Thread.Sleep }"?

I sometimes encounter code in the following form:
while (true) {
//do something
Thread.Sleep(1000);
}
I was wondering if this is considered good or bad practice and if there are any alternatives.
Usually I "find" such code in the main-function of services.
I recently saw code in the "Run" function in a windows azure worker role which had the following form:
ClassXYZ xyz = new ClassXYZ(); //ClassXYZ creates separate Threads which execute code
while (true) {
Thread.Sleep(1000);
}
I assume there are better ways to prevent a service (or azure worker role) from exiting.
Does anyone have a suggestion for me?

Well when you do that with Thread.Sleep(1000), your processor wastes a tiny amount of time to wake up and do nothing.
You could do something similar with CancelationTokenSource.
When you call WaitOne(), it will wait until it receives a signal.
CancellationTokenSource cancelSource = new CancellationTokenSource();
public override void Run()
{
//do stuff
cancelSource.Token.WaitHandle.WaitOne();
}
public override void OnStop()
{
cancelSource.Cancel();
}
This will keep the Run() method from exiting without wasting your CPU time on busy waiting.

An alternative approach may be using an AutoResetEvent and instantiate it signaled by default.
public class Program
{
public static readonly AutoResetEvent ResetEvent = new AutoResetEvent(true);
public static void Main(string[] args)
{
Task.Factory.StartNew
(
() =>
{
// Imagine sleep is a long task which ends in 10 seconds
Thread.Sleep(10000);
// We release the whole AutoResetEvent
ResetEvent.Set();
}
);
// Once other thread sets the AutoResetEvent, the program ends
ResetEvent.WaitOne();
}
}
Is the so-called while(true) a bad practice?
Well, in fact, a literal true as while loop condition may be considered a bad practice, since it's an unbrekeable loop: I would always use a variable condition which may result in true or false.
When I would use a while loop or something like the AutoResetEvent approach?
When to use while loop...
...when you need to execute code while waiting the program to end.
When to use AutoResetEvent approach...
...when you just need to hold the main thread in order to prevent the program to end, but such main thread just needs to wait until some other thread requests a program exit.

If you see code like this...
while (true)
{
//do something
Thread.Sleep(1000);
}
It's most likely using Sleep() as a means of waiting for some event to occur — something like user input/interaction, a change in the file system (such as a file being created or modified in a folder, network or device event, etc. That would suggest using more appropriate tools:
If the code is waiting for a change in the file system, use a FileSystemWatcher.
If the code is waiting for a thread or process to complete, or a network event to occur, use the appropriate synchronization primitive and WaitOne(), WaitAny() or WaitAll() as appropriate. If you use an overload with a timeout in a loop, it gives you cancelability as well.
But without knowing the actual context, it's rather hard to say categorically that it's either good, bad or indifferent. If you've got a daemon running that has to poll on a regular basis (say an NTP client), a loop like that would make perfect sense (though the daemon would need some logic to monitor for shutdown events occuring.) And even with something like that, you could replace it with a scheduled task: a different, but not necessarily better, design.

If you use while(true) you have no programmatic means of ending the loop from outside the loop.
I'd prefer, at least, a while(mySingletonValue) which would allow us to switch the loop as needed.
An additional approach would be to remove the functional behavior from the looping behavior. Your loop my still be infinite but it calls a function defined elsewhere. Therefore the looping behavior is completely isolated to what is being executed by the loop:
while(GetMySingletonValue())
{
someFunction();
}
In this way your singleton controls the looping behavior entirely.

There are better ways to keep the Azure Service and exit when needed.
Refer:
http://magnusmartensson.com/howto-wait-in-a-workerrole-using-system-timers-timer-and-system-threading-eventwaithandle-over-system-threading-thread-sleep
http://blogs.lessthandot.com/index.php/DesktopDev/MSTech/azure-worker-role-exiting-safely/

It really depends on that //do something on how it determines when to break out of the loop.
In general terms, more appropriate way to do it is to use some synchronization primitive (like ManualResetEvent) to wait on, and the code that processes and triggers the break of the loop (on the other thread) to signal on that primitive. This way you don't have thread wasting resources by being scheduled in every second to do nothing, and is a much cleaner way to do it.

I personally don't like Thread.Sleep code. Because it locks the main thread. You can write something like this, if it is a windows application besides it allows you more flexibility and you can call it async:
bool switchControl = true;
while (switchControl) {
//do something
await Wait(1);
}
async void Wait(int Seconds)
{
DateTime Tthen = DateTime.Now;
do
{
Application.DoEvents(); //Or something else or leave empty;
} while (Tthen.AddSeconds(Seconds) > DateTime.Now);
}

What is the best method for creating a static class that uses threads?

Let's say I am designing a simple logging class (yes - I know there are those already out there in the wild!) and I want the class to be static so the rest of my code can call it without having to instantiate it first. Maybe something like this:
internal static class Log
{
private static string _logFile = "";
internal static void InitializeLogFile(string path)
{
...
}
internal static void WriteHeader()
{
...
}
internal static void WriteLine(params string[] items)
{
...
}
}
Now, I want the internals to spin up their own thread and execute in an Asynch manner, possibly using BackgroundWorker to help simplify things. Should I just create a new BackgroundWorker in each method, create a static BackgroundWorker as a private property of the static class, or is there something I am overlooking altogether?

You definitely do not want spin up a new thread or BackgroundWorker on each invocation of the methods. I would use the producer-consumer pattern here. As it turns out this is such a common pattern that Microsoft provided us with the BlockingCollection class which simplies the implementation greatly. The nice thing about this approach is that:
there is only one extra thread required
the Log methods will have asynchronous semantics
the temporal ordering of the log messages is preserved
Here is some code to get your started.
internal static class Log
{
private static BlockingCollection<string> s_Queue = new BlockingCollection<string>();
static Log()
{
var thread = new Thread(Run);
thread.IsBackground = true;
thread.Start();
}
private static void Run()
{
while (true)
{
string line = s_Queue.Take();
// Add code to append the line to the log here.
}
}
internal static void WriteLine(params string[] items)
{
foreach (string item in items)
{
s_Queue.Add(item);
}
}
}

You only want to have 1 thread per log file/db. Otherwise, the order of items in the log is unreliable. Have a background thread that pulls from a thread-safe queue and does the writing.

Good call,
You definitely want the logging operations to occur in a separate thread as the code that is doing the logging. For instance, the accessor methods (such as "logEvent(myEvent)" ) should not block on file I/O operations while the logger logs the event to a file.
Make a queue so that the accessors simply push items onto the queue. This way your code shouldn't block while it is trying to log an event.
Start-up a second thread to empty the internal queue of events. This thread can run on a static private method of your logger class.
The performance drawback comes when you try to ensure thread safety of the underlying event queue. You will need to acquire a lock on the queue every time before a pop or push onto the queue.
Hope this helps.

I think that my recommendation is not exactly what you expect, but I hope it is useful anyway:
Don't use a static class. Instead,
use a regular class and hold a single
instance of it (the singleton
pattern); using a dependency
injection engine helps a lot with
this (I use MS Unity and it
works fine). If you define an interface for your logging class as well, your code will be much more testable.
As for the threading stuff, if I
understand correclty you want the
logging work to be performed in
separate threads. Are you sure that
you really need this? A logger should
be light enough so that you can
simple call the "Write" methods and
expect that your application
performance will not suffer.
A last note: you mention the BackgroundWorker class, but if I am not wrong this class is intended for use with desktop applications, not with ASP.NET. In this environment you should probably use something like the ThreadPool class.
Just my 2 euro cents...

I created a thread safe logging class myself a while back. I used it something like this.
Logging obj = new Logging(filename);
Action<string> log = obj.RequestLog();
RequestLog would return an anonymous method that wrote to its own Queue. Because a Q is thread safe for 1 reader/writer, I didn't need to use any locks when calling log()
The actual Logging object would create a new thread that ran in the background and would periodically check all of the queues. If a Q had a string in it, it would write it to a buffered file stream.
I added a little extra code to the reading thread so for each pass it made on the queues, if there was nothing written, it would sleep an extra 10 ms, up to a max of 100ms. This way the thread didn't spool too much. But if there was heavy writing going on, it would poll the Qs every 10ms.
Here's a snpit of the return code for the requested queue. The "this.blNewData = true" was so I didn't need to hit up every Q to see if any new data was written. No lock involved because a false positive still did no work since all the Qs would be empty anyway.
OutputQueue was the list of Queues that I looped through to see if anything was written. The code to loop through the list was in a lock in case NewQueueLog() was called and caused the list to get resized.
public Action<String> NewQueueLog()
{
Queue<String> tmpQueue = new Queue<String>(32);
lock (OutputQueue)
{
OutputQueue.Add(tmpQueue);
}
return (String Output) =>
{
tmpQueue.Enqueue(Output);
this.blNewData = true;
};
}
In the end, writing to the log was lock free, which helped when lots of threads were writing.

.NET 2.0 : File.AppendAllText(...) - Thread safe implementation

As an exercise in idle curiosity more than anything else, consider the following simple logging class:
internal static class Logging
{
private static object threadlock;
static Logging()
{
threadlock = new object();
}
internal static void WriteLog(string message)
{
try
{
lock (threadlock)
{
File.AppendAllText(#"C:\logfile.log", message);
}
}
catch
{
...handle logging errors...
}
}
}
Is the lock needed around File.AppendAllText(...) or is the method inherently thread-safe by its own implementation ?
Searching for information on this yields a lot of contradictory information, some say yes, some say no. MSDN says nothing.

File.AppendAllText is going to acquire an exclusive write-lock on the log file, which would cause any concurrent thread attempting to access the file to throw an exception. So yes, you need a static lock object to prevent multiple threads from trying to write to the log file at the same time and raising an IOException.
If this is going to be an issue, I'd really suggest logging to a database table which will do a better job of handling concurrent log writers.
Alternatively, you can use TextWriterTraceListener which is thread-safe (well, it's going to do the locking for you; I'd rather write as little of my own multithreaded code as possible).

Testing parallel writes shows that you would get a System.IO.IOException if you were to comment out your lock statement.
[Test]
public void Answer_Question()
{
var ex = Assert.Throws<AggregateException>(() => Parallel.Invoke(
() => Logging.WriteLog("abc"),
() => Logging.WriteLog("123")
));
// System.IO.IOException: The process cannot access the file 'C:\Logs\thread-safety-test.txt' because it is being used by another process.
Console.Write(ex);
}

It is thread safe in the sense that it opens the file with Read sharing, so assuming your filesystem honors file locks, only one thread will be allowed to write to the file at a time. Other threads may, however, get dirty reads if they are attempting to read the same file.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Multi-threaded file write enqueueing - c#

Related

Scaling Connections with BlockingCollection<T>()

Nested lock in Task.ContinueWith - Safe, or playing with fire?

What is the reason for "while(true) { Thread.Sleep }"?

What is the best method for creating a static class that uses threads?

.NET 2.0 : File.AppendAllText(...) - Thread safe implementation

Categories

Resources