System Uptime & MemoryBarrier - c#

I need a robust way of getting system uptime, and ended up using something as follows.
Added some comments to help people read it. I cannot use Task's as this has to run on a .NET 3.5 application.
// This is a structure, can't be marked as volatile
// need to implement MemoryBarrier manually as appropriate
private static TimeSpan _uptime;
private static TimeSpan GetUptime()
{
// Try and set the Uptime using per counters
var uptimeThread = new Thread(GetPerformanceCounterUptime);
uptimeThread.Start();
// If our thread hasn't finished in 5 seconds, perf counters are broken
if (!uptimeThread.Join(5 * 1000))
{
// Kill the thread and use Environment.TickCount
uptimeThread.Abort();
_uptime = TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue);
}
Thread.MemoryBarrier();
return _uptime;
}
// This sets the System uptime using the perf counters
// this gives the best result but on a system with corrupt perf counters
// it can freeze
private static void GetPerformanceCounterUptime()
{
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = TimeSpan.FromSeconds(uptime.NextValue());
}
}
The part I am struggling with is where should Thread.MemoryBarrier() be placed?
I am placing it before reading the value, but either the current thread or a different thread could have written to it. Does the above look correct?
Edit, Answer based on Daniel
This is what I eneded up implementing, thank you both for the insight.
private static TimeSpan _uptime;
private static TimeSpan GetUptime()
{
var uptimeThread = new Thread(GetPerformanceCounterUptime);
uptimeThread.Start();
if (uptimeThread.Join(5*1000))
{
return _uptime;
}
else
{
uptimeThread.Abort();
return TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue);
}
}
private static void GetPerformanceCounterUptime()
{
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = TimeSpan.FromSeconds(uptime.NextValue());
}
}
Edit 2
Updated based on Bob's comments.
private static DateTimeOffset _uptime;
private static DateTimeOffset GetUptime()
{
var uptimeThread = new Thread(GetPerformanceCounterUptime);
uptimeThread.Start();
if (uptimeThread.Join(5*1000))
{
return _uptime;
}
else
{
uptimeThread.Abort();
return DateTimeOffset.Now.Subtract(TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue));
}
}
private static void GetPerformanceCounterUptime()
{
if (_uptime != default(DateTimeOffset))
{
return;
}
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = DateTimeOffset.Now.Subtract(
TimeSpan.FromSeconds(uptime.NextValue()));
}
}

Thread.Join already ensures that writes performed by the uptimeThread are visible on the main thread. You don't need any explicit memory barrier. (without the synchronization performed by Join, you'd need barriers on both threads - after the write and before the read)
However, there's a potential problem with your code: writing to a TimeSpan struct isn't atomic, and the main thread and the uptimeThread may write to it at the same time (Thread.Abort just signals abortion, but doesn't wait for the thread to finish aborting), causing a torn write.
My solution would be to not use the field at all when aborting. Also, multiple concurrent calls to GetUptime() may cause the same problem, so you should use an instance field instead.
private static TimeSpan GetUptime()
{
// Try and set the Uptime using per counters
var helper = new Helper();
var uptimeThread = new Thread(helper.GetPerformanceCounterUptime);
uptimeThread.Start();
// If our thread hasn't finished in 5 seconds, perf counters are broken
if (uptimeThread.Join(5 * 1000))
{
return helper._uptime;
} else {
// Kill the thread and use Environment.TickCount
uptimeThread.Abort();
return TimeSpan.FromMilliseconds(
Environment.TickCount & Int32.MaxValue);
}
}
class Helper
{
internal TimeSpan _uptime;
// This sets the System uptime using the perf counters
// this gives the best result but on a system with corrupt perf counters
// it can freeze
internal void GetPerformanceCounterUptime()
{
using (var uptime = new PerformanceCounter("System", "System Up Time"))
{
uptime.NextValue();
_uptime = TimeSpan.FromSeconds(uptime.NextValue());
}
}
}
However, I'm not sure if aborting the performance counter thread will work correctly at all - Thread.Abort() only aborts managed code execution. If the code is hanging within a Windows API call, the thread will keep running.

AFAIK writes in .NET are volatile, so the only place where you would need a memory fence would be before each read, since they are subject to reordering and/or caching. To quote from a post by Joe Duffy:
For reference, here are the rules as I have come to understand them
stated as simply as I can:
Rule 1: Data dependence among loads and stores is never violated.
Rule 2: All stores have release semantics, i.e. no load or store may move after one.
Rule 3: All volatile loads are acquire, i.e. no load or store may move before one.
Rule 4: No loads and stores may ever cross a full-barrier.
Rule 5: Loads and stores to the heap may never be introduced.
Rule 6: Loads and stores may only be deleted when coalescing adjacent loads and
stores from/to the same location.
Note that by this definition, non-volatile loads are not required to
have any sort of barrier associated with them. So loads may be freely
reordered, and writes may move after them (though not before, due to
Rule 2). With this model, the only true case where you’d truly need
the strength of a full-barrier provided by Rule 4 is to prevent
reordering in the case where a store is followed by a volatile load.
Without the barrier, the instructions may reorder.

Related

Trying to find a lock-less solution for a C# concurrent queue

I have the following code in C#:
(_StoreQueue is a ConcurrentQueue)
var S = _StoreQueue.FirstOrDefault(_ => _.TimeStamp == T);
if (S == null)
{
lock (_QueueLock)
{
// try again
S = _StoreQueue.FirstOrDefault(_ => _.TimeStamp == T);
if (S == null)
{
S = new Store(T);
_StoreQueue.Enqueue(S);
}
}
}
The system is collecting data in real time (fairly high frequency, around 300-400 calls / second) and puts it in bins (Store objects) that represent a 5 second interval. These bins are in a queue as they get written and the queue gets emptied as data is processed and written.
So, when data is arriving, a check is done to see if there is a bin for that timestamp (rounded by 5 seconds), if not, one is created.
Since this is quite heavily multi-threaded, the system goes with the following logic:
If there is a bin, it is used to put data.
If there is no bin, a lock gets initiated and within that lock, the check is done again to make sure it wasn't created by another thread in the meantime. and if there is still no bin, one gets created.
With this system, the lock is roughly used once every 2k calls
I am trying to see if there is a way to remove the lock, but it is mostly because I'm thinking there has to be a better solution that the double check.
An alternative I have been thinking about is to create empty bins ahead of time and that would entirely remove the need for any locks, but the search for the right bin would become slower as it would have to scan the list pre-built bins to find the proper one.
Using a ConcurrentDictionary can fix the issue you are having. Here i assumed a type double for your TimeStamp property but it can be anything, as long as you make the ConcurrentDictionary key match the type.
class Program
{
ConcurrentDictionary<double, Store> _StoreQueue = new ConcurrentDictionary<double, Store>();
static void Main(string[] args)
{
var T = 17d;
// try to add if not exit the store with 17
_StoreQueue.GetOrAdd(T, new Store(T));
}
public class Store
{
public double TimeStamp { get; set; }
public Store(double timeStamp)
{
TimeStamp = timeStamp;
}
}
}

Implementing ConcurrentDictionary

I'm trying to create my own Cache implementation for an API. It is the first time I work with ConcurrentDictionary and I do not know if I am using it correctly. In a test, something has thrown error and so far I have not been able to reproduce it again. Maybe some concurrency professional / ConcurrentDictionary can look at the code and find what may be wrong. Thank you!
private static readonly ConcurrentDictionary<string, ThrottleInfo> CacheList = new ConcurrentDictionary<string, ThrottleInfo>();
public override void OnActionExecuting(HttpActionContext actionExecutingContext)
{
if (CacheList.TryGetValue(userIdentifier, out var throttleInfo))
{
if (DateTime.Now >= throttleInfo.ExpiresOn)
{
if (CacheList.TryRemove(userIdentifier, out _))
{
//TODO:
}
}
else
{
if (throttleInfo.RequestCount >= defaultMaxRequest)
{
actionExecutingContext.Response = ResponseMessageExtension.TooManyRequestHttpResponseMessage();
}
else
{
throttleInfo.Increment();
}
}
}
else
{
if (CacheList.TryAdd(userIdentifier, new ThrottleInfo(Seconds)))
{
//TODO:
}
}
}
public class ThrottleInfo
{
private int _requestCount;
public int RequestCount => _requestCount;
public ThrottleInfo(int addSeconds)
{
Interlocked.Increment(ref _requestCount);
ExpiresOn = ExpiresOn.AddSeconds(addSeconds);
}
public void Increment()
{
// this is about as thread safe as you can get.
// From MSDN: Increments a specified variable and stores the result, as an atomic operation.
Interlocked.Increment(ref _requestCount);
// you can return the result of Increment if you want the new value,
//but DO NOT set the counter to the result :[i.e. counter = Interlocked.Increment(ref counter);] This will break the atomicity.
}
public DateTime ExpiresOn { get; } = DateTime.Now;
}
If I understand what you are trying to do if the ExpiresOn has passed remove the entry else update it or add if not exists.
You certainly can take advantage of the AddOrUpdateMethod to simplify some of your code.
Take a look here for some good examples: https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/how-to-add-and-remove-items
Hope this helps.
The ConcurrentDictionary is sufficient as a thread-safe container only in cases where (1) the whole state that needs protection is its internal state (the keys and values it contains), and only if (2) this state can be mutated atomically using the specialized API it offers (GetOrAdd, AddOrUpdate). In your case the second requirement is not met, because you need to remove keys conditionally depending on the state of their value, and this scenario is not supported by the ConcurrentDictionary class.
So your current cache implementation is not thread safe. The fact that throws exceptions sporadically is a coincidence. It would still be non-thread-safe if it was totally throw-proof, because it would not be totally error-proof, meaning that it could occasionally (or permanently) transition to a state incompatible with its specifications (returning expired values for example).
Regarding the ThrottleInfo class, it suffers from a visibility bug that could remain unobserved if you tested the class extensively in one machine, and then suddenly emerge when you deployed your app in another machine with a different CPU architecture. The non-volatile private int _requestCount field is exposed through the public property RequestCount, so there is no guarantee (based on the C# specification) that all threads will see its most recent value. You can read this article by Igor Ostrovsky about the peculiarities of the memory models, which may convince you (like me) that employing lock-free techniques (using the Interlocked class in this case) with multithreaded code is more trouble than it's worth. If you read it and like it, there is also a part 2 of this article.

Lock text file during read and write or alternative

I have an application where I need to create files with a unique and sequential number as part of the file name. My first thought was to use (since this application does not have any other data storage) a text file that would contain a number and I would increment this number so then my application would always create a file with a unique id.
Then I thought that maybe at a time when there are more than one user submitting to this application at the same time, one process might be reading the txt file before it has been written by the previous process. So then I am looking for a way to read and write to a file (with try catch so then I can know when it's being used by another process and then wait and try to read from it a few other times) in the same 'process' without unlocking the file in between.
If what I am saying above sounds like a bad option, could you please give me an alternative to this? How would you then keep track of unique identification numbers for an application like my case?
Thanks.
If it's a single application then you can store the current number in your application settings. Load that number at startup. Then with each request you can safely increment it and use the result. Save the sequential number when the program shuts down. For example:
private int _fileNumber;
// at application startup
_fileNumber = LoadFileNumberFromSettings();
// to increment
public int GetNextFile()
{
return Interlocked.Increment(ref _fileNumber);
}
// at application shutdown
SaveFileNumberToSettings(_fileNumber);
Or, you might want to make sure that the file number is saved whenever it's incremented. If so, change your GetNextFile method:
private readonly object _fileLock = new object();
public int GetNextFile()
{
lock (_fileLock)
{
int result = ++_fileNumber;
SaveFileNumbertoSettings(_fileNumber);
return result;
}
}
Note also that it might be reasonable to use the registry for this, rather than a file.
Edit: As Alireza pointed in the comments, it is not a valid way to lock between multiple applications.
You can always lock the access to the file (so you won't need to rely on exceptions).
e.g:
// Create a lock in your class
private static object LockObject = new object();
// and then lock on this object when you access the file like this:
lock(LockObject)
{
... access to the file
}
Edit2: It seems that you can use Mutex to perform inter-application signalling.
private static System.Threading.Mutex m = new System.Threading.Mutex(false, "LockMutex");
void AccessMethod()
{
try
{
m.WaitOne();
// Access the file
}
finally
{
m.ReleaseMutex();
}
}
But it's not the best pattern to generate unique ids. Maybe a sequence in a database would be better ? If you don't have a database, you can use Guids or a local database (even Access would be better I think)
I would prefer a complex and universal solution with the global mutex. It uses a mutex with name prefixed with "Global\" which makes it system-wide i.e. one mutex instance is shared across all processes. if your program runs in friendly environment or you can specify strict permissions limited to a user account you can trust then it works well.
Keep in mind that this solution is not transactional and is not protected against thread-abortion/process-termination.
Not transactional means that if your process/thread is caught in the middle of storage file modification and is terminated/aborted then the storage file will be left in unknown state. For instance it can be left empty. You can protect yourself against loss of data (loss of last used index) by writing the new value first, saving the file and only then removing the previous value. Reading procedure should expect a file with multiple numbers and should take the greatest.
Not protected against thread-abortion means that if a thread which obtained the mutex is aborted unexpectedly and/or you do not have proper exception handling then the mutex could stay locked for the life of the process that created that thread. In order to make solution abort-protected you will have to implement timeouts on obtaining the lock i.e. replace the following line which waits forever
blnResult = iLock.Mutex.WaitOne();
with something with timeout.
Summing this up I try to say that if you are looking for a really robust solution you will come to utilizing some kind of a transactional database or write a kind of such a database yourself :)
Here is the working code without timeout handling (I do not need it in my solution). It is robust enough to begin with.
using System;
using System.IO;
using System.Security.AccessControl;
using System.Security.Principal;
using System.Threading;
namespace ConsoleApplication31
{
class Program
{
//You only need one instance of that Mutex for each application domain (commonly each process).
private static SMutex mclsIOLock;
static void Main(string[] args)
{
//Initialize the mutex. Here you need to know the path to the file you use to store application data.
string strEnumStorageFilePath = Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData),
"MyAppEnumStorage.txt");
mclsIOLock = IOMutexGet(strEnumStorageFilePath);
}
//Template for the main processing routine.
public static void RequestProcess()
{
//This flag is used to protect against unwanted lock releases in case of recursive routines.
bool blnLockIsSet = false;
try
{
//Obtain the lock.
blnLockIsSet = IOLockSet(mclsIOLock);
//Read file data, update file data. Do not put much of long-running code here.
//Other processes may be waiting for the lock release.
}
finally
{
//Release the lock if it was obtained in this particular call stack frame.
IOLockRelease(mclsIOLock, blnLockIsSet);
}
//Put your long-running code here.
}
private static SMutex IOMutexGet(string iMutexNameBase)
{
SMutex clsResult = null;
clsResult = new SMutex();
string strSystemObjectName = #"Global\" + iMutexNameBase.Replace('\\', '_');
//Give permissions to all authenticated users.
SecurityIdentifier clsAuthenticatedUsers = new SecurityIdentifier(WellKnownSidType.AuthenticatedUserSid, null);
MutexSecurity clsMutexSecurity = new MutexSecurity();
MutexAccessRule clsMutexAccessRule = new MutexAccessRule(
clsAuthenticatedUsers,
MutexRights.FullControl,
AccessControlType.Allow);
clsMutexSecurity.AddAccessRule(clsMutexAccessRule);
//Create the mutex or open an existing one.
bool blnCreatedNew;
clsResult.Mutex = new Mutex(
false,
strSystemObjectName,
out blnCreatedNew,
clsMutexSecurity);
clsResult.IsMutexHeldByCurrentAppDomain = false;
return clsResult;
}
//Release IO lock.
private static void IOLockRelease(
SMutex iLock,
bool? iLockIsSetInCurrentStackFrame = null)
{
if (iLock != null)
{
lock (iLock)
{
if (iLock.IsMutexHeldByCurrentAppDomain &&
(!iLockIsSetInCurrentStackFrame.HasValue ||
iLockIsSetInCurrentStackFrame.Value))
{
iLock.MutexOwnerThread = null;
iLock.IsMutexHeldByCurrentAppDomain = false;
iLock.Mutex.ReleaseMutex();
}
}
}
}
//Set the IO lock.
private static bool IOLockSet(SMutex iLock)
{
bool blnResult = false;
try
{
if (iLock != null)
{
if (iLock.MutexOwnerThread != Thread.CurrentThread)
{
blnResult = iLock.Mutex.WaitOne();
iLock.IsMutexHeldByCurrentAppDomain = blnResult;
if (blnResult)
{
iLock.MutexOwnerThread = Thread.CurrentThread;
}
else
{
throw new ApplicationException("Failed to obtain the IO lock.");
}
}
}
}
catch (AbandonedMutexException iMutexAbandonedException)
{
blnResult = true;
iLock.IsMutexHeldByCurrentAppDomain = true;
iLock.MutexOwnerThread = Thread.CurrentThread;
}
return blnResult;
}
}
internal class SMutex
{
public Mutex Mutex;
public bool IsMutexHeldByCurrentAppDomain;
public Thread MutexOwnerThread;
}
}

How to freeze a popsicle in .NET (make a class immutable)

I'm designing a class that I wish to make readonly after a main thread is done configuring it, i.e. "freeze" it. Eric Lippert calls this popsicle immutability. After it is frozen, it can be accessed by multiple threads concurrently for reading.
My question is how to write this in a thread safe way that is realistically efficient, i.e. without trying to be unnecessarily clever.
Attempt 1:
public class Foobar
{
private Boolean _isFrozen;
public void Freeze() { _isFrozen = true; }
// Only intended to be called by main thread, so checks if class is frozen. If it is the operation is invalid.
public void WriteValue(Object val)
{
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
public Object ReadSomething()
{
return it;
}
}
Eric Lippert seems to suggest this would be OK in this post.
I know writes have release semantics, but as far as I understand this only pertains to ordering, and it doesn't necessarily mean that all threads will see the value immediately after the write. Can anyone confirm this? This would mean this solution is not thread safe (this may not be the only reason of course).
Attempt 2:
The above, but using Interlocked.Exchange to ensure the value is actually published:
public class Foobar
{
private Int32 _isFrozen;
public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void WriteValue(Object val)
{
if (_isFrozen == 1)
throw new InvalidOperationException();
// write ...
}
}
Advantage here would be that we ensure the value is published without suffering the overhead on every read. If none of the reads are moved before the write to _isFrozen as the Interlocked method uses a full memory barrier I would guess this is thread safe. However, who knows what the compiler will do (and according to section 3.10 of the C# spec that seems like quite a lot), so I don't know if this is threadsafe.
Attempt 3:
Also do the read using Interlocked.
public class Foobar
{
private Int32 _isFrozen;
public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void WriteValue(Object val)
{
if (Interlocked.CompareExchange(ref _isFrozen, 0, 0) == 1)
throw new InvalidOperationException();
// write ...
}
}
Definitely thread safe, but it seems a little wasteful to have to do the compare exchange for every read. I know this overhead is probably minimal, but I'm looking for a reasonably efficient method (although perhaps this is it).
Attempt 4:
Using volatile:
public class Foobar
{
private volatile Boolean _isFrozen;
public void Freeze() { _isFrozen = true; }
public void WriteValue(Object val)
{
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
}
But Joe Duffy declared "sayonara volatile", so I won't consider this a solution.
Attempt 5:
Lock everything, seems a bit overkill:
public class Foobar
{
private readonly Object _syncRoot = new Object();
private Boolean _isFrozen;
public void Freeze() { lock(_syncRoot) _isFrozen = true; }
public void WriteValue(Object val)
{
lock(_syncRoot) // as above we could include an attempt that reads *without* this lock
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
}
Also seems definitely thread safe, but has more overhead than using the Interlocked approach above, so I would favour attempt 3 over this one.
And then I can come up with at least some more (I'm sure there are many more):
Attempt 6: use Thread.VolatileWrite and Thread.VolatileRead, but these are supposedly a little on the heavy side.
Attempt 7: use Thread.MemoryBarrier, seems a little too internal.
Attempt 8: create an immutable copy - don't want to do this
Summarising:
which attempt would you use and why (or how would you do it if entirely different)? (i.e. what is the best way for publishing a value once that is then read concurrently, while being reasonably efficient without being overly "clever"?)
does .NET's memory model "release" semantics of writes imply that all other threads see updates (cache coherency etc.)? I generally don't want to think too much about this, but it's nice to have an understanding.
EDIT:
Perhaps my question wasn't clear, but I am looking in particular for reasons as to why the above attempts are good or bad. Note that I am talking here about a scenario of one single writer that writes then freezes before any concurrent reads. I believe attempt 1 is OK but I'd like to know exactly why (as I wonder if reads could be optimized away somehow, for example).
I care less about whether or not this is good design practice but more about the actual threading aspect of it.
Many thanks for the response the question received, but I have chosen to mark this as an answer myself because I feel that the answers given do not quite answer my question and I do not want to give the impression to anyone visiting the site that the marked answer is correct simply because it was automatically marked as such due to the bounty expiring.
Furthermore I do not think the answer with the highest number of votes was overwhelmingly voted for, not enough to mark it automatically as an answer.
I am still leaning to attempt #1 being correct, however, I would have liked some authoritative answers. I understand x86 has a strong model, but I don't want to (and shouldn't) code for a particular architecture, after all that's one of the nice things about .NET.
If you are in doubt about the answer, go for one of the locking approaches, perhaps with the optimizations shown here to avoid a lot of contention on the lock.
Maybe slightly off topic but just out of curiosity :) Why don't you use "real" immutability? e.g. making Freeze() return an immutable copy (without "write methods" or any other possibility to change the inner state) and using this copy instead of the original object. You could even go without changing the state and return a new copy (with the changed state) on each write operation instead (afaik the string class works this). "Real immutability" is inherently thread safe.
I vote for Attempt 5, use the lock(this) implementation.
This is the most reliable means of making this work. Reader/writer locks could be employed, but to very little gain. Just go with using a normal lock.
If necessary you could improve the 'frozen' performance by first checking _isFrozen and then locking:
void Freeze() { lock (this) _isFrozen = true; }
object ReadValue()
{
if (_isFrozen)
return Read();
else
lock (this) return Read();
}
void WriteValue(object value)
{
lock (this)
{
if (_isFrozen) throw new InvalidOperationException();
Write(value);
}
}
If you really create, fill and freeze the object before showing it to other threads, then you don't need anything special to deal with thread-safety (the strong memory model of .NET is already your guarantee), so the solution 1 is valid.
But, if you give the unfrozen object to another thread (or if you are simple creating your class without knowing how users will use it) then using the version the solution that returns a new fully immutable instance is probably better. In this case, the Mutable instance is like the StringBuilder and the immutable instance is like the string. If you need an extra guarantee, the mutable instance may check its creator thread and throw exceptions if it is used from any other thread (in all methods... to avoid possible partial reads).
Attempt 2 is thread safe on x86 and other processors that have a strong memory model, but how I would do it is to make thread safety the consumers problem because there is no way for you to efficiently do it within the consumed code. Consider:
if(!foo.frozen)
{
foo.apropery = "avalue";
}
the thread saftey of the frozen property and the guard code in apropery's setter doesn't really matter because even they are perfectly thread safe you still have a race condition. Instead I would write it like
lock(foo)
{
if(!foo.frozen)
{
foo.apropery = "avalue";
}
}
and have neither of the properties inherently thread safe.
#1 - reader not threadsafe - I believe problem would be in reader side, not writer (code not shown)
#2 - reader not threadsafe - same as #1
#3 - promising, read check can be optimized out for most cases (when CPU caches are in sync)
Attempt 3:
Also do the read using Interlocked.
public class Foobar {
private object _syncRoot = new object();
private int _isFrozen = 0; // perf compiler warning, but training code, so show defaults
// Why Exchange to 1 then throw away result. Best to just increment.
//public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void Freeze() { Interlocked.Increment(ref _isFrozen); }
public void WriteValue(Object val) {
// if this core can see _isFrozen then no special lock or sync needed
if (_isFrozen != 0)
throw new InvalidOperationException();
lock(_syncRoot) {
if (_isFrozen != 0)
throw new InvalidOperationException(); // the 'throw' is 100x-1000x more costly than the lock, just eat it
_val = val;
}
}
public object Read() {
// frozen is one-way, if one-way state has been published
// to my local CPU cache then just read _val.
// There are very strange corner cases when _isFrozen and _val fields are in
// different cache lines, but should be nearly impossible to hit unless
// dealing with very large structs (make it more likely to cross
// 4k cache line).
if (_isFrozen != 0)
return _val;
// else
lock(_syncRoot) { // _isFrozen is 0 here
if (_isFrozen != 0) // if _isFrozen is 1 here we just collided with writer using lock on other thread, or our CPU cache was out of sync and lock() forced the dirty cache line to be read from main memory
return _val;
throw new InvalidOperationException(); // throw is 100x-1000x more expensive than lock, eat the cost of lock
}
}
}
Joe Duffy's post about 'volatile is dead' is, I think, in the context of his next-gen CLR/OS architecture and for CLR on ARM. Those of us doing multi-core x64/x86 I think volatile is fine. If perf is the primary concern I suggest you measure the code above and compare it to volatile.
Unlike other folks posting answers I wouldn't jump straight to lock() if you have lots of readers (3 or more threads likely to read the same object at the same time). But in your sample you mix perf-sensitive question with exceptions when a collision happens, which doesn't make much sense. If you're using exceptions, then you can also use other higher-level constructs.
If you want complete safety but need to optimize for lots of concurrent readers change lock()/Monitor to ReaderWriterLockSlim.
.NET has new primitives to handle publishing values. Take a look at Rx. It can be very fast and lockless for some cases (I think they use optimizations similar to above).
If written multiple times but only one value is kept - in Rx that is "new ReplaySubject(bufferSize: 1)". If you try it you might be surprised how fast it. At the same time I applaud your attempt to learn this level of detail.
If you want to go lockless get over your distaste for Thread.MemoryBarrier(). It is extremely important. But it has the same gotchas as volatile as described by Joe Duffy - it was designed as a hint to the compiler & CPU to prevent reordering of memory reads (which take a long time in CPU terms, so they are aggressively reordered when there are no hints present). When this reordering is combined with CLR constructs like auto-inline of functions and you can see very surprising behavior at the memory & register level. MemoryBarrier() just disables those single-threaded memory access assumptions that CPU and CLR use most of the time.
Perhaps my question wasn't clear, but I am looking in particular for reasons as to why the above attempts are good or bad. Note that I am talking here about a scenario of one single writer that writes then freezes before any concurrent reads. I believe attempt 1 is OK but I'd like to know exactly why (as I wonder if reads could be optimized away somehow, for example). I care less about whether or not this is good design practice but more about the actual threading aspect of it.
Ok, now I better understand what you are doing and looking for in a response. Allow me to elaborate on my previous answer promoting the use of locks by first addressing each of your attempts.
Attempt 1:
The approach of using a simple class that has no synchronization primitives of any form is entirely viable in your example. Since the 'authoring' thread is the only thread having access to this class during it's mutating state this should be safe. If an only if another thread has the potential to access before the class is 'frozen' would you need to provide synchronization. Essentially, it's not possible for a thread to have a cache of something it has never seen.
Aside from a thread having a cached copy of the internal state of this list there is one other concurrency issue that you should be concerned with. You should consider write reordering by the authoring thread. You example solution doesn't have enough code for me to address this, but the process of handing this 'frozen' list to another thread is the heart of the issue. Are you using Interlocked.Exchange or writing to a volatile state?
I still advocate that is not the best approach simply because there is no guarantee that another thread has not seen the instance while it's mutating.
Attempt 2:
While attempt 2 should not be used. If you are using atomic writes to a member, one should also use atomic reads. I would never recommend one without the other as without both reads and writes being atomic you haven't gained anything. The correct application of atomic reads and writes is your 'Attempt 3'.
Attempt 3:
This will guarantee an exception is thrown if a thread has attempted to mutate an frozen list. However it makes no assertion that a read is only acceptable on a frozen instance. This, IMHO, is just as bad as accessing our _isFrozen variable with atomic and non-atomic accessors. If you are going to say that it's important to safeguard writes, then you should always safeguard reads. One without the other is just 'odd'.
Overlooking my own feeling towards writing code that gaurds writes but not reads this is an acceptable approach given your specific uses. I have one writer, I write, I freeze, then I make it available to readers. Under this scenario you code works correctly. You rely on the atomic operation on the set of _isFrozen to provide the required memory barrier prior to handing the class to another thread.
In a nutshell this approach works, but again if a thread has an instance that is not frozen it's going to break.
Attempt 4:
While at heart this is nearly the same as attempt 3 (given one writer) there is one big difference. In this example, if you check _isFrozen in the reader then every access will require a memory barrier. This is unnecessary overhead once the list is frozen.
Still this has the same issue as Attempt 3 in that no assertions are made about the state of _isFrozen during the read so the performance should be identical in your example usage.
Attempt 5:
As I said this is my preference given the modification to read as appears in my other answer.
Attempt 6:
Is essentially the same as #4.
Attempt 7:
You could solve your specific needs with a Thread.MemoryBarrier. Essentially using the code from Attempt 1, you create the instance, call Freeze(), add your Thread.MemoryBarrier, and then share the instance (or share it within a lock). This should work great, again only under your limited use case.
Attempt 8:
Without knowing more about this, I can't advise on the cost of the copy.
Summary
Again I prefer using a class that has some threading guarantee or none at all. Creating a class that is only 'partially' thread safe is, IMO, dangerous.
In the words of a famous jedi master:
Either do or do not there is no try.
The same goes for thread safety. The class should either be thread safe or not. Taking this approach you are left with either using my augmentation of Attempt 5, or using Attempt 7. Given the choice, I would never recommend #7.
So my recommendation stands firmly behind a completely thread-safe version. The performance cost between the two is so infinitesimally small it's almost non-existent. The reader threads will never hit the lock simply because of your usage scenario of having a single writer. Yet, if they do, proper behavior is still a certainty. Thus as your code changes over time and suddenly your instance is being shared prior to being frozen you don't wind up with race condition that crashes your program. Thread safe, or not, don't be half-in or you wind up with nasty surprise someday.
My preference is all classes shared by more than one thread are one of two types:
Completely immutable.
Completely Thread-safe.
Since a popsicle list is not immutable by design it does not fit #1. Therefore if you are going to share the object across threads it should fit #2.
Hopefully all this ranting further explains my reasoning :)
_syncRoot
Many people have noticed that I skipped the use of a _syncRoot on my locking implementation. While the reasons to use _syncRoot are valid they are not always necessary. In your example usage where you have a single writer the use of lock(this) should suffice nicely without adding another heap allocation for _syncRoot.
Is the thing constructed and written to, then permanently frozen and read multiple times?
Or do you freeze and unfreeze and refreeze it multiple times?
If it's the former, then perhaps the "is frozen" check should be in the reader method not the writer method (to prevent it reading before it's frozen).
Or, if it's the latter, then the use case you need to beware of is:
Main thread invokes the writer method, finds that it's not frozen, and therefore begins to write
Before the write has finished, someone tries to freeze the object and then reads from it, while the other (main) thread is still writing
In the latter case, Google shows a lot of results for multiple reader single writer which you might find interesting.
In general, each mutable object should have precisely one clearly-defined "owner"; shared objects should be immutable. Popsicles should not be accessible by multiple threads until after they are frozen.
Personally, I don't like forms of popsicle immunity with an exposed "freeze" method. I think a cleaner approach is to have AsMutable and AsImmutable methods (each of which would simply return the object unmodified when appropriate). Such an approach can allow for more robust promises about immutability. For example, if an "unshared mutable object" is being mutated while its AsImmutable member is being called (behavior which would be contrary to the object being "unshared"), the state of the data in the copy may be indeterminate, but whatever was returned would be immutable. By contrast, if one thread froze an object and then assumed it was immutable while another thread was writing to it, the "immutable" object could end up changing after it was frozen and its values were read.
Edit
Based on further description, I would suggest having code which writes to the object do so within a monitor lock, and having the freeze routine look something like:
public Thingie Freeze(void) // Returns the object in question
{
if (isFrozen) // Private field
return this;
else
return DoFreeze();
}
Thingie DoFreeze(void)
{
if (Monitor.TryEnter(whatever))
{
isFrozen = true;
return this;
}
else if (isFrozen)
return this;
else
throw new InvalidOperationException("Object in use by writer");
}
The Freeze method may be called any number of times by any number of threads; it should be short enough to be inlined (though I haven't profiled it), and should thus take almost no time to execute. If the first access of the object in any thread is via the Freeze method, that should guarantee proper visibility under any reasonable memory model (even if the thread didn't see the updates to the object performed by the thread which created and originally froze it, it would perform the TryEnter, which would guarantee a memory barrier, and after that failed it would notice that the object was frozen and return it.
If code which is going to write the object acquires the lock first, an attempt to write to a frozen object could deadlock. If one would rather have such code throw an exception, one use TryEnter and throw an exception if it can't get the lock.
The object used for locking should be something which is exclusively held by the object to be frozen. If the object to be frozen doesn't hold a purely-private reference to anything, one could either lock on this or create a private object purely for locking purposes. Note that it is safe to abandon 'entered' monitor locks without cleanup; the GC will simply forget about them, since if no references exist to a lock there's no way anybody will ever care (or could even ask) whether the lock was entered at the time it was abandoned.
I am not sure in terms of cost how the following approach will do, but it is a bit different. Only initially if there are multiple threads trying to write value simultaneously will they encounter locks. Once it is frozen all later calls will get the exception directly.
Attempt 9:
public class Foobar
{
private readonly Object _syncRoot = new Object();
private object _val;
private Boolean _isFrozen;
private Action<object> WriteValInternal;
public void Freeze() { _isFrozen = true; }
public Foobar()
{
WriteValInternal = BeforeFreeze;
}
private void BeforeFreeze(object val)
{
lock (_syncRoot)
{
if (_isFrozen == false)
{
//Write the values....
_val = val;
//...
//...
//...
//and then modify the write value function
WriteValInternal = AfterFreeze;
Freeze();
}
else
{
throw new InvalidOperationException();
}
}
}
private void AfterFreeze(object val)
{
throw new InvalidOperationException();
}
public void WriteValue(Object val)
{
WriteValInternal(val);
}
public Object ReadSomething()
{
return _val;
}
}
Have you checked out Lazy
http://msdn.microsoft.com/en-us/library/dd642331.aspx
which uses ThreadLocal
http://msdn.microsoft.com/en-us/library/dd642243.aspx
And actually looking further there is a Freezable class...
http://msdn.microsoft.com/en-us/library/vstudio/ms602734(v=vs.100).aspx
you may achieve this using POST Sharp
take one interface
public interface IPseudoImmutable
{
bool IsFrozen { get; }
bool Freeze();
}
then derive your attribute from InstanceLevelAspect like this
/// <summary>
/// implement by divyang
/// </summary>
[Serializable]
[IntroduceInterface(typeof(IPseudoImmutable),
AncestorOverrideAction = InterfaceOverrideAction.Ignore, OverrideAction = InterfaceOverrideAction.Fail)]
public class PseudoImmutableAttribute : InstanceLevelAspect, IPseudoImmutable
{
private volatile bool isFrozen;
#region "IPseudoImmutable"
[IntroduceMember]
public bool IsFrozen
{
get
{
return this.isFrozen;
}
}
[IntroduceMember(IsVirtual = true, OverrideAction = MemberOverrideAction.Fail)]
public bool Freeze()
{
if (!this.isFrozen)
{
this.isFrozen = true;
}
return this.IsFrozen;
}
#endregion
[OnLocationSetValueAdvice]
[MulticastPointcut(Targets = MulticastTargets.Property | MulticastTargets.Field)]
public void OnValueChange(LocationInterceptionArgs args)
{
if (!this.IsFrozen)
{
args.ProceedSetValue();
}
}
}
public class ImmutableException : Exception
{
/// <summary>
/// The location name.
/// </summary>
private readonly string locationName;
/// <summary>
/// Initializes a new instance of the <see cref="ImmutableException"/> class.
/// </summary>
/// <param name="message">
/// The message.
/// </param>
public ImmutableException(string message)
: base(message)
{
}
public ImmutableException(string message, string locationName)
: base(message)
{
this.locationName = locationName;
}
public string LocationName
{
get
{
return this.locationName;
}
}
}
then apply in your class like this
[PseudoImmutableAttribute]
public class TestClass
{
public string MyString { get; set; }
public int MyInitval { get; set; }
}
then run it in multi thread
/// <summary>
/// The program.
/// </summary>
public class Program
{
/// <summary>
/// The main.
/// </summary>
/// <param name="args">
/// The args.
/// </param>
public static void Main(string[] args)
{
Console.Title = "Divyang Demo ";
var w = new Worker();
w.Run();
Console.ReadLine();
}
}
internal class Worker
{
private object SyncObject = new object();
public Worker()
{
var r = new Random();
this.ObjectOfMyTestClass = new MyTestClass { MyInitval = r.Next(500) };
}
public MyTestClass ObjectOfMyTestClass { get; set; }
public void Run()
{
Task readWork;
readWork = Task.Factory.StartNew(
action: () =>
{
for (;;)
{
Task.Delay(1000);
try
{
this.DoReadWork();
}
catch (Exception exception)
{
// Console.SetCursorPosition(80,80);
// Console.SetBufferSize(100,100);
Console.WriteLine("Read Exception : {0}", exception.Message);
}
}
// ReSharper disable FunctionNeverReturns
});
Task writeWork;
writeWork = Task.Factory.StartNew(
action: () =>
{
for (int i = 0; i < int.MaxValue; i++)
{
Task.Delay(1000);
try
{
this.DoWriteWork();
}
catch (Exception exception)
{
Console.SetCursorPosition(80, 80);
Console.SetBufferSize(100, 100);
Console.WriteLine("write Exception : {0}", exception.Message);
}
if (i == 5000)
{
((IPseudoImmutable)this.ObjectOfMyTestClass).Freeze();
}
}
});
Task.WaitAll();
}
/// <summary>
/// The do read work.
/// </summary>
public void DoReadWork()
{
// ThreadId where reading is done
var threadId = System.Threading.Thread.CurrentThread.ManagedThreadId;
// printing on screen
lock (this.SyncObject)
{
Console.SetCursorPosition(0, 0);
Console.SetBufferSize(290, 290);
Console.WriteLine("\n");
Console.WriteLine("Read Start");
Console.WriteLine("Read => Thread Id: {0} ", threadId);
Console.WriteLine("Read => this.objectOfMyTestClass.MyInitval: {0} ", this.ObjectOfMyTestClass.MyInitval);
Console.WriteLine("Read => this.objectOfMyTestClass.MyString: {0} ", this.ObjectOfMyTestClass.MyString);
Console.WriteLine("Read End");
Console.WriteLine("\n");
}
}
/// <summary>
/// The do write work.
/// </summary>
public void DoWriteWork()
{
// ThreadId where reading is done
var threadId = System.Threading.Thread.CurrentThread.ManagedThreadId;
// random number generator
var r = new Random();
var count = r.Next(15);
// new value for Int property
var tempInt = r.Next(5000);
this.ObjectOfMyTestClass.MyInitval = tempInt;
// new value for string Property
var tempString = "Randome" + r.Next(500).ToString(CultureInfo.InvariantCulture);
this.ObjectOfMyTestClass.MyString = tempString;
// printing on screen
lock (this.SyncObject)
{
Console.SetBufferSize(290, 290);
Console.SetCursorPosition(125, 25);
Console.WriteLine("\n");
Console.WriteLine("Write Start");
Console.WriteLine("Write => Thread Id: {0} ", threadId);
Console.WriteLine("Write => this.objectOfMyTestClass.MyInitval: {0} and New Value :{1} ", this.ObjectOfMyTestClass.MyInitval, tempInt);
Console.WriteLine("Write => this.objectOfMyTestClass.MyString: {0} and New Value :{1} ", this.ObjectOfMyTestClass.MyString, tempString);
Console.WriteLine("Write End");
Console.WriteLine("\n");
}
}
}
but still it will allow you to change property like array ,list . but if you apply more login in that then it may work for all type of property and field
I'd do something like this, inspired by C++ movable types. Just remember not to access the object after Freeze/Thaw.
Of course, you can add a _data != null check/throw if you want to be clear about why the user gets an NRE if accessing after thaw/freeze.
public class Data
{
public string _foo;
public int _bar;
}
public class Mutable
{
private Data _data = new Data();
public Mutable() {}
public string Foo { get => _data._foo; set => _data._foo = value; }
public int Bar { get => _data._bar; set => _data._bar = value; }
public Frozen Freeze()
{
var f = new Frozen(_data);
_data = null;
return f;
}
}
public class Frozen
{
private Data _data;
public Frozen(Data data) => _data = data;
public string Foo => _data._foo;
public int Bar => _data._bar;
public Mutable Thaw()
{
var m = new Mutable(_data);
_data = null;
return m;
}
}

Is the Managed heap not scalable to multi-core systems

I was seeing some strange behavior in a multi threading application which I wrote and which was not scaling well across multiple cores.
The following code illustrates the behavior I am seeing. It appears the heap intensive operations do not scale across multiple cores rather they seem to slow down. ie using a single thread would be faster.
class Program
{
public static Data _threadOneData = new Data();
public static Data _threadTwoData = new Data();
public static Data _threadThreeData = new Data();
public static Data _threadFourData = new Data();
static void Main(string[] args)
{
// Do heap intensive tests
var start = DateTime.Now;
RunOneThread(WorkerUsingHeap);
var finish = DateTime.Now;
var timeLapse = finish - start;
Console.WriteLine("One thread using heap: " + timeLapse);
start = DateTime.Now;
RunFourThreads(WorkerUsingHeap);
finish = DateTime.Now;
timeLapse = finish - start;
Console.WriteLine("Four threads using heap: " + timeLapse);
// Do stack intensive tests
start = DateTime.Now;
RunOneThread(WorkerUsingStack);
finish = DateTime.Now;
timeLapse = finish - start;
Console.WriteLine("One thread using stack: " + timeLapse);
start = DateTime.Now;
RunFourThreads(WorkerUsingStack);
finish = DateTime.Now;
timeLapse = finish - start;
Console.WriteLine("Four threads using stack: " + timeLapse);
Console.ReadLine();
}
public static void RunOneThread(ParameterizedThreadStart worker)
{
var threadOne = new Thread(worker);
threadOne.Start(_threadOneData);
threadOne.Join();
}
public static void RunFourThreads(ParameterizedThreadStart worker)
{
var threadOne = new Thread(worker);
threadOne.Start(_threadOneData);
var threadTwo = new Thread(worker);
threadTwo.Start(_threadTwoData);
var threadThree = new Thread(worker);
threadThree.Start(_threadThreeData);
var threadFour = new Thread(worker);
threadFour.Start(_threadFourData);
threadOne.Join();
threadTwo.Join();
threadThree.Join();
threadFour.Join();
}
static void WorkerUsingHeap(object state)
{
var data = state as Data;
for (int count = 0; count < 100000000; count++)
{
var property = data.Property;
data.Property = property + 1;
}
}
static void WorkerUsingStack(object state)
{
var data = state as Data;
double dataOnStack = data.Property;
for (int count = 0; count < 100000000; count++)
{
dataOnStack++;
}
data.Property = dataOnStack;
}
public class Data
{
public double Property
{
get;
set;
}
}
}
This code was run on a Core 2 Quad (4 core system) with the following results:
One thread using heap: 00:00:01.8125000
Four threads using heap: 00:00:17.7500000
One thread using stack: 00:00:00.3437500
Four threads using stack: 00:00:00.3750000
So using the heap with four threads did 4 times the work but took almost 10 times as long. This means it would be twice as fast in this case to use only one thread??????
Using the stack was much more as expected.
I would like to know what is going on here. Can the heap only be written to from one thread at a time?
The answer is simple - run outside of Visual Studio...
I just copied your entire program, and ran it on my quad core system.
Inside VS (Release Build):
One thread using heap: 00:00:03.2206779
Four threads using heap: 00:00:23.1476850
One thread using stack: 00:00:00.3779622
Four threads using stack: 00:00:00.5219478
Outside VS (Release Build):
One thread using heap: 00:00:00.3899610
Four threads using heap: 00:00:00.4689531
One thread using stack: 00:00:00.1359864
Four threads using stack: 00:00:00.1409859
Note the difference. The extra time in the build outside VS is pretty much all due to the overhead of starting the threads. Your work in this case is too small to really test, and you're not using the high performance counters, so it's not a perfect test.
Main rule of thumb - always do perf. testing outside VS, ie: use Ctrl+F5 instead of F5 to run.
Aside from the debug-vs-release effects, there is something more you should be aware of.
You cannot effectively evaluate multi-threaded code for performance in 0.3s.
The point of threads is two-fold: effectively model parallel work in code, and effectively exploit parallel resources (cpus, cores).
You are trying to evaluate the latter. Given that thread start overhead is not vanishingly small in comparison to the interval over which you are timing, your measurement is immediately suspect. In most perf test trials, a significant warm up interval is appropriate. This may sound silly to you - it's a computer program fter all, not a lawnmower. But warm-up is absolutely imperative if you are really going to evaluate multi-thread performance. Caches get filled, pipelines fill up, pools get filled, GC generations get filled. The steady-state, continuous performance is what you would like to evaluate. For purposes of this exercise, the program behaves like a lawnmower.
You could say - Well, no, I don't want to evaluate the steady state performance. And if that is the case, then I would say that your scenario is very specialized. Most app scenarios, whether their designers explicitly realize it or not, need continuous, steady performance.
If you truly need the perf to be good only over a single 0.3s interval, you have found your answer. But be careful to not generalize the results.
If you want general results, you need to have reasonably long warm up intervals, and longer collection intervals. You might start at 20s/60s for those phases, but here is the key thing: you need to vary those intervals until you find the results converging. YMMV. The valid times vary depending on the application workload and the resources dedicated to it, obviously. You may find that a measurement interval of 120s is necessary for convergence, or you may find 40s is just fine. But (a) you won't know until you measure it, and (b) you can bet 0.3s is not long enough.
[edit]Turns out, this is a release vs. debug build issue -- not sure why it is, but it is. See comments and other answers.[/edit]
This was very interesting -- I wouldn't have guessed there'd be that much difference. (similar test machine here -- Core 2 Quad Q9300)
Here's an interesting comparison -- add a decent-sized additional element to the 'Data' class -- I changed it to this:
public class Data
{
public double Property { get; set; }
public byte[] Spacer = new byte[8096];
}
It's still not quite the same time, but it's very close (running it for 10x as long results in 13.1s vs. 17.6s on my machine).
If I had to guess, I'd speculate that it's related to cross-core cache coherency, at least if I'm remembering how CPU cache works. With the small version of 'Data', if a single cache line contains multiple instances of Data, the cores are having to constantly invalidate each other's caches (worst case if they're all on the same cache line). With the 'spacer' added, their memory addresses are sufficiently far enough apart that one CPU's write of a given address doesn't invalidate the caches of the other CPUs.
Another thing to note -- the 4 threads start nearly concurrently, but they don't finish at the same time -- another indication that there's cross-core issues at work here. Also, I'd guess that running on a multi-cpu machine of a different architecture would bring more interesting issues to light here.
I guess the lesson from this is that in a highly-concurrent scenario, if you're doing a bunch of work with a few small data structures, you should try to make sure they aren't all packed on top of each other in memory. Of course, there's really no way to make sure of that, but I'm guessing there are techniques (like adding spacers) that could be used to try to make it happen.
[edit]
This was too interesting -- I couldn't put it down. To test this out further, I thought I'd try varying-sized spacers, and use an integer instead of a double to keep the object without any added spacers smaller.
class Program
{
static void Main(string[] args)
{
Console.WriteLine("name\t1 thread\t4 threads");
RunTest("no spacer", WorkerUsingHeap, () => new Data());
var values = new int[] { -1, 0, 4, 8, 12, 16, 20 };
foreach (var sv in values)
{
var v = sv;
RunTest(string.Format(v == -1 ? "null spacer" : "{0}B spacer", v), WorkerUsingHeap, () => new DataWithSpacer(v));
}
Console.ReadLine();
}
public static void RunTest(string name, ParameterizedThreadStart worker, Func<object> fo)
{
var start = DateTime.UtcNow;
RunOneThread(worker, fo);
var middle = DateTime.UtcNow;
RunFourThreads(worker, fo);
var end = DateTime.UtcNow;
Console.WriteLine("{0}\t{1}\t{2}", name, middle-start, end-middle);
}
public static void RunOneThread(ParameterizedThreadStart worker, Func<object> fo)
{
var data = fo();
var threadOne = new Thread(worker);
threadOne.Start(data);
threadOne.Join();
}
public static void RunFourThreads(ParameterizedThreadStart worker, Func<object> fo)
{
var data1 = fo();
var data2 = fo();
var data3 = fo();
var data4 = fo();
var threadOne = new Thread(worker);
threadOne.Start(data1);
var threadTwo = new Thread(worker);
threadTwo.Start(data2);
var threadThree = new Thread(worker);
threadThree.Start(data3);
var threadFour = new Thread(worker);
threadFour.Start(data4);
threadOne.Join();
threadTwo.Join();
threadThree.Join();
threadFour.Join();
}
static void WorkerUsingHeap(object state)
{
var data = state as Data;
for (int count = 0; count < 500000000; count++)
{
var property = data.Property;
data.Property = property + 1;
}
}
public class Data
{
public int Property { get; set; }
}
public class DataWithSpacer : Data
{
public DataWithSpacer(int size) { Spacer = size == 0 ? null : new byte[size]; }
public byte[] Spacer;
}
}
Result:
1 thread vs. 4 threads
no spacer 00:00:06.3480000 00:00:42.6260000
null spacer 00:00:06.2300000 00:00:36.4030000
0B spacer 00:00:06.1920000 00:00:19.8460000
4B spacer 00:00:06.1870000 00:00:07.4150000
8B spacer 00:00:06.3750000 00:00:07.1260000
12B spacer 00:00:06.3420000 00:00:07.6930000
16B spacer 00:00:06.2250000 00:00:07.5530000
20B spacer 00:00:06.2170000 00:00:07.3670000
No spacer = 1/6th the speed, null spacer = 1/5th the speed, 0B spacer = 1/3th the speed, 4B spacer = full speed.
I don't know the full details of how the CLR allocates or aligns objects, so I can't speak to what these allocation patterns look like in real memory, but these definitely are some interesting results.

Categories