How to avoid slowdown due to locked code? - c#

I am wondering how a piece of locked code can slow down my code even though the code is never executed. Here is an example below:
public void Test_PerformanceUnit()
{
Stopwatch sw = new Stopwatch();
sw.Start();
Random r = new Random();
for (int i = 0; i < 10000; i++)
{
testRand(r);
}
sw.Stop();
Console.WriteLine(sw.ElapsedTicks);
}
public object testRand(Random r)
{
if (r.Next(1) > 10)
{
lock(this) {
return null;
}
}
return r;
}
This code runs in ~1300ms on my machine. If we remove the lock block (but keep its body), we get 750ms. Almost the double, even though the code is never run!
Of course this code does nothing. I noticed it while adding some lazy initialization in a class where the code checks if the object is initialized and if not initializes it. The problem is that the initialization is locked and slows down everything even after the first call.
My questions are:
Why is this happening?
How to avoid the slowdown

About why it's happening, it has been discussed in the comments : it's due to the initialization of the try ... finally generated by the lock.
And to avoid this slowdown, you can extract the locking feature to a new method, so that the locking mechanism will only be initialized if the method is actually called.
I tried it with this simple code :
public object testRand(Random r)
{
if (r.Next(1) > 10)
{
return LockingFeature();
}
return r;
}
private object LockingFeature()
{
lock (_lock)
{
return null;
}
}
And here are my times (in ticks) :
your code, no lock : ~500
your code, with lock : ~1200
my code : ~500
EDIT : My test code (running a bit slower than the code with no locks) was actually on static methods, it appears that when the code is ran "inside" an object, the timings are the same. I fixed the timings according to that.

Related

Inform the compiler that a variable might be updated from another thread

This would generally be done using volatile. But in the case of a long or double that's impossible.
Perhaps just making it public is enough, and the compiler then knows that this can be used by another assembly and won't "optimize it out"? Can this be relied upon? Some other way?
To be clear, I'm not worried about concurrent reading/writing of the variable. Only one thing - that it doesn't get optimized out. (Like in https://stackoverflow.com/a/1284007/939213 .)
The best way to prevent code removal is to use the code.
if you are worries about optimizing the while loop in your example
class Test
{
long foo;
static void Main()
{
var test = new Test();
new Thread(delegate() { Thread.Sleep(500); test.foo = 255; }).Start();
while (test.foo != 255) ;
Console.WriteLine("OK");
}
}
you still could use volatile to do this by modifying your while loop
volatile int temp;
//code skipped in this sample
while(test.foo != 255) { temp = (int)foo;}
Now assuming you are SURE you won't have any thread safety issues. you are using your long foo so it won't be optimized away. and you don't care about losing any part of your long since you are just trying to keep it alive.
Make sure you mark your code very clearly if you do something like this. possibly write a VolatileLong class that wraps your long (and your volatile int) so other people understand what you are doing
also other thread-safty tools like locks will prevent code removal. for example the compiler is smart enough not to remove the double if in the sinleton pattern like this.
if (_instance == null) {
lock(_lock) {
if (_instance == null) {
_instance = new Singleton();
}
}
}
return _instance;

Writing a unit test for concurrent C# code?

I've been trying to solve this issue for quite some time now. I've written some example code showcasing the usage of lock in C#. Running my code manually I can see that it works the way it should, but of course I would like to write a unit test that confirms my code.
I have the following ObjectStack.cs class:
enum ExitCode
{
Success = 0,
Error = 1
}
public class ObjectStack
{
private readonly Stack<Object> _objects = new Stack<object>();
private readonly Object _lockObject = new Object();
private const int NumOfPopIterations = 1000;
public ObjectStack(IEnumerable<object> objects)
{
foreach (var anObject in objects) {
Push(anObject);
}
}
public void Push(object anObject)
{
_objects.Push(anObject);
}
public void Pop()
{
_objects.Pop();
}
public void ThreadSafeMultiPop()
{
for (var i = 0; i < NumOfPopIterations; i++) {
lock (_lockObject) {
try {
Pop();
}
//Because of lock, the stack will be emptied safely and no exception is ever caught
catch (InvalidOperationException) {
Environment.Exit((int)ExitCode.Error);
}
if (_objects.Count == 0) {
Environment.Exit((int)ExitCode.Success);
}
}
}
}
public void ThreadUnsafeMultiPop()
{
for (var i = 0; i < NumOfPopIterations; i++) {
try {
Pop();
}
//Because there is no lock, an exception is caught when popping an already empty stack
catch (InvalidOperationException) {
Environment.Exit((int)ExitCode.Error);
}
if (_objects.Count == 0) {
Environment.Exit((int)ExitCode.Success);
}
}
}
}
And Program.cs:
public class Program
{
private const int NumOfObjects = 100;
private const int NumOfThreads = 10000;
public static void Main(string[] args)
{
var objects = new List<Object>();
for (var i = 0; i < NumOfObjects; i++) {
objects.Add(new object());
}
var objectStack = new ObjectStack(objects);
Parallel.For(0, NumOfThreads, x => objectStack.ThreadUnsafeMultiPop());
}
}
I'm trying to write a unit that tests the thread unsafe method, by checking the exit code value (0 = success, 1 = error) of the executable.
I tried to start and run the application executable as a process in my test, a couple of 100 times, and checked the exit code value each time in the test. Unfortunately, it was 0 every single time.
Any ideas are greatly appreciated!
Logically, there is one, very small, piece of code where this problem can happen. Once one of the threads enters the block of code that pops a single element, then either the pop will work in which case the next line of code in that thread will Exit with success OR the pop will fail in which case the next line of code will catch the exception and Exit.
This means that no matter how much parallelization you put into the program, there is still only one single point in the whole program execution stack where the issue can occur and that is directly before the program exits.
The code is genuinely unsafe, but the probability of an issue happening in any single execution of the code is extremely low as it requires the scheduler to decide not to execute the line of code that will exit the environment cleanly and instead let one of the other Threads raise an exception and exit with an error.
It is extremely difficult to "prove" that a concurrency bug exists, except for really obvious ones, because you are completely dependent on what the scheduler decides to do.
Looking up some other posts I see this post which is written related to Java but references C#: How should I unit test threaded code?
It includes a link to this which might be useful to you: http://research.microsoft.com/en-us/projects/chess/
Hope this is useful and apologies if it is not. Testing concurrency is inherently unpredictable as is writing example code to cause it.
Thanks for all the input! Although I do agree that this is a concurrency issue quite hard to detect due to the scheduler execution among other things, I seem to have found an acceptable solution to my problem.
I wrote the following unit test:
[TestMethod]
public void Executable_Process_Is_Thread_Safe()
{
const string executablePath = "Thread.Locking.exe";
for (var i = 0; i < 1000; i++) {
var process = new Process() {StartInfo = {FileName = executablePath}};
process.Start();
process.WaitForExit();
if (process.ExitCode == 1) {
Assert.Fail();
}
}
}
When I ran the unit test, it seemed that the Parallel.For execution in Program.cs threw strange exceptions at times, so I had to change that to traditional for-loops:
public class Program
{
private const int NumOfObjects = 100;
private const int NumOfThreads = 10000;
public static void Main(string[] args)
{
var objects = new List<Object>();
for (var i = 0; i < NumOfObjects; i++) {
objects.Add(new object());
}
var tasks = new Task[NumOfThreads];
var objectStack = new ObjectStack(objects);
for (var i = 0; i < NumOfThreads; i++)
{
var task = new Task(objectStack.ThreadUnsafeMultiPop);
tasks[i] = task;
}
for (var i = 0; i < NumOfThreads; i++)
{
tasks[i].Start();
}
//Using this seems to throw exceptions from unit test context
//Parallel.For(0, NumOfThreads, x => objectStack.ThreadUnsafeMultiPop());
}
Of course, the unit test is quite dependent on the machine you're running it on (a fast processor may be able to empty the stack and exit safely before reaching the critical section in all cases).
1.) You could inject IL Inject Context switches on a post build of your code in the form of Thread.Sleep(0) using ILGenerator which would most likely help these issues to arise.
2.) I would recommend you take a look at the CHESS project by Microsoft research team.

How to acquire multiple locks in VS2012 without messing up indentation

This looks like a silly question, but I'm not able to find a solution to this.
My problem is that C# doesn't allow for the acquisition of multiple locks in a single lock statement. This won't work:
lock (a, b, c, d)
{
// ...
}
Instead, it seems to require an insane amount of indentation in order to do this:
lock (a)
lock (b)
lock (c)
lock (d)
{
// ...
}
Coupled with all other indentation levels that the code is already in (namespaces, class, method, conditionals, loops, ...), this gets insane. So instead, I want to use this formatting:
lock (a) lock (b) lock (c) lock (d)
{
// ...
}
and preserve my sanity. But Visual Studio (I'm using 2012) won't hear of it. As soon as I enter any closing brace, the above is transformed to something silly, like:
lock (a) lock (b) lock (c) lock (d)
{
// ...
}
And there seems there's nothing I can do. Is there any way to make this work?
Just an idea :- )
static class LockAndExecute
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static void _gen(Action a, object[] objs, int i = 0){
bool lockWasTaken = false;
var temp = objs[i];
try {
Monitor.Enter(temp, ref lockWasTaken);
if(i + 1 >= objs.Length)
a();
else
_gen(a, objs, i + 1);
}
finally
{
if (lockWasTaken)
Monitor.Exit(temp);
}
}
public static void Do(object[] objectsToLock, Action action){
_gen(action, objectsToLock);
}
}
and the usage;
LockAndExecute.Do(new[]{a, b}, () => {
Console.WriteLine("Eww!");
});
Using that many locks at a time is just asking for deadlock. Heck, even acquiring two different locks at a time runs that risk.
At the very least, you should be very very careful to only ever take these locks in exactly the same order everywhere that more than one is acquired at a time.
Also, "nice formatting" is in the eye of the beholder. That is, everyone's got their own idea of what's best. But, the following should work, without VS messing with it unless you specifically ask it to (e.g. by triggering an auto-format rule or explicitly auto-formatting):
lock (a)
lock (b)
lock (c)
lock (d)
{
}
You can also use this approach with using statements (where it's much more common to have more than one in a row), where the VS IDE already anticipates it.
You could work around the IDE's annoying behavior by changing your code, though the idea of changing your code to work around IDE behavior pains my conscience a little. I'd do it if it was a toy project but not on anything serious that another developer might work on.
Implement the lock with an IDisposable implementation. The using statement does not have the annoying indentation issue that the lock statements do.
class myLock : IDisposable
{
private object _obj;
public myLock(object obj)
{
_obj = obj;
System.Threading.Monitor.Enter(obj);
}
public void Dispose()
{
System.Threading.Monitor.Exit(_obj);
_obj = null;
}
public static void example()
{
var obj1 = new object();
var obj2 = new object();
var obj3 = new object();
lock (obj1)
lock (obj2)
lock (obj3)
{
// Stupid indentation >:(
}
using (new myLock(obj1))
using (new myLock(obj2))
using (new myLock(obj3))
{
// Ahhhh... :-)
}
}
}

EmptyEnumerable<T>.Instance assignment and multi-threading design

This is more of a design question I guess than an actual bug or a rant. I wonder what people think about the following behavior:
In .NET, when you want to represent an empty IEnumerable efficiently you can use Enumerable.Empty<MyType>(), this will cache the empty enumerable instance. It's a nice and free micro-optimization I guess that could help if relied upon heavily.
However, this is how the implementation looks like:
public static IEnumerable<TResult> Empty<TResult>() {
return EmptyEnumerable<TResult>.Instance;
}
internal class EmptyEnumerable<TElement>
{
static volatile TElement[] instance;
public static IEnumerable<TElement> Instance {
get {
if (instance == null) instance = new TElement[0];
return instance;
}
}
}
I would expect the assignment to happen within a lock, after another null check, but that's not what happens.
I wonder if this is a conscious decision (i.e. we don't care of potentially creating several objects we will just throw away immediately if this is accessed concurrently, because we would rather avoid locking) or just ignorance?
What would you do?
This is safe because volatile sequences all reads and writes to that field. Before the read in return instance; there is always at least one write setting that field to a valid value.
It is unclear what value is going to be returned because multiple arrays can potentially be created here. But there will always be a non-null array.
Why did they do it? Well, a lock has more overhead than volatile and the implementation is easy enough to pull off. Those extra instances will only be created a few times if multiple threads happen to race to this method. For each thread racing at most one instance will be created. After initialization is complete there is zero garbage.
Note, that without volatile the instance field can flip back to zero after having been assigned. That is very counter intuitive. Without any synchronization the compiler is allowed to rewrite the code like that:
var instanceRead1 = instance;
var returnValue;
if (instanceRead1 == null) {
returnValue = new TElement[0];
instance = returnValue;
}
var instanceRead2 = instance;
if (instanceRead2 == returnValue) return instanceRead2;
else return null;
In the presence of concurrent writes instanceRead2 can be a different value than was just written. No compiler would do such a rewrite but it is legal. The CPU might do something like that on some architectures. Unlikely, but legal. Maybe there is a more plausible rewrite.
In that code there is the possibility of creating more than one array. It's possible to either have a thread create an array and then end up actually using the one created from another thread, or for two different threads to each end up with their own array. However, that just doesn't matter here. The code will work correctly whether multiple objects are created or not. As long as an array is returned it doesn't matter which array is returned by any call ever. Additionally the "expense" of creating an empty array is simply not very high. The decision was made (likely after a fair bit of testing) that the expense of synchronizing access to the field every time the field is accessed ever was greater than the very unlikely possibility that a couple of additional empty arrays were created.
This is not a pattern that you should emulate in your own (quasi) singletons unless you are also in the position in which creating a new instance is cheap, and creating multiple instances doesn't affect the functionality of the code. In effect the only situation in which this works is when you're trying to cache the value of a cheaply computed operation. That's a micro optimization; it's not wrong, but it's also not a big win either.
Although running benchmarks on such small code not really yields the most trustworthy results, here are few options compared (very bluntly though):
The current implementation with volatile instance and null check without lock.
A lock on static object syncRoot.
Static type initializer.
A lock on typeof(T) (so that there is no static type initializer created).
Results (seconds for 1 billion iterations):
21.7
28.8
20.3
29.3
As you can see, the lock approach is the worst one by far. The best would be static type initializer which would also make the code cleaner. The actual reason is probably not really because of the lock but rather the size of the getter and things like code inlining and more options for the compiler to optimize the code.
The speed of creating 1 million (not billion this time) empty arrays for the same machine is 26ms.
The code:
using System;
namespace ConsoleSandbox
{
class T1<T>
{
static volatile T[] _instance;
public static T[] Instance
{
get
{
if (_instance == null) _instance = new T[0];
return _instance;
}
}
}
class T2<T>
{
static T[] _instance;
static object _syncRoot = new object();
public static T[] Instance
{
get
{
if (_instance == null)
lock (_syncRoot)
if (_instance == null)
_instance = new T[0];
return _instance;
}
}
}
class T3<T>
{
static T[] _instance = new T[0];
public static T[] Instance
{
get
{
return _instance;
}
}
}
class T4<T>
{
static T[] _instance;
public static T[] Instance
{
get
{
if (_instance == null)
lock (typeof(T4<T>))
if (_instance == null)
_instance = new T[0];
return _instance;
}
}
}
class Program
{
static void Main(string[] args)
{
int[][] res = new int[2][];
var sw = new System.Diagnostics.Stopwatch();
sw.Start();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T1<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T2<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T3<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T4<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000; i++)
res[i % 2] = new int[0];
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.WriteLine(res[0]);
Console.WriteLine(res[1]);
}
}
}

How to unit test Thread Safe Generic List in C# using NUnit?

I asked a question about building custom Thread Safe Generic List now I am trying to unit test it and I absolutely have no idea how to do that. Since the lock happens inside the ThreadSafeList class I am not sure how to make the list to lock for a period of time while I am try to mimic the multiple add call. Thanks.
Can_add_one_item_at_a_time
[Test]
public void Can_add_one_item_at_a_time() //this test won't pass
{
//I am not sure how to do this test...
var list = new ThreadSafeList<string>();
//some how need to call lock and sleep inside list instance
//say somehow list locks for 1 sec
var ta = new Thread(x => list.Add("a"));
ta.Start(); //does it need to aboard say before 1 sec if locked
var tb = new Thread(x => list.Add("b"));
tb.Start(); //does it need to aboard say before 1 sec if locked
//it involves using GetSnapshot()
//which is bad idea for unit testing I think
var snapshot = list.GetSnapshot();
Assert.IsFalse(snapshot.Contains("a"), "Should not contain a.");
Assert.IsFalse(snapshot.Contains("b"), "Should not contain b.");
}
Snapshot_should_be_point_of_time_only
[Test]
public void Snapshot_should_be_point_of_time_only()
{
var list = new ThreadSafeList<string>();
var ta = new Thread(x => list.Add("a"));
ta.Start();
ta.Join();
var snapshot = list.GetSnapshot();
var tb = new Thread(x => list.Add("b"));
tb.Start();
var tc = new Thread(x => list.Add("c"));
tc.Start();
tb.Join();
tc.Join();
Assert.IsTrue(snapshot.Count == 1, "Snapshot should only contain 1 item.");
Assert.IsFalse(snapshot.Contains("b"), "Should not contain a.");
Assert.IsFalse(snapshot.Contains("c"), "Should not contain b.");
}
Instance method
public ThreadSafeList<T> Instance<T>()
{
return new ThreadSafeList<T>();
}
Let's look at your first test, Can_add_one_item_at_a_time.
First of all, your exit conditions don't make sense. Both items should be added, just one at a time. So of course your test will fail.
You also don't need to make a snapshot; remember, this is a test, nothing else is going to be touching the list while your test is running.
Last but not least, you need to make sure that you aren't trying to evaluate your exit conditions until all of the threads have actually finished. Simplest way is to use a counter and a wait event. Here's an example:
[Test]
public void Can_add_from_multiple_threads()
{
const int MaxWorkers = 10;
var list = new ThreadSafeList<int>(MaxWorkers);
int remainingWorkers = MaxWorkers;
var workCompletedEvent = new ManualResetEvent(false);
for (int i = 0; i < MaxWorkers; i++)
{
int workerNum = i; // Make a copy of local variable for next thread
ThreadPool.QueueUserWorkItem(s =>
{
list.Add(workerNum);
if (Interlocked.Decrement(ref remainingWorkers) == 0)
workCompletedEvent.Set();
});
}
workCompletedEvent.WaitOne();
workCompletedEvent.Close();
for (int i = 0; i < MaxWorkers; i++)
{
Assert.IsTrue(list.Contains(i), "Element was not added");
}
Assert.AreEqual(MaxWorkers, list.Count,
"List count does not match worker count.");
}
Now this does carry the possibility that the Add happens so quickly that no two threads will ever attempt to do it at the same time. No Refunds No Returns partially explained how to insert a conditional delay. I would actually define a special testing flag, instead of DEBUG. In your build configuration, add a flag called TEST, then add this to your ThreadSafeList class:
public class ThreadSafeList<T>
{
// snip fields
public void Add(T item)
{
lock (sync)
{
TestUtil.WaitStandardThreadDelay();
innerList.Add(item);
}
}
// snip other methods/properties
}
static class TestUtil
{
[Conditional("TEST")]
public static void WaitStandardThreadDelay()
{
Thread.Sleep(1000);
}
}
This will cause the Add method to wait 1 second before actually adding the item as long as the build configuration defines the TEST flag. The entire test should take at least 10 seconds; if it finishes any faster than that, something's wrong.
With that in mind, I'll leave the second test up to you. It's similar.
You will need to insert some TESTONLY code that adds a delay in your lock. You can create a function like this:
[Conditional("DEBUG")]
void SleepForABit(int delay) { thread.current.sleep(delay); }
and then call it in your class. The Conditional attribute ensure it is only called in DEBUG builds and you can leave it in your compiled code.
Write something which consistently delays 100Ms or so and something that never waits and let'em slug it out.
You might want to take a look at Chess. It's a program specifically designed to find race conditions in multi-threaded code.

Categories