Category: concurrency

Lock-free thread safe initialisation in C#

I used to do quite a lot of concurrent programming early in my career. Recently I’ve been working again on optimising a large piece of code to behave better when ran in concurrent environment and I must share a secret with you – I had a great fun. It all started with a semi-serious challenge from a friend, and ended up consuming big part of my Christmas-new year break.

One of the problems I found, was that a group of objects that was being used on multiple threads had to be initialised in a thread safe manner – that is the initialisation code had to be ran just once, atomically, on a single thread. However afterwards it could be safely executed concurrently on multiple threads. This was being accomplished by using code similar to following:

public void Foo GetInitializedFoo()
{
   if(this.fooIsInitialized) return foo;
   lock(lockObject)
   {
      if(this.fooIsInitialized) return foo;
      
      InitializeFoo();
      return foo;
   }
}

As you can see the code usually works lock-free, and only the first time when foo is not initialised it has to lock to perform the initialisation algorithm. The reason that was suboptimal is that in heavily multi-threaded scenarios it yielded suboptimal performance when multiple threads were trying to initialise foo.

Imagine there are four threads. First of them succeeds in obtaining the lock, and goes ahead to perform the initialisation, while the other two go into the waiting queue and sleep there waiting for the lock to be freed. The first thread finishes the initialisation pretty soon, but by the time the first thread wakes up already quite a lot of time is wasted. Then the second thread obtains the lock, but the remaining two, even though the variable is already initialised and therefore safe to be read concurrently, still wait in the lock queue. The three remaining threads could execute all at once now, yet they will be woken up and executed each in turn.

Mad scientist solution

The problem was that using lock proved to be suboptimal for the problem at hand. To try to address this (and to make my inner geek happy) I decided to look at making this code perform better.

While .NET framework library has lots of synchronisation primitives none of them really seemed to be any better suited for the job then good old Monitor. I didn’t want to put waiting threads to sleep immediately, as this is quite an expensive operation (and so is waking them up back again), and the initialisation would usually be performed quite quickly. I also wanted all the waiting threads to be able to execute concurrently with no synchronisation after the first one of them finished initialisation. To accomplish this I came up with the following solution:

public sealed class ThreadSafeInit
{
	// We are free to use negative values here, as ManagedTreadId is guaranteed to be always > 0
	// http://www.netframeworkdev.com/net-base-class-library/threadmanagedthreadid-18626.shtml
	// the ids can be recycled as mentioned, but that does not affect us since at any given point in time
	// there can be no two threads with the same managed id, and that's all we care about
	private const int Initialized = int.MinValue + 1;
	private const int NotInitialized = int.MinValue;
	private int state = NotInitialized;

	public void EndThreadSafeOnceSection()
	{
		if (state == Initialized)
		{
			return;
		}
		if (state == Thread.CurrentThread.ManagedThreadId)
		{
			state = Initialized;
		}
	}

	public bool ExecuteThreadSafeOnce()
	{
		if (state == Initialized)
		{
			return false;
		}
		var inProgressByThisThread = Thread.CurrentThread.ManagedThreadId;
		var preexistingState = Interlocked.CompareExchange(ref state, inProgressByThisThread, NotInitialized);
		if (preexistingState == NotInitialized)
		{
			return true;
		}
		if (preexistingState == Initialized || preexistingState == inProgressByThisThread)
		{
			return false;
		}

		var spinWait = new SpinWait();
		while (state != Initialized)
		{
			spinWait.SpinOnce();
		}
		return false;
	}
}

The ThreadSafeInit class manages lock-free access to part of codebase. It has a state int flag which starts off in a non-initialised state. Then as ExecuteThreadSafeOnce method, which marks the beginning of the initialisation section is being called, it uses Interlocked.CompareExchange to set the flag in a lock-free thread safe manned to the id of current thread. The thread that succeeds returns true and is free to go ahead and do the initialisation. Any other thread waits for the first one to finish, but instead of going immediately to sleep it spins for a while, effectively wasting cycles, and then once the first thread finishes, and calls EndThreadSafeOnceSection method, all the other threads can proceed in parallel.

Usage pattern looks something like this:

public void Foo GetInitializedFoo()
{
   if(this.fooIsInitialized) return foo;
   var initialising = false;
   try
   {
      initializing = init.ExecuteThreadSafeOnce();
      if(this.fooIsInitialized) return foo;
      
      InitializeFoo();
      return foo;
   }
   finally
   {
      if(initialising)
      {
         init.EndThreadSafeOnceSection();
      }
   }
}

Using this code I was able to measure over two-fold improvement over the old solution in this part of code. Sure, it’s a micro-optimisation and in 99.9% of cases lock would be just fine, but as I said, it was a challenge, and I had a blast working on this, and that’s all I care about. Well, not all perhaps. If you see any flaws in this code, ways to improve it, or I just replicated something that someone else did much better long time ago, let me know.