For those of us who work in the multi-threaded environment, the concept of mutex (“mutual exclusion”) should be no stranger. There are two common categories of mutex – simple and recursive.
Simple mutex can only be locked once, and if the same thread tries to lock the same mutex again, it will deadlock.
Recursive mutex can be locked by the same thread numerous times, and the same thread must unlock the same number of times before the critical region can be accessed by another thread.
Recursive mutex is common. In fact, mutexes are recursive by default in Windows.
With all that said, I believe recursive mutex should be avoid as much as possible.
Since recursive mutex tracks more information than a simple mutex, it should be no surprise that it is more expensive.
Here is a sample test case with multiple threads accessing a critical region. The mutex implementation is from Boost.Thread library.
A single recursive mutex is quite a bit more expensive than a simple mutex. If recursive mutex is used, it is probably going to be locked more than once by the same thread (otherwise, why would you use it?). The graph also demonstrates the cost of recursively locking the mutex again, up to five times. As you can see, it is linear, and it is definitely not free.
The term “recursive mutex” might sound fancy, but they are nothing but redundant locks. After the critical region is locked, each additional recursive mutex adds nothing but overhead.
Some might argue that this cost is negligible. This might be true depending on the application, but this is just the surface of the problem.
The Real Problem
Recursive mutex makes locking easy by taking away the programmer’s responsibility of tracking mutexes. Programmers can add recursive mutex to existing functions and member functions easily without refactoring code. This would not have been possible with simple mutex. However, this advantage is merely an illusion.
Don’t believe it? Here’s the thought of David Butenhof, the person who introduced recursive mutex in POSIX thread.
Here’s some of his powerful arguments:
1. Recursive mutex allows you to lose track of your locking scheme and scope. As your call stack becomes deeper, you will have less sense on how long you’ve been locking the critical region.
2. Recursive mutex are reference counted, and releasing it does not imply that the critical region is freed. So how would you know how to unlock?
I’ve learned a lot from his post. His argument addresses the fundamental design issue in software that use recursive mutex.
To avoid using recursive mutex, we have to consider refactoring.
In functional programming, a typical refactoring technique to break up recursive locking is by splitting a function into two components– outer and inner. The outer function holds the lock, and invoke the inner functions to perform its task. The inner functions is an internal function that is not exposed as an API.
In Object Oriented programming, this technique can also be applied in form of Pimpl idiom or private functions.
Consider this example:
Class A has two public functions – One and Two. One calls Two. Both methods are thread-safe, they both lock the same recursive mutex, which is a private member of Class A. Below demonstrates that the functional programming refactoring technique can be applied in OO programming.
Refactoring – A More Complicated Scenario
When more than one classes are involved, things gets more complicated. The technique above will not hold.
Here’s a more complicated (yet realistic) example:
Class A has two public functions – One and Three. Both functions are thread-safe, and lock the same recursive mutex, which is a private member variable of A. Class B has one public function – Two.
A::One calls B::Two, and B::Two calls A::Three.
Now we have a mess. The possible call paths grows exponentially as the call stack gets deeper.
In the example, there are three possible paths.
1. A::One, B::Two, A::Three
2. B::Two, A::Three
If we apply the previous technique, we would refactor A::Three into A::Three and A::Three_Impl. That won’t work here because class B expects A::Three_Impl to be a public interface.
There is no fix-all solution here, but here’s some suggestion.
1. Refactor A and B such that they do not circular reference each other. In other word, A can reference B, or B can reference A, but not at the same time.
2. Merge A and B in one class, or move B::Two into class A.
Use Recursive Mutex If It Is Absolutely Positively Necessary
Unfortunately, in the real world, things get even more complicated. The complicated example above only involves two classes. But what if more classes are involved, the number of call paths grows exponentially, and becomes unrealistic to refactor?
As a discipline of computer science, I would argue that for every solution with recursive mutex, there must be a solution for simple mutex.
As a software engineer who is under the gun with a deadline… I say just hack it and use recursive mutex. We are shipping tomorrow! :)
But as David Butenhof said, recursive mutex is a hack. Instead of relying on recursive mutex up front, we should avoid it until it is absolutely necessary.
Recursive mutex is dangerous because you lose sense of locking scope. It costs more than a simple mutex.
Consider to refactor the code to avoid the use of recursive mutex.
Use recursive mutex if it can’t be refactored economically.
The source and the spreadsheet can be downloaded here.
Tools: Visual Studio 2008, Boost 1.39, Window XP SP3 32bit