Ostep-Concurrency-Problems
Common Concurrency Problems
A study by Lu et al. analyzed popular concurrency applications like MySQL, Apache, Mozilla, and OpenOffice and found common categories of concurrrency problems.
Non-deadlock Bugs
Most of these came down to two forms:
Atomicity-Violation Bugs
Taken from MySQL and simplified:
// thread 1
if (thd->proc_info) {
...
fputs(thd->proc_info, ...);
...
}
// thread 2
thd->proc_info = NULL
This one should be pretty obvious given the chapter up to this point; we need locks around the critical sections:
pthread_mutex_t proc_info_lock = PTHREAD_MUTEX_INITIALIZER;
// thread 1
pthread_mutex_lock(&proc_info_lock);
if (thd->proc_info) {
...
fputs(thd->proc_info, ...);
...
}
pthread_mutex_unlock(&proc_info_lock);
// thread 2
pthread_mutex_lock(&proc_info_lock);
thd->proc_info = NULL
pthread_mutex_unlock(&proc_info_lock);
Order-Violation Bugs
// thread 1
void init() {
...
mThread = PR_CreateThread(mMain, ...);
...
}
// thread 2
void mMain(..) {
...
mState = mThread->State;
...
}
Not quite as obvious as the first, perhaps, but the issue is that the code is assuming a certain ordering: that mThread
gets assigned in thread 1 before it is reference in thread 2. However, it is possible that thread 2 starts execution immediately, before the instruction runs to assign the return value to mThread
. In that case, mThread
might be NULL or referencing old memory.
The solution of course is condition variables:
pthread_mutex_t mtLock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t mtCond = PTHREAD_COND_INITIALIZER;
int mtInit = 0;
// thread 1
void init() {
...
mThread = PR_CreateThread(mMain, ...);
// signal that thread is created and mThread is ready for referencing
pthread_mutex_lock(&mtLock);
mtInit = 1;
pthread_cond_signal(&mtCond);
pthread_mutex_unlock(&mtLock);
...
}
// thread 2
void mMain(..) {
...
// wait for thread to be initialized
pthread_mutex_lock(&mtLock);
while (mtInit == 0) {
pthread_cond_wait(&mtCond, &mtLock);
}
pthread_mutex_unlock(&mtLock);
// ready to go!
mState = mThread->State;
...
}
Deadlock Bugs
Deadlocks occur when one thread holds a lock L1
and wants a lock L2
and another thread holds L2
but wants L1
. Both threads get stuck indefinitely. It's an easy problem to state but generally can be a hard one to solve.
Conditions necessary for deadlock:
- Mutual exclusion: threads claim exclusive resource control (locks)
- Hold-and-wait: threads hold resources (locks) while waiting for others
- No preemption: resources (locks) cannot be forcibly removed from holders
- Circular wait: there exists a chain of threads such that each one holds resources (locks) requested by the next thread in the chain. (Think cycle in directed graph.)
It stands to reason then, that we can prevent one of these conditions to prevent any deadlocks.
Circular wait
A simple and practical prevention to preventing the circular wait is to enforce a total ordering on locks. Given the description above, if L1
is always requested before L2
, no deadlock will occur. In more complex systems with many locks, a partial ordering can be useful as well (see the source code for memory mapping in Linux for an example).
Tip: You can use the address of the lock to enforce a total ordering.
Hold-and-wait
You can avoid hold-and-wait by using one of the pthread_mutex_trylock
style variants that don't block when the lock isn't available. However, this likely involves looping around to try obtaining multiple locks again, and can result in a livelock where two threads are constantly looping and trying and failing to obtain all the locks they need.
Mutual exclusion
Lastly, there are lock-free and wait-free approaches to building data structures. For example instead of using a lock to accomplish an atomic increment, imagine something like
void atomic_increment(int *value, int amount) {
do {
// get the current value
int old = *value;
// if we swapped successfully, great, exit
// otherwise get the new value and try incrementing again
} while (compare_and_swap_instr(value, old, old + amount) == 0);
}
which will just repeatedly try to atomically update the value until it succeeds. No deadlocks can occur here! (But a livelock is still possible.)
Avoidance strategies
Scheduling
In a multi-core setting, one could also imagine a smart scheduler that knows which threads could result in a deadlock if run concurrently, and instead make sure they are run sequentially on a single CPU.
Detect and recover
A perhaps less elegant but pragmatic solution is to allow deadlocks to occur, but have a detection mechanism that can recover. For example Postgres and other database management systems have a deadlock detector that runs periodically. It builds a resource graph and checks for cycles; if it finds one, it can abort the offending transactions.