Ostep-Concurrency-Problems

Common Concurrency Problems

A study by Lu et al. analyzed popular concurrency applications like MySQL, Apache, Mozilla, and OpenOffice and found common categories of concurrrency problems.

Non-deadlock Bugs

Most of these came down to two forms:

Atomicity-Violation Bugs

Taken from MySQL and simplified:

// thread 1
if (thd->proc_info) {
	...
	fputs(thd->proc_info, ...);
	...
}

// thread 2
thd->proc_info = NULL

This one should be pretty obvious given the chapter up to this point; we need locks around the critical sections:

pthread_mutex_t proc_info_lock = PTHREAD_MUTEX_INITIALIZER;

// thread 1
pthread_mutex_lock(&proc_info_lock);
if (thd->proc_info) {
	...
	fputs(thd->proc_info, ...);
	...
}
pthread_mutex_unlock(&proc_info_lock);

// thread 2
pthread_mutex_lock(&proc_info_lock);
thd->proc_info = NULL
pthread_mutex_unlock(&proc_info_lock);

Order-Violation Bugs

// thread 1
void init() {
	...
	mThread = PR_CreateThread(mMain, ...);
	...
}

// thread 2
void mMain(..) {
	...
	mState = mThread->State;
	...
}

Not quite as obvious as the first, perhaps, but the issue is that the code is assuming a certain ordering: that mThread gets assigned in thread 1 before it is reference in thread 2. However, it is possible that thread 2 starts execution immediately, before the instruction runs to assign the return value to mThread. In that case, mThread might be NULL or referencing old memory.
The solution of course is condition variables:

pthread_mutex_t mtLock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t mtCond = PTHREAD_COND_INITIALIZER;
int mtInit = 0;

// thread 1
void init() {
	...
	mThread = PR_CreateThread(mMain, ...);
	// signal that thread is created and mThread is ready for referencing
	pthread_mutex_lock(&mtLock);
	mtInit = 1;
	pthread_cond_signal(&mtCond);
	pthread_mutex_unlock(&mtLock);
	...
}

// thread 2
void mMain(..) {
	...
	// wait for thread to be initialized
	pthread_mutex_lock(&mtLock);
	while (mtInit == 0) {
		pthread_cond_wait(&mtCond, &mtLock);
	}
	pthread_mutex_unlock(&mtLock);
	// ready to go!
	mState = mThread->State;
	...
}

Deadlock Bugs

Deadlocks occur when one thread holds a lock L1 and wants a lock L2 and another thread holds L2 but wants L1. Both threads get stuck indefinitely. It's an easy problem to state but generally can be a hard one to solve.
Conditions necessary for deadlock:

  1. Mutual exclusion: threads claim exclusive resource control (locks)
  2. Hold-and-wait: threads hold resources (locks) while waiting for others
  3. No preemption: resources (locks) cannot be forcibly removed from holders
  4. Circular wait: there exists a chain of threads such that each one holds resources (locks) requested by the next thread in the chain. (Think cycle in directed graph.)
    It stands to reason then, that we can prevent one of these conditions to prevent any deadlocks.

Circular wait

A simple and practical prevention to preventing the circular wait is to enforce a total ordering on locks. Given the description above, if L1 is always requested before L2, no deadlock will occur. In more complex systems with many locks, a partial ordering can be useful as well (see the source code for memory mapping in Linux for an example).
Tip: You can use the address of the lock to enforce a total ordering.

Hold-and-wait

You can avoid hold-and-wait by using one of the pthread_mutex_trylock style variants that don't block when the lock isn't available. However, this likely involves looping around to try obtaining multiple locks again, and can result in a livelock where two threads are constantly looping and trying and failing to obtain all the locks they need.

Mutual exclusion

Lastly, there are lock-free and wait-free approaches to building data structures. For example instead of using a lock to accomplish an atomic increment, imagine something like

void atomic_increment(int *value, int amount) {
	do {
		// get the current value
		int old = *value;
	// if we swapped successfully, great, exit
	// otherwise get the new value and try incrementing again
	} while (compare_and_swap_instr(value, old, old + amount) == 0);
}

which will just repeatedly try to atomically update the value until it succeeds. No deadlocks can occur here! (But a livelock is still possible.)

Avoidance strategies

Scheduling

In a multi-core setting, one could also imagine a smart scheduler that knows which threads could result in a deadlock if run concurrently, and instead make sure they are run sequentially on a single CPU.

Detect and recover

A perhaps less elegant but pragmatic solution is to allow deadlocks to occur, but have a detection mechanism that can recover. For example Postgres and other database management systems have a deadlock detector that runs periodically. It builds a resource graph and checks for cycles; if it finds one, it can abort the offending transactions.