5. Thread Cancellation

This section addresses a critical robustness issue: What happens if a thread is killed (cancelled) while it is waiting for a lock?

1. The Scenario: A “Hung” Lock

Imagine the following sequence of events using the code we just wrote in:

Thread A calls pthread_rwlock_rdlock. It sees a writer is active, so it increments rw_nwaitreaders (count becomes 1) and calls pthread_cond_wait to sleep.
Thread B (the main thread) decides Thread A is taking too long and calls pthread_cancel(Thread A).
Thread A terminates immediately while inside the wait function.
The Bug: The line rw_nwaitreaders-- (which comes after the wait) is never executed. The system now thinks there is permanently 1 reader waiting. The lock creates a “ghost” waiter that blocks future writers forever.

Stevens provides a test program (testcancel) in this section that demonstrates exactly this behavior: the program hangs because the internal counters get desynchronized from reality.

2. The Solution: Cleanup Handlers

To fix this, Posix provides a mechanism essentially equivalent to a try...finally block found in higher-level languages. We can “push” a function onto a stack that the system guarantees will run if the thread is cancelled.

The API:

pthread_cleanup_push(function, arg): Registers a cleanup function.
pthread_cleanup_pop(execute): Removes the function from the stack. If execute is non-zero, it runs the function.

3. The Fix Implementation

We need to wrap our “wait” calls with these handlers. We create two small static functions—one to clean up readers, one for writers.

A. The Cleanup Functions

These functions simply restore the state (decrement the count and release the mutex) if the thread is killed.

// Cleanup handler for Readers
static void rwlock_cancelrdwait(void *arg) {
    pthread_rwlock_t *rw = arg;
    
    rw->rw_nwaitreaders--;             // 1. Undo the increment
    pthread_mutex_unlock(&rw->rw_mutex); // 2. Release the mutex (wait re-acquires it before cancel)
}
 
// Cleanup handler for Writers
static void rwlock_cancelwrwait(void *arg) {
    pthread_rwlock_t *rw = arg;
    
    rw->rw_nwaitwriters--;             // 1. Undo the increment
    pthread_mutex_unlock(&rw->rw_mutex); // 2. Release the mutex
}

B. Applying the Fix (The “Sandwich”)

We modify the locking functions to “sandwich” the wait call between a push and a pop.

Fixed pthread_rwlock_rdlock:

int pthread_rwlock_rdlock(pthread_rwlock_t *rw) {
    // ... (mutex locking and checks) ...
 
    while (rw->rw_refcount < 0 || rw->rw_nwaitwriters > 0) {
        rw->rw_nwaitreaders++;
 
        // --- THE FIX ---
        // 1. Register the cleanup handler
        pthread_cleanup_push(rwlock_cancelrdwait, (void *) rw);
 
        // 2. Wait (Point of potential cancellation)
        result = pthread_cond_wait(&rw->rw_condreaders, &rw->rw_mutex);
 
        // 3. Remove the handler. Passing '0' means "don't run it now" 
        //    (because we woke up normally and will decrement manually below).
        pthread_cleanup_pop(0); 
        // ----------------
 
        rw->rw_nwaitreaders--;
        if (result != 0) break;
    }
 
    // ... (rest of function) ...
}

4. Visualizing the Flow

It helps to visualize the execution flow of a normal wake-up versus a cancellation.

Normal Path: The thread pushes the handler, sleeps, wakes up, pops the handler (removing it), and continues. The handler never runs.
Cancelled Path: The thread pushes the handler, sleeps, and gets cancelled. The OS detects the cancellation and automatically executes the handler on the stack, fixing the counters before the thread dies.

5. Summary of Chapter 8

This concludes Chapter 8 on Read-Write Locks. We have covered:

Concept: Allowing multiple readers but only one writer.
API: pthread_rwlock_rdlock/wrlock.
Implementation: Building one from scratch using Mutexes and Condition Variables.
Robustness: Handling thread cancellation using pthread_cleanup_push.

Quartz 4

Explorer