This chapter marks a significant shift in how we handle concurrency. Up until now, whenever we wanted to handle multiple clients simultaneously, we used fork() to create a new process. While robust, that approach has limitations that threads aim to solve.
The text identifies two main problems with the traditional Unix model of using fork() (one process per client):
- Performance Cost: Forking is “expensive.” Even with optimizations like copy-on-write, the operating system still has to duplicate page tables, file descriptor tables, and set up a new memory space. Creating a thread is often 10–100 times faster than creating a process.
- Communication Difficulty: Passing information from a parent to a child is easy (since the child gets a copy of the parent’s variables). However, returning information from the child to the parent is difficult and requires Interprocess Communication (IPC) (like pipes or message queues).
The Thread Solution (“Lightweight Processes”)
Threads exist within a single process. The key trade-off is that while they are faster and can easily share data, they introduce the danger of synchronization bugs (race conditions).
The most important concept to grasp in this section is exactly what is shared and what is private between threads.
Shared Resources (All threads in a process share these):
- Global variables and memory.
- Process instructions (the code).
- Open file descriptors (If Thread A closes a socket, it is closed for Thread B too).
- Signal handlers.
- User IDs and Group IDs.
Private Resources (Each thread has its own):
- Thread ID (TID).
- Stack (for local variables).
- Registers (including the Program Counter and Stack Pointer).
errno(This is crucial; otherwise, a network error in one thread would overwrite a file error in another).- Signal mask.
The Analogy
The text uses an analogy involving Signal Handlers.
- In standard Unix programming, if a signal arrives, the main program pauses, and a signal handler function runs using the same global variables but a different stack.
- If the main program was updating a linked list and the signal handler tries to update the same list, corruption occurs. Threads work the same way: they run independently but share the same data space, so you must be careful.
The Standard: Pthreads
The book focuses on POSIX Threads (often called Pthreads), standardized in 1995. You will recognize them because all the functions begin with pthread_ (e.g., pthread_create).