| CPSC 331 | Operating Systems | Spring 2026 |
This project has several goals:
Your primary task is to add support for threads in xv6 as described in the project README and below. As part of this, you will add two system calls to support threads in the kernel and build a small user library providing support for basic thread management and mutex locks. There again isn't a lot of code to write; as with previous projects, the crux of the assignment is understanding what is going on in the relevant portions of xv6.
Review the course policy on academic integrity.
Certain uses of AI are permitted on this assignment. AI use is not
required.
The basic rule: you may use the code completion features of Copilot but may not use features (such as the coding agent) where code is generated from English language prompts. It is also essential that you understand and think critically about any code suggested by Copilot, both to help develop your C programming skills and because while the code suggestions are often uncannily on target, they are not always exactly what you want.
Using the code explanation features of Copilot Chat is permitted, though be careful that this doesn't spill over into code generation.
Review the policy on late work.
Revise and resubmit applies for this assignment. Review the details of how revise and resubmit works.
To hand in your project:
Copy your entire xv6-threads directory to your handin directory (/classes/cs331/handin/username).
Check that the handin was successful and that the directory structure is correct: your handin folder /classes/cs331/handin/username should contain a folder xv6-threads which in turn contains a src directory.
Do the steps outlined in reddish boxes below before you start writing code.
Copy the xv6-threads directory (and all of its contents) from /classes/cs331 to your workspace folder (~/cs331/workspace). Make sure that you end up with the xv6-threads directory inside your workspace folder, don't just copy its contents.
The provided code includes the xv6 codebase (the same thing you have started with for each xv6 project) and several user programs to test your system calls and thread library routines. There are no tester test cases this time.
Very important! One of the things the autoformatter can do is reorganize the #include directives in your code. While useful in some circumstances, this will break the xv6 code. You should have taken care of this as part of the setup for the previous xv6 projects. (If you didn't, go back and look at that now.)
Three test programs threadtest, threadlibtest, and locktest have been provided to help you test your code. Since they won't compile until you have completed the relevant parts of the project, you will need to add them to UPROGS list in src/Makefile when you are ready. (This is the same thing you did for the hello world program in the first xv6 project.) Once compiled, run them from xv6 shell prompt in QEMU.
Note that threadtest.c and locktest.c have a couple of commented-out sections — look for the TODO comments. Also feel free to modify the provided programs to further test your code.
To print messages for debugging kernel code, use cprintf instead of printf. (printf is only for printing when running in user mode — the two different versions have to do with using different sections of the address space for kernel vs user mode.) cprintf is similar to printf but you don't have to specify the file descriptor first — all output from cprintf goes to the console. Look for where it is defined/used in the xv6 code for examples.
Getting the stack set up properly for a new thread to begin executing can be tricky, and a common problem is to access a bad memory address. This will tend to show up as an error
unexpected trap 14 from cpu 0
Trap 14 is a page fault. Other information printed out is the value of %eip (the instruction pointer) and %cr2 (a control register which stores the address that caused the page fault). To figure out the problem, think carefully about the addresses and values you expect, then use cprintf to examine what is actually going on.
You have three main tasks: add kernel threads (clone() and join) system calls), add user-level library routine wrappers thread_clone() and thread_join() around those system calls, and library routines for mutex locks (lock_init(), lock_acquire(), and lock_release()). The project README provides an overview and some implementation details, though note that there are a few changes to the specifications (and many more details) below.
To do this:
Read through the README for an overview of what you will be doing and to find out what information it contains.
Work through the sections below, in order. For each, read through the whole section to see what it contains before getting started writing code. There are a number of specific directions and hints that will make things a great deal easier if you are aware of them rather than just launching straight into trying to achieve the specifications.
You will be implementing clone() and join() as described in the README, with two changes — the caller (rather than clone) will decide how much space to allocate for the stack and the stack parameter to clone will be the address of the top of the stack instead of the beginning of the space allocated for the stack. This is a bit cleaner than assuming a fixed stack size and passing the top of the stack matches what many real systems (like Linux) do.
You also do not need to deal with making sure that resizing the address space is handled properly for multithreaded processes (mentioned at the end of the "Overview" section of the README).
Since the kernel will keep track of both threads and processes in the process table, add a flag int isthread to the per-process information in order to distinguish them: locate the definition of struct proc and add a new field, then initialize it to 0 (not a thread i.e. a process) when new processes are created (in allocproc()).
We'll also store a pointer to the thread's stack so that memory can be deallocated when the thread exits. Add a field void* stack to struct proc and initialize it to 0 (null) in allocproc().
clone() is a system call, so start with setting up the usual structure for system calls — do everything except implement the main functionality (the body of the helper). Since threads are related to processes, sys_clone and the helper thrclone should go into sysproc.c and proc.c, respectively. Also be sure to add the header for the helper to defs.h. See the "Overview" section in the README for the header for clone(). Retrieve the parameters as ints (use argint rather than argptr) even though they are declared as void * because the size of what is being pointed by the address is unknown to clone().
Creating a new thread is very similar to creating a new process, except that the parent's address space is shared rather than a new address space being created and the new thread needs to be set up to run (by setting up its stack and instruction pointer). The body of fork() will be used as the basis for implementing thrclone().
Read the "Building clone() from fork()" section in the README to get an overview of the idea. That also explains some of what is going on in fork().
Copy the body of fork() to be the starting point for the body of thrclone().
Add: initialize np->isthread to 1 for the new thread.
Add: initialize np->stack to the stack parameter passed to thrclone().
Modify: instead of creating a copy of the parent's address space and pointing the new thread's page directory there, simply set np->pgdir to point to the parent's page directory. (The cleanup that is done if the copy operation fails is also no longer needed.)
The new thread will start running as if it is returning from a trap, so the "saved" state needs to be set up accordingly. When a trap occurs, registers are stored in the tf (trap frame) field of the process table info for the process/thread.
First, set up the new thread's entry point (where the thread will start running): %eip is the register holding the address of the next instruction (i.e. the instruction pointer or program counter), and the fcn parameter to clone() is a pointer to the function the thread should execute.
Add
np->tf->eip = (uint)fcn;
Finally, since the new thread is effectively starting off with a function call, its stack needs to be set up properly for that call. See the slides from class for details about how the stack is arranged and important pointer arithmetic tips for the following step.
Add: push the return address, arg1, and arg2 onto the stack and set the "saved" stack pointer np->tf->esp to point to the top of the stack.
You should now be able to test thread creation — look at the code in user/threadtest.c to see how it calls clone(), update the Makefile to compile it (add threadtest to UPROGS), run xv6 with make qemu, and then run threadtest from the xv6 command line.
join() is a system call, so start with setting up the usual structure for system calls as you did with clone(). Again consult the "Overview" section in the README for the header, and name the helper thrjoin.
Joining a child thread is very similar to waiting for a child process, except that we don't free its address space (the parent process is still using it!). (The thread's stack needs to be cleaned up instead.) The body of wait() will be used as the basis for implementing thrjoin().
Copy the body of wait() to be the starting point for the body of thrjoin().
Zombie processes (and threads) are those which have ended (signaled by calling exit()) but whose resources haven't been cleaned up yet. wait() looks for a zombie child process, then cleans up its resources by deallocating allocated memory and zeroing out the info stored in its process table entry. join() should do the same for threads, with the following changes:
Only look for child threads — add a check of isthread along with the existing check of whether p is a child.
Don't deallocate the address space! Remove the line that frees the page directory.
Copy the location of the thread's stack to the argument stack so that can be freed (by the caller). Add
*stack = p->stack;
to the rest of the cleanup done when a zombie child is found.
Finally, wait() itself needs a few updates so that it behaves properly when there are both threads and processes. (The "Overview" section in the README mentions this.) There are two considerations — cleaning up threads that have finished is different from cleaning up processes that have finished, and shared address spaces means that processes which exit before all of their threads have finished requires special handling. The latter issue can be dodged if a process always joins its threads before itself exiting, but, of course, an OS shouldn't count on user programs always being well-behaved.
Fix the different cleanup needs by updating wait() to only apply to child processes and not child threads — add a check of isthread (in this case, that p is not a thread) along with the existing check of whether p is a child.
[extra credit] Deal with processes that exit while they still have running threads. The README mentions freeing the address space only if this (the zombie child being waited for) is the last reference to it. (How will you know when it is the last reference?) Make sure you fully understand what is going on — trace through the implementation of the exit() system call in kernel/proc.c to find out what happens with exit() is called, and think through two scenarios: process p creates a multithreaded child process c and c exits before all of its threads, and multithreaded process p exits before all of its threads. Do the address spaces get cleaned up properly in both cases?
You should now be able to test join — see the TODO comments in threadtest.c and make changes accordingly, then compile and run.
The thread library puts a slightly nicer wrapper around the system calls.
Implement thread_create() and thread_join() as described in the "Overview" section of the README. See the additional notes and specifications below.
Additional notes and specifications:
The user.h and ulib.c files are in the include and ulib subdirectories, respectively, not user as the README says.
Allocate one page (4096 bytes) for the stack.
See the provided user/threadtest.c for examples of using clone() and join() (including allocating and freeing the thread's stack). Note that you need to pass the top of the stack to clone() — and that this is what is passed back by join() — but the beginning of the allocated block to free(). You can simply pass the arg1 and arg2 parameters along to clone().
Do error-checking, and return -1 if malloc(), clone(), or join() fail. Don't forget to free the allocated stack if clone() fails. (Otherwise the stack will be freed by join().)
You should now be able to test your library routines — look at the code in user/threadlibtest.c to see how it works (it is the same as threadtest.c except using the library routines instead of the system calls directly), update the Makefile to compile it (add threadlibtest to UPROGS), run xv6 with make qemu, and then run threadlibtest from the xv6 command line.
Implement ticket locks as described in the "Overview" section of the README. See the additional notes and specifications below.
Chapter 28 in the book contains code for implementing ticket locks. The main task in this part is copying that code into the right places in the xv6 code with a few adaptations — you don't need to figure out the implementation from scratch. Note that the README specifies slightly different names for the functions, so you'll need to adapt the book's code accordingly.
Additional notes and specifications:
Put the whole definition of the lock_t type and prototypes (headers) for lock_init(), lock_acquire(), and lock_release() in include/user.h and put the bodies in ulib/ulib.c.
It is recommended that you add a field pid to lock_t to store the pid of the thread holding the lock. This is handy for debugging purposes! Set pid to -1 when the lock is initialized or released, and use the getpid() system call to retrieve the pid of the current process to set it when the lock is acquired.
Copy the x86 fetch-and-add implementation referred to in the README into include/x86.h. Add static at the beginning of the header (see the other definitions in include/x86.h). Note that the function's name is slightly different than what is assumed in the book's code so you will need to adapt the book's code accordingly.
You should now be able to test your locks routines — look at the code in user/locktest.c to see how it works, update the Makefile to compile it (add locktest to UPROGS), run xv6 with make qemu, and then run locktest from the xv6 command line. Note that the provided version creates and initializes a lock but doesn't use it — run it that way first to demonstrate the problem of race conditions, then uncomment the lines indicated by the TODO comment to (hopefully) see successful mutual exclusion. It is also recommended that you add printfs in the lock routines so you can more easily trace correct flow of lock acquisition and release.