| CPSC 331 | Operating Systems | Spring 2026 |
This project has several goals:
Your task is to implement a simplified version of the Unix shell largely as described in the project README.
Review the course policy on academic integrity.
Certain uses of AI are permitted on this assignment. AI use is not
required.
The basic rule: you may use the code completion features of Copilot but may not use features (such as the coding agent) where code is generated from English language prompts. It is also essential that you understand and think critically about any code suggested by Copilot, both to help develop your C programming skills and because while the code suggestions are often uncannily on target, they are not always exactly what you want.
Using the code explanation features of Copilot Chat is permitted, though be careful that this doesn't spill over into code generation.
Review the policy on late work.
Revise and resubmit applies for this assignment. Review the details of how revise and resubmit works.
To hand in your project:
Make sure that you have included a comment at the beginning of your program containing your name and a description of the program, and that you have auto-formatted it.
Copy your entire processes-shell directory to your handin directory (/classes/cs331/handin/username).
Check that the handin was successful and that the directory structure is correct: your handin folder /classes/cs331/handin/username should contain a folder processes-shell which in turn contains a Makefile, a file wish.c, and, if you added any test cases, a subdirectory tests containing the appropriate files.
Do the steps outlined in reddish boxes below before you start writing code.
Copy the processes-shell directory (and all of its contents) from /classes/cs331 to your workspace folder (~/cs331/workspace). Make sure that you end up with the processes-shell directory inside your workspace folder, don't just copy its contents.
The processes-shell/tests directory contains test cases for your shell. Run them from the command line: change to your processes-shell directory as needed, then run
./test-wish.sh
Treat the provided tests as a backup check: practice your testing skills by first doing your own testing, then run the provided tests to see if you missed anything. Also note that the provided tests run your shell in batch mode, so you'll need that functionality before you can run any tests.
The tester directory (copied as part of the setup for the previous project) contains scripts for running the provided tests along with a README that describes the setup for each test case. There's no need to do anything with the contents of this directory unless you want to define your own test cases.
Extra credit is possible if you find cases missed by the provided tests — see the README in the tester directory for information on how to define a test case and hand in the files for your new case(s) along with your code. Be sure to number your test cases starting one higher than the provided ones — don't modify or replace the provided test cases — and include a description of what the test case covers in the appropriate file.
Also see the Reference section from the previous project.
Look up system calls and library routines in the man pages to find out more about them even if the README or handout tells you what to use. Pay careful attention to the #includes needed along with the descriptions and return values.
A few particular notes:
wait(NULL) waits for any child to finish, returning the pid of the exiting child or -1 if an error occurred. Not having any unwaited-for children results in -1.
There is a lot of string manipulation needed for this project, and, in fact, that is likely the most challenging part of the project. Working with strings looks very different in C than it does in Java — in Java, String is a built-in object with methods and automatic memory management. In C, a string is simply a sequence of characters stored in memory and terminated by a special null character ('\0'). You do not call methods on strings — instead, you pass character arrays (or pointers to them) to library functions. This means you must be aware of where memory comes from, how long it remains valid, and whether operations modify the data in place.
For this project you can avoid using malloc to allocate memory for strings — you can either use routines that allocate memory themselves or create arrays. Keep in mind, though, that you will need to keep track of when string routines allocate memory so that you can free it (using free) when it is no longer needed — unlike in Java, there is no garbage collector to reclaim unused storage automatically; any memory obtained through library or system interfaces must eventually be released explicitly. For example, this means remembering which buffers were created by functions such as getline or strdup, and ensuring they are freed once their contents are no longer required. A useful habit is to treat memory ownership deliberately: note where each block originates, avoid losing the pointer to it, and clean it up at a well-defined point in the program's control flow (for example, at the end of each command-processing loop). Careful tracking prevents memory leaks, reduces long-running resource consumption, and reinforces disciplined thinking about how data structures and system resources are managed at the systems level.
Lines of text can be read using getline(), which was introduced in the previous project. getline() has a version where it will allocate the memory needed for what is read in for you so that you can read an arbitrary-length line without having to deal with resizing arrays:
char *line = NULL; // getline will allocate this
size_t len = 0; // size of the allocated buffer
size_t nread = getline(&line, &len, stdin);
if (nread == -1) {
// EOF or error
exit(1);
}
// ...use line...
// When done, free the buffer
free(line);
Once a line is read, you will need to break it into tokens, for example, to separate the command name from its arguments and the arguments from each other. The strsep function is useful here: it walks through the string, replaces delimiter (separator) characters with '\0', and returns pointers to successive tokens. These effectively divides up the original string but does not create any new strings. strsep is used as follows: (assume this code goes where the "use line" comment is in the previous example)
char *rest = line; // pointer that strsep will update
char *token;
while ((token = strsep(&rest, " \t\n")) != NULL) {
if (strlen(token) == 0 ) { // skip empty tokens from repeated delimiters
continue;
}
// ...use token...
}
}
This splits line based on whitespace (spaces and tabs \t; including the newline \n as a delimiter strips the newline from the last token), with token taking on the values of the successive tokens. Note that strsep splits according to individual tokens, so two spaces in a row, for example, will result in a 0-length token between them.
It is important to keep in mind that strsep changes its input in two ways:
Its first parameter is updated to point to the rest of the line after the current token so we create rest — if strsep(&line,...) was used instead, we would lose the pointer to the beginning of the original buffer.
The buffer pointed to by line has all occurrences of the delimiters replaced by the null terminator '\0' — if you print line after the end of the loop, you will only see the first token. The rest of the original line is still there, though — the successive values returned for token let you access those portions of the buffer.
(Try printing out the values of token, rest, and line at the beginning of the loop body to see this.) If you want to keep the original buffer unchanged, or if you want to access the returned tokens beyond the lifetime of the original buffer (each token returned is just a pointer into the original buffer, not a new string), you'll need to make a copy of the string.
Use strdup to make a copy of a string. This allocates new memory for the copy, which you will need to keep track of and eventually free when you are done with it.
When concatenating strings in C, you typically use strcat, which appends one null-terminated string to the end of another. Unlike Java's + operator, strcat does not create new storage — it writes into the destination buffer — so that buffer must already have enough space to hold the combined result (including the terminating '\0'). A common approach is to allocate storage up front by declaring a character array large enough for the expected text. For example:
char path[256]; // destination buffer char dir[] = "/bin/"; char cmd[] = "ls"; path[0] = '\0'; // start with an empty string strcat(path, dir); strcat(path, cmd); // path now contains "/bin/ls"
Here the array path provides the memory where concatenation occurs, and strcat successively appends dir and cmd. Because C does not automatically check bounds, it is the programmer's responsibility to ensure the array is large enough (including having space for the null terminator '\0'); otherwise, memory corruption can occur.
You cannot compare strings using == as you might compare object references in Java; instead you use functions such as strcmp, which examines characters and returns whether two strings are equal or which is lexicographically larger.
The function strlen is used to determine the length of a C string. Unlike Java's length() method, it does not store the length as part of the string object; instead, it counts characters starting at the beginning of the array until it encounters the terminating null character ('\0'). The value returned is the number of characters excluding that terminator.
C library and system functions communicate problems through return values rather than exceptions. After calls such as getline, strsep, or strdup, you should verify that the result is not an error indicator (such as -1 or NULL). Similarly, guard against exceeding the size of your token array, and handle edge cases like empty input or consecutive delimiters producing empty tokens. This defensive style is part of effective systems programming: because C provides direct control over memory and data representation, correctness and robustness depend on explicitly checking and validating each step.
Dive Into Systems also has some reference material on strings and arrays:
Your task is to implement a simplified version of the Unix shell largely as described in the project README.
To do this:
Start with the big picture: read the introduction, Overview, and Program Specifications sections of the README and the "additional or modified specifications" below, then look through the Structure section (except for Miscellaneous Hints). "Look through" means to get an idea of what is covered without trying to master the technical details — in this case, focus on the different kinds of functionality your shell will support and the related concepts but skip over technical implementation details like the specific system calls to use. The goal here is to understand what you are to do, without worrying about how exactly you will do it.
Look through the Miscellaneous Hints in the README and "hints/suggestions" below — the goal is to discover what there are hints about rather than absorb the specifics of what the hints are telling you.
Start coding, practicing incremental development — the hints provide a suggested plan of attack. With each chunk of functionality, review the relevant section(s) in the README and in this handout as well as the provided hints and suggestions, this time paying attention to the technical specifics.
Additional or modified specifications:
Name the program file wish.c.
Create a Makefile with targets for all, test, and clean — you can copy the one for wcat from the previous project and just modify the variable defintions.
Practice good coding style including choosing appropriate (descriptive) variable names and defining subroutines to avoid repeated code and to help organize your program.
Hints/suggestions:
The Miscellaneous Hints section in the README outlines a plan of attack. Keep in mind that you can break things down even farther e.g. start with a program that repeatedly prints a prompt, reads the command line entered using getline(), and prints what was read on the screen. (A loop repeatedly reading using getline() might sound familiar...like what you did in wgrep.) Then run that command instead of printing it, building up various pieces of the functionality one by one: running the exact single command entered (e.g. /bin/ls), hardcoding the path (e.g. the user enters ls and the shell runs /bin/ls), using the path variable, supporting arguments.
Implement interactive mode first, doing your own testing. Once the main functionality is in place (except perhaps full error-checking and robustness), add batch mode and run the provided tests.
When you partially implement something (e.g. without error checking) as part of incremental development, include a comment tagged with something obvious like TODO to remind yourself to come back to it later.
Print stuff to help with testing and debugging. It is not a bad idea to prefix all debugging output with something obvious like DEBUG — that makes it easy to distinguish both when you are running your program and when you want to remove (or at least comment out) it for handin.
Keep in mind relevant examples from class — wgrep contained a loop repeatedly reading lines with getline(), and program p4.c (figure 5.4) from the book shows how to redirect standard output to a file in a child process.