Xv6, a simple Unix-like teaching operating system

Xv6 is a teaching operating system developed in the summer of 2006 for MIT's operating systems course, “6.828: Operating Systems Engineering.” We used it for 6.828 in Fall 2006 and Fall 2007 and are using it this semester (Fall 2008). We hope that xv6 will be useful in other courses too. This page collects resources to aid the use of xv6 in other courses.

History and Background

For many years, MIT had no operating systems course. In the fall of 2002, Frans Kaashoek, Josh Cates, and Emil Sit created a new, experimental course (6.097) to teach operating systems engineering. In the course lectures, the class worked through Sixth Edition Unix (aka V6) using John Lions's famous commentary. In the lab assignments, students wrote most of an exokernel operating system, eventually named Jos, for the Intel x86. Exposing students to multiple systems–V6 and Jos–helped develop a sense of the spectrum of operating system designs. In the fall of 2003, the experimental 6.097 became the official course 6.828; the course has been offered each fall since then.

V6 presented pedagogic challenges from the start. Students doubted the relevance of an obsolete 30-year-old operating system written in an obsolete programming language (pre-K&R C) running on obsolete hardware (the PDP-11). Students also struggled to learn the low-level details of two different architectures (the PDP-11 and the Intel x86) at the same time. By the summer of 2006, we had decided to replace V6 with a new operating system, xv6, modeled on V6 but written in ANSI C and running on multiprocessor Intel x86 machines. Xv6's use of the x86 makes it more relevant to students' experience than V6 was and unifies the course around a single architecture. Adding multiprocessor support requires handling concurrency head on with locks and threads (instead of using special-case solutions for uniprocessors such as enabling/disabling interrupts) and helps relevance. Finally, writing a new system allowed us to write cleaner versions of the rougher parts of V6, like the scheduler and file system.

6.828 substituted xv6 for V6 in the fall of 2006. Based on that experience, we cleaned up rough patches of xv6 for the course in the fall of 2007. Since then, xv6 has stabilized, so we are making it available in the hopes that others will find it useful too.

6.828 uses both xv6 and Jos. Courses taught at UCLA, NYU, Peking University, Stanford, Tsinghua, and University Texas (Austin) have used Jos without xv6; we believe other courses could use xv6 without Jos, though we are not aware of any that have.

Xv6 sources

The latest xv6 is xv6-rev2.tar.gz. We distribute the sources in electronic form but also as a printed booklet with line numbers that keep everyone together during lectures. The booklet is available as xv6-rev2.pdf. The xv6 source code is licensed under the traditional MIT license; see the LICENSE file in the source distribution.

xv6 compiles using the GNU C compiler, targeted at the x86 using ELF binaries. On BSD and Linux systems, you can use the native compilers; On OS X, which doesn't use ELF binaries, you must use a cross-compiler. Xv6 does boot on real hardware, but typically we run it using the Bochs emulator. Both the GCC cross compiler and Bochs can be found on the 6.828 tools page.

Lectures

In 6.828, the lectures in the first half of the course introduce the PC hardware, the Intel x86, and then xv6. The lectures in the second half consider advanced topics using research papers; for some, xv6 serves as a useful base for making discussions concrete. This section describe a typical 6.828 lecture schedule, linking to lecture notes and homework. A course using only xv6 (not Jos) will need to adapt a few of the lectures, but we hope these are a useful starting point.

Lecture 1. Operating systems

The first lecture introduces both the general topic of operating systems and the specific approach of 6.828. After defining “operating system,” the lecture examines the implementation of a Unix shell to look at the details the traditional Unix system call interface. This is relevant to both xv6 and Jos: in the final Jos labs, students implement a Unix-like interface and culminating in a Unix shell.

lecture notes OS abstractions slides

Lecture 2. PC hardware and x86 programming

This lecture introduces the PC architecture, the 16- and 32-bit x86, the stack, and the GCC x86 calling conventions. It also introduces the pieces of a typical C tool chain–compiler, assembler, linker, loader–and the Bochs emulator.

Reading: PC Assembly Language

Homework: familiarize with Bochs

lecture notes x86 intro slides homework

Lecture 3. Operating system organization

This lecture continues Lecture 1's discussion of what an operating system does. An operating system provides a “virtual computer” interface to user space programs. At a high level, the main job of the operating system is to implement that interface using the physical computer it runs on.

The lecture discusses four approaches to that job: monolithic operating systems, microkernels, virtual machines, and exokernels. Exokernels might not be worth mentioning except that the Jos labs are built around one.

Reading: Engler et al., Exokernel: An Operating System Architecture for Application-Level Resource Management

lecture notes

Lecture 4. Address spaces using segmentation

This is the first lecture that uses xv6. It introduces the idea of address spaces and the details of the x86 segmentation hardware. It makes the discussion concrete by reading the xv6 source code and watching xv6 execute using the Bochs simulator.

Reading: x86 MMU handout, xv6: bootasm.S, bootother.S, bootmain.c, main.c, init.c, and setupsegs in proc.c.

Homework: Bochs stack introduction

lecture notes x86 virtual memory slides homework

Lecture 5. Address spaces using page tables

This lecture continues the discussion of address spaces, examining the other x86 virtual memory mechanism: page tables. Xv6 does not use page tables, so there is no xv6 here. Instead, the lecture uses Jos as a concrete example. An xv6-only course might skip or shorten this discussion.

Reading: x86 manual excerpts

Homework: stuff about gdt XXX not appropriate; should be in Lecture 4

lecture notes

Lecture 6. Interrupts and exceptions

How does a user program invoke the operating system kernel? How does the kernel return to the user program? What happens when a hardware device needs attention? This lecture explains the answer to these questions: interrupt and exception handling.

It explains the x86 trap setup mechanisms and then examines their use in xv6's SETGATE (mmu.h), tvinit (trap.c), idtinit (trap.c), vectors.pl, and vectors.S.

It then traces through a call to the system call open: init.c, usys.S, vector48 and alltraps (vectors.S), trap (trap.c), syscall (syscall.c), sys_open (sysfile.c), fetcharg, fetchint, argint, argptr, argstr (syscall.c),

The interrupt controller, briefly: pic_init and pic_enable (picirq.c). The timer and keyboard, briefly: timer_init (timer.c), console_init (console.c). Enabling and disabling of interrupts.

Reading: x86 manual excerpts, xv6: trapasm.S, trap.c, syscall.c, and usys.S. Skim lapic.c, ioapic.c, picirq.c.

Homework: Explain the 35 words on the top of the stack at first invocation of syscall.

lecture notes homework

Lecture 7. Multiprocessors and locking

This lecture introduces the problems of coordination and synchronization on a multiprocessor and then the solution of mutual exclusion locks. Atomic instructions, test-and-set locks, lock granularity, (the mistake of) recursive locks.

Although xv6 user programs cannot share memory, the xv6 kernel itself is a program with multiple threads executing concurrently and sharing memory. Illustration: the xv6 scheduler's proc_table_lock (proc.c) and the spin lock implementation (spinlock.c).

Reading: xv6: spinlock.c. Skim mp.c.

Homework: Interaction between locking and interrupts. Try not disabling interrupts in the disk driver and watch xv6 break.

lecture notes homework

Lecture 8. Threads, processes and context switching

The last lecture introduced some of the issues in writing threaded programs, using xv6's processes as an example. This lecture introduces the issues in implementing threads, continuing to use xv6 as the example.

The lecture defines a thread of computation as a register set and a stack. A process is an address space plus one or more threads of computation sharing that address space. Thus the xv6 kernel can be viewed as a single process with many threads (each user process) executing concurrently.

Illustrations: thread switching (swtch.S), scheduler (proc.c), sys_fork (sysproc.c)

Reading: proc.c, swtch.S, sys_fork (sysproc.c)

Homework: trace through stack switching.

lecture notes (need to be updated to use swtch) homework

Lecture 9. Processes and coordination

This lecture introduces the idea of sequence coordination and then examines the particular solution illustrated by sleep and wakeup (proc.c). It introduces and refines a simple producer/consumer queue to illustrate the need for sleep and wakeup and then the sleep and wakeup implementations themselves.

Reading: proc.c, sys_exec, sys_sbrk, sys_wait, sys_exec, sys_kill (sysproc.c).

Homework: Explain how sleep and wakeup would break without proc_table_lock. Explain how devices would break without second lock argument to sleep.

lecture notes homework

Lecture 10. Files and disk I/O

This is the first of three file system lectures. This lecture introduces the basic file system interface and then considers the on-disk layout of individual files and the free block bitmap.

Reading: iread, iwrite, fileread, filewrite, wdir, mknod1, and code related to these calls in fs.c, bio.c, ide.c, and file.c.

Homework: Add print to bwrite to trace every disk write. Explain the disk writes caused by some simple shell commands.

lecture notes homework

Lecture 11. Naming

The last lecture discussed on-disk file system representation. This lecture covers the implementation of file system paths (namei in fs.c) and also discusses the security problems of a shared /tmp and symbolic links.

Understanding exec (exec.c) is left as an exercise.

Reading: namei in fs.c, sysfile.c, file.c.

Homework: Explain how to implement symbolic links in xv6.

lecture notes homework

Lecture 12. High-performance file systems

This lecture is the first of the research paper-based lectures. It discusses the “soft updates” paper, using xv6 as a concrete example.

Feedback

If you are interested in using xv6 or have used xv6 in a course, we would love to hear from you. If there's anything that we can do to make xv6 easier to adopt, we'd like to hear about it. We'd also be interested to hear what worked well and what didn't.

Russ Cox (rsc@swtch.com)
Frans Kaashoek (kaashoek@mit.edu)
Robert Morris (rtm@mit.edu)

You can reach all of us at 6.828-staff@pdos.csail.mit.edu.