6.858 Fall 2013 Lab 3: Server-side sandboxing for executable profiles

Handed out: Wednesday, October 9, 2013

All parts due: Friday, October 18, 2013 (5:00pm)

Note: In lab 4, classmates will review your zoobar code, so please only submit your code once you're completely done with lab 3.

Introduction

In this lab, we will extend the Zoobar web application to support executable profiles, which allow users to use Python code as their profiles. To make a profile, a user saves a Python program in their profile on their Zoobar home page. (To indicate that the profile contains Python code, the first line must be #!python.) Whenever another user views the user's Python profile, the server will execute the Python code in that user's profile to generate the resulting profile output. This will allow users to implement a variety of features in their profiles, such as:

A profile that greets visitors by their user name.
A profile that keeps track of the last several visitors to that profile.
A profile that gives a zoobar to every visitor (limit 1 per minute).

Supporting this safely requires sandboxing the profile code on the server, so that it cannot perform arbitrary operations or access arbitrary files. On the other hand, this code may need to keep track of persistent data in some files, or to access existing zoobar databases, to function properly. You will use the RPC library and some shim code that we provide to securely sandbox executable profiles.

To fetch the new source code for this lab, use git to commit your lab 2 solutions, fetch the latest version of the course repository, create a local branch called lab3 based on our lab3 branch, and merge your lab 2 solutions onto the lab3 branch:

httpd@vm-6858:~$ cd lab
httpd@vm-6858:~/lab$ git commit -am 'my solution to lab2'
[lab2 f524ff8] my solution to lab2
 1 files changed, 1 insertions(+), 0 deletions(-)
httpd@vm-6858:~/lab$ git pull
Already up-to-date.
httpd@vm-6858:~/lab$ git checkout -b lab3 origin/lab3
Branch lab3 set up to track remote branch lab3 from origin.
Switched to a new branch 'lab3'
httpd@vm-6858:~/lab$ git merge lab2
Merge made by recursive.
...
httpd@vm-6858:~/lab$

As with the previous lab, in some cases, Git may not be able to figure out how to merge your changes with the new lab assignment (e.g. if you modified some of the code that is changed in our lab 3 assignment). In that case, the git merge command will tell you which files are conflicted, and you should first resolve the conflict (by editing the relevant files) and then commit the resulting files with git commit -a.

The new source code includes the following components, which you should familiarize yourself with:

First, the profiles/ directory contains several executable profiles, which you will use as examples throughout this lab:
- profiles/hello-user.py is a simple profile that prints back the name of the visitor when the profile code is executed, along with the current time.
- profiles/visit-tracker.py keeps track of the last time that each visitor looked at the profile, and prints out the last visit time (if any).
- profiles/last-visits.py records the last three visitors to the profile, and prints them out.
- profiles/xfer-tracker.py prints out the last zoobar transfer between the profile owner and the visitor.
- profiles/granter.py gives the visitor one zoobar. To make sure visitors can't quickly steal all zoobars from a user, this profile grants a zoobar only if the profile owner has some zoobars left, the visitor has less than 20 zoobars, and it has been at least a minute since the last time that visitor got a zoobar from this profile.
Second, zoobar/sandboxlib.py is a Python module that implements sandboxing for untrusted Python profile code; see the Sandbox class, and the run() method which executes a specified function in the sandbox. The run method works by forking off a separate process and calling setresuid in the child process before executing the untrusted code, so that the untrusted code does not have any privileges. The parent process reads the output from the child process (i.e., the untrusted code) and returns this output to the caller of run(). If the child doesn't exit after a short timeout (5 seconds by default), the parent process kills the child.
Sandbox.run() also uses chroot to restrict the untrusted code to a specific directory, passed as an argument to the Sandbox constructor. This allows the untrusted profile code to perform some limited file system access, but the creator of Sandbox gets to decide what directory is accessible to the profile code.
Sandbox uses just one user ID for running untrusted profiles. This means that it's important that at most one profile be executing in the sandbox at a time. Otherwise, one sandboxed process could tamper with another sandboxed process, since they both have the same user ID! To enforce this guarantee, Sandbox uses a lockfile; whenever it tries to run a sandbox, it first locks the lockfile, and releases it only after the sandboxed process has exited. If two processes try to run some sandboxed code at the same time, only one will get the lockfile at a time. It's important that all users of Sandbox specify the same lockfile name if they use the same UID.
How does Sandbox know that some sandboxed code has fully exited and it's safe to reuse the user ID to run a different user's profile? After all, the untrusted code could have forked off another process, and is waiting for some other profile to start running with the same user ID. To prevent this, Sandbox uses Unix's resource limits: it uses setrlimit to limit the number of processes with a given user ID, so that the sandboxed code simply cannot fork. This means that, after the parent process kills the child process (or notices that it has exited), it can safely conclude there are no remaining processes with that user ID.
The final piece of code is zoobar/profile-server.py: an RPC server that accepts requests to run some user's profile code, and returns the output from executing that code.
This server uses sandboxlib.py to create a Sandbox and execute the profile code in it (via the run_profile function). profile-server.py also sets up an RPC server that allows the profile code to get access to things outside of the sandbox, such as the zoobar balances of different users. The ProfileAPIServer implements this interface; profile-server.py forks off a separate process to run the ProfileAPIServer, and also passes an RPC client connected to this server to the sandboxed profile code.
Because profile-server.py uses sandboxlib.py, which it turn needs to call setresuid to sandbox some process, the main profile-server.py process needs to run as root. As an aside, this is a somewhat ironic limitation of Unix mechanisms: if you want to improve your security by running untrusted code with a different user ID, you are forced to run some part of your code as root, which is a dangerous thing to do from a security perspective.

Python profiles with privilege separation

To get started, you will need to add profile-server.py to your zook.conf and modify chroot-setup.sh to create a directory for its socket, /jail/profilesvc. Remember that profile-server.py needs to run as root, so put 0 for the uid in its zook.conf entry.

Exercise 1. Add profile-server.py to your web server. Change the uid value in ProfileServer.run_rpc() from 0 to some other value compatible with your design from lab 2.

Make sure that your Zoobar site can support all of the five profiles. Depending on how you implemented privilege separation in lab 2, you may need to adjust how ProfileAPIServer implements rpc_get_xfers or rpc_xfer.

Run sudo make check to verify that your modified configuration passes our tests. The test case (see check_lab3.py) creates some user accounts, stores one of the Python profiles in the profile of one user, has another user view that profile, and checks that the other user sees the right output.

If you run into problems from the make check tests, you can always check /tmp/html.out for the output html of the profiles. Similarly, you can also check the output of the server in /tmp/zookld.out. If there is an error in the server, they will usually display there.

The next problem we need to solve is that some of the user profiles store data in files; for example, see last-visits.py and visit-tracker.py. However, all of the user profiles currently run with access to the same files, because ProfileServer.rpc_run() sets userdir to /tmp and passes that as the directory to Sandbox (which it turn chroots the profile code to that directory). As a result, one user's profile can corrupt the files stored by another user's profile.

Exercise 2. Modify rpc_run in profile-server.py so that each user's profile has access to its own files, and cannot tamper with the files of other user profiles.

Remember to consider the possibility of usernames with special characters. Also be sure to protect all of these files from other services on the same machine (such as the zookfs that serves static files).

Run make check to see whether your implementation passes our test cases.

Finally, recall that all of profile-server.py currently runs as root because it needs to create a sandbox. This is dangerous, and we would like to reduce the amount of code in profile-server.py that runs as root. In particular, the ProfileAPIServer that runs as part of profile-server.py does not strictly need to run as root (it does not invoke the sandbox), and in fact, it might be the most vulnerable part of the code to attacks, because it accepts RPC commands from the untrusted profile code!

Exercise 3. Change ProfileAPIServer in profile-server.py to avoid running as root. Recall that profile-server.py forks off a separate child process to run ProfileAPIServer, so you can switch to a different user ID (and group ID, if necessary) in ProfileAPIServer.__init__.

You will need to make sure that rpc_xfer can still perform transfers from the profile owner's account. It may be helpful to obtain the correct token before giving up root privileges.

As before, use make check to ensure your code passes our tests.

You are now done with the basic sandbox.

Exercise 4. Think of some interesting features that you could implement using Python server-side profiles, possibly in combination with extending the sandboxing infrastructure (e.g., providing an API for sending messages between users, or for sharing files between users). For example, can you build profile code that analyzes the social graph of who visited whose profile, or an equivalent to a Facebook wall, all using untrusted profile code?

Write a profile that demonstrates this functionality in profiles/my-profile.py. Describe what your profile is implementing in a comment at the top of the profile source code. Make any changes to your ProfileAPIServer necessary to support your feature.

Challenge! (optional) Now that profiles contain Python code, and can give away the user's zoobars, it's important that the user's profile code is not modified by an attacker, and only the correct profile code is executed by profile-server.py.

Create an RPC server that is in charge of modifying user profiles, and which requires a valid user token in order to modify a user's profile. Change the rest of the Zoobar application code to modify user profiles via this RPC server. Set permissions on the profile database so that the rest of the Zoobar application cannot modify profiles directly. Change profile-server.py to read profile code directly from the profile database, instead of accepting it as input to the run RPC call.

make check only does a cursory inspection of the person db, so it may be that your solution is correct but the test fails, or that the test succeeds but your solution is wrong. Therefore, if you've completed the challenge and want us to grade it, add an empty file named challenge3.txt to the lab directory so we know to take a look at your solution.

You are done! Run make submit to upload lab3-handin.tar.gz to the submission web site.

Handed out:	Wednesday, October 9, 2013
All parts due:	Friday, October 18, 2013 (5:00pm)
Note:	In lab 4, classmates will review your zoobar code, so please only submit your code once you're completely done with lab 3.