Privilege separation in OpenSSH =============================== Starting a new module in this class: case studies of system design for security. Big theme in many of these case studies: privilege separation. Privilege separation is also the focus of lab 2. Problem: exploitable bugs in software. Software is complex -> bugs -> exploits. What to do about this? Plan A: find them, fix them, avoid making new ones. We will talk about various techniques like this in the next module. Much progress here, but for large software systems, not enough. OpenSSH example from this paper. Listening process runs as root, accepts connections on port 22. Need root privilege to bind to port 22. Need root privilege for subsequent operations. Forks off new process for each incoming connection. Processes arbitrary network messages. But still runs as root (will need to check password, start shell, etc). Lots of code that could be buggy. zlib for compression over the network. Parsing network packets. Encryption, key exchange. Authentication: checking password, challenge-response, etc. Starting a shell. Re-keying after some period of time. Bugs can be quite damaging. Buffer overflows and such, as in lab 1. But also leaking sensitive memory contents (like the private key). Or accessing the wrong files as root. This "hard shell, soft inside" setup makes bugs devastating. Plan B: build systems that are secure even if there are bugs. Can we do anything like this? Goal: principle of least privilege. Each component should have the least privileges needed to do its job. Big idea: privilege separation. Divide up the s/w and data to limit damage from bugs. Two related benefits: Limit damage from successful exploit -- "least privilege". Limit attacker's access to buggy code -- "attack surface". Privilege separation is difficult. Need to come up with a fruitful separation plan. Need to isolate (client/server, VMs, containers, processes, SFI, &c). Need to allow controlled interaction (narrow API, meaningful security checks). Need to retain good performance (few domain crossings on critical path, etc). Need to refactor the software to work with separation plan. System designer must choose the separation plan(s): by service / type of data (friend lists vs passwords) by user (my e-mail vs your e-mail) by buggyness (image resizing vs everything else) by exposure to direct attack (network message parsing vs everything else) by inherent privilege (hide superuser processes; hide the keys or DB; etc) Separation plan is highly application-dependent. Today's paper: OpenSSH case study. Lab 2: web application privilege separation. Different plans, though similar underlying principles. How does OpenSSH choose to do privilege separation? Privileged listening process (runs as root) accepts incoming connections. Privileged monitor process (runs as root) per connection. Unprivileged worker process for doing much of the connection work. Network message parsing, crypto protocol, key exchange, compression, ... TCP connection is passed to the unprivileged worker process. Worker process gets re-spawned after authentication. Mostly a detail due to how user ID can be set for a process. Eventually spawns a user shell process. Worker process stays around, handling the encrypted network session. What privileged operations need to happen for each connection? Protocol requires signing a message with server's host private key. Need to check user's password. Need to authenticate user with public-key auth (challenge-response). Need to allocate a pseudo-terminal for user's login session. Need to start shell with user's UID. Most of these privileged operations require root privilege on Unix. E.g., host private key file is only readable by root. This is why OpenSSH used to run as root. How does the unprivileged worker do all of these operations? Privilege separation defines a new interface, between worker and monitor. Enumerated list of operations that can be requested. [ Ref: https://github.com/openssh/openssh-portable/blob/master/monitor.h ] Monitor process will perform just these corresponding operations. [ Ref: https://github.com/openssh/openssh-portable/blob/master/monitor.c ] Tight control over which operations are allowed at what point. E.g., MON_ONCE and MON_AUTH flags. Can only do certain operations one time (sign with server's private key). Can only do certain operations after supplying valid username (check pw). Meaningful security boundary between child worker process and monitor parent. Should assume that worker child process is compromised. Adversary can issue arbitrary requests to the monitor. Monitor process has full root privileges. But the operations that it exports are much less damaging. Can't get private key, can only sign with it. Can't get list of users, can only check a particular user name. Can't get password file, can only check a user's password. Etc. What's the attack surface of the worker process? Arbitrary network messages. Parsing, compression. Encryption, key exchange implementations. What's the attack surface of the monitor process? Accepting a network connection. Monitor requests (monitor.h). What's the attack surface of the listening process? Almost nothing: new TCP connection coming in. No data, just spawns a new monitor process for each accepted connection. What's the damage if the worker process gets compromised? Could try to log in as a user. But could have mostly done that by trying to log in via ssh, too. Could sign messages using server's host private key. Slightly worrisome: could impersonate server for another connection. But not for future connections: would need to sign future random msg. Post-authentication: could access that user's state. But could have done that just by logging in, too. Post-authentication: could allocate a pseudo-terminal. Not much damage. Could send spam or attack other things from the server machine. Network access not limited for the unprivileged worker. Could run machine out of memory, processes, CPU time, etc. Perhaps could enforce memory/forking limits on worker process. Why does the challenge-response authentication require the monitor? Could just have the child process generate a random challenge, check sig. A: monitor has to check authentication result, depends on fresh challenge. What are the mechanisms for isolation and control over sharing? Paper uses Unix processes, user IDs (UIDs), file permissions, and fd passing. What is setuid(uid)? A process can drop its privileges from root to an ordinary uid. What is chroot(dirname)? Causes / to refer to dirname for this process and descendants, so they can't name files outside of dirname. How to prevent interference between worker processes? P_SUGID prevents one process from debugging another process using ptrace. Even though the two processes are running with the same user ID. What is FD passing? One process opens file descriptor, passes it to another process. E.g., monitor allocates a pseudo-terminal, passes fd to worker process. Challenge: how to set user ID after successful user authentication? Cannot pass user ID as a file descriptor. OpenSSH plan: kill old worker child process, start new one. Need to pass all of the relevant state from old to new process. State (section 4.1): Encryption/authentication algorithms and keys. Network message sequence counters. Buffered network data. Compression state. UNIX process-level isolation tools are hard to use. Many global name-spaces: files, UIDs, PIDs, ports. Each may allow processes to see what others are up to. Each is an invitation for bugs or careless set-up. No idea of "default to no access". Thus hard for designer to reason about what a process can do. No fine-grained grants of privilege. Can't say "process can read only these three files." No way to limit network access. chroot() and setuid() can only be used by superuser. So non-superusers can't reduce/limit their own privilege. Awkward since security suggests *not* running as superuser. Lab 2 uses Linux containers (LXC) Didn't exist when authors designed OpenSSH privilege separation. Containers provide the illusion of virtual machines wo. using virtual machines Containers are more efficient than virtual machines Container is a Linux process, but strongly isolated: Limited access to the kernel name spaces Limited access to system calls No access to the file system Containers behave like a virtual machine Started from a VM image Have their own IP address Have their own file system Lab 2 uses *unprivileged* containers These containers run as non-root user processes If the process inside the container runs as root, still limited privileges More difficult to break out of container than chrooted-process How do the authors add privilege separation to existing OpenSSH code? Step 1: design separation plan. Required some refactoring of the code to expose this boundary. Step 2: RPC wrappers for functions at monitor interface boundary. Example from paper: PRIVSEP(auth_password(authctxt, pwd)). When PRIVSEP is disabled, this is just auth_password(authctxt, pwd). When PRIVSEP is enabled, this is mm_auth_password(authctxt, pwd). mm_auth_password() is an RPC client stub. [ Ref: https://github.com/openssh/openssh-portable/blob/master/monitor_wrap.c ] Step 3: send current state to monitor when authentication succeeds. [ Ref: https://github.com/openssh/openssh-portable/blob/master/sshd.c call to mm_send_keystate() ] And correspondingly, unpack this state when monitor starts new worker process. Challenge: privilege separation for existing libraries. Example problem in OpenSSH: pre-authentication worker used zlib for compression. zlib allocated its own buffers. How to transfer those buffers to new post-authentication worker? OpenSSH solution: give zlib a special malloc/free implementation. Allocates memory in a shared memory region. This shared memory region will get passed to new worker as-is. Good: transparent to existing code (like zlib). Many privilege separation libraries/toolkits play such games. Bad: complicated interface with monitor process. But at least monitor doesn't look at this shared memory region. Just passes shared memory to new post-authentication worker process. Bad: could have corrupted pointers, will cause arbitrary errors in worker. Probably not too bad, because requires being able to log in as user. Shared memory allocation was removed. For cryptographic reasons, pre-authentication compression was undesirable. Shared memory allocation code was complex. Bugs in it due to undefined behavior, even. [ Ref: https://github.com/openssh/openssh-portable/commit/0082fba4efdd492f765ed4c53f0d0fbd3bdbdf7f ] Removed in 2016 by not doing compression in pre-auth slave process. Where should an attacker look for weaknesses? Might be bugs in the worker process, good starting point. Bugs in OS kernel. Exploit kernel bug, become root, break out of isolation. Bugs in monitor's authentication code. Buffer overflow, logic error, crypto mistakes. Incorrectly authenticate as victim user. How secure is the resulting privilege-separated OpenSSH? Section 5. One measure of potential security vulnerabilities: lines of code. Unprivileged worker is about 2/3 of the code. Privileged monitor is about 1/3 of the code. Fewer lines of code -> fewer bugs. Another measure: attack surface. Unprivileged worker: arbitrary network messages. Privileged monitor: well-defined interface, few operations, fixed structure. Relatively less likely to have memory corruption, etc? Empirical case study: many previous vulnerabilities would have been prevented. Pre-authentication: integer overflow in network packet processing code. Pre-authentication: zlib bug. Post-authentication: off-by-one error in channel code. Post-authentication: Kerberos ticket passing. Privilege separation helps even with post-authentication bugs. Used to continue running as root, due to key re-negotiation (need to sign). What's the performance overhead? Described/deployed design: virtually no performance overhead! Section 6. Fast because privilege separation is not on the critical path for data xfer. After login, everything works basically the same as without privsep. Minor overhead for establishing new connection / login, but not significant. Direct result of carefully choosing the right privilege separation interface. Alternative design ("3 process") from section 4.3 would be slower. Would avoid the need for complex state transfer. Existing worker keeps handling encryption/compression on network connection. New worker handles user session. Introduces some overhead in steady-state: more context switching. OpenSSH still uses this basic privilege separation design today. Relatively minor changes (like getting rid of shared memory stuff for zlib). OpenSSH has some unique aspects that play into its privilege separation plan. Every connection is largely independent. Shared state is in the files that the user can access after logging in. Not really OpenSSH's problem. One privileged monitor process, relatively few privileged resources. Server private key, password database, ability to setuid(). Other systems we will look at are quite different. Web applications: lab 2 and Thursday's Google paper. Many different resources (user authentication, databases, services, etc). Stateful services (e.g., DB) rather than starting a fresh worker each time. Dynamic permissions (e.g., Google's user permission tickets).