TPMs and integrity measurement
==============================

What's the problem this paper is trying to address?
    Users are communicating with a server over a network.
    Users expect the server to be intact, ideally would like some assurance.
    Worry: server might be compromised by an attacker, misconfigured, etc..
    In general, problem of server integrity.

Defining integrity, or a model for integrity, is difficult.
    Quick aside: how do we define secrecy?
    Some data d is secret if it is not revealed to unintended users.
    Mechanisms for enforcing secrecy are, of course, another matter.
    Can have access control, encryption, etc.

Biba integrity model.
    A process is high-integrity if all of its inputs were high-integrity.
    Files have high integrity if they were modified only by processes
	with high integrity.
    Need to assign integrity to inputs (e.g., keyboard, network).
    Actually used in practice, in some situations.
	LOMAC for FreeBSD.
	SELinux could be configured in a similar way.
	Idea: keyboard is high integrity, network is low integrity.
	System binaries cannot be modified by low integrity processes.
	Physical analogue used in military systems: air gaps.
    Problem: too inflexible.
	It's often safe to accept low-integrity inputs (but not always).
	Convenience wins out: want to upgrade servers remotely, ...

Clark-Wilson integrity model.
    A system has integrity if:
	1. System started out with integrity.
	2. All changes were made through approved transformations.
    What do these mean?
	1. Initially installed from an untampered CD, ..
	2. All requests went to approved servers, e.g. Apache.
    How realistic is this model?
	Big assumption: approved transformations are safe.
	This is the opposite of Biba's problem: not all inputs are safe!
	But, in practice, this is how many systems are constructed today.
	Linux assumes all setuid-root binaries, root servers are "approved".

For data, can attest to integrity through cryptographic techniques.
    E.g., compute a hash, signature, MAC, etc.
    However, difficult to preserve signature on computed data.
    Some theoretical results exist, but not nearly practical yet.

This paper: consider integrity to be a property of executable code.
    I.e., if the system is executing high-integrity code, it has integrity.
    How realistic is this model?
	Much like Clark-Wilson, doesn't capture buffer overflows, other bugs.
	Also doesn't talk about the integrity of the data (across reboots).
	However, does track integrity of code across reboots (if executed).
    How to track integrity of code?
	Every time some piece of code is executed, measure it (compute hash).
	Secure boot: check if hash is allowed, and if it isn't, stop.
	Authenticated boot: record hash in a tamper-proof manner.
    Why is authenticated boot interesting?
	Remote clients may be able to verify what's running on a server.
	Original motivation: DRM for music.

Tripwire (e.g., athena dialup machines).
    Periodically checksum all of the files on the system.
    Check if they match a previously recorded set of checksums.
    Signal an integrity violation if there is a mismatch.
    What would this catch or not catch?

How to record hashes in a tamper-proof way?
    x86 hardware: TPM chip

               DRAM       /-- BIOS
                 |        |
    CPU --- Northbridge --+-- TPM

    TPM chip has an ephemeral set of registers (PCR0, PCR1, ..), and a key.
    Supported operations:
	TPM_extend(m): extend a PCR register, PCRn = SHA1(PCRn || m)
	TPM_quote(n, m): generate signature of (PCRn, m) with TPM's key
	other operations like seal, unseal, counters, not relevant for now.

Who does the measurement for authenticated boot?
    PCR values get reset to zero only when the entire computer is reset.
    Important: CPU must jump to BIOS code, which is not tampered with.
    BIOS code measures itself (bootstrapping step), extends PCR.
    BIOS code loads boot loader (grub), measures it, extends PCR, runs it.
    Boot loader loads kernel, measures it, extends PCR, runs it.
    In the paper, kernel loads binaries, measures them, extends PCR, runs them.
	Kernel maintains a list of all files that were measured.

How does a client trust that the measurements were done correctly?
    Chain of trust from BIOS onward.
    BIOS must be trusted.

How does a client trust the key of the TPM?
    Must come with a certificate from the hardware manufacturer.
    Client trusts hardware manufacturer to only sign legitimate TPMs / systems.

What are all the things that have to be measured and extended into a PCR?
    BIOS, boot loader, kernel + initrd.
    Paper says grub.conf doesn't have to be measured.  Why?
    What about kernel parameters?  Boot loader would have to measure.
    Who measures kernel modules?  Is it safe to trust kernel to do so?
    Why does the proposed system measure every process, not just e.g. Apache?
	Any process running as root can tamper with any other process.
	Non-root processes might have ways of affecting root processes too.
	Linux doesn't provide good integrity isolation guarantees.
	LOMAC might be able to provide partial integrity guarantees.
    What about non-binary files?
	Shell scripts, Java bytecode, etc should be measured.
	Config files probably should be measured (if they're "relevant"?).
	Data files (e.g. user's email messages) might not be relevant.
	Command-line arguments to binaries?  Hard to say..

Is measuring the set of files enough?
    It might matter how a program is loaded.
    A legitimate shell script might be a bad config file for Apache.
	E.g., Apache might disable some security options..
    A better scheme might be to annotate measured files with context info.
	E.g., measurement list might include <"shell script", F> or
	      <"apache config file", F> instead of just <F>.
	For shell scripts: environment variables, current directory matter!
	Might record this in context too.
	Of course, relies on application to get this right.

Is it possible to measure too much?
    Easy problem: performance cost.
    Hard problem: privacy.

Why do we need the measurement list in addition to PCRs?
    Measurement list allows a remote client to verify PCR value in Quote.
    How does client know what to do with a measurement list?
	Presumably compares with some approved list of measurements/binaries.
	A very hard problem: client must convince itself some program is safe!
	Typically boils down to trusting code from Microsoft/Redhat/...
	Could rely on some online service, but service would do the same..
    What about software upgrades, or config file changes?
	Need to have signature from software vendor attesting to new binaries.
	How to tell if someone's Apache config file is right or not?
    One approach to solve this: rely on site owner's own signatures.
	Of course, if site owner is compromised, no security.
	But maybe site owner could keep their key more secure than all servers.
	E.g. Amazon might sign its software/config setup, keep key offline.
    What if the measurement list is tampered with?

Integrity challenge protocol.
    What does the protocol guarantee?
	A server with given measurement list exists, right now.
	Nonce ensures that adversary cannot replay old messages.
    What would the client like to know?
	The client is talking to this server, right now.
    What's a potential attack?
	Adversary compromises server, forwards nonce to another server.
    How to solve this?
	Tie the nonce to the session key somehow.
	Include SSL session key in Quote, or use session key instead of nonce.

What happens with buffer overflows?
    Executable stack: does the shell code get measured?
    Non-executable stack: are return-to-libc attacks detected?
    What happens afterwards -- e.g. attacker downloads botnet?
    Quick aside: Xbox (I think) had secure boot, but was compromised
	through a buffer overflow in saved-game load code in some game.
    Attacker could boot game from CD, load corrupted game from USB drive,
	exploit a buffer overflow, and use that to start running Linux.
    SQL injection attacks?
	Could measure SQL queries as code, but it's also data.

What happens with root logins, e.g. via SSH?
    Ideally, would treat keyboard input as a shell script.
    Hard to measure keyboard input as a single file.
    But, measuring each command separately would be meaningless (no context!).
	Can't tell the meaning of some command, e.g. "echo foo >> bar".

What if an attacker tries to compromise the kernel?
    Paper has a defense against measurement bypasses by invalidating PCRs.
    Does it matter?
    By the paper's (implicit) assumption, trusted code wouldn't be malicious.
    On the other hand, if malicious code was trying to bypass measurement,
	it should have been measured, otherwise the system has already lost.

Big problem: assumption that code integrity means system integrity.
    Lots of ways to construct bad behavior from existing code
	return-to-libc attacks.
	Run legitimate Unix commands with new arguments.
	Run legitimate Unix commands in new environments.
	Run applications with different config files, ..
	Modify config-like data, such as the password file.

Attacks on the TPM chip itself?
    Worst case: extract TPM key.
	Chip is supposed to be physically tamper-proof.
    Bad: reset TPM without resetting the CPU.
	Assumption is that the link with CPU is tamper-resistant.
	Much harder to ensure -- typically just epoxy.
	Could short out power pins to TPM, short out bus to CPU, ..