BitLocker
=========

Administrivia:
  Submit final project proposals by Monday, via email to course staff list.

What's the problem this paper is trying to solve?
  Prevent data from being stolen if an attacker physically steals a laptop.
  Effectively, want some form of disk encryption.

What security properties might users want from disk encryption?
  Data secrecy: adversary that gets access to disk cannot get data.
  Data integrity: adversary cannot replace data on disk without detection.
  Data freshness: adversary cannot roll back to an old version without det.
  Deniability:
    Ensure that adversary can't tell you have certain data encrypted.
    Ensure that adversary can't tell if they have the real password.
    E.g., someone forces you to give up your password.

Where do encryption keys come from?
  Simple approach: user provides key in some form.
    User might enter password (hashed to produce key, like in Kerberos).
    User might plug in a USB drive containing the key.
    Problem: users don't want to keep entering special key at bootup.

Bitlocker: give key to legitimate OS, let OS verify user's credentials.
  How to tell if legit OS is running, or attacker boots from his own CD?
  One approach: only run software signed by some authority.
    E.g., a PC only runs Windows signed by Microsoft.
    Even if adversary boots from CD, can only boot Windows.
    Windows will still enforce file permissions, password checks, etc.
    Sometimes works on specialized devices (game consoles, cell phones, ..)
    Too restrictive for general-purpose PCs.
  Another approach: authenticated boot.
    Instead of restricting what software can boot, authenticate ("measure") it.
    Use a piece of trusted hardware: a TPM (Trusted Platform Module).

               DRAM       /-- BIOS
                 |        |
    CPU --- Northbridge --+-- TPM

  TPM chip has an ephemeral set of registers (PCR0, PCR1, ..), and a key.
  Some supported operations (others also exist, but irrelevant here):
    TPM_extend(m): extend a PCR register, PCRn = SHA1(PCRn || m)
    TPM_quote(n, m): generate signature of (PCRn, m) with TPM's key
    TPM_seal(n, PCR_value, plaintext): return ciphertext.
    TPM_unseal(ciphertext): return plaintext, if PCRn matches PCR_value.

Who does the measurement for authenticated boot?
  PCR values get reset to zero only when the entire computer is reset.
  Important: CPU must jump to BIOS code, which is not tampered with.
  BIOS code "measures" itself: extends PCR with hash of its code.
  BIOS code loads boot loader (e.g., Linux grub), measures it
    (extends PCR with the hash of the boot loader), runs it.
  Boot loader loads kernel, measures it (extends PCR with hash of
    kernel), runs it.

What does this allow us to do?
  Can prove to others over the network that you're running some software.
    Use TPM_quote() to get the TPM to sign a message on your behalf.
    Assumption: remote party trusts your TPM (but not you directly).
    TPM has its own secret key, HW mfg signs public key, stores cert on TPM.
  Can encrypt data in a way that's only accessible to specific software.
    Use TPM_seal, TPM_unseal.
    Sealed data can be decrypted only by chosen recipient (PCR).
    Each TPM has its own randomly-generated key for encryption.
  Assumption: adversary does not tamper with CPU, TPM, or their link.

Bitlocker's TPM mode: key stored in TPM.
  Idea: store key in the TPM (or rather, seal it using the TPM).
    Advantage: no need for user to interact with the BIOS.
  What's the point of TPM-only mode?
    Key can only be obtained if the machine boots up the same OS (Windows).
    As a result, security boils down to whatever plan Windows has.
    One possibility: user has Windows password.
      Why is this better than the password-in-BIOS approach?
      1. No need to enter password twice: in BIOS and in Windows.
      2. Windows can rate-limit login attempts, prevent pw guessing.
    Another possibility: user cannot access sensitive data directly.
      User might have to access sensitive data via privileged process.
      Privileged process will not divulge entire data set.

What gets measured at boot in BitLocker?
  Two partitions on disk.
  First partition contains BitLocker's bootstrapping code.
  Second partition contains encrypted data ("OS volume").
  First partition measured at boot.
  BitLocker's key sealed with first partition's PCR measurement.
  Why not measure the second partition?
    Changes frequently.
    Expectation: adversary won't be able to meaningfully change it.
  What if we need to upgrade the first partition?
    One possibility: re-seal key with new PCR value before upgrade.
  What if we need to upgrade laptops?  Or, laptop died and need to recover?
    Disk encryption key is stored encrypted with a recovery password.
    (Or, stored in active directory encrypted with admin's password.)
    User can type in their recovery password to gain access to disk.

How do we encrypt the disk once we have the key?
  Encrypt disk blocks one-at-a-time.
    Why one-at-a-time?  Atomicity, performance.
  Potential problem: integrity (adversary can modify sectors on disk).
    Why is this a problem for a disk encryption scheme?
    Why is it insufficient to do secure boot (check signatures on code)?
  What are the options for ensuring integrity?
    Ideally, store a MAC (~keyed hash) for the sector somewhere on disk.
    Recall, disks write sectors at a time: need one MAC per sector.
    Store MAC in adjacent sector: effectively cut space by a factor of 2.
    Store MAC in a table elsewhere: two seeks (and breaks atomicity).
    Store MACs for group of sectors nearby: breaks atomicity.
  Where can we store MACs for integrity?
    Buy really expensive disks (NetApp, EMC) that have jumbo sectors.
    "Enterprise" disks have 520-byte sectors, instead of standard 512.
    Extra 8 bytes used to store checksums, transaction IDs, etc.
    Could be used to store MAC.
    Not going to fly for common machines.
  BitLocker approach: "poor-man's authentication"
    Assume adversary cannot change the ciphertext in a "useful" fashion.
    I.e., cannot have a predictable effect on the plaintext.
  When would this work or not?
    Works if applications detect or crash when important data is garbled.
    Must be true at sector-level, which attacker can corrupt separately.
    Probably true for code: random instructions will raise an exception.
    Worst case for data: 1 bit (e.g., "require login?") alone in a sector.
    Adversary can guess random ciphertexts, see when that bit changes.
    If application doesn't notice other bits corrupted, game over.
    Hopefully this is not how the registry is constructed, so maybe OK..
  What about freshness?
    Harder to achieve: need to have some state that can't be rolled back.
    Strawman: hash all blocks, store hash in TPM.
    Problem: updates require re-hashing entire disk, slow.
    Idea: tree of hashes (Merkle tree).
    Even that is often too expensive to update.
  How do we achieve poor-man's authentication?
    Let's look at some encryption mechanisms.
    We consider symmetric-key encryption schemes (no need for public-key).
    Key sizes are typically on the order of 128 bits for symmetric-key.

Block ciphers.
  Encrypt entire block of data at a time; hopefully no direct bitwise deps.
  Standard block cipher today: AES, 128-bit block size (128- or 256-bit key).
  How to encrypt something larger than a block size?

Block cipher modes of operation
  Have plaintext blocks P_1 .. P_n
  Want ciphertext blocks C_1 .. C_n
  ECB: Apply block cipher to each block-sized chunk in turn.
    C_i = E_k(P_i)
    Advantage: simple.
    Disadvantage: attacker can permute blocks, find equal blocks, ..
  CBC: XOR each block's plaintext with previous block's ciphertext.
    C_i = E_k(P_i XOR C_{i-1})
    Can we decrypt this?  P_i = D_k(C_i) XOR C_{i-1}
    Advantage: attacker cannot permute blocks.
               different ciphertexts for matching plaintext blocks.
    Disadvantage: encryption hard to parallelize (but decryption can be).
    What if plaintext is changed by 1 bit?  All subsequent C's affected.
    What if ciphertext changed by 1 bit?  Only 2 plaintexts affected.
      Plaintext for changed ciphertext is fully garbled.
      Plaintext for next block is changed by 1 bit, in the same position!
    What's P_0 here, the initial XOR value?
      Called the initiatlization vector (IV).
      Needed for decryption, to decrypt the first block.
    What if IV is revealed after encryption?  Usually OK.
    What if IV is predictable or reused?  Usually a bad idea.
      Leaks information about the contents of first block.
      Watermarking attacks on file system.
        Adversary can create file patterns that are visible after enc.
    How to choose IVs for disk encryption?
      Typically want to choose different IVs for different sectors.
        Different encryptions for different sectors.
      Avoids watermarking, swapping of sectors by adversary.
      However, IV is typically reused, so ideally keep IV secret.
  Many more block cipher modes of operation exist.
    Can provide provable CCA2 security (UFE), authentication (OCB/CCM), ..

BitLocker's AES-CBC + Elephant mode
  Figure 1 in the paper.
  Start with basic AES-CBC: closest to what we want.
  Why is the IV derived from the sector number?
    Make encryption functions of each sector different, prevent switching.
  Why is the sector number encrypted in the IV?
    Make sure the IV is not predictable, avoid watermarking attacks.
  What are the diffusers doing?
    XORing bits of the block.
    Ensures that one bit change on either side will flip many other bits.
  Why do we have another sector-dependent transform upfront (sector key)?
    So that diffusers don't operate on predictable data on either side.
    Even if adversary learns the IV for a pair of sectors, cannot swap them.
  Sector key: why is it different from the AES key?
    Guarantees that sector-key operations cannot weaken AES.
    Even if the entire secret key is leaked, AES-CBC still intact.

Potential attacks on BitLocker?
  Not intended to be a perfect security solution, by design.
  Hardware attacks: DMA, cold boot attacks, ..
  Security vulnerabilities in Windows (buffer overflows, root access).
  Rolling back disk blocks to an old version.
    Adversary likely doesn't have interesting old blocks.
    Hard (expensive in terms of performance) to defend against.

Alternative to BitLocker's sector encryption: filesystem-level encryption.
  FS can solve atomicity/consistency problems.
  FS can find space for extra MACs, random IVs to prevent IV reuse, etc.
  FS might require much more code to be "measured" into the TPM.
    FS comes up much later in the boot process.
    Requires TPM re-sealing for FS upgrades, driver upgrades, etc.
  FS might not interpose on swapping/paging.
  FS harder to deploy (changes to FS, cannot deploy incrementally).