SSL and HTTPS
=============

Overall problem: security in the presence of a network adversary.
  Web browser communicates with web servers via network.
  Unlike previous lectures, adversary assumed to intercept, modify packets.
  Turns out this is a good model for many situations:
    Nearby adversaries can intercept packets on wired, wireless networks.
    Adversaries can often spoof packets from arbitrary sources.
  How to build secure systems in the presence of such adversaries?

Recall: two kinds of encryption schemes.
  E is encrypt, D is decrypt
  Symmetric key cryptography means same key is used to encrypt & decrypt
    ciphertext = E_k(plaintext)
    plaintext = D_k(ciphertext)
  Asymmetric key (public-key) cryptography: encrypt & decrypt keys differ
    ciphertext = E_PK(plaintext)
    plaintext = D_SK(ciphertext)
    PK and SK are called public and secret (private) key, respectively
  Public-key cryptography is orders of magnitude slower than symmetric

  Encryption provides data secrecy, often also want integrity.
  Message authentication code (MAC) with symmetric keys can provide integrity.
    Look up HMAC if you're interested in more details.
  Can use public-key crypto to sign and verify, almost the opposite:
    Use secret key to generate signature (compute D_SK)
    Use public key to check signature (compute E_PK)

How to secure network communication with cryptography?  (Simple sketch.)
  Suppose two computers already have a shared secret key.
    Use symmetric encryption and MAC to encrypt, authenticate messages.
    Adversary cannot decrypt or tamper with messages.
  What can we do if two computers don't have a shared secret?
  One possibility: two computers know each other's public keys.
    Use public-key encryption (expensive) to exchange symmetric keys.
    Strawman: A picks symmetric key, encrypts with PK_B, sends to B.
    Now fall back to symmetric encryption/MAC case.
  What can go wrong with strawman?
    Adversary can replay all of A's traffic and B would not notice.
    Solution: have the server send a nonce (random value).
      Incorporate the nonce into the final master secret:
        K_master = f(K_pre-master, nonce)
    Adversary can impersonate A, by sending another symmetric key to B.
    Possible solution (one of many; if B cares who A is):
      B also chooses and send a symmetric key to A, encrypted with PK_A.
      Then both A and B use a hash of the two keys combined.
    Adversary can later obtain SK_B, decrypt symmetric key and all messages.
    Solution: use a key exchange protocol like Diffie-Hellman,
      which provides forward secrecy.
  What if neither computer knows each other's public key?
    Common approach: use a trusted third party to generate certificates.
    Certificate is tuple (name, pubkey), signed by certificate authority.
    Meaning: certificate authority claims that name's public key is pubkey.
    B sends A a pubkey along with a certificate.
    If A trusts certificate authority, continue as above.
    The process to establish K_master is called the "handshake"

Plan for securing web browsers: HTTPS
  New protocol: https instead of http (e.g., https://www.paypal.com/).
  1. How to ensure data is not sniffed or tampered with on the network?
    Use SSL (a cryptographic protocol that uses certificates).
    SSL encrypts and authenticates network traffic.
    Negotiate ciphers (and other features: compression, extensions).
    Negotiation is done in clear. Include a MAC of all handshake messages
      to authenticate.
  2. How to ensure that we are talking with the right server?
    SSL certificate name must match hostname in the URL
    In our example, certificate name must be www.paypal.com.
    One level of wildcard is also allowed (*.paypal.com)
    Browsers trust a number of certificate authorities.
    What happens if adversary tampers with DNS records?
      Good news: security doesn't depend on DNS.
      We already assumed adversary can tamper with network packets.
      Wrong server will not know correct private key matching certificate.
  3. How to ensure client-side Javascript cannot be used to subvert security?
    Origin (from the same-origin policy) includes the protocol.
      http://www.paypal.com/ is different from https://www.paypal.com/
      Here, we care about integrity of data (e.g., Javascript code).
      Result: non-HTTPS pages cannot tamper with HTTPS pages.
      Rationale: non-HTTPS pages could have been modified by adversary.
  4. How to ensure user credentials are not sent to wrong server?
    Server certificates help clients differentiate between servers.
    Cookies (common form of user credentials) have a "Secure" flag.
    Secure cookies can only be sent with HTTPS requests.
    Non-Secure cookies can be sent with HTTP and HTTPS requests.
  5. Finally, users can enter credentials directly.  How to secure that?
    Lock icon in the browser tells user they're interacting with HTTPS site.
    Browser should indicate to the user the name in the site's certificate.
    User should verify site name they intend to give credentials to.

How can this plan go wrong?
  As you might expect, every step above can go wrong.
  Not an exhaustive list, but gets at problems that ForceHTTPS wants to solve.

1. Cryptography.
  There have been some attacks on the cryptographic parts of SSL.
  Attack by Rizzo and Duong can allow adversary to learn some plaintext by
    issuing many carefully-chosen requests over a single SSL connection. (BEAST)
  More recent attack by same people using compression, mentioned in iSEC
    lecture. (CRIME)
  Some servers/CAs use weak crypto, e.g. certificates using MD5.
  Some clients choose weak crypto (e.g., SSL on Android).
  But, cryptography is rarely the weakest part of a system.

2. Authenticating the server.
  Adversary may be able to obtain a certificate for someone else's name.
    Used to require a faxed request on company letterhead (but how to check?)
    Now often requires receiving secret token at root@domain.com or similar.
    Security depends on the policy of least secure certificate authority.
    There are 100's of trusted certificate authorities in most browsers.
    Several CA compromises in 2011 (certs for gmail, etc obtained)
    Servers may be compromised and the corresponding private key stolen.
  How to deal with compromised certificate (e.g., invalid cert or stolen key)?
    Certificates have expiration dates.
    Checking certificate status with CA on every request is hard to scale.
    Certificate Revocation List (CRL) published by some CA's, but relatively
      few certificates in them (spot-checking: most have zero revoked certs).
    CRL must be periodically downloaded by client.
      Could be slow, if many certs are revoked.
      Not a problem if few or zero certs are revoked, but not too useful.
    OCSP: online certificate status protocol.
      Query whether a certificate is valid or not.
    Various heuristics for guessing whether certificate is OK or not.
      CertPatrol, EFF's SSL Observatory, ..
      Not as easy as "did the cert change?". Websites sometimes test new CAs.
    Problem: online revocation checks are soft-fail. An active network attacker
      can just make the checks unavailable. Browsers don't like blocking on a
      side channel. (Performance, single point of failure, captive portals, etc.)
    In practice browsers push updates with blacklist after major breaches.
  SSL implementations have bugs in verifying certificate names.
    Remember important principle from 6.033: "be explicit".
    Certificate contains length (in bytes) followed by that many name bytes.
    Many C implementations store names as standard C strings.
    Some CAs would provide certificates for www.paypal.com\0.attacker.com.
    To non-C code (e.g., Java), looks like a valid attacker.com subdomain.
  Users ignore certificate mismatch errors.
    Despite certificates being easy to obtain, many sites misconfigure them.
    Some don't want to deal with (non-zero) cost of getting certificates.
    Others forget to renew them (certificates have expiration dates).
    End result: browsers allow users to override mismatched certificates.
    About 60% of bypass buttons shown by Chrome are clicked through.

3. Mixing HTTP and HTTPS content.
  Web page origin is determined by the URL of the page itself.
  Page can have many embedded elements:
    Javascript via <SCRIPT> tags
    CSS style sheets via <STYLE> tags
    Flash code via <EMBED> tags
    Images via <IMG> tags
  If adversary can tamper with these elements, could control the page.
    In particular, Javascript and Flash code give control over page.
    CSS gives less control, but still abusable. Particularly with
    complex attribute selectors.
  Probably the developer wouldn't include Javascript from attacker's site.
  But, if the URL is non-HTTPS, adversary can tamper with HTTP response.

4. Protecting cookies.
  Web application developer could make a mistake, forgets the Secure flag.
  User visits http://bank.com/ instead of https://bank.com/, leaks cookie.

  Suppose the user only visits https://bank.com/.  Why is this still a problem?
  1. Adversary can cause another HTTP site to redirect to http://bank.com/.
  2. Even if user never visits any HTTP site, application code might be buggy.
    Some sites serve login forms over HTTPS and serve other content over HTTP.
    Be careful when serving over both HTTP and HTTPS.
      E.g., Google's login service creates new cookies on request.
      Login service has its own (Secure) cookie.
      Can request login to a Google site by loading login's HTTPS URL.
      Used to be able to also login via cookie that wasn't Secure.
      ForceHTTPS solves problem by redirecting HTTP URLs to HTTPS.
      http://blog.icir.org/2008/02/sidejacking-forced-sidejacking-and.html

  Cookie integrity: a non-Secure cookie set on http://bank.com will still be
  sent to https://bank.com. No way to determine who set the cookie.

5. Users directly entering credentials.
  Phishing attacks.
  Users don't check for lock icon.
  Users don't carefully check domain name, don't know what to look for.
    E.g., typo domains (paypa1.com), unicode
  Web developers put login forms on HTTP pages (target login script is HTTPS).
    Adversary can modify login form to point to another URL.
    Login form not protected from tampering, user has no way to tell.

How can we address some of these problems?
  ForceHTTPS (this paper):
    A flag that a server can set for itself.
    - Makes SSL certificate misconfigurations into a fatal error.
    - Redirects HTTP requests to HTTPS.
    - Prohibits non-HTTPS embedding (+ performs ForceHTTPS for them).
    What problems does this solve?  Mostly 2, 3, and to some extent 4.
    Is this really necessary? Can we just only use HTTPS, set Secure
    cookies, etc.?
    - Users can still click-through errors, so it still helps for #2.
    - Not necessary for #3 assuming the web developer never makes a mistake.
    - Still helpful for #4. Marking cookies as Secure gives confidentiality,
      but not integrity. Active attacker can serve fake set at http://bank.com,
      and set cookies for https://bank.com. (https://bank.com cannot distinguish)
    Why not just turn it on for everyone?
    - HTTPS site might not exist
    - If it does, might not be the same site (https://web.mit.edu is
      authenticated, but http://web.mit.edu isn't)
    - HTTPS page may expect users to click through (self-signed certs).

Implementing ForceHTTPS
  The ForceHTTPS bit is stored in a cookie.
  Interesting issues:
    State exhaustion (the ForceHTTPS cookie getting evicted).
    Denial of service (force entire domain; force via JS; force via HTTP).
    Bootstrapping (how to get ForceHTTPS bit; how to avoid privacy leaks).
      Possible solution 1: DNSSEC.
      Possible solution 2: embed ForceHTTPS bit in URL name (if possible).
      If there's a way to get some authenticated bits from server owner
        (DNSSEC, URL name, etc), should we just get the public key directly?
      Difficulties: users have unreliable networks. Browsers are unwilling
        to block the handshake on a side-channel request.

Current status of ForceHTTPS:
  Some ideas from ForceHTTPS are being adopted into standards.
  HTTP Strict-Transport-Security header is similar to a ForceHTTPS cookie.
    Uses header instead of magic cookie:
      Strict-Transport-Security: max-age=7884000; includeSubDomains
    Turns HTTP links into HTTPS links.
    Prohibits user from overriding SSL errors (e.g., bad certificate).
    Optionally applies to all subdomains.
      Why is this useful?
      non-Secure and forged cookies can be leaked or set on subdomains.
    Optionally provides an interface for users to manually enable it.
    Implemented in Chrome, Firefox, and Opera.
    Bootstrapping largely unsolved. Chrome has a hard-coded list of preloads.
    Soon to be an RFC.
  IE9 and Chrome block mixed scripting by default. Firefox 18 to follow up.

Another solution: HTTPS-Everywhere
   Collaboration between TOR and EFF
   Add-on for firefox and chrome
   Comes with rules to rewrite URLs for popular web sites

Other ways to address problems in SSL
  Better tools / better developers to avoid programming mistakes.
    Mark all sensitive cookies as Secure (#4).
    Avoid any insecure embedding (#3).
    Unfortunately, seems error-prone..
    Does not help end-users (requires developer involvement).
  EV certificates.
    Trying to address problem 5: users don't know what to look for in cert.
    In addition to URL, embed the company name (e.g., "PayPal, Inc.")
    Typically shows up as a green box next to the URL bar.
    Why would this be more secure?
    When would it actually improve security?
    Might indirectly help solve #2, if users come to expect EV certificates.
  Blacklist weak crypto.
    Browsers are starting to reject MD5 signatures on certificates
      (iOS 5, Chrome 18, Firefox 16)
    and RSA keys with < 1024 bits.
      (Chrome 18, OS X 10.7.4, Windows XP+ after a recent update)
  OCSP stapling.
    OCSP responses are signed by CA.
    Server sends OCSP response in handshake instead of querying online (#2).
    Effectively a short-lived certificate.
    Problems:
    - Not widely deployed.
    - Only possible to staple one OCSP response.
  Key pinning.
    Only accept certificates signed by per-site whitelist of CAs.
    Remove reliance on least secure CA (#2).
    Currently a hard-coded list of sites in Chrome.
    Diginotar compromise caught in 2011 because of key pinning.
    Plans to add mechanism for sites to advertise pins (HTTP header, TACK).
    Same bootstrapping difficulty as in ForceHTTPS.

References:
  http://www.educatedguesswork.org/2011/09/security_impact_of_the_rizzodu.html
  http://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security
  http://tools.ietf.org/html/draft-ietf-websec-strict-transport-sec-14
  http://blogs.msdn.com/b/ie/archive/2011/06/23/internet-explorer-9-security-part-4-protecting-consumers-from-malicious-mixed-content.aspx
  http://blog.chromium.org/2012/08/ending-mixed-scripting-vulnerabilities.html
  http://www.imperialviolet.org/2012/07/19/hope9talk.html
  http://www.thoughtcrime.org/papers/ocsp-attack.pdf
  http://www.imperialviolet.org/2011/03/18/revocation.html
  http://www.imperialviolet.org/2012/02/05/crlsets.html
  http://tools.ietf.org/html/draft-ietf-websec-key-pinning-02
  http://tack.io/
  http://dankaminsky.com/2011/08/31/notnotar/
  http://op-co.de/blog/posts/android_ssl_downgrade/