Csep-561-Lec-7

The Application Layer (DNS, CDNs, HTTPS), and Security

  • Note on TCP congestion control algorithms
    • BBR, Cubic, etc., are all still opering within the TCP protocol. All that's changing is when the sender sends packets.
  • SACK (Selective ACK) actually does add extra headers, and sends ACK ranges which allows sender to more precisely know what to resend, and is a big improvement.
    • State of the art circa 2010, still in wide use today
  • CUBIC is the standard TCP stack
    • Linux >= 2.6.19
    • Windows >= 10.1709
    • MaxOS >= Yosemite
    • Keeps tricks from NewReno about fast recovery and fast retransmit w/ SACK; just replaces additive increase with cubic function
    • Throws away all that beautiful convergence of AIMD from last lecture. Performs better, just an emprical approach.
    • Seeks to resovle
      • Flows with lower RTTs grow faster than those with higher RTTs

Application layer

  • Where we are in the course:
    physical -> link -> network -> transport -> application

  • Recall the OSI model actually has layers

    • 4: Transport
    • 5: Session
    • 6: Presentation
    • 7: Application
      but, in the internet we bundle layers 5 and 6 into the application layer.
  • A session is a series of related data over the network, i.e. your Zoom stream, or a particular website loading.

  • The concept of presentation is how to identify a type of content and encode it for transfer

    • MIME types, image/jpeg, etc.
  • Evolution of internet applications:

    • 82% of traffic by byte is video
    • Most traffic is mobile as of 2016
    • Lots of attack traffic from China, Russia, and U.S.

DNS (Domain Name System)

Let's you go to www.uw.edu instead of 183.23.2.32.

  • Goals:

    • Easy to manage
    • Efficient
  • Approach:

    • Distributed directory based on hierarchy namespace
    • Automated protocol to tie pieces together
  • Names are higher level id's for resources

  • Addreesses are lower-level locators for resources

  • Resolution is mapping a name to an address

  • Before DNS

    • there was a HOSTS.TXT file regularly retrieved for all hosts from a central machine at the Network Information Center
    • Names were initially flat, i..e no domains.

Domains are hierarchal starting from the right:

  • robot.cs.washing.edu
    • .edu is the top level hierarchy, robot is the specific part.

TLD (Top Level Domains) are

  • run by ICANN (Internet Corp for Assigned Names and Numbers)

  • 700+ generic TLDs, many created over time

  • 250 country code TLDs

  • DNS Resolution Example

    • requesting robots.cs.washington.edu from flits.cs.vu.nl
    • queries local name server (cs.vu.nl)
    • which queries root name server a.root-servers.net
    • then edu name server a.edu-servers.net
    • then name server UW
    • then name server UWCS
    • ... mad steps
  • Two types of queries:

    • Recursive query nameserver resolves and returns final server
      • Lets client offload burden to server
      • Lets server cache results for a bunch of clients
    • Iterative query nameserver just responds with the next one to hit
      • Lets server "fire an forget"

Local nameservers often run by IT

  • Or you can use google's public DNS (8.8.8.8) or Cloudfare's public DNS (1.1.1.1), instead of using e.g. Comcast's.

Root nameservers is served by 13 server names

  • a.root-servers.net through m.root-servers.net
  • all need root IP addresses

Caching

  • Resolution latency needs to be low
  • URLs don't change very often
  • You don't want every query to have to hit the root nameserver

DNS Protocol

  • Built on UDP port 53
  • ARQ for reliability
  • Server is stateless
  • Relies on replicas to make the service redundant
  • Zone is comprised of DNS resource records that give info for its domain names

Security is a major issues here.

  • Mostly added as an afterthought in the 90s

  • Need to secure the mapping, so users don't go to a malicious bank.com

  • Goal: integrity and authenticity (over confidentiality)

    • Impossible to cache encrypted nameserver lookup, so we forego confidentiality
  • DNS Spoofing attacks the intermediary nameserver caches

  • DNSSEC (DNS Security Extensions) disallows this

    • Extends DNS with new record types
    • RRSIG for digital signatures of records
    • DNSKEY for public keys for validation
    • Root servers upgraded in 2010
    • Check this box when setting up your website!

HTTP (HyperText Transfer Protocol)

  • Created by Sir Tim Bernes-Lee, the inventor of the Web, directs the W3C
  • Keep in mind the Web is an application built on top of the Internet, which is a much broader thing.
  • A "web page" consists of a set of related HTTP transactions
  • In this web context, HTTP is a request/response protocol for fetching web resources.
  • A URL (Uniform Resource Locator)
    • protocol + server + page on server
    • http://en.wikipedia.org/wiki/vegemite
      • protocol: http
      • server: en.wikipedia.org
      • page on server: /wiki/vegemite
  • PLT (Page Load Time) was a key measure of web performance
    • Hard to measure in the modern world, with infinite scroll and whatnot
    • Modern metrics are
      • FCP (First Contentful Paint): is it happening?
      • FMP (First Meaningful Paint): is it useful?
      • TTI (Time to Interactive): is it usable?
  • HTTP/1.0
    • used one TCP connection to fetch one web resource, not efficient at all.
  • HTTP/1.1
    • Supports pipelining, parallel requests
    • State of the art as of 2015
  • HTTPS
    • A protocol for securely connecting to server
    • Authenticates that the server is who they say they are via a chain of certificates.
    • As long as you trust each link in the chain, you're good
    • Encrypts and validates all traffic to/from server
    • Built on top of TLS (Transport Layer Security).
      • Renamed, new version of "SSL"
      • More secure than TLS - SSL is old and broken don't use it!
      • Requires a complex handshake (diffie-helman key change!) between the client and server
    • Basically the same as HTTP just with a new added TLS layer