Lec-5
Bandwidth Allocation and Congestion Control
Last thoughts on SDN
- SDN is more of a philosophy than technique
- Disaggregate control logic from hardware.
- Behavior is no longer emergent from a graph of routers; it is controlled centrally.
- The best part of this is managing complexity with automation.
- Enables practical, efficient virtual networking at cloud scale.
Transport Layer
Where we are in the course:
Physical -> Link -> Network -> Transport -> Application
That is, actually end-to-end connectivity across a network of networks.
Common protocols: TCP, UDP. These sit on top of IP.
- Why do we need it?
- Reliable transmission! Packets that are lost in IP are gone. Transport knows that we need to resend.
- IP lets computers talk to computers. But what happens at the computer?
- You don't want AIM getting your browser's website packets!
- Data rates
- Reassembly of packets
Various transport layer services for different kinds of data delivery:
| | Unreliable | Reliable |
|-|-|-|
|Messages|Datagrams (UDP)|Chunks (SCTP) (Not common)|
|Bytestream|Streams (RTP)|Streams (TCP)|
- RTP is Real Time Protocol e.g. this is what Zoom uses
- TCP is full-featured
- UDP is not. It's more of a glorified packet.
- TCP is a very raw bytestream. You need to provide your own framing mechanism at the app level.
Socket API
- Used by both UDP and TCP.
- Tip: It makes it feel like you are getting "messages", just remember TCP is a bytestream.
- Sockets let apps attach to the local network at different ports.
- This is what lets you run multiple apps on one host simultaneously.
- E.g. app 1 on port 1, app 2 on port 2.
- Part of the POSIX API.
Ports
- The application process is identified by 5-tuple:
- IP address src
- IP address dest
- Transport protocol
- Port src
- Port dest
- Ports are 16-bit integers that basically identify local "mailboxes" that processes lease.
- Servers often bind to conventional well-known ports
- E.g. HTTP: 80, HTTPS: 443, SSH:22, FTP: 20/21, SMTP: 25
- RTSP: 543 for media player control
- IPP: 631 for printer sharing
- IMAP: 143, 110: POP-3 are for remove email access
- Ports indexed < 1024 require admin privileges
- Clients often assigned ephemeral ports, chosen by OS, only for temporary use.
UDP
Used by apps that don't want reliability or bytestreams
- E.g.
- Voice-over-IP
- DNS, RPC
- DHCP
- These apps implement their own reliability.
- Since applications are running asynchronously in user-space,
- Each port has a UDP buffer queues where the OS puts the datagrams
- Size of the queues are chosen by OS
- In TCP you get to choose!
- OS is in charge of taking packet from the network and de-multiplexing to correct queue.
- If queue full, usually packet gets dropped
- Very thin wrapper over IP packet.
NAT Hell
- Recall NATs are inspecting layers that they shouldn't.
- Now there are millions of devices in the internet with knowledge of only a few legacy transports.
- This is why SCTP is not used in practice, it's not supported by a lot of existing nodes that are inspecting Transport layer headers when they shouldn't be!
Instead, we work around NAT hell by building protocols on top of UDP instead of on top of IP directly.
- So now you can use SCTP over UDP (RFC6951).
- QUIC (Quick UDP Internet Connections) is more aggressive in its stance - it actually encrypts headers!
- Partly for security & privacy
- But party so that middleboxes can't see it! And so the protocol can stay agile.
TCP
A very high level overview of "the workhorse transport"
(See the text's chapter on TCP for a more thorough overview)
- First, a three way handshake to connect
- A :---> SEQ=x ---> B
- A <--- SEQ=y, ACK=x+1 <---: B
- A :---> SEQ=x+1, ACK=y+1 ---> B
- To release, there's a similar handshakey thing. But both parties need to agree to close to fully close the connection.
- There's also an awkward bit where one party needs to wait for a TIME_OUT = 2 minutes, just in case their final FIN message gets lost.
- So there's a lot of resources tied up in TCP closing.
- To solve this, TCP also has an unorganized close called the reset.
- Intended for use if connection becomes corrupted.
- Typically handled as exception, but often sent by endpoints instead of a full close, just for the sake of the server's resources.
- Lots of different SEQ/ACK techniques
- Sliding window
- Go Back N
- Selective Repeat
- Flow control
- Sliding window
Congestion
Early TCP used a fixed 8 packet window size.
As the ARPANET grew, links stayed busy, but transfer rate crashed
- Queues help absorb bursts when input > output, but if this happens persistently,
- queue will overflow! this is congestion
To fix this, we need the Transport and Network layers to work together:
- Network layer witnesses congestion
- Only it can provide feedback
- Transport layer causes congestion
- Only it can reduce the load
A good solution has both
- efficiency use the full capacity of the bandwidth
- fairness divides the capacity among hosts equally
However these notions are often at odds. Instead we typically prioritize efficiency, and "fairness" just minimally is met by ensuring no nodes "starve" with no bandwidth.
A **Max-min fair allocation is one that
- increasing the rate of one flow decreases the rate of a smaller flow
- this "maximizes the minimum flow"
How do we implement this? AIMD (Additive Increase Multiplicative Decrease)
- Hosts additively increase rate while network not congested
- Hosts multiplicatively decrease rate when congested
- Used by TCP!
- Converges to the efficient allocation for fairness/efficiency when both hosts run it.
- Really cool!
- Requires only binary feedback from the network (congested or not congested)