How many times have people told you that QoS (quality of service) in the Internet will magically be solved once we get optical fibre and loads of bandwidth every where? An attractive assertion, to be sure, but one that rather misunderstands why IP QoS fails to materialize in today’s Intenet.
Faster is… faster
High bandwidth optical networks will solve IP QoS in the same way as bigger and wider roads and freeways have solved vehicular congestion. When the system is vastly under-loaded, you can drive from point A to point B with minimum of fuss and few delays at on-ramps, off-ramps, and intersections. That’s kind of obvious. But as demand increases (for example, during peak travel periods) everything becomes more bursty – short periods of free flowing traffic are interrupted by backlogged intersections, on-ramps, and off-ramps. And it doesn’t really have anything to do with the absolute size of the roads. Transient congestion is caused when too many people arrive at intersections faster than the road system can get them through the intersection. Whether it is two interstate highways crossing, two city streets, or a freeway off-ramp dumping traffic into a regular street – capacity mismatches lead to transient congestion. This leads to additional traffic management techniques such as traffic lights, high occupancy lanes, electronic toll lanes, and rules giving precedence to emergency services vehicles.
The Internet isn’t much different.
Congestion occurs where there’s capacity mismatch
Regardless of how wonderfully fast your backbone links are, there will be points where traffic moves from a faster link to a slower link. Yes, from your home dial-up connection you could move to DSL or CableModem, and your ISP may aggregate all their customers onto DS3, OC-3 or OC-12 intermediate backbone links. All very wonderful. But think about the other end – the popular web or media server being pummelled by the aggregate traffic coming in, or the mid-speed inter-provider exchanges (the ‘interchanges’ between ISP backbones). At these points of bandwidth mismatch, queues are built into routers and switches to handle transient overloads (rather as a freeway off-ramp fills with, and empties of, vehicles as the traffic lights control traffic flow onto the city streets). Whether we’re talking about 1.5Mbit/sec T1 links, or 9.6GBit/sec OC-192 links, where there’s a bandwidth mismatch we need packet queues. And where queuing occurs we see average latency rise, jitter get added, and packet loss rates rise. And those are the little beasties that screw up IP QoS when they’re unconstrained.
The bandwidth of optical data transport systems is an essential part of a network provider’s toolkit. In many environments today, over-provisioning the network is the simplest (and often the only practical) way to ensure peak-period congestion doesn’t occur. But you cannot avoid the need for queuing mechanism in routers and switches – especially at the Internet’s off-ramps. And as soon as someone decides certain traffic has a higher priority than all other traffic, you will need differentiated queuing and scheduling in even the high-end, gigabit/sec routers and switches – the Internet equivalent of high occupancy lanes and “get out of the way of emergency vehicles” rules.
Historical note: This blog entry was actually made on June 11th 2012, to archive a web page I wrote on May 3rd 2001 (see Internet Archive’s WayBack machine link from April 2003). Although largely for my own benefit, I’ve archived it in the hope it will be useful to others (like you, if you’re reading this page and you aren’t me).