Latency

  • Network latency is the time delay between when data is sent from source to destination - measured as round-trip time (RTT) or one-way delay
  • Caused by multiple factors that accumulate as packets traverse the network path
  • Critical performance metric that affects user experience, especially for real-time applications (VoIP, video conferencing, gaming)
  • Different from bandwidth (capacity) - you can have high bandwidth but still experience high latency

Types of Latency

  • Propagation Delay: Time for signal to travel through transmission medium
    • Fiber optic: ~5 microseconds per kilometer
    • Copper: ~5.5 microseconds per kilometer
    • Satellite: 250-280ms (geostationary orbit)
  • Transmission Delay: Time to push all packet bits onto the wire
    • Formula: Packet size / Link bandwidth
    • Example: 1500-byte frame on FastEthernet = 1500×8 / 100Mbps = 0.12ms
  • Processing Delay: Time for device to examine packet headers and make forwarding decisions
    • Routers: 1-10ms (varies by hardware/software)
    • Switches: <1ms (hardware-based forwarding)
  • Queuing Delay: Time packet waits in output queue before transmission
    • Variable based on network congestion and QoS policies

Common Latency Sources

  • WAN Links: Especially satellite (250-500ms RTT) and long-distance terrestrial circuits
  • Router Hops: Each Layer 3 hop adds processing delay
    • Rule of thumb: 1-5ms per router hop on modern equipment
  • Network Congestion: Causes increased queuing delays and potential packet loss
  • Protocol Overhead: TCP acknowledgments, retransmissions, and window scaling
  • DNS Resolution: Can add 20-100ms before actual data transfer begins

Measuring Latency

Tool Purpose Typical Output
ping Basic RTT measurement Average, min, max RTT
traceroute Per-hop latency analysis RTT to each router in path
pathping (Windows) Combined ping/traceroute Packet loss and latency per hop
iperf3 Application-level latency Jitter, throughput with latency

Latency Requirements by Application

Application Type Acceptable Latency Critical Threshold
Web Browsing <200ms >1000ms unusable
VoIP <150ms preferred >300ms poor quality
Video Conferencing <150ms >400ms disruptive
Online Gaming <50ms competitive >100ms noticeable lag
Financial Trading <1ms >10ms significant impact

Latency Optimization Techniques

  • QoS Implementation: Prioritize time-sensitive traffic using DSCP markings
    • Voice: EF (Expedited Forwarding) - 46
    • Video: AF41 (Assured Forwarding) - 34
  • Traffic Shaping: Prevent buffer bloat by controlling burst rates
  • Path Optimization: Use routing protocols with latency-aware metrics
    • EIGRP: Includes delay in composite metric calculation
    • OSPF: Can use delay-based cost modifications
  • Local Caching: Content Delivery Networks (CDNs) and proxy caches reduce RTT
  • Protocol Tuning: TCP window scaling, selective acknowledgments (SACK)

Vocabulary

  • RTT (Round-Trip Time): Total time for packet to travel to destination and acknowledgment to return
  • Jitter: Variation in latency over time - critical for real-time applications
  • Buffer Bloat: Excessive buffering causing increased latency under load
  • Cut-Through Switching: Forwards frames before complete reception (reduces latency vs store-and-forward)
  • Serialization Delay: Time to convert digital data to signals on physical medium

Notes

  • Latency is often more noticeable to users than bandwidth limitations - 10Mbps with 20ms latency feels faster than 100Mbps with 200ms latency for interactive applications
  • Satellite connections have inherent high latency due to distance (22,236 miles to geostationary orbit) - no amount of bandwidth can overcome physics
  • Use ping -t (Windows) or ping without count (Linux) for continuous latency monitoring during troubleshooting
  • Consider asymmetric latency - upload and download paths may have different delays, especially with satellite or cellular connections
  • QoS policies must be implemented end-to-end to be effective - single bottleneck point can negate optimization efforts elsewhere
  • Modern applications use techniques like prefetching and connection pooling to mask latency effects from users