Jitter
技术Variation in packet arrival times causing irregular delivery of audio and video data
What is Jitter?
Jitter is the variation in the time delay between when packets are sent and when they arrive. While latency measures the average delay, jitter measures the inconsistency in that delay. Think of it like a train schedule: latency is the average travel time, but jitter is the unpredictability—sometimes the train is on time, sometimes it's early, sometimes it's late.
In video calling, packets should ideally arrive at regular intervals. Your encoder might send a packet every 20 milliseconds, and ideally, the receiver gets a packet every 20ms. But in reality, network conditions vary—packet 1 arrives after 50ms, packet 2 after 35ms, packet 3 after 60ms, packet 4 after 40ms. This variation is jitter.
High jitter causes audio dropouts, video freezing, robotic or choppy sound, and overall degraded call quality. It's particularly damaging to real-time communication because it disrupts the smooth, continuous playback that human perception expects.
Jitter vs. Latency
These terms are often confused, but they measure different aspects of network performance:
- Latency: Average time for packets to travel from sender to receiver. Measures overall delay
- Jitter: Variation in packet arrival times. Measures inconsistency in delay
You can have high latency with low jitter (consistent but slow delivery, like satellite internet) or low latency with high jitter (fast but unpredictable, like congested WiFi). For video calling, you need both low latency AND low jitter.
Jitter reflects short-term conditions or inconsistencies in packet flow, while latency (RTT) is the average time over a longer period. A network might have 50ms average latency but 30ms jitter, meaning packets arrive anywhere from 20ms to 80ms after being sent.
How Jitter Occurs
Network Congestion
The most common cause. When routers and switches are busy, packets wait in queues. Queue lengths fluctuate constantly—sometimes a packet gets through immediately, sometimes it waits 50ms. This variability creates jitter.
Congestion is worse during peak hours (evenings for residential networks, business hours for corporate networks) and on shared connections (public WiFi, cellular networks).
Route Changes
Internet routing is dynamic. If the network path changes mid-call (common on mobile networks as you move between towers, or when ISPs adjust routing), packets suddenly experience different delays, causing jitter spikes.
WiFi Interference
Wireless networks are inherently more jittery than wired connections. Radio interference, competing devices, signal strength variations, and retransmissions all introduce timing variability. WiFi typically adds 10-30ms of jitter compared to wired Ethernet's <5ms.
Packet Prioritization (QoS)
Ironically, Quality of Service (QoS) mechanisms can sometimes increase jitter. When routers prioritize certain traffic, other packets get delayed variably depending on current priority queue loads.
Packet Routing Variability
Not all packets take the same path. Some might take a direct route, others might bounce through additional hops. This path diversity creates arrival time variation.
Impact on Audio and Video Quality
Audio Impact
Audio is extremely sensitive to jitter because our ears detect timing inconsistencies easily:
- Low jitter (0-20ms): Imperceptible, audio sounds natural
- Moderate jitter (20-50ms): Occasionally choppy or slightly robotic sound
- High jitter (50-100ms+): Frequent dropouts, severe distortion, unintelligible speech
When jitter exceeds the jitter buffer's capacity, the buffer runs out of packets to play, causing audio gaps (silence) or plays late packets out of order (garbled sound).
Video Impact
Video is slightly more tolerant of jitter than audio, but still suffers:
- Low jitter: Smooth playback at consistent frame rate
- Moderate jitter: Occasional frame drops or stuttering
- High jitter: Frequent freezing, jerky motion, frames displayed out of order
Because video frames can be decoded and displayed with some flexibility (unlike audio which must play continuously), jitter buffers for video can be larger without as much perceived quality impact.
Jitter Buffers: The Solution
A jitter buffer is a small queue on the receiving side that temporarily stores incoming packets before playing them. Instead of playing packets immediately as they arrive (which would sound choppy due to jitter), the buffer collects packets and plays them at regular intervals.
How Jitter Buffers Work
- Packets arrive at irregular intervals due to jitter
- The jitter buffer stores these packets temporarily
- The buffer waits until it has enough packets to ensure continuous playback
- Packets are then played out at regular intervals (e.g., every 20ms for audio)
- This smooths out the timing variations, providing consistent playback
The trade-off: larger buffers smooth out more jitter but add latency. Smaller buffers reduce latency but risk running empty if jitter is high.
Fixed vs. Adaptive Jitter Buffers
Fixed jitter buffers use a constant size (e.g., always 60ms). Simple but inefficient—wastes latency on good networks, insufficient on bad networks.
Adaptive jitter buffers dynamically adjust size based on current network conditions. When jitter is low, the buffer shrinks to minimize latency. When jitter increases, the buffer grows to prevent dropouts.
Modern WebRTC implementations universally use adaptive jitter buffers. They continuously monitor packet arrival patterns and adjust buffer size in real-time, typically ranging from 15-120ms for audio.
NetEQ: WebRTC's Audio Jitter Buffer
NetEQ is Chromium's sophisticated audio jitter buffer implementation, used in all Chromium-based browsers (Chrome, Edge, Opera). It's one of the most advanced jitter buffer algorithms in production.
NetEQ Features
- Adaptive buffering: Continuously optimizes delay based on network jitter
- Packet loss concealment: Synthesizes missing audio when packets are lost
- Time stretching/compression: Subtly speeds up or slows down audio to maintain buffer levels without pitch changes
- Comfort noise generation: Adds gentle background noise during silence to avoid jarring gaps
- Dynamic range adaptation: Adjusts to varying jitter patterns throughout a call
Buffer Size Behavior
NetEQ typically starts with a ~40ms buffer. On stable networks with minimal jitter, it can shrink to 15-20ms, minimizing latency. On poor networks with high jitter, it expands to 100-120ms to prevent dropouts.
The algorithm constantly evaluates: "Can I reduce the buffer without risking dropouts?" and "Do I need to increase the buffer to handle current jitter levels?"
Video Jitter Buffers
Video jitter buffers work similarly to audio but with different constraints:
- Can be larger (50-200ms) because video frames are displayed at discrete intervals (e.g., 30 fps = every 33ms)
- Must handle dependencies between frames (I-frames, P-frames, B-frames in codecs like H.264)
- Can skip frames when buffer runs low, unlike audio which must play continuously
- Often prioritize keyframes (I-frames) over delta frames to ensure decodability
Acceptable Jitter Levels
ITU-T recommendations and industry standards suggest:
- Excellent: <10ms jitter (LAN, quality wired connections)
- Good: 10-30ms jitter (typical broadband, good WiFi)
- Acceptable: 30-50ms jitter (marginal connections, busy networks)
- Poor: 50-100ms jitter (severely congested or unstable networks)
- Unusable: >100ms jitter (call quality severely degraded)
These thresholds assume adequate jitter buffering. Without jitter buffers, even 20-30ms jitter would cause noticeable quality issues.
Measuring Jitter
WebRTC Stats API
Use getStats() on RTCPeerConnection to access jitter metrics:
jitter: Packet arrival time variation (in seconds, typically 0.001-0.100)jitterBufferDelay: Current jitter buffer sizejitterBufferEmittedCount: Number of packets played from buffer
Browser Tools
- Chrome:
chrome://webrtc-internalsdisplays real-time jitter graphs - Firefox:
about:webrtcshows jitter statistics per connection
Network Testing Tools
Tools like iperf, mtr, or online jitter tests can measure network-level jitter independently of WebRTC.
Reducing Jitter
1. Use Wired Connections
Ethernet has much lower jitter than WiFi (typically <5ms vs. 10-30ms). For critical calls, use a wired connection when possible.
2. Reduce Network Congestion
- Close bandwidth-heavy applications (streaming, downloads, cloud sync)
- Limit other users on your network during important calls
- Upgrade to higher bandwidth if consistently congested
3. Quality of Service (QoS)
Configure your router to prioritize WebRTC traffic (UDP ports used by STUN/TURN, or use DSCP marking). This ensures video call packets get priority over less time-sensitive traffic like file downloads.
4. Optimize WiFi
- Use 5GHz band instead of 2.4GHz (less congestion, lower jitter)
- Position close to the access point for strong signal
- Reduce interference by changing WiFi channels
- Use WiFi 6 (802.11ax) if available—better jitter characteristics under load
5. Choose Better ISPs/Networks
Some ISPs have more stable routing and better peering agreements, resulting in lower jitter. Fiber connections typically have lower jitter than cable or DSL.
6. Increase Packet Size (Packetization Time)
Sending larger packets less frequently can reduce the impact of jitter. For audio, use 20ms packetization on good networks, but increase to 60ms or even 120ms on poor networks. Larger packets mean fewer packet timing variations matter.
7. Edge Servers
Deploy WebRTC SFU servers closer to users. Shorter network paths have fewer hops, reducing opportunities for jitter to accumulate.
Jitter vs. Packet Loss
These often occur together but are different problems:
- Jitter: Packets arrive at irregular times but (usually) all arrive eventually
- Packet loss: Some packets never arrive at all
High jitter can lead to packet loss if the jitter buffer overflows (packets arrive too late and are discarded as useless). Conversely, packet loss doesn't necessarily cause jitter—packets might be lost consistently without timing variations.
The Bottom Line
Jitter is the silent saboteur of video call quality. While latency determines overall delay and bandwidth determines maximum quality, jitter determines consistency. A network with perfect bandwidth and acceptable latency can still deliver terrible call quality if jitter is high.
Fortunately, WebRTC's adaptive jitter buffers (like NetEQ for audio) are remarkably effective at masking jitter, automatically adjusting to network conditions. But there's a limit—extreme jitter (>100ms) cannot be fully compensated without adding unacceptable latency.
Understanding jitter helps you diagnose "the call sounds choppy" complaints, optimize network infrastructure, and set realistic quality expectations. Wired connections, QoS prioritization, and edge deployment are your best defenses against jitter.
References
- Jitter - Glossary - Mozilla Developer Network
- How WebRTC's NetEQ Jitter Buffer Provides Smooth Audio - webrtcHacks
- What is Jitter and How to use Jitter Buffer to reduce jitter? - TRTC
- WebRTC and Buffers - Stream
- Improved Jitter Buffer Management for WebRTC - ACM
- Network Jitter or Round Trip Time - which is more important in WebRTC? - Cyara
- Jitter Buffer - BlogGeek.me
- Real-time Networking - WebRTC for the Curious