Choosing Simulcast vs SVC for Large Conferences

This guide is part of the Simulcast & SVC Implementation guide, and it answers a single deployment question: in a conference with 50 or more participants, do you ship simulcast or SVC? The decision turns on where you can afford to spend — publisher CPU, uplink bandwidth, decoder cost, or SFU complexity — because no choice is free on all four axes at once.

Context & Trade-offs

Both mechanisms give the Selective Forwarding Unit multiple quality points to forward from a single publisher. They pay for it differently.

Simulcast runs N parallel encoder instances on the publisher. Three VP8 layers cost roughly 1.8–2.2× the CPU of a single 720p encode (the layers are smaller, so it is sublinear, not 3×) and sum to the most total uplink — the publisher ships all three streams whether or not any subscriber wants the top one. SVC runs one encoder pass that emits one layered stream, so publisher CPU is close to a single encode and total uplink is the bitrate of just the full-quality stream, since lower layers are subsets of it rather than separate streams. On a constrained publisher uplink — the typical mobile presenter — that uplink difference is decisive: three simulcast layers at 150/500/1500 kbps need ~2.15 Mbps up, whereas an L3T3 SVC stream needs ~2.0 Mbps for the same operating points and degrades more gracefully.

The costs invert on the receive and server side. SVC decode is heavier — the decoder reconstructs the dependency structure — and SVC support is codec-gated: VP9 and AV1 only, with AV1 adding significant encode CPU (covered in Configuring AV1 SVC Layers in WebRTC). Simulcast works on plain VP8, so it survives any browser mix including older Safari. The SFU is where complexity flips hardest: simulcast forwarding is a trivial rid-to-SSRC match, while SVC forwarding requires parsing the dependency descriptor on every packet to drop the right frames, as detailed in Simulcast-Aware Forwarding.

Axis	Simulcast (3 layers)	SVC (L3T3)
Publisher CPU	~1.8–2.2× single encode	~1.1× single encode
Total uplink	sum of all layers (~2.15 Mbps)	full stream only (~2.0 Mbps)
Decoder cost	low (plain decode)	higher (layered decode)
Codec support	VP8/VP9/AV1, all browsers	VP9/AV1 only
SFU complexity	trivial (rid match)	high (dependency descriptor)

At 50+ participants the SFU forwarding cost scales with subscribers, not publishers, so SVC’s per-packet parsing tax multiplies across every forward — but its lower publisher uplink wins when presenters are on weak mobile links. The practical rule: SVC wins large conferences where clients reliably run VP9/AV1 and publisher uplink is the bottleneck; simulcast wins when the browser mix is uncertain, publisher CPU is plentiful, or you want the simplest, cheapest SFU. The server’s actual layer-picking logic is the same either way — see Bandwidth-Aware Layer Selection in an SFU.

Two scaling effects sharpen the decision past 50 participants. First, most large conferences are asymmetric: a handful of active speakers publish full quality while dozens of listeners subscribe but rarely send video. That asymmetry favors SVC, because the expensive publisher-side encode happens on few clients while the cheap subscribe path is the common case — you pay the SVC encode cost rarely and reap the uplink savings where they hurt most. Second, the failure mode at scale differs. A simulcast publisher that loses CPU headroom drops a whole layer abruptly, so subscribers pinned to that layer snap to a lower quality. An SVC publisher under the same pressure sheds a temporal layer first, dropping frame rate before resolution — a gentler, more recoverable degradation that large rooms tolerate better. Neither effect changes the codec-support reality: if a meaningful fraction of your room cannot decode VP9 or AV1, SVC is off the table regardless of its scaling advantages, and you ship simulcast on VP8.

Minimal Runnable Implementation

A capability-driven selector that picks SVC when the negotiated codec supports it and falls back to simulcast otherwise keeps one publisher path working across a mixed conference:

function configurePublishEncodings(transceiver, codecMime) {
  const supportsSpatialSvc = /vp9|av1/i.test(codecMime); // VP8 has no spatial SVC

  if (supportsSpatialSvc) {
    // One encode pass, one SSRC — lowest publisher CPU + uplink for 50+ rooms
    return [{
      active: true,
      maxBitrate: 2_000_000,
      scalabilityMode: 'L3T3_KEY'   // 3 spatial + 3 temporal, keyframe-synced upgrades
    }];
  }

  // Fallback: independent layers, trivial SFU forwarding, works on plain VP8
  return [
    { rid: 'high', active: true, maxBitrate: 1_500_000, scaleResolutionDownBy: 1.0 },
    { rid: 'mid',  active: true, maxBitrate:   500_000, scaleResolutionDownBy: 2.0 },
    { rid: 'low',  active: true, maxBitrate:   150_000, scaleResolutionDownBy: 4.0 }
  ];
}

Reproduction Steps & Debugging Log Patterns

Negotiate a session and read RTCRtpSender.getCapabilities('video') to learn whether VP9/AV1 is available on this client.
Apply the SVC encoding on a VP9 publisher and a simulcast encoding on a VP8 publisher in the same room.
Poll getStats() on each publisher and compare totalEncodeTime growth per second — the simulcast publisher’s encode time rises faster.
Compare aggregate bytesSent (sum across rids for simulcast vs the single SVC SSRC) to quantify the uplink difference.
On the SFU, log per-subscriber forward decisions and confirm both publishers reach every downlink tier.

Expected contrast in the stats:

// VP8 simulcast publisher (3 SSRCs)
// totalEncodeTime delta ~22ms/s   bytesSent(sum) ~268 KB/s
// VP9 SVC publisher (1 SSRC, L3T3)
// totalEncodeTime delta ~13ms/s   bytesSent ~250 KB/s

If the SVC publisher shows only a single operating point at the SFU, the scalabilityMode was rejected and silently collapsed to one layer — probe capabilities and fall back to simulcast.

A second measurement worth taking is the SFU’s own CPU per forward. Instrument the server to time the per-packet path for both publisher types and divide by the subscriber count. In a 50-subscriber room the SVC dependency-descriptor parse runs once per forwarded packet per subscriber, so a publisher at 30 fps across nine operating points generates measurably more server work than the simulcast publisher whose forward is a constant-time rid lookup. If that per-forward delta, multiplied by your peak subscriber count, exceeds the publisher CPU and uplink you saved, the trade has gone the wrong way for your hardware — which is exactly the calculation that tips some large deployments back toward simulcast despite SVC’s client-side appeal.

A Decision Procedure

Reduce the choice to four questions answered in order, and the conference almost always resolves cleanly. First: can every active publisher’s browser encode VP9 or AV1? If not, ship simulcast — SVC is unavailable to those publishers and a mixed mechanism per room is more operational complexity than it is worth. Second: is the publisher uplink or CPU the binding constraint? Mobile presenters on cellular uplinks and battery-bound devices favor SVC’s single encode and lower total uplink; desktop publishers on wired connections rarely feel the difference and can take simulcast’s simpler path. Third: how loaded is the SFU at peak? If the server is already near its CPU ceiling forwarding to large rooms, simulcast’s constant-time rid lookup is cheaper than SVC’s per-packet descriptor parse, and you should weight toward simulcast. Fourth: how forgiving does degradation need to be? Rooms where a brief quality dip is unacceptable benefit from SVC’s gentler temporal-first shedding.

In practice the common large-conference shape — a few mobile-capable speakers, many listeners, modern Chrome clients, an SFU with headroom — points to SVC, because the expensive encode is rare and the uplink and degradation wins land where they matter. The common enterprise shape — guaranteed Chrome but a strict, possibly older browser matrix and cost-sensitive server fleet — points to simulcast for its universality and cheap forwarding. When in doubt, ship simulcast first: it works everywhere, the SFU is trivial, and you can layer SVC in for the publishers that benefit once the baseline is solid.

Common Implementation Mistakes

Choosing SVC for a mixed-browser room. If any meaningful share of clients are VP8-only or older Safari, SVC silently degrades to one layer for them. Detect capability per publisher.
Ignoring SFU cost at scale. SVC’s per-packet dependency-descriptor parsing multiplies across every forward in a 50+ room — budget server CPU, not just publisher CPU.
Assuming SVC always saves bandwidth. The uplink win is real but modest (~5–10%); the bigger SVC advantage is publisher CPU and graceful degradation, not raw bytes.
Hardcoding one mechanism site-wide. Presenters on mobile uplinks benefit from SVC while desktop publishers with spare CPU are fine on simulcast — choose per publisher.
Forgetting decode cost on low-end subscribers. SVC decode is heavier; on weak mobile receivers it can cost more than the uplink it saved.

FAQ

At what participant count does SVC start to win? There is no hard threshold, but past ~50 participants the publisher uplink and CPU savings of SVC usually outweigh its SFU cost — provided clients run VP9 or AV1. Below that, simulcast’s simplicity often wins.

Can I mix simulcast and SVC publishers in one conference? Yes. The SFU treats them uniformly at the layer-selection level; only the forwarding mechanism differs per publisher. A capability-driven selector lets each publisher pick its best mechanism.

Does SVC reduce the SFU’s work? No — it increases it. The SFU must parse the dependency descriptor per packet for SVC, versus a trivial rid match for simulcast. SVC saves publisher resources, not server resources.

Related: return to Simulcast & SVC Implementation, then read Configuring AV1 SVC Layers in WebRTC and Simulcast with Three Quality Layers in Chrome, and cross to Simulcast-Aware Forwarding for the server side.