Munging SDP to Prefer Opus DTX

Opus discontinuous transmission (DTX) lets an encoder stop sending packets during silence, collapsing the audio bitrate from a steady 24–40 kbps down to a few hundred bps of comfort-noise updates while no one is speaking. The catch: there is no RTCRtpEncodingParameters flag for DTX, so enabling it means editing the a=fmtp line in the SDP by hand. This guide is part of the SDP Offer/Answer Lifecycle guide, and it covers exactly when that string edit is justified, how to do it without breaking BUNDLE, and why you should reach for setCodecPreferences() for everything else.

Context & Trade-offs

The first rule of SDP manipulation is to prefer the structured APIs. RTCRtpTransceiver.setCodecPreferences() reorders codecs, and setDirection() controls media flow, both without touching the text — and both correctly preserve a=group:BUNDLE and m-line ordering, which is precisely what regex munging tends to corrupt. Reordering or rewriting media sections by hand is the leading cause of the rejections covered in Debugging SDP m-line Mismatches, so the default answer to “should I munge the SDP?” is no.

DTX is the narrow exception. The browser exposes no API surface for the usedtx fmtp parameter, so the only way to request it is to append usedtx=1 to the Opus codec’s a=fmtp line after createOffer() and before setLocalDescription(). The payoff is real: in a multi-party call where most participants are silent most of the time, DTX cuts per-stream upstream audio from roughly 24–40 kbps to near zero during silence, which compounds across a room and meaningfully reduces both client uplink and SFU forwarding cost. The trade-off is a small risk of a brief audio artifact at the silence-to-speech boundary on aggressive encoders, and — the part teams get wrong — the obligation to edit only the parameter and leave the m-line order, payload-type numbers, and BUNDLE group completely untouched.

Approach Use for BUNDLE safety
setCodecPreferences() Codec ordering, codec selection Preserved by the API
setDirection() Media direction / pausing Preserved by the API
a=fmtp munge (usedtx=1) DTX — no API exists Your responsibility

It helps to picture what the encoder actually does. With usedtx=1 negotiated, the Opus encoder runs a voice-activity gate: when the input drops below a speech threshold it stops emitting full frames and sends only periodic comfort-noise updates, then resumes full-rate encoding the moment speech returns. The decoder, told to expect DTX, synthesises low-level comfort noise across the gaps so the listener hears a natural quiet room rather than a dead channel. This is purely an encoder-side optimisation negotiated through SDP — it changes nothing about the transport, the SSRC, or the RTP timestamps, which keep advancing so the receiver’s jitter buffer and lip-sync stay aligned. That is exactly why the edit must be surgical: you are toggling a codec parameter, not altering the media topology, and the m-line order, payload numbering, and BUNDLE group must come through byte-identical.

Minimal Runnable Implementation

Edit only the Opus a=fmtp line, keyed to the dynamically negotiated payload type, after generating the offer and before committing it.

// Append usedtx=1 to the Opus fmtp line without touching anything else.
function enableOpusDtx(sdp) {
  // 1. find Opus's negotiated payload type (it is dynamic, never assume 111)
  const rtpmap = sdp.match(/a=rtpmap:(\d+) opus\/48000/i);
  if (!rtpmap) return sdp;                 // no Opus offered; leave SDP untouched
  const pt = rtpmap[1];

  // 2. locate that payload type's existing fmtp line
  const fmtpRe = new RegExp(`a=fmtp:${pt} (.*)`);
  const fmtp = sdp.match(fmtpRe);
  if (!fmtp) {
    // no fmtp line yet: add one immediately after the rtpmap, preserving m-line order
    return sdp.replace(rtpmap[0], `${rtpmap[0]}\r\na=fmtp:${pt} usedtx=1`);
  }
  if (/usedtx=/.test(fmtp[1])) return sdp;  // already set; idempotent

  // 3. append the param to the existing list, leaving order and BUNDLE intact
  return sdp.replace(fmtpRe, `a=fmtp:${pt} ${fmtp[1]};usedtx=1`);
}

async function makeOfferWithDtx(pc, signaling) {
  const offer = await pc.createOffer();
  offer.sdp = enableOpusDtx(offer.sdp);     // mutate ONLY the fmtp param
  await pc.setLocalDescription(offer);       // commit; m-line order unchanged
  signaling.send({ type: 'offer', sdp: pc.localDescription.sdp });
}

Key the edit to the payload type from a=rtpmap, never a hardcoded number — Opus is a dynamic payload type and differs across engines and negotiations. Touch nothing but the usedtx token; do not reorder fmtp parameters, do not renumber payload types, and do not move the m=audio section, or you will re-introduce the m-line drift this edit is supposed to avoid. The committed pc.localDescription.sdp is what you transmit, consistent with the SDP Offer/Answer Lifecycle commit sequence.

Reproduction Steps & Debugging Log Patterns

  1. Create an offer, run enableOpusDtx(), and confirm the Opus a=fmtp line now contains usedtx=1 while the m= line order is byte-for-byte unchanged.
  2. Commit with setLocalDescription() and verify signalingState advances to have-local-offer with no InvalidStateError — proof the munge did not corrupt the description.
  3. Complete the exchange and stay silent on the sending side; poll getStats() at 1-second intervals and watch the outbound audio bytesSent delta fall toward zero during silence.
  4. Speak, and confirm bytesSent climbs back to the normal 24–40 kbps range.
  5. Diff the offer against the answer’s mid ordering to prove BUNDLE survived intact.
// During silence, outbound audio nearly flatlines under DTX
const before = await pc.getStats();
// ... 5s of silence ...
const after = await pc.getStats();
// RTC: [dtx] bytesSent delta during silence ~ a few hundred bytes (comfort noise)
// RTC: [dtx] bytesSent delta while speaking ~ 15-25 KB over 5s (24-40 kbps)

If audio cuts out entirely rather than dropping to comfort noise, the receiver likely lacks DTX support and is misreading the gaps — fall back by omitting usedtx=1. If setLocalDescription() throws, your edit altered more than the fmtp value; revert and re-derive the payload type from a=rtpmap. Cross-check raw SDP in chrome://webrtc-internals or Firefox about:webrtc.

When the connection runs through an SFU rather than peer-to-peer, verify the savings end to end. DTX reduces what the client uploads, but the server only forwards the savings if it relays the gaps faithfully instead of padding the stream back to a constant rate. Inspect the SFU’s outbound bytesSent for the silent participant the same way you inspected the client’s, and confirm the BUNDLE group survived the server’s own SDP rewrite — many servers regenerate descriptions per subscriber, and a server-side munge can drop usedtx=1 or, worse, shuffle the m-line order. The interaction with simulcast is benign because DTX only affects the audio section, but the BUNDLE check still matters: audio shares the transport with video, and corrupting the group breaks both.

Common Implementation Mistakes

FAQ

Why prefer setCodecPreferences() over regex munging? Because the structured APIs preserve m-line order and the BUNDLE group automatically, while string edits routinely corrupt both. Munge only when no API exposes the parameter you need — DTX’s usedtx=1 is the canonical example.

How much bandwidth does Opus DTX actually save? During silence the encoder drops from a steady 24–40 kbps to a few hundred bps of comfort-noise updates. In a multi-party call where most participants are silent most of the time, that compounds across every stream and cuts both client uplink and server-side forwarding cost substantially.

Can munging a=fmtp break BUNDLE? Only if your edit strays beyond the parameter value — reordering fmtp tokens, renumbering payload types, or moving the media section. Append usedtx=1 to the existing Opus fmtp line and leave everything else byte-identical, and BUNDLE is unaffected.

Related: return to the SDP Offer/Answer Lifecycle guide, and see Debugging SDP m-line Mismatches and SDP Renegotiation Without Dropping Streams.