Self-Hosting Coturn STUN vs Public STUN Servers

Almost every WebRTC tutorial hardcodes stun:stun.l.google.com:19302, and for a prototype that is fine. The question this page settles is when that default stops being acceptable and a self-hosted coturn STUN listener earns its operational cost — judged on reliability, privacy, rate limits, and latency, not folklore. This guide is part of the STUN Server Deployment Strategies section, which in turn sits under the WebRTC Protocol Stack & Signaling Servers guide.

Context & Trade-offs

A public STUN server resolves your client’s reflexive (srflx) candidate for free, with zero infrastructure to run. The trade-off is everything you do not control.

Reliability. Servers like stun.l.google.com carry no SLA. They are best-effort, can change IPs, and can be unreachable from networks that block or deprioritise their ranges. STUN sits on the critical path of every connection, so an outage you cannot page on becomes call failures you cannot explain. The free public resolvers exist primarily to serve their operators’ own products; you are a guest on infrastructure tuned for someone else’s traffic shape, and nothing contractually obliges it to stay reachable, keep the same address, or answer your volume of requests.

Privacy. Every Binding Request hands a third party your users’ source IP and a timing signal for when a call starts. For consumer apps that may be acceptable; for healthcare, finance, or enterprise deployments with data-residency obligations, sending reflexive lookups to a third-party operator is often disqualifying on its own.

Rate limits. Public resolvers apply undocumented per-source throttling. Behind a corporate NAT or carrier-grade NAT, hundreds of clients share one public IP, and the resolver cannot tell them apart — collectively they can trip a rate limit, and the resulting dropped Binding Responses surface as random, unreproducible gathering failures for a subset of users. This is the most insidious failure mode of public STUN: it does not fail cleanly, it fails for a fraction of one organisation’s users, intermittently, in a way your own logs cannot see because the throttling happens on infrastructure you do not operate. By the time support escalates “video doesn’t connect from our Frankfurt office,” you have no instrumentation to confirm STUN is the cause.

Latency. A single global public endpoint means a fixed, often cross-continent round trip. A self-hosted node placed near your users keeps the reflexive lookup under 30 ms and, as a multi-region deployment, cuts initial connect latency by 40–60% versus one distant resolver. On mobile and CGNAT where bindings refresh in under 30 s, a fast local response also matters because a slow candidate can expire before it is used. The reflexive candidate also frequently arrives on the critical path: when direct host paths fail but the NAT is not symmetric, the srflx candidate is the one ICE nominates, so its gathering time is the connect time the user feels.

Self-hosting coturn flips all four: you own the SLA, the lookups never leave your infrastructure, you set the rate limits, and you choose placement. The cost is running and monitoring the nodes — but a STUN-only coturn is stateless and cheap, far lighter than the TURN Server Configuration & Auth it often sits beside. A common hybrid keeps public STUN as a secondary entry so a self-hosted outage degrades rather than fails.

There is also a measurement angle. With a public resolver you cannot instrument the lookup — you have no logs, no Prometheus counters, and no way to correlate a spike in failed connections with STUN behaviour. A self-hosted node exports request rates, dropped packets, and error responses, which turns “some users can’t connect” from an unactionable report into a graph you can alarm on. For any team that takes connection reliability seriously enough to have an on-call rotation, that observability alone often justifies the node before privacy or latency enter the argument.

The decision is not all-or-nothing. The pragmatic posture for most production apps is: self-host STUN in the regions where your users concentrate, list a public resolver as a secondary entry, and always pair both with a TURN relay for the symmetric-NAT clients that STUN can never serve regardless of who operates it.

A rough threshold helps. A weekend prototype or an internal demo with a handful of users gains nothing from a self-hosted node — the public default is correct, and the operational overhead is wasted. The calculus changes when any of three things become true: you have a paying user base whose connection failures generate support tickets, you carry data-residency or privacy obligations that make third-party reflexive lookups unacceptable, or your users concentrate in a region far from the public resolvers’ best-served networks and you can measure the latency penalty. Cross any one of those lines and a STUN-only coturn node — stateless, cheap, and trivial to run beside the TURN relay you already need — stops being premature optimisation and starts being basic operational hygiene.

Minimal Runnable Implementation

A STUN-only coturn node needs almost nothing. STUN (RFC 8489) is unauthenticated, so there is no realm, secret, or credential to manage — strip everything TURN-related and keep the listener.

# /etc/turnserver.conf — STUN-only node (no relay, no auth)
listening-port=3478           # standard STUN/TURN port
external-ip=203.0.113.21      # the node's PUBLIC IP; never the cloud-internal RFC 1918 address
no-tls                        # STUN needs no TLS — drop the listener to shrink attack surface
no-tcp                        # STUN Binding requests are UDP-only
no-auth                       # plain STUN is unauthenticated by design
no-cli                        # disable the telnet admin console in production
no-multicast-peers            # block multicast relay attempts
denied-peer-ip=10.0.0.0-10.255.255.255       # refuse to reflect private ranges
denied-peer-ip=192.168.0.0-192.168.255.255
# Crude amplification guard: cap requests per source IP
user-quota=12
total-quota=1200

Point the client at your node, and keep a public resolver as a backup entry so a self-hosted outage degrades instead of failing:

const pc = new RTCPeerConnection({
  iceServers: [
    { urls: 'stun:stun.yourdomain.com:3478' },     // self-hosted, near the user
    { urls: 'stun:stun.l.google.com:19302' },       // public fallback if self-hosted is down
    { urls: 'turn:turn.yourdomain.com:3478',        // relay for symmetric NAT
      username: creds.username, credential: creds.credential }
  ],
  iceCandidatePoolSize: 10  // warm srflx gathering before createOffer()
});

Reproduction Steps & Debugging Log Patterns

Start the node: docker run -d --network host coturn/coturn -c /etc/turnserver.conf. Confirm it is listening with ss -ulpn | grep 3478 (expect a udp row, not tcp).
Probe a real Binding cycle: stun-client --mode full --localport 0 stun.yourdomain.com 3478. Expect a Mapped address = <public-ip>:<port> line. If the mapped address is an RFC 1918 10.x/192.168.x value, external-ip is wrong.
In a browser, open chrome://webrtc-internals, start a connection, and watch the candidate list. A working self-hosted node emits a srflx candidate within tens of milliseconds: candidate:... typ srflx raddr 0.0.0.0 rport 0 ... generation 0
Compare against the public path by temporarily swapping the first iceServers entry to stun.l.google.com and re-reading the gathering timestamps — the self-hosted node should resolve the reflexive candidate noticeably sooner for local users.
To reproduce a rate-limit symptom, drive many requests from one source IP; throttled responses appear as missing srflx candidates while host candidates still gather, and iceGatheringState reaches complete with fewer candidates than expected.

Common Implementation Mistakes

Treating public STUN as production-grade. No SLA, no latency guarantee, and undocumented throttling make stun.l.google.com a prototype convenience, not a dependency to build call reliability on.
external-ip left at the cloud-internal address. Behind a NAT gateway the node reflects a private IP, so every client gets a useless srflx candidate. Verify with stun-client after deploy.
Self-hosting STUN but skipping a relay. STUN cannot traverse symmetric NAT no matter where you host it; users behind symmetric or carrier-grade NAT still need TURN. Pair the node with configuring Coturn for production TURN relay.
No fallback entry. A single self-hosted endpoint with no public backup turns one node outage into total connection failure. List a public resolver as a secondary iceServers entry.
Omitting rate limits because STUN is unauthenticated. STUN responses are slightly larger than requests, so an open resolver is a UDP amplification vector. Set per-source quotas even on a STUN-only node.

FAQ

Is a self-hosted STUN server expensive to run? No. A STUN-only coturn is stateless and answers a single UDP request/response per lookup, so a small instance per region handles very high request rates. The cost is operational — placement, monitoring, and health checks — not compute. Relaying media through TURN is what consumes real bandwidth.

Can I keep using public STUN as a fallback? Yes, and it is a good default. List your self-hosted node first and a public resolver second in iceServers. The ICE agent gathers from both, so a self-hosted outage degrades to the public path instead of failing the connection outright.

Does self-hosting STUN remove the need for TURN? No. STUN — public or self-hosted — only discovers a reflexive address and cannot traverse symmetric NAT, where the port mapping changes per destination. A TURN relay remains the mandatory fallback for those clients; see traversing symmetric NAT with TURN.

How do I know whether public STUN is actually hurting me? Instrument gathering. Log the timestamp of each srflx candidate against iceGatheringState transitions in chrome://webrtc-internals, segmented by user region, and watch for two signals: reflexive candidates that consistently arrive 150 ms+ after host candidates (a latency problem), and connections where srflx is missing entirely for a subset of users sharing a corporate or carrier NAT (a rate-limit problem). Both point at a self-hosted node placed nearer those users with quotas you control.

Related: return to STUN Server Deployment Strategies for multi-region placement, and read configuring Coturn for production TURN relay and TURN Server Configuration & Auth for the stateful relay half of the same coturn binary.

Self-Hosting Coturn STUN vs Public STUN Servers

Context & Trade-offs

Minimal Runnable Implementation

Reproduction Steps & Debugging Log Patterns

Common Implementation Mistakes

FAQ

Related Guides