SFU, WebRTC Multiparty Video Architecture and its uses

What is an SFU?

SFU is short for Selective Forwarding Unit. To understand what this means, we must first ask ourselves this question; what is a Multiparty Video Conference?

A multiparty video conference is a video conference where many people from different locations can all be on one video call at the same time. Each participant of the conference will have a live video feed from every other participant. To accomplish this, an effective routing mechanism must be in place.

What are the models used to deploy a multiparty video conference?

There are in general, 3 main models of deploying a multiparty video conference.

  1. Mesh
  2. MCU
  3. SFU

Mesh architecture has each participant in the call sending their streams to each and every other participant. This type of connection is peer-to-peer involving no more than at most 4 participants in the case of video communication.

MCU (Multipoint Conferencing Unit) has each participant sending their streams to a single server which then mixes and combines all inputs. The resultant stream is then sent to each participant. This type of architecture is good for easing the load on the network but is expensive because of the mixing architecture that is required on the server.

SFU has each participant’s streams sent to routing device that routes their streams and sends it to other participants with modifications to each stream if needed.

Which model is better?

The answer to this question depends on the need. For example, if there will only be two participants in a video call, then Mesh or peer-to-peer connection will be the best as it is easy to implement. However, the processing of all media is performed by the hardware on each of the systems involved. Therefore this architecture fails when the number of participants in the call increases. This is because the processors on the systems run out of capacity to handle all incoming and outgoing media streams.


When is SFU useful?

SFU shines when there are multiple people in a video conference with each one possibly on different devices and network types.

The advantage of SFU is that the routing server sends the media streams to each participant based on a few criteria such as device resolution, network speeds, etc. In WebRTC the maximum possible quality will be used when sending media such as video. But for example, sending a 4K video to a device that does not support it is a waste of resources. In this case, the server can send a video of a lower resolution to that device.

To put it simply, SFU scales well with the number of simultaneous users in terms of cost of implementation.


There isn’t one catch-all solution when it comes to WebRTC implementation. But there are case-by-case solutions. One-to-one communication can easily be handled by a peer-to-peer connection. Whereas many-to-many can be handled by MCU and SFU, with SFU being able to handle participant numbers in the range of several thousand or even millions.

In short, if you want to provide WebRTC services to millions of customers at once, then SFU might be the implementation that you are looking for.