To get started, look at how your application uses WebRTC. Think about these questions: What is the maximum number of users I need to support in a session? What is the average number of users I expect to support in a session? How many users will send media within a session? How many users will receive media within a session? Etc.

These questions will help you decide on the technical architecture that most closely matches your needs.

Here are some technical architecture that help you scale your WebRTC Application.

  • Mesh Architecture

Mesh architecture depends completely on client’s capabilities.

Each client connects directly to every other client, resulting in each client managing (n-1) bidirectional connections, where n is the total number of clients.

With 3 clients in a session, each client has to manage 2 separate connections. Since each connection is encoded uniquely, each client is responsible for encrypting and uploading their media for two separate times.

This type of architecture can be represented as a complete graph, where each vertex is a client, and each edge is a connection. The total number of edges/connections is exponential, equal to n (n-1)/2, which starts out small, but adds up in a hurry.

Mesh requires no server infrastructure beyond the requisite signalling and TURN server(s), so for applications where sessions typically involve 2-3 clients, mesh is an excellent, low-cost approach.

  • Forwarding Architecture

Forwarding Architecture depends on the selective forwarding unit, or SFU, which acts as an intelligent media relay in the middle of a session.

Every client connects to the SFU at once to send media, and then once more for every other client, resulting in each client managing in unidirectional connections, where n is the total number of clients.

With 8 clients in a session, each client has to manage 8 separate connections. One of those connections is reserved for sending media, while the others are exclusively for receiving media.

The total number of connections in this architecture is n2. While this is more than Mesh Architecture, it is more scalable for your clients. It plays well into the asymmetric nature of most Internet connections by requiring each client to upload only once. Related to this, you are only encrypting once, which alleviates pressure on the device itself.

The Forwarding Architecture can employ various scaling techniques. It acts as a proxy between the sender and receiver, it can monitor bandwidth capabilities of each leg and selectively apply temporal (frame-rate) and spatial (resolution) scaling to the packet stream as it moves through the server.

  • Mixing Architecture

Mixing Architecture depends on a multipoint control unit, or MCU, which acts as a high-powered media mixer in the middle of a session. Every client connects to the MCU, resulting in each client managing in just a single bidirectional connection, regardless of the number of other clients present. The connection is used to send media to and receive a mix of media from the server.

For applications with large numbers of active participants, like virtual classrooms where devices are particularly bandwidth and resource-constrained, this is the perfect approach.

  • Hybrid Architecture

Scale-your-WebRTC-Apps-Successfully

Hybrid architectures is a combination of Mesh, Forwarding, and Mixing Architectures. In a hybrid mode, participants can join a session based on whatever makes the most sense for the session or even for a particular endpoint.