Janus with Kubernetes: Demystifying the Myths

First off, I’d like to express, once more, my deep appreciation to the Meetecho team for developing such a straightforward yet potent software. It has engaged us thoroughly over the past year, bringing not only challenges but also immense enjoyment. Among the numerous hurdles, scaling Janus in a cloud setting while minimizing the number of ports (ideally to one) posed a significant challenge. This issue, as evident from numerous forum discussions, remains unresolved for many, lacking clear guidance for a reliable solution.

Our previous endeavors with Docker Swarm showed promising results. We had services operating on the host network, registering with Consul for health checks and service discovery, as the swarm’s mesh network (overlay) doesn’t support services networked at the host level. To circumvent port conflicts, we allocated one EC2 host (c6i.xllarge) per Janus instance. A Coturn server, situated within the same network as Janus and designated for clients, facilitated packet relay to the appropriate Janus instance (note: only clients were configured to use Coturn).

However, we recently transitioned to Kubernetes for various reasons, which introduced new uncertainties regarding the scalability of Janus within this environment.

After a strenuous journey of migrating our services to Kubernetes, I’m thrilled to share that we’ve successfully deployed multiple Janus instances on Kubernetes, facilitating numerous conferences with load-balanced Januses. While several questions remain unanswered, such as the bandwidth capacities of internet gateways/network load balancers, our proof of concept demonstrated flawless deployment. I share our journey, hoping to draw insights from those with relevant experience, thereby easing the testing and fine-tuning process for myself and aiding others facing similar challenges.

Our effective configuration included a Coturn server in host network mode within a private network, connected via a NAT internet gateway and accessible externally through a network load balancer. Multiple Janus instances, also in host network mode (one per c6i.xllarge host), registered themselves with Consul for service discovery and health checks. Our janus-management service handled load balancing, based on the number of active rooms. With all instances on the host network, Coturn could reach them, although they remained inaccessible from the internet. All clients were configured to utilize the Coturn server. We also implemented a cluster autoscaler, though scaling based on CPU metrics proved limited, prompting us to explore custom scaling solutions based on dynamic subscriber counts. Currently, I am developing a testing framework to assess the bandwidth and CPU usage under various loads.

That summarizes our journey with deploying Janus on Kubernetes. Alongside, we manage 14 other services, including Redis, Postgres, Mongo, RabbitMQ, and the ELK stack, topics reserved for future discussions. Should you have any suggestions, advice, or inquiries, please don’t hesitate to engage. I’m eager to hear about your Janus deployment experiences. A heartfelt thanks again to the Meetecho team and the remarkable Janus community.

Cheers

1 Like

Hi @Koyukan Glad to hear that everything is working perfectly fine for you. Can you help me out a little bit here?
I am also trying to setup an architecture where I am planning to deploy janus-gateway-server with sip plugin and a proxy-service as a sidecar to it, using docker in EC2. So far I could achieve it using the networkmode=host only.

More details here- Deploy janus-gateway server with SIP plugin inside Docker in AWS EC2