I am currently using Janus with its SIP plugin to serve as a broker between a native WebRTC client (a native Android app written in Kotlin) and a PBX (in my case, 3CX). So far, Janus is working great in terms of SIP registration and events.
My Android app communicates with a “man-in-the-middle” service that talks to the Android app on the one side, but manages the WebSocket connections to Janus on the other side. I call this the “Broker Interface”.
When calling from the Android app to an external landline (for example, my mobile phone), the following steps are taken:
- Android App signals to the Broker Interface that it wants to call phonenumber XYZ using SIP account PQR. It also includes an SDP offer without candidates (I use trickle ICE). The Android app starts signaling ICE candidates immediately after the offer has been generated, and also signals the “gathering completed” event to the Broker Interface.
- The Broker Interface uses the Janus websockets interface to call that number and includes the SDP offer without candidates in the “jsep” property.
- The Broker Interface receives Janus events regarding the call, indicating that the call is indeed being placed and it receives the ringing events from the PBX. In the meantime, the Broker Interface also forwards the ICE candidates from the Android app to Janus.
- When the call is accepted, Janus correctly creates a WebRTC PeerConnection with its respective SDP answer (with candidates, Janus uses half-trickle). The Broker Interface then immediately forwards this answer to the Android app.
- When the Android app receives this answer, it immediately sets the PeerConnection’s RemoteDescription to the answer so the ICE connection can start and finally, media can start flowing.
This all works as expected. However, the Android WebRTC SDK takes quite some time from the moment I set the remote description to the moment that media starts flowing. This can easily take a second. When calling from the Android app to a landline (PSTN), this delay is unacceptable, because the caller will miss the callee’s greeting (The “Hello, this is ” will never reach the caller, because the caller is still building its ICE connection).
What I am looking for is a possibility for Janus (or the SIP plugin) to create a WebRTC peer connection before the SIP call is accepted, so my “slow” Android WebRTC SDK can already start the ICE connection. Consequently, when the SIP call is accepted, media can start flowing immediately, without delay or “missed” parts in the beginning.
I know that part of the functionality I described is already supported through “early media” support. However, not all SIP servers I have to work with support this feature. (In my test server (3CX), I have never seen a 183 response to an invite. Is there something I missed there?)
Another possible option could also be to create a room using the AudioBridge for each call, so the Android app can already connect to the room, and when the SIP plugin gets the SIP accepted event, add the SIP plugin SDP answer as offer to the room we just created for that call. This feels… inefficient and a bit clumsy, however.
It would be fantastic if the Janus or its SIP plugin could create a WebRTC peerconnection before the SIP call is accepted. If this is not possible, is there another way to solve my issue?