Delay between SIP call accepted and media flowing

I am currently using Janus with its SIP plugin to serve as a broker between a native WebRTC client (a native Android app written in Kotlin) and a PBX (in my case, 3CX). So far, Janus is working great in terms of SIP registration and events.

My Android app communicates with a “man-in-the-middle” service that talks to the Android app on the one side, but manages the WebSocket connections to Janus on the other side. I call this the “Broker Interface”.

When calling from the Android app to an external landline (for example, my mobile phone), the following steps are taken:

  1. Android App signals to the Broker Interface that it wants to call phonenumber XYZ using SIP account PQR. It also includes an SDP offer without candidates (I use trickle ICE). The Android app starts signaling ICE candidates immediately after the offer has been generated, and also signals the “gathering completed” event to the Broker Interface.
  2. The Broker Interface uses the Janus websockets interface to call that number and includes the SDP offer without candidates in the “jsep” property.
  3. The Broker Interface receives Janus events regarding the call, indicating that the call is indeed being placed and it receives the ringing events from the PBX. In the meantime, the Broker Interface also forwards the ICE candidates from the Android app to Janus.
  4. When the call is accepted, Janus correctly creates a WebRTC PeerConnection with its respective SDP answer (with candidates, Janus uses half-trickle). The Broker Interface then immediately forwards this answer to the Android app.
  5. When the Android app receives this answer, it immediately sets the PeerConnection’s RemoteDescription to the answer so the ICE connection can start and finally, media can start flowing.

This all works as expected. However, the Android WebRTC SDK takes quite some time from the moment I set the remote description to the moment that media starts flowing. This can easily take a second. When calling from the Android app to a landline (PSTN), this delay is unacceptable, because the caller will miss the callee’s greeting (The “Hello, this is ” will never reach the caller, because the caller is still building its ICE connection).

What I am looking for is a possibility for Janus (or the SIP plugin) to create a WebRTC peer connection before the SIP call is accepted, so my “slow” Android WebRTC SDK can already start the ICE connection. Consequently, when the SIP call is accepted, media can start flowing immediately, without delay or “missed” parts in the beginning.

I know that part of the functionality I described is already supported through “early media” support. However, not all SIP servers I have to work with support this feature. (In my test server (3CX), I have never seen a 183 response to an invite. Is there something I missed there?)

Another possible option could also be to create a room using the AudioBridge for each call, so the Android app can already connect to the room, and when the SIP plugin gets the SIP accepted event, add the SIP plugin SDP answer as offer to the room we just created for that call. This feels… inefficient and a bit clumsy, however.

It would be fantastic if the Janus or its SIP plugin could create a WebRTC peerconnection before the SIP call is accepted. If this is not possible, is there another way to solve my issue?

That’s not possible. How would Janus know what to use in the SDP, if the answer hasn’t been received yet to relay? Besides, it’s not like Janus can create a PC on its own: the other device is a part of that. Janus is not a PBX, in the SIP plugin: it will always just relay SDP’s back and forth, leaving negotiation up to the endpoints and only stepping in for gatewaying the media.

Try looking at the Admin API to see what’s taking too long (ICE? DTLS? something else?), to see what you can optimize in the configuration to speed things up. Otherwise, as you said, see if SIP early media can help you (even though I think we only support it on the way in).

Thanks for your fast response. The delay is not on the Janus side. I have tested your demo HTML files with my hosted instance of Janus, and I see that there is a delay of about 150ms between “call accepted” and “media flowing”. The longer delay (400 ms for ICE connection and then 500ms until media starts flowing) is on the Android app side but it is not in my power to change that; I use the Google WebRTC SDK.

Which Admin API endpoints would you recommend me to check?

The Admin API has a way to inspect a specific handle, in your case the handle associated to the SIP session. It will contain timing info related to different aspects (e.g., when ICE started, when it succeded, etc.), which allows you to figure out which parts too longer. But if it’s Android setting things up that’s slowing everything down then you’r right, it wouldn’t help much.

Not sure where the delay comes from, though? If Android is calling, then you already accessed the device, created a PeerConnection, and called setLocalDescription, which I can understand take time: does setRemoteDescription for the remote answer really take that long?

1 Like

Thanks again for your quick reply. I will look into the Admin API tomorrow and I will try to get some detailed timing logs.

Hi Lorenzo, I hope you have a nice Christmas.

I have dug a bit deeper in comparing the ICE connection delays of the native Android app vs. the SIP gateway demo HTML file in the browser. I have enabled libnice debugging on the Janus instance, so I can see what is going on there.

Since I cannot expect you to read thousands of lines of log, I have created two logfiles: one containing the libnice/janus debug log when calling using the native Android app and one containing the libnice/janus debug log when calling using the webbrowser HTML file. To make it a good comparison, both the native Android app and the webbrowser call the same number using the same extension on the same PBX. On the browser side, I even forced ICE relay using my TURN server, since the native Android app uses that too (since it is in an LTE network).

From the moment the call is accepted to the moment the log shows “stream 1 component 1 STATE-CHANGE connecting → connected”, it takes the browser 81 ms, whereas the native Android app takes 475 ms. A difference of 394 ms!

When comparing both logs, one difference is immediately clear. libnice log shows the entry "Message HMAC-SHA1 fingerprint: ". In the browser log, the next log entry (inbound STUN packet from TURN server) takes 4 ms. In case of the Android native app, it takes 369 ms. That means that of the total ICE connection setup time difference of 394 ms, 369 ms (94%) comes from this one single point of delay.

Do you have any pointers / clues where this delay could come from? I suspect that it originates from the TURN server (a self-hosted instance of coturn, in my case).