When a webrtc caller makes a call to (sip) user behind FreeSwitch, I notice the caller is unable to hear first 1-2 seconds of callee’s audio. I illustrated this problem with the following sequence diagram. Appreciate if someone can chime in with any ideas on how to avoid this problem? Is it possible for Janus to generate the SDP answer before receiving the SIP 200 OK to facilitate pre-establishment of webrtc connection?
Obviously not, since Janus does not know what the peer will accept or refuse, and which codec they’ll prefer. It can only relay SDPs between the involved parties.
As a side note, I think you’re using the term clipping incorrectly here. There’s no clipping: it just takes more time for the WebRTC PeerConnection to complete, after the SIP dialog has already been established, which is normal due to the gatewaying process. In the past we tried delaying the ACK, but that caused a whole set of other troubles (SIP retransmissions and timeouts).
Few ideas for Janus to be able to generate SDP offer before SIP 200 OK
Pre-configure the SIP plugin (Ex: specify the codecs beforehand and let SIP plugin throw an error and hangup the call if it receives different codecs compared to preconfigured ones)
Doing so is logically like pre-accepting the call in anticipation of an accept from SIP UA. But if SIP UA does not respond in time or reject the call, the JSEP webrtc can be closed. In business use cases, SIP UA is in most cases automated agent like IVR so it will always accept
By clipping, I mean, the caller is missing the first few words of the audio from SIP UA, as I described in the sequence diagram. I’ve a full setup where I’m observing this issue myself.
The term clipping is used in a few RFCs - rfc5479#section-4.1, rfc8839#name-offer-in-invite etc.
Pre-configure is most definitely never going to happen. For provisional, we already support early media (183) but I think only in one direction (Janus user as callee) at the moment, and not the other one.