Suggestions on how to avoid audio clipping with SIP plugin

When a webrtc caller makes a call to (sip) user behind FreeSwitch, I notice the caller is unable to hear first 1-2 seconds of callee’s audio. I illustrated this problem with the following sequence diagram. Appreciate if someone can chime in with any ideas on how to avoid this problem? Is it possible for Janus to generate the SDP answer before receiving the SIP 200 OK to facilitate pre-establishment of webrtc connection?

. The plantuml code for this diagram is at @startumlskinparam NoteBackgroundColor orangeskinparam sequenceMessageAlign - Pastebin.com

Thank you very much

Obviously not, since Janus does not know what the peer will accept or refuse, and which codec they’ll prefer. It can only relay SDPs between the involved parties.

As a side note, I think you’re using the term clipping incorrectly here. There’s no clipping: it just takes more time for the WebRTC PeerConnection to complete, after the SIP dialog has already been established, which is normal due to the gatewaying process. In the past we tried delaying the ACK, but that caused a whole set of other troubles (SIP retransmissions and timeouts).

1 Like

Hi Lorenzo, thank you for sharing your thoughts.

Few ideas for Janus to be able to generate SDP offer before SIP 200 OK

  1. Pre-configure the SIP plugin (Ex: specify the codecs beforehand and let SIP plugin throw an error and hangup the call if it receives different codecs compared to preconfigured ones)
  2. Parse the SDP from provisional response as described in RFC 8839 - Session Description Protocol (SDP) Offer/Answer Procedures for Interactive Connectivity Establishment (ICE). But I’m not sure how many SIP implementations support this feature

Doing so is logically like pre-accepting the call in anticipation of an accept from SIP UA. But if SIP UA does not respond in time or reject the call, the JSEP webrtc can be closed. In business use cases, SIP UA is in most cases automated agent like IVR so it will always accept

By clipping, I mean, the caller is missing the first few words of the audio from SIP UA, as I described in the sequence diagram. I’ve a full setup where I’m observing this issue myself.

The term clipping is used in a few RFCs - rfc5479#section-4.1, rfc8839#name-offer-in-invite etc.

Pre-configure is most definitely never going to happen. For provisional, we already support early media (183) but I think only in one direction (Janus user as callee) at the moment, and not the other one.

FYI, someone contributed a patch to add early media support to the SIP plugin:

We plan to merge this soon, so if you’re interested in this feature, please test it and provide feedback on the PR page.