Libnice related crash : libnice:ERROR:../agent/conncheck.c:3056:priv_prune_pending_checks: assertion failed: (priority > 0)

Good day all!

Looking more for hints than fixes, but if you got a fix for me, I’ll be a happy camper.

Recently, we got a problem we never had before - which left some customers without access to video communications for a bit north of 45 minutes - and I want to know (or at least, estimate) if this is either a configuration error, an implementation error or a one-off (but that last one is unlikely I think), so I can debug, fix or report. However, I think I’m a bit out of my depth given my current knowledge of networking and all the solutions in action here.

Looking at the logs I found that the error happened in libnice :
libnice:ERROR:../agent/conncheck.c:3056:priv_prune_pending_checks: assertion failed: (priority > 0)

(note : vroom.js is a duktaped plugin of ours, and it is liked by another class, janus.plugin.vroom, which is near-identical to duktape.c, with some slight modifications)

Janus version used : 0.13.3

NB : we are using a custom plugin. vroom.js is a duktaped plugin of ours, and it is liked by another class, janus.plugin.vroom, which is near-identical to duktape.c, with some slight modifications

NB2 : We’ve been having the Failed to add some remote candidates (added 0, expected 1) warning for a long time. I don’t know if this is related to the error, however everything used to work ok in spite of the warning, so I never took the time to debug that. Let me know if this is something I should look into.

Below here : first logs, then configurations

Janus’ log at relevant section:

 {"timestamp": 1693828779687653, "line": "[895523731306862] Creating ICE agent (ICE Lite mode, controlled)\n"}
 {"timestamp": 1693828779780181, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.780+00:00 Handling message for session: 817308666 {\"action\": \"processSdpAnswer\", \"runAsync\": true} {\"type\": \"answer\", \"sdp\": \"v=0\\r\\no=mozilla...THIS_IS_SDPARTA-99.0 1837066192733370317 0 IN IP4 1.1.1.1\\r\\ns=-\\r\\nt=0 0\\r\\nm=audio 9 UDP/TLS/RTP/SAVPF 111 9 0 8 126\\r\\nc=IN IP4 1.1.1.1\\r\\na=recvonly\\r\\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\\r\\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\\r\\na=fmtp:111 maxplaybackrate=48000;stereo=1;useinbandfec=1\\r\\na=fmtp:126 0-15\\r\\na=rtpmap:111 opus/48000/2\\r\\na=rtpmap:9 G722/8000/1\\r\\na=rtpmap:0 PCMU/8000\\r\\na=rtpmap:8 PCMA/8000\\r\\na=rtpmap:126 telephone-event/8000/1\\r\\nm=video 9 UDP/TLS/RTP/SAVPF 96 106 108 98\\r\\nc=IN IP4 1.1.1.1\\r\\na=recvonly\\r\\na=extmap:14 urn:ietf:params:rtp-hdrext:toffset\\r\\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\\r\\na=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\\r\\na=extmap:5/recvonly http://www.webrtc.org/experiments/rtp-hdrext/playout-delay\\r\\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\\r\\na=fmtp:106 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1\\r\\na=fmtp:108 profile-level-id=42e01f;level-asymmetry-allowed=1\\r\\na=fmtp:96 max-fs=12288;max-fr=60\\r\\na=fmtp:98 max-fs=12288;max-fr=60\\r\\na=rtcp-fb:96 nack\\r\\na=rtcp-fb:96 nack pli\\r\\na=rtcp-fb:96 ccm fir\\r\\na=rtcp-fb:96 goog-remb\\r\\na=rtcp-fb:96 transport-cc\\r\\na=rtcp-fb:106 nack\\r\\na=rtcp-fb:106 nack pli\\r\\na=rtcp-fb:106 ccm fir\\r\\na=rtcp-fb:106 goog-remb\\r\\na=rtcp-fb:106 transport-cc\\r\\na=rtcp-fb:108 nack\\r\\na=rtcp-fb:108 nack pli\\r\\na=rtcp-fb:108 ccm fir\\r\\na=rtcp-fb:108 goog-remb\\r\\na=rtcp-fb:108 transport-cc\\r\\na=rtcp-fb:98 nack\\r\\na=rtcp-fb:98 nack pli\\r\\na=rtcp-fb:98 ccm fir\\r\\na=rtcp-fb:98 goog-remb\\r\\na=rtcp-fb:98 transport-cc\\r\\na=rtpmap:96 VP8/90000\\r\\na=rtpmap:106 H264/90000\\r\\na=rtpmap:108 H264/90000\\r\\na=rtpmap:98 VP9/90000\\r\\n\"}\n"}
 {"timestamp": 1693828779780497, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.780+00:00 Resuming coroutines\n"}
 {"timestamp": 1693828779780591, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.780+00:00  pushEvent\n"}
 {"timestamp": 1693828779780718, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.780+00:00 Coroutines resumed\n"}
 {"timestamp": 1693828779813164, "line": "[WARN] [2077434969869207] Failed to add some remote candidates (added 0, expected 1)\n"}
 {"timestamp": 1693828779886143, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.886+00:00 Handling message for session: 2207909989 {\"action\": \"processSdpAnswer\", \"runAsync\": true} {\"type\": \"answer\", \"sdp\": \"v=0\\r\\no=mozilla...THIS_IS_SDPARTA-99.0 820194989820274871 0 IN IP4 1.1.1.1\\r\\ns=-\\r\\nt=0 0\\r\\nm=audio 9 UDP/TLS/RTP/SAVPF 111 9 0 8 126\\r\\nc=IN IP4 1.1.1.1\\r\\na=recvonly\\r\\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\\r\\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\\r\\na=fmtp:111 maxplaybackrate=48000;stereo=1;useinbandfec=1\\r\\na=fmtp:126 0-15\\r\\na=rtpmap:111 opus/48000/2\\r\\na=rtpmap:9 G722/8000/1\\r\\na=rtpmap:0 PCMU/8000\\r\\na=rtpmap:8 PCMA/8000\\r\\na=rtpmap:126 telephone-event/8000/1\\r\\nm=video 9 UDP/TLS/RTP/SAVPF 96 106 108 98\\r\\nc=IN IP4 1.1.1.1\\r\\na=recvonly\\r\\na=extmap:14 urn:ietf:params:rtp-hdrext:toffset\\r\\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\\r\\na=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\\r\\na=extmap:5/recvonly http://www.webrtc.org/experiments/rtp-hdrext/playout-delay\\r\\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\\r\\na=fmtp:106 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1\\r\\na=fmtp:108 profile-level-id=42e01f;level-asymmetry-allowed=1\\r\\na=fmtp:96 max-fs=12288;max-fr=60\\r\\na=fmtp:98 max-fs=12288;max-fr=60\\r\\na=rtcp-fb:96 nack\\r\\na=rtcp-fb:96 nack pli\\r\\na=rtcp-fb:96 ccm fir\\r\\na=rtcp-fb:96 goog-remb\\r\\na=rtcp-fb:96 transport-cc\\r\\na=rtcp-fb:106 nack\\r\\na=rtcp-fb:106 nack pli\\r\\na=rtcp-fb:106 ccm fir\\r\\na=rtcp-fb:106 goog-remb\\r\\na=rtcp-fb:106 transport-cc\\r\\na=rtcp-fb:108 nack\\r\\na=rtcp-fb:108 nack pli\\r\\na=rtcp-fb:108 ccm fir\\r\\na=rtcp-fb:108 goog-remb\\r\\na=rtcp-fb:108 transport-cc\\r\\na=rtcp-fb:98 nack\\r\\na=rtcp-fb:98 nack pli\\r\\na=rtcp-fb:98 ccm fir\\r\\na=rtcp-fb:98 goog-remb\\r\\na=rtcp-fb:98 transport-cc\\r\\na=rtpmap:96 VP8/90000\\r\\na=rtpmap:106 H264/90000\\r\\na=rtpmap:108 H264/90000\\r\\na=rtpmap:98 VP9/90000\\r\\n\"}\n"}
 {"timestamp": 1693828779886416, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.886+00:00 Resuming coroutines\n"}
 {"timestamp": 1693828779886501, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.886+00:00  pushEvent\n"}
 {"timestamp": 1693828779886617, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.886+00:00 Coroutines resumed\n"}
 {"timestamp": 1693828779900154, "line": "[WARN] [895523731306862] Failed to add some remote candidates (added 0, expected 1)\n"}
 {"timestamp": 1693828779931380, "line": "[2077434969869207] The DTLS handshake has been completed\n"}
 {"timestamp": 1693828779931413, "line": "WebRTC media is now available\n"}
 {"timestamp": 1693828779931480, "line": "[\u001B[36mvroom.js\u001B[0m]  WebRTC PeerConnection is up for session: 817308666\n"}
 {"timestamp": 1693828779937021, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.936+00:00 Handling message for session: 817308666 {\"runAsync\": true, \"toId\": 102830013, \"action\": \"addRecipient\"} null\n"}
 {"timestamp": 1693828779937333, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.937+00:00 Resuming coroutines\n"}
 {"timestamp": 1693828779937451, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.937+00:00  pushEvent\n"}
 {"timestamp": 1693828779937575, "line": "[\u001B[36mvroom.js\u001B[0m]  2023-09-04 11:59:39.937+00:00 Coroutines resumed\n"}
 {"timestamp": 1693828780022172, "line": "[895523731306862] The DTLS handshake has been completed\n"}
 {"timestamp": 1693828780022196, "line": "WebRTC media is now available\n"}
 {"timestamp": 1693828780022255, "line": "[\u001B[36mvroom.js\u001B[0m]  WebRTC PeerConnection is up for session: 2207909989\n"}
 {"timestamp": 1693828812440305, "line": "[janus.plugin.vroom-0x7f3144008880] No WebRTC media anymore\n"}
 {"timestamp": 1693828812440486, "line": "[\u001B[36mvroom.js\u001B[0m]  WebRTC PeerConnection is down for session: 1511280016\n"}
 {"timestamp": 1693828812441098, "line": "[6663004314392501] WebRTC resources freed; 0x7f3158026b70 0x7f314c034af0\n"}
 {"timestamp": 1693828812444063, "line": "[janus.plugin.vroom-0x7f3144008dc0] No WebRTC media anymore\n"}
 {"timestamp": 1693828812444182, "line": "[\u001B[36mvroom.js\u001B[0m]  WebRTC PeerConnection is down for session: 815050429\n"}
 {"timestamp": 1693828812444770, "line": "[6457550628113394] WebRTC resources freed; 0x7f315805ca10 0x7f314c034af0\n"}
 {"timestamp": 1693828812447173, "line": "[ERR] [dtls-bio.c:janus_dtls_bio_agent_write:157] [563309218106504] Error sending DTLS message on component 1 of stream 1 (-1)\n"}
 {"timestamp": 1693828812447202, "line": "[janus.plugin.vroom-0x7f3144007220] No WebRTC media anymore\n"}
 {"timestamp": 1693828812447275, "line": "[\u001B[36mvroom.js\u001B[0m]  WebRTC PeerConnection is down for session: 329048855\n"}
 {"timestamp": 1693828812447638, "line": "[563309218106504] WebRTC resources freed; 0x7f3158043600 0x7f314c034af0\n"}
 {"timestamp": 1693832789394510, "line": "JANUS JSON logger plugin initialized!\n"}

System log (pretty similar) :

Sep  4 11:59:39 my_instance janus[846]: [895523731306862] Creating ICE agent (ICE Lite mode, controlled)
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.780+00:00 Handling message for session: 817308666 {"action": "processSdpAnswer", "runAsync": true} {"type": "answer", "sdp": "v=0\r\no=mozilla...THIS_IS_SDPARTA-99.0 1837066192733370317 0 IN IP4 1.1.1.1\r\ns=-\r\nt=0 0\r\nm=audio 9 UDP/TLS/RTP/SAVPF 111 9 0 8 126\r\nc=IN IP4 1.1.1.1\r\na=recvonly\r\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:111 maxplaybackrate=48000;stereo=1;useinbandfec=1\r\na=fmtp:126 0-15\r\na=rtpmap:111 opus/48000/2\r\na=rtpmap:9 G722/8000/1\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:126 telephone-event/8000/1\r\nm=video 9 UDP/TLS/RTP/SAVPF 96 106 108 98\r\nc=IN IP4 1.1.1.1\r\na=recvonly\r\na=extmap:14 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\r\na=extmap:5/recvonly http://www.webrtc.org/experiments/rtp-hdrext/playout-delay\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:106 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1\r\na=fmtp:108 profile-level-id=42e01f;level-asymmetry-allowed=1\r\na=fmtp:96 max-fs=12288;max-fr=60\r\na=fmtp:98 max-fs=12288;max-fr=60\r\na=rtcp-fb:96 nack\r\na=rtcp-fb:96 nack pli\r\na=rtcp-fb:96 ccm fir\r\na=rtcp-fb:96 goog-remb\r\na=rtcp-fb:96 transport-cc\r\na=rtcp-fb:106 nack\r\na=rtcp-fb:106 nack pli\r\na=rtcp-fb:106 ccm fir\r\na=rtcp-fb:106 goog-remb\r\na=rtcp-fb:106 transport-cc\r\na=rtcp-fb:108 nack\r\na=rtcp-fb:108 nack pli\r\na=rtcp-fb:108 ccm fir\r\na=rtcp-fb:108 goog-remb\r\na=rtcp-fb:108 transport-cc\r\na=rtcp-fb:98 nack\r\na=rtcp-fb:98 nack pli\r\na=rtcp-fb:98 ccm fir\r\na=rtcp-fb:98 goog-remb\r\na=rtcp-fb:98 transport-cc\r\na=rtpmap:96 VP8/90000\r\na=rtpmap:106 H264/90000\r\na=rtpmap:108 H264/90000\r\na=rtpmap:98 VP9/90000\r\n"}
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.780+00:00 Resuming coroutines
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.780+00:00  pushEvent
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.780+00:00 Coroutines resumed
Sep  4 11:59:39 my_instance janus[846]: [WARN] [2077434969869207] Failed to add some remote candidates (added 0, expected 1)
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.886+00:00 Handling message for session: 2207909989 {"action": "processSdpAnswer", "runAsync": true} {"type": "answer", "sdp": "v=0\r\no=mozilla...THIS_IS_SDPARTA-99.0 820194989820274871 0 IN IP4 1.1.1.1\r\ns=-\r\nt=0 0\r\nm=audio 9 UDP/TLS/RTP/SAVPF 111 9 0 8 126\r\nc=IN IP4 1.1.1.1\r\na=recvonly\r\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:111 maxplaybackrate=48000;stereo=1;useinbandfec=1\r\na=fmtp:126 0-15\r\na=rtpmap:111 opus/48000/2\r\na=rtpmap:9 G722/8000/1\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:126 telephone-event/8000/1\r\nm=video 9 UDP/TLS/RTP/SAVPF 96 106 108 98\r\nc=IN IP4 1.1.1.1\r\na=recvonly\r\na=extmap:14 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01\r\na=extmap:5/recvonly http://www.webrtc.org/experiments/rtp-hdrext/playout-delay\r\na=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid\r\na=fmtp:106 profile-level-id=42e01f;level-asymmetry-allowed=1;packetization-mode=1\r\na=fmtp:108 profile-level-id=42e01f;level-asymmetry-allowed=1\r\na=fmtp:96 max-fs=12288;max-fr=60\r\na=fmtp:98 max-fs=12288;max-fr=60\r\na=rtcp-fb:96 nack\r\na=rtcp-fb:96 nack pli\r\na=rtcp-fb:96 ccm fir\r\na=rtcp-fb:96 goog-remb\r\na=rtcp-fb:96 transport-cc\r\na=rtcp-fb:106 nack\r\na=rtcp-fb:106 nack pli\r\na=rtcp-fb:106 ccm fir\r\na=rtcp-fb:106 goog-remb\r\na=rtcp-fb:106 transport-cc\r\na=rtcp-fb:108 nack\r\na=rtcp-fb:108 nack pli\r\na=rtcp-fb:108 ccm fir\r\na=rtcp-fb:108 goog-remb\r\na=rtcp-fb:108 transport-cc\r\na=rtcp-fb:98 nack\r\na=rtcp-fb:98 nack pli\r\na=rtcp-fb:98 ccm fir\r\na=rtcp-fb:98 goog-remb\r\na=rtcp-fb:98 transport-cc\r\na=rtpmap:96 VP8/90000\r\na=rtpmap:106 H264/90000\r\na=rtpmap:108 H264/90000\r\na=rtpmap:98 VP9/90000\r\n"}
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.886+00:00 Resuming coroutines
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.886+00:00  pushEvent
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.886+00:00 Coroutines resumed
Sep  4 11:59:39 my_instance janus[846]: [WARN] [895523731306862] Failed to add some remote candidates (added 0, expected 1)
Sep  4 11:59:39 my_instance janus[846]: [2077434969869207] The DTLS handshake has been completed
Sep  4 11:59:39 my_instance janus[846]: WebRTC media is now available
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  WebRTC PeerConnection is up for session: 817308666
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.936+00:00 Handling message for session: 817308666 {"runAsync": true, "toId": 102830013, "action": "addRecipient"} null
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.937+00:00 Resuming coroutines
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.937+00:00  pushEvent
Sep  4 11:59:39 my_instance janus[846]: [#033[36mvroom.js#033[0m]  2023-09-04 11:59:39.937+00:00 Coroutines resumed
Sep  4 11:59:40 my_instance janus[846]: [895523731306862] The DTLS handshake has been completed
Sep  4 11:59:40 my_instance janus[846]: WebRTC media is now available
Sep  4 11:59:40 my_instance janus[846]: [#033[36mvroom.js#033[0m]  WebRTC PeerConnection is up for session: 2207909989
Sep  4 12:00:12 my_instance janus[846]: [janus.plugin.vroom-0x7f3144008880] No WebRTC media anymore
Sep  4 12:00:12 my_instance janus[846]: [#033[36mvroom.js#033[0m]  WebRTC PeerConnection is down for session: 1511280016
Sep  4 12:00:12 my_instance janus[846]: [6663004314392501] WebRTC resources freed; 0x7f3158026b70 0x7f314c034af0
Sep  4 12:00:12 my_instance janus[846]: **
Sep  4 12:00:12 my_instance janus[846]: libnice:ERROR:../agent/conncheck.c:3056:priv_prune_pending_checks: assertion failed: (priority > 0)
Sep  4 12:00:12 my_instance janus[846]: [janus.plugin.vroom-0x7f3144008dc0] No WebRTC media anymore
Sep  4 12:00:12 my_instance janus[846]: [#033[36mvroom.js#033[0m]  WebRTC PeerConnection is down for session: 815050429
Sep  4 12:00:12 my_instance janus[846]: [6457550628113394] WebRTC resources freed; 0x7f315805ca10 0x7f314c034af0
Sep  4 12:00:12 my_instance janus[846]: [ERR] [dtls-bio.c:janus_dtls_bio_agent_write:157] [563309218106504] Error sending DTLS message on component 1 of stream 1 (-1)
Sep  4 12:00:12 my_instance janus[846]: [janus.plugin.vroom-0x7f3144007220] No WebRTC media anymore
Sep  4 12:00:12 my_instance janus[846]: [#033[36mvroom.js#033[0m]  WebRTC PeerConnection is down for session: 329048855
Sep  4 12:00:12 my_instance janus[846]: [563309218106504] WebRTC resources freed; 0x7f3158043600 0x7f314c034af0
Sep  4 12:00:12 my_instance janus[846]: Bail out! libnice:ERROR:../agent/conncheck.c:3056:priv_prune_pending_checks: assertion failed: (priority > 0)
Sep  4 12:00:41 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:00:41.140502445Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:03 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:03.547008876Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:03 my_instance promtail-linux-amd64[1287]: level=error ts=2023-09-04T12:04:03.547131628Z caller=client.go:294 component=client host= msg="final error sending batch" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:04 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:04.609685481Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:05 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:05.374643002Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:06 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:06.697364616Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:10 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:10.09170874Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:14 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:14.657290114Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:28 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:28.778613496Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:04:44 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:04:44.820458825Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:05:18 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:05:18.444393408Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:06:52 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:06:52.4933317Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:10:39 my_instance promtail-linux-amd64[1287]: level=warn ts=2023-09-04T12:10:39.393754368Z caller=client.go:288 component=client host= msg="error sending batch, will retry" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:10:39 my_instance promtail-linux-amd64[1287]: level=error ts=2023-09-04T12:10:39.393918167Z caller=client.go:294 component=client host= msg="final error sending batch" status=-1 error="Post \"url\": unsupported protocol scheme \"\""
Sep  4 12:17:01 my_instance CRON[3242]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Sep  4 13:06:29 my_instance systemd[1]: janus-gateway.service: Main process exited, code=killed, status=6/ABRT
Sep  4 13:06:29 my_instance systemd[1]: janus-gateway.service: Failed with result 'signal'.
Sep  4 13:06:29 my_instance systemd[1]: janus-gateway.service: Consumed 44min 33.685s CPU time.
Sep  4 13:06:29 my_instance systemd[1]: janus-gateway.service: Scheduled restart job, restart counter is at 1.
Sep  4 13:06:29 my_instance systemd[1]: Stopped Janus WebRTC Server.
Sep  4 13:06:29 my_instance systemd[1]: janus-gateway.service: Consumed 44min 33.685s CPU time.
Sep  4 13:06:29 my_instance systemd[1]: Started Janus WebRTC Server.

Crash log (without core dump). (on pastebin because it busts the char limit on here)

Relevant part in the decoded Core Dump :

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140646270621248) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140646270621248) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140646270621248, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007fec55192476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007fec551787f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007fec5598db57 in  () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x00007fec559e770f in g_assertion_message_expr () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#7  0x00007fec55d04456 in priv_prune_pending_checks (component=0x7feac40af150, stream=0x7feac408af10, agent=0x7fe9f402ecf0) at ../agent/conncheck.c:3056
#8  conn_check_update_check_list_state_for_ready (agent=agent@entry=0x7fe9f402ecf0, stream=stream@entry=0x7feac408af10, component=component@entry=0x7feac40af150) at ../agent/conncheck.c:2205
#9  0x00007fec55d09ad8 in conn_check_prune_socket (agent=agent@entry=0x7fe9f402ecf0, stream=stream@entry=0x7feac408af10, component=component@entry=0x7feac40af150, sock=sock@entry=0x7feaa8053d80)
    at ../agent/conncheck.c:5052
#10 0x00007fec55cff0a5 in nice_component_remove_socket (agent=0x7fe9f402ecf0, cmp=0x7feac40af150, nsocket=0x7feaa8053d80) at ../agent/component.c:178
#11 0x00007fec55cfbcf5 in component_io_cb (gsocket=<optimized out>, condition=<optimized out>, user_data=<optimized out>) at ../agent/agent.c:6487
#12 0x00007fec55ba149b in  () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#13 0x00007fec559c4c44 in g_main_context_dispatch () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x00007fec55a1a258 in  () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x00007fec559c42b3 in g_main_loop_run () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#16 0x000056357d6eaa35 in janus_ice_handle_thread (data=0x7fec3c04dad0) at ice.c:1281
#17 0x00007fec559f3a51 in  () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#18 0x00007fec551e4b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#19 0x00007fec55276a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

from libnice/agent/conncheck.c (ends with line 3056) :

static guint priv_prune_pending_checks (NiceAgent *agent, NiceStream *stream, NiceComponent *component)
{
  GSList *i;
  guint64 priority;
  guint in_progress = 0;
  guint triggered_check = 0;
  gchar prio[NICE_CANDIDATE_PAIR_PRIORITY_MAX_SIZE];

  nice_debug ("Agent %p: Pruning pending checks for s%d/c%d",
      agent, stream->id, component->id);

  /* Called when we have at least one selected pair */
  priority = component->selected_pair.priority;
  g_assert (priority > 0); // line 3056 here

Janus Configs

janus.jcfg

general: {
        configs_folder = "/opt/janus/etc/janus"                 # Configuration files folder
        plugins_folder = "/opt/janus/lib/janus/plugins"         # Plugins folder
        transports_folder = "/opt/janus/lib/janus/transports"   # Transports folder
        events_folder = "/opt/janus/lib/janus/events"           # Event handlers folder
        loggers_folder = "/opt/janus/lib/janus/loggers"         # External loggers folder
        debug_level = 4                                         # Debug/logging level, valid values are 0-7
}

media: { rtp_port_range = "10001-65535" }

nat: {
        nice_debug = false
        full_trickle = true
        ice_lite = true
        ice_tcp = true
        ignore_mdns = true
        ice_enforce_list = "ens3,ens4"
}

plugins: {
        disable = "libjanus_voicemail.so,libjanus_recordplay.so,libjanus_audiobridge.so,libjanus_echotest.so,libjanus_nosip.so,libjanus_sip.so,libjanus_streaming.so,libjanus_textroom.so,libjanus_videocall.so,libjanus_videoroom.so"
}

transports: {
        disable = "libjanus_rabbitmq.so,libjanus_http.so,libjanus_mqtt.so,libjanus_nanomsg.so,libjanus_pfunix.so"
}

events: {
        disable = "libjanus_sampleevh.so,libjanus_gelfevh.so,libjanus_mqttevh.so,libjanus_nanomsgevh.so,libjanus_rabbitmqevh.so"
}

So, we’ve encountered the same issue on two other instances. Same backtrace in gdb too.

Questions relative to this :

  1. Apport seems to be using a lot of resources when it makes its report. I’m worrying that this only adds to the problem, as it seems to prevent the instance from either stabilizing or rebooting, Any suggestion about this? I don’t think disabling apport is something one wants to do, usually… but should I?

  2. Would we be better off trying to implement the multistream version instead of the legacy? We’ve been delaying it because we were lacking time to implement it, but if it could help, hell, I’d be willing to look into that.

Any help is appreciated :pray:

I don’t think Janus has anything to do with it. Update libnice before you try anything else, since that’s where it’s crashing.

2 Likes

We have an update on the situation: since then, we updated libnice but we are still having issues. I’m working on the same team as @pticrix. We did not update Janus yet and are still at 0.13.3 but do you think the latest 0.x version could fix the issue? I saw in the changelog a mention of “Fixed rare crash in AudioBridge” but there is no link provided on that point. Can you give us some insight about the fixed problem?

As a reference, here is the extracted syslog portion giving the error. Before that it’s all running smoothly:

Jan 23 15:19:21 janus-gateway-5-classilio-gra7 janus[10305]: libnice:ERROR:../agent/conncheck.c:3056:priv_prune_pending_checks: assertion failed: (priority > 0)
Jan 23 15:19:21 janus-gateway-5-classilio-gra7 janus[10305]: Bail out! libnice:ERROR:../agent/conncheck.c:3056:priv_prune_pending_checks: assertion failed: (priority > 0)
Jan 23 15:19:25 janus-gateway-5-classilio-gra7 systemd[1]: janus-gateway.service: Main process exited, code=killed, status=6/ABRT
Jan 23 15:19:25 janus-gateway-5-classilio-gra7 systemd[1]: janus-gateway.service: Failed with result 'signal'.
Jan 23 15:19:25 janus-gateway-5-classilio-gra7 systemd[1]: janus-gateway.service: Consumed 43min 59.314s CPU time.
Jan 23 15:19:26 janus-gateway-5-classilio-gra7 systemd[1]: janus-gateway.service: Scheduled restart job, restart counter is at 2.
Jan 23 15:19:26 janus-gateway-5-classilio-gra7 systemd[1]: Stopped Janus WebRTC Server.
Jan 23 15:19:26 janus-gateway-5-classilio-gra7 systemd[1]: janus-gateway.service: Consumed 43min 59.314s CPU time.

We only fix crashes in Janus, not libnice. If that’s still the error, you need to report it to the libnice repo.

Alright, seems fair! Il found an issue already on Libnice about that. In the meantime, would it be possible not to kill the whole Janus service because of such an error? Maybe restarting libnice or drop the faulty connection/session instead of crashing the server. Do you think it would be possible?

That’s not how libraries work :slightly_smiling_face:

1 Like