Noticed that in Janus EchoTest and also in our application, that temporal simulcast seems not to work anymore for VP8. I click on L, but I keep receiving 24 or 30 FPS. I remember 1-2 years ago it was working fine and it was lowering down the FPS. Could you double-check temporal in VP8? Maybe something broke during the VP9 simulcast refactoring effort? I noticed that L temporal works fine for us in VP9 simulcast, but not in VP8 anymore.
Iāve just tried the EchoTest with Firefox and it works as expected for me: the UI shows three buttons for temporal layers, but there really only are two (TL1 and TL0).
With Chrome apparently only TL0 exists, but thatās not a problem in Janus. It probably means Chrome stopped sending temporal layers for VP8, since they still do work with Firefox.
But if I look in the WebRTC stats, I still receive a high FPS (around 30), even if I click the TL0. Can you check the actual traffic and confirm FPS and bytes per second is lower in the WebRTC stats?
about:webrtc in Firefox and chrome://webrtc-internals
Thatās because itās the only temporal layer that exists. My guess is Chrome stopped adding temporal layers when doing VP8 simulcast.
I see thereās a new rtp-hdrext/video-layers-allocation00
extension being negotiated, which apparently provides info on which layers exist. Iāll take note of negotiating/parsing that extension for experimentation, as that may give more details on what a sender is actually configured to send.
Thank you for the prompt response! I noticed this old Chrome discussion from 2020 - https://groups.google.com/g/discuss-webrtc/c/N1sMEBJhOz4. When Chrome switched from 3 to 2 temporal layers for VP8. I also noticed this comment, not sure if related.
There are platforms where Chrome uses HW encoders that donāt have temporal layer support. In which case Chrome will only send a single temporal layer.
I doubt my Dell laptop has a VP8 hardware encoder
FYI, I created a basic parser for that extension, which seems to say there should still be two temporal layers:
a1549c048407b401ac023c6404ff02cf1e027f01671e013f00b31e (54 bytes)
-- 27 bytes
a1 54 9c 04 84 07 b4 01 ac 02 3c 64 04 ff 02 cf 1e 02 7f 01 67 1e 01 3f 00 b3 1e
-- -- rid=2, ns=2 (3 RTP streams), sl_bm=1 (0001)
-- -- -- RTP #0, sl_bm=1 (0001)
-- -- -- RTP #1, sl_bm=1 (0001)
-- -- -- RTP #2, sl_bm=1 (0001)
-- -- Temporal layers (54)
-- -- -- RTP #0, sl=0, tl=1 (2 temporal layers)
-- -- -- RTP #1, sl=0, tl=1 (2 temporal layers)
-- -- -- RTP #2, sl=0, tl=1 (2 temporal layers)
-- -- Target bitrates (9c 04 84 ...)
-- -- -- RTP #0, sl=0, tl=0, bitrate=540
-- -- -- RTP #0, sl=0, tl=1, bitrate=900
-- -- -- RTP #1, sl=0, tl=0, bitrate=180
-- -- -- RTP #1, sl=0, tl=1, bitrate=300
-- -- -- RTP #2, sl=0, tl=0, bitrate=60
-- -- -- RTP #2, sl=0, tl=1, bitrate=100
-- -- Resolutions (04 ff 02 ...)
-- -- -- RTP #0, sl=0, res=1280x720 @ 30fps
-- -- -- RTP #1, sl=0, res=640x360 @ 30fps
-- -- -- RTP #2, sl=0, res=320x180 @ 30fps
Parsed 27/27 bytes
Bye!
When I do an unencrypted Wireshark capture, though, I see different things:
- Firefox is setting
X=1
in the VP8 descriptor to specify thereās more info, and thenT=1
to say the descriptor contains the temporal layer index. This is what we parse and use for simulcast swithcing. - Chrome is setting
X=0
, which means all the additional info we need isnāt there. I donāt know if it isnāt there because, despite what the extension says, there actually arenāt multiple temporal layers, or if itās because theyāre now signalling that info somewhere else (maybe the AV1 descriptor? but why? thatās optional)
Thanks to Philipp Hancke, we found out the root cause of the issue:
https://issues.webrtc.org/issues/42226269
When the Dependency Descriptor extension is negotiated, Chrome will stop sending the temporal layer index in the payload descriptor, and put it in the DD instead. We always negotiate the extension because we might need it in case AV1 is used, but donāt do anything with it for other codecs, since itās an extension that was originally conceived only for AV1 SVC usage.
As such, there are a few potential fixes:
- just munge the SDP and remove that extension before sending the offer to Janus: this is the easiest fix that you can deploy right away, no change to Janus needed;
- we change the code in Janus (or the plugins) so that we only negotiate the extension if AV1 is the codec weāll actually use;
- we change the code in Janus (or the plugins) so that we try the āusualā way of getting the temporal index first, and if we canāt find it, we look in the DD.
My guess is that 3. will be the ācleanestā approach in the longer run, but itās also the one that will probably take me more time, as Iāll have to check the implications of using resources Iāve mostly hardcoded to AV1/SVC to something else as well.
I ended up implementing 2., which was much easier and quicker to do, and transparent to plugins (no change needed there): Don't accept Dependency Descriptor extension unless the negotiated coā¦ Ā· meetecho/janus-gateway@13e7260 Ā· GitHub
In case we want to go for 3. in the future, that will need a dedicated refactoring, as the code does make many assumptions associated with SVC. It will probably require a new dedicated structure or set of utilities that can transparently use payload descriptors or DD by providing a unified C API to plugins.
PS: I created a PR for the parsing of the video-layers-allocation00
RTP extension: the patch makes a partial usage of it in the EchoTest demo, if it works and makes sense we can extend that to the VideoRoom plugin.