Referencing issue possibly?

Hi, I’m investigating crash on our fork then helper threads are enabled and found some missing reference increment to room here: janus-gateway/src/plugins/janus_videoroom.c at master · meetecho/janus-gateway · GitHub

In other places publisher->room reference is incremented(in dummy and remote publishers). Is it correct or bug maybe?

I’m not sure crash is occuring on master version so I don’t make issue on github, since I can’t provide any details now, continue to digging into it. In our case looks like room or streams references cleared ahead of time and janus crashes in helper thread on room destroy.

I also found what ref counter is incremented on helper threads creation, before it was initialized:

this one is ok, since reference created earlier on janus-gateway/src/plugins/janus_videoroom.c at master · meetecho/janus-gateway · GitHub

but I found a problem - if helper threads are enabled and if room is destroyed via ‘destroy’ call, refs to room from helper threads aren’t cleared

Dummy publishers were using references incorrectly: see here for a fix we’ve prepared that should be merged soon. As such, that’s what should be used as an example as to how references are handled.

I haven’t investigated how helper threads use references, as I can’t remember on the top of my head. Any additional reference that may be needed will need to be dealt with accordingly when cleaning up, though: this was the issue with dummy publishers, for instance, which don’t go through the same cleanup procedures regular and remote publishers go through. Besides, there are differences between destroying a room using “destroy” and having it destroyed by just shutting down Janus: the former may have more things happening, while the latter relies entirely on the callback that’s invoked when the room is removed from its hashtable.

I got the following on room destroy(don’t look on line numbers, this is our fork, so it’s not related to master of janus):

2025-04-23T09:24:05.167031831Z [plugins/janus_videoroom.c:janus_videoroom_room_destroy:2891:decrease] 0x61200061a3b8 (38)
2025-04-23T09:24:05.167074441Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (37)
2025-04-23T09:24:05.167262265Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (36)
2025-04-23T09:24:05.167401455Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (35)
2025-04-23T09:24:05.167581355Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (34)
2025-04-23T09:24:05.167683167Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (33)
2025-04-23T09:24:05.167688216Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (32)
2025-04-23T09:24:05.167839583Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (31)
2025-04-23T09:24:05.167850145Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (30)
2025-04-23T09:24:05.167853535Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (29)
2025-04-23T09:24:05.167905480Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (28)
2025-04-23T09:24:05.168003048Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x61200061a3b8 (27)
2025-04-23T09:24:05.168009406Z [plugins/janus_videoroom.c:janus_videoroom_process_synchronous_request:5661:decrease] 0x61200061a3b8 (26)
2025-04-23T09:24:05.199128516Z [plugins/janus_videoroom.c:janus_videoroom_destroy_session:4628:decrease] 0x61200061a3b8 (25)
2025-04-23T09:24:05.221558701Z [plugins/janus_videoroom.c:janus_videoroom_destroy_session:4628:decrease] 0x61200061a3b8 (24)
2025-04-23T09:24:05.227838943Z [plugins/janus_videoroom.c:janus_videoroom_destroy_session:4628:decrease] 0x61200061a3b8 (23)
2025-04-23T09:24:05.231123795Z [plugins/janus_videoroom.c:janus_videoroom_destroy_session:4628:decrease] 0x61200061a3b8 (22)
2025-04-23T09:24:05.234499909Z [plugins/janus_videoroom.c:janus_videoroom_destroy_session:4628:decrease] 0x61200061a3b8 (21)
2025-04-23T09:24:05.237615655Z [plugins/janus_videoroom.c:janus_videoroom_destroy_session:4628:decrease] 0x61200061a3b8 (20)
2025-04-23T09:24:05.916402152Z [plugins/janus_videoroom.c:janus_videoroom_remote_publisher_thread:14615:decrease] 0x61200061a3b8 (19)
2025-04-23T09:24:05.969971884Z [plugins/janus_videoroom.c:janus_videoroom_remote_publisher_thread:14615:decrease] 0x61200061a3b8 (18)
2025-04-23T09:24:06.124293857Z [plugins/janus_videoroom.c:janus_videoroom_remote_publisher_thread:14615:decrease] 0x61200061a3b8 (17)
2025-04-23T09:24:06.162886473Z [plugins/janus_videoroom.c:janus_videoroom_remote_publisher_thread:14615:decrease] 0x61200061a3b8 (16)

16 - is the number of our helper threads

I think since helper threads use g_async_queue_pop, it blocks forever if no data available and thread will run forever even if room already detroyed (janus-gateway/src/plugins/janus_videoroom.c at master · meetecho/janus-gateway · GitHub )

maybe we should change it to g_async_queue_timeout_pop? to force while cycle check for room->destroyed periodically?

You can probably check with gdb if that’s indeed where those threads are waiting. In the core we artificially inject a fake “exit” object whose only purpose is to wake those pop and break the loop (e.g., janus_ice_detach_handle in ice.c), so I guess we could do the same here.

yeah, in this case helper thread awaits for exit_packet, but exit_packet queued on janus_videoroom_room_free, which called if room ref counter goes to 0, looks like deadlock for me

I think the problem may be that we only add exit_packet in janus_videoroom_room_free, but that’s the method that’s invoked when the room’s references get to 0: since helper threads hold a ref too, that never happens, so it’s indeed a catch-22 kind of situation. Maybe that block of code should be moved to janus_videoroom_room_destroy, even though we should then ensure no race condition occurs for publishers still using them.

moving to janus_videoroom_room_destroy fixes it:

2025-04-24T07:34:08.113640416Z [plugins/janus_videoroom.c:janus_videoroom_leave_or_unpublish:4530:increase] 0x612000023b38 (21)
2025-04-24T07:34:08.113859596Z [plugins/janus_videoroom.c:janus_videoroom_leave_or_unpublish:4569:decrease] 0x612000023b38 (20)
2025-04-24T07:34:08.126752014Z [plugins/janus_videoroom.c:janus_videoroom_process_synchronous_request:5615:increase] 0x612000023b38 (20)
2025-04-24T07:34:08.126816872Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (19)
2025-04-24T07:34:08.126941673Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (18)
2025-04-24T07:34:08.126955350Z [plugins/janus_videoroom.c:janus_videoroom_room_destroy:2899:decrease] 0x612000023b38 (17)
2025-04-24T07:34:08.127135918Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x612000023b38 (16)
2025-04-24T07:34:08.127252828Z [plugins/janus_videoroom.c:janus_videoroom_room_dereference:2886:decrease] 0x612000023b38 (15)
2025-04-24T07:34:08.127551317Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (14)
2025-04-24T07:34:08.127561325Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (13)
2025-04-24T07:34:08.127569269Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (12)
2025-04-24T07:34:08.127582185Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (11)
2025-04-24T07:34:08.127660782Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (11)
2025-04-24T07:34:08.127687402Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (9)
2025-04-24T07:34:08.127707631Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (8)
2025-04-24T07:34:08.127719653Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (7)
2025-04-24T07:34:08.127746333Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (6)
2025-04-24T07:34:08.127753205Z [plugins/janus_videoroom.c:janus_videoroom_process_synchronous_request:5663:decrease] 0x612000023b38 (5)
2025-04-24T07:34:08.127761791Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (4)
2025-04-24T07:34:08.127790850Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (3)
2025-04-24T07:34:08.127799871Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (2)
2025-04-24T07:34:08.127962664Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (1)
2025-04-24T07:34:08.127976467Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14678:decrease] 0x612000023b38 (0)

Cool! Before submitting a patch to do that, though, you may want to make sure that messing with room->threads there doesn’t cause any issue in other places, e.g., publishers still pushing traffic to the room and trying to use helpers for that before they’re removed.

How can be be sure, can you suggest?
We run janus under load testing and I see no problems now(janus doesn’t crash) all streams are cleared before room is destroyed.
Maybe check for room->destroyed before trying to iterate on room->threads?

I also found what helper->ref != 0 on room destroy:

2025-04-25T13:06:17.760739783Z [plugins/janus_videoroom.c:janus_videoroom_process_synchronous_request:5071:init] 0x60700001d950 (1)
2025-04-25T13:06:17.760857039Z [plugins/janus_videoroom.c:janus_videoroom_process_synchronous_request:5075:increase] 0x60700001d950 (2)
2025-04-25T13:06:17.760916837Z [plugins/janus_videoroom.c:janus_videoroom_process_synchronous_request:5082:increase] 0x60700001d950 (3)
2025-04-25T13:15:46.529385498Z [plugins/janus_videoroom.c:janus_videoroom_helper_destroy:2414:decrease] 0x60700001d950 (2)
2025-04-25T13:15:46.529414289Z [plugins/janus_videoroom.c:janus_videoroom_helper_thread:14651:decrease] 0x60700001d950 (1)