After a certain time the system does not work

Hey,

I set up a video streaming system
The system runs for a few hours and then stops by itself.

I used it with GDB I only saw inputs and outputs, no other errors
I’m using systemd I set it to restart when something abnormal happens but it doesn’t work

I am using the system with websocket
When I check websocket data only the following data is happening
{“janus”:“create”,“transaction”:“qBIQcSZcjeqa”}

I can’t login to janus/info site until I restart it manually
LOG is a data like 9gb, so I can’t examine it (approximate log sizes are always this much)

how can i find this error

Thank you

I analyzed the LOG by breaking it down and when I look at the last line, the following errors appear

I understand websocket is shutting down but I don’t know why

[Sun Jul  2 13:12:49 2023] [WSS-0x7f285c174c50] WS connection down, closing
[transports/janus_websockets.c:janus_websockets_destroy_client:838:lock] 0x7f285c320668
[Sun Jul  2 13:12:49 2023] [WSS-0x7f285c174c50] Destroying WebSocket client
[transports/janus_websockets.c:janus_websockets_destroy_client:846:lock] 0x7f2880bc36d0
[transports/janus_websockets.c:janus_websockets_destroy_client:849:unlock] 0x7f2880bc36d0
[transports/janus_websockets.c:janus_websockets_destroy_client:875:unlock] 0x7f285c320668
[Sun Jul  2 13:12:49 2023] A janus.transport.websockets transport instance has gone away (0x7f285c320650)
[janus.c:janus_transport_gone:3448:lock] 0x555821a6ff70
[janus.c:janus_transport_gone:3471:unlock] 0x555821a6ff70
[Sun Jul  2 13:12:49 2023] [WSS-0x7f285c174c50]   -- closed

As unlikely as it may be, there may be a deadlock, and that’s something you can investigate with gdb: just attach to the process when it’s not responding, and show the threads, and you may find one stuck somewhere, e.g. on a mutex or something else.

Hello
I’m bothering you again, I caught something
I think websocket crashes
i caught this data the last time it crashed in the logs

how do i know why websocket crashed

[Sat Jul 22 13:04:20 2023] A janus.transport.websockets transport instance has gone away (0x7f78b0402070)
[janus.c:janus_transport_gone:3448:lock] 0x555b910dbf70
[janus.c:janus_transport_gone:3471:unlock] 0x555b910dbf70
[Sat Jul 22 13:04:20 2023] [WSS-0x7f78b04e95d0]   -- closed
[transports/janus_websockets.c:janus_websockets_destroy:819:lock] 0x7f78d4bc36d0
[transports/janus_websockets.c:janus_websockets_destroy:824:unlock] 0x7f78d4bc36d0
[Sat Jul 22 13:04:20 2023] JANUS WebSockets transport plugin destroyed!
[Sat Jul 22 13:04:20 2023] Ending requests thread...
[Sat Jul 22 13:04:47 2023] [file-live-sample] Rewind! (/opt/janus/share/janus/streams/radio.alaw)

I don’t see any crash, I see the transport plugin unloading, which only happens when Janus is shutting down. Maybe you have something that sends a SIGINT or SIGTERM to Janus on that machine?

yes i misunderstood
when it crashes i can use websocket
websocket is connecting, I sent a command to create a room but no response

I’ll do some research with gdb and let you know if I come across anything.

I found a small solution so that users do not suffer
I’m trying to connect /info to your Janus system, the page is blank, I detect it and reboot the system automatically
:slight_smile:

Can you review this log in your available time?
I got the log that occurred before it crashed

I divided the log 45gb into these log parts and shared the last part with you, since I saw destroy (“Forced to stop it here…”) in the last part, I thought the problem would be around here.

I don’t see any crash:

Stopping server, please wait...
[Tue Jul 25 23:02:20 2023] Ending sessions timeout watchdog...
[Tue Jul 25 23:02:20 2023] Sessions watchdog stopped

You or something else told Janus to shutdown (maybe via a SIGINT or SIGTERM signal, as I said in my previous message), which Janus then did.

I just use JANUS on this linux server
When the system freezes, I reload the server with the bot and that’s it.

info page cannot be linked if a stop function is used
but when the system freezes, blank page data comes from the info page link
Since the system is working normally now, the info panel shows the data with json data.

I created a SIGTERM and SIGINT log, I will know if it is due to this.