How to extract the audio of a stream with ffmpeg

Hello :wave:

Apologies if this is slightly off-topic for this forum, but we’re unsure where else to ask for help regarding this issue.

The Chilean Senate uses Janus to stream their live sessions (available at https://tv.senado.cl/), and we are trying to create a live transcription of these sessions.

Here’s how we currently transcribe streams from the Chamber of Deputies, which uses YouTube for streaming:

  1. We capture 5 minutes of live audio from the stream using ffmpeg.
  2. We send the audio to the Whisper API to generate the transcript.

However, when trying to use the same approach with the Janus m3u8 stream, we run into issues. Specifically, ffmpeg is unable to pick up the Janus m3u8 stream link.

For example, when we try to access a stream link like this one:

https://janus-tv-ply.senado.cl/playlist/stream_hls.m3u8?s=tvsenado-sd&t=1727214595&rand=0.8606446947152353

We get the following response instead of the typical HLS playlist structure:

status:ok
uuid:8aba7643-0365-4213-abdd-b48cc73b5f17
start:1725915602
last:1727214852
files:
https://janus-tv-jhs.senado.cl/hls/tvsenado-sd/1727136000/1727211600/tvsenado-sd_1727214592.ts
https://janus-tv-jhs.senado.cl/hls/tvsenado-sd/1727136000/1727211600/tvsenado-sd_1727214602.ts

This doesn’t look like the typical .m3u8 playlist format that we expect, which might be why ffmpeg and even VLC are unable to handle the stream properly.

Our goal:

We would like to get a live audio feed in a format like .mp3 (or another audio format) so that we can send it to Whisper for transcription. We are relatively new to Janus, so it’s possible we are misunderstanding something fundamental here.

Questions:

  1. Is there a specific way to handle this type of Janus-generated .m3u8 file for extracting live audio?
  2. Is there a method or plugin within Janus that would allow us to get a continuous audio stream in .mp3 or a more standard format that tools like ffmpeg can process?

We appreciate any guidance you can provide. Thanks in advance for your help!

– Toto

I don’t know what you mean by " Janus m3u8 stream": the m3u8 most definitely doesn’t come from us, as we only deal with WebRTC, not HLS. I don’t know if whoever built the platform for you built an HLS restreamer out of a Janus WebRTC stream, or if you’re talking of a different Janus application?

If you’re using the Janus WebRTC Server, then there are different ways of doing a live transcription. We personally use RTP forwarders to a separate backend that can do that (in the past using Deepgram, more recently using our own Whisper-based Whispy), which could also be configured to do what you’re doing now (accumulate 5 minutes of audio and transcribe that).

1 Like