Chatbot plugin (audio+text) advice

Hello there!

My passion is creating various automated (chat)bots and then connect them with outside world through any means available. Lately, I thought about expanding the possibilities by adopting webRTC and among others, tried Janus. A big chunk of tech to get used to, applause for your thorough documentation! But to the point:

What would be the most straightforward way to connect single user with audio and text bot “backend”?
Audio : IN + OUT
Text: IN + OUT

I understand the generic way is to write a plugin from scratch that will do the “routing” of both modalities (text messages by DataChannel, audio separate). I wonder if I overlooked something obvious and already existing, like the Streaming plugin could theoretically be used for “routing” audio from bot to user (for the other way around, Record plugin is closest). I could create a patchwork of existing plugins to mix the exact chimaera I need.

Could someone kindly advise me, if I am thinking the right direction, or am I heading for a cliff? Is there an existing example code to push me in a right direction? Your thoughts are much appreciated.

1 Like

The easiest way is using VideoRoom + RTP forwarders on the way in: this way, the audio/video of the user can be forwarded to an external application via plain RTP, and from there you can do whatever you want (recently I used it to do transcriptions with Deepgram, for instance). If you also need media to be sent back to the user, then you need a separate PeerConnection, probably with the Streaming plugin, which can do RTP→WebRTC instead.

This presentation will give you an overview on both: FOSDEM 2020 - Janus as a WebRTC "enabler"

Much appreciated, cheers!
This is a lot to take in, but having a correct direction always helps with wandering the software jungle.