The new multistream support added in #2211 made it possible to send/receive mult…iple streams to/from Janus in the same PeerConnection. That said so far, for the send part, it hasn't been easy to really take advantage of it when using `janus.js`: in fact, while that basic SDK does have a way to capture devices to send, it does so byusing a `media` object where you specify if you want audio and/or video (and maybe data), if they should be sent/received and so on. As it is, the `media` object is very rigid and not flexible at all, since it doesn't allow you to add more than one audio or video stream for instance, and even just updating sessions is quite a nightmare. A few users did struggle with this, which led to proposals #2966 for instance.
Since I wanted to change this and allow for more flexibility since even before we merged multistream support, I eventually decided to bite the bullet and start working on this. As you can see, it's currently a draft PR because it's not complete (more info below), but I wanted to share the current state of the effort nevertheless, mainly to figure out if this new approach makes sense, in particular from an API perspective.
Basically, the main idea was to drop this flat `media` object when you call `createOffer` or `createAnswer`, and use an array-based approach instead, where the user provides an array called `tracks` with info on everything they're interested in. This is an example from the updated EchoTest demo code:
echotest.createOffer(
{
// We want bidirectional audio and video, plus data channels
tracks: [
{ type: 'audio', capture: true, recv: true },
{ type: 'video', capture: true, recv: true,
// We may need to enable simulcast or SVC on the video track
simulcast: doSimulcast, svc: (vcodec === 'av1' && doSvc) ? doSvc : null },
{ type: 'data' },
],
As you can see, the `tracks` array contains three object:
1. an audio track, where we ask `janus.js` to capture the default audio device (`capture: true`) and we clarify we want to receive media as well (`recv: true`);
2. a video track, formatted pretty much the same way, but with additional properties like whether we want this track to do simulcast or SVC too;
3. a data channel.
This simple example should already show how more flexible this approach is: in fact, it shows how we can add more than one audio or video track if we wanted, and for video possibly do simulcast for one but not all tracks, etc. In this example we're only showing how to capture generic audio, video or data, but screen sharing is supported as well, by setting `type: "screen"`.
The `capture` property is particularly important, of course, as it specifies whether we want to actually capture or send anything. Setting `true` is the equivalent of passing `audio: true` or `video: true` to `getUserMedia` (or `getDisplayMedia` for screensharing). For video, this property supports the same string values we had for `media` (e.g., `hires` for 1280x720), as well as explicit constraints (e.g., for ideal/exact resolutions). Passing a `MediaStreamTrack` to `capture`, instead, will have `janus.js` skip the capture part, and pass the track directly to the PeerConnection: this is the equivalent of what passing an external stream did before, meaning you can still do the `getUserMedia` yourself and then pass the result to `janus.js` directly. You can see examples of that in the canvas-based demos (e.g., canvas and virtual backgrounds), where we originate a stream externally rather than in `janus.js`. Insertable Streams should be supported as well (the e2etest demo does work) by setting the transform functions in the related `tracks` item (which is an improvement on the more generic audio/video transforms we passed globally before): anyway, I haven't tested in a SFU scenario, so in case you test that it would be nice to know if it does indeed work as it is.
It's important to point out you only need to provide a `tracks` array if you want to add/remove/update something: if you issue a renegotiation without providing anything, then nothing is changed, which is helpful when you don't really want to change anything but only, e.g., issue an ICE restart. This is supposed to be an improvement on how `media` handled that, where you had to pass weird properties like `keepAudio` or `keepVideo` to avoid unexpected changes. A `tracks` array is also optional when doing a `createAnswer` for which you don't want to capture anything, e.g., a subscription; this is an example from the updated Streaming demo, for instance:
streaming.createAnswer(
{
jsep: jsep,
// We only specify data channels here, as this way in
// case they were offered we'll enable them. Since we
// don't mention audio or video tracks, we autoaccept them
// as recvonly (since we won't capture anything ourselves)
tracks: [
{ type: 'data' }
],
In this case, we are providing a `tracks` only to add an (optional) data channel, just in case data channels were offered. We're not specifying any audio or video track, which means `janus.js` will automatically accept anything that is offered as `recvonly`. If you want to reject something, add a track item and something to identify it (like the `mid` of the track) and pass `recv: false` to reject it.
Upating sessions should also be faily easy now. First of all, if you just want to replace a mic or webcam for instance, there's a new handle method called `replaceTracks` that uses the same `tracks` array and allows you to make changes without triggering a renegotiation. An example might be:
echotest.replaceTracks([
{
type: 'video',
mid: '1', // We assume mid 1 is video
capture: { deviceId: { exact: videoDeviceId } }
}
]);
In this example we're asking `janus.js` to replace the video we're sending in `mid` 1 with a specific videocamera. You can see this API in action in the updated Device Test demo. Of course you can use a `tracks` array also to update an existing session, e.g., to replace an audio track and/or stop sending video in a session: to replace a track in a new `createOffer`/`createAnswer` you need to add a `replace: true`, while to remove one you pass `remove: true` instead. This example shows how we can remove a specific video stream while keeping everything else unaltered:
echotest.createOffer({
tracks: [{ type: 'video', mid: '1', remove: true}],
success: function(jsep) {
echotest.send({message: { video: true}, jsep: jsep})
}
});
The `add: true` can be used to dynamically add a new track instead (e.g., to start screensharing in a session where you're sending audio and video already):
echotest.createOffer({
tracks: [{ type: 'screen', add: true, capture: true}],
success: function(jsep) {
echotest.send({message: { video: true}, jsep: jsep})
}
});
To make it easier to figure out which local and remote tracks are available, and which mids they're associated with, you can use the new handle methods `getLocalTracks()` and `getRemoteTracks()`: these are particularly useful when you want to change something in particular, but are not sure how to uniquely address them in a tracks item. The result also includes a few additional info when available, like the track label: this is mostly useful on local devices, as that will tell you which device was captured for instance, but you may have an use for remote labels too.
Do you find all this exciting? I hope so, because I do need some feedback on whether this really works in practice! It does seem to do the job in the demos I've tested, but of course real usage by other developers is fundamental to understand if the API as it is needs tweaking or fixing.
That said, as I mentioned this is a draft because of a few existing issues and limitations that I have to work around. But before getting to that, a question you may be asking yourself...
## Is the old media object still supported?
Well, in theory yes, in practice it may not always work. I did create a helper function that translates a `media` object to a `tracks` array (which is called automatically if you only pass a `media`), but it doesn't support all and every property we had, and there were properties we rarely used, that you may need. Besides, I haven't tested what happens when you renegotiate stuff: as I explained, you don't need to provide a `tracks` again if you don't change anything, so in case you pass a `media` in a renegotiation you may end up with new stuff being captured. I haven't really tested it at this stage, and while I may try and fix things like this, the idea is to migrate to `tracks` as soon as possible in a new version of Janus (1.1), without worrying too much about the burden of the legacy `media` object. Feedback is welcome of course, especially if you do test what happens with your applications that still rely on `media`.
Now, about things that still don't work or need fixing...
## What's the catch?
There are a few limitations in the way this new approach is implemented right now. The plan is to sort them out before making this a mergeable PR, and of course feedback and suggestions would be more than welcome here, as for a couple of things I'm struggling to understand the best way to address them (if they should be addressed at all, that is).
1. ~~One of the first things you'll notice is that, when you ask for both audio and video, you get different permission dialogs, and not a single one for both. This is because, since we pass an array, we traverse tracks in order, and capture them in order accordingly. I still haven't decided whether this should be changed or not, mostly because I don't have a clear idea of how to take care of this in practice. Besides the technical aspect of how to process two tracks at the same time in the existing code, there's the problem of how to do that semantically: if a `tracks` array contains two audio tracks and a video one, which ones do you ask together? Even if we assume a way is added for the developer to specify what to ask together, how do you address them within an array that doesn't have unique IDs?~~ Fixed with the (possibly implicit) concept of grouping (see comments below).
2. ~~Another thing you'll notice is that error managament is awful at the moment, or sometimes entirely missing. This is because to process tracks in sequence I'm making use of some `async` functions internally and `await` directives on a few WebRTC APIs, and so it's problematic to intercept errors and report them the way we did before. Cleanup is also more problematic in this context. Anyway, this will need to be done properly, so it will hopefully be addressed soon: I was more interested in getting something working first, so I didn't care too much about that for the time being.~~ This should be fixed now.
3. ~~While we support providing tracks externally (as external streams worked before), there still isn't code to ensure external tracks are not closed when we get rid of them, e.g., via a `remove: true`. The idea is not to force `janus.js` to keep track at all, but envisage an additional property you can specify (e.g., `dontCloseTrack: true`) for when you don't want the track to be closed for whatever reason. Considering the developer knows when they provided a track they captured themselves, it should be trivial to just add the property in that case. That said, this isn't in the PR yet.~~ This should be addressed now: applications can add a `dontStop: true` property to a track when providing a `capture`, and if so `janus.js` will not stop the track itself when the time comes to get rid of it.
4. One demo I didn't test at all was the SIP one, and most importantly if/how the new approach works when updating sessions, e.g., dynamically adding/removing videos and stuff like that. The existing SIP demo doesn't provide easy ways to do that dynamically, so if you have SIP applications built on top on `janus.js` it would be helpful to see if that works as expected.
This should be all, feedback welcome!