Skip to content

Make ToxAV implementation pluggable #1369

Open
@strfry

Description

@strfry

Motivation

I'm currently doing some experiments with making ToxAV compatible with the standard RTP/WebRTC stack.
From my impression, Tox initially followed this path (encapsulating RTP in Tox), as evidenced by the naming and header fields of RTPMessage, but then diverged and invented custom solutions for standard problems (for example: Inventing a "Large Frame" protocol with the data_length and offset fields instead of RFC 7741 VP8 Payloading).

The "batteries-included" approach of ToxAV, meaning that library users only deal with uncompressed buffers, makes it easy to build an AV-enabled Tox client and shields the application programmer from the nitty gritty details of video streaming, but also makes some features, like hardware accelerated video decoding with zero-copy display impossible.

As a result, Tox now maintains an inadequate, incompatible implementation of an AV streaming stack, an entirely out-of-scope endeavour for a project this size.
IMHO, Tox should redefine it's role as being a distributed, secure transport layer for a standards compliant RTP implementation.
In the future, TokTok might still provide a basic AV implementation, but focus on application-layer compatibility, and avoid sinking developer time in developing isolated solutions for problems that have already been solved by a multitude of other projects.

Besides the benefits of using existing implementations, a RTP-compliant ToxAV would enable many other applications, like bridging to WebRTC peers, video-on-demand delivery and more.

Migration Path

Since legacy compatibility is an important requirement for Tox, replacing the protocol overnight isn't an option. A migration path could look like this:

  1. Carving out the specification of current/"legacy" ToxAV protocol
  2. Write adapter code to convert ToxAV->RTP / RTP->ToxAV
  3. Use those converters within Toxcore, to convert packets between legacy peers and the new RTP based implementation

The feasability of 2) is currently researched/demonstrated within github.com/strfry/gotox

API Changes

Signalling

Signalling commands (Call, Answer, Hangup, etc.) are currently sent on a special comm channel, and internally handled by msi.c, where a callback-style interface is provided.
This seemed like a good spot to hook on, since it's exactly what toxav_new does, which i intend to replace.
In my prototype, i just expose this internal API by means of the FFI ( TokTok/go-toxcore-c@master...strfry:feature/msi ), so no direct action is needed, but i think it is debatable whether the API described in msi.h would be a good cutting point for a public API.

Packet ID filtering

The lossy_packet interface basically enables us to send and receive AV packets.
Only problem, it explicitly checks for AV related packet IDs, to multiplex between ToxAV and userspace.
This just affects a few lines, that need to handle this condition in a different way: master...strfry:feature/pluggable_rtp
Of course statically disabling the ToxAV codepath isn't an option because it would break existing AV clients. Another option would be to disable these checks when tox is built without AV support, but this isn't an ideal solution other.

Maybe this could be dependend on the previous allocation of a ToxAV object?
I'm not sure if there would be unexpected side-effects for existing ToxAV-clients, that might get confused by video packets coming in through the custom_lossy_packet interface?

Open Issues

The (Web)RTC stack uses the Session Description Protocol (SDP, RFC4566) to negotiate various details, such as codecs, protocol extension and ICE connection candidates.
SDP is a common point of criticism of the WebRTC spec, and was mostly accepted for easier compatibility with existing SIP networks.
It's complexity, and integral support for features that are not necessary in the context of ToxAV make it unfavorable for inclusion in Tox.
It is yet to be researched how the necessary options can be mapped to Tox capabilities, and how much can be skipped through higher base specifications within Tox. For example, we can assume support for the Opus audio codec, and don't need to negotiate about things like "PCMU/8000".

Another topic of research is mapping the various RTCP commands to existing comm channel commands for bandwidth regulation.

Metadata

Metadata

Assignees

Labels

P2Medium priorityenhancementNew feature for the user, not a new feature for build scripttoxavAudio/video

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions