-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Forgive the Pied Piper Reference. The purpose of this issue is to initiate a design discussion around making MOQT efficient for many-to-one distribution.
Much of the preceding design of MOQT assumes either few-to-few (e.g. video conferencing) or one-to-many (live sports) distribution. We can leverage caching to enable scalable egress, combined with a fan-out architecture for non-cacheable real-time content.
But what about the use-case of "fan-in" - millions of sensors providing real-time telemetry to a single monitoring point? In this case, caching does no good from a scale perspective, as the content is only ever going to one subscriber. The fan-out architecture works in reverse to collapse all upstream data onto downstream pipes.
What features do we need in MOQT to make high capacity ingest efficient?:
- We want a way for a publisher to signal to all relays that a track should not be cached. Currently the MAX CACHE DURATION Parameter can only be present in a SUBSCRIBE_OK, FETCH_OK or TRACK_STATUS message. It cannot be sent by a publisher. Setting the DELIVERY TIMEOUT Parameter to zero is not equivalent to do-not-cache.
- Keep-alives for long running sessions. Imagine a security camera that sends sparse data once every 5min. We can rely on application level pings to keep the connection alive, or should web build in a mechanism into the protocol to keep connections alive (with auth protection so untrusted clients cannot keep connections open forever).
- Can the central collectors use the relays as configurable filters? What if we do want to leverage the cache to "batch" the collection of telemetry? For example, sensors output temperature data every second, but the collector only wants to retrieve the last 5mi of data every 5min. We could create a FORWARD_FILTER Parameter which the subscriber can send during subscription, which specifies a key parameter and a threshold value. For example, I subscribe only to objects for which the "foo" parameter is > 10.
- Much telemetry data is timeline-based. Can we have a standard mode in which a track is identified as a timeline and make sure that works efficiently for our major actions (SUBSCRIBE, FETCH etc). Is setting each group number to be a millisecond epoch timestamp sufficiently robust?
- Publisher authentication and authorization: is our current token scheme sufficient for high-capacity ingest?
- End-to-end encryption: we allow the payload to be encrypted, but is this sufficient for sensitive IoT data at city-wide scale?
What else might be important for high-capacity ingest that we can address now before we fail at scale?
Cheers
Will