-
Notifications
You must be signed in to change notification settings - Fork 293
Gossipsub: Extensions control Message #684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
pubsub/gossipsub/gossipsub-v1.3.md
Outdated
|
||
Authors: [@marcopolo] | ||
|
||
Interest Group: TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment/react here if you'd like to be included in this interest group
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✋🏽
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✋🏽
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
me too please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to be added too!
I like the concept and the idea, I have some questions though. Thinking out-loud there is a bit of engineering complexity with this. The new control message MUST be sent as the first message and MUST only be sent once. As the protobuf allows these to be sent at any time, there is a bit of annoyance that we have to check its the first message, hasn't been sent a second time and reject all messages that may contain this extra control message. The reason I bring this up, is that in rust-libp2p, we already have all of this machinery for protocol-id. It can already only be sent once and we keep of track of each peer and what they support. Peers can send us a list of protocol-ids. Can't we have the same functionality where a peer supports say:
I guess the obvious downside is that multistream select chooses the "best" protocol-id to match on. So it would limit interactions between extensions, but a workaround would be choosing a different extension number that represents the interaction of multiple extensions, i.e extension 3 could mean support for extension 1 and 2. I only raise this, because I think this mentioned approach requires 0 engineering effort on our behalf and we dont have to add a control message. I'm not against adding this, just wanted to raise this as potentially a simpler possibility. |
Hey Age, thanks for the comments
The constraints should make it easier to implement. It is a bit more annoying to implement if you can expect an extensions message at any point. If the extensions message is sent a second time, you should disconnect as the peer is misbehaved. Making this be the first message and validating no future extension control messages arrive was <10 lines of code in the Go implementation.
Yes we could, but I think that's less ideal. For sake of argument, let's say we can encode extensions into the protocol ID in a way that extensions can be combined and we don't care about the encoding size. (We could for example give each extension a numeric ID, and attach them as a comma separated list after the If we rely purely on multistream semantics (no Identify to learn about a peers protocol list), this means we'd pay a round trip for each extension combination we'd want to support. If there are 3 extensions that can all be used independently, this would mean there could be 1 (all extensions) + 3 (only two) + 3 (only one) + 1 (no extensions)=8 round trips to negotiate the extensions with a single peer. This is pretty bad, so I won't dive further into this. If we use the Identify protocol (or similar) to first exchange a list of protocol IDs and maybe simply encode all supported extensions, instead of combinations. Then peers would know at the first multistream select round which protocol to offer, and the node would be fairly certain the other peer supports the offered extensions. This would work, but I'm not a fan of it for a couple reasons.
To address these issues I could imagine changes to core libp2p that:
This would be fine, but doesn't get us the 0 engineering effort we both would like. I'm not opposed to continue discussing using the protocol ID as maybe I've missed something. But I also want to hear more about what makes this proposal difficult to implement. |
We can put the extensions supported in the hello RPC, that should be backwards compatible (since it is protobuf, older clients can ignore). |
60fbe08
to
767eac2
Compare
I dont think this is difficult to implement. Just thought maybe we could get away with doing nothing by piggybacking the protocol-id. I've added a few features/things to rust-libp2p and I had some comments pushing back on new things saying that the protocol was getting too complex, so I guess now, i'm always looking for simplifications before adding more things. The complexity here is tiny (didn't mean to over-state it), seems like its necessary for what we're trying to do here. |
I think this is good. I still think we should stop adding version numbers in PR titles. It is confusing and versioning should not be done this way. Having said that, this would be good for 1.3, as it unlocks many other improvements. Finally, some nitpicking: |
Generally agree. I initially included it as a signal on how I think the various extensions should be prioritized. I've removed it to separate the versioning discussion.
What is the misunderstanding? I think the text is accurate. This is the same language as what is used in the HTTP/3 SETTINGS RFC (which I liberally copied from). That RFC states:
https://datatracker.ietf.org/doc/html/rfc9114#section-7.2.4-4 |
I see. You are right, it is hard to misunderstand (where the misunderstanding I wanted to avoid is to see this an an explicit ban of negotiating the use of extensions, which is clarified by the "however ... " part). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I think this is a good idea. It reduces frictions to propose new ideas or implement and deploy something not being agreed by all yet.
|
||
## The Extensions Control Message | ||
|
||
If a peer supports any extension, the Extensions control message MUST be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should future-proof this in case we move away from the single-stream (per direction) design. For context: we speculate that more granular streams on top of QUIC can alleviate transport HoL blocking and can enable more fluid cancellation. We've discussed:
- one stream per subscription (+ control stream)
- one stream per message (+ control stream)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need any text change here. We'll always have an initial stream and the extensions MUST be in the first message on that stream.
The best next optimization for multiple streams would be to open a stream per message. I don't think multiple control streams are worth it because it is simpler if control messages are serialized and control messages are small.
Aside, it would be great to use bidirectional streams as bidirectional. There is implementation complexity in correctly treating two separate streams as a single bidirectional stream (the current go implementation has bugs here).
syntax = "proto2"; | ||
|
||
message ControlExtensions { | ||
// Initially empty. Future extensions will be added here along with a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's describe field expectations and provide some examples. For example, some extensions may simply advertise themselves as boolean flags, while others may send preamble data/config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a real example with the TestExtension. We'll also have another one soon with Partial messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those won't be linked from here. What I'm saying is that this spec defines a framework for extensions, and it feels incomplete without an explanation of how extensions would actually represent themselves, other than "here's a schema where they can drop something". Some stereotypical examples would help frame this right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be linked here. As well as having the code points defined here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain your motivation for wanting to inventorize all eventual extensions here? The implicit convention is to have the spec define the mechanism for the extension point, while managing the assignment in a separate resource (registry).
This separation between mechanism and registry is true for: multicodec, multiaddr, and major industry standards like CBOR (tags), TLS (params), MIME (media types), etc. where the base spec is an IETF RFC, extensions are separate RFCs, and their ID assignment is tracked by IANA registries.
I'm happy to be convinced that deviating from convention is appropriate here, but if there isn't a clear case, I'd suggest we:
- have this spec establish the registry (e.g. a local csv file), together with
- the test extension (Gossipsub: test extension #686), and
- the v1 of the registry containing the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference here. We can create a separate file to serve as registry. (although it should be a .proto
rather than a .csv
, as it's a more direct mapping.)
My motivation to keep it in one file is to keep things simple. There are tradeoffs in maintaining a separate registry. IANA works well because an RFC is a published document that is more or less complete, and the considerations to adopting a new codepoint need to hashed out elsewhere (the WG might not even still exist). The downside is that there is more overhead to a separate registry (we can mitigate that with a file within this repo, such that changes can be atomic to a PR).
While I don't have a strong preference, I don't see much benefit in having a separate registry file for this use case. Can you elaborate on what you expect the benefits to be?
For some prior art in this repo, we have Noise Extensions that does not maintain a separate registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on what you expect the benefits to be?
Both approaches can work. But given that most prior art uses a separate registry, I'd lean towards this approach. That way the asset is easily verifiable, hashable, independently versionable, with the guarantee that new versions only modify registry entries and no spec behaviour (which we lose with inlining).
For some prior art in this repo, we have Noise Extensions that does not maintain a separate registry.
This spec confused me. Those fields appear out of thin air without any traceability, explanation, nor link to their respective specs. I think this should be refactored to a detached registry too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed they would benefit from adding a link. The git blame is pretty straightforward though.
As mentioned, I don't have a strong preference here, so I'll create a new file.
need to make Gossipsub extensions follow a strict ordering. Finally, it allows | ||
extensions to iterate independently from Gossipsub's versioning. | ||
|
||
## The Extensions Control Message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not advertise the extensions during the HELLO
exchange?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello is an implementation detail, there's no "HELLO" protobuf message. It is shorthand for sending the topics we are subscribed to. If an implementation wants to include it in their "hello packet", that's fine. But there's no need to reference the hello packet here. The requirement is simply that this MUST be the first message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, that's why I said exchange and not RPC.
This hello dispatch isn't an implementation detail, it is defined here:
specs/pubsub/gossipsub/gossipsub-v1.0.md
Line 183 in e4e6eb7
Whenever a new peer is connected, the gossipsub implementation checks to see if |
FWIW, it's probably hierarchically misplaced there (an errata submission could move it to another clearer position in the doc).
I'm fairly confident that all implementations dispatch this hello, and given it's specified, we could simply refer to it explicitly here for more clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This core idea of sending your subscriptions on a new connection is part of the spec. I agree.
The rust-libp2p code has no explicit "hello packet" notion. It simply sends the topic subscriptions on a new connection. As a datapoint, @jxs didn't know what the hello packet was until I explained it.
I think the language here is clear without needing to reference the "hello packet." What do you think would be gained by referencing the hello packet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense; the current wording works for me. I think there's opportunity to formalize the initial exchange a bit better—but that's outside the scope of this PR.
496cf55
to
e01116b
Compare
This lets us try more extensions and changes without requiring a new Gossipsub version. Hopefully it makes future extensions easier to iterate on, use, and deploy. More details in the doc
There is a lot of prior art here as many modern protocols have something like this, QUIC's transport parameters and H{3,2} SETTINGS frames to name a few.