Skip to content

Gossipsub: Extensions control Message #684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

MarcoPolo
Copy link
Contributor

This lets us try more extensions and changes without requiring a new Gossipsub version. Hopefully it makes future extensions easier to iterate on, use, and deploy. More details in the doc

There is a lot of prior art here as many modern protocols have something like this, QUIC's transport parameters and H{3,2} SETTINGS frames to name a few.


Authors: [@marcopolo]

Interest Group: TODO
Copy link
Contributor Author

@MarcoPolo MarcoPolo Jun 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment/react here if you'd like to be included in this interest group

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✋🏽

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✋🏽

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

me too please

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to be added too!

@AgeManning
Copy link
Contributor

I like the concept and the idea, I have some questions though.

Thinking out-loud there is a bit of engineering complexity with this. The new control message MUST be sent as the first message and MUST only be sent once. As the protobuf allows these to be sent at any time, there is a bit of annoyance that we have to check its the first message, hasn't been sent a second time and reject all messages that may contain this extra control message.

The reason I bring this up, is that in rust-libp2p, we already have all of this machinery for protocol-id. It can already only be sent once and we keep of track of each peer and what they support. Peers can send us a list of protocol-ids.

Can't we have the same functionality where a peer supports say:

  • /meshsub/1.1.0
  • /meshsub/1.0.0
  • /extension/17
  • /extension/52

I guess the obvious downside is that multistream select chooses the "best" protocol-id to match on. So it would limit interactions between extensions, but a workaround would be choosing a different extension number that represents the interaction of multiple extensions, i.e extension 3 could mean support for extension 1 and 2.

I only raise this, because I think this mentioned approach requires 0 engineering effort on our behalf and we dont have to add a control message.

I'm not against adding this, just wanted to raise this as potentially a simpler possibility.

@MarcoPolo
Copy link
Contributor Author

Hey Age, thanks for the comments

The new control message MUST be sent as the first message and MUST only be sent once. As the protobuf allows these to be sent at any time, there is a bit of annoyance that we have to check its the first message, hasn't been sent a second time and reject all messages that may contain this extra control message.

The constraints should make it easier to implement. It is a bit more annoying to implement if you can expect an extensions message at any point. If the extensions message is sent a second time, you should disconnect as the peer is misbehaved. Making this be the first message and validating no future extension control messages arrive was <10 lines of code in the Go implementation.

Can't we have the same functionality where a peer supports say: ...

Yes we could, but I think that's less ideal.

For sake of argument, let's say we can encode extensions into the protocol ID in a way that extensions can be combined and we don't care about the encoding size. (We could for example give each extension a numeric ID, and attach them as a comma separated list after the /meshsub/1.x/ prefix). We probably would not want to do this because:

If we rely purely on multistream semantics (no Identify to learn about a peers protocol list), this means we'd pay a round trip for each extension combination we'd want to support. If there are 3 extensions that can all be used independently, this would mean there could be 1 (all extensions) + 3 (only two) + 3 (only one) + 1 (no extensions)=8 round trips to negotiate the extensions with a single peer. This is pretty bad, so I won't dive further into this.

If we use the Identify protocol (or similar) to first exchange a list of protocol IDs and maybe simply encode all supported extensions, instead of combinations. Then peers would know at the first multistream select round which protocol to offer, and the node would be fairly certain the other peer supports the offered extensions. This would work, but I'm not a fan of it for a couple reasons.

  1. It uses the protocol ID as a mechanism to signal optional extensions, which is a bit strange. Are there examples of protocol identifiers encoding optional extensions? ALPNs for example do not replace the need for a SETTINGS frame.
  2. Extensions may need to encode more than just a boolean of being supported or not. e.g. a extension may also inform a peer of some limit (see SETTINGS_QPACK_MAX_TABLE_CAPACITY).
  3. It adds a round trip as you have to get the protocol list first. (Note that you don't need to receive a peer's extensions before being able to send them core Gossipsub messages in this proposal)
  4. It requires a separate protocol to learn about a peer's protocol list. We have this with Identify, but it seems weird to make Identify a pre-req for gossipsub extensions.

To address these issues I could imagine changes to core libp2p that:

  1. Defines a SETTINGS-like container that can be used by protocols. (which is essentially what we're trying to achieve by adding to the protocol ID).
  2. Can exchange Protocol IDs and SETTINGS-like container early in the connection (i.e. early data). This avoids the round trip penalty

This would be fine, but doesn't get us the 0 engineering effort we both would like.

I'm not opposed to continue discussing using the protocol ID as maybe I've missed something. But I also want to hear more about what makes this proposal difficult to implement.

@vyzo
Copy link
Contributor

vyzo commented Jun 25, 2025

We can put the extensions supported in the hello RPC, that should be backwards compatible (since it is protobuf, older clients can ignore).

@MarcoPolo MarcoPolo force-pushed the marco/gossipsub-extensions branch from 60fbe08 to 767eac2 Compare June 25, 2025 19:29
@AgeManning
Copy link
Contributor

AgeManning commented Jun 26, 2025

I'm not opposed to continue discussing using the protocol ID as maybe I've missed something. But I also want to hear more about what makes this proposal difficult to implement.

I dont think this is difficult to implement. Just thought maybe we could get away with doing nothing by piggybacking the protocol-id. I've added a few features/things to rust-libp2p and I had some comments pushing back on new things saying that the protocol was getting too complex, so I guess now, i'm always looking for simplifications before adding more things.

The complexity here is tiny (didn't mean to over-state it), seems like its necessary for what we're trying to do here.

@cskiraly
Copy link

cskiraly commented Jun 26, 2025

I think this is good.

I still think we should stop adding version numbers in PR titles. It is confusing and versioning should not be done this way. Having said that, this would be good for 1.3, as it unlocks many other improvements.

Finally, some nitpicking:
Extensions are not negotiated : I would phrase this differently to avoid misunderstanding. E.g. Extension negotiation is out of scope for this specification.

@MarcoPolo MarcoPolo changed the title Gossipsub v1.3: Extensions control Message Gossipsub: Extensions control Message Jun 26, 2025
@MarcoPolo
Copy link
Contributor Author

I still think we should stop adding version numbers in PR titles. It is confusing and versioning should not be done this way. Having said that, this would be good for 1.3, as it unlocks many other improvements.

Generally agree. I initially included it as a signal on how I think the various extensions should be prioritized. I've removed it to separate the versioning discussion.

Finally, some nitpicking: Extensions are not negotiated : I would phrase this differently to avoid misunderstanding. E.g. Extension negotiation is out of scope for this specification.

What is the misunderstanding? I think the text is accurate. This is the same language as what is used in the HTTP/3 SETTINGS RFC (which I liberally copied from). That RFC states:

SETTINGS parameters are not negotiated; they describe characteristics of the sending peer that can be used by the receiving peer. However, a negotiation can be implied by the use of SETTINGS: each peer uses SETTINGS to advertise a set of supported values. The definition of the setting would describe how each peer combines the two sets to conclude which choice will be used. SETTINGS does not provide a mechanism to identify when the choice takes effect.

https://datatracker.ietf.org/doc/html/rfc9114#section-7.2.4-4

@cskiraly
Copy link

What is the misunderstanding? I think the text is accurate. This is the same language as what is used in the HTTP/3 SETTINGS RFC (which I liberally copied from). That RFC states:

SETTINGS parameters are not negotiated; they describe characteristics of the sending peer that can be used by the receiving peer. However, a negotiation can be implied by the use of SETTINGS: each peer uses SETTINGS to advertise a set of supported values. The definition of the setting would describe how each peer combines the two sets to conclude which choice will be used. SETTINGS does not provide a mechanism to identify when the choice takes effect.

https://datatracker.ietf.org/doc/html/rfc9114#section-7.2.4-4

I see. You are right, it is hard to misunderstand (where the misunderstanding I wanted to avoid is to see this an an explicit ban of negotiating the use of extensions, which is clarified by the "however ... " part).

Copy link
Contributor

@ppopth ppopth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I think this is a good idea. It reduces frictions to propose new ideas or implement and deploy something not being agreed by all yet.


## The Extensions Control Message

If a peer supports any extension, the Extensions control message MUST be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should future-proof this in case we move away from the single-stream (per direction) design. For context: we speculate that more granular streams on top of QUIC can alleviate transport HoL blocking and can enable more fluid cancellation. We've discussed:

  • one stream per subscription (+ control stream)
  • one stream per message (+ control stream)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need any text change here. We'll always have an initial stream and the extensions MUST be in the first message on that stream.

The best next optimization for multiple streams would be to open a stream per message. I don't think multiple control streams are worth it because it is simpler if control messages are serialized and control messages are small.

Aside, it would be great to use bidirectional streams as bidirectional. There is implementation complexity in correctly treating two separate streams as a single bidirectional stream (the current go implementation has bugs here).

syntax = "proto2";

message ControlExtensions {
// Initially empty. Future extensions will be added here along with a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's describe field expectations and provide some examples. For example, some extensions may simply advertise themselves as boolean flags, while others may send preamble data/config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a real example with the TestExtension. We'll also have another one soon with Partial messages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those won't be linked from here. What I'm saying is that this spec defines a framework for extensions, and it feels incomplete without an explanation of how extensions would actually represent themselves, other than "here's a schema where they can drop something". Some stereotypical examples would help frame this right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be linked here. As well as having the code points defined here.

Copy link
Member

@raulk raulk Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain your motivation for wanting to inventorize all eventual extensions here? The implicit convention is to have the spec define the mechanism for the extension point, while managing the assignment in a separate resource (registry).

This separation between mechanism and registry is true for: multicodec, multiaddr, and major industry standards like CBOR (tags), TLS (params), MIME (media types), etc. where the base spec is an IETF RFC, extensions are separate RFCs, and their ID assignment is tracked by IANA registries.

I'm happy to be convinced that deviating from convention is appropriate here, but if there isn't a clear case, I'd suggest we:

  1. have this spec establish the registry (e.g. a local csv file), together with
  2. the test extension (Gossipsub: test extension #686), and
  3. the v1 of the registry containing the latter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference here. We can create a separate file to serve as registry. (although it should be a .proto rather than a .csv, as it's a more direct mapping.)

My motivation to keep it in one file is to keep things simple. There are tradeoffs in maintaining a separate registry. IANA works well because an RFC is a published document that is more or less complete, and the considerations to adopting a new codepoint need to hashed out elsewhere (the WG might not even still exist). The downside is that there is more overhead to a separate registry (we can mitigate that with a file within this repo, such that changes can be atomic to a PR).

While I don't have a strong preference, I don't see much benefit in having a separate registry file for this use case. Can you elaborate on what you expect the benefits to be?

For some prior art in this repo, we have Noise Extensions that does not maintain a separate registry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on what you expect the benefits to be?

Both approaches can work. But given that most prior art uses a separate registry, I'd lean towards this approach. That way the asset is easily verifiable, hashable, independently versionable, with the guarantee that new versions only modify registry entries and no spec behaviour (which we lose with inlining).

For some prior art in this repo, we have Noise Extensions that does not maintain a separate registry.

This spec confused me. Those fields appear out of thin air without any traceability, explanation, nor link to their respective specs. I think this should be refactored to a detached registry too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed they would benefit from adding a link. The git blame is pretty straightforward though.

As mentioned, I don't have a strong preference here, so I'll create a new file.

need to make Gossipsub extensions follow a strict ordering. Finally, it allows
extensions to iterate independently from Gossipsub's versioning.

## The Extensions Control Message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not advertise the extensions during the HELLO exchange?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello is an implementation detail, there's no "HELLO" protobuf message. It is shorthand for sending the topics we are subscribed to. If an implementation wants to include it in their "hello packet", that's fine. But there's no need to reference the hello packet here. The requirement is simply that this MUST be the first message.

Copy link
Member

@raulk raulk Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, that's why I said exchange and not RPC.

This hello dispatch isn't an implementation detail, it is defined here:

Whenever a new peer is connected, the gossipsub implementation checks to see if

FWIW, it's probably hierarchically misplaced there (an errata submission could move it to another clearer position in the doc).

I'm fairly confident that all implementations dispatch this hello, and given it's specified, we could simply refer to it explicitly here for more clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This core idea of sending your subscriptions on a new connection is part of the spec. I agree.

The rust-libp2p code has no explicit "hello packet" notion. It simply sends the topic subscriptions on a new connection. As a datapoint, @jxs didn't know what the hello packet was until I explained it.

I think the language here is clear without needing to reference the "hello packet." What do you think would be gained by referencing the hello packet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense; the current wording works for me. I think there's opportunity to formalize the initial exchange a bit better—but that's outside the scope of this PR.

@MarcoPolo MarcoPolo force-pushed the marco/gossipsub-extensions branch from 496cf55 to e01116b Compare July 9, 2025 23:22
@MarcoPolo MarcoPolo mentioned this pull request Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

8 participants