-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add support for IPv6 scoped addresses (RFC4007) #15263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for IPv6 scoped addresses (RFC4007) #15263
Conversation
Could you please open an issue as a feature request first? We ask this to separate high-level discussion of the feature (in the issue discussion) from details of this specific PR (PR discussion). Ref: https://github.com/crystal-lang/crystal/blob/master/CONTRIBUTING.md#using-the-issue-tracker I'm marking this PR as draft for now. |
ah well, shame on me. I've just created #15264 now to discuss the feature itself. |
12d0b01
to
e9e524e
Compare
In order to translate interface names in such scoped addresses the required `LibC` binding to `if_nametoindex()` has been added. This method obviously only works for interfaces (devices) that are actually present on the system. The binding for the reverse operation `if_indextoname()` has also been added, although its usage is a bit more cumbersome due to LibC::Char* buffer handling. The necessary buffer length has been placed into the constant `LibC::IF_NAMESIZE`, which appears to be `16u8` on unix-like systems and `257u16` on windows. This could potentially be reworked via a preprocessor block at compile-time as indicated by some folks over on discord, I currently do not know how to achieve that though. Scoped identifiers are only valid for link-local (`fe80::`) addresses, e.g. `fe80::1%eth0` References: - https://datatracker.ietf.org/doc/html/rfc4007 Fixes crystal-lang#15264
e9e524e
to
5b253ee
Compare
Alright, here we go. Not sure why, but something is not happy in the workflow runs in my fork.
Apart from that the remaining workflows appear to be happy with it. Let me know if you'd like more comments in the added code or want something changed. |
This should never have made it into the previous commit. If allowed, I could squash/fixup to get rid of this noise but as per the contributing guidelines amending/editing and force-pushing branches with active PRs is undesired.
I fully agree with all suggestions, so this commit implements them. As per the contributing guidelines this is a separate commit and not squashed.
Thanks for the detailed review and constructive feedback/suggestions. Following the contributing guidelines I have done so with separate commits but I'd be happy to rebase+squash/fixup if desired for a "cleaner" history before merging into |
This just implicitly assumes that `eth0` in the example got enumerated with interface index `2` by the kernel. Should be sufficient for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry about squash/rebase. We do that when merging.
Commit history in the PR is best left as is.
Instead of using `#partition('%')` on the input string, which results in additional `String` allocations, we can stop consuming the slice within `#parse_v6_fields?` when we encounter the zone seprator character (`'%'`) and pass that index back to the constructor. However, this change to `#parse_v6_fields?` is a bit more intrusive and changes it return type from a simple `UInt16[8]?` to a `Tuple(UInt16[8]?, Int32?)` to account for the zone separator index. All references to that method and its specs were updated accordingly to adopt this change. Additionally the exception conditions for invalid v6 addresses have been simplified a bit and all of the converted to proper `Socket::Error` exceptions.
This is proposed as an alternate solution for the previous implementation. Returning the index of the zone separator (`'%'`) back to the constructor is a bit cumbersome for subsequent usage. Return the remaining subslice instead makes it much easier to use in the constructor for subsequent parsing and within the exception messages. I definitely prefer this approach over the previous one and consider it an improvement.
To avoid a breaking change only modify the private overload. Additionally remove unneeded counter variable for tracking the index of the zone separator as its value can be calculated when needed.
As per the discussion there is reasoning behind keeping stdlib integer types a simple `Int32` whenever possible.
Thanks for the explanation and dilligence in the reviews, that goes to everyone who has taken part here, much appreciated. I've pushed two more commits, one to switch the macro to use |
I'm not happy about the
IMO input and output types should always be identical, so |
That is definitely a good catch. I've adjusted |
Windows CI is failing, because Win32 presumably does not set
|
Unfortunately I can't seem to find any useful
|
Are we good to merge now? (I'd really like this to be part of the 1.17.0 release 🙂 ) |
# Scoped/Zoned IPv6 link-local addresses are supported per RFC4007, e.g. | ||
# `fe80::abcd%eth0` but will always use their numerical interface index | ||
# in the `#inspect` representation. The interface name can be retrieved later | ||
# using `#link_local_interface` on the `IPAddress` object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the post-merge comment, this is a fantastic PR I will get a bunch of use out of.
However, isn't it actually #to_s
which i more important to document the representation of? #to_s
is expected to be used in application code, whereas #inspect
is for debugging, and happens to use #to_s
here.
Additionally, would it make sense for #inspect
to show the interface by name, since performance is a non-issue? With fallback to zone number if #inspect
fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improving documentation sounds good for a follow-up 👌
I'm not sure about name vs. id. #to_s
is supposed to be a user-friendly format, so that would be an argument to show the name instead of the id there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying the #to_s
output would be a breaking change. There are likely lots of things/applications/libs that occasionally convert an IPv4/IPv6 address to a string and handing them off to whatever. Each of these implementations would then potentially need adjusting if there was suddenly a %<zone_id>
suffix as part of the string.
I do think a separate #to_s_rfc4007
? method might be useful to get that exact format. With an optional boolean to chose between numerical zone_id
and link_local_interface
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think a separate
#to_s_rfc4007
? method might be useful to get that exact format. With an optional boolean to chose between numericalzone_id
andlink_local_interface
.
Sounds like a great idea. I'd suggest using a dedicated named argument in to_s
instead though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but the current state of this PR looks like to_s
is modified as well as inspect.
I'd prefer for inspect to by default include the zone, by default with name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this PR adds the zone ID to the output of both, #to_s
and #inspect
(the latter delegates to the former).
We missed adding a spec for that to make it explicit.
Considering https://github.com/crystal-lang/crystal/pull/15263/files#r2103365809 this might have been an unintended change and we should revert that.
I'd prefer for inspect to always include the zone.
Agreed. The information should be there. But should it be ID or name?
Maybe ID because it's sufficient and more succinct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After taking another look, the #to_s
always returned the address accompanied by the port. If one wanted to obtain just the address, there is an #address
method, which hasn't been modified by this PR, i.e. it doesn't include any %<zone_id>
, neither numerical nor string.
It is indeed true, that we've now implicitly changed the #to_s
behavior with a potentially breaking formatting change. This was definitely an oversight as there is also no spec for it.
I can follow up with another PR for adding an optional bool to #to_s
that controls whether or not to include the zone_id
.
Should we default to always include it or omit it in #to_s
to keep previous behavior and only always include it in #inspect
?
Furthermore, when including it, do we use numerical ID or interface name?
I'd be fine attempting #link_local_interface
and fallback to using numerical @zone_id
if it failed. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing worth noting:
Since zone_id
can only ever be set (i.e. .positive?
) on link-local
ipv6 addresses, the format change doesn't really affect anything.
If at all, it actually might fix certain implementations since link-local
ipv6 addresses might not even work properly without the zone_id
.
But I still think we can make improvements in both the comments/docs, default behavior and obviously also add some specs for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the effect is only for link-local addresses, I think including the zone ID by default makes sense for both to_s
and inspect
. However, looking up the interface name has a performance impact so I believe it should only be on by default for inspect
.
In order to translate interface names in such scoped addresses the
required
LibC
binding toif_nametoindex()
has been added.This method obviously only works for interfaces (devices) that are
actually present on the system.
The binding for the reverse operation
if_indextoname()
has also beenadded, although its usage is a bit more cumbersome due to
LibC::Char*
buffer handling. The necessary buffer length has been placed into the
constant
LibC::IF_NAMESIZE
, which appears to be16u8
on unix-likesystems and
257u16
on windows.This could potentially be reworked via a preprocessor block at
compile-time as indicated by some folks over on discord, I currently do
not know how to achieve that though.
Scoped identifiers are only valid for link-local (
fe80::
) addresses, e.g.fe80::1%eth0
References:
Fixes #15264 .