-
-
Notifications
You must be signed in to change notification settings - Fork 105
Dependency cli parsing #673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I think I'd still go the other way around: from a CLI arg that directly targets a resolver and have it parse the URL into a dependency, and fallback to iterate each resolver to parse the URL for known hosts when no CLI arg is specified, and fail asking for an explicit resolver when we can't know what the remote is. |
I would go the other way: encode the full information in a single URI. The implementation of different strategies is quite simple: We parse the value as a URI and route it to a resolver-specific parser based on the scheme (e.g. If the URI scheme does not indicate a specific resolver (e.g. |
For GitHub, from the clone menu we should support
I want to support the most common format the user can copy paste directly. And I want to preserve as much as possible user intent. That's why I don't want the I saw mentions of Thanks for the feedback. |
Oh and should I move parsing and rendering to a DependencyDefinition type to avoid having these changes in dependency but not used in all places? |
We could probably turn the named tuple return type from |
20d0567
to
eafca30
Compare
I think I incorporated all of the feedback. |
expect_parses("github:Foo/[email protected]", "github", "Foo/Bar", VersionReq.new("~> 1.2.3")) | ||
|
||
# GitHub urls | ||
expect_parses("https://github.com/foo/bar", "github", "foo/bar", Any) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I'm wondering about why this gets normalized to github
resolver instead of "git", "https://github.com/foo/bar"
.
Both options are valid. So we only need to decide which one we want to pick.
I suppose the reason for github
is that it's more concise? That's nice but not strictly necessary.
git
would be closer to the original input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me closer to the intent is to have github: foo/bar
, because I'm assuming the user is copy-pasting the url in the browser. We do preserve the format with trailing .git as those are copied from the clone popup.
At some point maybe it's worth configuring shards to map all github source to be resolved in some specific way, but is a separate story.
expect_parses("..\\relative\\windows", "path", "../relative/windows", Any) | ||
{% end %} | ||
# Path file schema | ||
expect_parses("file://#{local_relative}", "path", local_relative, Any) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: This is technically incorrect. The URL file://a/releative/path
means path /relative/path
on host a
. The //
indicates an authority.
The correct URL for a local path would be file:a/relative/path
.
I believe some applications implicitly interpret file://a/b/c
as file:a/b/c
because this is a common mistake.
The file scheme is not intended for remote access. We can consider doing that, but I don't think it's a good idea as it would solidify invalid behaviour.
Instead, the best way to handle this is probably to error if the host of a file URI is anything else but empty or localhost
.
Following up on #673 (comment), I propose an alternative implementation, fully based on URI parsing in #678 IMO this makes it much easier to handle things. The different routes are clearly decided by the URI scheme. All parsing is safe and sound using stdlib's URI class. |
For reference, VCS dependencies in PIP: https://pip.pypa.io/en/stable/topics/vcs-support/ |
The alternative implementation seems good. We can close this PR then in favor of the other. Anything else that is missing?, because it's marked as draft. |
#678 is a patch against the branch of this PR. We should merge that in and then continue here. |
Commit pushed into this branch |
The windows test is failing because for relative URIs we expect |
Co-authored-by: Johannes Müller <[email protected]>
src/dependency_definition.cr
Outdated
source = uri.to_s | ||
# narrow down requirement | ||
requirement = Any | ||
if source.includes?("@") | ||
source, version = source.split("@") | ||
requirement = VersionReq.new("~> #{version}") | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Splitting on "@"
over the entire URL is error prone. It would also match the user info portion of a URI.
git:user@host:repo
parses into source user
with version requirement ~> host:repo
.
This mechanism must be restricted to the path of the URI.
Even then, there's a chance for misinterpretation because @
is a completely valid character in a URI path. Might not be likely to encounter shards at such a path, but I'm wondering if it's worth it. IMO the @
syntax is not essential.
We could consider a different format for expressing version constraints concisely in the URL.
Nix flake uses query parameters to encode version information (e.g. git+https://example.com/foo/bar?branch=master
)
https://nix.dev/manual/nix/2.28/command-ref/new-cli/nix3-flake#examples
I think this is not entirely elegant because query parameters have a different purpose. They're supposed to be resolved by the remote party.
The URI fragment is more elegant for this. It represents resource-internal information resolved by the local party.
Our URIs point to shard repositories, that's the resource. And navigating to a specific portion of a resource is the purpose of a fragment.
So URLs would look like this: github:Foo/Bar#1.2.3
, git+https://example.com/foo/bar#1.2.3
.
We can further enhance this by adding support for parameters to the fragment, thus being able to resolve fragments like #branch=master
, or #tag=1.2.3-rc1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that # force the need of quoting in the shell?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. You can use URLs with fragments perfectly fine in shell commands.
The shell considers #
as a comment only if it's preceded by whitespace.
$ echo foo#bar
foo#bar
$ echo foo #bar
foo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using fragments for version numbers with the same semantics currently have @ seems good to me.
Tag and branch (and revision/commit) can come later imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may work, but other package managers (cargo, bundler, ...) use @version
, so using something else like #version
is confusing, and #
has a special meaning in shells (comment), which only increases the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like more @
than #
but I don't feel strongly. The only thing that makes me doubt is that @version
could be expected to be = version
instead of ~> version
, because the at read.
#
is more tidy for treating things as url. but then github:repo/owner
is not exactly a url. I find github:repo/owner@version
more intuitive than github:repo/owner#version
but we need to pick one symbol.
It seems that #
is a cleaner implementation than @
because we can rely on URI parsing, but there is no ambiguity, we can split @
in the last path component. I don't think we need to choose based on the implementation but rather by the UX, but again there is not much difference I think in that here.
Any final calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we need to fix this now. We could leave version restrictions out of this first step and continue the discussion on that separately, while moving forward with producing an MVP. Having support for restrictions is useful, but not essential for a working prototype.
The syntax for version restrictions is closely connected to information for other details such as commit, branch and tag which we had already agreed to defer. Let's talk about all this together and in full.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do. I rather support github:owner/repo with version rather than all the other combinations we've been discussing. It seems that it will be more used IMO.
python pip uses @
for versioning also, and fragment for some additional thing.
I would stick with @
as the simplest syntax and if we need branch or other we can add #
as an alternative later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not happy about jumping to a syntax for one use case without proper consideration of other related applications. Even if it's for the most common use case.
There are many important details to consider.
For example, the format used in pip and others references a concrete version: [email protected]
means package foo
at exactly version 1.2.3. In this proposal, however, [email protected]
implicitly means package foo
with a restriction of ~>1.2.3
. Adopting an existing syntax but applying different semantics can be confusing to users.
I'd like us to be sure that's what we want, and we see no reasonable, better alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact shards encourage semver, calver seems good enough for me to make @ or whatever syntax that mentions just a version to mean ~>.
I would leave out for now any other version operator.
I don't mean to jump. I mean to discuss and decide. I don't want to not support ~> version restriction out in this first cut.
Made some changes to use |
Related to #672, this PR adds the logic to parse dependencies from the cli. Still not used, but worth discussing syntax and implementation.
The
Dependency
class unfortunately is not fit to generate theshard.yaml
. Resolver key and source are normalized. I created a separate type.The added spec shows the various syntax supported.
To create a proper dependency, with the shard name, we need to actually fetch the shard to resolve that. This is done in
DependencyDefinition.from_cli
while theDependencyDefinition.parts_from_cli
is just the parsing without side effects.In general
{resolver}:{source}
syntax is supported, but there are some shorthands for convenience.git@...
github:foo/bar
github:Foo/Bar#1.2.3
(which will mean~> 1.2.3
)https://github.com/foo/bar
and other known hosts like gitlab, bitbucket, and codeberghttps://github.com/foo/bar#1.2.3
and other known hosts like gitlab, bitbucket, and codeberghttps://github.com/Foo/Bar/commit/000000
https://github.com/Foo/Bar/tree/v1.2.3
(which is interpreted as a tag since it's prefixed withv
)https://github.com/Foo/Bar/tree/some/branch
Some of these are maybe controversial, feel free to discuss.
If you want to play around you can use the following code.
which will output