Skip to content

Commit 3c48165

Browse files
committed
WIP: Cargo support
First iteration. Signed-off-by: Alexey Ovchinnikov <[email protected]>
1 parent ce4d758 commit 3c48165

File tree

1 file changed

+79
-52
lines changed

1 file changed

+79
-52
lines changed

docs/designs/cargo-support.md

Lines changed: 79 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,12 @@
1-
# Cargo overview
1+
# Adding Cargo support to Cachi2
22

3-
## Main files
3+
## Background
4+
5+
[Cargo] is the package manager of choice for [Rust] programming language.
6+
It handles building Rust projects as well as retrieving and building their
7+
dependencies. Cargo could be further extended with plugins.
8+
9+
A typical Cargo-managed project has the following structure:
410

511
```
612
├── .cargo
@@ -11,9 +17,9 @@
1117
└── main.rs (or lib.rs)
1218
```
1319

14-
- Cargo.toml: dependency listing and project configuration.
15-
- Cargo.lock: lockfile that contains the latest resolved dependencies.
16-
- .cargo/config.toml: package manager specific configuration.
20+
Where Cargo.toml contains dependency listing and project configuration,
21+
Cargo.lock is a lockfile that contains the latest resolved dependencies
22+
and .cargo/config.toml: package manager specific configuration.
1723

1824
### Glossary
1925

@@ -23,8 +29,11 @@ file.
2329

2430
## [Specifying dependencies](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html)
2531

26-
The examples below show what types of dependencies Cargo supports, and how they can be specified in the
27-
`Cargo.toml` file.
32+
Cargo supports several types of dependencies: on crates distributed through registries, on
33+
github projects and on filesystem paths.
34+
35+
The examples below show the different types of dependencies Cargo supports, and
36+
how they can be specified in the `Cargo.toml` file.
2837

2938
<details>
3039
<summary>default registry (crates.io)</summary>
@@ -80,10 +89,10 @@ The examples below show what types of dependencies Cargo supports, and how they
8089

8190
<details>
8291
<summary>platform specific</summary>
92+
Note: in cargo docs, "platform" refers interchangeably to both architecture and OS
93+
Cargo has support for specifying dependencies under a certain platform with `#[cfg]`
94+
syntax:
8395

84-
TODO
85-
- note: in cargo docs, "platform" refers interchangeably to both architecture and OS
86-
- cargo has support for specifying dependencies under a certain platform, like
8796
```
8897
[target.'cfg(windows)'.dependencies]
8998
winhttp = "0.4.0"
@@ -98,10 +107,11 @@ The examples below show what types of dependencies Cargo supports, and how they
98107

99108
[target.i686-unknown-linux-gnu.dependencies]
100109
openssl = "1.0.1"
101-
``
102-
- Regardless, as far as we could tell from experimenting, cargo build requires ALL dependencies to be present - even if they won't be used.
103-
- as a potential optimization, [cargo-vendor-filterer](https://github.com/coreos/cargo-vendor-filterer/) can vendor cargo dependencies
104-
- if we adopt this approach, it might be limited to pure-rust builds
110+
```
111+
112+
Cargo build apparently requires all dependencies to be present - even if they won't be used (
113+
this was determined experimentally and this is in line with other package managers
114+
behavior, see, for example, related section in [Bundler] documentation).
105115
</details>
106116

107117
<details>
@@ -119,12 +129,6 @@ The examples below show what types of dependencies Cargo supports, and how they
119129
```
120130
</details>
121131

122-
<details>
123-
<summary>platform specific</summary>
124-
125-
TODO
126-
</details>
127-
128132
<details>
129133
<summary>alternative registry</summary>
130134

@@ -137,10 +141,13 @@ The examples below show what types of dependencies Cargo supports, and how they
137141
```
138142
</details>
139143

144+
All the dependencies types mentioned above are supported by Cargo out of the
145+
box, with either no or minimal additional set up.
146+
140147
### Cargo.lock
141148

142-
The `Cargo.lock` file follows the toml format. Here are some examples of how dependencies are
143-
represented in it.
149+
The `Cargo.lock` file follows the toml format. Below are some examples of how
150+
dependencies are represented in it.
144151

145152
<details>
146153
<summary>main or local package</summary>
@@ -248,22 +255,31 @@ file or via the `cargo metadata` command.
248255
```
249256
</details>
250257

251-
## Features
258+
## [Features](https://doc.rust-lang.org/cargo/reference/features.html)
252259

253-
*TODO*
260+
Features allow conditional compilation of projects. From cachi2's perspective the
261+
most important aspect of features is optional dependencies. Optional dependency
262+
is such dependency which will not be processed unless explicitly requested.
263+
The safest way to deal with optional dependencies in the context of hermetic builds
264+
would be to use `--all-features` flag with cargo commands when prefetching dependencies.
254265

255266
## [Build Scripts](https://doc.rust-lang.org/cargo/reference/build-scripts.html)
256267

257-
Any package that contains a `build.rs` file in it's root will have it executed during build-time.
258-
Note that this does not happen in any other stage, such as during vendoring or dependency fetching.
268+
Any package that contains a `build.rs` file in it's root will have it executed
269+
during build-time. Note that this does not happen in any other stage, such as
270+
during vendoring or dependency fetching. The build script can contain
271+
arbitrary code, and not running it could result in a failed build, moreover, a
272+
[plugin](https://embarkstudios.github.io/cargo-deny/) is necessary to skip
273+
build scripts.
259274

260275
## [Vendoring](https://doc.rust-lang.org/cargo/commands/cargo-vendor.html)
261276

262-
Cargo offers the option to vendor the dependencies by using `cargo vendor`. All dependencies
263-
(including git dependencies) are downloaded to the `./vendor` folder by default.
277+
Cargo offers the option to vendor the dependencies by using `cargo vendor`. All
278+
dependencies (including git dependencies) are downloaded to the `./vendor`
279+
folder by default.
264280

265-
The command also prints the required configuration that needs to be added to `.cargo/config.toml`
266-
in order for the offline compilation to work. Here's an example:
281+
The command also prints the required configuration that needs to be added to
282+
`.cargo/config.toml` in order for the offline compilation to work:
267283

268284
```toml
269285
[source.crates-io]
@@ -284,7 +300,7 @@ Also, vendoring does not trigger any builds scripts.
284300

285301
# Cargo support in Cachi2
286302

287-
## Approach 1: use cargo commands
303+
## Approach 1 (preferred): use cargo commands
288304

289305
### Identifying the dependencies
290306

@@ -453,7 +469,7 @@ This way, we can identify path and git dependencies, as well as the main package
453469
fetched from non-default registries.
454470

455471
Dev and build dependencies have respective `kind`s when listed in the nested `.dependencies` key.
456-
To indentify them and mark them as such in the SBOM, we'd need only to check all the times a single
472+
To identify them and mark them as such in the SBOM, we'd need only to check all the times a single
457473
package appears as a transitive dependency in this output.
458474

459475
### Prefetching
@@ -481,13 +497,13 @@ Cons:
481497
- Relying on a built-in command brings it's own disadvantages:
482498
- We have less control on what will be executed when invoking `cargo` commands
483499
- We need to account for cargo behavior changes more closely
484-
- We need install cargo in the Cachi2 image and keep its version up to date
500+
- We need to install cargo in the Cachi2 image and keep its version up to date
485501
- Might make it harder to build Pip+Rust projects
486502
- Cargo will refuse to vendor an empty directory with a single `Cargo.toml` file, which
487503
means we'd need to minimally provide a minimal `src/main.rs` file to it.
488504

489505

490-
## Approach 2: manually fetching the dependencies
506+
## Approach 2 (alternative): manually fetching the dependencies
491507

492508
### Identifying the dependencies
493509

@@ -592,18 +608,26 @@ Cons:
592608
- Checksum files need to be manually generated
593609
- Sub-packages in git dependencies need to moved to a flat structure
594610
- The "vendor" configuration needs to be generated manually
611+
- Extra maintenance burden for Cargo.lock parser
612+
613+
## Decision
595614

596-
# Caveats
615+
Given the rich set of features provided by Cargo for managing dependencies it is more
616+
cost effective to rely on Cargo for performing all the necessary parsing and fetching.
617+
This decision is in line with current approach to other package managers (e.g. Bundler or
618+
Yarn).
597619

598-
## Crates with binaries
620+
621+
## Appendix A. Crates with binaries
599622

600623
Crates are supposed to contain only source code. However, crates.io don't seem to enforce any
601624
rule to prohibit crates being uploaded with binaries. This happened at least once with
602625
[serde][serde-with-binaries], one of the most popular rust libraries.
603626

604-
# Pip + Cargo support in Cachi2
627+
## Appendix B. Pip + Cargo support in Cachi2
628+
605629

606-
## Context
630+
### Context
607631

608632
Traditionally, performance bottlenecks in the python ecosystem are addressed with C extensions,
609633
which introduce their own complexities and safety concerns.
@@ -620,7 +644,15 @@ Addressing the integration challenges of Rust in Python projects is crucial to e
620644
performance, safety, and concurrency of Python applications. The "rustification" of Python libraries
621645
is here to stay.
622646

623-
## The challenge and cachi2 boundaries
647+
On the other hand Cachi2 in its current shape does not mandate the presence of the sources for all
648+
dependencies. For example both Bundler and Pip will ignore all binary dependencies unless
649+
requested otherwise. Once requested only the binaries themselves will be collected
650+
during the prefetch phase. They will be reported in SBOM as regular packages. Making
651+
fully self-contained builds is a larger topic and is out of scope for this document.
652+
A general description of how this could be achieved for Python packages depending on Rust
653+
is presented in this section.
654+
655+
### The challenge and cachi2 boundaries
624656

625657
Building projects that do DIRECTLY depend on both rust and python should be straightforward and
626658
similar to build with pip and cargo independently. The developers of those projects can easily
@@ -643,7 +675,7 @@ the package indirectly depends or a file format designed for this. Also [pybuild
643675
might evolve to help solving this problem, so it is not like we would waste any time understanding
644676
these problems.
645677

646-
## Build dependencies
678+
### Build dependencies
647679

648680
`maturin` and `setuptools-rust` are PEP517 compliant build backends for python packages with
649681
embedded rust code.
@@ -652,7 +684,7 @@ Under the hood, `maturin` relies exclusively on `PyO3` while `setuptools-rust` c
652684
or `Rust-CPython` (but newer projects are likely preferring the former, as the author of
653685
`Rust-CPython` development is halted and its author recommends `PyO3`).
654686

655-
## Detecting python packages with rust dependencies
687+
### Detecting python packages with rust dependencies
656688

657689
We could use the presence of either `maturin` or `setuptools-rust` as build dependencies of a python
658690
package as a heuristic to determine if a package is a python+rust library. Alternatively, we could
@@ -669,7 +701,7 @@ Packages relying on `maturin` and `setuptools` have a default place to have thei
669701
Cargo.toml/lock stored. Also, parsing the configuration it is possible to know if the path for those
670702
manifests were modified.
671703

672-
### maturin
704+
#### maturin
673705
Detecting `maturin` is easier because it only supports python packages that use `pyproject.toml`
674706
to configure it. So detecting its presence is only a matter of verifying if
675707
`[build-system].requires` contains `maturin`.
@@ -696,7 +728,7 @@ example:
696728
manifest-path = "Cargo.toml"
697729
```
698730

699-
### setuptools-rust
731+
#### setuptools-rust
700732

701733
Oldest versions of `setuptools-rust` exclusively support `setup.py`, but since version 1.7.0 it also
702734
supports `pyproject.toml`.
@@ -756,7 +788,7 @@ setup(
756788
)
757789
```
758790

759-
## Vendoring rust dependencies
791+
### Vendoring rust dependencies
760792

761793
Even though `cargo vendor` only requires `Cargo.toml` (and optionally, but ideally for reproducible
762794
builds, `Cargo.lock`), it will fail without source code present. If it wasn't for this, manifest
@@ -781,7 +813,7 @@ If we go in that direction, we could even go one step further and expect a speci
781813
python+rust dependencies. This allow customers to only need to include a file like
782814
`rust-requirements.txt/toml/json/etc`.
783815

784-
## Hermetically build python + rust libraries
816+
### Hermetically build python + rust libraries
785817

786818
Both `maturin` and `setuptools-rust` will, somehow, invoke cargo during the build process. For this
787819
reason, we can leverage the way cargo is configured to look for vendored packages.
@@ -790,11 +822,6 @@ In order to do that, we need:
790822
1. A folder with all vendored crates
791823
2. A .cargo/config.toml [[link to the section in the document where this is explained]] overriding
792824
crates.io source with the path to vendored dependencies.
793-
<!--
794-
# TODO: move this to a appropriate section, as this configuration is the same for pure rust
795-
# note, tho, that on pure rust we might need to ADD this info to a existing file instead of writing
796-
# one from scratch.
797-
-->
798825
The config file looks like the following:
799826
```toml
800827
[source.crates-io]
@@ -825,7 +852,7 @@ RUN source /tmp/cachi2.env && \
825852

826853
```
827854

828-
### Limitations
855+
#### Limitations
829856

830857
- The process likely won't work with python packages lacking Cargo.lock.
831858
- Interestingly, while inspecting some projects relying on maturin I saw many that didn't have a
@@ -846,4 +873,4 @@ created.
846873
[ccs-inline-github]: https://github.com/Stranger6667/css-inline/tree/wasm-v0.11.2/bindings/python
847874
[serde-with-binaries]: https://www.bleepingcomputer.com/news/security/rust-devs-push-back-as-serde-project-ships-precompiled-binaries/
848875
[pybuild-deps]: https://pybuild-deps.readthedocs.io/en/latest/
849-
[python-rust-research]: https://github.com/bruno-fs/python-rust-research/blob/afebfc7ab6ef55aa0db6879b0cda7760373b60cd/python-rusty-exploration.ipynb
876+
[python-rust-research]: https://github.com/bruno-fs/python-rust-research/blob/afebfc7ab6ef55aa0db6879b0cda7760373b60cd/python-rusty-exploration.ipynb

0 commit comments

Comments
 (0)