Skip to content

Documentation: FAQ about reading files and alternatives #13300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cvrebert opened this issue Apr 5, 2021 · 4 comments
Open

Documentation: FAQ about reading files and alternatives #13300

cvrebert opened this issue Apr 5, 2021 · 4 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts type: documentation (cleanup)

Comments

@cvrebert
Copy link
Contributor

cvrebert commented Apr 5, 2021

I not-infrequently see internal questions about reading files in Starlark or (equivalently) defining targets (e.g. generating library targets or generating tests) based on (non-BUILD/bzl) files. FWICT, the Bazel docs currently don't address this directly. Thus, similar questions get asked and answered, wasting effort. Since these questions are about somewhat obscure corners of Starlark, they don't always get a swift answer. And the quality/detail of example code in the answers varies. Hence these suggestions.

  1. ​The docs ought to clearly state that Starlark & build rules themselves cannot read files (they can only pass file paths along to binaries). Currently, this is mostly implicit, in that Starlark lacks open()/file() functions.
  2. The docs ought to explain the philosophical/design reason for this restriction. AIUI, it would mean changes to arbitrary files (vs. just BUILD & .bzl files) in a source code tree could affect the build graph, which would make detecting changes & recomputing the build graph more expensive. Perhaps there are additional reasons?
  3. The docs should cover follow-up questions about what users should do instead. I've seen a few patterns; these should be codified.
    • Move the canonical source of the data into a .bzl file instead; possibly writing a Starlark rule to generate the previous data file.
    • Rather than reading a file and automatically generating N targets from the file, write a rule for generating a single target and then explicitly write out N instances of that rule in the BUILD file. Possibly write a script or automation for generating that BUILD code. If there's concern about the file and the targets drifting out-of-sync as new entries are added to the file, write a test which takes the file and all the targets as inputs, and asserts that the targets cover all entries in the file.
      • Instead of a monolithic main file, consider splitting the dataset into separate files. (This isn't always feasible/desirable.)
    • (I'm probably missing a couple more)
    • There are generally similar alternatives when dealing with a binary which generates a directory containing N output files, where the user wants each of the files to have associated target(s).

I have no opinion on whether (parts of) these answers belong in an actual "FAQs" page.

Have you found anything relevant by searching the web?

@brandjon brandjon added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed untriaged labels Jun 3, 2021
@yeswalrus
Copy link

Strong upvote here. For things like FlatBuffers (or I think even protobuf) that generate one file per table/struct, this is a big issue. The best options I can think of ATM are to write a custom rule and declare the output as a directory using ctx.actions.declare_directory, write a linter rule enforcing 1 struct per file with a matching name (which many devs are opposed to when there are several small dependent structs that aren't meant to be used separately) or as mentioned, write a custom build script that generates BUILD code, which is rather an ugly hack IMO and invites someone to build a meta-build system by creating space 'above' bazel.

@brandjon brandjon added untriaged team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts and removed team-Build-Language labels Nov 4, 2022
@ShreeM01 ShreeM01 added the team-Documentation Documentation improvements that cannot be directly linked to other team labels label Jan 10, 2023
@comius comius removed the untriaged label Aug 22, 2023
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Oct 26, 2024
@cvrebert
Copy link
Contributor Author

I have re-skimmed the docs. This issue still seems relevant.

@github-actions github-actions bot removed the stale Issues or PRs that are stale (no activity for 30 days) label Oct 27, 2024
@cam-bond
Copy link

cam-bond commented Dec 9, 2024

+1, I spent way too much time trying to figure this out when interacting with starlark for the first time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts type: documentation (cleanup)
Projects
None yet
Development

No branches or pull requests

7 participants