Open
Description
I not-infrequently see internal questions about reading files in Starlark or (equivalently) defining targets (e.g. generating library targets or generating tests) based on (non-BUILD/bzl) files. FWICT, the Bazel docs currently don't address this directly. Thus, similar questions get asked and answered, wasting effort. Since these questions are about somewhat obscure corners of Starlark, they don't always get a swift answer. And the quality/detail of example code in the answers varies. Hence these suggestions.
- The docs ought to clearly state that Starlark & build rules themselves cannot read files (they can only pass file paths along to binaries). Currently, this is mostly implicit, in that Starlark lacks
open()
/file()
functions.- The most explicit thing I've found is a short paragraph in https://bazel.build/rules/challenges#loading-outdated
- I think this should be briefly, clearly stated in:
- The docs ought to explain the philosophical/design reason for this restriction. AIUI, it would mean changes to arbitrary files (vs. just BUILD & .bzl files) in a source code tree could affect the build graph, which would make detecting changes & recomputing the build graph more expensive. Perhaps there are additional reasons?
- The docs should cover follow-up questions about what users should do instead. I've seen a few patterns; these should be codified.
- Move the canonical source of the data into a
.bzl
file instead; possibly writing a Starlark rule to generate the previous data file. - Rather than reading a file and automatically generating N targets from the file, write a rule for generating a single target and then explicitly write out N instances of that rule in the BUILD file. Possibly write a script or automation for generating that BUILD code. If there's concern about the file and the targets drifting out-of-sync as new entries are added to the file, write a test which takes the file and all the targets as inputs, and asserts that the targets cover all entries in the file.
- Instead of a monolithic main file, consider splitting the dataset into separate files. (This isn't always feasible/desirable.)
- (I'm probably missing a couple more)
- There are generally similar alternatives when dealing with a binary which generates a directory containing N output files, where the user wants each of the files to have associated target(s).
- Move the canonical source of the data into a
I have no opinion on whether (parts of) these answers belong in an actual "FAQs" page.
Have you found anything relevant by searching the web?
- Internal SO-ish platform: Several threads.
- StackOverflow
- GitHub issues
- Searching for "read[ing] files" and docs-related tags/terms didn't yield any open issues which looked relevant.
- Email threads on bazel-discuss