Skip to content

Documentation: FAQ about reading files and alternatives #13300

Open
@cvrebert

Description

@cvrebert

I not-infrequently see internal questions about reading files in Starlark or (equivalently) defining targets (e.g. generating library targets or generating tests) based on (non-BUILD/bzl) files. FWICT, the Bazel docs currently don't address this directly. Thus, similar questions get asked and answered, wasting effort. Since these questions are about somewhat obscure corners of Starlark, they don't always get a swift answer. And the quality/detail of example code in the answers varies. Hence these suggestions.

  1. ​The docs ought to clearly state that Starlark & build rules themselves cannot read files (they can only pass file paths along to binaries). Currently, this is mostly implicit, in that Starlark lacks open()/file() functions.
  2. The docs ought to explain the philosophical/design reason for this restriction. AIUI, it would mean changes to arbitrary files (vs. just BUILD & .bzl files) in a source code tree could affect the build graph, which would make detecting changes & recomputing the build graph more expensive. Perhaps there are additional reasons?
  3. The docs should cover follow-up questions about what users should do instead. I've seen a few patterns; these should be codified.
    • Move the canonical source of the data into a .bzl file instead; possibly writing a Starlark rule to generate the previous data file.
    • Rather than reading a file and automatically generating N targets from the file, write a rule for generating a single target and then explicitly write out N instances of that rule in the BUILD file. Possibly write a script or automation for generating that BUILD code. If there's concern about the file and the targets drifting out-of-sync as new entries are added to the file, write a test which takes the file and all the targets as inputs, and asserts that the targets cover all entries in the file.
      • Instead of a monolithic main file, consider splitting the dataset into separate files. (This isn't always feasible/desirable.)
    • (I'm probably missing a couple more)
    • There are generally similar alternatives when dealing with a binary which generates a directory containing N output files, where the user wants each of the files to have associated target(s).

I have no opinion on whether (parts of) these answers belong in an actual "FAQs" page.

Have you found anything relevant by searching the web?

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3We're not considering working on this, but happy to review a PR. (No assignee)team-DocumentationDocumentation improvements that cannot be directly linked to other team labelsteam-Rules-APIAPI for writing rules/aspects: providers, runfiles, actions, artifactstype: documentation (cleanup)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions