Skip to content

Spurious file-related failures on Windows runners #10483

Closed
@tgross35

Description

@tgross35

Description

For the past few months, the rust-lang/rust project has had a lot of spurious failures on the Windows runners. These are typically either failure to open a file (mostly from link.exe) or failure to remove a file:

  • LINK : fatal error LNK1104: cannot open file ...
  • error: failed to remove file ..., Access is denied (os error 5)

Example run: https://github.com/rust-lang-ci/rust/actions/runs/10537107932/job/29198090275

Is it possible that something changed that would cause this? Even if not and this is a problem with our tooling, we could use assistance debugging.

Further context, links to failed jobs, and attempts to reproduce are at rust-lang/rust#127883. Almost every PR showing up in the mentions list is from one of these failures. These errors are similar to what was reported in #4086.

Cc @ChrisDenton and @ehuss who have been working to reproduce this.

Platforms affected

  • Azure DevOps
  • GitHub Actions - Standard Runners
  • GitHub Actions - Larger Runners

Runner images affected

  • Ubuntu 20.04
  • Ubuntu 22.04
  • Ubuntu 24.04
  • macOS 12
  • macOS 13
  • macOS 13 Arm64
  • macOS 14
  • macOS 14 Arm64
  • Windows Server 2019
  • Windows Server 2022

Image version and build link

Current runner version: '2.319.1'
Runner name: 'windows-2022-8core-32gb_4d2ba789d359'
Runner group name: 'Default Larger Runners'
Machine name: 'runner'
Operating System
  Microsoft Windows Server 2022
  10.0.20348
  Datacenter
Runner Image
  Image: windows-2022
  Version: 20240811.1.0
  Included Software: https://github.com/actions/runner-images/blob/win22/20240811.1/images/windows/Windows2022-Readme.md
  Image Release: https://github.com/actions/runner-images/releases/tag/win22%2F20240811.1

Is it regression?

Yes, around 2024-06-27 but the exact start is unknown. It has seemingly gotten significantly worst in the past week or so, that job has at least a 25% failure rate from this issue in the past couple of days (probably close to 50%).

Expected behavior

Accessing or removing the files should succeed.

Actual behavior

The file operations are encountering spurious failures, as linked above.

Repro steps

No known consistent reproduction.

Metadata

Metadata

Assignees

Labels

Area: RustOS: Windowsbug reportinvestigateCollect additional information, like space on disk, other tool incompatibilities etc.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions