Skip to content

Pass user-agent from DownloadConfig into fsspec storage_options #7631

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ArjunJagdale
Copy link
Contributor

Fixes part of issue #6046

Problem

The user-agent defined in DownloadConfig was not passed down to fsspec-based filesystems like HfFileSystem, which prevents proper identification/tracking of client requests.

Solution

Added support for injecting the user-agent into storage_options["headers"] within _prepare_single_hop_path_and_storage_options() based on the protocol.

Now, when using hf://, http://, or https://, the custom user-agent is passed automatically.

Code Location

Modified:

  • src/datasets/utils/file_utils.py

Used get_datasets_user_agent(...) to ensure proper formatting and fallback logic.

Fixes part of issue huggingface#6046

### Problem
The `user-agent` defined in `DownloadConfig` was not passed down to fsspec-based filesystems like `HfFileSystem`, which prevents proper identification/tracking of client requests.

### Solution
Added support for injecting the `user-agent` into `storage_options["headers"]` within `_prepare_single_hop_path_and_storage_options()` based on the `protocol`.

Now, when using `hf://`, `http://`, or `https://`, the custom user-agent is passed automatically.

### Code Location
Modified:
- `src/datasets/utils/file_utils.py`

Used `get_datasets_user_agent(...)` to ensure proper formatting and fallback logic.
@ArjunJagdale
Copy link
Contributor Author

  • This PR assumes that HfFileSystem in huggingface_hub supports receiving headers in storage_options. If not, a follow-up PR can be opened to add this support to HfFileSystem.__init__.
  • No test was added for this since it’s a config passthrough. If needed, I’d be happy to add one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant