Skip to content

Add Configurable Dataset Download Timeout for load_dataset Calls #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

WnRock
Copy link
Contributor

@WnRock WnRock commented Jun 23, 2025

Address issues with dataset downloads timing out on slow or unstable network connections by making the download timeout configurable.

  • Added a dataset_download_timeout field to the config.
  • Updated all relevant load_dataset calls to use the storage_options parameter with a custom aiohttp.ClientTimeout, as recommended in huggingface/datasets#7164.

Users can now set dataset_download_timeout in the config to control the maximum allowed download time for datasets. Improve reliability when downloading large datasets.

close #14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Default Timeout Too Short for Large Datasets with Unstable Network (FSTimeoutError)
1 participant