Skip to content

terminate called without an active exception; Aborted (core dumped) #7566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
alexey-milovidov opened this issue May 11, 2025 · 0 comments
Open

Comments

@alexey-milovidov
Copy link

Describe the bug

I use it as in the tutorial here: https://huggingface.co/docs/datasets/stream, and it ends up with abort.

Steps to reproduce the bug

  1. pip install datasets
$ cat main.py 
#!/usr/bin/env python3

from datasets import load_dataset

dataset = load_dataset('HuggingFaceFW/fineweb', split='train', streaming=True)
print(next(iter(dataset)))
  1. chmod +x main.py
$ ./main.py 
README.md: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 43.1k/43.1k [00:00<00:00, 7.04MB/s]
Resolving data files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25868/25868 [00:05<00:00, 4859.26it/s]
Resolving data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25868/25868 [00:00<00:00, 54773.56it/s]
{'text': "How AP reported in all formats from tornado-stricken regionsMarch 8, 2012\nWhen the first serious bout of tornadoes of 2012 blew through middle America in the middle of the night, they touched down in places hours from any AP bureau. Our closest video journalist was Chicago-based Robert Ray, who dropped his plans to travel to Georgia for Super Tuesday, booked several flights to the cities closest to the strikes and headed for the airport. He’d decide once there which flight to take.\nHe never got on board a plane. Instead, he ended up driving toward Harrisburg, Ill., where initial reports suggested a town was destroyed. That decision turned out to be a lucky break for the AP. Twice.\nRay was among the first journalists to arrive and he confirmed those reports -- in all formats. He shot powerful video, put victims on the phone with AP Radio and played back sound to an editor who transcribed the interviews and put the material on text wires. He then walked around the devastation with the Central Regional Desk on the line, talking to victims with the phone held so close that editors could transcribe his interviews in real time.\nRay also made a dramatic image of a young girl who found a man’s prosthetic leg in the rubble, propped it up next to her destroyed home and spray-painted an impromptu sign: “Found leg. Seriously.”\nThe following day, he was back on the road and headed for Georgia and a Super Tuesday date with Newt Gingrich’s campaign. The drive would take him through a stretch of the South that forecasters expected would suffer another wave of tornadoes.\nTo prevent running into THAT storm, Ray used his iPhone to monitor Doppler radar, zooming in on extreme cells and using Google maps to direct himself to safe routes. And then the journalist took over again.\n“When weather like that occurs, a reporter must seize the opportunity to get the news out and allow people to see, hear and read the power of nature so that they can take proper shelter,” Ray says.\nSo Ray now started to use his phone to follow the storms. He attached a small GoPro camera to his steering wheel in case a tornado dropped down in front of the car somewhere, and took video of heavy rain and hail with his iPhone. Soon, he spotted a tornado and the chase was on. He followed an unmarked emergency vehicle to Cleveland, Tenn., where he was first on the scene of the storm's aftermath.\nAgain, the tornadoes had struck in locations that were hours from the nearest AP bureau. Damage and debris, as well as a wickedly violent storm that made travel dangerous, slowed our efforts to get to the news. That wasn’t a problem in Tennessee, where our customers were well served by an all-formats report that included this text story.\n“CLEVELAND, Tenn. (AP) _ Fierce wind, hail and rain lashed Tennessee for the second time in three days, and at least 15 people were hospitalized Friday in the Chattanooga area.”\nThe byline? Robert Ray.\nFor being adept with technology, chasing after news as it literally dropped from the sky and setting a standard for all-formats reporting that put the AP ahead on the most competitive news story of the day, Ray wins this week’s $300 Best of the States prize.\n© 2013 The Associated Press. All rights reserved. Terms and conditions apply. See AP.org for details.", 'id': '<urn:uuid:d66bc6fe-8477-4adf-b430-f6a558ccc8ff>', 'dump': 'CC-MAIN-2013-20', 'url': 'http://%[email protected]/Content/Press-Release/2012/How-AP-reported-in-all-formats-from-tornado-stricken-regions', 'date': '2013-05-18T05:48:54Z', 'file_path': 's3://commoncrawl/crawl-data/CC-MAIN-2013-20/segments/1368696381249/warc/CC-MAIN-20130516092621-00000-ip-10-60-113-184.ec2.internal.warc.gz', 'language': 'en', 'language_score': 0.9721424579620361, 'token_count': 717}
terminate called without an active exception
Aborted (core dumped)

Expected behavior

I'm not a proficient Python user, so it might be my own error, but even in that case, the error message should be better.

Environment info

Successfully installed datasets-3.6.0 dill-0.3.8 hf-xet-1.1.0 huggingface-hub-0.31.1 multiprocess-0.70.16 requests-2.32.3 xxhash-3.5.0

$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant