-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Passing sample based parameters for metric
feature request
A feature that isn't implemented yet.
#3038
opened Jun 3, 2025 by
elements72
Caught jinja2.exceptions.UndefinedError: 'context' is undefined when dealing with japanese_leaderboard
asking questions
For asking for clarification / support on library usage.
#3028
opened May 29, 2025 by
Lynnzake
Add Support Conditional Generation Models like Mistral3
feature request
A feature that isn't implemented yet.
#3027
opened May 29, 2025 by
KyleMylonakisProtopia
Issue with quantization_config argument
bug
Something isn't working.
#3026
opened May 28, 2025 by
shanhx2000
Support for using a remote /tokenize API endpoint as the tokenizer
feature request
A feature that isn't implemented yet.
#3017
opened May 24, 2025 by
furkancoskun
Couldn't find file squad-v1.1/train-v1.1.json when evaluate Qwen3-A3B with vllm pipeline
asking questions
For asking for clarification / support on library usage.
#3015
opened May 23, 2025 by
Lynnzake
Docker build fails due to missing pip module in Conda environment during setup.py develop on editable install
feature request
A feature that isn't implemented yet.
#3014
opened May 22, 2025 by
osmangoninahid
hellaswag not working: "no tasks specified" and "Keyerror: 'train'
asking questions
For asking for clarification / support on library usage.
#3010
opened May 22, 2025 by
matthijsvk
zeno_visualize.py can't parse model_args
bug
Something isn't working.
good first issue
Good for newcomers
#3005
opened May 21, 2025 by
login256
backward compatibility for unitxt (and others) after adding question_suffix to Something isn't working.
feature request
A feature that isn't implemented yet.
task.fewshot_context
in #2876
bug
#3004
opened May 21, 2025 by
baberabb
unitxt with local-chat-completions gets stuck forever
bug
Something isn't working.
#2986
opened May 15, 2025 by
ivanbaldo
Longbench classification_score() missing 1 required positional argument: 'results'
#2976
opened May 12, 2025 by
sustcsonglin
Performance bottleneck: consider multiprocessing for cached request checking
#2964
opened May 9, 2025 by
justHungryMan
Ruler QA tasks do not work for Something isn't working.
max_seq_lengths
< 4096
bug
#2963
opened May 9, 2025 by
sustcsonglin
Log truncation/max_length to logged samples
feature request
A feature that isn't implemented yet.
#2961
opened May 8, 2025 by
freshpearYoon
Generation length is limited to 2048 tokens.Qwen3 model accuracy is low
#2953
opened May 3, 2025 by
sravan500
Previous Next
ProTip!
Adding no:label will show everything without a label.