huggingface / transformers Public

Notifications You must be signed in to change notification settings
Fork 27.9k
Star 139k

Code
Issues 995
Pull requests 555
Actions
Projects 1
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Pull requests: huggingface/transformers

Labels 131 Milestones 0

New pull request New

555 Open 18,416 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix: typos in documentation files

#36122 opened Feb 10, 2025 by maximevtush

Loading…

Fix potential regex catastrophic backtracking in NougatTokenizerFast

#36121 opened Feb 10, 2025 by Rocketknight1 • Draft

[generate] revert change in Aria: the maximum cache length must match max_length

#36120 opened Feb 10, 2025 by gante

Loading…

update awesome-transformers.md.

#36115 opened Feb 10, 2025 by zhanluxianshen

Loading…

2 of 5 tasks

add DeepSpeed tensor parallel initialization.

#36114 opened Feb 10, 2025 by inkcherry

Loading…

Proper performant flex attention implementation

#36103 opened Feb 8, 2025 by bursteratom

Loading…

3 of 15 tasks

Remove loading custom kernel for RT-DETRv2 cleanup Vision

#36098 opened Feb 7, 2025 by qubvel

Loading…

Fixup another model + encoder decoder

#36095 opened Feb 7, 2025 by muellerzr • Draft

5 tasks

Add pipeline parallel plan to PretrainedConfig and PreTrainedModel

#36091 opened Feb 7, 2025 by hmellor

Loading…

5 tasks

Fix: Llama - adjust the rotary embedding dimensions.

#36090 opened Feb 7, 2025 by abdullahselek

Loading…

1 of 5 tasks

fix: dtype might change during resize

#36089 opened Feb 7, 2025 by LarsHaalck

Loading…

2 of 5 tasks

qwen2.5vl: fix bugs when using flash2+bf16 or num_return_sequences>1

#36083 opened Feb 7, 2025 by gewenbin0992

Loading…

5 tasks

[docs] fix bug in deepspeed config

#36081 opened Feb 7, 2025 by faaany

Loading…

[docs] update awq doc

#36079 opened Feb 7, 2025 by faaany

Loading…

PyTorch's int8 & int4 WoQ GEMMs expect contiguous activations

#36078 opened Feb 7, 2025 by sanchitintel

Loading…

[Bugfix] Fix reloading of pixtral/llava configs

#36077 opened Feb 7, 2025 by kylesayrs

Loading…

Speedup modular conversion w/ multiproc

#36073 opened Feb 6, 2025 by muellerzr • Draft

1 of 5 tasks

add special tokens should probably be serialized and deserialized?

#36070 opened Feb 6, 2025 by ArthurZucker • Draft

Optim: APOLLO optimizer integration

#36062 opened Feb 6, 2025 by zhuhanqing

Loading…

🚧 [WiP] Add Janus model

#36053 opened Feb 5, 2025 by yaswanth19 • Draft

5 tasks

Remove type hint Unpack[FlashAttentionKwargs]

#36049 opened Feb 5, 2025 by ydshieh

Loading…

Anole add model

#36047 opened Feb 5, 2025 by zucchini-nlp

Loading…

Fix bug in prepare_inputs_for_generation function: in Qwen2-VL (#36037)

#36038 opened Feb 4, 2025 by JamesHujy

Loading…

Add Phi-3.5-vision

#36036 opened Feb 4, 2025 by Dahlbomii • Draft

[core] Large/full refactor of from_pretrained

#36033 opened Feb 4, 2025 by Cyrilvallez

Loading…

Previous 1 2 3 4 5 … 22 23 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly