thu-ml / SageAttention Public

Notifications You must be signed in to change notification settings
Fork 117
Star 1.7k

Code
Issues 53
Pull requests 7
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: thu-ml/SageAttention

Beta

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

53 Open 109 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

SegaAttention2, 4090, 3 hours, 8secs.

#183 opened Jun 4, 2025 by K-O-N-B

Where is the SageAttention2++ code?

#182 opened May 28, 2025 by aikitoria

fix-gcc15

#180 opened May 25, 2025 by al-swaiti

RuntimeError: Cannot find CUDA_HOME. CUDA must be available to build the package.

#179 opened May 25, 2025 by wujpia

Question about usage of make_smem_desc

#178 opened May 24, 2025 by botbw

Error in worker loop (rank 3): SM89 kernel is not available. Make sure you GPUs with compute capability 8.9.

#177 opened May 22, 2025 by LiuYinfeng01

Release SageAttention3 on Hugging Face

#176 opened May 22, 2025 by NielsRogge

MultiQuery Attention

#169 opened May 13, 2025 by asahni-sc

wonder why using different(128/64) block-size in sageattn-v1 for quant q, k

#167 opened May 8, 2025 by ZJLi2013

head-dim lower than 128

#166 opened May 6, 2025 by XuekuanWang

How to simply verify the successful installation of sageattention?

#165 opened May 6, 2025 by aswordok

BUG: RTX 50XX nan returned by _fused.mean_scale_fuse_quant_cuda and _fused.scale_fuse_quant_cuda

#164 opened Apr 30, 2025 by deepbeepmeep

Sage Attention make time longer

#163 opened Apr 30, 2025 by lonelygoogle

Sage Attention 2 not work with Pytorch Compile

#162 opened Apr 30, 2025 by pivtienduc

SageAttention2.0 is slower VS Flash attention 2

#161 opened Apr 23, 2025 by XuekuanWang

Transformers intergrate

#159 opened Apr 20, 2025 by lucasjinreal

SageAttention errors on 1.0.6 and 2.1.1 on 5090

#158 opened Apr 18, 2025 by BBBAAA2

No GPUs found

#157 opened Apr 17, 2025 by rkfg

The accuracy loss in the CUDA version is much more than Triton version for llama-3.2

#154 opened Apr 7, 2025 by WanliZhong

Install failed. plz help.

#153 opened Apr 3, 2025 by lgs777

K Sampler [WinError 2] The system cannot find the file specified.

#152 opened Apr 2, 2025 by REG-0422

Sage Attention vs Flash Attention Speed Comparison with Wan 2.1 - Sage Attention is 37% Faster - 720p - 14b model - tested on Windows Python VENV - no WSL

#150 opened Mar 25, 2025 by FurkanGozukara

qk_int_sv_f8_cuda_sm90 kernel failed to run on H20

#149 opened Mar 25, 2025 by SimonSongg

SageAttention fails to work on RTX 50xx series GPU (e.g. 5090) despite clean venv install

#148 opened Mar 24, 2025 by Teskun

I've built Windows wheels

#146 opened Mar 24, 2025 by woct0rdho

Previous 1 2 3 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!