-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] Enable f4_e2m1 jit gemm #2442
Conversation
ebac6cb
to
44b218e
Compare
44b218e
to
021d757
Compare
// cmp (ge) t0:w, y:w, 31 | ||
// shr y:uw, 10 | ||
// csel (ge) y:fp16, 0x7bff, y:fp16, t0:fp16 | ||
// csel (ze) y:fp16, NaN:fp16, y:fp16, t1:fp16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: there's a much faster sequence, though this is OK for now:
shl t0:ud x:ub 24
add t0:ud t0:ud 1
mov y:hf t0:f
4a4aabf
to
b823c25
Compare
make test |
@kealan-barbieri Do we have f4_e2m1 coverage in benchdnn input files? If missing, can you please add some? In the long term #2434 should help with that. |
@echeresh there is existing coverage: https://github.com/oneapi-src/oneDNN/blob/main/tests/benchdnn/inputs/matmul/test_matmul_fp4 |
91cfc9b
to
8a4fc5e
Compare
6cf6814
to
895871f
Compare
b4fbafd
to
a21bc13
Compare
make test |
a21bc13
to
a4f262b
Compare
make test |
make test perf-gpu |
a4f262b
to
5614ff2
Compare
make test |
5614ff2
to
8a915a5
Compare
8a915a5
to
077365b
Compare
Update types with autoTypeConversions before counting outer product ops.
077365b
to
6e4794e
Compare
Description
Partially covers MFDNN-124711
Checklist
General
make test
andmake test_benchdnn_*
) pass locally for each commit?