Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SYCL end-to-end test doesn't pass on AMD GPU. #16778

Open
bader opened this issue Jan 25, 2025 · 0 comments
Open

SYCL end-to-end test doesn't pass on AMD GPU. #16778

bader opened this issue Jan 25, 2025 · 0 comments
Labels
bug Something isn't working hip Issues related to execution on HIP backend.

Comments

@bader
Copy link
Contributor

bader commented Jan 25, 2025

Describe the bug

I did innocent change in the code comments and to my surprise CI says my change does not pass pre-commit testing on HIP platform.

-- Testing: 2268 tests, 24 workers --
FAIL: SYCL :: syclcompat/memory/memory_management_test2.cpp (2237 of 2268)
******************** TEST 'SYCL :: syclcompat/memory/memory_management_test2.cpp' FAILED ********************
Exit Code: -6

Command Output (stdout):
--
# RUN: at line 33
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -Wno-error=#warnings -Wno-error=deprecated-declarations -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031  /__w/llvm/llvm/llvm/sycl/test-e2e/syclcompat/memory/memory_management_test2.cpp -o /__w/llvm/llvm/build-e2e/syclcompat/memory/Output/memory_management_test2.cpp.tmp.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror '-Wno-error=#warnings' -Wno-error=deprecated-declarations -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 /__w/llvm/llvm/llvm/sycl/test-e2e/syclcompat/memory/memory_management_test2.cpp -o /__w/llvm/llvm/build-e2e/syclcompat/memory/Output/memory_management_test2.cpp.tmp.out
# .---command stderr------------
# | In file included from /__w/llvm/llvm/llvm/sycl/test-e2e/syclcompat/memory/memory_management_test2.cpp:38:
# | In file included from /__w/llvm/llvm/toolchain/bin/../include/syclcompat/memory.hpp:54:
# | /__w/llvm/llvm/toolchain/bin/../include/syclcompat/device.hpp:470:2: warning: "Querying the number of bytes of free memory is not supported" [-W#warnings]
# |   470 | #warning "Querying the number of bytes of free memory is not supported"
# |       |  ^
# | 1 warning generated.
# | In file included from /__w/llvm/llvm/llvm/sycl/test-e2e/syclcompat/memory/memory_management_test2.cpp:38:
# | In file included from /__w/llvm/llvm/toolchain/bin/../include/syclcompat/memory.hpp:54:
# | /__w/llvm/llvm/toolchain/bin/../include/syclcompat/device.hpp:470:2: warning: "Querying the number of bytes of free memory is not supported" [-W#warnings]
# |   470 | #warning "Querying the number of bytes of free memory is not supported"
# |       |  ^
# | 1 warning generated.
# `-----------------------------
# RUN: at line 34
env UR_HIP_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=hip:gpu  /__w/llvm/llvm/build-e2e/syclcompat/memory/Output/memory_management_test2.cpp.tmp.out
# executed command: env UR_HIP_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=hip:gpu /__w/llvm/llvm/build-e2e/syclcompat/memory/Output/memory_management_test2.cpp.tmp.out
# .---command stdout------------
# | void test_memcpy_kernel()
# `-----------------------------
# .---command stderr------------
# | Memory access fault by GPU node-1 (Agent handle: 0x243ec490) on address 0x7fa3d19bc000. Reason: Page not present or supervisor privilege.
# `-----------------------------
# error: command failed with exit status: -6

--

********************
TIMEOUT: SYCL :: WorkGroupMemory/basic_usage.cpp (2268 of 2268)
******************** TEST 'SYCL :: WorkGroupMemory/basic_usage.cpp' FAILED ********************
Exit Code: -9
Timeout: Reached timeout of 600 seconds

Command Output (stdout):
--
# RUN: at line 1
/__w/llvm/llvm/toolchain/bin//clang++  -Werror -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031  /__w/llvm/llvm/llvm/sycl/test-e2e/WorkGroupMemory/basic_usage.cpp -o /__w/llvm/llvm/build-e2e/WorkGroupMemory/Output/basic_usage.cpp.tmp.out
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1031 /__w/llvm/llvm/llvm/sycl/test-e2e/WorkGroupMemory/basic_usage.cpp -o /__w/llvm/llvm/build-e2e/WorkGroupMemory/Output/basic_usage.cpp.tmp.out
# note: command had no output on stdout or stderr
# RUN: at line 2
env UR_HIP_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=hip:gpu  /__w/llvm/llvm/build-e2e/WorkGroupMemory/Output/basic_usage.cpp.tmp.out
# executed command: env UR_HIP_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=hip:gpu /__w/llvm/llvm/build-e2e/WorkGroupMemory/Output/basic_usage.cpp.tmp.out
# note: command had no output on stdout or stderr
# error: command failed with exit status: -9
# error: command reached timeout: True

--
********************
Timed Out Tests (1):
  SYCL :: WorkGroupMemory/basic_usage.cpp

********************
Failed Tests (1):
  SYCL :: syclcompat/memory/memory_management_test2.cpp

syclcompat/memory/memory_management_test2.cpp test failure might be related to another issue I opened in 2023 - #10460. @npmiller, @GeorgeWeb, FYI.

I searched for other issues with similar error message and found #14404 reported in 2024. @JackAKirk, FYI.

Both these issues were referenced from multiple PRs.

Link to my change: #16777
Link to pre-commit results on AMD GPU: https://github.com/intel/llvm/actions/runs/12959719697/job/36152970827.

To reproduce

Environment

  • OS: Linux
  • Target device and vendor: AMD GPU

All environment information is available in the logs referenced in the description section.

Additional context

No response

@bader bader added bug Something isn't working hip Issues related to execution on HIP backend. labels Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hip Issues related to execution on HIP backend.
Projects
None yet
Development

No branches or pull requests

1 participant