Description
Description
When -fno-builtin-printf
is in effect (such as when passing this option or simply -fno-builtin
),
compiling a device function with printf results in an error.
Impact
/Oi-
commonly used on Windows when debugging (this is the default setting of Visual Studio) is translated by clang-cl
to -fno-builtin
.
There's no option to re-enable a builtin that was disabled by a previous flag, so the only option is to not use -fno-builtin
or list all builtin functions except printf.
Environment
❯ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
❯ clang --version
Ubuntu clang version 17.0.2 (++20231003073124+b2417f51dbbd-1~exp1~20231003073217.50)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
❯ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
❯ hipconfig --version
5.4.22801-aaa1e3d8
Reproducer
Assuming a file named main.cu with the contents:
#ifdef __HIP__
#include <hip/hip_runtime.h>
#endif
__global__ void a() {
printf("my name: %s\n", "asd");
}
For cuda:
> clang -x cu -fno-builtin-printf main.cu
ptxas fatal : Unresolved extern function 'printf'
clang-17: error: ptxas command failed with exit code 255 (use -v to see invocation)
Ubuntu clang version 17.0.2 (++20231003073124+b2417f51dbbd-1~exp1~20231003073217.50)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
For HIP:
❯ clang -x hip -fno-builtin-printf main.cu
main.cu:5:17: error: unsupported call to variadic function printf
__global__ void a() {
^
1 error generated when compiling for gfx906.
(This happens with both with -mprintf-kind=
buffered
and hostcall
)
OpenMP offloading is probably also affected, but I didn't try it.
Analysis
Here's what I think goes wrong:
Clang rewrites printf
to vprintf
for Cuda, and directly emits runtime calls for AMDGPU, but this only happens when builtins are enabled, when builtins are not enabled a normal function call is emitted to printf
, but it won't be found because:
- The cuda device libraries don't provide it, as nvcc normally rewrites it to a special
vprintf
. - AMDGPU doesn't even support C style variadic functions so trying to codegen the call to
printf
fails earlier (at compile time instead of link-time).