I am facing a weird problem when compiling a Python extension featuring OpenMP with Clang.
Minimal Example
I managed to boil down my actual problem to the following code:
The Python extension could not be much simpler, while still featuring OpenMP.
Apart from the function bar, this is mostly standard boilerplate:
# include <Python.h>
static PyObject * bar(PyObject *self)
{
#pragma omp parallel sections
{
#pragma omp section
{float x=42.0; x+=1;}
}
Py_RETURN_NONE;
}
static PyMethodDef foo_methods[] = {
{"bar", (PyCFunction) bar, METH_NOARGS, NULL},
{NULL, NULL, 0, NULL}
};
static struct PyModuleDef moduledef = {
PyModuleDef_HEAD_INIT, "foo", NULL, -1,
foo_methods, NULL, NULL, NULL, NULL
};
PyMODINIT_FUNC PyInit_foo(void)
{
return PyModule_Create(&moduledef);
}
With the above being foo.c, I compile and load this with:
clang -fPIC -fopenmp -I/usr/include/python3.7m -c foo.c -o foo.o
clang -shared foo.o -o foo.so -lgomp
python3 -c "import foo"
The last line, i.e., the import of the module throws the following error:
ImportError: /home/wrzlprmft/…/foo.so: undefined symbol: __kmpc_for_static_fini
What I found out so far
This does not happen when replacing Clang with GCC.
This does not happen with regular shared libraries (not involving Python).
Using Setuptools to compile the extension does not help.
(In fact, my compile commands are a reduction of what Setuptools does to find out whether it uses any non-essential compiler extensions that cause this.)
All of this happens on Ubuntu 19.10 with Python 3.7, Clang 9.0.0-2, and GCC 9.2.1.
I can also replicate the problem on current Arch Linux with Python 3.8 and Clang 9.0.1.
This worked until a year ago, probably longer.
Using Python 3.6 does not help.
Using Clang 3.8, 4.0, 6.0, 7, or 8 does not help.
Here, somebody reports a similar problem when trying to compile TensorFlow.
This is yet unsolved.
Question
What is going wrong here and how can I fix this?
Right now I do not even have an idea whether this is an error by me, in Clang, OpenMP, or Python.
Related
The following short c example uses the standard c library and therefore requires the wasi sdk:
#include <stdio.h>
int main(void)
{
puts("Hello");
return 0;
}
When compiling the code directly with clang to wasm it works without problem:
clang --target=wasm32-unknown-wasi -s -o example.wasm example.c
My understanding of the LLVM tool chain is that I could achieve the same result with either
clang -> LLVM IR (.ll) -> LLVM native object files (.o) -> convert to wasm
clang -> LLVM native object files (.o) -> convert to wasm
I am able to use the second approach with a simple C program which does not use standard lib calls, when trying with the example above I receive a undefined symbol error:
clang --target=wasm32-unknown-wasi -c example.c
wasm-ld example.o -o example.wasm --no-entry --export-all
wasm-ld: error: example.o: undefined symbol: puts
I do not know if my problem is that I use the wrong clang parameters and therefore not export enough information or that the error is in the wasm-ld command.
Would be happy if someone could give me more insight into tool chain, thanks
Setup
I have a simple helloworld program:
// content of main.c
#include <stdio.h>
#include <limits.h>
int main() {
for (int i = 0; i < INT_MAX; ++i) {
printf("simply helloworld!\n");
}
return 0;
}
I compile a baseline version with clang 13.0.0 using clang -flto=thin -fvisibility=hidden -fuse-ld=lld main.c
To experiment with CFI, I compile another version using clang -flto=thin -fsanitize=cfi -fsanitize-cfi-cross-dso -fno-sanitize-cfi-canonical-jump-tables -fsanitize-trap=cfi -fvisibility=hidden -fuse-ld=lld main.c
Expectation
I am expecting negligible performance overhead as I am only calling into a shared library that I expect will run the same code for both. The disassembly for main function for both binaries look the same.
Reality
The baseline version completes execution in ~27s while the cfi version completes execution in ~32s. Using perf stat -e instructions <binary> I can see that the cfi version runs ~100,000,000,000 more instructions. With perf record then perf diff, I can see that the difference is primarily in two functions _pthread_cleanup_push_defer and _pthread_cleanup_pop_restore that the cfi version runs. Using gdb, these functions are called as the call stack of printf gets deeper.
Question
How do I begin to explain the performance difference between these two binaries? What makes a simple call to printf call two different versions of itself for two different binaries?
I am trying to experiment with libFuzzer library and going through the toy-example[1].
keep-learnings-MacBook-Pro:Ccodeanalysis keep_learning$ cat Fuzzme.cpp
#include <stdint.h>
#include <stddef.h>
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size > 0 && data[0] == 'H')
if (size > 1 && data[1] == 'I')
if (size > 2 && data[2] == '!')
__builtin_trap();
return 0;
}
keep-learnings-MacBook-Pro:Ccodeanalysis keep_learning$ clang++ -fsanitize=address,fuzzer Fuzzme.cpp
ld: file not found: /Library/Developer/CommandLineTools/usr/lib/clang/10.0.1/lib/darwin/libclang_rt.fuzzer_osx.a
clang: error: linker command failed with exit code 1 (use -v to see invocation)
keep-learnings-MacBook-Pro:Ccodeanalysis keep_learning$ clang++ --version
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
A quick Google search showed me this [2], but other than that I could not find any meaningful information to resolve it, hence posting here. Could some one please tell me how to solve this ? Thanks in advance.
[1] http://llvm.org/docs/LibFuzzer.html#toy-example
[2] https://bugs.llvm.org/show_bug.cgi?id=39794
As you have noticed, there is no fuzzer runtime shipped with Apple developer tools. So you'd either report this issue to Apple folks, or build the runtime library by yourself from the sources (or both).
As Anton stated, Apple Developer Tools do not include the fuzzer library, leaving you to compile from source, or asking Apple.
It turns out LLVM also hosts pre-compiled binaries for some releases on their downloads page:
https://releases.llvm.org/download.html.
On that page, find your LLVM version (eg "Download LLVM 10.0.0"), and go a bit further until you see Pre-Built Binaries. Don't see binaries for your LLVM version? Pick the nearest lower version. The OP and I both have clang++ 10.0.1, so we'd pick 10.0.0.
Click the macOS link to download, pop into the Terminal to untar and copy the libraries, and you're done. I did it with a few environment variables (those paths are killer!), and a cp -n to preserve existing files.
export CLANG_ROOT=clang+llvm-10.0.0-x86_64-apple-darwin/lib/clang/10.0.0
export XCODE_ROOT=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.1
tar xvf clang+llvm-10.0.0-x86_64-apple-darwin.tar.xz $CLANG_ROOT/include/fuzzer $CLANG_ROOT/lib/darwin
sudo cp -rn $CLANG_ROOT/include/fuzzer $XCODE_ROOT/include
sudo cp -n $CLANG_ROOT/lib/darwin/* $XCODE_ROOT/lib/darwin
I did exactly the above, and my code compiled and linked right away.
I am trying to compile two *.c files to LLVM bitcode via clang, link them together using llvm-link, and make a single *.wasm file out of it. I built LLVM on my machine via the Makefile provided by https://github.com/yurydelendik/wasmception
This works fine until I use memcpy in the C code. Then llvm-link stops with error:
Intrinsic has incorrect argument type!
void (i8*, i8*, i32, i1)* #llvm.memcpy.p0i8.p0i8.i32
The following is a minimal example to reproduce the issue:
one.c
#define EXPORT __attribute__((visibility("default")))
#include <string.h>
char* some_str();
EXPORT void do_something() {
char* cpy_src = some_str();
char other_str[15];
memcpy(other_str, cpy_src, strlen(cpy_src));
}
two.c
char* some_str() {
return "Hello World";
}
Execute the following commands:
$ clang --target=wasm32-unknown-unknown-wasm --sysroot=../wasmception/sysroot -S -emit-llvm -nostartfiles -fvisibility=hidden one.c -o one.bc
[...]
$ clang --target=wasm32-unknown-unknown-wasm --sysroot=../wasmception/sysroot -S -emit-llvm -nostartfiles -fvisibility=hidden two.c -o two.bc
[...]
Note that no optimization is done because that would eliminate the unnecessary memcpy call here. As I said, this is a minimal example out of context to show the error.
$ llvm-link one.bc two.bc -o res.bc -v
Loading 'one.bc'
Linking in 'one.bc'
Loading 'two.bc'
Linking in 'two.bc'
Intrinsic has incorrect argument type!
void (i8*, i8*, i32, i1)* #llvm.memcpy.p0i8.p0i8.i32
llvm-link: error: linked module is broken!
When I comment out the memcpy call in the example file, the error is gone. Of course this is not an option in the real project I am working at.
Am I doing something wrong? Is it a bad idea in general to use memcpy in a WebAssembly context? Can this be a bug in LLVM/Clang?
Reading through these github issues, it seems the memcpy intrinsic is not currently supported by the WASM backend:
https://github.com/WebAssembly/design/issues/236
https://github.com/WebAssembly/design/issues/1003
As a workaround, you could instruct clang to disable intrinsic expansion using -fno-builtin, so that the generated code will call the actual memcpy function.
Disclamer: using -nostdcincl isn't possible because it excludes needed system libraries. Here instead the problem seems to be that tthe compiler ignores my -I directives
I have installed a library (OpenCV) in ~/local on a remote machine, since I don't have sudo access there. Notice that an older version of the same library is installed in /usr/local.
I'm trying to compile this code:
g++ -DCC_DISABLE_CUDA -I/home/spm1428/CloudCache -I/home/spm1428/local/include/opencv -I/home/spm1428/local/include/opencv2 -I/usr/include/boost -I/home/spm1428/vlfeat -O3 -g -Wall -c -fopenmp -std=c++11 -c -o Descriptor.o ../Descriptors/Descriptor.cpp
However, the returned error is:
In file included from /usr/local/include/opencv2/opencv.hpp:77:0,
from /home/spm1428/CloudCache/Utilities/Utility.hpp:11,
from ../Descriptors/Descriptor.cpp:17:
/usr/local/include/opencv2/highgui/highgui.hpp:165:25: error: redeclaration of ‘IMREAD_UNCHANGED’
IMREAD_UNCHANGED =-1,
^
In file included from ../Descriptors/Descriptor.cpp:13:0:
/home/spm1428/local/include/opencv2/imgcodecs.hpp:65:8: note: previous declaration ‘cv::ImreadModes IMREAD_UNCHANGED’
IMREAD_UNCHANGED = -1, //!< If set, return the loaded image as is (with alpha channel,
^
In file included from /usr/local/include/opencv2/opencv.hpp:77:0,
from /home/spm1428/CloudCache/Utilities/Utility.hpp:11,
from ../Descriptors/Descriptor.cpp:17:
/usr/local/include/opencv2/highgui/highgui.hpp:167:24: error: redeclaration of ‘IMREAD_GRAYSCALE’
IMREAD_GRAYSCALE =0,
I think that this happens because there is another version installed. How can I solve this?
I think this error happens for the same reason (the old version doesn't have cv::xfeatures2d::SURF).