I am trying to add labels in C source code(instrumentation); with a small experience with assembly, comipler is clang; I have got a strange behavior with __asm__ and labels in CASE statements !!!;
here is what I have tried:
// Compiles successfully.
int main()
{
volatile unsigned long long a = 3;
switch(8UL)
{
case 1UL:
//lbl:;
__asm__ ("movb %%gs:%1,%0": "=q" (a): "m" (a));
a++;
}
return 0;
}
and this :
// Compiles successfully.
int main()
{
volatile unsigned long long a = 3;
switch(8UL)
{
case 1UL:
lbl:;
//__asm__ ("movb %%gs:%1,%0": "=q" (a): "m" (a));
a++;
}
return 0;
}
command:
clang -c examples/a.c
examples/a.c:5:14: warning: no case matching constant switch condition '8'
switch(8UL)
^~~
1 warning generated.
BUT this:
// not Compile.
int main()
{
volatile unsigned long long a = 3;
switch(8UL)
{
case 1UL:
lbl:;
__asm__ ("movb %%gs:%1,%0": "=q" (a): "m" (a));
a++;
}
return 0;
}
the error:
^~~
examples/a.c:9:22: error: invalid operand for instruction
__asm__ ("movb %%gs:%1,%0": "=q" (a): "m" (a));
^
<inline asm>:1:21: note: instantiated into assembly here
movb %gs:-16(%rbp),%rax
^~~~
1 warning and 1 error generated.
I am using :
clang --version
clang version 9.0.0-2 (tags/RELEASE_900/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
IMPORTANT; this will compile successfully with gcc.
gcc --version
gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I am working on Ubuntu 19 ,64 BIT.
any Help please..
EDIT
Based on the accepted answer below:
The __asm__ statement itself cause error (different sizes).
The __asm__ is unreachable.
Adding Label to the statement makes it reachable.
GCC ignores it.
Clang does not ignore it.
movb is 8-bit operand-size, %rax is 64-bit because you used unsigned long long. Just use mov to do a load of the same width as the output variable, or use movzbl %%gs:%1, %k0 to zero-extend to 64-bit. (Explicitly to 32-bit with movzbl, and implicitly to 64-bit by writing the 32-bit low half of the 64-bit register (the k modifier in %k0))
Surprised GCC doesn't reject that as well; maybe GCC removes it as dead code because of the unreachable case in switch(8). If you look at GCC's asm output, it probably doesn't contain that instruction.
I need to create an OpenCL application that instruments the code of the OpenCL kernel that it receives as input, for some exotic profiling purposes (haven't found what I need, so I need/want to do it myself).
I want to compile the kernel to an intermediate representation (LLVM-IR right now), instrument it (using the LLVM C++ bindings), transpile the instrumented code to SPIR-V and then create a kernel in the hostcode with clCreateProgramWithIL().
For now, I am just compiling a simple OpenCL kernel that adds 2 vectors, without instrumentation:
__kernel void vadd(
__global float* a,
__global float* b,
__global float* c,
const unsigned int count)
{
int i = get_global_id(0);
if(i < count) c[i] = a[i] + b[i];
}
For compiling the above to LLVM IR, I use the following command:
clang -c -emit-llvm -include libclc/generic/include/clc/clc.h -I libclc/generic/include/ vadd.cl -o vadd.bc -emit-llvm -O0 -x cl
Afterwards, I transpile vadd.bc to vadd.spv with the llvm-spirv tool (here).
Finally, I try building a kernel from the C hostcode like this:
...
cl_program program = clCreateProgramWithIL(context, binary_data->data, binary_data->size, &err);
err = clBuildProgram(program, 1, &device_id, NULL, NULL, NULL);
...
After running the hostcode, I receive the above error from the clBuildProgram command:
CL_BUILD_PROGRAM_FAILURE
error: undefined reference to `get_global_id()'
error: backend compiler failed build.
It seems that the vadd.spv file is not link with the OpenCL kernel library. Any idea how to achieve this?
I want to slice the unused variables which are shown down with frama-c. But I have no idea which command line should I write to slice all unused variables with one command line
Last login: Thu Nov 9 20:48:42 on ttys000
Recep-MacBook-Pro:~ recepinanir$ cd desktop
Recep-MacBook-Pro:desktop recepinanir$ cat hw.c
#include <stdio.h>
int main()
{
int x= 10;
int y= 24;
int z;
printf("Hello World\n");
return 0;
}
Recep-MacBook-Pro:desktop recepinanir$ clang hw.c
Recep-MacBook-Pro:desktop recepinanir$ ./a.out
Hello World
Recep-MacBook-Pro:desktop recepinanir$ clang -Wall hw.c -o result
hw.c:5:9: warning: unused variable 'x' [-Wunused-variable]
int x= 10;
^
hw.c:6:9: warning: unused variable 'y' [-Wunused-variable]
int y= 24;
^
hw.c:7:9: warning: unused variable 'z' [-Wunused-variable]
int z;
^
3 warnings generated.
Recep-MacBook-Pro:desktop recepinanir$
As mentioned on https://frama-c.com/slicing.html, slicing is always relative some criterion, and the goal is to produce a program that is smaller to the original one, while presenting the same behavior with respect to the criterion. The Slicing plug-in itself gives several ways to build such criteria, but it seems that you are interested in the result of the Sparecode plugin (https://frama-c.com/sparecode.html): this is a specialized version of slicing, where the criterion is the program state at the end of the entry point of your analysis (i.e. main in your case). In other words, Sparecode will remove everything that does not contribute to the final result of the code under analysis. In your case, frama-c -sparecode-analysis hw.c gives the following result (note that the call to printf has been modified by the Variadic plug-in, and that its argument is not considered as useful for the final state of main. If this is an issue, you'd need to provide more specialized output functions, with an ACSL specification indicating that they have an impact to some global variable)
/* Generated by Frama-C */
#include "stdio.h"
/*# assigns \result, __fc_stdout->__fc_FILE_data;
assigns \result
\from (indirect: __fc_stdout->__fc_FILE_id),
__fc_stdout->__fc_FILE_data;
assigns __fc_stdout->__fc_FILE_data
\from (indirect: __fc_stdout->__fc_FILE_id),
__fc_stdout->__fc_FILE_data;
*/
int printf_va_1(void);
int main(void)
{
int __retres;
printf_va_1();
__retres = 0;
return __retres;
}
Finally, note that in the general case, Slicing (hence Sparecode) gives an overapproximation: it will only remove statements for which it is certain that they have no impact on the criterion.
I have been writing a code in FORTRAN but I am having problems using the lapack dsyevr:
http://netlib.sandia.gov/lapack/double/dsyevr.f
The problems I am getting seem to be linked to memory allocation issues, specifically I believe to do with the output arrays the dsyevr produces (including A which is an input and an output).
I have tried to write a simplified code to demonstrate the issues I am seeing. Please let me know if any of it needs clarification. The code is names prof1.f90 and calls the dsyevr function:
PROGRAM prog1
implicit none
real(kind=8), allocatable :: W(:)
real(kind=8), allocatable :: Z(:,:)
real(kind=8), allocatable :: A(:,:)
integer(kind=8) :: n, info, il, iu, m, lwork, liwork
integer(kind=8) :: i, k, p, q, nu
real(kind=8) :: abstol, vl, vu
real(kind=8), allocatable :: work(:)
integer, allocatable :: isuppz(:), iwork(:)
n = 3
allocate(W(3),Z(3,3),A(n,n),stat=info)
if (info .ne. 0) stop "error allocating arrays"
A(1,1)=3.78136524999999994E-003
A(1,2)=0.0000000000000000
A(1,3)=-7.92918150000000038E-004
A(2,1)=0.0000000000000000
A(2,2)=5.20293929999999984E-003
A(2,3)=0.0000000000000000
A(3,1)=-7.92918150000000038E-004
A(3,2)=0.0000000000000000
A(3,3)=3.78136524999999994E-003
vl = 1.06451084056294826E-313
vu = 0.0
il = 4294967297
iu = 8839891
m = 140733655445712
W(1) = 2.98844710000000001E-003
W(2) = 4.57428340000000030E-003
W(3) = 5.20293929999999984E-003
Z(1,1) = 8.65587596665713699E-317
Z(1,2) = 8.65587596665713699E-317
Z(1,3) = 1.58101006669198894E-322
Z(2,1) = 1.58101006669198894E-322
Z(2,2) = 0.0000000000000000
Z(2,3) = 8.65569415049946741E-317
Z(3,1) = 4.24400777097956191E-314
Z(3,2) = 4.79243676466009148E-322
Z(3,3) = 3.51391740150311405E-316
lwork = -1
liwork = -1
abstol = 1d-5
allocate(work(1),iwork(1),isuppz(6))
call dsyevr('V','A','U',n,A,n,vl,vu,il,iu,abstol,m,W,Z,n,isuppz,work,lwork,iwork,liwork,info)
if (info .ne. 0) stop "error obtaining work array dimensions"
lwork = work(1)
liwork = iwork(1)
deallocate(work,iwork)
allocate(work(lwork),iwork(liwork),stat=info)
if (info .ne. 0) stop "error allocating work arrays"
call dsyevr('V','A','U',n,A,n,vl,vu,il,iu,abstol,m,W,Z,n,isuppz,work,lwork,iwork,liwork,info)
if (info .ne. 0) stop "error diagonalizing the hamiltonian"
deallocate(A,work,iwork,isuppz)
END PROGRAM prog1
In the code above the dsyevr function is called twice, the first time which is just to get the dimensions of the work etc... matrices runs correctly however the second time when it is called it returns the following error
*** glibc detected *** ./PROGRAM: munmap_chunk(): invalid pointer: 0x000000000134cc20 ***
There is also a Backtrace and MemoryMap which I can provide.
If it is useful the makefile I have been using is given below. The program is created using the line:
make PROGRAM
Makefile:
FC = gfortran
FCFLAGS = -g -fbounds-check
FCFLAGS = -O2
FCFLAGS += -I/usr/include
%: %.o
$(FC) $(FCFLAGS) -o $# $^ $(LDFLAGS)
%.o: %.f90
$(FC) $(FCFLAGS) -c $< -fno-range-check
%.o: %.F90
$(FC) $(FCFLAGS) -c $<
.PHONY: clean veryclean
PROGRAM: prog1.f90 prog1.o
$(FC) $(FCFLAGS) -o $# prog1.o $(LIBS) -Wl,--start-group -L$(MKLROOT)/lib/intel64 -lmkl_gf_ilp64 -lmkl_core -lmkl_sequential -Wl,--end-group -lpthread
clean:
rm -f *.o *.mod *.MOD
veryclean: clean
rm -f *~ $(PROGRAM)
where $MKLROOT is /opt/intel/composer_xe_2011_sp1.8.273/mkl
If it is useful I have used valgrind:
valgrind --tool=memcheck --db-attach=yes ./PROGRAM
And I have found the following error:
==31069==
==31069== Invalid write of size 8
==31069== at 0x57B9F92: mkl_lapack_dsyevr (in /opt/intel/composer_xe_2011_sp1.8.273 /mkl/lib/intel64/libmkl_core.so)
==31069== by 0x4D9A580: DSYEVR (in /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib /intel64/libmkl_gf_ilp64.so)
==31069== by 0x401163: MAIN__ (in /home/j/workbook/Test4/PROGRAM)
==31069== by 0x401F09: main (in /home/j/workbook/Test4/PROGRAM)
==31069== Address 0x6afe0b0 is 0 bytes inside a block of size 4 alloc'd
==31069== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==31069== by 0x400FE4: MAIN__ (in /home/j/workbook/Test4/PROGRAM)
==31069== by 0x401F09: main (in /home/j/workbook/Test4/PROGRAM)
==31069==
I am not sure if the line "Invalid write of size 8" is referring to the size of the some of the integers or reals (as mentioned by Jonathan Dursi) or referring to the size of an array being passed into dsyevr.
It's not obvious from the documentation, but isuppz has to be allocated, even for that initial call where lwork = -1. If you move the isuppz allocation to before the first call your code completes successfully.
After that, since you're using the ILP (8-byte integer) version of MKL, e.g. LAPACK with 8-byte integer indices, all of your integer parameters to the LAPACK routines will have to be the same long kind, so that you'll need
integer(kind=8), allocatable :: isuppz(:), iwork(:)
I'll note that for neither and integer, kind=8 is actually part of the standard, and you should really use either selected_int/real_kind or the iso_fortran_env module and int64, etc.
I have an AArch64 NEON function defined in an assembly .s file. It's already compiling and running fine, but to improve code readability I'd like to use register aliases with the .req assembler directive. Although when I try to do it clang fails with error: unexpected token in argument list
To keep the example simple consider this code:
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 2
_foo:
add w0, w0, w1
ret lr
This code compiles and runs, but if I try to use
myreg .req w0
before the add instruction, I get
/Users/dfcamara/Desktop/MyApp/MyApp/foo.s:5:16: error: unexpected token in argument list
myreg .req w0
^
Maybe I need some clang directive that I'm not aware, I can't find documentation about them. Or a compiler option. I just created a new iOS (iPad) project and added an assembly file.
Thanks