Is it possible to use clang to produce RISC V assembly without linking? - clang

I am trying to learn more about compilers and RISC V assembly was specifically designed to be easy to learn and teach. I am interested in compiling some simple C code to assembly using clang for the purpose of understanding the semantics. I'm planning on using venus to step through the assembly and the source code does NOT actually need to be fully compiled to machine code in order to run on a real machine.
I want to avoid compiler optimizations so I can see what I've actually instructed the processor to do.
I don't actually need the program to compile to machine code--I just want the assembly.
I don't want to worry about linking to the system library because this code doesn't actually need to run
The code does not make any explicit use of system calls and so I think a std lib should not be required
This answer seems to indicate that clang definitely can compile to RISC V targets, but it requires having a version of the OS's standard library built for RISC V.
This answer indicates that some form of cross-compiling is necessary, but again I don't need to fully compile the code to machine instructions so this should not apply if I'm understanding correctly.

Use clang -S to stop after generating an assembly file:
$ cat foo.c
int main() { return 2+2; }
$ clang -target riscv64 -S foo.c
$ cat foo.s
.text
.attribute 4, 16
.attribute 5, "rv64i2p0_m2p0_a2p0_c2p0"
.file "foo.c"
.globl main
.p2align 1
.type main,#function
main:
addi sp, sp, -32
sd ra, 24(sp)
sd s0, 16(sp)
addi s0, sp, 32
li a0, 0
sw a0, -20(s0)
li a0, 4
ld ra, 24(sp)
ld s0, 16(sp)
addi sp, sp, 32
ret
.Lfunc_end0:
.size main, .Lfunc_end0-main
.ident "Ubuntu clang version 14.0.0-1ubuntu1"
.section ".note.GNU-stack","",#progbits
.addrsig
You can also use Compiler Explorer conveniently online.

Related

LLVM ARM64 assembly: Getting a symbol/label address?

I am porting some 32-bit ARM code to 64-bit and am having trouble determining the 64-bit version of this instruction:
ldr r1, =_fns
Where _fns is a symbol defined in some C source file elsewhere in the project.
I've tried the below but both give errors:
adr x1, _fns <== "error: unknown AArch64 fixup kind!"
adrl x1, _fns <== "error: unrecognized instruction mnemonic"
The assembler is LLVM in the iOS SDK (XCode 7.1).
I've noticed that if _fns is locally defined (i.e. in the same .S file) then "adr x1,_fns" works fine. However that's not a fix as _fns has to be in C code (i.e. in a different translation unit).
What is the right way to do this with LLVM ARM assembler?
If I feed
extern char ar[];
char *f()
{
return ar;
}
into the ELLCC (clang based) demo, I get:
Output from the compiler targeting ARM AArch64
.text
.file "/tmp/webcompile/_3793_0.c"
.globl f
.align 2
.type f,#function
f: // #f
// BB#0: // %entry
adrp x0, ar
add x0, x0, :lo12:ar
ret
.Lfunc_end0:
.size f, .Lfunc_end0-f
.ident "ecc 0.1.13 based on clang version 3.7.0 (trunk) (based on LLVM 3.7.0svn)"
.section ".note.GNU-stack","",#progbits
The adrp instruction gets the "page" address of ar into x0. The symbol argument to adrp translates into a 21 bit PC relative offset to the 4K page in which the symbol resides. That offset is added to the PC to get the actual start of the page. The add instruction adds the low 12 bits of the symbol address to get the actual symbol address.
This instruction sequence allows the address of a symbol within +/-4GB of the PC to be loaded.
As far as I can tell, there doesn't seem to be a way of getting functionality similar to 32 bit ARM's "=ar" in C. I assembly language, it looks like this will work:
.text
.file "atest.s"
.globl f
.align 2
f:
ldr x0, p
ret
.align 3
p:
.xword _fns
This is very similar to what the 32 bit ARM does under the hood.
The only reason I started out with the C version was to show how I usually attack a problem like this, especially if I'm not that familiar with the target assembly language.
this work well for me, xcode7.1 LLVM7.0 IOS9.1
In 32bit arm
ldr r9,=JumpTab
Change to 64bit arm
adrp x9,JumpTab#PAGE
add x9,x9,JumpTab#PAGEOFF
By the way,you need care registers' number, some registers have specific useful in arm64

Linking problems when creating a library for iOS 7

I get linking problem when create library for iOS 7 on iPhone (ARM64).
The error message is:
ld: in /long_path/libHEVCCodec.a(inv_xforms_arm64.o), in section TEXT,text reloc 0:
ARM64_RELOC_SUBTRACTOR must have r_length of 2 or 3 for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
This error appears as a result to this code (it's some sort of switch):
adr addr, .L.dct_add_switch
ldrh offset, [addr, ta, lsl #1]
add addr, addr, offset, uxth
br addr
.L.dct_add_switch:
.hword .L.dct_add_4 - .L.dct_add_switch
.hword .L.dst_add_4 - .L.dct_add_switch
...
ta, addr, offset are general registers x3, x4, w5 respectively.
Does anybody know how to handle this situation?
PS: there are not any problems with GNU GCC & Android.
EDIT1:
It seems that problem is not in linker itself but in compiler.
I checked object file (objdump) and instead of difference constants there are just zeros.
.L.dct_add_switch:
0000000000000010 .long 0x00000000
0000000000000014 .long 0x00000000
0000000000000018 .long 0x00000000
000000000000001c nop
When I put manually calculated constants instead of ".L.dct_add_4 - .L.dct_add_switch", etc expressions, everything is going all right.
Maybe there is some compiler keys which will make compiler to do it job rightfully?
Thanks.
Well there is a compiler & linker problem and it depends on the size of data which are used for offsets. Clang is not very friendly to anything what is different from 4 Bytes.
The discussion and possible solutions in other topic: creating constant jump table; xcode; clang; asm
The problem is the Mach-O object file format for ARM 64-bit targets doesn't support a relocation for the 16-bit difference between two symbols. It appears that the difference must be 32-bit or 64-bit. It doesn't seem to be a problem with the compiler or the linker. The assembly code you've quoted in your question looks like handcrafted assembly, not compiler output.
The solution would be to rewrite the assembly to use 32-bit difference values. Something like this:
adr addr, .L.dct_add_switch
ldr offset, [addr, ta, lsl #2]
add addr, addr, offset, uxtw
br addr
.L.dct_add_switch:
.word .L.dct_add_4 - .L.dct_add_switch
.word .L.dst_add_4 - .L.dct_add_switch

Rename ARM assembly register in XCode 5.1 / LLVM / clang

I have an AArch64 NEON function defined in an assembly .s file. It's already compiling and running fine, but to improve code readability I'd like to use register aliases with the .req assembler directive. Although when I try to do it clang fails with error: unexpected token in argument list
To keep the example simple consider this code:
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 2
_foo:
add w0, w0, w1
ret lr
This code compiles and runs, but if I try to use
myreg .req w0
before the add instruction, I get
/Users/dfcamara/Desktop/MyApp/MyApp/foo.s:5:16: error: unexpected token in argument list
myreg .req w0
^
Maybe I need some clang directive that I'm not aware, I can't find documentation about them. Or a compiler option. I just created a new iOS (iPad) project and added an assembly file.
Thanks

Compiling ARM .s for iPhone on Mac / Windows

I am new to this and here is what I want to do -
Create simple programs - loops, counters etc. using ARMv7 assembly
I want to be able to compile them on the Mac / Win / Linux for running on the iPhone
I have a jailbroken iPhone so I can upload the file there, sign it with ldid and run it
Can someone please point me to how I can do this with freely available tools?
Thanks!
I'm left wondering what you're trying to achieve here - is it to learn developing in ARM assembler? Do you want to write iOS applications?
No sane person writes complete applications in assembler these days - they use high level languages - and in a few restricted cases optimize in assembler. This is a very specialist and useful skill to have.
Using a complete C program as a surrogate host is good way to start. Create yourself a simple Hello world program in C which calls an (almost) empty function.
You can (mostly) get this to work using XCode (you need install the optional command-line tools). All but the final linking stage for ARM works using clang. This is obviously MacOSX only.
A better alternative for this kind of experimentation is an ARM Linux system where you're not fighting against the locked down environment of iOS. The Raspberry Pi is perfect for the job. You'll need a cross-compiling toolchain for ARMv7 - of which there are plenty. If using Ubuntu, there are pre-built packages readily available.
Main.c
#include<stdio.h>
extern void func();
int main()
{
printf("Hello World\n");
func();
}
and func.c
#include <stdio.h>
void func()
{
printf("In func()\n");
}
Compile both for your host environment and run it to see it works:
gcc main.c func.c
`./a.out'
Now compile for your target environment. The precise name of the cross-compiling tools varies depending what you installed (mine is arm-angstrom-linux-gnueabi-gcc)
arm-angstrom-linux-gnueabi-gcc main.c func.c
Copy to your target, and prove it works.
Now you can start to write some assembler. Get gcc to produce ARM assembler for our victim file func.c - this results in a file func.s
arm-angstrom-linux-gnueabi-gcc func.c -s
:
.cpu arm7tdmi-s
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 6
.eabi_attribute 18, 4
.file "func.c"
.section .rodata
.align 2
.LC0:
.ascii "In func()\000"
.text
.align 2
.global func
.type func, %function
func:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 1, uses_anonymous_args = 0
stmfd sp!, {fp, lr}
add fp, sp, #4
ldr r0, .L2
bl puts
sub sp, fp, #4
ldmfd sp!, {fp, lr}
bx lr
.L3:
.align 2
.L2:
.word .LC0
.size func, .-func
.ident "GCC: (GNU) 4.5.4 20120305 (prerelease)"
.section .note.GNU-stack,"",%progbits
You can see here that between label func: and .L3 is the business end of func() - and it's almost all function prologue and epilogue. You'll want to check out the ARM Procedure Call Standard to understand what these are and for guidance on which registers to use.
Once you've done your edits, compile the whole thing again with GCC
arm-angstrom-linux-gnueabi-gcc main.c func.s
...and test it.

How to Switch Xcode 4.2's iOS disassembly from Thumb to ARM?

My iOS App is build with the Apple LLVM 3.0 compiler in Thumb mode. For armv7, I'm pretty sure that's actually Thumb-2.
I'm reimplementing my two most time-consuming functions in ARM assembly code. The callers of these functions are Thumb, so I use Thumb to ARM interworking instructions to switch to ARM in the prologue of my functions so I have acesss to ARM's richer instruction set and greater number of registers. At function exit I use ARM to Thumb interworking to return to ARM mode.
GDB's disassembly is correct for the Thumb code, but when I am in ARM mode, it disassembles the ARM instructions as if each one were a pair of completely nonsensical Thumb instructions. Is there some way I can tell GDB to switch to ARM disassembly, then upon returning to Thumb code, use the Thumb disassembler?
Google is no help. There are apparently other forks of GDB that can do that, but I haven't figured out a way to do it with GDB.
LLDB apparently supports ARM debugging, but it does not yet work on iOS devices in Xcode 4.2. When I choose the LLDB debugger in Product -> Edit Scheme, then set a breakpoint in my code, my App hangs before hitting the breakpoint.
It's been a long time since I've done any assembly of any sort, so I am brushing up on the ARM calling conventions by implementing functions that take various parameters and return various types of results in both C and ARM assembly. The lower_case functions are C and the CamelCase functions are assembly. I call abiTest the very first thing from main(), and use assert() to ensure that it returns YES
BOOL abiTest( void )
{
void_no_args();
VoidNoArgs();
if ( 42 != int_no_args() )
return NO;
if ( 42 != IntNoArgs() )
return NO;
return YES;
}
Here is the source for IntNoArgs. .thumb_func is a directive for the linker. My research seems to indicate that you want it even for ARM functions, if one mixes the two types of code
.globl _IntNoArgs
.align 1
.code 16
.thumb_func _IntNoArgs
_IntNoArgs:
# int IntNoArgs( void );
.loc 1 __LINE__ 0
adr r0, Larm1 # Larm1 is a PC-relative address. r0's low bit will be cleared
bx r0 # Switch to ARM mode then branch to Larm1. That's the next instruction
.align 2
.code 32
Larm1:
stmfd sp!, { r7, lr }
mov r0, #42
ldmfd sp!, { r7, lr }
bx lr
Here is how GDB disassembles the _IntNoArgs. The first two lines are correct, the remainder are completely wrong
0x000172c8 <+0000> add r0, pc, #0 (adr r0, 0x172cc <VoidNoArgs+4>)
0x000172ca <+0002> bx r0
0x000172cc <+0004> lsls r0, r0
0x000172ce <+0006> stmdb sp!, {r1, r3, r5}
0x000172d2 <+0010> b.n 0x17a16
0x000172d4 <+0012> lsls r0, r0
0x000172d6 <+0014> ldmia.w sp!, {r1, r2, r3, r4, r8, r9, r10, r11, r12, sp, lr, pc}
The disassembly stops here because the ldmia.w instruction appears to be putting a new value into the program counter after taking it from the stack, thereby returning from the subroutine. After I step over this instruction with "si" the disassembly pane show:
0x000172d8 <+0016> vrhadd.u16 d14, d14, d31
The si instruction always does the right thing, by advancing just one instruction whether we are in Thumb or ARM mode. So GDB must know the current instruction set architecture, it's just that the disassembler is not getting that information.
There is a bit in one of the ARM's register that indicates the current mode. Some forks of GDB have the ability to use the value of that bit when determining which ISA to disassemble as, but this is apparently not the case with Xcode 4.2's GDB.
Xcode's GDB has a command "set arm disassembler" and its corresponding "show arm disassembler" that looks like it would help, but it doesn't. That apparently is meant to support other kinds of ARM variants than what the iOS devices use.
"set fallback-mode" can be set to arm, thumb or auto in other forks of GDB, but not Xcodes. Ditto for "set disassembler-flavor".
What I would REALLY REALLY REALLY like is a machine debugger that worked just like MacsBug did on the Classic Mac OS. While GDB is generally capable of doing assembler debugging, it totally sucks for that purpose. That's not anyone's fault, really, because it is designed for source debugging. A good assembly debugger is designed to do it that way from the ground up.
The ABI function call guide states that switching between ARM and Thumb mode can be done only at function boundaries in iOS. Make sure your functions are either ARM or Thumb-only, and the debugger will work fine.

Resources