I am trying to mix ARM and THUMB instructions in my assembly code. For example, in the following code I try to use both modes:
.thumb # .code 16
.section __TEXT,__text
.globl mySymbol1
mySymbol1:
....
.arm # .code 32
.section __TEXT,__text
.globl mySymbol2
mySymbol2:
...
Now, as per my understanding when I compile this code into a library and run it through nm, mysymbol1 should show up as arm and mysymbol2 should show up as thumb, i.e,
0000xxxx (__TEXT,__text) external mySymbol1
0000yyyy (__TEXT,__text) external [Thumb] mySymbol2
But both are showing up as arm. What am I missing here? My assembler command is:
as -arch armv7 -o a.o a.s
you need .thumb_func before the thumb labels for them to be thumb targets otherwise the gnu tools will treat it as an arm target. (yes you need the .thumb once AND .thumb_func for EVERY label you want to use as a thumb target). Many examples http://github.com/dwelch67
Related
I am trying to learn more about compilers and RISC V assembly was specifically designed to be easy to learn and teach. I am interested in compiling some simple C code to assembly using clang for the purpose of understanding the semantics. I'm planning on using venus to step through the assembly and the source code does NOT actually need to be fully compiled to machine code in order to run on a real machine.
I want to avoid compiler optimizations so I can see what I've actually instructed the processor to do.
I don't actually need the program to compile to machine code--I just want the assembly.
I don't want to worry about linking to the system library because this code doesn't actually need to run
The code does not make any explicit use of system calls and so I think a std lib should not be required
This answer seems to indicate that clang definitely can compile to RISC V targets, but it requires having a version of the OS's standard library built for RISC V.
This answer indicates that some form of cross-compiling is necessary, but again I don't need to fully compile the code to machine instructions so this should not apply if I'm understanding correctly.
Use clang -S to stop after generating an assembly file:
$ cat foo.c
int main() { return 2+2; }
$ clang -target riscv64 -S foo.c
$ cat foo.s
.text
.attribute 4, 16
.attribute 5, "rv64i2p0_m2p0_a2p0_c2p0"
.file "foo.c"
.globl main
.p2align 1
.type main,#function
main:
addi sp, sp, -32
sd ra, 24(sp)
sd s0, 16(sp)
addi s0, sp, 32
li a0, 0
sw a0, -20(s0)
li a0, 4
ld ra, 24(sp)
ld s0, 16(sp)
addi sp, sp, 32
ret
.Lfunc_end0:
.size main, .Lfunc_end0-main
.ident "Ubuntu clang version 14.0.0-1ubuntu1"
.section ".note.GNU-stack","",#progbits
.addrsig
You can also use Compiler Explorer conveniently online.
I came across the size command which gives the section size of the ELF file. While playing around with it, I created an output file for the simplest C++ program :
int main(){return 0;}
Clearly, I have not defined any initialized or uninitialized, data then why are my BSS and DATA sections of the size 512 and 8 bytes?
I thought it might be because of int main(), I tried creating object file for the following C program :
void main(){}
I still don't get 0 for BSS and DATA sections.
Is it because a certain minimum sized memory is allocated to those section?
EDIT- I thought it might be because of linked libraries but my object is dynamically linked so probably it shouldn't be the issue
int main(){return 0;} puts data in .text only.
$ echo 'int main(){return 0;}' | gcc -xc - -c -o main.o && size main.o
text data bss dec hex filename
67 0 0 67 43 main.o
You're probably sizeing a fully linked executable.
$ gcc main.o -o main && size main
text data bss dec hex filename
1415 544 8 1967 7af main
In fact, if you are compiling with the libc attached to the binary, there are functions that are added before (and after) the main() function. They are here mostly to load dynamic libraries (even if you do not need it in your case) and unload it properly once main() end.
These functions have global variables that require storage; uninitialized (zero initialized) global variables in the BSS segment and initialized global variables in the DATA segment.
This is why, you will always see BSS and DATA in all the binaries compiled with the libc. If you want to get rid of this, then you should write your own assembly program, like this (asm.s):
.globl _start
_start:
mov %eax, %ebx
And, then compile it without the libc:
$> gcc -nostdlib -o asm asm.s
You should reduce your footprint to the BSS and DATA segment on this ELF binary.
I get linking problem when create library for iOS 7 on iPhone (ARM64).
The error message is:
ld: in /long_path/libHEVCCodec.a(inv_xforms_arm64.o), in section TEXT,text reloc 0:
ARM64_RELOC_SUBTRACTOR must have r_length of 2 or 3 for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
This error appears as a result to this code (it's some sort of switch):
adr addr, .L.dct_add_switch
ldrh offset, [addr, ta, lsl #1]
add addr, addr, offset, uxth
br addr
.L.dct_add_switch:
.hword .L.dct_add_4 - .L.dct_add_switch
.hword .L.dst_add_4 - .L.dct_add_switch
...
ta, addr, offset are general registers x3, x4, w5 respectively.
Does anybody know how to handle this situation?
PS: there are not any problems with GNU GCC & Android.
EDIT1:
It seems that problem is not in linker itself but in compiler.
I checked object file (objdump) and instead of difference constants there are just zeros.
.L.dct_add_switch:
0000000000000010 .long 0x00000000
0000000000000014 .long 0x00000000
0000000000000018 .long 0x00000000
000000000000001c nop
When I put manually calculated constants instead of ".L.dct_add_4 - .L.dct_add_switch", etc expressions, everything is going all right.
Maybe there is some compiler keys which will make compiler to do it job rightfully?
Thanks.
Well there is a compiler & linker problem and it depends on the size of data which are used for offsets. Clang is not very friendly to anything what is different from 4 Bytes.
The discussion and possible solutions in other topic: creating constant jump table; xcode; clang; asm
The problem is the Mach-O object file format for ARM 64-bit targets doesn't support a relocation for the 16-bit difference between two symbols. It appears that the difference must be 32-bit or 64-bit. It doesn't seem to be a problem with the compiler or the linker. The assembly code you've quoted in your question looks like handcrafted assembly, not compiler output.
The solution would be to rewrite the assembly to use 32-bit difference values. Something like this:
adr addr, .L.dct_add_switch
ldr offset, [addr, ta, lsl #2]
add addr, addr, offset, uxtw
br addr
.L.dct_add_switch:
.word .L.dct_add_4 - .L.dct_add_switch
.word .L.dst_add_4 - .L.dct_add_switch
I have an AArch64 NEON function defined in an assembly .s file. It's already compiling and running fine, but to improve code readability I'd like to use register aliases with the .req assembler directive. Although when I try to do it clang fails with error: unexpected token in argument list
To keep the example simple consider this code:
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 2
_foo:
add w0, w0, w1
ret lr
This code compiles and runs, but if I try to use
myreg .req w0
before the add instruction, I get
/Users/dfcamara/Desktop/MyApp/MyApp/foo.s:5:16: error: unexpected token in argument list
myreg .req w0
^
Maybe I need some clang directive that I'm not aware, I can't find documentation about them. Or a compiler option. I just created a new iOS (iPad) project and added an assembly file.
Thanks
I am new to this and here is what I want to do -
Create simple programs - loops, counters etc. using ARMv7 assembly
I want to be able to compile them on the Mac / Win / Linux for running on the iPhone
I have a jailbroken iPhone so I can upload the file there, sign it with ldid and run it
Can someone please point me to how I can do this with freely available tools?
Thanks!
I'm left wondering what you're trying to achieve here - is it to learn developing in ARM assembler? Do you want to write iOS applications?
No sane person writes complete applications in assembler these days - they use high level languages - and in a few restricted cases optimize in assembler. This is a very specialist and useful skill to have.
Using a complete C program as a surrogate host is good way to start. Create yourself a simple Hello world program in C which calls an (almost) empty function.
You can (mostly) get this to work using XCode (you need install the optional command-line tools). All but the final linking stage for ARM works using clang. This is obviously MacOSX only.
A better alternative for this kind of experimentation is an ARM Linux system where you're not fighting against the locked down environment of iOS. The Raspberry Pi is perfect for the job. You'll need a cross-compiling toolchain for ARMv7 - of which there are plenty. If using Ubuntu, there are pre-built packages readily available.
Main.c
#include<stdio.h>
extern void func();
int main()
{
printf("Hello World\n");
func();
}
and func.c
#include <stdio.h>
void func()
{
printf("In func()\n");
}
Compile both for your host environment and run it to see it works:
gcc main.c func.c
`./a.out'
Now compile for your target environment. The precise name of the cross-compiling tools varies depending what you installed (mine is arm-angstrom-linux-gnueabi-gcc)
arm-angstrom-linux-gnueabi-gcc main.c func.c
Copy to your target, and prove it works.
Now you can start to write some assembler. Get gcc to produce ARM assembler for our victim file func.c - this results in a file func.s
arm-angstrom-linux-gnueabi-gcc func.c -s
:
.cpu arm7tdmi-s
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 6
.eabi_attribute 18, 4
.file "func.c"
.section .rodata
.align 2
.LC0:
.ascii "In func()\000"
.text
.align 2
.global func
.type func, %function
func:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 1, uses_anonymous_args = 0
stmfd sp!, {fp, lr}
add fp, sp, #4
ldr r0, .L2
bl puts
sub sp, fp, #4
ldmfd sp!, {fp, lr}
bx lr
.L3:
.align 2
.L2:
.word .LC0
.size func, .-func
.ident "GCC: (GNU) 4.5.4 20120305 (prerelease)"
.section .note.GNU-stack,"",%progbits
You can see here that between label func: and .L3 is the business end of func() - and it's almost all function prologue and epilogue. You'll want to check out the ARM Procedure Call Standard to understand what these are and for guidance on which registers to use.
Once you've done your edits, compile the whole thing again with GCC
arm-angstrom-linux-gnueabi-gcc main.c func.s
...and test it.