I am porting some 32-bit ARM code to 64-bit and am having trouble determining the 64-bit version of this instruction:
ldr r1, =_fns
Where _fns is a symbol defined in some C source file elsewhere in the project.
I've tried the below but both give errors:
adr x1, _fns <== "error: unknown AArch64 fixup kind!"
adrl x1, _fns <== "error: unrecognized instruction mnemonic"
The assembler is LLVM in the iOS SDK (XCode 7.1).
I've noticed that if _fns is locally defined (i.e. in the same .S file) then "adr x1,_fns" works fine. However that's not a fix as _fns has to be in C code (i.e. in a different translation unit).
What is the right way to do this with LLVM ARM assembler?
If I feed
extern char ar[];
char *f()
{
return ar;
}
into the ELLCC (clang based) demo, I get:
Output from the compiler targeting ARM AArch64
.text
.file "/tmp/webcompile/_3793_0.c"
.globl f
.align 2
.type f,#function
f: // #f
// BB#0: // %entry
adrp x0, ar
add x0, x0, :lo12:ar
ret
.Lfunc_end0:
.size f, .Lfunc_end0-f
.ident "ecc 0.1.13 based on clang version 3.7.0 (trunk) (based on LLVM 3.7.0svn)"
.section ".note.GNU-stack","",#progbits
The adrp instruction gets the "page" address of ar into x0. The symbol argument to adrp translates into a 21 bit PC relative offset to the 4K page in which the symbol resides. That offset is added to the PC to get the actual start of the page. The add instruction adds the low 12 bits of the symbol address to get the actual symbol address.
This instruction sequence allows the address of a symbol within +/-4GB of the PC to be loaded.
As far as I can tell, there doesn't seem to be a way of getting functionality similar to 32 bit ARM's "=ar" in C. I assembly language, it looks like this will work:
.text
.file "atest.s"
.globl f
.align 2
f:
ldr x0, p
ret
.align 3
p:
.xword _fns
This is very similar to what the 32 bit ARM does under the hood.
The only reason I started out with the C version was to show how I usually attack a problem like this, especially if I'm not that familiar with the target assembly language.
this work well for me, xcode7.1 LLVM7.0 IOS9.1
In 32bit arm
ldr r9,=JumpTab
Change to 64bit arm
adrp x9,JumpTab#PAGE
add x9,x9,JumpTab#PAGEOFF
By the way,you need care registers' number, some registers have specific useful in arm64
Related
I am trying to learn more about compilers and RISC V assembly was specifically designed to be easy to learn and teach. I am interested in compiling some simple C code to assembly using clang for the purpose of understanding the semantics. I'm planning on using venus to step through the assembly and the source code does NOT actually need to be fully compiled to machine code in order to run on a real machine.
I want to avoid compiler optimizations so I can see what I've actually instructed the processor to do.
I don't actually need the program to compile to machine code--I just want the assembly.
I don't want to worry about linking to the system library because this code doesn't actually need to run
The code does not make any explicit use of system calls and so I think a std lib should not be required
This answer seems to indicate that clang definitely can compile to RISC V targets, but it requires having a version of the OS's standard library built for RISC V.
This answer indicates that some form of cross-compiling is necessary, but again I don't need to fully compile the code to machine instructions so this should not apply if I'm understanding correctly.
Use clang -S to stop after generating an assembly file:
$ cat foo.c
int main() { return 2+2; }
$ clang -target riscv64 -S foo.c
$ cat foo.s
.text
.attribute 4, 16
.attribute 5, "rv64i2p0_m2p0_a2p0_c2p0"
.file "foo.c"
.globl main
.p2align 1
.type main,#function
main:
addi sp, sp, -32
sd ra, 24(sp)
sd s0, 16(sp)
addi s0, sp, 32
li a0, 0
sw a0, -20(s0)
li a0, 4
ld ra, 24(sp)
ld s0, 16(sp)
addi sp, sp, 32
ret
.Lfunc_end0:
.size main, .Lfunc_end0-main
.ident "Ubuntu clang version 14.0.0-1ubuntu1"
.section ".note.GNU-stack","",#progbits
.addrsig
You can also use Compiler Explorer conveniently online.
I am learning more about shellcode and making syscalls in arm64 on iOS devices. The device I am testing on is iPhone 6S.
I got the list of syscalls from this link (https://github.com/radare/radare2/blob/master/libr/include/sflib/darwin-arm-64/ios-syscalls.txt).
I learnt that x8 is used for putting the syscall number for arm64 from here (http://arm.ninja/2016/03/07/decoding-syscalls-in-arm64/).
I figured the various registers used to pass in parameters for arm64 should be the same as arm so I referred to this link (https://w3challs.com/syscalls/?arch=arm_strong), taken from https://azeria-labs.com/writing-arm-shellcode/.
I wrote inline assembly in Xcode and here are some snippets
//exit syscall
__asm__ volatile("mov x8, #1");
__asm__ volatile("mov x0, #0");
__asm__ volatile("svc 0x80");
However, the application does not terminate when I stepped over these codes.
char write_buffer[]="console_text";
int write_buffer_size = sizeof(write_buffer);
__asm__ volatile("mov x8,#4;" //arm64 uses x8 for syscall number
"mov x0,#1;" //1 for stdout file descriptor
"mov x1,%0;" //the buffer to display
"mov x2,%1;" //buffer size
"svc 0x80;"
:
:"r"(write_buffer),"r"(write_buffer_size)
:"x0","x1","x2","x8"
);
If this syscall works, it should print out some text in Xcode's console output screen. However, nothing gets printed.
There are many online articles for ARM assembly, some use svc 0x80 and some use svc 0 etc and so there can be a few variations. I tried various methods but I could not get the two code snippets to work.
Can someone provide some guidance?
EDIT:
This is what Xcode shows in its Assembly view when I wrote a C function syscall int return_value=syscall(1,0);
mov x1, sp
mov x30, #0
str x30, [x1]
orr w8, wzr, #0x1
stur x0, [x29, #-32] ; 8-byte Folded Spill
mov x0, x8
bl _syscall
I am not sure why this code was emitted.
The registers used for syscalls are completely arbitrary, and the resources you've picked are certainly wrong for XNU.
As far as I'm aware, the XNU syscall ABI for arm64 is entirely private and subject to change without notice so there's no published standard that it follows, but you can scrape together how it works by getting a copy of the XNU source (as tarballs, or viewing it online if you prefer that), grep for the handle_svc function, and just following the code.
I'm not gonna go into detail on where exactly you find which bits, but the end result is:
The immediate passed to svc is ignored, but the standard library uses svc 0x80.
x16 holds the syscall number
x0 through x8 hold up to 9 arguments*
There are no arguments on the stack
x0 and x1 hold up to 2 return values (e.g. in the case of fork)
The carry bit is used to report an error, in which case x0 holds the error code
* This is used only in the case of an indirect syscall (x16 = 0) with 8 arguments.
* Comments in the XNU source also mention x9, but it seems the engineer who wrote that should brush up on off-by-one errors.
And then it comes to the actual syscall numbers available:
The canonical source for "UNIX syscalls" is the file bsd/kern/syscalls.master in the XNU source tree. Those take syscall numbers from 0 up to about 540 in the latest iOS 13 beta.
The canonical source for "Mach syscalls" is the file osfmk/kern/syscall_sw.c in the XNU source tree. Those syscalls are invoked with negative numbers between -10 and -100 (e.g. -28 would be task_self_trap).
Unrelated to the last point, two syscalls mach_absolute_time and mach_continuous_time can be invoked with syscall numbers -3 and -4 respectively.
A few low-level operations are available through platform_syscall with the syscall number 0x80000000.
This should get you going. As #Siguza mentioned you must use x16 , not x8 for the syscall number.
#import <sys/syscall.h>
char testStringGlobal[] = "helloWorld from global variable\n";
int main(int argc, char * argv[]) {
char testStringOnStack[] = "helloWorld from stack variable\n";
#if TARGET_CPU_ARM64
//VARIANT 1 suggested by #PeterCordes
//an an input it's a file descriptor set to STD_OUT 1 so the syscall write output appears in Xcode debug output
//as an output this will be used for returning syscall return value;
register long x0 asm("x0") = 1;
//as an input string to write
//as an output this will be used for returning syscall return value higher half (in this particular case 0)
register char *x1 asm("x1") = testStringOnStack;
//string length
register long x2 asm("x2") = strlen(testStringOnStack);
//syscall write is 4
register long x16 asm("x16") = SYS_write; //syscall write definition - see my footnote below
//full variant using stack local variables for register x0,x1,x2,x16 input
//syscall result collected in x0 & x1 using "semi" intrinsic assembler
asm volatile(//all args prepared, make the syscall
"svc #0x80"
:"=r"(x0),"=r"(x1) //mark x0 & x1 as syscall outputs
:"r"(x0), "r"(x1), "r"(x2), "r"(x16): //mark the inputs
//inform the compiler we read the memory
"memory",
//inform the compiler we clobber carry flag (during the syscall itself)
"cc");
//VARIANT 2
//syscall write for globals variable using "semi" intrinsic assembler
//args hardcoded
//output of syscall is ignored
asm volatile(//prepare x1 with the help of x8 register
"mov x1, %0 \t\n"
//set file descriptor to STD_OUT 1 so it appears in Xcode debug output
"mov x0, #1 \t\n"
//hardcoded length
"mov x2, #32 \t\n"
//syscall write is 4
"mov x16, #0x4 \t\n"
//all args prepared, make the syscall
"svc #0x80"
::"r"(testStringGlobal):
//clobbered registers list
"x1","x0","x2","x16",
//inform the compiler we read the memory
"memory",
//inform the compiler we clobber carry flag (during the syscall itself)
"cc");
//VARIANT 3 - only applicable to global variables using "page" address
//which is PC-relative addressing to load addresses at a fixed offset from the current location (PIC code).
//syscall write for global variable using "semi" intrinsic assembler
asm volatile(//set x1 on proper PAGE
"adrp x1,_testStringGlobal#PAGE \t\n" //notice the underscore preceding variable name by convention
//add the offset of the testStringGlobal variable
"add x1,x1,_testStringGlobal#PAGEOFF \t\n"
//set file descriptor to STD_OUT 1 so it appears in Xcode debug output
"mov x0, #1 \t\n"
//hardcoded length
"mov x2, #32 \t\n"
//syscall write is 4
"mov x16, #0x4 \t\n"
//all args prepared, make the syscall
"svc #0x80"
:::
//clobbered registers list
"x1","x0","x2","x16",
//inform the compiler we read the memory
"memory",
//inform the compiler we clobber carry flag (during the syscall itself)
"cc");
#endif
#autoreleasepool {
return UIApplicationMain(argc, argv, nil, NSStringFromClass([AppDelegate class]));
}
}
EDIT
To #PeterCordes excellent comment, yes there is a syscall numbers definition header <sys/syscall.h> which I included in the above snippet^ in Variant 1. But it's important to mention inside it's defined by Apple like this:
#ifdef __APPLE_API_PRIVATE
#define SYS_syscall 0
#define SYS_exit 1
#define SYS_fork 2
#define SYS_read 3
#define SYS_write 4
I haven't heard of a case yet of an iOS app AppStore rejection due to using a system call directly through svc 0x80 nonetheless it's definitely not public API.
As for the suggested "=#ccc" by #PeterCordes i.e. carry flag (set by syscall upon error) as an output constraint that's not supported as of latest XCode11 beta / LLVM 8.0.0 even for x86 and definitely not for ARM.
I get linking problem when create library for iOS 7 on iPhone (ARM64).
The error message is:
ld: in /long_path/libHEVCCodec.a(inv_xforms_arm64.o), in section TEXT,text reloc 0:
ARM64_RELOC_SUBTRACTOR must have r_length of 2 or 3 for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
This error appears as a result to this code (it's some sort of switch):
adr addr, .L.dct_add_switch
ldrh offset, [addr, ta, lsl #1]
add addr, addr, offset, uxth
br addr
.L.dct_add_switch:
.hword .L.dct_add_4 - .L.dct_add_switch
.hword .L.dst_add_4 - .L.dct_add_switch
...
ta, addr, offset are general registers x3, x4, w5 respectively.
Does anybody know how to handle this situation?
PS: there are not any problems with GNU GCC & Android.
EDIT1:
It seems that problem is not in linker itself but in compiler.
I checked object file (objdump) and instead of difference constants there are just zeros.
.L.dct_add_switch:
0000000000000010 .long 0x00000000
0000000000000014 .long 0x00000000
0000000000000018 .long 0x00000000
000000000000001c nop
When I put manually calculated constants instead of ".L.dct_add_4 - .L.dct_add_switch", etc expressions, everything is going all right.
Maybe there is some compiler keys which will make compiler to do it job rightfully?
Thanks.
Well there is a compiler & linker problem and it depends on the size of data which are used for offsets. Clang is not very friendly to anything what is different from 4 Bytes.
The discussion and possible solutions in other topic: creating constant jump table; xcode; clang; asm
The problem is the Mach-O object file format for ARM 64-bit targets doesn't support a relocation for the 16-bit difference between two symbols. It appears that the difference must be 32-bit or 64-bit. It doesn't seem to be a problem with the compiler or the linker. The assembly code you've quoted in your question looks like handcrafted assembly, not compiler output.
The solution would be to rewrite the assembly to use 32-bit difference values. Something like this:
adr addr, .L.dct_add_switch
ldr offset, [addr, ta, lsl #2]
add addr, addr, offset, uxtw
br addr
.L.dct_add_switch:
.word .L.dct_add_4 - .L.dct_add_switch
.word .L.dst_add_4 - .L.dct_add_switch
I have an AArch64 NEON function defined in an assembly .s file. It's already compiling and running fine, but to improve code readability I'd like to use register aliases with the .req assembler directive. Although when I try to do it clang fails with error: unexpected token in argument list
To keep the example simple consider this code:
.section __TEXT,__text,regular,pure_instructions
.globl _foo
.align 2
_foo:
add w0, w0, w1
ret lr
This code compiles and runs, but if I try to use
myreg .req w0
before the add instruction, I get
/Users/dfcamara/Desktop/MyApp/MyApp/foo.s:5:16: error: unexpected token in argument list
myreg .req w0
^
Maybe I need some clang directive that I'm not aware, I can't find documentation about them. Or a compiler option. I just created a new iOS (iPad) project and added an assembly file.
Thanks
My iOS App is build with the Apple LLVM 3.0 compiler in Thumb mode. For armv7, I'm pretty sure that's actually Thumb-2.
I'm reimplementing my two most time-consuming functions in ARM assembly code. The callers of these functions are Thumb, so I use Thumb to ARM interworking instructions to switch to ARM in the prologue of my functions so I have acesss to ARM's richer instruction set and greater number of registers. At function exit I use ARM to Thumb interworking to return to ARM mode.
GDB's disassembly is correct for the Thumb code, but when I am in ARM mode, it disassembles the ARM instructions as if each one were a pair of completely nonsensical Thumb instructions. Is there some way I can tell GDB to switch to ARM disassembly, then upon returning to Thumb code, use the Thumb disassembler?
Google is no help. There are apparently other forks of GDB that can do that, but I haven't figured out a way to do it with GDB.
LLDB apparently supports ARM debugging, but it does not yet work on iOS devices in Xcode 4.2. When I choose the LLDB debugger in Product -> Edit Scheme, then set a breakpoint in my code, my App hangs before hitting the breakpoint.
It's been a long time since I've done any assembly of any sort, so I am brushing up on the ARM calling conventions by implementing functions that take various parameters and return various types of results in both C and ARM assembly. The lower_case functions are C and the CamelCase functions are assembly. I call abiTest the very first thing from main(), and use assert() to ensure that it returns YES
BOOL abiTest( void )
{
void_no_args();
VoidNoArgs();
if ( 42 != int_no_args() )
return NO;
if ( 42 != IntNoArgs() )
return NO;
return YES;
}
Here is the source for IntNoArgs. .thumb_func is a directive for the linker. My research seems to indicate that you want it even for ARM functions, if one mixes the two types of code
.globl _IntNoArgs
.align 1
.code 16
.thumb_func _IntNoArgs
_IntNoArgs:
# int IntNoArgs( void );
.loc 1 __LINE__ 0
adr r0, Larm1 # Larm1 is a PC-relative address. r0's low bit will be cleared
bx r0 # Switch to ARM mode then branch to Larm1. That's the next instruction
.align 2
.code 32
Larm1:
stmfd sp!, { r7, lr }
mov r0, #42
ldmfd sp!, { r7, lr }
bx lr
Here is how GDB disassembles the _IntNoArgs. The first two lines are correct, the remainder are completely wrong
0x000172c8 <+0000> add r0, pc, #0 (adr r0, 0x172cc <VoidNoArgs+4>)
0x000172ca <+0002> bx r0
0x000172cc <+0004> lsls r0, r0
0x000172ce <+0006> stmdb sp!, {r1, r3, r5}
0x000172d2 <+0010> b.n 0x17a16
0x000172d4 <+0012> lsls r0, r0
0x000172d6 <+0014> ldmia.w sp!, {r1, r2, r3, r4, r8, r9, r10, r11, r12, sp, lr, pc}
The disassembly stops here because the ldmia.w instruction appears to be putting a new value into the program counter after taking it from the stack, thereby returning from the subroutine. After I step over this instruction with "si" the disassembly pane show:
0x000172d8 <+0016> vrhadd.u16 d14, d14, d31
The si instruction always does the right thing, by advancing just one instruction whether we are in Thumb or ARM mode. So GDB must know the current instruction set architecture, it's just that the disassembler is not getting that information.
There is a bit in one of the ARM's register that indicates the current mode. Some forks of GDB have the ability to use the value of that bit when determining which ISA to disassemble as, but this is apparently not the case with Xcode 4.2's GDB.
Xcode's GDB has a command "set arm disassembler" and its corresponding "show arm disassembler" that looks like it would help, but it doesn't. That apparently is meant to support other kinds of ARM variants than what the iOS devices use.
"set fallback-mode" can be set to arm, thumb or auto in other forks of GDB, but not Xcodes. Ditto for "set disassembler-flavor".
What I would REALLY REALLY REALLY like is a machine debugger that worked just like MacsBug did on the Classic Mac OS. While GDB is generally capable of doing assembler debugging, it totally sucks for that purpose. That's not anyone's fault, really, because it is designed for source debugging. A good assembly debugger is designed to do it that way from the ground up.
The ABI function call guide states that switching between ARM and Thumb mode can be done only at function boundaries in iOS. Make sure your functions are either ARM or Thumb-only, and the debugger will work fine.