I've got very unexpected result from Pin Tool, my tool looks for CALL/RET instructions and then log the proper message :
VOID CallBack(VOID * ip, ADDRINT esp)
{
UINT32 *RetAddrPtr = (UINT32 *)esp;
fprintf(log_info,"RET inst #%p ==> Retuen Address #%p.\n", ip, *RetAddrPtr);
}
// Pin calls this function every time a new instruction is encountered
VOID Trace(TRACE trace, VOID *v)
{
ADDRINT insAddress = TRACE_Address(trace);
// Visit every basic block in the trace
for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl))
{
for(INS ins = BBL_InsHead(bbl); INS_Valid(ins); ins = INS_Next(ins))
{
ADDRINT instAddress = INS_Address(ins);
if( INS_IsCall(ins) )
{
ADDRINT nextInstAddress = (ADDRINT)( (USIZE)instAddress + INS_Size(ins) );
fprintf(log_info,"CALL inst #%p ==> CALL Return Address #%p.\n", instAddress, nextInstAddress);
}
if(INS_IsRet(ins))
{
INS_InsertCall( ins,
IPOINT_BEFORE,
(AFUNPTR)CallBack,
IARG_INST_PTR,
IARG_REG_VALUE,
REG_STACK_PTR,
IARG_END);
}
}
}
}
but the result is very unusual :-/. see this is log result from program entry point:
CALL inst #0101247C ==> CALL Return Address #01012481.
RET inst #01012800 ==> Return Address #01012481.
CALL inst #0101248A ==> CALL Return Address #0101248C.
CALL inst #7C80B73F ==> CALL Return Address #7C80B744.
RET inst #7C80B751 ==> Return Address #0101248C.
CALL inst #010124E3 ==> CALL Return Address #010124E9.
RET inst #77C3538A ==> Return Address #010124E9.
CALL inst #010124F8 ==> CALL Return Address #010124FE.
RET inst #77C1F1E0 ==> Return Address #010124FE.
CALL inst #01012506 ==> CALL Return Address #0101250C.
RET inst #77C1F1A9 ==> Return Address #0101250C.
CALL inst #01012520 ==> CALL Return Address #01012525.
RET inst #010127C4 ==> Return Address #01012525.
CALL inst #01012532 ==> CALL Return Address #01012538.
CALL inst #01012539 ==> CALL Return Address #0101253E.
CALL inst #010127BA ==> CALL Return Address #010127BF.
CALL inst #77C4EE60 ==> CALL Return Address #77C4EE65.
RET inst #77C4ED04 ==> Return Address #77C4EE28. <=========
RET inst #77C4ED97 ==> Return Address #77C4EE3F. <=========
RET inst #77C4EE49 ==> Return Address #77C4EE65.
RET inst #77C4EE68 ==> Return Address #010127BF.
RET inst #010127C1 ==> Return Address #0101253E.
as you can see, there is two RET instruction which doesn't map to any CALL.
after this I opened up program in debugger and saw this :
77C4EE15 > 8BFF MOV EDI,EDI ; kernel32.GetModuleHandleA
77C4EE17 55 PUSH EBP
77C4EE18 8BEC MOV EBP,ESP
77C4EE1A 51 PUSH ECX
77C4EE1B 53 PUSH EBX
77C4EE1C 9B WAIT
77C4EE1D D97D FC FSTCW WORD PTR SS:[EBP-4]
77C4EE20 FF75 FC PUSH DWORD PTR SS:[EBP-4]
77C4EE23 E8 41FEFFFF CALL msvcrt.77C4EC69 <============
77C4EE28 8BD8 MOV EBX,EAX
77C4EE2A 8B45 0C MOV EAX,DWORD PTR SS:[EBP+C]
77C4EE2D F7D0 NOT EAX
77C4EE2F 23D8 AND EBX,EAX
77C4EE31 8B45 08 MOV EAX,DWORD PTR SS:[EBP+8]
77C4EE34 2345 0C AND EAX,DWORD PTR SS:[EBP+C]
77C4EE37 59 POP ECX ; msvcrt.77C4EE65
77C4EE38 0BD8 OR EBX,EAX
77C4EE3A E8 CBFEFFFF CALL msvcrt.77C4ED0A <==============
77C4EE3F 8945 0C MOV DWORD PTR SS:[EBP+C],EAX
77C4EE42 D96D 0C FLDCW WORD PTR SS:[EBP+C]
77C4EE45 8BC3 MOV EAX,EBX
77C4EE47 5B POP EBX ; msvcrt.77C4EE65
77C4EE48 C9 LEAVE
Pin Tool cannot see this calls ? I think maybe I used a wrong sequence of API call.
and also there is another unexpected result: there is two different CALL instructions within a function with a conditional jmp between CALLs, which means just of those CALL instructions should execute but Pin log both of them!
Pin doesn't instrument ALL the process execution, it starts it a little bit after the real process start and ends a little bit before the real exit.
The call was probably executed when process execution wasn't redirected to Pin yet this is why you see the retn and not the call.
a little piece of code to explain it :
call start_instrumentation
label ret_call_start_ins:
[...]
function start_instrumentation:
_do_stuff_
* now the process is under Pin control *
_do_stuff_under_pin_control_
retn ; the return you see without any associated call
Related
I've asked people before about why the stack doesn't start at 0x7fff...c before, and was told that typically 0x800... onwards is for the kernel, and the cli args and environment variables live at the top of the user's stack which is why it starts below 0x7fff...c. But I recently tried to examine all the strings with the following program
#include <stdio.h>
#include <string.h>
int main(int argc, const char **argv) {
const char *ptr = argv[0];
while (1) {
printf("%p: %s\n", ptr, ptr);
size_t len = strlen(ptr);
ptr = (void *)ptr + len + 1;
}
}
However, after displaying all my environment variables, I see the following (I compiled the program to an executable called ./t):
0x7ffc19f84fa0: <final env variable string>
0x7ffc19f84fee: _=./t
0x7ffc19f84ff4: ./t
0x7ffc19f84ff8:
0x7ffc19f84ff9:
0x7ffc19f84ffa:
0x7ffc19f84ffb:
0x7ffc19f84ffc:
0x7ffc19f84ffd:
0x7ffc19f84ffe:
0x7ffc19f84fff:
So it appears there's one extra empty byte after the null terminator for the ./t string at bytes 0x7ffc19f84ff4..0x7ffc19f84ff7, and after that I segfault so I guess that's the base of the stack. What actually lives in the remaining "empty" space before kernel memory starts?
Edit: I also tried the following:
global _start
extern print_hex, fgets, puts, print, exit
section .text
_start:
pop rdi
mov rcx, 0
_start_loop:
mov rdi, rsp
call print_hex
pop rdi
call puts
jmp _start_loop
mov rdi, 0
call exit
where print_hex is a routine I wrote elsewhere. It seems this is all I can get
0x00007ffcd272de28
./bin/main
0x00007ffcd272de30
abc
0x00007ffcd272de38
def
0x00007ffcd272de40
ghi
0x00007ffcd272de48
make: *** [Makefile:47: run] Segmentation fault
so it seems that even in _start we don't begin at 0x7fff...
I have been trying to get the PRU to work in a way that makes sense to me and at this point I am completely clueless. I can get the examples to work, but anytime I make a change or try to write things from scratch I just beat my head against the wall. I just want to as a start access any of the USRLEDS and turn them off or on at some speed, or as first pass turn on a LED and leave it on. Here is a PASM code I got off the internet (Will post link when I find it):
.origin 0
.entrypoint START
#define PRU0_ARM_INTERRUPT 19
#define AM33XX
#define GPIO1 0x4804c000 //Trying to access the GPIO1
#define GPIO_CLEARDATAOUT 0x190 //writing 1 to the bit you want cleared in GPIO_DATAOUT register (what does that mean?)
#define GPIO_SETDATAOUT 0x194 (set a value for GPIO output pins, which pins am I even writing to? GPIO1?
#define GPIO_OE 0x134 //enable the pins output capabilities
START:
//clear that bit
lbco r0, c4, 4, 4 //This creates a constant offset and stores in c4, but why do you need that?
CLR r0, r0, 4 //if you copied the data why do you need to clear it?
SBCO r0, C4, 4, 4 //What is this for?
//MOV r1, 10
MOV r2, 0x00000000 //store address 0x00 into r2, why?
MOV r3, GPIO1 //Store GPIO1 address in r3
MOV r4, GPIO_OE //place address of GPIO_OE into r4
MOV r5, GPIO_SETDATAOUT //store address of GPIO_SETDATAOUT in r5
MOV r6, GPIO_CLEARDATAOUT //store addres of GPIOCLEARDATAOUT in r6
SBBO r2, r3, r4,4 //What is this even doing? Copying 4 bytes from r2 into r3+r4, but why do you want to copy that way and if not why not?
MOV r1, 10
MOV r2, 0xFFFFFFFF //Suppossedly this turn the GPIO1 ON and OFF?
SBBO r2, r3, r6, 4 and again the storage stuff?
HALT
I am also attaching the C code that I am using:
#include <stdio.h>
#include <pruss/prussdrv.h>
#include <pruss/pruss_intc_mapping.h>
#define PRU_NUM 0 //defining which PRU to use
int main() {
int ret;
tpruss_intc_initdata intc = PRUSS_INTC_INITDATA;
//initialize the PRU by using init command from prussdrv.h
ret = prussdrv_init();
if(ret != 0) {
printf("Error returned: %d\n",ret);
printf("PRU unable to be initialized");
return -1;
}
ret = prussdrv_open(PRU_EVTOUT_0);
if(ret != 0) {
printf("Error returned for prussdrv_open(): %d\n",ret);
printf("PRU can't open PRU_EVTOUT_0");
return -1;
}
//Map PRUS's INTC
ret = prussdrv_pruintc_init(&intc);
if (ret != 0) {
printf("Error returned for prussdrv_pruintc_int\n");
printf("PRU doesn't work");
return -1;
}
//load and execute binary on PRU
prussdrv_exec_program(PRU_NUM, "./ashwini_test.bin");
prussdrv_pru_wait_event(PRU_EVTOUT_0);
prussdrv_pru_clear_event(PRU_EVTOUT_0,PRU0_ARM_INTERRUPT);
/*Disable PRU and close memory mappings*/
prussdrv_pru_disable(PRU_NUM);
prussdrv_exit();
//prussdrv_pru_wait_event(PRU_EVTOUT_0);
return 0;
}
I have gone through THE TRM and https://groups.google.com/forum/#!topic/beaglebone/98eF1wQE_QA, and elinux and derekmolloy, I just feel like I am missing something very basic about how address scheme work or how to think about these things. Thanks again for your help!
When you say that's your PASM code... do you mean it's some code you got from somewhere else that you're trying to use? Because the comments on most lines asking what they do makes it seem unlikely that it's actually your code...
Anyways, can't really answer unless you have a specific question, but there's plenty of info out there about how to use the GPIO subsystem on the BeagleBone's AM335x processor. I talked about it some in a post a while back here: https://graycat.io/tutorials/beaglebone-io-using-python-mmap/
I've also got a few documented PRU assembly examples here: https://github.com/alexanderhiam/PRU-stuffs
I am trying to use a recursive procedure to compute A-B in high-level assembly. After computing the difference, it is stored in the EAX register for display at the end of the program.
My problem: The values in registers EAX and EBX are correct before exiting the procedure, but I do not understand why EAX is always zero when A is greater than B.
Is there something about the ret() command that is causing this? What is wrong with my code? Someone please help me.
Here is the sample code:
program MainSubtractionFunction;
#include( "stdlib.hhf" );
static
iDataValue1 : int32 := 0;
iDataValue2 : int32 := 0;
DifferenceInt : int32 :=69;
procedure recursiveSubtraction( a: int32; b : int32 ); #nodisplay; #noframe;
static
returnAddress : dword;
value: int32;
begin recursiveSubtraction;
pop( returnAddress );
pop( b );
pop( a );
push( returnAddress );
mov (a, EAX);
mov (b, EBX);
CompareB:
cmp (EBX, 0);
je ExitSequence;
CompareA:
cmp (EAX, 0);
je AEqualsZero;
NeitherEqualZero:
sub (1, EAX);
sub (1, EBX);
push(EAX);
push(EBX);
call recursiveSubtraction;
AEqualsZero:
neg (EBX);
mov (EBX, EAX);
jmp ExitSequence;
BEqualsZero:
jmp ExitSequence;
ExitSequence:
ret();
end recursiveSubtraction;
begin MainSubtractionFunction;
stdout.put( "Feed Me A: " );
stdin.get( iDataValue1 );
stdout.put( "Feed Me B: " );
stdin.get( iDataValue2 );
push( iDataValue1 );
push( iDataValue2 );
call recursiveSubtraction;
mov(EAX, DifferenceInt);
stdout.put("RecursiveSubtraction of A-B = ",DifferenceInt, nl);
stdout.put("EAX = ",EAX, nl);
stdout.put("EBX = ",EBX, nl);
stdout.put("ECX = ",ECX, nl);
end MainSubtractionFunction;
ENVIRONMENT
HLA (High Level Assembler - HLABE back end, POLINK linker)
Version 2.16 build 4413 (prototype)
Windows 10
NOTE
The recursive algorithm in this question works for the following cases: positive - positive, positive - negative, and negative - positive but will not work for the case negative - negative.
SOLUTION
The problem is after returning from the recursive call to recursiveSubtraction the code will always sequentially continue to the AEqualsZero label.
NeitherEqualZero:
sub(1, EAX);
sub(1, EBX);
push(EAX);
push(EBX);
call recursiveSubtraction;
AEqualsZero:
neg(EBX);
mov(EBX, EAX);
jmp ExitSequence;
To solve this problem add an additional jmp ExitSequence after the call to recursiveSubtraction.
NeitherEqualZero:
sub(1, EAX);
sub(1, EBX);
push(EAX);
push(EBX);
call recursiveSubtraction;
jmp ExitSequence;
AEqualsZero:
neg(EBX);
mov(EBX, EAX);
jmp ExitSequence;
EXAMPLE
program MainSubtractionFunction;
#include("stdlib.hhf");
procedure recursiveSubtraction(A: int32; B: int32); #nodisplay; #noframe;
begin recursiveSubtraction;
pop(EDX); // Return Address
pop(EBX);
pop(EAX);
push(EDX); // Return Address
CompareB:
cmp(EBX, 0);
je ExitSequence;
CompareA:
cmp(EAX, 0);
je AEqualsZero;
NeitherEqualZero:
dec(EAX);
dec(EBX);
push(EAX);
push(EBX);
call recursiveSubtraction;
jmp ExitSequence;
AEqualsZero:
neg(EBX);
mov(EBX, EAX);
jmp ExitSequence;
BEqualsZero:
jmp ExitSequence;
ExitSequence:
ret();
end recursiveSubtraction;
begin MainSubtractionFunction;
stdout.put("Feed Me A: ");
stdin.geti32();
push(EAX);
stdout.put("Feed Me B: ");
stdin.geti32();
push(EAX);
call recursiveSubtraction;
stdout.put("RecursiveSubtraction of A-B = ", (type int32 EAX), nl);
end MainSubtractionFunction;
I am new to gdb. I want to print the memory addresses used with the actual sequence during execution of a c program. Let’s explain my question with an example. Let’s assume that we have the following c code with two functions main() and test(). I know that, inside gdb, I can use "disassemble main" to disassemble main() function, or "disassemble test" to disassemble test() function separately. My question is, how can I disassemble these two functions as a single code; so that, I can see all the memory addresses used during execution and their sequence of accesses? To be specific, as main() is calling test() and test() is also calling itself multiple times, I want to see something like example 2. I am also wandering, the addresses shown in gdb disassembler, are they virtual or physical memory addresses? Any help or guidance will be appreciated.
Example 1:
#include "stdio.h"
int test(int q)
{
if(q<16)
test(q+5);
return q;
}
void main()
{
unsigned int a=5;
unsigned int b=5;
unsigned int c=5;
test(a);
}
Example 2:
<Memory Address> <assembly instruction> <c instructions>
0x12546a mov //for unsigned int a=5;
0x12546b mov //for unsigned int b=5;
0x12546c mov //for unsigned int c=5;
0x12546d jmp //for test(q=a=5);
0x12546e cmpl //for if(q<16)
0x12546f jmp //for test(q+5);
0x12546d jmp //for test(q=10);
0x12546e cmpl //for if(q<16)
0x12546f jmp //for test(q+5);
0x12547a jmp //for test(q=15);
0x12547b cmpl //for if(q<16)
0x12547c jmp //for test(q+5);
0x12547d jmp //for test(q=20);
0x12547e cmpl //for if(q<16)
0x12547f jmp //return q);
0x12548a jmp //return q);
0x12548b jmp //return q);
0x12548c jmp //return q);
There's really no pretty way to do this. You're just going to have to step through the code:
(gdb) stepi
(gdb) x/i $pc
(gdb) info registers
(gdb) stepi
(gdb) x/i $pc
(gdb) info registers
.....
You could script that up so that it does it quickly and dumps the data to a file, but that's about all.
I suppose you may have more luck with valgrind. If there's no existing tool to do so, it is possible to add your own instrumentation to report memory accesses (and not only that), or alter an existing one.
E.g. see http://valgrind.org/docs/manual/lk-manual.html
--trace-mem= [default: no]
When enabled, Lackey prints the size and address of almost every memory access made by the program.
My understanding of Dart leads me to believe that this 'cast' should not affect run-time semantics, but just wanted to confirm:
(foo as Bar).fee();
(foo as Bar).fi();
(foo as Bar).fo();
Or is it "best practice" to cast once:
final bFoo = (foo as Bar);
bFoo.fee();
bFoo.fi();
bFoo.fo();
This is highly dependent on how the DartVM optimizer handles the case. Using the latest version of Dart I constructed two test functions:
void test1() {
Dynamic bar = makeAFoo();
for (int i = 0; i < 5000; i++) {
(bar as Foo).a();
(bar as Foo).b();
}
}
and
void test2() {
Dynamic bar = makeAFoo();
Foo f = bar as Foo;
for (int i = 0; i < 5000; i++) {
f.a();
f.b();
}
}
Looking at the optimized code for test1 you can see the loop looks like this:
00D09A3C bf813b9d00 mov edi,0x9d3b81 'instance of Class: SubtypeTestCache'
00D09A41 57 push edi
00D09A42 50 push eax
00D09A43 6811003400 push 0x340011
00D09A48 e8d36c83ff call 0x540720 [stub: Subtype1TestCache]
00D09A4D 58 pop eax
00D09A4E 58 pop eax
00D09A4F 5f pop edi
00D09A50 81f911003400 cmp ecx,0x340011
00D09A56 7411 jz 0xd09a69
00D09A58 81f9710f7c00 cmp ecx,0x7c0f71
00D09A5E 0f8437000000 jz 0xd09a9b
00D09A64 e900000000 jmp 0xd09a69
00D09A69 8b1424 mov edx,[esp]
00D09A6C 8b4c2404 mov ecx,[esp+0x4]
00D09A70 6811003400 push 0x340011
00D09A75 50 push eax
00D09A76 68b9229d00 push 0x9d22b9
00D09A7B 51 push ecx
00D09A7C 52 push edx
00D09A7D 6889289d00 push 0x9d2889
00D09A82 b8813b9d00 mov eax,0x9d3b81 'instance of Class: SubtypeTestCache'
00D09A87 50 push eax
00D09A88 b9b0d00b00 mov ecx,0xbd0b0
00D09A8D ba06000000 mov edx,0x6
00D09A92 e8896583ff call 0x540020 [stub: CallToRuntime]
00D09A97 83c418 add esp,0x18
00D09A9A 58 pop eax
00D09A9B 5a pop edx
00D09A9C 59 pop ecx
00D09A9D 50 push eax
00D09A9E a801 test al,0x1
00D09AA0 0f8450010000 jz 0xd09bf6
00D09AA6 0fb74801 movzx_w ecx,[eax+0x1]
00D09AAA 81f922020000 cmp ecx,0x222
00D09AB0 0f8540010000 jnz 0xd09bf6
00D09AB6 b9d1229d00 mov ecx,0x9d22d1 'Function 'a':.'
00D09ABB bae96ccb00 mov edx,0xcb6ce9 Array[1, 1, null]
00D09AC0 e82b6983ff call 0x5403f0 [stub: CallStaticFunction]
00D09AC5 83c404 add esp,0x4
00D09AC8 b911003400 mov ecx,0x340011
00D09ACD ba11003400 mov edx,0x340011
00D09AD2 8b45f4 mov eax,[ebp-0xc]
00D09AD5 51 push ecx
00D09AD6 52 push edx
00D09AD7 3d11003400 cmp eax, 0x340011
00D09ADC 0f849a000000 jz 0xd09b7c
00D09AE2 a801 test al,0x1
00D09AE4 7505 jnz 0xd09aeb
00D09AE6 e95f000000 jmp 0xd09b4a
00D09AEB 0fb74801 movzx_w ecx,[eax+0x1]
00D09AEF 81f922020000 cmp ecx,0x222
00D09AF5 0f8481000000 jz 0xd09b7c
00D09AFB 0fb77801 movzx_w edi,[eax+0x1]
00D09AFF 8b4e07 mov ecx,[esi+0x7]
00D09B02 8b891c100000 mov ecx,[ecx+0x101c]
00D09B08 8b0cb9 mov ecx,[ecx+edi*0x4]
00D09B0B 8b7927 mov edi,[ecx+0x27]
00D09B0E 8b7f03 mov edi,[edi+0x3]
00D09B11 81ff59229d00 cmp edi,0x9d2259
00D09B17 0f845f000000 jz 0xd09b7c
00D09B1D bfd13b9d00 mov edi,0x9d3bd1 'instance of Class: SubtypeTestCache'
00D09B22 57 push edi
00D09B23 50 push eax
00D09B24 6811003400 push 0x340011
00D09B29 e8f26b83ff call 0x540720 [stub: Subtype1TestCache]
00D09B2E 58 pop eax
00D09B2F 58 pop eax
00D09B30 5f pop edi
00D09B31 81f911003400 cmp ecx,0x340011
00D09B37 7411 jz 0xd09b4a
00D09B39 81f9710f7c00 cmp ecx,0x7c0f71
00D09B3F 0f8437000000 jz 0xd09b7c
00D09B45 e900000000 jmp 0xd09b4a
00D09B4A 8b1424 mov edx,[esp]
00D09B4D 8b4c2404 mov ecx,[esp+0x4]
00D09B51 6811003400 push 0x340011
00D09B56 50 push eax
00D09B57 68b9229d00 push 0x9d22b9
00D09B5C 51 push ecx
00D09B5D 52 push edx
00D09B5E 6889289d00 push 0x9d2889
00D09B63 b8d13b9d00 mov eax,0x9d3bd1 'instance of Class: SubtypeTestCache'
00D09B68 50 push eax
00D09B69 b9b0d00b00 mov ecx,0xbd0b0
00D09B6E ba06000000 mov edx,0x6
00D09B73 e8a86483ff call 0x540020 [stub: CallToRuntime]
00D09B78 83c418 add esp,0x18
00D09B7B 58 pop eax
00D09B7C 5a pop edx
00D09B7D 59 pop ecx
00D09B7E 50 push eax
00D09B7F a801 test al,0x1
00D09B81 0f8479000000 jz 0xd09c00
00D09B87 0fb74801 movzx_w ecx,[eax+0x1]
00D09B8B 81f922020000 cmp ecx,0x222
00D09B91 0f8569000000 jnz 0xd09c00
00D09B97 b961239d00 mov ecx,0x9d2361 'Function 'b':.'
00D09B9C bae96ccb00 mov edx,0xcb6ce9 Array[1, 1, null]
00D09BA1 e84a6883ff call 0x5403f0 [stub: CallStaticFunction]
00D09BA6 83c404 add esp,0x4
00D09BA9 8b4df8 mov ecx,[ebp-0x8]
00D09BAC 83c102 add ecx,0x2
00D09BAF 0f8055000000 jo 0xd09c0a
00D09BB5 89cf mov edi,ecx
00D09BB7 8b5df4 mov ebx,[ebp-0xc]
00D09BBA e90efeffff jmp 0xd099cd
And the optimized code for test2 you can see the loop looks like this:
00D09F3D 894df4 mov [ebp-0xc],ecx
00D09F40 81f910270000 cmp ecx,0x2710
00D09F46 0f8d46000000 jnl 0xd09f92
00D09F4C 3b251c414700 cmp esp,[0x47411c]
00D09F52 0f8659000000 jna 0xd09fb1
00D09F58 50 push eax
00D09F59 b9d1229d00 mov ecx,0x9d22d1 'Function 'a':.'
00D09F5E bae96ccb00 mov edx,0xcb6ce9 Array[1, 1, null]
00D09F63 e8886483ff call 0x5403f0 [stub: CallStaticFunction]
00D09F68 83c404 add esp,0x4
00D09F6B 8b45f0 mov eax,[ebp-0x10]
00D09F6E 50 push eax
00D09F6F b961239d00 mov ecx,0x9d2361 'Function 'b':.'
00D09F74 bae96ccb00 mov edx,0xcb6ce9 Array[1, 1, null]
00D09F79 e8726483ff call 0x5403f0 [stub: CallStaticFunction]
00D09F7E 83c404 add esp,0x4
00D09F81 8b4df4 mov ecx,[ebp-0xc]
00D09F84 83c102 add ecx,0x2
00D09F87 0f8048000000 jo 0xd09fd5
00D09F8D 8b45f0 mov eax,[ebp-0x10]
00D09F90 ebab jmp 0xd09f3d
And only one set of calls to SubTypeTestCache (outside the loop for test2) instead of two in test1.
Today, it seems that doing the cast once is faster but pulling the cast out of the loop seems like a simple optimization that the VM may do in the future.
Running (foo as Bar) has two effects:
It tells the editor that foo is a Bar which helps with static type analysis and lets the editor do code completion.
It checks that foo is a Bar (or a subtype of Bar), otherwise it'll throw a CastException.
Look for "Type Cast" in (http://www.dartlang.org/docs/spec/latest/dart-language-specification.pdf).
Updated: I like John's answer, too, but I think I should say one more thing. I overlooked the fact that you were talking about doing the cast once versus three times. Looking at final bFoo = (foo as Bar);, I want to say one more thing about the language semantics.
It's true that Dart Editor, dart2js, and the VM could conceivably infer that foo is of type Bar, which would save additional checks, etc. However, the semantics of the language say something slightly different. "final bFoo" does not have a type annotation. So according to the language spec, bFoo is of type Dynamic.
Hence, when you write "(foo as Bar)" three times, each expression results in a Bar. But when you write bFoo, you have a Dynamic object.
It is not "best practice" to perform three as casts right in a row for the same variable.
An as cast is really a runtime check. I'm just guessing, but if you are trying to reduce warnings from the editor, there is probably a better way to do it.
For example, here's one scenario:
class Foo {
}
class Bar extends Foo {
m1() => print('m1');
}
doStuff(Foo foo) {
foo.m1(); // warning here
}
main() {
var foo = new Bar();
doStuff(foo);
}
The above code runs just fine, but the editor does show a warning. To eliminate the warning, it's better to refactor the code. You could remove the Foo annotation from doStuff, or you could consider moving m1() up to Foo, or you could do double-dispatch.