I am running an application on an embedded (PowerPC 32 bit) system where there is a stack size limitation of 64K. I am experiencing some occasional crashes because of stack overflow.
I can build the application also for a normal Linux system (with some minor little changes in the code), so I can run an emulation on my development environment.
I was wondering which is the best way to find the methods that exceed the stack size limitation and which is the stack frame when this happens (in order to perform some code refactoring).
I've already tried Callgrind (a Valgrind tool), but it seems not to be the right tool.
I'm looking more for a tool than changes in the code (since it's a 200K LOC and 100 files project).
The application is entirely written in C++03.
While it seems that there should be an existing tool for this, I would approach it by writing a small macro and adding it to the top of suspected functions:
char *__stack_root__;
#define GUARD_SIZE (64 * 1024 - 1024)
#define STACK_ROOT \
char __stack_frame__; \
__stack_root__ = &__stack_frame__;
#define STACK_GUARD \
char __stack_frame__; \
if (abs(&__stack_frame__ - __stack_root__) > GUARD_SIZE) { \
printf("stack is about to overflow in %s: at %d bytes\n", __FUNCTION__, abs(&__stack_frame__ - __stack_root__)); \
}
And here how to use it:
#include <stdio.h>
#include <stdlib.h>
void foo(int);
int main(int argc, char** argv) {
STACK_ROOT; // this macro records the top of the bottom of the thread's stack
foo(10000);
return 0;
}
void foo(int x) {
STACK_GUARD; // this macro checks if we're approaching the end of memory available for stack
if (x > 0) {
foo(x - 1);
}
}
couple notes here:
this code assumes single thread. If you have multiple threads, you need to keep track of individual __stack_frame__ variables, one per thread. Use thread local storage for this
using abs() to make sure the macro works both when PowerPC grows its stack up, as well as down (it can: depends on your setup)
adjust the GUARD_SIZE to your liking, but keep it smaller than the max size of your stack on the target
Related
I am working on dynamic memory analysis using stack painting/foot print analysis method.
dynamic-stack-depth-determination-using-footprint-analysis
basically the idea is to fill the entire amount of memory allocated to the stack area with a dedicated fill value, for example 0xABABABAB, before the application starts executing. Whenever the execution stops, the stack memory can be searched upwards from the end of the stack until a value that is not 0xABABABABis found, which is assumed to be how far the stack has been used. If the dedicated value cannot be found, the stack has consumed all stack space and most likely has overflowed.
I want a c code to fill the stack from top to bottom with a pattern.
void FillSystemStack()
{
extern char __stack_start,_Stack_bottom;
}
NOTE
I am using STM32F407VG board emulated with QEMU on eclipse.
stack is growing from higher address to lower address
start of the stack is 0x20020000
bottom of the stack is Ox2001fc00
You shouldn't completely fill the stack after main() begins, because the stack is in use once main() begins. Completely filling the stack would overwrite the bit of stack that has already been used and could lead to undefined behavior. I suppose you could fill a portion of the stack soon after main() begins as long as you're careful not to overwrite the portion that has been used already.
But a better plan is to fill the stack with a pattern before main() is called. Review the startup code for your tool chain. The startup code initializes variable values and sets the stack pointer before calling main(). The startup code may be in assembly depending on your tool chain. The code that initializes variables is probably a simple loop that copies bytes or words from the appropriate ROM to RAM sections. You can probably use this code as an example to write a new loop that will fill the stack memory range with a pattern.
This is a Cortex M so it gets the stack pointer set out of reset. Meaning it's pretty much instantly ready to go for C code. If your reset vector is written in C and performs stacking/C function calls, it will be too late to fill the stack at a very early stage. Meaning you shouldn't do it from application C code.
The normal way to do the trick you describe is through an in-circuit debugger. Download the program, hit reset, fill the stack with the help of the debugger. There will be some convenient debugger command available to do that. Execute the program, try to use as much of it as possible, observe the stack in the memory map of your debugger.
With the insights from #kkrambo answer, I tried to paint the stack just after the start of main by taking care that I do not overwrite the portion that has been used already.My stack paint and stack count functions are given below:
StackPaint
uint32_t sp;
#define STACK_CANARY 0xc5
#define WaterMark 0xc9
void StackPaint(void)
{
char *p = &__StackLimit; // __StackLimit macro defined in linker script
sp=__get_MSP();
PRINTF("stack pointer %08x \r\n\n",sp);
while((uint32_t)p < sp)
{
if(p==&__StackTop){ // __StackTop macro defined in linker script
*p = WaterMark;
p++;
}
*p = STACK_CANARY;
p++;
}
}
StackCount
uint16_t StackCount(void)
{
PRINTF("In the check address function in main file \r\n\n");
const char *p = &__StackLimit;
uint16_t c = 0;
while(*p == WaterMark || (*p == STACK_CANARY && p <= &__StackTop))
{
p++;
c++;
}
PRINTF("stack used:%d bytes \n",1024-c);
PRINTF("remaining stack :%d bytes\n",c);
return c;
}
void deal_msg(unsigned char * buf, int len)
{
unsigned char msg[1024];
strcpy(msg,buf);
//memcpy(msg, buf, len);
puts(msg);
}
void main()
{
// network operation
sock = create_server(port);
len = receive_data(sock, buf);
deal_msg(buf, len);
}
As the pseudocode shows above, the compile environment is vc6 and running environment is windows xp sp3 en. No other protection mechanisms are applied, that is stack can be executed, no ASLR.
The send data is 'A' * 1024 + addr_of_jmp_esp + shellcode.
My question is:
if strcpy is used, the shellcode is generated by msfvenom, msfvenom -p windows/exec cmd=calc.exe -a x86 -b "\x00" -f python,
msfvenom attempts to encode payload with 1 iterations of x86/shikata_ga_nai
after data is sent, no calc pops up, the exploit won't work.
But if memcpy is used, shellcode generated by msfvenom -p windows/exec cmd=calc.exe -a x86 -f python without encoding works.
How to avoid the original program's crash after calc pops up, how to keep stack balance to avoid crash?
Hard to say. I'd use a custom payload (just copy the windows/exec cmd=calc.exe) and put a 0xcc at the start and debug it (or something that will be easily recognizable under the debugger like a ud2 or \0xeb\0xfe). If your payload is executed, you'll see it. Bypass the added instruction (just NOP it) and try to see what can possibly go wrong with the remainder of the payload.
You'll need a custom payload ; Since you're on XP SP3 you don't need to do crazy things.
Don't try to do the overflow and smash the whole stack (given your overflow it seems to be perfect, just enough overflow to control rIP).
See how the target function (deal_msg in your example) behave under normal conditions. Note the stack address when the ret is executed (and if register need to have certain values, this depend on the caller).
Try to replicate that in your shellcode: you'll most probably to adjust the stack pointer a bit at the end of your shellcode.
Make sure the caller (main) stack hasn't been affected when executing the payload. This might happen, in this case reserve enough room on the stack (going to lower addresses), so the caller stack is far from the stack space needed by the payload and it doesn't get affected by the payload execution.
Finally return to the ret of the target or directly after the call of the deal_msg function (or anywhere you see fit, e.g. returning directly to ExitProcess(), but this might be more interesting to return close to the previous "normal" execution path).
All in all, returning somewhere after the payload execution is easy, just push <addr> and ret but you'll need to ensure that the stack is in good shape to continue execution and most of the registers are correctly set.
TL;DR;
How to get the macro name used for size of a constant size array declaration, from a callExpr -> arg_0 -> DeclRefExpr.
Detailed Problem statement:
Recently I started working on a challenge which requires source to source transformation tool for modifying
specific function calls with an additional argument. Reasearching about the ways i can acheive introduced me
to this amazing toolset Clang. I've been learning how to use different tools provided in libtooling to
acheive my goal. But now i'm stuck at a problem, seek your help here.
Considere the below program (dummy of my sources), my goal is to rewrite all calls to strcpy
function with a safe version of strcpy_s and add an additional parameter in the new function call
i.e - destination pointer maximum size. so, for the below program my refactored call would be like
strcpy_s(inStr, STR_MAX, argv[1]);
I wrote a RecursiveVisitor class and inspecting all function calls in VisitCallExpr method, to get max size
of the dest arg i'm getting VarDecl of the first agrument and trying to get the size (ConstArrayType). Since
the source file is already preprocessed i'm seeing 2049 as the size, but what i need is the macro STR_MAX in
this case. how can i get that?
(Creating replacements with this info and using RefactoringTool replacing them afterwards)
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define STR_MAX 2049
int main(int argc, char **argv){
char inStr[STR_MAX];
if(argc>1){
//Clang tool required to transaform the below call into strncpy_s(inStr, STR_MAX, argv[1], strlen(argv[1]));
strcpy(inStr, argv[1]);
} else {
printf("\n not enough args");
return -1;
}
printf("got [%s]", inStr);
return 0;
}
As you noticed correctly, the source code is already preprocessed and it has all the macros expanded. Thus, the AST will simply have an integer expression as the size of array.
A little bit of information on source locations
NOTE: you can skip it and proceed straight to the solution below
The information about expanded macros is contained in source locations of AST nodes and usually can be retrieved using Lexer (Clang's lexer and preprocessor are very tightly connected and can be even considered one entity). It's a bare minimum and not very obvious to work with, but it is what it is.
As you are looking for a way to get the original macro name for a replacement, you only need to get the spelling (i.e. the way it was written in the original source code) and you don't need to carry much about macro definitions, function-style macros and their arguments, etc.
Clang has two types of different locations: SourceLocation and CharSourceLocation. The first one can be found pretty much everywhere through the AST. It refers to a position in terms of tokens. This explains why begin and end positions can be somewhat counterintuitive:
// clang::DeclRefExpr
//
// ┌─ begin location
foo(VeryLongButDescriptiveVariableName);
// └─ end location
// clang::BinaryOperator
//
// ┌─ begin location
int Result = LHS + RHS;
// └─ end location
As you can see, this type of source location points to the beginning of the corresponding token. CharSourceLocation on the other hand, points directly to the characters.
So, in order to get the original text of the expression, we need to convert SourceLocation's to CharSourceLocation's and get the corresponding text from the source.
The solution
I've modified your example to show other cases of macro expansions as well:
#define STR_MAX 2049
#define BAR(X) X
int main() {
char inStrDef[STR_MAX];
char inStrFunc[BAR(2049)];
char inStrFuncNested[BAR(BAR(STR_MAX))];
}
The following code:
// clang::VarDecl *VD;
// clang::ASTContext *Context;
auto &SM = Context->getSourceManager();
auto &LO = Context->getLangOpts();
auto DeclarationType = VD->getTypeSourceInfo()->getTypeLoc();
if (auto ArrayType = DeclarationType.getAs<ConstantArrayTypeLoc>()) {
auto *Size = ArrayType.getSizeExpr();
auto CharRange = Lexer::getAsCharRange(Size->getSourceRange(), SM, LO);
// Lexer gets text for [start, end) and we want him to grab the end as well
CharRange.setEnd(CharRange.getEnd().getLocWithOffset(1));
auto StringRep = Lexer::getSourceText(CharRange, SM, LO);
llvm::errs() << StringRep << "\n";
}
produces this output for the snippet:
STR_MAX
BAR(2049)
BAR(BAR(STR_MAX))
I hope this information is helpful. Happy hacking with Clang!
I wrote a program which only purpose was to allocate a given amount of memory so that I could see its effect on the Ubuntu 12.04 System Monitor.
This is what I wrote
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char **argv)
{
if(argc<2)
exit(-1);
int *mem=0;
int kB = 1024*atoi(argv[1]);
mem = malloc(kB);
if(mem==NULL)
exit(-2);
sleep(3);
free(mem);
exit(0);
}
I used sleep() so that the program would not end immediately so that System Monitor would have time to show the change in memory.
What happens (and I'm puzzled about) is that even if I allocate 1GB, the System Monitor shows no memory change at all! Why is that so?
I thought the reason could be because the allocated memory was never accessed so I tried to insert the following before sleep()
int i;
for(i=0; i<kB/sizeof(int); i++)
mem[i]=i;
But this too had no effect on the graph in System Monitor.
Thanks.
I am trying to emulate grep pattern of UNIX using a C program( just for learning ). The code that i have written is giving me a run time error..
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#define MAXLENGTH 1000
char userBuf[MAXLENGTH];
int main ( int argc, char *argv[])
{
int numOfBytes,fd,i;
if (argc != 2)
printf("Supply correct number of arguments.\n");
//exit(1);
fd =open("pattern.txt",O_RDWR);
if ( fd == -1 )
printf("File does not exist.\n");
//exit(1);
while ( (numOfBytes = read(fd,userBuf,MAXLENGTH)) > 0 )
;
printf("NumOfBytes = %d\n",numOfBytes);
for(i=0;userBuf[i] != '\0'; ++i)
{
if ( strstr(userBuf,argv[1]) )
printf("%s\n",userBuf);
}
}
The program is printing infinitely, the lines containing the pattern . I tried debugging , but couldn't figure out the error. Please let me know where am i wrong.,
Thanks
Say the string is "fooPATTERN". Your first time through the loop, you check for the pattern in "fooPATTERN" and find it. Then your second time through the loop, you check for the pattern in "ooPATTERN" and find it again. Then your third time, you check for the pattern in "oPATTERN" and find it again.
Since you're doing this to learn, I won't tell you much more. You can decide how best to solve it. There are at least two fundamentally different ways you could solve it. One is to do less on each pass of the loop to ensure you only find it once. The other is to make sure your next pass of the loop is past any pattern that was found.
One thing to think about: If the pattern is 'oo' and the string is 'ooo', how many patterns should be found? 1 or 2?
The 'read' does not delimit the data with a null character.
The while loop should encompase the for loop - it doesn't
First, you shouldn't be using raw Unix i/o with open and read if you're just learning C. Start with standard C i/o with fopen and fread/fscanf/fgets and so forth.
Second, you're reading in successive pieces of the file into the same buffer, overwriting the buffer each time, and only ever processing the last contents of the buffer.
Third, nothing guarantees that your buffer will be zero-terminated when you read into it with read(). In fact, it usually won't be.
Fourth, you're not using the i variable in the body of your loop. I can't tell exactly what you were shooting for here, but doing the same thing on the same data umpteen thousand times surely wasn't it.
Fifth, always compile with the fullest warning settings you can abide -- at lest -Wall with GCC. It should have complained that you call read() without including <unistd.h>.