How is the run time stack maintained after a buffer is injected with malicious code? - callstack

I was reading the following paper on buffer overflow : http://www1.telhai.ac.il/sources/private/academic/cs/557/2659/Materials/Smashing.pdf
According to the examples in the paper, the attacker injects code in the buffer which is overflowed and modifies the return address to point to the address of the buffer. Since, the injected code and the local variables both lie on the stack , shouldn't the attacker make sure that she/he has left enough space for local variables before overflowing the buffer with the code? In broad sense, I m confused how stack would be maintained when both the code and local variables are there on the stack ....are there chances of local variables overwriting the injected code ???

One solution is to not use any local variables. This isn't necessarily a big constraint on code that is going to be small and do its work by calling into something else anyway. (Very relative on saying its not a big constraint, it's beyond me tbh but since this isn't the easiest code to write, its not a big constraint in that context).
That's not even necessary though. A stack will look something like:
Function A Arg 2
Function A Arg 1
Function A Arg 0
Return Address into caller
Function A Local 1
Function A Local 0
Function B Arg 1
Function B Arg 0
Return Address into place in Function A's executable code.
Function B Local 2
Function B Local 1
Function B Local 0
Function C Arg 0
Return Address into place in Function B's executable code.
Function C Local 3
Function C Local 2
Function C Local 1
Function C Local 0
...
A buffer overflow tricks it into looking like:
Function A Arg 2
Function A Arg 1
Function A Arg 0
Return Address into caller
Function A Local 1
NastyCode bytes 0-3
NastyCode bytes 4-7
NastyCode bytes 8-B
NastyCode bytes C-E
NastyCode bytes F-10
NastyCode bytes 10-13
NastyCode bytes 14-17
NastyCode bytes 18-1B
NastyCode bytes 1F-20
...
NastyCode arg 0 (etc.)
Return Address (of NastyCode [may not even be valid, may return to original caller, or may jump into something it'll set up]).
NastyCode Local 2
NastyCode Local 1
NastyCode Local 0
Return Address (should be of Function F, into a point in Function E but it's changed to point to byte 0 of NastyCode)
Function F Local 2
Function F Local 1
Function F Local 0
So the executable code written into the stack can be away from the local varaibles of that code.

Related

Distinguish function vs closure

Lua will write the code of a function out as bytes using string.dump, but warns that this does not work if there are any upvalues. Various snippets online describe hacking around this with debug. It looks like 'closed over variables' are called 'upvalues', which seems clear enough. Code is not data etc.
I'd like to serialise functions and don't need them to have any upvalues. The serialised function can take a table as an argument that gets serialised separately.
How do I detect attempts to pass closures to string.dump, before calling the broken result later?
Current thought is debug.getupvalue at index 1 and treat nil as meaning function, as opposed to closure, but I'd rather not call into the debug interface if there's an alternative.
Thanks!
Even with debug library it's very difficult to say whether a function has a non-trivial upvalue.
"Non-trivial" means "upvalue except _ENV".
When debug info is stripped from your program, all upvalues look almost the same :-)
local function test()
local function f1()
-- usual function without upvalues (except _ENV for accessing globals)
print("Hello")
end
local upv = {}
local function f2()
-- this function does have an upvalue
upv[#upv+1] = ""
end
-- display first upvalues
print(debug.getupvalue (f1, 1))
print(debug.getupvalue (f2, 1))
end
test()
local test_stripped = load(string.dump(test, true))
test_stripped()
Output:
_ENV table: 00000242bf521a80 -- test f1
upv table: 00000242bf529490 -- test f2
(no name) table: 00000242bf521a80 -- test_stripped f1
(no name) table: 00000242bf528e90 -- test_stripped f2
The first two lines of the output are printed by test(), the last two lines by test_stripped().
As you see, inside test_stripped functions f1 and f2 are almost undistinguishable.

Memory allocation when a pointer is assigned

I am trying to avoid memory allocation and local copy in my code. Below is a small example :
module test
implicit none
public
integer, parameter :: nb = 1000
type :: info
integer n(nb)
double precision d(nb)
end type info
type(info), save :: abc
type(info), target, save :: def
contains
subroutine test_copy(inf)
implicit none
type(info), optional :: inf
type(info) :: local
if (present(inf)) then
local = inf
else
local = abc
endif
local%n = 1
local%d = 1.d0
end subroutine test_copy
subroutine test_assoc(inf)
implicit none
type(info), target, optional :: inf
type(info), pointer :: local
if (present(inf)) then
local => inf
else
local => def
endif
local%n = 1
local%d = 1.d0
end subroutine test_assoc
end module test
program run
use test
use caliper_mod
implicit none
type(ConfigManager), save :: mgr
abc%n = 0
abc%d = 0.d0
def%n = 0
def%d = 0.d0
! Init caliper profiling
mgr = ConfigManager_new()
call mgr%add("runtime-report(mem.highwatermark,output=stdout)")
call mgr%start
! Call subroutine with copy
call cali_begin_region("test_copy")
call test_copy()
call cali_end_region("test_copy")
! Call subroutine with pointer
call cali_begin_region("test_assoc")
call test_assoc()
call cali_end_region("test_assoc")
! End caliper profiling
call mgr%flush()
call mgr%stop()
call mgr%delete()
end program run
As far as i understand, the subroutine test_copy should produce a local copy while the subroutine test_assoc should only assign a pointer to some existing object. However, memory profiling with caliper leads to :
$ ./a.out
Path Min time/rank Max time/rank Avg time/rank Time % Allocated MB
test_assoc 0.000026 0.000026 0.000026 0.493827 0.000021
test_copy 0.000120 0.000120 0.000120 2.279202 0.000019
What looks odd is that Caliper shows the exact same amount of memory allocated whatever the value of the parameter nb. Am I using the right tool the right way to track memory allocation and local copy ?
Test performed with gfortran 11.2.0 and Caliper 2.8.0.
The local object is just a simple small scalar, albeit containing an array component, and will be very likely placed on the stack.
A stack is a pre-allocated part of memory of fixed size. Stack "allocation" is actually just a change of value of the stack pointer which is just one integer value. No actual memory allocation from the operating system takes place during stack allocation. There will be no change in the memory the process occupies.

Do arrays get put into the activation stack instance?

Would the ARI for main just have local variable a as 1 spot and local b as one spot or would local b have 3 spots?
would the parameters for Procedure X have 1 spot for var c and 3 spots for var d?

Strange behavior caused by debug.getinfo(1, "n").name

I learned how to get the function name inside a function by using debug.getinfo(1, "n").name.
Using this feature, I found out the strange behavior in Lua.
Here's my code:
function myFunc()
local name = debug.getinfo(1, "n").name
return name
end
function foo()
return myFunc()
end
function boo()
local name = myFunc()
return name
end
print(foo())
print(boo())
Result:
nil
myFunc
As you can see, the function foo() and boo() calls the same function myFunc() but they return different results.
If I replace debug.getinfo(1, "n").name with other string, they return the same results as expected but I don't understand the unexpected behavior caused by using the debug.getinfo().
Is it possible to correct myFunc() function so calling both foo() and boo() functions return the same result?
Expected result:
myFunc
myFunc
In Lua, any return statement of the form return <expression_yielding_a_function>(...) is a "tail call". Tail calls essentially don't exist in the call stack, so they take up no additional space or resources. The function you call effectively gets erased from the debug information.
Is it possible to correct myFunc() function so calling both foo() and boo() functions return the same result?
Um... yes, but before I tell you how, allow me to try to convince you not to do this.
As previously mentioned, tail calls are part of the Lua language. The removal of tail calls from the stack is not an "optimization" any more than it is an "optimization" for a for loop to exit when you use break. It is a part of Lua's grammar, and Lua programmers have just as much a right to expect a tail call to be a tail call as they have the right to expect break to exit loops.
Lua, as a language, specifically states that this:
local function recursive(...)
--some terminating condition
return recursive(modified_args)
end
will never, ever, run out of stack space. It will be just as stack space efficient as performing a loop. This is a part of the Lua language, just as much a part of it as the behavior of for and while.
If a user wants to call your function via a tail call, that is their right as the user of a language that makes tail calls a thing. Denying users of a language the right to use the features of that language is rude.
So don't do it.
Furthermore, your code suggests that you are attempting to rely on functions having names. That you're doing something significant and meaningful with those names.
Well, Lua is not Python; Lua functions do not have to have names, period. As such, you should not write code that meaningfully relies upon the name of a function. For debugging or logging purposes, fine. But you should not break user expectations just for debugging and logging. So if the user made a tail call, just accept that's what the user wanted and that your debugging/logging will suffer slightly.
OK, so, do we agree that you shouldn't do this? That Lua users have the right to tail calls, and you don't have the right to deny them? That Lua functions are not named and you shouldn't write code that requires them to maintain a name? OK?
What follows is terrible code that you should never use! (in Lua 5.3):
function bypass_tail_call(Func)
local function tail_call_bypass(...)
local rets = table.pack(Func(...))
return table.unpack(rets, rets.n)
end
return tail_call_bypass
end
Then, simply replace your real function with the return of the bypass:
function myFunc()
local name = debug.getinfo(1, "n").name
return name
end
myFunc = bypass_tail_call(myFunc)
Note that the bypass function has to build an array to hold the return values, then unpack them into the final return statement. This obviously requires additional memory allocations that don't have to happen in regular code.
So there's another reason not to do this.
You can run your code through luac -l -p
...
function <stdin:6,8> (4 instructions at 0x555f561592a0)
0 params, 2 slots, 1 upvalue, 0 locals, 1 constant, 0 functions
1 [7] GETTABUP 0 0 -1 ; _ENV "myFunc"
2 [7] TAILCALL 0 1 0
3 [7] RETURN 0 0
4 [8] RETURN 0 1
function <stdin:10,13> (4 instructions at 0x555f561593b0)
0 params, 2 slots, 1 upvalue, 1 local, 1 constant, 0 functions
1 [11] GETTABUP 0 0 -1 ; _ENV "myFunc"
2 [11] CALL 0 1 2
3 [12] RETURN 0 2
4 [13] RETURN 0 1
Those are the two function that are of interest to us: foo and boo
As you can see, when boo calls myFunc, it's just a normal CALL, so nothing interesting there.
foo, however, does something called a tail call. That is, the return value of foo is the return value of myFunc.
What makes this kind of call special is that there is no need for the program to jump back into foo; once foo calls myFunc it can just hand over the keys and say "You know what to do"; myFunc then returns its results directly to where foo was called. This has two advantages:
The stack frame of foo can be cleaned up before myFunc is called
once myFunc returns, it doesn't need two jumps to return to the main thread; only one
Both of those are insignificant in examples like yours, but once you have a chain of lots and lots of tail calls, it becomes significant.
The downside of this is that, once the stack of foo gets cleaned up, Lua also forgets all the debugging information associated with it; it only remembers that myFunc was called as a tail call, but not from where.
An interesting side note, is that boo is almost also a tail call. If Lua didn't have multiple return values, it'd be exactly identical to foo, and a smarter compiler like LuaJIT might compile it to a tail call. PUC Lua won't though, since it needs a literal return some_function() to recognize the tail call.
The difference is that boo only returns the first value returned by myFunc, and while in your example, there will only ever be one, the interpreter can't make that assumption (LuaJIT might make that assumption during JIT compilation, but that's beyond my understanding)
Also note that, technically, the word tail call just describes a function A directly returning the return value of another function B.
It often gets used interchangeably with tail call optimization, which is what the compiler does when it re-uses the stack frame and turns the function call into a jump.
Strictly speaking, C (for example) has tail calls, but it has no tail call optimization, meaning something like
int recursive(n) { return recursive(n+1); }
is valid C code, but will eventually cause a stack overflow, while in Lua
local function recursive(n) return recursive(n+1) end
will just run forever. Both are tail calls, but only the second gets optimized.
EDIT: As always with C, some compilers may, on their own, implement tail call optimization, so don't go around telling everyone that "C never ever does it"; it's just not a requried part of the language, while in Lua it's actually defined in the language specification, so it's not Lua until it has TCO.
This is a result of tail call optimisation, which Lua does.
In this case, Lua translates the function call into a "goto" statement, and does not use any extra stack frame to perform the tail call.
You can add traceback statement to check it:
function myFunc()
local name = debug.getinfo(1, "n").name
print(debug.traceback("Stack trace"))
return name
end
Tail call optimisation happens in Lua when you return with a function call:
-- Optimized
function good1()
return test()
end
-- Optimized
function good2()
return test(foo(), bar(5 + baz()))
end
-- Not optimised
function bad1()
return test() + 1
end
-- Not optimised
function bad2()
return test()[2] + foo()
end
You can refer to the following links for more information:
- Programming in Lua - 6.3: Proper Tail Calls
- What is tail call optimisation? - Stack Overflow

How to obtain the offset of a MASM local variable at assembly time

I want to obtain the offset into the stack frame of a MASM procedure's local variable at assembly time (not runtime, the LEA instruction isn't useful here). E.g., if the local variable is at RBP-8 at run time, I want to obtain the constant (-8) during assembly.
Is this possible?
OFFSET operator only works on global (static) variables. Every suggestion I've found on the internet talks about using LEA, which is not what I want.
someProc proc
local a:byte, b:word, c:dword
.
.
. someConst = MagicOffset(a) ;Magically sets constant "someConst" to the offset of "a" (probably something like -8).
someProc endp

Resources