I'm using cheat engine to find where in the game's memory certain properties are stored. For example - my player's health. Ultimately I want to write a program that will know where in memory to look so that my program can make decisions based on the current game state. I can and have found where in memory certain things are stored, the problem is that each time the game is opened the memory locations change. What do I need to do so that my program can work around the changing memory locations?
The issue is that in your cheat engine table you're using hard coded addresses for these variables. The variables are either dynamically allocated or are allocated staticly relative to the base address of a module. To fix this you can use pointers to the variables, in which the pointers are staticly located or calculated at runtime using relative offsets from module base addresses. You would use "Find out what accesses" to find pointers or the Pointer Scanner to do this. You grab dynamic addresses of modules by using the ToolHelp32Snapshot windows API function. You can also use signature scanning to scan for an array of bytes that represent instructions that access the variable at runtime. Then grab the address from the instruction operands.
Related
The LD_PRELOAD technique allows us to supply our own custom standard library functions to an existing binary, overriding the standard ones or manipulating their behaviour, giving a fun way to experiment with a binary and understand its behaviour.
I've read that LD_PRELOAD can be used to "checkpoint" a program --- that is, to produce a record of the full memory state, call stack and instruction pointer at any given time --- allowing us to "reset" the program back to that previous state at will.
It's clear to me how we can record the state of the heap. Since we can provide our own version of malloc and related functions, our preloaded library can obviously gain perfect knowledge of the memory state.
What I can't work out is how our preloaded functions can determine the call stack and instruction pointer; and then reset them at a later time to the previously recorded value. Clearly this is necessary for checkpointing. Are there standard library functions that can do this? Or is a different technique required?
I've read that LD_PRELOAD can be used to "checkpoint" a program ... allowing us to "reset" the program back to that previous state at will.
That is a gross simplification. This "checkpoint" mechanism can not possibly restore any open file descriptors, or any mutexes, since the state of these is partially inside the kernel.
It's clear to me how we can record the memory state. ...
What I can't work out is how our preloaded functions can determine the call stack and instruction pointer;
The instruction pointer is inside the preloaded function, and is trivially available as e.g. register void *rip __asm__("rip") on x86_64. But you (likely) don't care about that address -- you probably care about the caller of your function. That is also trivially available as __builtin_return_address() (at least when using GCC).
And the rest of the call stack is saved in memory (in the stack region to be more precise), so if you know the contents of memory, you know the call stack.
Indeed, when you use e.g. GDB where command with a core dump, that's exactly what GDB does -- it reads contents of memory from the core and recovers the call stack from it.
Update:
I wrote in my original post that I know how to inspect the memory, but in fact I only know how to inspect the heap. How can I view the full contents of all stack frames?
Inspecting memory works the same regardless of whether that memory "belongs" to heap, stack, or code. You simply dereference a pointer and voilĂ -- you get the contents of memory at that location.
What you probably mean is:
how to find location of stack and
how to decode it
The answer to the first question is OS-specific, and you didn't tag your question with any OS.
Assuming you are on Linux, one way to locate the stack is to parse entries in /proc/self/maps looking for an entry (continuous address range) which "covers" current stack (i.e. "covers" an address of any local variable).
For the second question, the answer is:
it's complicated1 and
you don't actually need to decode it in order to save/restore its state.
1To figure out how to decode stack, you could look at sources for debuggers (such as GDB and LLDB).
This is also very OS and processor specific.
You would need to know calling conventions. On x86_64 you would need to know about unwind descriptors. To find local variables, you would need to know about DWARF debugging format.
Did I mention it's complicated?
First, to avoid making this seem like an XYZ problem, I'd like to give some context (Note I am not using Emscripten):
I am trying to see if I can implement a form of hot reloading for Wasm programs written in C++, hosted on the web. To do this, I want to have a section of memory that I call my "world state" (to anyone who has watched Handmade Hero ( https://handmadehero.org/ ), this will be familiar):
struct State {
// put everything here
} state;
Typically for a full C++ program with a platform layer, you'd allocate this struct on the platform side and feed a pointer to that memory through a function pointer in the reloadable/dll/dylib part of the code. The reloadable code puts EVERYTHING into this persistent memory so if the code needs to be recompiled and reloaded, all the state will continue to exist since the memory was allocated in the part of the program that wasn't reloaded. As far as I can tell, this is impossible in Wasm though.
Firstly, is my assumption correct that I have to use WebAssembly.Memory? --or can I allocate a uint8array in js and use that for my persistent state, separate from the program memory? If so, is that slower?
So this will work as long as I don't use a dynamic allocator like WASI, and instead use a push allocator I can control. (I think this because, suppose I use malloc to get memory addresses and reload--malloc's internal state will reload and think all the heap memory is available when it's not, so future allocations might clobber previous ones.)
Upon reload, I can first copy the struct into a temporary buffer on the js side, reload, get the memory location of the struct from Wasm (I will require that it exists), and copy the saved memory from js back into position.
However this falls apart if I use pointers because if I change the program (which is the point) __data_end might change, which would offset all of the addresses! I checked the linker flags here https://lld.llvm.org/WebAssembly.html to see what I could control. I can specify that the stack comes before the data segment, but the heap would still come after that, which results in the same problem. I can also specify where the global data are located, but that's not the data segment I believe, so the variable-size data segment could still offset all of my addresses.
Here's a nice page that can help us visualize the Wasm memory: https://dassur.ma/things/c-to-webassembly/
Would anyone have any thoughts on how to achieve what I'd like? The only options I can think of involve somehow using memory outside the Wasm memory (possibly slower or impossible), using only stack memory and no pointers (unrealistic unless I can auto-recalculate all pointer offsets after a recompile, which would be painful and bug-prone), or finding a way to make the data segment come after the stack and heap at a fixed address, which would then guarantee that the stack and heap segments wouldn't get offset if the data segment needs to grow. Another option, if possible, would be to fix the max size of the data segment. The Wasm spec/documentation aren't really great when it comes to memory manipulation like this, so I'd appreciate some clarification about what's possible too. Lastly, maybe I could use two Wasm modules (but wouldn't that sort of indirection be slow)? I might be missing something crucial related to the memory layout.
Please let me know if you need more details. I've done something like this before in C, as I mentioned, and it's a common rapid iteration game-dev technique. Basically I'm trying to recreate it in Wasm.
EDIT: Apparently you can call Wasm functions from another module directly. Firstly, how do you do it, and secondly, what would be performance characteristics be for accessing the memory of another module?
EDIT2: Maybe some form of dynamic linking if that's supported? https://webassembly.org/docs/dynamic-linking/
WebAssembly modules hold variable state in three distinct places:
Linear memory
Local variables associated with the execution stack
Global variables
Of these, only global variables and linear memory are accessible to the host environment, and potentially serialisable in order to cache them as you hot-reload your module. There is of course no way to directly access and store the current call-stack.
If I were looking to achieve this, I'd create my own state machine within WebAssembly, storing this within a known location within linear memory.
Wasm is organised into modules, and modules define four relevant kinds of entities: functions, memories, tables, globals. The code is in the functions, while the other three represent a module's state.
Now, the interesting thing is that all four of these entity kinds can be imported and exported. Moreover, all of them can be created externally to the module, e.g., by the JS API.
Consequently, a way to emulate code swapping is to set up your module such that all three pieces of state are created externally and imported into the module. That way, you can keep them alive externally and pass them to the upgraded module once available. (You also need to make sure that the upgraded module doesn't use data/element segments or start functions in a way that paves over preexisting state.)
Of course, this only works if the shape of the module's state does not change between upgrades. E.g., no new globals, no new data layout in memory, otherwise the new code won't understand the old state. That is actually the hard part of the problem, but it's independent from Wasm specifics.
I'm in an assembly language course focusing on x86 Pentium processors, and am working on a Linux system. I understand that programs get loaded into memory and that you can perform operations directly within the registers but I'm not sure you can avoid creating a data segment altogether.
A yes or no, followed by a brief explanation as to why would be great.
It is not required. A data segment is simply a block of memory allocated for data and thus can be written to and read from. Code segments are read only. If you try to write to a code segment the hardware will generate an interrupt. However, assembly codes can be fed any address in memory, and if protected mode is disabled, then the hardware won't generate an interrupt.
As an example, the boot sector loads into a very restricted space on launch, and it is quite common (because space is so restricted) to place variables among the code bytes. Once I even wrote a boot sector that adjusted its own byte-code to accommodate differences in booting from different disks. So this is a case of code using code addresses as variables.
However, while you definitely can avoid creating a data segment, 99.99% of the time you do separate out a data segment.
You may also want to read up on protected mode to understand this better.
Ok I have a bit of a noob student question.
So I'm familiar with the fact that stacks contain subroutine calls, and heaps contain variable length data structures, and global static variables are assigned to permanant memory locations.
But how does it all work on a less theoretical level?
Does the compiler just assume it's got an entire memory region to itself from address 0 to address infinity? And then just start assigning stuff?
And where does it layout the instructions, stack, and heap? At the top of the memory region, end of memory region?
And how does this then work with virtual memory? The virtual memory is transparent to the program?
Sorry for a bajilion questions but I'm taking programming language structures and it keeps referring to these regions and I want to understand them on a more practical level.
THANKS much in advance!
A comprehensive explanation is probably beyond the scope of this forum. Entire texts are devoted to the subject. However, at a simplistic level you can look at it this way.
The compiler does not lay out the code in memory. It does assume it has the entire memory region to itself. The compiler generates object files where the symbols in the object files typically begin at offset 0.
The linker is responsible for pulling the object files together, linking symbols to their new offset location within the linked object and generating the executable file format.
The linker doesn't lay out code in memory either. It packages code and data into sections typically labeled .text for the executable code instructions and .data for things like global variables and string constants. (and there are other sections as well for different purposes) The linker may provide a hint to the operating system loader where to relocate symbols but the loader doesn't have to oblige.
It is the operating system loader that parses the executable file and decides where code and data are layed out in memory. The location of which depends entirely on the operating system. Typically the stack is located in a higher memory region than the program instructions and data and grows downward.
Each program is compiled/linked with the assumption it has the entire address space to itself. This is where virtual memory comes in. It is completely transparent to the program and managed entirely by the operating system.
Virtual memory typically ranges from address 0 and up to the max address supported by the platform (not infinity). This virtual address space is partitioned off by the operating system into kernel addressable space and user addressable space. Say on a hypothetical 32-bit OS, the addresses above 0x80000000 are reserved for the operating system and the addresses below are for use by the program. If the program tries to access memory above this partition it will be aborted.
The operating system may decide the stack starts at the highest addressable user memory and grows down with the program code located at a much lower address.
The location of the heap is typically managed by the run-time library against which you've built your program. It could live beginning with the next available address after your program code and data.
This is a wide open question with lots of topics.
Assuming the typical compiler -> assembler -> linker toolchain. The compiler doesnt know a whole lot, it simply encodes stack relative stuff, doesnt care how much or where the stack is, that is the purpose/beauty of a stack, dont care. The compiler generates assembler the assembler is assembled into an object, then the linker takes info linker script of some flavor or command line arguments that tell it the details of the memory space, when you
gcc hello.c -o hello
your installation of binutils has a default linker script which is tailored to your target (windows, mac, linux, whatever you are running on). And that script contains the info about where the program space starts, and then from there it knows where to start the heap (after the text, data and bss). The stack pointer is likely set either by that linker script and/or the os manages it some other way. And that defines your stack.
For an operating system with an mmu, which is what your windows and linux and mac and bsd laptop or desktop computers have, then yes each program is compiled assuming it has its own address space starting at 0x0000 that doesnt mean that the program is linked to start running at 0x0000, it depends on the operating system as to what that operating systems rules are, some start at 0x8000 for example.
For a desktop like application where it is somewhat a single linear address space from your programs perspective you will likely have .text first then either .data or .bss and then after all of that the heap will be aligned at some point after that. The stack however it is set is typically up high and works down but that can be processor and operating system specific. that stack is typically within the programs view of the world the top of its memory.
virtual memory is invisible to all of this the application normally doesnt know or care about virtual memory. if and when the application fetches an instruction or does a data tranfer it goes through hardware which is configured by the operating system and that converts between virtual and physical. If the mmu indicates a fault, meaning that space has not been mapped to a physical address, that can sometimes be intentional and then another use of the term "Virtual memory" applies. This second definition the operating system can then for example take some other chunk of memory, yours or someone elses, move that to hard disk for example, mark that other chunk as not being there, and then mark your chunk as having some ram then let you execute not knowing you were interrupted with some ram that you didnt know you had to take from someone else. Your application by design doesnt want to know any of this, it just wants to run, the operating system takes care of managing physical memory and the mmu that gives you a virtual (zero based) address space...
If you were to do a little bit of bare metal programming, without mmu stuff at first then later with, microcontroller, qemu, raspberry pi, beaglebone, etc you can get your hands dirty both with the compiler, linker script and configuring an mmu. I would use an arm or mips for this not x86, just to make your life easier, the overall big picture all translates directly across targets.
It depends.
If you're compiling a bootloader, which has to start from scratch, you can assume you've got the entire memory for yourself.
On the other hand, if you're compiling an application, you can assume you've got the entire memory for yourself.
The minor difference is that in the first case, you have all physical memory for yourself. As a bootloader, there's nothing else in RAM yet. In the second case, there's an OS in memory, but it will (normally) set up virtual memory for you so that it appears you have the entire address space for yourself. Usuaully you still have to ask the OS for actual memory, though.
The latter does mean that the OS imposes some rules. E.g. the OS very much would like to know where the first instruction of your program is. A simple rule might be that your program always starts at address 0, so the C compiler could put int main() there. The OS typically would like to know where the stack is, but this is already a more flexible rule. As far as "the heap" is concerned, the OS really couldn't care.
Most game botting applications use a series of memory offsets they have found for that particular version of a game client to facilitate botting. They might have a memory offset for health, x/y position, etc. Every time the game releases an update the offsets for the various pieces of information the bot program uses must be re-found and updated as well.
I'm interested in writing a Solitaire bot as a pet project. If you look here, mmoglider (a commercial bot) has already accomplished this as a demo for their botting program (which normally is used to bot WoW): YouTube video of MMOGlider botting Vista Solitaire.
What is a common method of accurately locating various useful memory offsets? How might I go about locating the memory offset that points to the "deck" in the solitaire program and use that to determine what cards are on the stack? I know from experience with the glider guys that once they were able to locate the offsets for the deck itself they said that every card value for the entire deck was there.
So, does anyone have any experience with reverse engineering and pulling memory offsets out of existing programs? And once you have those offsets how to be able to pull and read the values from that "Deck" structure in memory?
Typically there are two approaches to such tasks. For simplicity, let us consider a game with an integer amount of "health" for the player.
The first is to manipulate the process memory while the program is running. This is good for finding known values. When you have 100 health in a game, search the memory space for 100 (most likely as an integer) and record every location it is found. Then when your health changes to 99, cross-search those same locations to see which have changed appropriately. Continue until you have narrowed down the precise location(s) of the health variable. In most modern games what you will actually find is a dynamically allocated memory address that is part of a struct. That struct will be referenced by a pointer within the program, you then have to search within the program memory for values that may be a pointer to the space near the health variable, and repeat the narrowing-down process over multiple game runs to establish the position of the pointer to the data that you want. This is the method most useful for classic PC and console games, particularly any game where the memory space is small and easy to manipulate.
The second method requires you to disassemble the application binary (I use IDA Pro for this), then locate functions that are known to use the data that you want. For example, say you see "Health: 99" on the screen. Search the binary for the "Health: " string, then find references to that string (you will likely find a call to sprintf or similar) and see what other memory locations those same functions reference, this will usually lead you to the "health" variable or the struct containing it. This is the method most common in more modern games, with massive memory spaces and more advanced programming practices.