Beginner assembly programming memory usage question

Beginner assembly programming memory usage question - memory

I've been getting into some assembly lately and its fun as it challenges everything i have learned. I was wondering if i could ask a few questions
When running an executable, does the entire executable get loaded into memory?
From a bit of fiddling i've found that constants aren't really constants? Is it just a compiler thing?
const int i = 5;
_asm { mov i, 0 } // i is now 0 and compiles fine
So are all variables assigned with a constant value embedded into the file as well?
Meaning:
int a = 1;
const int b = 2;
void something()
{
const int c = 3;
int d = 4;
}
Will i find all of these variables embedded in the file (in a hex editor or something)?
If the executable is loaded into memory then "constants" are technically using memory? I've read around on the net people saying that constants don't use memory, is this true?

Your executable's text (i.e. code) and data segments get mapped into the process's virtual address space when the executable starts up, but the bytes might not actually be copied from the disk until those memory locations are accessed. See http://en.wikipedia.org/wiki/Demand_paging
C-language constants actually exist in memory, because you have to be able to take the address of them. (That is, &i.) Constants are usually found in the .rdata segment of your executable image.
A constant is going to take up memory somewhere--if you have the constant number 42 in your program, there must be somewhere in memory where the 42 is stored, even if that means that it's stored as the argument of an immediate-mode instruction.

The OS loads the code and data segments in order to prepare them for execution.
If the executable has a resource segment, the application loads parts of it at demand.
It's true that const variables take memory space but compilers are free to optimize
for memory usage and code size, and embed their values in the code.
(in case they don't detect any address references for those variables)
const char * aka C strings, usually are interned by the compilers, to save memory.

Related

Does memory allocation and initialisation take place separatelty during compile time and runtime?

When we declare a variable, say an int, I would like to know the steps involved during the memory allocation and initialisation. and also for a pointer
int x = 5;
now during compile time, 4-bytes is allocated to the integer x. but when does the memory gets filled with the value 5? does the initialisation take place during compilation or runtime execution.
similarly, consider
int x = 5;
int* p = &x;
in these 2 lines what is process of allocation and initialisation.

Variables initialization depends on the kind of variables. Global or static variables are initialised at compile time, while automatic variables are entirely managed at run time.
Global variables
At compile time, the value of all global variables is known. These values are written by the compiler to specific sections of an object file.
At link time, all the object files are gathered and memory locations are determined for each variable. This allows to know the address of every variable, in case one of these addresses is assigned to another variable.
As a result, an executable file is generated that contains a description of the content of every section ( text, data, rodata, etc). In the data or rodata section, are written the values of all initialized global variables.
at run time, the loader reads the description of the different sections and asks to the OS memory. It then will copy the content of all sections to their respective memory locations.
This is the way variable are initialised with a value determined at compile or link time.
The only exception is for variables that are initialized at zero (or not initialized). They are located in a special section (frequently named bss). To reduce the size of executable files, these zero values are not written in the executable. Instead, before executing main(), a runtime procedure will memset to zero all the content of the bss section.
Automatic variables
The procedure is completely different. One does not know the location of these variable before the program runs and the only way is to compute their values by machine instructions.
So the compiler first determines if theses vars will be located in register or memory, and when entering the function, the first instructions will be to reserve stack space for local variables and to initialize their values. This is done by means of regular machine instructions.
In case the value is the address of another variable (say y=&x),
* if x is a local (automatic) variable, the address will be computed by writing to y the sum of the content of the stack pointer register and a given offset determined by the compiler
* if x is a global or static variable, at link time, once the addresses of global variables are known, the linker modifies the instructions generated by the compiler to write the proper address in the register or stack location used to represent y.

There are situations where there is no way around defining at runtime:
if user_input == "yes":
my_var = 5
else:
my_var = 7
But normally it depends on the concept the responsible compiler programmer has implemented. If you use a different compiler or different language then things might be different.

Process memory mapping in C++

#include <iostream>
int main(int argc, char** argv) {
int* heap_var = new int[1];
/*
* Page size 4KB == 4*1024 == 4096
*/
heap_var[1025] = 1;
std::cout << heap_var[1025] << std::endl;
return 0;
}
// Output: 1
In the above code, I allocated 4 bytes of space in the heap. Now as the OS maps the virtual memory to system memory in pages (which are 4KB each), A block of 4KB in my virtual mems heap would get mapped to the system mem. For testing I decided I would try to access other addresses in my allocated page/heap-block and it worked, however I shouldn't have been allowed to access more than 4096 bytes from the start (which means index 1025 as an int variable is 4 bytes).
I'm confused why I am able to access 4*1025 bytes (More than the size of the page that has been allocated) from the start of the heap block and not get a seg fault.
Thanks.

The platform allocator likely allocated far more than the page size is since it is planning to use that memory "bucket" for other allocation or is likely keeping some internal state there, it is likely that in release builds there is far more than just a page sized virtual memory chunk there. You also don't know where within that particular page the memory has been allocated (you can find out by masking some bits) and without mentioning the platform/arch (I'm assuming x86_64) there is no telling that this page is even 4kb, it could be a 2MB "huge" page or anything alike.
But by accessing outside array bounds you're triggering undefined behavior like crashes in case of reads or data corruption in case of writes.
Don't use memory that you don't own.
I should also mention that this is likely unrelated to C++ since the new[] operator usually just invokes malloc/calloc behind the scenes in the core platform library (be that libSystem on OSX or glibc or musl or whatever else on Linux, or even an intercepting allocator). The segfaults you experience are usually from guard pages around heap blocks or in absence of guard pages there simply using unmapped memory.
NB: Don't try this at home: There are cases where you may intentionally trigger what would be considered undefined behavior in general, but on that specific platform you may know exactly what's there (a good example is abusing pthread_t opaque on Linux to get tid without an overhead of an extra syscall, but you have to make sure you're using the right libc, the right build type of that libc, the right version of that libc, the right compiler that it was built with etc).

Stack data size

I cannot find answer to this question. What is the size of data pushed on stack?
Let's say that you push some data on the stack. For example an int. Then value of stack pointer decreases by 4 bytes.
Until today I thought that the biggest data that can be pushed on stack cannot be larger than pointer size. But I did a little experiment. I wrote a simple app in C#:
int i = 0; //0x68 <-- address of variable
int j = 1; //0x64
ulong k = 2; //0x5C
int l = 3; //0x58
Due to my predictions, ulong should be allocated on heap, because it needs 8 bytes, while I am working on 32 bit system (so pointer size is 4 bytes). And that would be very odd, because this is simple local variable.
But it was pushed on stack.
So I think that there is something wrong with the way I think of stack. Because how would stack pointer "know" if he must be changed 4 bytes (if we push int) or 8 bytes (if we push long or double) or just 1 byte?
Or maybe values of local variables are on heap but addresses of those variables are on stack. But this would make no sense, because that's how objects are processed. When I create object (using new keyword) object is allocated on heap and address to this object is pushed on stack, right?
So could someone tell me (or give a link to article) how it really works?

The compiler or run-time environment knows the kind of data you are working with, and knows how to structure into units that will fit onto the stack. So, for example, on an architecture where the stack pointer only moves in 32-bit chunks, the compiler or run-time will know how to format a data element that is longer that 32-bits as multiple 32-bit chunks to push them. It will know, similarly, how to pop them off the stack and reconstruct a variable of the appropriate type.
The problem here is not really different for the stack than it is for any other kind of memory. The CPU architecture will be optimised to work with a small number of data elements of a particular size, but programming languages typically provide a much wider range of data types. The compiler or the run-time environment has to handle the packing and unpacking of data this situation requires.

OS memory allocation addresses

Quick curious question, memory allocation addresses are choosed by the language compiler or is it the OS which chooses the addresses for the memory asked?
This is from a doubt about virtual memory, where it could be quickly explained as "let the process think he owns all the memory", but what happens on 64 bits architectures where only 48 bits are used for memory addresses if the process wants a higher address?
Lets say you do a int a = malloc(sizeof(int)); and you have no memory left from the previous system call so you need to ask the OS for more memory, is the compiler the one who determines the memory address to allocate this variable, or does it just ask the OS for memory and it allocates it on the address returned by it?

It would not be the compiler, especially since this is dynamic memory allocation. Compilation is done well before you actually execute your program.
Memory reservation for static variables happens at compile time. But the static memory allocation will happen at start-up, before the user defined Main.
Static variables can be given space in the executable file itself, this would then be memory mapped into the process address space. This is only one of few times(?) I can image the compiler actually "deciding" on an address.
During dynamic memory allocation your program would ask the OS for some memory and it is the OS that returns a memory address. This address is then stored in a pointer for example.

The dynamic memory allocation in C/C++ is simply done by runtime library functions. Those functions can do pretty much as they please as long as their behavior is standards-compliant. A trivial implementation of compliant but useless malloc() looks like this:
void * malloc(size_t size) {
return NULL;
}
The requirements are fairly relaxed -- the pointer has to be suitably aligned and the pointers must be unique unless they've been previously free()d. You could have a rather silly but somewhat portable and absolutely not thread-safe memory allocator done the way below. There, the addresses come from a pool that was decided upon by the compiler.
#include "stdint.h"
// 1/4 of available address space, but at most 2^30.
#define HEAPSIZE (1UL << ( ((sizeof(void*)>4) ? 4 : sizeof(void*)) * 2 ))
// A pseudo-portable alignment size for pointerŚbwitary types. Breaks
// when faced with SIMD data types.
#define ALIGNMENT (sizeof(intptr_t) > sizeof(double) ? sizeof(intptr_t) : siE 1Azeof(double))
void * malloc(size_t size)
{
static char buffer[HEAPSIZE];
static char * next = NULL;
void * result;
if (next == NULL) {
uintptr_t ptr = (uintptr_t)buffer;
ptr += ptr % ALIGNMENT;
next = (char*)ptr;
}
if (size == 0) return NULL;
if (next-buffer > HEAPSIZE-size) return NULL;
result = next;
next += size;
next += size % ALIGNMENT;
return result;
}
void free(void * ptr)
{}
Practical memory allocators don't depend upon such static memory pools, but rather call the OS to provide them with newly mapped memory.
The proper way of thinking about it is: you don't know what particular pointer you are going to get from malloc(). You can only know that it's unique and points to properly aligned memory if you've called malloc() with a non-zero argument. That's all.

shm_open and mmap with stl vector

I've read that stl vector does not work well with SYS V shared memory. But if I use POSIX shm_open and then mmap with NULL (mmap(NULL, LARGE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0) and give a much larger size than my object, which contains my vector, and after mapping add aditional items to the vector, can there be a problem other than exceeding the LARGE_SIZE space? Other related question: is it guaranteed on a recent SUSE linux that when mapped to the same start address (using above syntax) in unrelated processes my object will be mapped directly and no (system) copy is performed to actualize changed values in the processes (like what happens with normal open and normal files when mmap-ed)?
Thanks!
Edit:
Is this correct then?:
void* mem = allocate_memory_with_mmap(); // say from a shared region
MyType* ptr = new ( mem ) MyType( args );
ptr.~MyType() //is this really needed?
now in an unrelated process:
MyType* myptr = (MyType*)fetch_address_from_mmap(...)
myptr->printHelloWorld();
myptr->myvalue = 1; //writes to shared memory
myptr.~MyType() //is this really needed?
now if I want to free memory
munmap(address...) //but this done only once, when none of the processes use it any more

You are missing the fact that the STL vector is usually just a tuple of (mem pointer, mem size, element count), where actual memory for contained objects is received from the allocator template parameter.
Placing an instance of std::vector in shared memory does not make any sense. You probably want to check out boost::interprocess library instead.
Edit 0:
Memory allocation and object construction are two distinct phased, though combined in a single statement like bellow (unless operator new is re-defined for MyType):
// allocates from process heap and constructs
MyType* ptr = new MyType( args );
You can split these two phases with placement new:
void* mem = allocate_memory_somehow(); // say from a shared region
MyType* ptr = new ( mem ) MyType( args );
Though now you will have to explicitly call the destructor and release the memory:
ptr->~MyType();
release_memory_back_to_where_it_came_from( ptr );
This is essentially how you can construct objects in shared memory in C++. Note though that types that store pointers are not suitable for shared memory since any pointer in one process memory space does not make any sense in the other. Use explicit sizes and offsets instead.
Hope this helps.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart