I have been experimented around accessing memory used by other programs and I've encountered a little bit strange (to me) results.
First I have created a variable in my first program and gave it the value of 10. Then I looked at the address of it and asigned it manualy to a pointer in my second program. After that i tried to derefrence the pointer and (to my surprise) the program didn't crash. Instead it printed derefrenced pointer's value as 0
Next I created a few other programs to experiment with this. In my first program I created a pointer and asigned 'new int' to it. Then I checked the address of the int and manually asigned it to another pointer in my second program. Now when i tried to derefrence the ptr of my second program it did crash.
Could someone explain why the difference happened? And why was the derefrenced pointer 0?
Sorry for a possibly stupid question :/
This is because the addresses that your program prints for you to see are virtual addresses. Virtual addresses are relative to the memory space of each individual program. They get converted to physical memory addresses by the operating system during runtime.
So you didn't really access the real (physical) memory address of one of your programs from another one. This is also why the pointer value was set to 0.
Related
why stack pointer is initialized to the maximum value?
I only knows that It is the tiny register which stores the last program request’s address in a stack. It is the particular kind of buffer that stores the information in the order of top-down. but can anyone explain me why initially it's having max value.
Without more detail on the actual microprocessor, there isn't a single exact answer; but in general, each architecture handles stack pointer initialization a bit differently. For example, the version of ARM used in many microprocessors initializes the stack pointer (also R13) from the vector table (the first entry). Other architectures either initialize the register to 0 or some other value; so it's not always all 1s as you mention in your question. If the hardware itself doesn't initialize the stack pointer so somewhere meaningful, some of the first instructions usually does. And usually this value is near or at the top of memory as you mention, the stack typically grows down from higher addresses to lower ones; so a value of all 1s or some other larger value might make sense depending on how memory is laid out and managed.
One thing also worth mentioning is that you say the stack stores the "last program request's address" which if I understand correctly is one of the things the stack stores. In more architectures, the stack can store much more than just the return address of a call but also local variables, saved context when a call is made or a context is swapped (either by an OS or by an exception/interrupt) and anything else the program might want to push onto it.
So the short answer is: it isn't always set to the maximum value but it's usually set to some high value as the stack will grow down to lower addresses as things like data and addresses are pushed on it.
I'm comparing two NSNumbers in my app and I've done it the wrong way:
if(max < selected)
And it should be:
if([max longValue] < [selected longValue])
So the first comparison is really comparing the two object memory addresses, the funny thing (at least for me) is that the values seems to be related with the addresses. For example, if I get the first number with value 5 its memory address is 0xb000000000000053 and if I get the second with 10 is 0xb0000000000000a3 (being "a" equivalent to 10 in hexadecimal).
For that reason the first comparison (wrong) was actually working. Now an user complaint about an error here and is obviously because of this but it has lead me to the next questions:
Does this only happen in simulators? Cause it's where I'm testing, and the user will have a real device. Or maybe this happen normally but it's not a rule always fulfilled?
This is a "tagged pointer," not an address. The value 5 is packed inside the pointer as you've seen. You can identify tagged pointers because they're odd (the last bit is 1). It's not possible to fetch odd addresses on any of Apple's hardware (the word size is 4 or 8 bytes), so that bit is never set for a real address.
Tagged pointers are only available on 64-bit platforms. If you run on a 32-bit platform then the values will be real pointers, and they may not be in any particular order, which will then lead to the kinds of bugs you're encountering. Unfortunately I don't believe there is any way to get a compiler warning or even a static analysis warning for this kind of misuse on NSNumber.
Mike Ash provides an in-depth discussion of the subject.
On a slightly related note, on 32-bit platforms, certain NSNumbers are singletons, particularly small values since they're used a lot (-1 through 12 as I recall, but I believe it's different on different platforms). This means that == may happen to work for some numbers, but not for others. It also means that without ARC, it was possible to over-release a specific value (for example, 4) such that your program would crash the next time it happened to use that value. True story.... very hard to debug.
Objects can be put on and removed only from the top of a stack. But what about reading and writing their values? Please correct me if I'm wrong, but I think process must be able to read from any part of the stack, since if only reading from the top was possible it would have to remove (and store somewhere) whole content of the stack above a variable it wants to examine. But in that case, how does the process know where exactly in the stack is a particular variable? I suspect it just holds a pointer to it, but where is that pointer stored?
Another thing - reading about stacks I often find phrases like "All memory allocated on the stack is known at compile time." Well, I probably misunderstand this, so please tell me where's the flaw in my logic:
Suppose a local variable is created when an if() statement is true, and isn't when it's false. Whether it's true will turn out at run time. So at compile time there's no way to know if it should be created, hence I wouldn't think memory for it is allocated at all, as it would be wasteful. Consequently, it isn't created/known at compile time.
At compile time, it's known how much space each type needs: An Integer, for instance, is 4 Bytes wide on 32 bit platforms, and a class with 2 Integers consumes 8 Bytes. Whether this space is allocated for a specific variable is not necessarily known (may depend on an if, as you stated).
When you invoke a method, all parameters and the return address are pushed onto the stack. To get one parameter, you walk up the stack up to its position, which is computed by the base pointer and the size of each parameter.
So it is not entirely true for this stack that you can access the top element only. It is, however, for the Stack data structure.
Taking my first course in assembly language, I am frustrated with cryptic error messages during debugging... I acknowledge that the following information will not be enough to find the cause of the problem (given my limited understanding of the assembly language, ColdFire(MCF5307, M68K family)), but I will gladly take any advice.
...
jsr out_string
Address Error (format 0x04 vector 0x03 fault status 0x1 status reg 0x2700)
I found a similar question on http://forums.freescale.com/freescale/board/message?board.id=CFCOMM&thread.id=271, regarding on ADDRESS ERROR in general.
The answer to the question states that the address error is because the code is "incorrectly" trying to execute on a non-aligned boundary (or accessing non-aligned memory).
So my questions will be:
What does it mean to "incorrectly" trying to execute a non-aligned boundary/memory? If there is an example, it would help a lot
What is non-aligned boundary/memory?
How would you approach fixing this problem, assuming you have little debugging technique(eg. using breakpoints and trace)
First of all, it is possible that isn't the instruction causing the error. Be sure to see if the previous or next instruction could have caused it. However, assuming that exception handlers and debuggers have improved:
An alignment exception is what occurs when, say 32 bit (4 byte) data is retrieved from an address which is not a multiple of 4 bytes. For example, variable x is 32 bits at address 2, then
const1: dc.w someconstant
x: dc.l someotherconstant
Then the instruction
mov.l x, %r0
would cause a data alignment fault on a 68000 (and 68010, IIRC). The 68020 eliminated this restriction and performs the unaligned access, but at the cost of decreased performance. I'm not aware of the jsr (jump to subroutine) instruction requiring alignment, but it's not unreasonable and it's easy to arrange—Before each function, insert the assembly language's macro for alignment:
.align long
func: ...
It has been a long time since I've used a 68K family processor, but I can give you some hints.
Trying to execute on an unaligned boundary means executing code at an odd address. If out_string were at an address with the low bit set for example.
The same holds true for a data access to memory of 2 or 4 byte data. I'm not sure if the Coldfire supports byte access to odd memory addresses, but the other 68K family members did.
The address error occurs on the instruction that causes the error in all cases.
Find out what instruction is there. If the pc matches (or is close) then it is an unaligned execution. If it is a memory access, e.g. move.w d0,(a0), then check to see what address is being read/written, in this case the one pointed at by a0.
I just wanted to add that this is very good stuff to figure out. I program high end medical imaging devices in my day job, but occasionally I need to get down to this level. I have found and fixed more than one COTS OS problem by being able to track down just this sort of problem.
How is a program (e.g. C or C++) arranged in computer memory? I kind of know a little about segments, variables etc, but basically I have no solid understanding of the entire structure.
Since the in-memory structure may differ, let's assume a C++ console application on Windows.
Some pointers to what I'm after specifically:
Outline of a function, and how is it called?
Each function has a stack frame, what does that contain and how is it arranged in memory?
Function arguments and return values
Global and local variables?
const static variables?
Thread local storage..
Links to tutorial-like material and such is welcome, but please no reference-style material assuming knowledge of assembler etc.
Might this be what you are looking for:
http://en.wikipedia.org/wiki/Portable_Executable
The PE file format is the binary file structure of windows binaries (.exe, .dll etc). Basically, they are mapped into memory like that. More details are described here with an explanation how you yourself can take a look at the binary representation of loaded dlls in memory:
http://msdn.microsoft.com/en-us/magazine/cc301805.aspx
Edit:
Now I understand that you want to learn how source code relates to the binary code in the PE file. That's a huge field.
First, you have to understand the basics about computer architecture which will involve learning the general basics of assembly code. Any "Introduction to Computer Architecture" college course will do. Literature includes e.g. "John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach" or "Andrew Tanenbaum, Structured Computer Organization".
After reading this, you should understand what a stack is and its difference to the heap. What the stack-pointer and the base pointer are and what the return address is, how many registers there are etc.
Once you've understood this, it is relatively easy to put the pieces together:
A C++ object contains code and data, i.e., member variables. A class
class SimpleClass {
int m_nInteger;
double m_fDouble;
double SomeFunction() { return m_nInteger + m_fDouble; }
}
will be 4 + 8 consecutives bytes in memory. What happens when you do:
SimpleClass c1;
c1.m_nInteger = 1;
c1.m_fDouble = 5.0;
c1.SomeFunction();
First, object c1 is created on the stack, i.e., the stack pointer esp is decreased by 12 bytes to make room. Then constant "1" is written to memory address esp-12 and constant "5.0" is written to esp-8.
Then we call a function that means two things.
The computer has to load the part of the binary PE file into memory that contains function SomeFunction(). SomeFunction will only be in memory once, no matter how many instances of SimpleClass you create.
The computer has to execute function SomeFunction(). That means several things:
Calling the function also implies passing all parameters, often this is done on the stack. SomeFunction has one (!) parameter, the this pointer, i.e., the pointer to the memory address on the stack where we have just written the values "1" and "5.0"
Save the current program state, i.e., the current instruction address which is the code address that will be executed if SomeFunction returns. Calling a function means pushing the return address on the stack and setting the instruction pointer (register eip) to the address of the function SomeFunction.
Inside function SomeFunction, the old stack is saved by storing the old base pointer (ebp) on the stack (push ebp) and making the stack pointer the new base pointer (mov ebp, esp).
The actual binary code of SomeFunction is executed which will call the machine instruction that converts m_nInteger to a double and adds it to m_fDouble. m_nInteger and m_fDouble are found on the stack, at ebp - x bytes.
The result of the addition is stored in a register and the function returns. That means the stack is discarded which means the stack pointer is set back to the base pointer. The base pointer is set back (next value on the stack) and then the instruction pointer is set to the return address (again next value on the stack). Now we're back in the original state but in some register lurks the result of the SomeFunction().
I suggest, you build yourself such a simple example and step through the disassembly. In debug build the code will be easy to understand and Visual Studio displays variable names in the disassembly view. See what the registers esp, ebp and eip do, where in memory your object is allocated, where the code is etc.
What a huge question!
First you want to learn about virtual memory. Without that, nothing else will make sense. In short, C/C++ pointers are not physical memory addresses. Pointers are virtual addresses. There's a special CPU feature (the MMU, memory management unit) that transparently maps them to physical memory. Only the operating system is allowed to configure the MMU.
This provides safety (there is no C/C++ pointer value you can possibly make that points into another process's virtual address space, unless that process is intentionally sharing memory with you) and lets the OS do some really magical things that we now take for granted (like transparently swap some of a process's memory to disk, then transparently load it back when the process tries to use it).
A process's address space (a.k.a. virtual address space, a.k.a. addressable memory) contains:
a huge region of memory that's reserved for the Windows kernel, which the process isn't allowed to touch;
regions of virtual memory that are "unmapped", i.e. nothing is loaded there, there's no physical memory assigned to those addresses, and the process will crash if it tries to access them;
parts the various modules (EXE and DLL files) that have been loaded (each of these contains machine code, string constants, and other data); and
whatever other memory the process has allocated from the system.
Now typically a process lets the C Runtime Library or the Win32 libraries do most of the super-low-level memory management, which includes setting up:
a stack (for each thread), where local variables and function arguments and return values are stored; and
a heap, where memory is allocated if the process calls malloc or does new X.
For more about the stack is structured, read about calling conventions. For more about how the heap is structured, read about malloc implementations. In general the stack really is a stack, a last-in-first-out data structure, containing arguments, local variables, and the occasional temporary result, and not much more. Since it is easy for a program to write straight past the end of the stack (the common C/C++ bug after which this site is named), the system libraries typically make sure that there is an unmapped page adjacent to the stack. This makes the process crash instantly when such a bug happens, so it's much easier to debug (and the process is killed before it can do any more damage).
The heap is not really a heap in the data structure sense. It's a data structure maintained by the CRT or Win32 library that takes pages of memory from the operating system and parcels them out whenever the process requests small pieces of memory via malloc and friends. (Note that the OS does not micromanage this; a process can to a large extent manage its address space however it wants, if it doesn't like the way the CRT does it.)
A process can also request pages directly from the operating system, using an API like VirtualAlloc or MapViewOfFile.
There's more, but I'd better stop!
For understanding stack frame structure you can refer to
http://en.wikipedia.org/wiki/Call_stack
It gives you information about structure of call stack, how locals , globals , return address is stored on call stack
Another good illustration
http://www.cs.uleth.ca/~holzmann/C/system/memorylayout.pdf
It might not be the most accurate information, but MS Press provides some sample chapters of of the book Inside Microsoft® Windows® 2000, Third Edition, containing information about processes and their creation along with images of some important data structures.
I also stumbled upon this PDF that summarizes some of the above information in an nice chart.
But all the provided information is more from the OS point of view and not to much detailed about the application aspects.
Actually - you won't get far in this matter with at least a little bit of knowledge in Assembler. I'd recoomend a reversing (tutorial) site, e.g. OpenRCE.org.