MIPS Stack- Erasing Contents Clarification - stack

My Textbook says: Notice in this example, you aren't actually erasing the stack contents. If you're storing secret data, you should overwrite it with zeroes, or (ideally) garbage data. For
now, we can just leave our garbage lying around.
Aren't the contents that are saved on the stack removed after executing this?
add $30, $30, $4
Since it pops off all the saved contents.

Merely incrementing the stack pointer does not change the contents of memory below the stack. It's the same as working with a string variable in C:
char str[200] = { " Secret data" };
strcpy(str, "foo");
At this point the variable's useful value by C convention is "foo". But of course the array contains
foo\0Secret data\0< 184 more zero bytes >
Just as a function that cares to look can see the secret data in the string, another function or process running malicious code that knows your code doesn't erase its stack could read old "popped" data there. If it's a password or other value conferring privilege, the bad guys are ahead.

Related

Tracking address when writing to flash

My system needs to store data in an EEPROM flash. Strings of bytes will be written to the EEPROM one at a time, not continuously at once. The length of strings may vary. I want the strings to be saved in order without wasting any space by continuing from the last write address. For example, if the first string of bytes was written at address 0x00~0x08, then I want the second string of bytes to be written starting at address 0x09.
How can it be achieved? I found that some EEPROM's write command does not require the address to be specified and just continues from lastly written point. But EEPROM I am using does not support that. (I am using Spansion's S25FL1-K). I thought about allocating part of memory to track the address and storing the address every time I write, but that might wear out flash faster. What is widely used method to handle such case?
Thanks.
EDIT:
What I am asking is how to track/save the address in a non-volatile way so that when next write happens, I know what address to start.
I never worked with this particular flash, but I've implemented something similar. Unfortunately, without knowing your constrains / priorities (memory or CPU efficient, how often write happens etc.) it is impossible to give a definite answer. Here are some techniques that you may want to consider. I don't know if they are widely used though.
Option 1: Write X bytes containing string length before the string. Then on initialization you could parse your flash: read the length n, jump n bytes forward; read the next byte. If it's empty (all ones for your flash according to the datasheet) then you got your first empty bit. Otherwise you've just read the length of the next string, so do the same over again.
This method allows you to quickly search for the last used sector, since the first byte of the used sector is guaranteed to have a value. The flip side here is overhead of extra n bytes (depending on the max string length) each time you write a string, and having to parse it to get the value (although this can only be done once on boot).
Option 2: Instead of prepending the size, append the unique "end-of-string" sequence, and then parse on boot for the last sequence before ones that represent empty flash.
Disadvantage here is longer parse, but you possibly could get away with just 1 byte-long overhead for each string.
Option 3 would be just what you already thought of: allocating a separate sector that would contain the value you need. To reduce flash wear you could also write these values back-to-back and search for the last one each time you boot. Also, you might consider the expected lifetime of the device that you program versus 100,000 erases that your flash can sustain (again according to the datasheet) - is wearing even a problem? That of course depends on how often data will be saved.
Hope that helps.

Does the whole linkage section return?

In program a
EXEC CICS LINK
PROGRAM(PGMB)
COMMAREA(COMMA)
LENGTH(LENGTH OF COMMA)
RESP(CICS-RESP)
END-EXEC
In program b
EXEC CICS RETURN
END-EXEC
Does program b only return the commarea that program a passed? Or does it return the whole LINKAGE SECTION?
Program B returns neither the entire LINKAGE SECTION nor the commarea (COMMA in your example).
It returns nothing.
Why does it return nothing? Because nothing gets passed to it.
Or, rather, what gets passed to it is simply the address(es) of the parameter(s). Nothing else. That is all. Importantly, no length.
PROGA
01 some-stuff.
05 a-bit-of-stuff PIC X.
05 the-rest-of-the-stuff PIC X(99).
CALL .... USING a-bit-of-stuff
PROGB
LINKAGE SECTION.
01 stuff-that-is-somewhere-else PIC X(100).
PROCEDURE DIVISION USING stuff-that-is-somewhere-else.
a-bit-of-stuff is define as only one byte. This makes no difference. It is the definition, in the LINKAGE SECTION, of the item on the PROCEDURE DIVISION USING ... which matches, in order of reference, nothing else, to the CALL ... USING ...
PROGB will be "passed" the address of a-bit-of-stuff. If that address is then mapped to 100 bytes in the LINKAGE SECTION of the CALLed program, COBOL does not mind at all.
If we change that example CALL to use instead some-stuff, since some-stuff has the same starting address as a-bit-of-stuff, there would be absolutely no change in the generated code, and no change in the execution of the two programs.
Defining different sizes of data "between" CALLer and CALLed is usually not done, because it makes things less clear to us, humans. The compiler cares not one jot.
What you need to look at the 01s (or 77s if that silly idea takes your fancy) as is a REDEFINES. They are a REDEFINES, an implicit one, of data which is defined somewhere else. No data is defined for items in the LINKAGE SECTION (there is one exception to that on the Mainframe). The 01-levels in the LINKAGE SECTION are just redefining, or mapping, the address of the data that is passed to the program. The data does not "leave" the CALLing program, and the data is never "passed back".
Things can go wrong, of course, if you use different lengths for matching items on the USINGs. If the storage from the CALLer is "acquired" (like a GETMAIN in CICS) then attempting to referencing data outside of that storage, even one byte further on, can get you an abend due to an Addressing Exception (a S0C4, which CICS will kindly name something else for you, an AKEA).
Even without acquired storage, other fields after the one "passed" can be accidentally trashed, or the field itself may not get the expected amount of data MOVEd to it by the CALLed program, if the definition is short in the CALLed program.
There are actually two things which get "returned" from a CALLed program. They are the special-register RETURN-CODE, and the single item on the RETURNING of the PROCEDURE DIVISION (if used, likely not).
Even so, the mechanisms of how those are achieved are different from the normal misunderstanding of data "passed" between CALLing and CALLed programs.
I have not had the joy of programming CICS for sometime now, this answer is based on what knowledge I still remember.
The calling program gets at most an amount of data less than or equal to the size of the data area sent in the calling program (or as specified by the optional LENGTH parameter). Don't try to access data beyond what you have sent.
"So if program x LINKS to program y, any updates done to the COMMAREA in y will be visible in x."
Source: SOVF:How CICS Shared Memory Works.
"When a communication area is passed by way of an EXEC CICS LINK command, the invoked program is passed a pointer to the communication area itself. Any changes that are made to the contents of the data area in the invoked program are available to the invoking program, when control returns to it; to access any such changes, the program names the data area that is specified in the original COMMAREA option." Source: IBM-CICS-Ref.
So, does program b only return the commarea that program a passed?
I would answer the above as Yes.
Does it return the whole linkage section?
As for this one, it depends on the structure of the DFHCOMMAREA of the linked program. If it only contains 1 such area, then the answer is that it returns as many bytes as was sent on link command from that area (either implicitly or explicitly). Remember that this area is outside your caller. So, if the caller sends 100 bytes and the linkage section has an area of 500 bytes, you only get 100 back at the most.
If you want to allow your linked program to modify data in the commarea, there are some very serious limitations.
Exec CICS
Return
End-Exec
Will expose a changes in a commarea to the LINKing program, but only by accident and only if both tasks execute on the same CICS region. This is because the commarea is actually a pointer. On a distributed program link, the area is copied, but not copied back.

Are the name of the variable and the data itself, both of them stored in the stack?

My question is : Are the name of the variable and the data itself both stored in the stack?
I would like to know how the name of the variable is linked to the address memory in the stack (the data) and what does it.
Also how does anything know the number of bytes the type of the variable is composed of and how does it decides to read these exact number of bytes in the stack?
Does all the data stored in the stack occupy the same space, no matter the type of data it is?.
And the same questions with the heap?
Generally, I believe the following to be true in most practical implementations:
No, the name and actual data are not both stored on the stack.
The compiler keeps track of where the variable is on the stack, and when the compiler is done, all references to the variable (ie the name) has been substituted by a proper increse/decrease of the stack pointer to address the memory area where the data is stored.
No, they do not occupy the same space. A 4 byte var takes up 4 bytes. A 1000000 bytes variable take up 1000000 bytes (but that's not recommended, usually).
The heap is a bit different... Maybe this page can answer your question a bit more: http://www.learncpp.com/cpp-tutorial/79-the-stack-and-the-heap

Reading from a stack and memory allocation at compile time

Objects can be put on and removed only from the top of a stack. But what about reading and writing their values? Please correct me if I'm wrong, but I think process must be able to read from any part of the stack, since if only reading from the top was possible it would have to remove (and store somewhere) whole content of the stack above a variable it wants to examine. But in that case, how does the process know where exactly in the stack is a particular variable? I suspect it just holds a pointer to it, but where is that pointer stored?
Another thing - reading about stacks I often find phrases like "All memory allocated on the stack is known at compile time." Well, I probably misunderstand this, so please tell me where's the flaw in my logic:
Suppose a local variable is created when an if() statement is true, and isn't when it's false. Whether it's true will turn out at run time. So at compile time there's no way to know if it should be created, hence I wouldn't think memory for it is allocated at all, as it would be wasteful. Consequently, it isn't created/known at compile time.
At compile time, it's known how much space each type needs: An Integer, for instance, is 4 Bytes wide on 32 bit platforms, and a class with 2 Integers consumes 8 Bytes. Whether this space is allocated for a specific variable is not necessarily known (may depend on an if, as you stated).
When you invoke a method, all parameters and the return address are pushed onto the stack. To get one parameter, you walk up the stack up to its position, which is computed by the base pointer and the size of each parameter.
So it is not entirely true for this stack that you can access the top element only. It is, however, for the Stack data structure.

What is the purpose of each of the memory locations, stack, heap, etc? (lost in technicalities)

Ok, I asked the difference between Stackoverflow and bufferoverflow yesterday and almost getting voted down to oblivion and no new information.
So it got me thinking and I decided to rephrase my question in the hopes that I get reply which actually solves my issue.
So here goes nothing.
I am aware of four memory segments(correct me if I am wrong). The code, data, stack and heap. Now AFAIK the the code segment stores the code, while the data segment stores the data related to the program. What seriously confuses me is the purpose of the stack and the heap!
From what I have understood, when you run a function, all the related data to the function gets stored in the stack and when you recursively call a function inside a function, inside of a function... While the function is waiting on the output of the previous function, the function and its necessary data don't pop out of the stack. So you end up with a stack overflow. (Again please correct me if I am wrong)
Also I know what the heap is for. As I have read someplace, its for dynamically allocating data when a program is executing. But this raises more questions that solves my problems. What happens when I initially initialize my variables in the code.. Are they in the code segment or in the data segment or in the heap? Where do arrays get stored? Is it that after my code executes all that was in my heap gets erased? All in all, please tell me about heap in a more simplified manner than just, its for malloc and alloc because I am not sure I completely understand what those terms are!
I hope people when answering don't get lost in the technicalities and can keep the terms simple for a layman to understand (even if the concept to be described is't laymanish) and keep educating us with the technical terms as we go along. I also hope this is not too big a question, because I seriously think they could not be asked separately!
What is the stack for?
Every program is made up of functions / subroutines / whatever your language of choice calls them. Almost always, those functions have some local state. Even in a simple for loop, you need somewhere to keep track of the loop counter, right? That has to be stored in memory somewhere.
The thing about functions is that the other thing they almost always do is call other functions. Those other functions have their own local state - their local variables. You don't want your local variables to interfere with the locals in your caller. The other thing that has to happen is, when FunctionA calls FunctionB and then has to do something else, you want the local variables in FunctionA to still be there, and have their same values, when FunctionB is done.
Keeping track of these local variables is what the stack is for. Each function call is done by setting up what's called a stack frame. The stack frame typically includes the return address of the caller (for when the function is finished), the values for any method parameters, and storage for any local variables.
When a second function is called, then a new stack frame is created, pushed onto the top of the stack, and the call happens. The new function can happily work away on its stack frame. When that second function returns, its stack frame is popped (removed from the stack) and the caller's frame is back in place just like it was before.
So that's the stack. So what's the heap? It's got a similar use - a place to store data. However, there's often a need for data that lives longer than a single stack frame. It can't go on the stack, because when the function call returns, it's stack frame is cleaned up and boom - there goes your data. So you put it on the heap instead. The heap is a basically unstructured chunk of memory. You ask for x number of bytes, and you get it, and can then party on it. In C / C++, heap memory stays allocated until you explicitly deallocate. In garbage collected languages (Java/C#/Python/etc.) heap memory will be freed when the objects on it aren't used anymore.
To tackle your specific questions from above:
What's the different between a stack overflow and a buffer overflow?
They're both cases of running over a memory limit. A stack overflow is specific to the stack; you've written your code (recursion is a common, but not the only, cause) so that it has too many nested function calls, or you're storing a lot of large stuff on the stack, and it runs out of room. Most OS's put a limit on the maximum size the stack can reach, and when you hit that limit you get the stack overflow. Modern hardware can detect stack overflows and it's usually doom for your process.
A buffer overflow is a little different. So first question - what's a buffer? Well, it's a bounded chunk of memory. That memory could be on the heap, or it could be on the stack. But the important thing is you have X bytes that you know you have access to. You then write some code that writes X + more bytes into that space. The compiler has probably already used the space beyond your buffer for other things, and by writing too much, you've overwritten those other things. Buffer overruns are often not seen immediately, as you don't notice them until you try to do something with the other memory that's been trashed.
Also, remember how I mentioned that return addresses are stored on the stack too? This is the source of many security issues due to buffer overruns. You have code that uses a buffer on the stack and has an overflow vulnerability. A clever hacker can structure the data that overflows the buffer to overwrite that return address, to point to code in the buffer itself, and that's how they get code to execute. It's nasty.
What happens when I initially initialize my variables in the code.. Are they in the code segment or in the data segment or in the heap?
I'm going to talk from a C / C++ perspective here. Assuming you've got a variable declaration:
int i;
That reserves (typically) four bytes on the stack. If instead you have:
char *buffer = malloc(100);
That actually reserves two chunks of memory. The call to malloc allocates 100 bytes on the heap. But you also need storage for the pointer, buffer. That storage is, again, on the stack, and on a 32-bit machine will be 4 bytes (64-bit machine will use 8 bytes).
Where do arrays get stored...???
It depends on how you declare them. If you do a simple array:
char str[128];
for example, that'll reserve 128 bytes on the stack. C never hits the heap unless you explicitly ask it to by calling an allocation method like malloc.
If instead you declare a pointer (like buffer above) the storage for the pointer is on the stack, the actual data for the array is on the heap.
Is it that after my code executes all that was in my heap gets erased...???
Basically, yes. The OS will clean up the memory used by a process after it exits. The heap is a chunk of memory in your process, so the OS will clean it up. Although it depends on what you mean by "clean it up." The OS marks those chunks of RAM as now free, and will reuse it later. If you had explicit cleanup code (like C++ destructors) you'll need to make sure those get called, the OS won't call them for you.
All in all, please tell me about heap in a more simplified manner than just, its for malloc and alloc?
The heap is, much like it's name, a bunch of free bytes that you can grab a piece at a time, do whatever you want with, then throw back to use for something else. You grab a chunk of bytes by calling malloc, and you throw it back by calling free.
Why would you do this? Well, there's a couple of common reasons:
You don't know how many of a thing
you need until run time (based on
user input, for example). So you
dynamically allocate on the heap as
you need them.
You need large data structures. On
Windows, for example, a thread's
stack is limited by default to 1
meg. If you're working with large
bitmaps, for example, that'll be a
fast way to blow your stack and get
a stack overflow. So you grab that
space of the heap, which is usually
much, much larger than the stack.
The code, data, stack and heap?
Not really a question, but I wanted to clarify. The "code" segment contains the executable bytes for your application. Typically code segments are read only in memory to help prevent tampering. The data segment contains constants that are compiled into the code - things like strings in your code or array initializers need to be stored somewhere, the data segment is where they go. Again, the data segment is typically read only.
The stack is a writable section of memory, and usually has a limited size. The OS will initialize the stack and the C startup code calls your main() function for you. The heap is also a writable section of memory. It's reserved by the OS, and functions like malloc and free manage getting chunks out of it and putting them back.
So, that's the overview. I hope this helps.
With respect to stack... This is precicely where the parameters and local variables of the functions / procedures are stored. To be more precise, the params and local variables of the currently executing function is only accessible from the stack... Other variables that belong to chain of functions that were executed before it will be in stack but will not be accessible until the current function completed its operations.
With respect global variables, I believe these are stored in data segment and is always accessible from any function within the created program.
With respect to Heap... These are additional memories that can be made allotted to your program whenever you need them (malloc or new)... You need to know where the allocated memory is in heap (address / pointer) so that you can access it when you need. Incase you loose the address, the memory becomes in-accessible, but the data still remains there. Depending on the platform and language this has to be either manually freed by your program (or a memory leak occurs) or needs to be garbage collected. Heap is comparitively huge to stack and hence can be used to store large volumes of data (like files, streams etc)... Thats why Objects / Files are created in Heap and a pointer to the object / file is stored in stack.
In terms of C/C++ programs, the data segment stores static (global) variables, the stack stores local variables, and the heap stores dynamically allocated variables (anything you malloc or new to get a pointer to). The code segment only stores the machine code (the part of your program that gets executed by the CPU).

Resources