How do data types get allocated in stack in MIPS architecture?
If i have 2 char and 1 int data, is stack going to allocate them in 8 byte form(2 chars are in same memory segment and 1 int is in another memory segment) or 12 byte form(memory segment for each chars and 1 memory segment for int)? I am trying to understand 32 bit MIPS architecture.
To the question, it matters if the data types being allocated are for local variables vs. for parameter passing.
For locals you can allocate whatever you want as long as the int is aligned on 4 byte boundary. The total stack allocation is rounded up to 8 bytes (though some don't bother with this, e.g. for homework, and, is only strictly necessary if your function calls other functions that may rely on the expected 8 byte alignment of the stack.)
For parameters you should follow the documented calling convention — there are several so you have to know which you're working with. See here to see some of them; look for "MIPS EABI 32-bit Calling Convention" and/vs. "MIPS O32 32-bit Calling Convention".
What they have in common is that the first four parameters are passed in registers, which effectively means that chars take a full 32-bit word; char parameters passed on the stack also follow that form, so take a full 32-bit word each.
Related
Suppose I have the following primitive stack implementation for a virtual machine:
unsigned long stack[512];
unsigned short top = 0;
void push(unsigned long qword) {
stack[top] = qword;
top++;
}
void pop() {
top--;
}
unsigned long get() {
return top-1;
}
This stack actually works fine (except that it doesn't check for an overflow) but I now have the following problem: It is quite inefficient.
Here is an example:
Let's say I want to push a byte onto the stack. I would now have to cast it to a long and then push it onto the stack. But now a whole 7 bytes are not being used. This feels kind of wrong.
So now I have the following question:
How do stack machines efficiently store data types of different sizes? Do they do the same as in this implementation?
There are different metrics of efficiency. Using an eight bytes long to store a single byte will raise the memory consumption. On the other hand, memory is not the major concern on most of today’s machines. Further, a stack is a pre-allocated amount of memory, typically. So as long as not the entire memory block has been exhausted, it is entirely irrelevant whether the unused seven bytes are within that long or on the other side of the location marked by top.
In terms of CPU time, you don’t gain any advantage of transferring a quantity smaller than the hardware’s bus size. In the best case, it makes no difference. In the worst case, transferring a single byte boils down to reading a long from memory, manipulating one byte of it and writing the long back. In the latter case, it would be more efficient to expand the byte to long, to overwrite all eight bytes explicitly.
This is reflected by the design of the Java bytecode, for example. It does not only drop support for pushing and popping quantities smaller than 32 bit, it doesn’t even support arithmetic instructions for them¹. So for most use cases, you don’t even know that a quantity could be a byte before pushing. Only formal parameter types and array types may refer to byte.
But note that a JVM isn’t even a stack engine in the narrowest sense. There is no support for pushing and popping arbitrary numbers of items. As explained in this answer, expressing the intent using a stack allows very compact instructions. But Java bytecode doesn’t allow branching to code locations with a different number of items on the stack. So it doesn’t support pushing or popping items in a loop. In other words, for each instruction, the actual offset into the stack is predictable and also the operand types are known. So it’s always possible to transform Java bytecode to an IR not using a stack straight-forwardly. Such transformed code could use instructions with arbitrary operand sizes, if that has a benefit on the particular target architecture.
¹ And that was accounting for hardware in use a quarter century ago
There's no "one true" way of doing this, and the Java VM uses a few different strategies. All types less than 32-bits in size are widened to 32-bits. Pushing 1 byte to the stack effectively pushes 4 bytes to the stack. The benefit is simplicity when there are fewer native value sizes to deal with.
Another strategy is used for 64-bit values. They occupy two stack slots instead of one. The JVM has specific opcodes which indicate which type of value they expect on the stack, and the verifier ensures that no opcode is attempting to access a variable off the stack that doesn't match the type that should be there.
A third strategy is used for object references. The actual pointer size can be 32 bits or 64 bits, depending on the CPU capabilities, whether the JVM is running in 64-bit mode, etc. The JVM has specific opcodes for handling object references, and the verifier checks this too.
This question already has answers here:
Why is it not possible to push a byte onto a stack on Pentium IA-32?
(4 answers)
Closed 7 years ago.
Hi I am reading a guide on x86 by the University of Virginia and it states that pushing and popping the stack either removes or adds a 4-byte data element to the stack.
Why is this set to 4 bytes? Can this be changed, could you save memory on the stack by pushing on smaller data elements?
The guide can be found here if anyone wishes to view it:
http://www.cs.virginia.edu/~evans/cs216/guides/x86.html
Short answer: Yes, 16 or 32 bits. And, for x86-64, 64 bits.
The primary reasons for a stack are to return from nested function calls and to save/restore register values. It is also typically used to pass parameters and return function results. Except for the smallest parameters, these items usually have the same size by the design of the processor, namely, the size of the instruction pointer register. For 8088/8086, it is 16-bits. For 80386 and successors, it is 32-bits. Therefore, there is little value in having stack instructions that operate on other sizes.
There is also the consideration of the size of the data on the memory bus. It takes the same amount of time to retrieve or store a word as it does a byte. (Except for 8088 which has 16-bit registers but an 8-bit data bus.) Alignment also comes into play. The stack should be aligned on word boundaries so each value can be retrieved as one memory operation. The trade-off is usually taken to save time over saving memory. To pass one byte as a parameter, one word is usually used. (Or, depending on the optimization available to the compiler, one word-sized register would be used, avoiding the stack altogether.)
I have a 64 bit kernel and i run 32 bit processes in userland.In the user process code ,if i declare a 64 bit variable ,how will it be referred.Will it incur 2 memory reads.?
basically the scenario is:
I need to use a 64 bit mask in my user process.
Approach 1 :
-> Use a u64bits variable.
Approach
-> Use a array of 2 32 bit variables.
First off: the kernel has no bearing on the answer to this question.
Second, I assume this is x86 you're talking about. Where possible, the compiler will place 64-bit values across 2 32-bit registers. For example, if you return a uint64_t from a function, the low 32 bits will be stored in the eax register, and the high bits will be in edx.
The compiler will generally do the right thing for performance and correctness: using an array will likely just confuse it and lead to worse results.
By the way, x86-64 CPUs will normally perform reads of 2 adjacent 32-bit words at the same speed as a single 64-bit read. The advantages of 64-bit mode are that arithmetic can be done directly on 64-bit values (1 64x64 multiplication instruction vs 3-4 32x32 instructions), there is much more space available in registers (16 registers instead of 8, registers are twice as wide), and of course the larger possible virtual address space.
Why is the smallest value that can be stored a Byte(8bit) & not a Bit(1bit) in memory?
Even booleans are stored as Bytes. Will we ever bump the smallest number to 32 or 64bits like register's on the CPU?
EDIT: To clarify as many answers seemed confused about the nature of questing. This question is about why isn't a byte 7-bit, 1-bit, 32-bit, etc (not why lower bit primitives must fit within the hardware's byte at min). Is the 8-bit byte simply historical as some hardware has 10-bit bytes for example. Or is there a mathematical reason 8-bit is ideal vs say 10-bit for general processing?
The hardware is built to read data in blocks (bytes, later words and dwords). This provides greater efficiency, than accessing individual bits, and also offers more addressing range. So most data is aligned to at least byte boundary. There exist encodings that operate with bit sequences, rather than bytes, but they are quite rare.
Nowadays the data is most often aligned to dword (32-bits) boundary anyway. Moreover, some hardware (ARM, for example), can't access misaligned multibyte variables, i.e. 16-bit word can't "cross" dword boundary - exception will be thrown.
Because computers address memory at the byte level, so anything smaller than a byte is not addressable.
The underlying methods of processor access are limited to the size of the smallest usable register. On most architectures, that size is 8 bits. You can use smaller portions of these; for instance, C has the bitfield feature in structs that will allow combining fields that only need to be certain bit lengths. Access will still require that the whole byte be read.
Some older exotic architectures actually did have different a "word size." In these machines, 10 bits might be the common size.
Lastly, processors are almost always backwards compatible. Intel, for instance, has maintained complete instruction compatibility from the 386 on up. If you take a program compiled for the 386, it will still run on an i7 processor. Changing the word size would break compatibility. So while it is possible, no manufacturer will ever do it.
Assume that we have native language that consist of 2 character such as a , b
to distinguish two characters we need at least 1 bit for example 0 to represent char a and 1 to represent char b
so that if we count number of characters and special characters and symbols, there are 128 character and to distinguish one character from another, you need log2(128) = 7 bit and 8th bit for transmission
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
We must remember that memory can only
be addressed in multiples of the word
size. A word in our case is 4 bytes,
or 32 bits. So our 5 byte buffer is
really going to take 8 bytes (2 words)
of memory, and our 10 byte buffer is
going to take 12 bytes (3 words) of
memory. That is why SP is being
subtracted by 20.
Why it's not ceil((5+10)/4)*4=16?
Because individual variables should be aligned. With your proposed formula, you'd align only the first variable on the stack, leaving following variables unaligned, which is bad for performance.
This is also known as "packing" and can be done in C/C++ with pragmas, but is only useful in very specific cases and can be dangerous both for performance and as a cause of potential runtime traps. Some processors will generate faults on unaligned accesses at runtime, which will crash your program.
The variables on your architecture are aligned individually. buffer1 gets rounded up to 8 and buffer2 to 12 so that both of their starting addresses are 4-byte aligned. So 8+12 = 20.