Understanding the memory layout of a 4-byte integer

Understanding the memory layout of a 4-byte integer - memory

I am working with the MIPS architecture (not sure if this is relevant since we are dealing with memory).
I am told that A 32-bit integer is in memory at physical address 0x00A0CE48.
I assume that number is 00000000111111110000000011111111.
The system is byte-addressable, what value would be at memory address 0x00A0?
I wasn't sure if the first 8 bits were at address 0x00, next 8 bits at 0x00A0, next 8 bits at 0x00A0CE, and the last 8 bits at 0x00A0CE48? I'm asking because I have to manipulate a value in 0x00A0, but im not sure what's there.
Part of the problem is to 1st assume big endian is used, then little endian.
A 32-bit integer resides in memory at physical address 0x00A0CE48. The bits within the 32-bit word are numbered 0 to 31 from least significant bit to most significant bit. The code below extracts a single bit from this 32 bit pattern and places the bit into $t4.
lui $t0,0x00A0
ori $t0,$t0,0xCE48
lbu $t4,2($t0)
srl $t4,$t4,5
andi $t4,$t4,1
The next question in my assignment is to indicate the number of the bit (0 through 31) within the 32-bit word that is left in $t4 if the memory order used is little-endian or big-endian.

On a big endian system, the most significant byte is stored first. So, assuming the value is 0x12345678, 0x12 will be stored at address 0x00A0CE48, 0x34 will be stored at address 0x00A0CE49, 0x56 will be stored at address 0x00A0CE4A, and 0x78 will be stored at address 0x00A0CE4B.
On the other hand, on a little endian system, the least significant byte will be stored first. So, 0x78 would be stored at 0x00A0CE48, and so on.
Note that if a 32-bit word is stored at address 0x00A0CE48, the next word will be four bytes later, at address 0x00A0CE4C. The arithmetic should be performed on the address as a whole. You cannot consider the bytes making up the address separately when reading from memory.
In the assembly you've posted, lui (which stands for "load upper immediate") will shift the immediate value 16 bits to the left and store it in $t0. After that instruction, the value in $t0 will be 0x00A00000. The next instruction will OR the contents of $t0 with 0xCE48 and store the results in $t0. After that, $t0 will contain your full address, 0x00A0CE48.

Related

how long is a memory address typically in bits

I am confused with so many terminologies that my instructor talks about such as word,byte addressing and memory location.
I was under the impression that for a 32-bit processor,
it can address upto 2^32 bits, which is 4.29 X 10^9 bits (NOT BYTES).
The way I think now is:
The memory is like an array of buckets each of 1 byte length.
when we say byte addressing (which I guess is the most common ones), each char is 1 byte and is retrieved from the first bucket (say for example).
for int the next 4 bytes are put together in little-endian ordering to compute the Integer value.
so each memory, I see it as, 8 bits or 1 byte, which can give upto 2^8 locations, this is far less than what cpu can address.
There is some very basic mis-understanding here on my part which if some experts can explain in simple terms that a prosepective CS-major student can it in once forever.
I have read various pages including this one on word and here the unit of address resolution is given as 8b for ARM, which adds more to my confusion.

The processor uses 32 bits to store an address. With 32 bits, you can store 2^32 distinct numbers, ranging from 0 to 2^32 - 1. "Byte addressing" means that each byte in memory is individually addressable, i.e. there is an address x which points to that specific byte. Since there are 2^32 different numbers you can put into a 32-bit address, we can address up to 2^32 bytes, or 4 GB.
It sounds like the key misconception is the meaning of "byte addressing." That only means that each individual byte has its own address. Addresses themselves are still composed of multiple bytes (4, in this case, since four 8-bit bytes are taken together and interpreted as a single 32-bit number).
I was under the impression that for a 32-bit processor, it can address upto 2^32 bits, which is 4.29 X 10^9 bits (NOT BYTES).
This is typically not the case -- bit-level addressing is quite rare. Byte addressing is far more common. You could design a CPU that worked this way, though. In that case as you said, you would be able to address up to 2^32 bits = 2^29 bytes (512 MiB).

For one bit, You would have 0 or 1 and For two bits, you would have 00, 01, 10, 11. For 8 bits, you would have 2^8 which is 256 address values. Address and Data are separate terms. Address is the location and Data is the content in that location. Data width(content) is how many bits you could store in one memory cell address.(Think like an apartment with bedrooms- each apartment in a building has two bedrooms)and Data depth(address) is how many addresses you would have(In a building how many apartments you would have #1 thru #1400 etc). One bit in the CPU register can reference an individual byte in memory like one number in apartment number can reference one apartment. SIMM module RAMs had 32 bit Data width and DIMM modules have 64 bit Data width. It means in one memory address in DIMM, It stores 64 bits data. How many addresses can be multiplexed by two wires (two bit processing), you could make 4 addresses. (Each of these addresses could hold 64 bits if it is DIMM module ). 32 bit processing means, 32 wires, 2^32 address options. Even though, 64 bit processing has 64 bit registers and internal bus (wires) as 64 bit, http://www.tech-faq.com/address-bus.html, address bus max is 44 bits. means 2^44 maximum addressing can be achieved by Intel Super Server CPU Itanium 2.

Interpretation of memory layout

On this page, there is an example image of part of memory layout for a program :
What does each line in such an image represent? Does each line represent a single line of physical memory ? Usually physical memory has 32-bit or 64 bit as each line, so in that case, does each line in the image cover several lines of physical memory ?
Thanks.

Each two digit group represents one octet (byte). They are often laid out in groups of 16 octets because it fits well on a printed page or terminal screen. The address of the leftmost octet is in the left column (e.g. 00430020).
This representation is used as a typographic convention and does not necessarily have any relation to the physical structure of memory.

One would hope it was patently obvious, but...
Each line in the above diagram represents 16 bytes (given that the address advances by hex 0x10 with each line and given that there are 16 bytes on each line).
"Usually physical memory has 32-bit or 64 bit as each line" -- Well, no. Physical memory is primarily divided into 8-bit bytes. The computer may have a 32 or 64-bit wide transfer path from memory to the CPU, but the width of that path is irrelevant to understanding diagrams such as the above. (The term "line" inside a computer basically applies only to "cache line", which is a group of bytes from maybe 16 bytes long to possibly 256 bytes long (depending on the design) which are together in a "cache" -- a high-speed "snapshot" of portions of memory. But such cache operates "transparently" so you can ignore its existence for most purposes.)
What you will observe in the above diagram is that the data in a 32-bit address is "little endian" -- the first "next" field is 30 00 43 00, while the location it's pointing to is 00430030. The bytes in memory are reversed from how you or I would "naturally" read them.
So this diagram is simply showing some simple structures in memory and how they address each other.

Endian-ness: Bits in a Byte vs. Bytes in Memory

When we say a specific architecture is either little-endian or big-endian, we are referring to the whether numerical significance is stored from left-to-right or right-to-left in memory. My question is: does this ordering refer to how bits or ordered in a byte, or how bytes are ordered in a memory?
For example, consider the number 6000=1770h=0001011101110000b. If both bits in a byte and byte in memory are little-endian, this would be stored as
00001110 11101000 = 0E E8,
if bits in a byte were big-endian, but bytes in memory were little-endian, this would be stored as (for what it's worth, this happens to be how Visual Studio seems to be telling me that memory is organized in x64 architecture)
01110000 00010111 = 70 17,
if bits were little-endian, but bytes were big-endian, this would be stored as
11101000 00001110 = 0E E8,
and finally, if bits were big-endian, but bytes were little-endian, this would be stored as
00010111 01110000 = 17 70
(Hopefully I did that right.)
So then, what do the terms "little-endian" and "big-endian" actually refer to? Do the terms refer to the ordering of bits in a byte, or the ordering of bytes in memory, or both? Furthermore, if VS tells me that, for example, 7C, is 'in' a given particular byte, do they mean that the bits that make up that byte in computer memory are literally 0111 1100, or do they just mean that the value stored in that byte is 7Ch=124, but or may not be actually represented as 7c=01111100 depending on whether or not the underlying architecture happens to be little-endian?

The ordering of bits in a byte is invisible. Since you can't address individual bits, there would be no difference between the two cases. However, you can address individual bytes, so there it does make a difference.
If we're expressing 6000 in byte-addressible memory, the high byte is 23 decimal (6000 divided by 256) and the low byte is 112 decimal (6000 mod 256). We could store this as 23,112 or 112,23. There are no other options. Only the ordering of bytes is an open choice, and this is what endianness refers to.

In memory, little-endian or big-endian is not so much left-to-right or right-to-left issue as one of addressing. In little endian, the least significant portion of data is stored in the lower addressed locations and the reverse with big endian.
The ordering occurs independently at 2 levels. As most machines address more than 1 bit at a time (recall some graphic CPUs that did have bit addresses), the address will locate a group of bits, typically 8 bits. So if the bits at address 10 are less significant than the bits at address 11, it is a little endian machine. This is generalized as byte endian-ness. The processor's characteristics define this.
The endian-ness of the group of bits, the bit endian-ness, is significant if there is a way to address them in some fashion. Some processors provide operations that use a bit level address within the byte (or word). If your programming language directly allows you to use that or hides that is another question.
In C there are bit fields such as
union u {
unsigned char uc;
struct s {
int a :1;
int b :7;
};
};
This code is non-portable because of bit endian-ness. u.uc = 7 may result in u.s.b also being 7 or something else. Typically the byte endian-ness and bit endian-ness are the same. But the compiler controls the above example.
Bit endian-ness is also significant in serial communication. As 1 bit is sent/received sequentially, its construction to/from memory needs endian-ness definition.
Conclude:
Big-endian and little endian most often refer to the "byte" level addressing. The endian-ness of the bit is typically either the same or of select importance to the programmer.
BTW, your example of "If both bits in a byte and byte in memory are little-endian, this would be stored as
00001110 11101000 = 0E E8
I would suggest is not correct as the left side and right side are using different endian-ness. Had you used the same endian-ness, you may conclude
00001110 11101000 = 07 71
For fun consider:
01000000 (big endian) has value "sixty-four" (A big endian word)
10110000 (little endian) has value "thirteen". (Thirteen is little endian word)

Why are memory addresses incremented by 4 in MIPS?

If something is stored at 0x1001 0000 the next thing is stored at 0x1001 0004. And if I'm correct the memory pieces in a 32-bit architecture are 32 bits each. So would 0x1001 0002 point to the second half of the 32 bits?

First of all, memory addresses in MIPS architecture are not incremented by 4. MIPS uses byte addressing, so you can address any byte from memory (see e.g. lb and lbu to read a single byte, lh and lhu to read a half-word).
The fact is that if you read words which are 32 bits length (4 bytes, lw), then two consecutive words will be 4 bytes away from each other. In this case, you would add 4 to the address of the first word to get the address of the next word.
Beside this, if you read words you have to align them in multiples of 4, otherwise you will get an alignment exception.
In your example, if the first word is stored in 0x10010000 then the next word will be in 0x10010004 and of course the first half/second half would be in 0x1001000 and 0x1001002 (the ordering will depend on the endianness of the architecture).

You seem to have answered this one yourself! 32 bits make 4 bytes, so if you're e.g. pushing to a stack, where all elements are pushed as the same size, each next item will be 4 bytes ahead (or before) the next.

Difference between word addressable and byte addressable

Can someone explain what's the different between Word and Byte addressable? How is it related to memory size etc.?

A byte is a memory unit for storage
A memory chip is full of such bytes.
Memory units are addressable. That is the only way we can use memory.
In reality, memory is only byte addressable. It means:
A binary address always points to a single byte only.
A word is just a group of bytes – 2, 4, 8 depending upon the data bus size of the CPU.
To understand the memory operation fully, you must be familiar with the various registers of the CPU and the memory ports of the RAM. I assume you know their meaning:
MAR(memory address register)
MDR(memory data register)
PC(program counter register)
MBR(memory buffer register)
RAM has two kinds of memory ports:
32-bits for data/addresses
8-bit for OPCODE.
Suppose CPU wants to read a word (say 4 bytes) from the address xyz onwards. CPU would put the address on the MAR, sends a memory read signal to the memory controller chip. On receiving the address and read signal, memory controller would connect the data bus to 32-bit port and 4 bytes starting from the address xyz would flow out of the port to the MDR.
If the CPU wants to fetch the next instruction, it would put the address onto the PC register and sends a fetch signal to the memory controller. On receiving the address and fetch signal, memory controller would connect the data bus to 8-bit port and a single byte long opcode located at the address received would flow out of the RAM into the CPU's MDR.
So that is what it means when we say a certain register is memory addressable or byte addressable. Now what will happen when you put, say decimal 2 in binary on the MAR with an intention to read the word 2, not (byte no 2)?
Word no 2 means bytes 4, 5, 6, 7 for 32-bit machine. In real physical memory is byte addressable only. So there is a trick to handle word addressing.
When MAR is placed on the address bus, its 32-bits do not map onto the 32 address lines(0-31 respectively). Instead, MAR bit 0 is wired to address bus line 2, MAR bit 1 is wired to address bus line 3 and so on. The upper 2 bits of MAR are discarded since they are only needed for word addresses above 2^32 none of which are legal for our 32 bit machine.
Using this mapping, when MAR is 1, address 4 is put on the bus, when MAR is 2, address 8 is put on the bus and so forth.
It is a bit difficult in the beginning to understand. I learnt it from Andrew Tanenbaums's structured computer organisation.

This image should make it easy to understand:
http://i.stack.imgur.com/rpB7N.png
Simply put,
• In the byte addressing scheme, the first word starts at address 0, and
the second word starts at address 4.
• In the word addressing scheme, all bytes of the first word are located
in address 0, and all bytes of the second word are located in address 1.
The advantage of byte-addressability are clear when we consider applications that process data one byte at a time. Access of a single byte in a byte-addressable system requires only the issuing of a single address. In a 16–bit word addressable system, it is necessary first to compute the address of the word containing the byte, fetch that word, and then extract the byte from the two-byte word. Although the processes for byte extraction are well understood, they are less efficient than directly accessing the byte. For this reason, many modern machines are byte addressable.

Addressability is the size of a unit of memory that has its own address. It's also the smallest chunk of memory that you can modify without affecting its neighbours.
For example: a machine where bytes are the normal 8 bits, and the word-size = 4 bytes. If it's a word-addressable machine, there's no such thing as the address of the second byte of an int. Dealing with strings (e.g. an array like char str[]) becomes inconvenient, because you still store characters packed together. Modifying just str[1] means loading the word that contains it, doing some shift/and/or operations to apply the change, then doing a word store.
Note that this is different from a machine that doesn't allow unaligned word load/stores (where the low 2 bits of a word address have to be 0). Such machines usually have a byte load/store instruction. We're talking about machines without even that.
CPU addresses might actually still include the low bits, but require them to always be zero (or ignore them). However, after checking that they're zero, the could be discarded, so the rest of the memory system only sees the word address, where two adjacent words have an address that differs by 1 (not 4). However, on a 16-bit CPU where a register can only hold 64k different addresses, you wouldn't likely do this. Each separate CPU address would refer to a different 2 bytes of memory, instead of discarding the low bit. 2B word-addressable memory would let you address 128kiB of memory, instead of just 64kiB with byte-addressable memory.
Fun fact: ARM used to use the low 2 bits of an address as a shuffle control for unaligned word loads. (But it always had byte load/store instructions.)
See also:
https://en.wikipedia.org/wiki/Word-addressable
https://en.wikipedia.org/wiki/Byte_addressing
Note that bit-addressable memory could exist, but doesn't. 8-bit bytes are nearly universally standard now. (Ancient computers sometimes had larger bytes, see the history section of wikipedia's Byte article.)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart