C/C++, are objects binary compatible across platform? - memory

I would like to send some object data, in binary, across some mcu. I treat it as a cross platform problem. How I would like to implement is like:
//mcu A
//someObj declared and initialized
Send((uint_8_t*)&someObj,sizeof(someObj));
//mcu B
SomeClass someObj;
Read((uint_8_t*)&someObj,sizeof(someObj));
Are there any guarantee in C/C++ that such thing is possible?

There is no guaranty that it works. If your data is only composed of a set of chars, it will probably work whatever the platforms.
Otherwise, you will encounter hardware and software problems.
Hardware problems include endianness and data alignment.
Endianness refers to the way multibyte data types are arranged in memory. For instance an integer has 4 bytes and some architectures store it in memory by writing at the lowest address address the least significant byte (little endian like the pentium) while others store the most significant byte at the lowest address (big endian). If endianness is different, bytes must be swapped to ensure compatibility. Note that some platforms (Arm, mips, among others) can use both endianness, but it is generally selected at boot time. Also some machines have different endianness for integers and floats.
Alignment refers to the constraint on many architectures that a 2^k bytes data must be at an address multiple of 2^k. Some architectures, like the pentium, do not have this constraint and can manipulate unaligned data, but a compiler may lay out data in an aligned way to improve performances. As a side effect of alignment constraints, a given object may not have the same size on different architectures and sizeof() applied to a struct is not guaranteed to return the same value.
Software problems are related to the nature of your data.
Obviously if your data contains any kind of pointer, it is impossible to transfer them as is across platforms.
If you have C++ objects with constructors/destructors, again you will run into problems if transferring binary data.
The process of converting data to allow a safe transfer across platforms is frequently called serialization or pickling. Many languages (java, javascript, python, R) have a native support for it. In C/C++, there is no support for serialization in the language, and custom serialization must written, but frameworks like Boost or MFC provide serialization methods. You can also have a look at XDR (external data representation) that is a serialization standard which is supported by several libraries.

Related

How do I deal with numbers less than 32 bit in a 32 bit system?

I'm attempting to simulate a 32 bit computer under a very scuffed architecture I have come up with on my own. I am probably doing everything wrong but it's just a fun thing I'm doing to teach myself C.
I am encountering a slight issue where I have no idea how many bytes of a number I should save to memory.
At the moment I have an instruction that looks like this: CODE, (addressing info), add-a, add-b, add-c.
the opcode and addressing info is 4 bits long, and the addresses are 8 bits long. If I add 2 32 bit numbers (b and c) they get saved at address a. The issue arises when I have a number that is less than 32 bits. For example, if I have an array of 1 byte chars and for whatever reason I want to add 1 to one of the numbers, when I save the 1 byte char back to that array, it would be written as a 32 bit number, thus overwriting the 3 subsequent chars.
I'm not really sure the best way to tackle this issue but I have a few ideas.
Idea 1:
Just do everything in 32 bit chunks. Let the programmer deal with the issue themselves. (do some funky bitwise manipulation to fit the 1 byte char back into the array. maybe with a mask)
I don't want to do this as it would make the code messy.
Idea 2:
Only allow addresses every 32 bit . If every number is 32 bits long, then no number will be overwritten.
This sucks as as far I can tell, nothing does this. It would make saving smaller numbers take up 4 times more memory than they need to.
Idea 3:
Stop working with 32 bit numbers. Only ever add, subtract, store, get 8 bit numbers. This would work and probably be less messy but would also be very annoying. Adding 32 bit numbers would suddenly take at least 4 lines of code, the programs would then run slower. It would also mean moving lines of code around would also take at least 4 lines of code as each line of code is 4 bytes long.
Basically I have no idea what I'm doing and I can find anyone online talking about this. I'm sure either there is something glaringly obvious, or I'm doing something stupid and I will need to redesign the whole system...
also side note, I'm not sure if this is the correct place to ask this kind of question but if it isn't I would love to know where is
All the ideas you mention seem to share a common concept, which is to limit what the hardware does and make software make up the rest of its desires/requirements by (a) assembling larger items from smaller storage units, and vice versa, (b) packing smaller items into larger storage units.
Generally speaking this is how computation works anyway, providing only limited capabilities in hardware, and making software make up any shortfall.  The limited capabilities, ideally, are well matched to common software patterns, such as for strings, integer of various sizes, floats, etc..
Where the line is drawn between hardware-built-in capabilities and software compensation has been changed many times by many processors over the years.
Software generally has to do both of these with any machine organization existing today.  If you want an array of Boolean values, then you would probably want to pack them into bytes (or words) and set/extract bits from them, which is (b).  On the other hand if you want long strings or multiword numeric data, then software assembles some larger number of storage units into a whole, which is (a).
Modern 64-bit hardware offers at least 1-byte, 2-byte, 4-byte and 8-byte data (modulo vectors).  By offering these data sizes, we mean that it provides for instructions that directly operate on these sizes, i.e. single instructions that do useful things with them.
However, there are no modern bit-addressable machines, so if you want smaller than a byte (quite reasonable sometimes) you have to handle that with software.
Further if you want 3-, 5-, 6-, or 7-byte data, the hardware doesn't necessarily provide that directly — though support for misaligned load helps, since with that you can load a larger size and mask off the bad pieces; stores similar with read-modify-write.
If you want 9-byte or larger, you'll have to use multiple load and store instruction, though again misaligned capabilities in the hardware help with odd sizes.
Some instruction sets have drawn a limited line by removing byte load & store instructions (while remaining byte addressable) though provided dedicated instructions to extract the proper byte from a word in a register so as to still provide some hardware acceleration for byte operations on hardware that doesn't have misaligned loads, since without either byte loads, misaligned capabilities, or special helper instructions, extracting the proper byte from a word can take multiple instructions and/or repetitive loading of the same memory word for sequential access.
I advocate the load/store model. That means rich load & store instructions:, load signed byte, load unsigned byte, signed half, unsigned half, word (32), double. And arithmetic in such a model would be register to register, so then you don't need smaller than word-sized addition.  No mainstream programming language demands byte addition, and having byte arithmetic doesn't even offer optimization in a load/store architecture.
However, you will want to take the architecture as a whole into account in designing the individual instructions.

Advantage for hex formats like SREC or Intel HEX

I want to ask if someone can explain me the benefit for using hex formats (e.g. by Motorola S-Record or Intel HEX) over using direct binary images like for firmware or memory dumps?
I understand that it is useful to have some Meta information about the binary file like used memory areas, checksums for data integrity and so on…
However, the fact that the actual data size is doubled, because everything will be saved in a hex-ASCII representation is confusing me.
Is the reason for using a hex-ASCII representation only the portability, to prevent problems with systems that have a different byte endianness or are there other benefits?
I found for this topic many tutorials about how to convert binary to hex and backwards or the specifications of the certain formats, but no information about the advantages and disadvantages.
There are a couple of advantages of HEX-ASCII formats over binary image. Here are a couple to start with.
Transport. An ASCII file can be transfered through most mediums. Binary files may not be transferred intact across some mediums.
Validity checks. The HEX file has a checksum on each line. The Binary may not have any checks.
Size. Data is written to selected memory address in the Hex file. The binary image has limited addressing information and is likely to contain every memory location from start address to end of file.

Is unaligned memory access allowed on iOS devices?

I'm currently working on app that loads blob of tightly packed data which contains different integer types (sized from char to int) that might not be properly aligned.
So, can I use simple *(short*)ptr or similar accesses to that data? Test on my iphone 5 shows no problem with that, but I'm not sure about all cases on all newer processors.
I did find some related informations, like this:
ARMv6 and later, except some microcontroller versions, support unaligned accesses for half-word and single-word load/store instructions with some limitations, such as no guaranteed atomicity.
but in case of words it seems that on 32-bit and 64-bit ARMs word 32 and 64 bit accordingly, which would mean short requires proper alignment on 64-bit machine.
So, can I assume this is safe, or should I use some keywords like __packed?
Or should I rather avoid it completely and recreate my data so it always have proper alignment (or always use memmove when data is from external source and cannot by permanently modified)?
It's ages ago that I tried it. And it worked, but every single access to unaligned memory caused a trap, which took considerable time. I'd suggest you measure how long it takes to add a million aligned shorts vs a million unaligned shorts. If you have a few hundred or thousand unaligned numbers, nothing to worry about.
__packed works reasonably fast. ARM has some clever instructions to do unaligned access with very few instructions. Again, I'd measure how long that takes. My experience with this is not current.

Are floating points deterministic across iOS devices?

I have several questions regarding floating points and iOS devices:
Are floating points deterministic on one given iOS device?
Are floating points deterministic across all iOS devices?
If not, is there a way to make them deterministic? What I am thinking of is: change of compiler settings, usage of reduced set of math operations, etc.
If there isn't any way to do so, what would be the best alternative?
Could I use fixed points instead? Would it mean using NSDecimalNumber?
Cheers.
The results from the same data size should be identical. The IEEE standard for floating point dictates the results for different float operations on different sizes of floating point numbers.
Where you might get into trouble is if the definition of different data types varies across architectures.
The size of "float" is not formally defined in ANSI C. It can be 4 bytes on some platforms and 8 on others. (NSInteger, for example, is 32 bits on a 32 bit device and 64 bits on a 64 bit device. I know it's an integer type, but its an example of a type that changes size based on the platform.)
I don't know if Apple changed the size of any of the data types between their 32 bit and 64 bit platforms. Perhaps somebody who knows conclusively can chime in here.
The newest iOS devices are 64 bit. I would suggest checking the size of float/CGFloat (using sizeof().) If you don't have access to one of the new 64 bit devices you should be able to test it using the 64 bit simulator.
It's lik

Why byte-addressable memory and not 4-byte-addressable memory?

Why do computers have byte-addressable memory, and not 4-byte-addressable memory (or 8-byte-addressable memory for 64bit)? Yeah, I see how it could be useful sometimes, it just seems inelegant and excessive. Are the advantages substantial, or is it really just because of legacy?
Processors actually do access memory in quantities of 64-bit (x86 did since Pentium or so); 64-bit processors often have a 128-bit bus. Plus, in accessing main memory, you have bursts that fill an entire cache line, which is even larger units of memory.
It's only the addressing that is byte-based; this adds little overhead and is not excessive at all.
Today, you absolutely need byte-based addressing for networking protocols. Implementing TCP with word-based addressing would be difficult: what do you want read() to return if what you received where 17 bytes? Likewise, higher layers are byte-based: HTTP would be fairly difficult to implement if you get a request line like "GET / HTTP/1.0" be presented in units of four bytes. You essentially would have to split the words back into bytes with shift operations and such (which now the processors do in hardware, thanks to byte-based addressing).
Largely historical reasons - it has become the standard that CPUs understand. Here is a good discussion on it:
Generally, a size has to be chosen to
be convenient for both data and
machine instructions. 8 bits (256
values) is enough to accommodate
common characters in English and some
other languages. Designers of 8-bit
processors presumably found that being
able to encode 256 common instructions
as one byte was a "reasonable
tradeoff". And at the time, 8 bits was
also generally enough to encode other
things such as a pixel colour or
screen coordinate. Having a byte size
that is a power of 2 may also have
been felt to be a "neater" design. It
is interesting to note that, for
example, Marxer, E. (1974), Elements
of Data Processing, describes a byte
as being either 6-bit and 8-bit
depending on whether the computer was
of the "octal" or "hexadecimal" type.
Certainly, other sizes were used in the early days.
We needed to settle down on some size for standardization. People chose 8-bit size for the reasons mentioned by Shane above. since then we are stuck with byte addressable memory. now it is impossible to change due to various compatibility issues and the fact that OPCODES are a byte long only. but using a trick, memory is easily made word-addressable to fetch/store data/addresses!

Resources