Max length for a dynamic array in Delphi 64? - delphi

My current question is related to Max length for a dynamic array in Delphi?. That question was asked in 2009 when the 64 bit compiler was not available. I am preparing migration to Delphi XE2 (or whatever version is available for purchase not) or to Lazarus because I need 64 bit support.
I would like to know what changed (related to dynamic array max length) in Delphi 64bit. Can I create bigger arrays now?

Dynamic array lengths are, in modern Delphi, NativeInt.
This means that dynamic arrays are limited in theory to 32 bit lengths in 32 bit code, and 64 bit length in 64 bit code. Of course, practical considerations mean that the limits are somewhat lower. However it is possible to allocate dynamic arrays with more than 232 elements in 64 bit code.
On the other hand, strings are subject to a 32 bit limit on their length for all architectures. As I understand it the reasoning is that strings are simply not expected to hold such large amounts of text. And many of the text support library functions that strings rely on use 32 bit lengths. Whereas arrays are used for more general purpose computing and a 32 bit limit would greatly reduce their utility under 64 bits.

Related

How many values can be stored per physical address in Memory?

I've read that you can only store one value per physical address in Ram. Now this data could be an instruction or data. Is this due to when the CPU reads in a Word from Ram, it can only deal with one value at a time? be that an instruction, int or a string. Is there a technical reason you can't fit more than one value per index. I've read about Scalar Processors but aren't they really old. Couldn't you fit two or more values in the width of a 64 bit Word for example? Or am i missing something really obvious here. I guess i'm asking is this a programming concept or is there an actual technical/hardware reason the cpu can't deal with more than one value per read of a Word from Ram..
Thanks
Rob
Most recent computers use addresses that point to a "Byte" location in memory.
Each machine instruction that includes "load (or store) from memory" functionality includes either an implicit or explicit specification of the number of bytes to be loaded/stored, starting at the target byte address. Common sizes are 1, 2, 4, 8 Bytes (corresponding to single data items of the most commonly supported sizes).
It is up to the application program to decide how to interpret the bytes and what operations to perform on them. It is certainly common to store the characters of a string in consecutive byte memory locations and process 4 or 8 characters at a time using 32-bit (4-Byte) or 64-bit (8-Byte) load and store instructions. Operation on the individual bytes (characters) may involves masking, shifting, and copying within the processor's general-purpose registers, but since the late 1990's, many/most microprocessors have included instructions specifically designed to treat the contents of a register as multiple independent (smaller) values.
"Packing" multiple data items into consecutive bytes of memory need not be limited to the sizes of registers for supported arithmetic types (1, 2, 4, 8 Bytes). Since about 2000, many processors have also included "Single Instruction Multiple Data" (SIMD) instructions to load bigger payloads into a set of "SIMD registers". (Common sizes are 16 and 32 Bytes, but some processors support 64 Byte registers.) Systems that include these SIMD load and store instructions typically also include instructions to operate on the SIMD registers "in parallel" -- treating the register contents as multiple independent values. It is common to provide instructions to treat the contents of a 256-bit (32-Byte) register as 32 1-Byte values, 16 2-Byte values, 8 4-Byte values, or 4 8-Byte values. The details vary by processor architecture and generation.

JVM 64-bit different memory usages?

I've done some reading but I'm not entirely sure about one thing, for example how much memory would this use in JVM 64 bit(sorry if stupid question, but I'm a bit confused and don't know much about this):
MyObject[] myArray; - I know an array takes up 24 bytes, but how much will each element in this array take? is every element an object reference, meaning 8 byte per element? If not, how do I know how many bytes each element in this array needs?
Normally, that is when using heap sizes of less than 32 GB, the 64-bit JVM uses compressed oops which store object pointers as a 32-bit integer (scaled by three bits when used, since all objects are aligned to 8 bytes; see the link for details), so each element would actually only use 4 bytes.
If you use more than 32 GB of heap or otherwise turn off compressed oops, however, then each element will indeed use 8 bytes.
Also, I suspect that your statement on the array header being 24 bytes is wrong. To begin with, when compressing oops, the class reference in the header is also compressed, and the identity-hash-code and array length fields are 32-bit to begin with, so I suspect it is more likely to use 12 bytes. Even when using full-length oops, it should still only take 16 bytes. I can't find any hard source verifying either, however. In general, however, it should be said that Hotspot does not even use a fixed-size object header but one that varies in size depending on various circumstances of the object. This article describes some of those circumstances.
That is on the Hotspot JVM, at least. Since the JLS doesn't specify any primitive sizes, it could, theoretically, be anything on any given JVM, though 8 bytes are, of course, the most likely implementation choice.
Here is good information on how to calculate the memory usage of a Java array
For Example
let's consider a 10x10 int array. Firstly, the "outer" array has its 12-byte object header followed by space for the 10 elements. Those elements are object references to the 10 arrays making up the rows. That comes to 12+4*10=52 bytes, which must then be rounded up to the next multiple of 8, giving 56. Then, each of the 10 rows has its own 12-byte object header, 4*10=40 bytes for the actual row of ints, and again, 4 bytes of padding to bring the total for that row to a multiple of 8. So in total, that gives 11*56=616 bytes. That's a bit bigger than if you'd just counted on 10*10*4=400 bytes for the hundred "raw" ints themselves.

How will 64 bit variable be referenced in a 32 bit process?

I have a 64 bit kernel and i run 32 bit processes in userland.In the user process code ,if i declare a 64 bit variable ,how will it be referred.Will it incur 2 memory reads.?
basically the scenario is:
I need to use a 64 bit mask in my user process.
Approach 1 :
-> Use a u64bits variable.
Approach
-> Use a array of 2 32 bit variables.
First off: the kernel has no bearing on the answer to this question.
Second, I assume this is x86 you're talking about. Where possible, the compiler will place 64-bit values across 2 32-bit registers. For example, if you return a uint64_t from a function, the low 32 bits will be stored in the eax register, and the high bits will be in edx.
The compiler will generally do the right thing for performance and correctness: using an array will likely just confuse it and lead to worse results.
By the way, x86-64 CPUs will normally perform reads of 2 adjacent 32-bit words at the same speed as a single 64-bit read. The advantages of 64-bit mode are that arithmetic can be done directly on 64-bit values (1 64x64 multiplication instruction vs 3-4 32x32 instructions), there is much more space available in registers (16 registers instead of 8, registers are twice as wide), and of course the larger possible virtual address space.

16 bit Int vs 32 bit Int vs 64 bit Int

I've been wondering this for a long time since I've never had "formal" education on computer science (I'm in highschool), so please excuse my ignorance on the subject.
On a platform that supports the three types of integers listed in the title, which one's better and why? (I know that every kind of int has a different length in memory, but I'm not sure what that means or how it affects performance or, from a developer's view point, which one has more advantages over the other).
Thank you in advance for your help.
"Better" is a subjective term, but some integers are more performant on certain platforms.
For example, in a 32-bit computer (referenced by terms like 32-bit platform and Win32) the CPU is optimized to handle a 32-bit value at a time, and the 32 refers to the number of bits that the CPU can consume or produce in a single cycle. (This is a really simplistic explanation, but it gets the general idea across).
In a 64-bit computer (most recent AMD and Intel processors fall into this category), the CPU is optimized to handle 64-bit values at a time.
So, on a 32-bit platform, a 16-bit integer loaded into a 32-bit address would need to have 16 bits zeroed out so that the CPU could operate on it; a 32-bit integer would be immediately usable without any alteration, and a 64-bit integer would need to be operated on in two or more CPU cycles (once for the low 32-bits, and then again for the high 32-bits).
Conversely, on a 64-bit platform, 16-bit integers would need to have 48 bits zeroed, 32-bit integers would need to have 32 bits zeroed, and 64-bit integers could be operated on immediately.
Each platform and CPU has a 'native' bit-ness (like 32 or 64), and this usually limits some of the other resources that can be accessed by that CPU (for example, the 3GB/4GB memory limitation of 32-bit processors). The 80386 processor family (and later x86) processors made 32-bit the norm, but now companies like AMD and then Intel are currently making 64-bit the norm.
To answer your first question, the usage of a 16 bit vs a 32 bit vs a 64 bit integer depends on the context that it is used. Therefore, you really can't say one is better over the other, per say. However, depending on a situation, using one over another is preferable. Consider this example. Let's say you have a database with 10 million users and you want to store the year they were born. If you create a field in your database with a 64 bit integer then you have exhausted 80 megabytes of your storage; whereas, if you were to use a 16 bit field, only 20 megabytes of your storage will get used. You can use a 16 bit field here because the year people are born is smaller than the largest 16 bit number. In other words 1980, 1990, 1991 < 65535, assuming your field is unsigned. All in all, it depends on the context. I hope this helps.
A simple answer is to use the smallest one you KNOW will be safe for the range of possible values it will contain.
If you know the possible values are constrained to be smaller than a maximum-length 16-bit integer (e.g. the value corresponding to what day of the year it is - always <= 366) then use that. If you aren't sure (e.g. the record ID of a table in a database that can have any number of rows) then use Int32 or Int64 depending on your judgment.
Other can probably give you a better sense of of the performance advantages depending on what programming language you are using, but the smaller types use less memory and hence are 'better' to use if you don't need larger.
Just for reference, a 16-bit integer means there are 2^16 possible values - generally represented as between 0 and 65,535. 32-bit values range from 0 to 2^32 - 1, or just over 4.29 billion values.
This question On 32-bit CPUs, is an 'integer' type more efficient than a 'short' type? may add some more good information.
It depends on whether speed or storage should be optimized. If you are interested in speed and you are running SQL Server in 64 bit mode then 64 bit keys are what you need. A 64 bit processor running in 64 bit mode, is optimized to use 64 bit numbers and addresses. Likewise, a 64 bit processor running in 32 bit mode is optimized to use 32 bit numbers and addresses. For example, in 64 bit mode, all pushes and pops onto the stack are 8 bytes etc. Also fetch from cache and memory are again optimized for 64 bit numbers and addresses. The processor, running in 64 bit mode, may need more machine cycles to handle a 32 bit number just like a processor, running in 32 bit mode needs more machine cycles to handle a 16 bit number. The increases in processing time come for many reasons, but just think about the example of memory alignment: The 32 bit number may not be aligned on a 64 bit integral boundary which means loading the number requires shifting and masking the number after loading it into a register. At the very least, every 32 bit number must be masked before each operation. We are talking at least halving the processor's effective speed while handling 32 or 16 bit integers in 64 bit mode.
To provide a simple explanation to novice programmers. A bit is either a 0 or a 1.
a 16 bit Int is an integer represented by a string of 16 bits (16 0's and 1's)
a 32 bit Int is an integer represented by a string of 32 bits (32 0's and 1's)
a 64 bit Int is an integer represented by a string of 64 bits (64 0's and 1's)
Examples to drive those concepts home:
an example of a 16-bit integer would be 0000000000000110 which equals the int 6
an example of a 32-bit integer would be 00000000000000000100001000100110 which equals the int 16934.
an example of a 64-bit integer would be 0000100010000000010000100010011000000000000000000100001000100110 which equals the int 612562280298594854.
You can represent a larger number of integers with 64 bits than you can 32 bits than you can 16 bits. So the benefit of using fewer bits is you save space on the machine. The benefit of using more bits is you can represent more integers.

Why is the smallest value that can be stored is a Byte(8bit) & not a Bit(1bit)?

Why is the smallest value that can be stored a Byte(8bit) & not a Bit(1bit) in memory?
Even booleans are stored as Bytes. Will we ever bump the smallest number to 32 or 64bits like register's on the CPU?
EDIT: To clarify as many answers seemed confused about the nature of questing. This question is about why isn't a byte 7-bit, 1-bit, 32-bit, etc (not why lower bit primitives must fit within the hardware's byte at min). Is the 8-bit byte simply historical as some hardware has 10-bit bytes for example. Or is there a mathematical reason 8-bit is ideal vs say 10-bit for general processing?
The hardware is built to read data in blocks (bytes, later words and dwords). This provides greater efficiency, than accessing individual bits, and also offers more addressing range. So most data is aligned to at least byte boundary. There exist encodings that operate with bit sequences, rather than bytes, but they are quite rare.
Nowadays the data is most often aligned to dword (32-bits) boundary anyway. Moreover, some hardware (ARM, for example), can't access misaligned multibyte variables, i.e. 16-bit word can't "cross" dword boundary - exception will be thrown.
Because computers address memory at the byte level, so anything smaller than a byte is not addressable.
The underlying methods of processor access are limited to the size of the smallest usable register. On most architectures, that size is 8 bits. You can use smaller portions of these; for instance, C has the bitfield feature in structs that will allow combining fields that only need to be certain bit lengths. Access will still require that the whole byte be read.
Some older exotic architectures actually did have different a "word size." In these machines, 10 bits might be the common size.
Lastly, processors are almost always backwards compatible. Intel, for instance, has maintained complete instruction compatibility from the 386 on up. If you take a program compiled for the 386, it will still run on an i7 processor. Changing the word size would break compatibility. So while it is possible, no manufacturer will ever do it.
Assume that we have native language that consist of 2 character such as a , b
to distinguish two characters we need at least 1 bit for example 0 to represent char a and 1 to represent char b
so that if we count number of characters and special characters and symbols, there are 128 character and to distinguish one character from another, you need log2(128) = 7 bit and 8th bit for transmission

Resources