In LP64, the size of a long and the size of a long long are the same (Apple Docs, Unix Docs).
Is there any difference then, when limiting yourself to the understanding that you're running on an LP64 system (as XCode appears to when compiling for 64 bit), between a long and a long long? Is there any performance reason to use a long instead of a long long if your goal is a 64 bit integral?
Here's why I ask. In Objective C on Xcode, NSString's format (like printf) and NSNumber both use data types like int, long, long long and their unsigned variants when converting numbers and text and not specific bit length numbers like int16_t, int32. and int64_t. This would make it difficult to program things that require a certain minimum size (i.e. networking or currency applications) or times when you want to store specifically sized data into an NSNumber without typecasting.
Is it safe, limiting to any Intel Mac OS or iOS device, to use int for int32_t and long long for int64_t when interacting with things like NSString's format functions and NSNumber?
Is it safe, limiting to any Intel Mac OS or iOS device, to use int for int32_t and long long for int64_t when interacting with things like NSString's format functions and NSNumber?
According to the ILP32 & LP64 conventions yes, but you should really document that you are relying on these sizes.
One way to do that is to use a clever macro that originated (as I understand) in the Linux kernel:
#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
This macro will generate a compile time error if its condition argument is true, as in that case it attempts to determine the size of a negative-sized array. You can use it in the following simple function:
static __attribute__((unused)) void _compile_time_use_only_()
{
BUILD_BUG_ON( (sizeof(int) != 4) );
BUILD_BUG_ON( (sizeof(long long) != 8) );
}
Add that to your code and if you attempt to compile on any system where int is not 32-bits or long long is not 64-bits then you'll get a compile time error. There is essentially zero-cost at runtime (just a few bytes for the unused function).
Make sure you comment the function stating what it does!
You can of course assert the size of other types the same way.
HTH
Use either NSInteger or int64_t. NSInteger = fastest type with at least 32 bits, and compatible with sizes of arrays etc. int64_t = exactly 64 bit. This will also make the move to Swift easier.
Related
I am trying to comprehend how development is affected when developing for both 32-bit and 64-bit architectures. From what I have researched thus far, I understand an int is always 4 bytes regardless of the architecture of the device running the app. But an NSInteger is 4 bytes on a 32-bit device and 8 bytes on a 64-bit device. I get the impression NSInteger is "safer" and recommended but I'm not sure what the reasoning is for that.
My question is, if you know the possible value you're using is never going to be large (maybe you're using it to index into an array of 200 items or store the count of objects in an array), why define it as an NSInteger? That's just going to take up 8 bytes when you won't use it all. Is it better to define it as an int in those cases? If so, in what case would you want to use an NSInteger (as opposed to int or long etc)? Obviously if you needed to utilize larger numbers, you could with the 64-bit architecture. But if you needed it to also work on 32-bit devices, would you not use long long because it's 8 bytes on 32-bit devices as well? I don't see why one would use NSInteger, at least when creating an app that runs on both architectures.
Also I cannot think of a method which takes in or returns a primitive type - int, and instead utilizes NSInteger, and am wondering if there is more to it than just the size of the values. For example, (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section. I'd like to understand why this is the case. Assuming it's possible to have a table with 2,147,483,647 rows, what would occur on a 32-bit device when you add one more - does it wrap around to a -2,147,483,647? And on a 64-bit device it would be 2,147,483,648. (Why return a signed value? I'd think it should be unsigned since you can't have a negative number of rows.)
Ultimately, I'd like to obtain a better understanding of actual use of these number data types, perhaps some code examples would be great!
I personally think that, 64-bit is actually the reason for existence for NSInteger and NSUInteger; before 10.5, those did not exist. The two are simply defined as longs in 64-bit, and as ints in 32-bit.
NSInteger/NSUInteger are defined as *dynamic typedef*s to one of these types, and they are defined like this:
#if __LP64__ || NS_BUILD_32_LIKE_64
typedef long NSInteger;
typedef unsigned long NSUInteger;
#else
typedef int NSInteger;
typedef unsigned int NSUInteger;
#endif
Thus, using them in place of the more basic C types when you want the 'bit-native' size.
I suggest you to throughly read this link.
CocoaDev has some more info.
For proper format specifier you should use for each of these types, see the String Programming Guide's section on Platform Dependencies
I remember when attending iOS developer conference. you have to take a look on the data-type in iOS7. for example, you use NSInteger in 64-bit device and save it on iCloud. then you want to sync to lower device (say iPad 2nd gen), your app will not behave the same, because it recognizes NSInteger in 4 bytes not 8 bytes, then your calculation would be wrong.
But so far, I use NSInteger because mostly my app doesn't use iCloud or doesn't sync. and to avoid compiler warning.
Apple uses int because for a loop control variable (which is only used to control the loop iterations) int datatype is fine, both in datatype size and in the values it can hold for your loop. No need for platform dependent datatype here. For a loop control variable even a 16-bit int will do most of the time.
Apple uses NSInteger for a function return value or for a function argument because in this case datatype [size] matters, because what you are doing with a function is communicating/passing data with other programs or with other pieces of code.
Apple uses NSInteger (or NSUInteger) when passing a value as an
argument to a function or returning a value from a function.
The only thing I would use NSInteger for is passing values to and from an API that specifies it. Other than that it has no advantage over an int or a long. At least with an int or a long you know what format specifiers to use in a printf or similar statement.
As a continue to Irfan's answer:
sizeof(NSInteger)
equals a processor word's size. It is much more simple and faster for processor to operate with words
Newbie in IOS programming here.
I was looking at the Foundation Types Data Reference and have started to use the NSInteger typedef on the assumption that it will make my app more portable. However I often have a use for 16-bit and 8-bit integers and I don't see an NSShort or NSByte.
It seems wasteful to allocate a 32/64 bit variable for something that has a small range, say 0 to 12.
Are there any symbols that are defined for that?
Use uint8_t and uint16_t if you want types that are a specific size. There are also similar types for 32 and 64 bits values.
When should I be using NSInteger vs. int when developing for iOS? I see in the Apple sample code they use NSInteger (or NSUInteger) when passing a value as an argument to a function or returning a value from a function.
- (NSInteger)someFunc;...
- (void)someFuncWithInt:(NSInteger)value;...
But within a function they're just using int to track a value
for (int i; i < something; i++)
...
int something;
something += somethingElseThatsAnInt;
...
I've read (been told) that NSInteger is a safe way to reference an integer in either a 64-bit or 32-bit environment so why use int at all?
You usually want to use NSInteger when you don't know what kind of processor architecture your code might run on, so you may for some reason want the largest possible integer type, which on 32 bit systems is just an int, while on a 64-bit system it's a long.
I'd stick with using NSInteger instead of int/long unless you specifically require them.
NSInteger/NSUInteger are defined as *dynamic typedef*s to one of these types, and they are defined like this:
#if __LP64__ || TARGET_OS_EMBEDDED || TARGET_OS_IPHONE || TARGET_OS_WIN32 || NS_BUILD_32_LIKE_64
typedef long NSInteger;
typedef unsigned long NSUInteger;
#else
typedef int NSInteger;
typedef unsigned int NSUInteger;
#endif
With regard to the correct format specifier you should use for each of these types, see the String Programming Guide's section on Platform Dependencies
Why use int at all?
Apple uses int because for a loop control variable (which is only used to control the loop iterations) int datatype is fine, both in datatype size and in the values it can hold for your loop. No need for platform dependent datatype here. For a loop control variable even a 16-bit int will do most of the time.
Apple uses NSInteger for a function return value or for a function argument because in this case datatype [size] matters, because what you are doing with a function is communicating/passing data with other programs or with other pieces of code; see the answer to When should I be using NSInteger vs int? in your question itself...
they [Apple] use NSInteger (or NSUInteger) when passing a value as an
argument to a function or returning a value from a function.
OS X is "LP64". This means that:
int is always 32-bits.
long long is always 64-bits.
NSInteger and long are always pointer-sized. That means they're 32-bits on 32-bit systems, and 64 bits on 64-bit systems.
The reason NSInteger exists is because many legacy APIs incorrectly used int instead of long to hold pointer-sized variables, which meant that the APIs had to change from int to long in their 64-bit versions. In other words, an API would have different function signatures depending on whether you're compiling for 32-bit or 64-bit architectures. NSInteger intends to mask this problem with these legacy APIs.
In your new code, use int if you need a 32-bit variable, long long if you need a 64-bit integer, and long or NSInteger if you need a pointer-sized variable.
If you dig into NSInteger's implementation:
#if __LP64__
typedef long NSInteger;
#else
typedef int NSInteger;
#endif
Simply, the NSInteger typedef does a step for you: if the architecture is 32-bit, it uses int, if it is 64-bit, it uses long. Using NSInteger, you don't need to worry about the architecture that the program is running on.
You should use NSIntegers if you need to compare them against constant values such as NSNotFound or NSIntegerMax, as these values will differ on 32-bit and 64-bit systems, so index values, counts and the like: use NSInteger or NSUInteger.
It doesn't hurt to use NSInteger in most circumstances, excepting that it takes up twice as much memory. The memory impact is very small, but if you have a huge amount of numbers floating around at any one time, it might make a difference to use ints.
If you DO use NSInteger or NSUInteger, you will want to cast them into long integers or unsigned long integers when using format strings, as new Xcode feature returns a warning if you try and log out an NSInteger as if it had a known length. You should similarly be careful when sending them to variables or arguments that are typed as ints, since you may lose some precision in the process.
On the whole, if you're not expecting to have hundreds of thousands of them in memory at once, it's easier to use NSInteger than constantly worry about the difference between the two.
On iOS, it currently does not matter if you use int or NSInteger. It will matter more if/when iOS moves to 64-bits.
Simply put, NSIntegers are ints in 32-bit code (and thus 32-bit long) and longs on 64-bit code (longs in 64-bit code are 64-bit wide, but 32-bit in 32-bit code). The most likely reason for using NSInteger instead of long is to not break existing 32-bit code (which uses ints).
CGFloat has the same issue: on 32-bit (at least on OS X), it's float; on 64-bit, it's double.
Update: With the introduction of the iPhone 5s, iPad Air, iPad Mini with Retina, and iOS 7, you can now build 64-bit code on iOS.
Update 2: Also, using NSIntegers helps with Swift code interoperability.
As of currently (September 2014) I would recommend using NSInteger/CGFloat when interacting with iOS API's etc if you are also building your app for arm64.
This is because you will likely get unexpected results when you use the float, long and int types.
EXAMPLE: FLOAT/DOUBLE vs CGFLOAT
As an example we take the UITableView delegate method tableView:heightForRowAtIndexPath:.
In a 32-bit only application it will work fine if it is written like this:
-(float)tableView:(UITableView *)tableView heightForRowAtIndexPath:(NSIndexPath *)indexPath
{
return 44;
}
float is a 32-bit value and the 44 you are returning is a 32-bit value.
However, if we compile/run this same piece of code in a 64-bit arm64 architecture the 44 will be a 64-bit value. Returning a 64-bit value when a 32-bit value is expected will give an unexpected row height.
You can solve this issue by using the CGFloat type
-(CGFloat)tableView:(UITableView *)tableView heightForRowAtIndexPath:(NSIndexPath *)indexPath
{
return 44;
}
This type represents a 32-bit float in a 32-bit environment and a 64-bit double in a 64-bit environment. Therefore when using this type the method will always receive the expected type regardless of compile/runtime environment.
The same is true for methods that expect integers.
Such methods will expect a 32-bit int value in a 32-bit environment and a 64-bit long in a 64-bit environment. You can solve this case by using the type NSInteger which serves as an int or a long based on the compile/runtime environemnt.
int = 4 byte (fixed irrespective size of the architect)
NSInteger = depend upon size of the architect(e.g. for 4 byte architect = 4 byte NSInteger size)
How seriously do developers think about using a 16bit integer when writing code? I've been using 32bit integers ever since I've been programming and I don't really think about using 16bit.
Its so easy to declare a 32bit int because its the default for most languages.
Whats the upside of using a 16bit integer apart from a little memory saved?
Now that we have cars, we don't walk or ride horses as much, but we still do walk and ride horses.
There is less need to use shorts these days. In a lot of situations the cost of disk space and availability of RAM mean that we no longer need to squeeze every last bit of storage out of computers as we did 20 years ago, so we can sacrifice a bit of storage efficiency in order to save on development/maintenance costs.
However, where large amounts of data are used, or we are working with systems with small memories (e.g. embedded controllers) or when we are transmitting data over networks, using 32 or 64 bits to represent a 16-bit value is just a waste of memory/bandwidth. It doesn't matter how much memory you have, wasting half or three quarters of it would just be stupid.
APIs/interfaces (e.g. TCP/IP port numbers) and algorithms that require manipulation (e.g. rotation) of 16-bit values.
I was interested in the relative performance so I wrote this small test program to perform a very simple test of the speed of allocating, using, and freeing a significant amount of data in both int and short format.
I run the tests several times in case caching and so on are affected.
#include <iostream>
#include <windows.h>
using namespace std;
const int DATASIZE = 1000000;
template <typename DataType>
long long testCount()
{
long long t1, t2;
QueryPerformanceCounter((LARGE_INTEGER*)&t1);
DataType* data = new DataType[DATASIZE];
for(int i = 0; i < DATASIZE; i++) {
data[i] = 0;
}
delete[] data;
QueryPerformanceCounter((LARGE_INTEGER*)&t2);
return t2-t1;
}
int main()
{
cout << "Test using short : " << testCount<short>() << " ticks.\n";
cout << "Test using int : " << testCount<int>() << " ticks.\n";
cout << "Test using short : " << testCount<short>() << " ticks.\n";
cout << "Test using int : " << testCount<int>() << " ticks.\n";
cout << "Test using short : " << testCount<short>() << " ticks.\n";
cout << "Test using int : " << testCount<int>() << " ticks.\n";
cout << "Test using short : " << testCount<short>() << " ticks.\n";
}
and here are the results on my system (64 bit quad core system running windows7 64 bit, but the program is a 32 bit program built using VC++ express 2010 beta in release mode)
Test using short : 3672 ticks.
Test using int : 7903 ticks.
Test using short : 4321 ticks.
Test using int : 7936 ticks.
Test using short : 3697 ticks.
Test using int : 7701 ticks.
Test using short : 4222 ticks.
This seems to show that there are significant performance advantages at least in some cases to using short instead of int when there is a large amount of data. I realise that this is far from being a comprehensive test, but it's some evidence that not only do they use less space but they can be faster to process too at least in some applications.
when there is memory constraints short can help u lot. for e.g. while coding for embedded systems, u need to consider the memory.
16-bit values are still in great demand (though unsigned would do - don't really need signed).
For example,
16 bit Unicode - UTF-16/UCS-2.
16 bit graphics - especially for embedded devices.
16 bit checksums - for UDP headers and similar.
16 Bit devices - e.g. many norflash devices are 16 bit.
You might need to wrap at 65535.
You might need to work with a message sent from a device which includes fields which are 16 bit. Using 32 bit integers in this case would cause you to be accessing bits at the wrong offset in the message.
You might be working on an embedded 16 bit micro, or an embedded 8 bit micro. Hint: not all processors are x86, 32 bit.
This is really important in database development, because sometimes people are using a lot more space than is really needed (e.g. using int when small would have been sufficient). When you have tables with millions of rows this can be important factor in e.g. database size and queries. I would recommend people using always the appropriate datatype for columns.
I also try to use the correct datatype for other development, I know it can be a pain dealing with long and small (pretty convenient to have everyting int) but I think it pays off in the end, for example when serializing objects.
you ask: Any good reason to keep them around?
Since you say 'language-agnostic' the answer is a 'certainly yes'.
The computer CPU still works with bytes, words, full registers and whatnot, no matter how much these 'data types' are abstracted by some programming languages. There will always be situations where the code needs to 'touch the metal'.
It's hardly a little memory saved [read: 50%] when you allocate memory for a large number of numeric values. Common uses are:
COM and external device interop
Reducing memory consumption for large arrays where each number will never exceed a couple thousands in magnitude
Unique hashes for pairs of objects, where no more than ~65K objects are needed (hash values can only be 32-bit ints, but note that hash table types must transform the value for internal representations so collisions are still likely, but equality can be based on exact hash matches)
Speed up algorithms that rely on structs (smaller sized value types translates to increased performance when they are copied around in memory)
In large arrays, "little memory saved" could instead be "much memory saved".
The use of 16 bit integers is primarily for when you need to encode things for transmission over a network, for saving on hard disk, etc. without using up any more space than necessary. It might also occasionally be useful to save memory if you have a very large array of integers, or a lot of objects that contain integers.
Use of 16 bit integers without there being a good memory saving reason is pretty pointless. And 16 bit local variables are most often silently implemented with 32 or 64 bit integers anyway.
you have probably been using the 16 bit datatype more often than you knew. The char datatype in both C# and Java are 16 bit. Unicode is typically stored in a 16bit datatype.
The question should really be why we need a 16-bit primitive data type, and the answer would be that there is an awful lot of data out there which is naturally represented in 16 bits. One ubiquitous example is audio, e.g. CD audio is represented as streams of 16 bit signed integers.
16 bits is still plenty big enough to hold pixel channel values (e.g. R, G, or B). Most pixels only use 8 bits to store a channel, but Photoshop has a 16-bit mode that professionals use.
In other words, a pixel might be defined as struct Pixel16 { short R, G, B, A; } or an image might be defined as separate channels of struct Channel16 { short channel[]; }
I think most people use the default int on their platform. However there are times when you have to communicate with older systems or libraries that are expecting 16 bit or even eight bit integers (thank god we don't have to worry about 12 bit integers any more). This is especially true for databases. Also, if you're doing bit masking or bit shifting, you might have an algorithm that specifies the length of the integer. By default, and on platforms where memory is cheap, you should probably use integers sized to your processor.
Those 2 bytes add up. Your data types eventually become part of array or databases or messages, they go into data files. It adds up to a lot of wasted space and on embedded systems it can make a huge difference.
When we do peer review of our code at work, if something is sized incorrectly, it will be written as a discrepancy and must be corrected. If we find something that has a range of 1-1000 using an int32_t, it has to be corrected. The range must also be documented in a comment. Our department does not allow use of int, long, etc, we must use int32_t, int16_t, uint16_t, etc. so that the expected size is documented.
uint16_t conicAngle; // angle in tenths of a degree (range 0..3599)
or in Ada:
type Amplitude is range 0 .. 255; // signal amplitude from FPGA
Get in the habit of using what you need and no more and documenting what you need (if the language doesn't support it).
We are currently in the process of fixing a performance problem by resizing the data types in several messages, they have 32 bit fields that could be 8 or 16 bit. By resizing them appropriately we can reduce the message rate in half and improve our data throughput to meet the requirements.
Once upon a time, in the land of Earth, there existed devices called computers.
In the early days following the invention of "computers," there was limited storage in memory for fancy things like numbers and strings.
Billy, a programmer, was encouraged by the evil Wizard (his boss) to use the least amount of memory that he could!
Then one day, memory sizes got large enough that everyone could use 32-bit numbers if they wanted!
I could continue on, but all the other obvious things were already covered.
On an embedded system we have a setup that allows us to read arbitrary data over a command-line interface for diagnostic purposes. For most data, this works fine, we use memcpy() to copy data at the requested address and send it back across a serial connection.
However, for 16-bit hardware registers, memcpy() causes some problems. If I try to access a 16-bit hardware register using two 8-bit accesses, the high-order byte doesn't read correctly.
Has anyone encountered this issue? I'm a 'high-level' (C#/Java/Python/Ruby) guy that's moving closer to the hardware and this is alien territory.
What's the best way to deal with this? I see some info, specifically, a somewhat confusing [to me] post here. The author of this post has exactly the same issue I do but I hate to implement a solution without fully understanding what I'm doing.
Any light you can shed on this issue is much appreciated. Thanks!
In addition to what Eddie said, you typically need to use a volatile pointer to read a hardware register (assuming a memory mapped register, which is not the case for all systems, but it sounds like is true for yours). Something like:
// using types from stdint.h to ensure particular size values
// most systems that access hardware registers will have typedefs
// for something similar (for 16-bit values might be uint16_t, INT16U,
// or something)
uint16_t volatile* pReg = (int16_t volatile*) 0x1234abcd; // whatever the reg address is
uint16_t val = *pReg; // read the 16-bit wide register
Here's a series of articles by Dan Saks that should give you pretty much everything you need to know to be able to effectively use memory mapped registers in C/C++:
"Mapping memory"
"Mapping memory efficiently"
"More ways to map memory"
"Sizing and aligning device registers"
"Use volatile judiciously"
"Place volatile accurately"
"Volatile as a promise"
Each register in this hardware is exposed as a two-byte array, the first element is aligned at a two-byte boundary (its address is even). memcpy() runs a cycle and copies one byte at each iteration, so it copies from these registers this way (all loops unrolled, char is one byte):
*((char*)target) = *((char*)register);// evenly aligned - address is always even
*((char*)target + 1) = *((char*)register + 1);//oddly aligned - address is always odd
However the second line works incorrectly for some hardware specific reasons. If you copy two bytes at a time instead of one at a time, it is instead done this way (short int is two bytes):
*((short int*)target) = *((short*)register;// evenly aligned
Here you copy two bytes in one operation and the first byte is evenly aligned. Since there's no separate copying from an oddly aligned address, it works.
The modified memcpy checks whether the addresses are venely aligned and copies in tow bytes chunks if they are.
If you require access to hardware registers of a specific size, then you have two choices:
Understand how your C compiler generates code so you can use the appropriate integer type to access the memory, or
Embed some assembly to do the access with the correct byte or word size.
Reading hardware registers can have side affects, depending on the register and its function, of course, so it's important to access hardware registers with the proper sized access so you can read the entire register in one go.
Usually it's sufficient to use an integer type that is the same size as your register. On most compilers, a short is 16 bits.
void wordcpy(short *dest, const short *src, size_t bytecount)
{
int i;
for (i = 0; i < bytecount/2; ++i)
*dest++ = *src++;
}
I think all the detail is contained in that thread you posted so I'll try and break it down a little;
Specifically;
If you access a 16-bit hardware register using two 8-bit
accesses, the high-order byte doesn't read correctly (it
always read as 0xFF for me). This is fair enough since
TI's docs state that 16-bit hardware registers must be
read and written using 16-bit-wide instructions, and
normally would be, unless you're using memcpy() to
read them.
So the problem here is that the hardware registers only report the correct value if their values are read in a single 16-bit read. This would be equivalent to doing;
uint16 value = *(regAddress);
This reads from the address into the value register using a single 16-byte read. On the other hand you have memcpy which is copying data a single-byte at a time. Something like;
while (n--)
{
*(uint8*)pDest++ = *(uint8*)pSource++;
}
So this causes the registers to be read 8-bits (1 byte) at a time, resulting in the values being invalid.
The solution posted in that thread is to use a version of memcpy that will copy the data using 16-bit reads whereever the source and destination are a6-bit aligned.
What do you need to know? You've already found a separate post explaining it. Apparently the CPU documentation requires that 16-bit hardware registers are accessed with 16-bit reads and writes, but your implementation of memcpy uses 8-bit reads/writes. So they don't work together.
The solution is simply not to use memcpy to access this register.
Instead, write your own routine which copies 16-bit values.
Not sure exactly what the question is - I think that post has the right solution.
As you stated, the issue is that the standard memcpy() routine reads a byte at a time, which does not work correctly for memory mapped hardware registers. That is a limitation of the processor - there's simply no way to get a valid value reading a byte at at time.
The suggested solution is to write your own memcpy() which only works on word-aligned addresses, and reads 16-bit words at a time. This is fairly straightforward - the link gives both a c and an assembly version. The only gotcha is to make sure you always do the 16 bit copies from validly aligned address. You can do that in 2 ways: either use linker commands or pragmas to make sure things are aligned, or add a special case for the extra byte at the front of an unaligned buffer.