read()/write() calls on iOS seem to be limited by 2250 bytes - ios

I am having a strange problem trying to read and write 9k bytes with open(), read() and write(). When I attempt to write 9k to a file and read it back, the data only goes up to 2250 bytes. After that everything is zeros.
Here is my code (except for the filename which isn't relevant, I'm just putting it to NSDocumentDirectory):
int fp = open([appFile cStringUsingEncoding:NSASCIIStringEncoding], O_RDWR | O_CREAT, 0644);
[_detailViewController log:#"first open() returns %i (err: %i)", fp, errno];
int data2[10000];
int data3[10000];
for (int i=0;i<10000;i++) data2[i] = 1;
[_detailViewController log:#"resetting seek to 0"];
int seekPos = lseek(fp, 0, SEEK_SET);
result = write(fp, data2, 9000);
[_detailViewController log:#"wrote 9k, result is %i", result];
[_detailViewController log:#"resetting seek to 0"];
seekPos = lseek(fp, 0, SEEK_SET);
result = read(fp, data3, 9000);
[_detailViewController log:#"read 9k, result is %i", result];
[_detailViewController log:#"values of data2[2248-2252] = 0x%x 0x%x 0x%x 0x%x 0x%x", data2[2248], data2[2249], data2[2250], data2[2251], data2[2252]];
[_detailViewController log:#"values of data3[2248-2252] = 0x%x 0x%x 0x%x 0x%x 0x%x", data3[2248], data3[2249], data3[2250], data3[2251], data3[2252]];
close(fp);
And here is the strange output:
2013-02-13 14:08:38.290 FileTester[2800:907] first open() returns 6 (err: 3)
2013-02-13 14:08:38.295 FileTester[2800:907] resetting seek to 0
2013-02-13 14:08:38.301 FileTester[2800:907] wrote 9k, result is 9000
2013-02-13 14:08:38.306 FileTester[2800:907] resetting seek to 0
2013-02-13 14:08:38.311 FileTester[2800:907] read 9k, result is 9000
2013-02-13 14:08:38.319 FileTester[2800:907] values of data2[2248-2252] = 0x1 0x1 0x1 0x1 0x1
2013-02-13 14:08:38.327 FileTester[2800:907] values of data3[2248-2252] = 0x1 0x1 0x0 0x0 0x0
As you can see on the last line, the data suddenly goes to zero.
Any ideas what I might be doing wrong? The thing that really gets me is that both the read() and write() return 9000.

As mentioned by ughoavgfhw (Thanks!) the problem was I was mixing up bytes and ints. 9000 bytes is the same thing as 2250 ints, since each int is 4 bytes.

Related

cJSON Memory leak when freeing cJSON object

I am facing an issue while using the cJSON Library. I am assuming that there is a memory leak that is breaking the code after a certain time (40 mins to 1 hr).
I have copied my code below :
void my_work_handler_5(struct k_work *work)
{
char *ptr1[6];
int y=0;
static int counterdo = 0;
char *desc6 = "RSRP";
char *id6 = "dBm";
char *type6 = "RSRP";
char rsrp_str[100];
snprintf(rsrp_str, sizeof(rsrp_str), "%d", rsrp_current);
sensor5 = cJSON_CreateObject();
cJSON_AddItemToObject(sensor5, "description", cJSON_CreateString(desc6));
cJSON_AddItemToObject(sensor5, "Time", cJSON_CreateString(time_string));
cJSON_AddItemToObject(sensor5, "value", cJSON_CreateNumber(rsrp_current));
cJSON_AddItemToObject(sensor5, "unit", cJSON_CreateString(id6));
cJSON_AddItemToObject(sensor5, "type", cJSON_CreateString(type6));
/* print everything */
ptr1[counterdo] = cJSON_Print(sensor5);
printk("Counterdo value is : %d\n", counterdo);
cJSON_Delete(sensor5);
counterdo = counterdo + 1;
if (counterdo==6){
for(y=0;y<=counterdo;y++){
free(ptr1[y]);
}
counterdo = 0;
}
return;
}
I read some other threads regarding freeing up the memory and tried to do the same. Can anyone let me know if this is the right approach to free up the space allocated to the cJSON Object.
Regards,
Adeel.
Since cJSON is a portable library with no dependencies, this is better to look for a potential issue in your code on a PC: they are specialized tools available in this environment for facilitating the investigation. I am assuming here you have a Linux system, a Windows system with WSL or WSL2 installed, or a Linux virtual machine, available, and gcc, valgrind installed.
A minimal, self-contained, portable version of your code could be:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <cJSON.h>
static int rsrp_current = 1;
static char *time_string = NULL;
void
my_work_handler_5 ()
{
char *ptr1[6];
int y = 0;
static int counterdo = 0;
char *desc6 = "RSRP";
char *id6 = "dBm";
char *type6 = "RSRP";
char rsrp_str[100];
snprintf (rsrp_str, sizeof (rsrp_str), "%d", rsrp_current);
cJSON *sensor5 = cJSON_CreateObject ();
cJSON_AddItemToObject (sensor5, "description", cJSON_CreateString (desc6));
cJSON_AddItemToObject (sensor5, "Time", cJSON_CreateString (time_string));
cJSON_AddItemToObject (sensor5, "value", cJSON_CreateNumber (rsrp_current));
cJSON_AddItemToObject (sensor5, "unit", cJSON_CreateString (id6));
cJSON_AddItemToObject (sensor5, "type", cJSON_CreateString (type6));
/* print everything */
ptr1[counterdo] = cJSON_Print (sensor5);
printf ("Counterdo value is : %d\n", counterdo);
cJSON_Delete (sensor5);
counterdo = counterdo + 1;
if (counterdo == 6)
{
for (y = 0; y <= counterdo; y++)
{
free (ptr1[y]);
}
counterdo = 0;
}
return;
}
int
main (int argc, char **argv)
{
time_t curtime;
time (&curtime);
for (int n = 0; n < 3 * 6; n++)
{
my_work_handler_5 ();
}
}
Build procedure:
wget https://github.com/DaveGamble/cJSON/archive/v1.7.14.tar.gz
tar zxf v1.7.14.tar.gz
gcc -g -O0 -IcJSON-1.7.14 -o cjson cjson.c cJSON-1.7.14/cJSON.c
Running valgrind on the program:
valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes --verbose ./cjson
..indicates some memory is being freed that was not previously allocated: Invalid free() / delete / delete[] / realloc():
==6747==
==6747== HEAP SUMMARY:
==6747== in use at exit: 0 bytes in 0 blocks
==6747== total heap usage: 271 allocs, 274 frees, 14,614 bytes allocated
==6747==
==6747== All heap blocks were freed -- no leaks are possible
==6747==
==6747== ERROR SUMMARY: 21 errors from 2 contexts (suppressed: 0 from 0)
==6747==
==6747== 3 errors in context 1 of 2:
==6747== Invalid free() / delete / delete[] / realloc()
==6747== at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==6747== by 0x1094DA: my_work_handler_5 (cjson.c:42)
==6747== by 0x10955A: main (cjson.c:59)
==6747== Address 0x31 is not stack'd, malloc'd or (recently) free'd
==6747==
==6747==
==6747== 18 errors in context 2 of 2:
==6747== Conditional jump or move depends on uninitialised value(s)
==6747== at 0x483C9F5: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==6747== by 0x1094DA: my_work_handler_5 (cjson.c:42)
==6747== by 0x10955A: main (cjson.c:59)
==6747== Uninitialised value was created by a stack allocation
==6747== at 0x109312: my_work_handler_5 (cjson.c:11)
==6747==
==6747== ERROR SUMMARY: 21 errors from 2 contexts (suppressed: 0 from 0)
Replacing:
for (y = 0; y <= counterdo; y++)
{
free (ptr1[y]);
}
by:
for (y = 0; y < counterdo; y++)
{
free (ptr1[y]);
}
and executing valgrind again:
==6834==
==6834== HEAP SUMMARY:
==6834== in use at exit: 1,095 bytes in 15 blocks
==6834== total heap usage: 271 allocs, 256 frees, 14,614 bytes allocated
==6834==
==6834== Searching for pointers to 15 not-freed blocks
==6834== Checked 75,000 bytes
==6834==
==6834== 1,095 bytes in 15 blocks are definitely lost in loss record 1 of 1
==6834== at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==6834== by 0x10B161: print (cJSON.c:1209)
==6834== by 0x10B25F: cJSON_Print (cJSON.c:1248)
==6834== by 0x1094AB: my_work_handler_5 (cjson.c:30)
==6834== by 0x10959C: main (cjson.c:59)
==6834==
==6834== LEAK SUMMARY:
==6834== definitely lost: 1,095 bytes in 15 blocks
==6834== indirectly lost: 0 bytes in 0 blocks
==6834== possibly lost: 0 bytes in 0 blocks
==6834== still reachable: 0 bytes in 0 blocks
==6834== suppressed: 0 bytes in 0 blocks
==6834==
==6834== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Some memory is definitively being leaked.
The reason is that char *ptr1[6] is not static, and is therefore created on the stack every time my_work_handler_5() is being called. The pointers that were returned are by cJSON_Print() are therefore lost between two calls, and free() is being called on arbitrary pointer values, since ptr1[] is not initialized as it could be:
char *ptr1[6] = { NULL, NULL, NULL, NULL, NULL, NULL };
Since you are freeing memory every 6 calls, this is causing the memory leak you were suspecting.
Replacing:
char *ptr1[6];
by:
static char *ptr1[6];
compiling, running valgrind again:
==6927==
==6927== HEAP SUMMARY:
==6927== in use at exit: 0 bytes in 0 blocks
==6927== total heap usage: 271 allocs, 271 frees, 14,614 bytes allocated
==6927==
==6927== All heap blocks were freed -- no leaks are possible
==6927==
==6927== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
The modified version of the program should now work on your bare-metal system.

mprotect errno 22 iOS

I'm developing a jailbroken app on iOS and getting errno 22 when calling
mprotect(p, 1024, PROT_READ | PROT_EXEC)
errno 22 means invalid arguments but I can't figure out whats wrong. I've aligned p to be a multiple of page size, and I've malloced the memory previously before calling mprotect.
Here's my code and sample output
#define PAGESIZE 4096
FILE * pFile;
pFile = fopen ("log.txt","w");
uint32_t code[] = {
0xe2800001, // add r0, r0, #1
0xe12fff1e, // bx lr
};
fprintf(pFile, "Before Execution\n");
p = (uint32_t *)malloc(1024+PAGESIZE-1);
if (!p) {
fprintf(pFile, "Couldn't malloc(1024)");
perror("Couldn't malloc(1024)");
exit(errno);
}
fprintf(pFile, "Malloced to %p\n", p);
p = (uint32_t *)(((uintptr_t)p + PAGESIZE-1) & ~(PAGESIZE-1));
fprintf(pFile, "Moved pointer to %p\n", p);
fprintf(pFile, "Before Compiling\n");
// copy instructions to function
p[0] = code[0];
p[1] = code[1];
fprintf(pFile, "After Compiling\n");
if (mprotect(p, 1024, PROT_READ | PROT_EXEC)) {
int err = errno;
fprintf(pFile, "Couldn't mprotect2: %i\n", errno);
perror("Couldn't mprotect");
exit(errno);
}
And output:
Before Execution
Malloced to 0x13611ec00
Moved pointer 0x13611f000
Before Compiling
After Compiling
Couldn't mprotect2: 22
Fixed this by using posix_memalign(). Turns out I wasn't aligning my pointer to the page size correctly

iOS 64bit ntpdate don't work properly with arm64 flag

I have a function (ANSI C) to retrieve time for our ntpd server.
This code work properly when I compile 32bit but doesn't work if I compile in armv64.
It works properly on iPhone 4,4S,5 (32bit), it doesn't work properly on Iphone 5s,6,6S (64bit).
I think that the problem is:
tmit=ntohl((time_t)buf[10]); //# get transmit time
time_t is now 8byte when compiled in armv64.....
Underneath you can find the source code...
Output Correct with Iphone 5 Simulator (32bit) ---------------------------
xxx.xxx.xxx.xxx PORT 123
sendto-->48
prima recv
recv-->48
tmit=-661900093
tmit=1424078403
1424078403-->Time: Mon Feb 16 10:20:03 2015
10:20:03 --> 37203
---------------------------------------------------------
Output Wrong with Iphone 6 Simulator (64bit) ---------------------------
xxx.xxx.xxx.xxx PORT 123
sendto-->48
prima recv
recv-->48
tmit=19612797
tmit=2105591293
2105591293-->Time: Tue Nov 19 00:47:09 38239
00:47:09 --> 2829
//---------------------------------------------------------------------------
long ntpdate(char *hostname) {
//ntp1.inrim.it (193.204.114.232)
//ntp2.inrim.it (193.204.114.233)
int portno=NTP_PORT; //NTP is port 123
int maxlen=1024; //check our buffers
int i=0; // misc var i
unsigned char msg[48]={010,0,0,0,0,0,0,0,0}; // the packet we send
unsigned long buf[maxlen]; // the buffer we get back
//struct in_addr ipaddr; //
struct protoent *proto; //
struct sockaddr_in server_addr;
int s; // socket
int tmit; // the time -- This is a time_t sort of
char ora[20]="";
//
//#we use the system call to open a UDP socket
//socket(SOCKET, PF_INET, SOCK_DGRAM, getprotobyname("udp")) or die "socket: $!";
proto=getprotobyname("udp");
s=socket(PF_INET, SOCK_DGRAM, proto->p_proto);
if(s==-1) {
//printf("ERROR socket=%d\n",s);
return -1;
}
//Setto il timeout per la ricezione --------------------
struct timeval tv;
tv.tv_sec = TIMEOUT_NTP; //sec
tv.tv_usec = 0;
if (setsockopt(s, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(struct timeval)) != 0)
{
//printf("Error assigning socket option");
return -1;
}
memset( &server_addr, 0, sizeof( server_addr ));
server_addr.sin_family=AF_INET;
//trasformo il nome in ip
struct hostent *hp = gethostbyname(hostname);
if (hp == NULL) {
return -1;
} else {
sprintf(hostname_ip, "%s", inet_ntoa( *( struct in_addr*)( hp -> h_addr_list[0])));
}
#ifdef LOG_NTP
printf("%s-->%s PORT %d\n",hostname,hostname_ip,portno);
#endif
server_addr.sin_addr.s_addr = inet_addr(hostname_ip);
server_addr.sin_port=htons(portno);
//printf("ipaddr (in hex): %x\n",server_addr.sin_addr);
/*
* build a message. Our message is all zeros except for a one in the
* protocol version field
* msg[] in binary is 00 001 000 00000000
* it should be a total of 48 bytes long
*/
// send the data
i=sendto(s,msg,sizeof(msg),0,(struct sockaddr *)&server_addr,sizeof(server_addr));
#ifdef LOG_NTP
printf("sendto-->%d\n",i);
#endif
if (i==-1)
return -1;
#ifdef LOG_NTP
printf("prima recv\n");
#endif
// get the data back
i=recv(s,buf,sizeof(buf),0);
#ifdef LOG_NTP
printf("recv-->%d\n",i);
#endif
if (i==-1)
{
#ifdef LOG_NTP
printf("Error: %s (%d)\n", strerror(errno), errno);
#endif
return -1;
}
//printf("recvfr: %d\n",i);
//We get 12 long words back in Network order
//for(i=0;i<12;i++)
//printf("%d\t%-8x\n",i,ntohl(buf[i]));
/*
* The high word of transmit time is the 10th word we get back
* tmit is the time in seconds not accounting for network delays which
* should be way less than a second if this is a local NTP server
*/
tmit=ntohl((time_t)buf[10]); //# get transmit time
#ifdef LOG_NTP
printf("tmit=%d\n",tmit);
#endif
/*
* Convert time to unix standard time NTP is number of seconds since 0000
* UT on 1 January 1900 unix time is seconds since 0000 UT on 1 January
* 1970 There has been a trend to add a 2 leap seconds every 3 years.
* Leap seconds are only an issue the last second of the month in June and
* December if you don't try to set the clock then it can be ignored but
* this is importaint to people who coordinate times with GPS clock sources.
*/
tmit-= 2208988800U;
#ifdef LOG_NTP
printf("tmit=%d\n",tmit);
#endif
/* use unix library function to show me the local time (it takes care
* of timezone issues for both north and south of the equator and places
* that do Summer time/ Daylight savings time.
*/
//#compare to system time
#ifdef LOG_NTP
//printf("%d-->Time: %s\n",tmit,ctime((const time_t)&tmit));
printf("%d-->Time: %s\n",tmit,ctime((const time_t)&tmit));
#endif
//i=time(0);
//printf("%d-%d=%d\n",i,tmit,i-tmit);
//printf("System time is %d seconds off\n",i-tmit);
//Prendo l'ora e la converto in HH:MM:SS --> Sec
strftime(ora, 20, "%T", localtime((const time_t)&tmit));
#ifdef LOG_NTP
printf("%s --> %ld\n",ora, C2TIME(ora));
#endif
return C2TIME(ora);
}
I Solved the Problem!!!!!!!!!
uint32_t buf[maxlen];
uint32_t tmit;
instead of:
unsigned long buf[maxlen];
int tmit;
Defining a variable of type time_t
time_t tmit_temp=tmit;
printf("%d-->Time: %s\n",tmit,ctime((const time_t)&tmit_temp));
strftime(ora, 20, "%T", localtime((const time_t)&tmit_temp));
This works properly!!! ;-)

Set data for each bit in NSData iOS

I created NSData of length 2 bytes (16 bits) and I want to set first 12 bits as binary value of (int)120 and 13th bit as 0 or 1(bit), 14th bit as 0 or 1 (bit) and 15th bit as 0 or 1(bit).
This is what I want to do:
0000 0111 1000 --> 12 bits as 120 (int)
1 <-- 13th bit
0 <-- 14th bit
1 <-- 15th bit
1 <-- 16th bit
Expected output => 0000 0111 1000 1011 : final binary and convert it to NSData.
How can I do that? Please give me some advice. Thanks all.
0000011110001011 in bit is 0x07 0x8b in byte.
unsigned char a[2] ;
a[0] = 0x07 ;
a[1] = 0x8b ;
NSData * d = [NSData dataWithBytes:a length:2] ;
Recently i wrote this code for my own project.. Check if this can be helpful to you.
//Create a NSMutableData with particular num of data bytes
NSMutableData *dataBytes= [[NSMutableData alloc] initWithLength:numberOfDataBytes];
//get the byte in which you want to change bit.
char x;
[dataBytes getBytes:&x range:NSMakeRange(bytePos,1)];
//change the bit here by shift operation
x |= 1<< (bitNum%8);
//put byte back in NSMutableData
[dataBytes replaceBytesInRange:NSMakeRange(bytePos,1) withBytes:&x length:1];
Let me know if you need more help ..:)
For changing 13th bit based on string equality
if([myString isEqualTo:#"hello"])
{
char x;
[dataBytes getBytes:&x range:NSMakeRange(0,1)];
//change the bit here by shift operation
x |= 1<< (4%8); // x |=1<<4;
//put byte back in NSMutableData
[dataBytes replaceBytesInRange:NSMakeRange(0,1) withBytes:&x length:1];
}
This is the exact code you might need:
uint16_t x= 120; //2 byte unsigned int (0000 0000 0111 1000)
//You need 13th bit to be 1
x<<=1; //Shift bits to left .(x= 0000 0000 1111 0000)
x|=1; //OR with 1 (x= 0000 0000 1111 0001)
//14th bit to be 0.
x<<=1; // (x=0000 0001 1110 0010)
//15th bit to be 1
x<<=1; //x= 0000 0011 1100 0100
x|=1; //x= 0000 0011 1100 0101
//16th bit to be 1
x<<=1; //x= 0000 0111 1000 1010
x|=1; //x= 0000 0111 1000 1011
//Now convert x into NSData
/** **** Replace this for Big Endian ***************/
NSMutableData *data = [[NSMutableData alloc] init];
int MSB = x/256;
int LSB = x%256;
[data appendBytes:&MSB length:1];
[data appendBytes:&LSB length:1];
/** **** Replace upto here.. :) ***************/
//replace with :
//NSMutableData *data = [[NSMutableData alloc] initWithBytes:&x length:sizeof(x)];
NSLog(#"%#",[data description]);
Output: <078b> //x= 0000 0111 1000 1011 //for Big Endian : <8b07> x= 1000 1011 0000 0111

How to read a NSInputStream with UTF-8?

I try to read a large file in iOS using NSInputStream to separate the files line by newlines (I don't want to use componentsSeparatedByCharactersInSet as it uses too much memory).
But as not all lines seem to be UTF-8 encoded (as they can appear just as ASCII, same bytes) I often get the Incorrect NSStringEncoding value 0x0000 detected. Assuming NSASCIIStringEncoding. Will stop this compatiblity mapping behavior in the near future. warning.
My question is: Is there a way to surpress this warning by e.g. setting a compiler flag?
Furthermore: Is it save to append/concatenate two buffer reads, as reading from the byte stream, then converting the buffer to string and then appending the string could make the string corrupted?
Below an example method that demonstrates that the byte to string conversion will discard the first and second half of the UTF-8 character, as being invalid.
- (void)NSInputStreamTest {
uint8_t testString[] = {0xd0, 0x91}; // #"Б"
// Test 1: Read max 1 byte at a time of UTF-8 string
uint8_t buf1[1], buf2[1];
NSString *s1, *s2, *s3;
NSInteger c1, c2;
NSInputStream *inStream = [[NSInputStream alloc] initWithData:[[NSData alloc] initWithBytes:testString length:2]];
[inStream open];
c1 = [inStream read:buf1 maxLength:1];
s1 = [[NSString alloc] initWithBytes:buf1 length:1 encoding:NSUTF8StringEncoding];
NSLog(#"Test 1: Read %d byte(s): %#", c1, s1);
c2 = [inStream read:buf2 maxLength:1];
s2 = [[NSString alloc] initWithBytes:buf2 length:1 encoding:NSUTF8StringEncoding];
NSLog(#"Test 1: Read %d byte(s): %#", c2, s2);
s3 = [s1 stringByAppendingString:s2];
NSLog(#"Test 1: Concatenated: %#", s3);
[inStream close];
// Test 2: Read max 2 bytes at a time of UTF-8 string
uint8_t buf4[2];
NSString *s4;
NSInteger c4;
NSInputStream *inStream2 = [[NSInputStream alloc] initWithData:[[NSData alloc] initWithBytes:testString length:2]];
[inStream2 open];
c4 = [inStream2 read:buf4 maxLength:2];
s4 = [[NSString alloc] initWithBytes:buf4 length:2 encoding:NSUTF8StringEncoding];
NSLog(#"Test 2: Read %d byte(s): %#", c4, s4);
[inStream2 close];
}
Output:
2013-02-10 21:16:23.412 Test[11144:c07] Test 1: Read 1 byte(s): (null)
2013-02-10 21:16:23.413 Test[11144:c07] Test 1: Read 1 byte(s): (null)
2013-02-10 21:16:23.413 Test[11144:c07] Test 1: Concatenated: (null)
2013-02-10 21:16:23.413 Test[11144:c07] Test 2: Read 2 byte(s): Б
First of all, in line: s3 = [s1 stringByAppendingString:s2]; you are trying to concatenate to 'nil' values. The result would be 'nil' also. So, you may want to concatenate bytes instead of strings:
uint8_t buf3[2];
buf3[0] = buf1[0];
buf3[1] = buf2[0];
s3 = [[NSString alloc] initWithBytes:buf3 length:2 encoding:NSUTF8StringEncoding];
Output:
2015-11-06 12:57:40.304 Test[10803:883182] Test 1: Read 1 byte(s): (null)
2015-11-06 12:57:40.305 Test[10803:883182] Test 1: Read 1 byte(s): (null)
2015-11-06 12:57:40.305 Test[10803:883182] Test 1: Concatenated: Б
Secondary, length of UTF-8 character may lay in [1..6] bytes.
(1 byte) 0aaa aaaa //if symbol lays in 0x00 .. 0x7F (ASCII)
(2 bytes) 110x xxxx 10xx xxxx
(3 bytes) 1110 xxxx 10xx xxxx 10xx xxxx
(4 bytes) 1111 0xxx 10xx xxxx 10xx xxxx 10xx xxxx
(5 bytes) 1111 10xx 10xx xxxx 10xx xxxx 10xx xxxx 10xx xxxx
(6 bytes) 1111 110x 10xx xxxx 10xx xxxx 10xx xxxx 10xx xxxx 10xx xxxx
So, if you are intended to read from NSInputStream raw bytes and then translate them into UTF-8 NSString, you probably want to read byte by byte from NSInputStream until you will get valid string:
#define MAX_UTF8_BYTES 6
NSString *utf8String;
NSMutableData *_data = [[NSMutableData alloc] init]; //for easy 'appending' bytes
int bytes_read = 0;
while (!utf8String) {
if (bytes_read > MAX_UTF8_BYTES) {
NSLog(#"Can't decode input byte array into UTF8.");
return;
}
else {
uint8_t byte[1];
[_inputStream read:byte maxLength:1];
[_data appendBytes:byte length:1];
utf8String = [NSString stringWithUTF8String:[_data bytes]];
bytes_read++;
}
}
ASCII (and hence the newline character) is a subset of UTF-8, so there should not be any conflict.
It should be possible to divide your stream at the newline characters, as you would in a simple ASCII stream. Then you can convert each chunk ("line") into an NSString using UTF-8.
Are you sure the encoding errors are not real, i.e., that your stream may actually contain erroneous characters with respect to a UTF-8 encoding?
Edited to add from the comments:
This presumes that the lines consist of sufficiently few characters to keep a whole line in memory before converting from UTF-8.

Resources