When i use os_proc_available_memory to get memory, it return large number
2199022788156 * 1024。
Appeared on some devices, iPhone 14, 128gb.
Not all iPhone14
Some one know why? Have you ever encountered it? or any idea?
Related
I'm comparing two NSNumbers in my app and I've done it the wrong way:
if(max < selected)
And it should be:
if([max longValue] < [selected longValue])
So the first comparison is really comparing the two object memory addresses, the funny thing (at least for me) is that the values seems to be related with the addresses. For example, if I get the first number with value 5 its memory address is 0xb000000000000053 and if I get the second with 10 is 0xb0000000000000a3 (being "a" equivalent to 10 in hexadecimal).
For that reason the first comparison (wrong) was actually working. Now an user complaint about an error here and is obviously because of this but it has lead me to the next questions:
Does this only happen in simulators? Cause it's where I'm testing, and the user will have a real device. Or maybe this happen normally but it's not a rule always fulfilled?
This is a "tagged pointer," not an address. The value 5 is packed inside the pointer as you've seen. You can identify tagged pointers because they're odd (the last bit is 1). It's not possible to fetch odd addresses on any of Apple's hardware (the word size is 4 or 8 bytes), so that bit is never set for a real address.
Tagged pointers are only available on 64-bit platforms. If you run on a 32-bit platform then the values will be real pointers, and they may not be in any particular order, which will then lead to the kinds of bugs you're encountering. Unfortunately I don't believe there is any way to get a compiler warning or even a static analysis warning for this kind of misuse on NSNumber.
Mike Ash provides an in-depth discussion of the subject.
On a slightly related note, on 32-bit platforms, certain NSNumbers are singletons, particularly small values since they're used a lot (-1 through 12 as I recall, but I believe it's different on different platforms). This means that == may happen to work for some numbers, but not for others. It also means that without ARC, it was possible to over-release a specific value (for example, 4) such that your program would crash the next time it happened to use that value. True story.... very hard to debug.
I'm just testing influxdb 1.3.5 for storing a small number (~30-300) of very long integer series (worst case: (86400)*(12*365) [sec/day * ((days/year)*12) * 1 device] = 378.432.000)
e.g. the number of total points would be for 320 devices: (86400)*(12*365)*320 [sec/day * ((days/year)*12) * 320 devices] = 121.098.240.000)
The series cardinality is low, it equals the number of devices. I'm using second-precision timestamps (that mode is enabled when I commit to influxdb via the php-API.
Yes, I really need to keep all the samples, so downsampling is not an option.
I'm inserting the samples as point-arrays of size 86400 per request sorted from oldest to newest. The behaviour is similar (OOM in both cases) for inmem and tsi1 indexing modes.
Despite all that, I'm not able to insert this number of points to the database without crashing it due to out of memory. The host-vm has 8GiB of RAM and 4GiB of Swap which fill up completely. I cannot find anything about that setup being problematic in the documentation. I cannot find a notice that indicates this setup should result in a high RAM usage at all...
Does anyone have a hint on what could be wrong here?
Thanks and all the best!
b-
[ I asked the same question here but received no replies, that's the reason for crossposting: https://community.influxdata.com/t/ever-increasing-ram-usage-with-low-series-cardinality/2555 ]
I found out what the issue most likely was:
I had a bug in my feeder that caused timestamps not being updated to lots of points with distinct values were written over and over again to the same timestamp/tag combination.
If you experience something similar, try double-checking each step in the pipeline for a time concerning error.
This was not the issue unfortunately, the ram usage rises nevertheless then importing more points than before.
I have some places in my code which looks like this:
var i = 0
for c in vertexStates[0] {
//this operation is costy (encapsulates 4 linear interpolation inside)
currentVertexes.append(vertexStates[1][i++].interpolateTo(c, alpha: factor))
}
And I know that there is more than 1000 vertexes in vertexStates[index] array for sure (maybe up to 3000). What are the best practices for optimizing (vectorizing) such operations? Should figure out how to do it in some threads? Will profits from using multi-threading outweight over head? Maybe there are other ways of doing such operations faster?
I need general approach on how to optimize such operations (in my case which produces array from two other arrays and order is important for me), no matter if 3000 counts as long or not. My iPhone 6 Plus CPU is loaded by 65% during this operations, so I can predict 4s will show very poor results, even though I haven't tested it yet.
100 isn't very long. 300 isn't very long. 100,000 is where we can start arguing whether something is very long.
Did you measure how long things take? What is the slowest device where your code could run? If you run on iOS 7, how well does it run on an iPhone 4? If you run on iOS 8 or 9 only, how well does it run on 4s or iPad 2?
The first step is measuring. Post with results.
I noticed after increasing the number of arrays that are instantiated into memory from 8 to 23, my app just stops running
[NSMutableArray addObject:obj]
on the 13th array on 32 bit devices only. On an iPad Air 2 (device and sim) and iPhone 6 (device and sim) all 23 arrays are populated and the app functions as expected.
I understand there's a point at which a device will run out of available memory, and I noticed in Xcode on a 32 bit device, the app memory was hovering around 50-55mb, but the app doesn't crash or give a memory warning in the console. On a 64 bit device or sim, at the same point of interest, the app memory is around 90-95mb?
1) how is memory for 32 bit devices different from 64 bit devices when it comes to the amount of data that can be instantiated?
2) is there a certain number of arrays that can be init to memory unrelated to the size, considering I could populate 2 out of 23 arrays with a single small object and the first would have the right count and the 2nd (any array with an ID > 13) would be 0 like this?
if (obj.eventTypeID == [NSNumber numberWithInt:1]) {obj.color = [UIColor whiteColor];[array1 addObject:obj];}
//ALL ARRAYS ALWAYS POPULATE NO MATTER THE COUNT OR SIZE BETWEEN 1 AND 13
if (obj.eventTypeID == [NSNumber numberWithInt:13]) {obj.color = [UIColor greenColor];[array13 addObject:obj];}
//ALL ARRAYS ARE ALWAYS EMPTY BETWEEN 14 AND 23
if (obj.eventTypeID == [NSNumber numberWithInt:23]) {obj.color = [UIColor redColor];[array23 addObject:obj];}
Hopefully that's enough to go on. Just remember, the app works as expected on 64 bit, but not in 32 bit.
For what it's worth, I ended up fixing the problem by casting the conditional statements using intValue...genius, I know...
if ([obj.eventTypeID intValue] == 1)
Although this solved a huge bug, I still don't understand why the precision of doing it the other way is any different.
obj.eventTypeID is an NSInteger
So why is this any different? More importantly, why is this difference powerful enough to stop the thread from processing these conditions inside a loop if it's not an intValue? These are some of the unanswered parts of this question, but it's long been solved with the solution above.
It seems like 2 million floats should be no big deal, only 8MBs of 1GB of GPU RAM. I am able to allocate that much at times and sometimes more than that with no trouble. I get CL_OUT_OF_RESOURCES when I do a clEnqueueReadBuffer, which seems odd. Am I able to sniff out where the trouble really started? OpenCL shouldn't be failing like this at clEnqueueReadBuffer right? It should be when I allocated the data right? Is there some way to get more details than just the error code? It would be cool if I could see how much VRAM was allocated when OpenCL declared CL_OUT_OF_RESOURCES.
I just had the same problem you had (took me a whole day to fix).
I'm sure people with the same problem will stumble upon this, that's why I'm posting to this old question.
You propably didn't check for the maximum work group size of the kernel.
This is how you do it:
size_t kernel_work_group_size;
clGetKernelWorkGroupInfo(kernel, device, CL_KERNEL_WORK_GROUP_SIZE, sizeof(size_t), &kernel_work_group_size, NULL);
My devices (2x NVIDIA GTX 460 & Intel i7 CPU) support a maximum work group size of 1024, but the above code returns something around 500 when I pass my Path Tracing kernel.
When I used a workgroup size of 1024 it obviously failed and gave me the CL_OUT_OF_RESOURCES error.
The more complex your kernel becomes, the smaller the maximum workgroup size for it will become (or that's at least what I experienced).
Edit:
I just realized you said "clEnqueueReadBuffer" instead of "clEnqueueNDRangeKernel"...
My answer was related to clEnqueueNDRangeKernel.
Sorry for the mistake.
I hope this is still useful to other people.
From another source:
- calling clFinish() gets you the error status for the calculation (rather than getting it when you try to read data).
- the "out of resources" error can also be caused by a 5s timeout if the (NVidia) card is also being used as a display
- it can also appear when you have pointer errors in your kernel.
A follow-up suggests running the kernel first on the CPU to ensure you're not making out-of-bounds memory accesses.
Not all available memory can necessarily be supplied to a single acquisition request. Read up on heap fragmentation 1, 2, 3 to learn more about why the largest allocation that can succeed is for the largest contiguous block of memory and how blocks get divided up into smaller pieces as a result of using the memory.
It's not that the resource is exhausted... It just can't find a single piece big enough to satisfy your request...
Out of bounds acesses in a kernel are typically silent (since there is still no error at the kernel queueing call).
However, if you try to read the kernel result later with a clEnqueueReadBuffer(). This error will show up. It indicates something went wrong during kernel execution.
Check your kernel code for out-of-bounds read/writes.