VM Tracker shows large Dirty Size - ios

There is a part of my app where I perform operations concurrently. They consist of initializing many CALayers and rendering them to bitmaps.
Unfortunately, during these operations (each takes about 2 seconds to complete on an iphone 4), the Dirty Size indicated by VM Tracker spikes to ~120MB. Allocations spike to ~12MB(does not accumulate) From my understanding, the Dirty size is memory that cannot be freed. so often, my app and all other apps in the background gets killed.
Incident Identifier: 7E6CBE04-D965-470D-A532-ADBA007F3433
CrashReporter Key: bf1c73769925cbff86345a576ae1e576728e5a11
Hardware Model: iPhone3,1
OS Version: iPhone OS 5.1.1 (9B206)
Kernel Version: Darwin Kernel Version 11.0.0: Sun Apr 8 21:51:26 PDT 2012; root:xnu-
1878.11.10~1/RELEASE_ARM_S5L8930X
Date: 2013-03-18 19:44:51 +0800
Time since snapshot: 38 ms
Free pages: 1209
Active pages: 3216
Inactive pages: 1766
Throttled pages: 106500
Purgeable pages: 0
Wired pages: 16245
Largest process: Deja Dev
Processes
Name UUID Count resident pages
geod <976e1080853233b1856b13cbd81fdcc3> 338
LinkedIn <24325ddfeed33d4fb643030edcb12548> 6666 (jettisoned)
Music~iphone <a3a7a86202c93a6ebc65b8e149324218> 935
WhatsApp <a24567991f613aaebf6837379bbf3904> 2509
MobileMail <eed7992f4c1d3050a7fb5d04f1534030> 945
Console <9925a5bd367a7697038ca5a581d6ebdf> 926 (jettisoned)
Test Dev <c9b1db19bcf63a71a048031ed3e9a3f8> 81683 (active)
MobilePhone <8f3f3e982d9235acbff1e33881b0eb13> 867
debugserver <2408bf4540f63c55b656243d522df7b2> 92
networkd <80ba40030462385085b5b7e47601d48d> 158
notifyd <f6a9aa19d33c3962aad3a77571017958> 234
aosnotifyd <8cf4ef51f0c635dc920be1d4ad81b322> 438
BTServer <31e82dfa7ccd364fb8fcc650f6194790> 275
CommCenterClassi <041d4491826e3c6b911943eddf6aaac9> 722
SpringBoard <c74dc89dec1c3392b3f7ac891869644a> 5062 (active)
aggregated <a12fa71e6997362c83e0c23d8b4eb5b7> 383
apsd <e7a29f2034083510b5439c0fb5de7ef1> 530
configd <ee72b01d85c33a24b3548fa40fbe519c> 465
dataaccessd <473ff40f3bfd3f71b5e3b4335b2011ee> 871
fairplayd.N90 <ba38f6bb2c993377a221350ad32a419b> 169
fseventsd <914b28fa8f8a362fabcc47294380c81c> 331
iapd <0a747292a113307abb17216274976be5> 323
imagent <9c3a4f75d1303349a53fc6555ea25cd7> 536
locationd <cf31b0cddd2d3791a2bfcd6033c99045> 1197
mDNSResponder <86ccd4633a6c3c7caf44f51ce4aca96d> 201
mediaremoted <327f00bfc10b3820b4a74b9666b0c758> 257
mediaserverd <f03b746f09293fd39a6079c135e7ed00> 1351
lockdownd <b06de06b9f6939d3afc607b968841ab9> 279
powerd <133b7397f5603cf8bef209d4172d6c39> 173
syslogd <7153b590e0353520a19b74a14654eaaa> 178
wifid <3001cd0a61fe357d95f170247e5458f5> 319
UserEventAgent <dc32e6824fd33bf189b266102751314f> 409
launchd <5fec01c378a030a8bd23062689abb07f> 126
**End**
On closer inspection, the dirty memory consists mostly of Image IO and Core Animation pages. multiple entries consisting of hundreds to thousands of pages. What does Image IO and Core Animation do exactly? and how can I reduce the Dirty Memory?
edit: tried doing this on a serial queue and no improvement on size of Dirty memory
another question. how large is too large for Dirty Memory and allocations?
Updated:
- (void) render
{
for (id thing in mylist) {
#autorelease {
CALayer *layer = createLayerFromThing(thing);
UIImage *img = [self renderLayer:layer];
[self writeToDisk:img];
}
}
}
in createLayerFromThing(thing); I actually creating a layer with a huge amount of sub layers
UPDATED
first screenshot for maxConcurrentOperationCount = 4
second for maxConcurrentOperationCount = 1
============================================================================================================================================================
and since it bringing down the number of concurrent operations barely made a dent,
I decided to try maxConcurrentOperationCount = 10

It's difficult to say what's going wrong without any details but here are a few ideas.
A. Use #autorelease. CALayers generate bitmaps in the backgound, which in aggregate can take-up lots of space if they are not freed in time. If you are creating and rendering many layers I suggest adding an autorelease block inside your rendering loop. This won't help if ALL your layers are nested and needed at the same time for rendering.
- (void) render
{
for (id thing in mylist) {
#autorelease {
CALayer *layer = createLayerFromThing(thing);
[self renderLayer:layer];
}
}
}
B. Also if you are using CGBitmapCreateContext for rendering are you calling the matching CGContextRelease? This goes also for CGColorRef.
C. If you are allocating memory with malloc or calloc are you freeing it when done? One way to ensure that this happens
Post the code for the rendering loop to provide more context.

I believe there are two possibilities here:
The items you create are not autoreleased.
The memory you are taking is what it is, due to the number of concurrent operations you are doing.
In the first case the solution is simple. Send an autorelease message to the layers and images upon their creation.
In the second case, you could limit the number of concurrent operations by using an NSOperationQueue. Operation queues have a property called maxConcurrentOperationCount. I would try with a value of 4 and see how the memory behavior changes from what you currently have. Of course, you might have to try different values to get the right memory vs performance balance.

Autorelease will wait till the end of the run loop to clean up. If you explicit release it can take it from the heap without filling the pool.
- (void) render {
for (id thing in mylist) {
CALayer *layer = createLayerFromThing(thing); // assuming this thing is retained
[self renderLayer:layer];
[layer release]; // this layer no longer needed
} }
Also run build with analyse and see if you have leaks and fix them too.

Related

Ceph cluster down, Reason OSD Full - not starting up

Cephadm Pacific v16.2.7
Our Ceph cluster is stuck pgs degraded and osd are down
Reason:- OSD's got filled up
Things we tried
Changed vale to to maximum possible combination (not sure if done right ?)
backfillfull < nearfull, nearfull < full, and full < failsafe_full
ceph-objectstore-tool - tried to delte some pgs to recover space
tried to mount osd and delete pg's to recover some space, but not sure how to do it in bluestore .
Global Recovery Event - stuck for ever
ceph -s
cluster:
id: a089a4b8-2691-11ec-849f-07cde9cd0b53
health: HEALTH_WARN
6 failed cephadm daemon(s)
1 hosts fail cephadm check
Reduced data availability: 362 pgs inactive, 6 pgs down, 287 pgs peering, 48 pgs stale
Degraded data redundancy: 5756984/22174447 objects degraded (25.962%), 91 pgs degraded, 84 pgs undersized
13 daemons have recently crashed
3 slow ops, oldest one blocked for 31 sec, daemons [mon.raspi4-8g-18,mon.raspi4-8g-20] have slow ops.
services:
mon: 5 daemons, quorum raspi4-8g-20,raspi4-8g-25,raspi4-8g-18,raspi4-8g-10,raspi4-4g-23 (age 2s)
mgr: raspi4-8g-18.slyftn(active, since 3h), standbys: raspi4-8g-12.xuuxmp, raspi4-8g-10.udbcyy
osd: 19 osds: 15 up (since 2h), 15 in (since 2h); 6 remapped pgs
data:
pools: 40 pools, 636 pgs
objects: 4.28M objects, 4.9 TiB
usage: 6.1 TiB used, 45 TiB / 51 TiB avail
pgs: 56.918% pgs not active
5756984/22174447 objects degraded (25.962%)
2914/22174447 objects misplaced (0.013%)
253 peering
218 active+clean
57 undersized+degraded+peered
25 stale+peering
20 stale+active+clean
19 active+recovery_wait+undersized+degraded+remapped
10 active+recovery_wait+degraded
7 remapped+peering
7 activating
6 down
2 active+undersized+remapped
2 stale+remapped+peering
2 undersized+remapped+peered
2 activating+degraded
1 active+remapped+backfill_wait
1 active+recovering+undersized+degraded+remapped
1 undersized+peered
1 active+clean+scrubbing+deep
1 active+undersized+degraded+remapped+backfill_wait
1 stale+active+recovery_wait+undersized+degraded+remapped
progress:
Global Recovery Event (2h)
[==========..................] (remaining: 4h)
'''
Some versions of BlueStore were susceptible to BlueFS log growing extremely large - beyond the point of making booting OSD impossible. This state is indicated by booting that takes very long and fails in _replay function.
This can be fixed by::
ceph-bluestore-tool fsck –path osd path –bluefs_replay_recovery=true
It is advised to first check if rescue process would be successful::
ceph-bluestore-tool fsck –path osd path –bluefs_replay_recovery=true –bluefs_replay_recovery_disable_compact=true
If above fsck is successful fix procedure can be applied
Special Thank you to, this has been solved with the help of a dewDrive Cloud backup faculty Member

How to debug leak in native memory on JVM?

We have a java application running on Mule. We have the XMX value configured for 6144M, but are routinely seeing the overall memory usage climb and climb. It was getting close to 20 GB the other day before we proactively restarted it.
Thu Jun 30 03:05:57 CDT 2016
top - 03:05:58 up 149 days, 6:19, 0 users, load average: 0.04, 0.04, 0.00
Tasks: 164 total, 1 running, 163 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.2%us, 1.7%sy, 0.0%ni, 93.9%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 24600552k total, 21654876k used, 2945676k free, 440828k buffers
Swap: 2097144k total, 84256k used, 2012888k free, 1047316k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3840 myuser 20 0 23.9g 18g 53m S 0.0 79.9 375:30.02 java
The jps command shows:
10671 Jps
3840 MuleContainerBootstrap
The jstat command shows:
S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT
37376.0 36864.0 16160.0 0.0 2022912.0 1941418.4 4194304.0 445432.2 78336.0 66776.7 232 7.044 17 17.403 24.447
The startup arguments are (sensitive bits have been changed):
3840 MuleContainerBootstrap -Dmule.home=/mule -Dmule.base=/mule -Djava.net.preferIPv4Stack=TRUE -XX:MaxPermSize=256m -Djava.endorsed.dirs=/mule/lib/endorsed -XX:+HeapDumpOnOutOfMemoryError -Dmyapp.lib.path=/datalake/app/ext_lib/ -DTARGET_ENV=prod -Djava.library.path=/opt/mapr/lib -DksPass=mypass -DsecretKey=aeskey -DencryptMode=AES -Dkeystore=/mule/myStore -DkeystoreInstance=JCEKS -Djava.security.auth.login.config=/opt/mapr/conf/mapr.login.conf -Dmule.mmc.bind.port=1521 -Xms6144m -Xmx6144m -Djava.library.path=%LD_LIBRARY_PATH%:/mule/lib/boot -Dwrapper.key=a_guid -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.disable_console_input=TRUE -Dwrapper.pid=10744 -Dwrapper.version=3.5.19-st -Dwrapper.native_library=wrapper -Dwrapper.arch=x86 -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 -Dwrapper.lang.domain=wrapper -Dwrapper.lang.folder=../lang
Adding up the "capacity" items from jps shows that only my 6144m is being used for java heap. Where the heck is the rest of the memory being used? Stack memory? Native heap? I'm not even sure how to proceed.
If left to continue growing, it will consume all memory on the system and we will eventually see the system freeze up throwing swap space errors.
I have another process that is starting to grow. Currently at about 11g resident memory.
pmap 10746 > pmap_10746.txt
cat pmap_10746.txt | grep anon | cut -c18-25 | sort -h | uniq -c | sort -rn | less
Top 10 entries by count:
119 12K
112 1016K
56 4K
38 131072K
20 65532K
15 131068K
14 65536K
10 132K
8 65404K
7 128K
Top 10 entries by allocation size:
1 6291456K
1 205816K
1 155648K
38 131072K
15 131068K
1 108772K
1 71680K
14 65536K
20 65532K
1 65512K
And top 10 by total size:
Count Size Aggregate
1 6291456K 6291456K
38 131072K 4980736K
15 131068K 1966020K
20 65532K 1310640K
14 65536K 917504K
8 65404K 523232K
1 205816K 205816K
1 155648K 155648K
112 1016K 113792K
This seems to be telling me that because the Xmx and Xms are set to the same value, there is a single allocation of 6291456K for the java heap. Other allocations are NOT java heap memory. What are they? They are getting allocated in rather large chunks.
Expanding a bit more details on Peter's answer.
You can take a binary heap dump from within VisualVM (right click on the process in the left-hand side list, and then on heap dump - it'll appear right below shortly after). If you can't attach VisualVM to your JVM, you can also generate the dump with this:
jmap -dump:format=b,file=heap.hprof $PID
Then copy the file and open it with Visual VM (File, Load, select type heap dump, find the file.)
As Peter notes, a likely cause for the leak may be non collected DirectByteBuffers (e.g.: some instance of another class is not properly de-referencing buffers, so they are never GC'd).
To identify where are these references coming from, you can use Visual VM to examine the heap and find all instances of DirectByteByffer in the "Classes" tab. Find the DBB class, right click, go to instances view.
This will give you a list of instances. You can click on one and see who's keeping a reference each one:
Note the bottom pane, we have "referent" of type Cleaner and 2 "mybuffer". These would be properties in other classes that are referencing the instance of DirectByteBuffer we drilled into (it should be ok if you ignore the Cleaner and focus on the others).
From this point on you need to proceed based on your application.
Another equivalent way to get the list of DBB instances is from the OQL tab. This query:
select x from java.nio.DirectByteBuffer x
Gives us the same list as before. The benefit of using OQL is that you can execute more more complex queries. For example, this gets all the instances that are keeping a reference to a DirectByteBuffer:
select referrers(x) from java.nio.DirectByteBuffer x
What you can do is take a heap dump and look for object which are storing data off heap such as ByteBuffers. Those objects will appear small but are a proxy for larger off heap memory areas. See if you can determine why lots of those might be retained.

How to reduce Ipython parallel memory usage

I'm using Ipython parallel in an optimisation algorithm that loops a large number of times. Parallelism is invoked in the loop using the map method of a LoadBalancedView (twice), a DirectView's dictionary interface and an invocation of a %px magic. I'm running the algorithm in an Ipython notebook.
I find that the memory consumed by both the kernel running the algorithm and one of the controllers increases steadily over time, limiting the number of loops I can execute (since available memory is limited).
Using heapy, I profiled memory use after a run of about 38 thousand loops:
Partition of a set of 98385344 objects. Total size = 18016840352 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 5059553 5 9269101096 51 9269101096 51 IPython.parallel.client.client.Metadata
1 19795077 20 2915510312 16 12184611408 68 list
2 24030949 24 1641114880 9 13825726288 77 str
3 5062764 5 1424092704 8 15249818992 85 dict (no owner)
4 20238219 21 971434512 5 16221253504 90 datetime.datetime
5 401177 0 426782056 2 16648035560 92 scipy.optimize.optimize.OptimizeResult
6 3 0 402654816 2 17050690376 95 collections.defaultdict
7 4359721 4 323814160 2 17374504536 96 tuple
8 8166865 8 196004760 1 17570509296 98 numpy.float64
9 5488027 6 131712648 1 17702221944 98 int
<1582 more rows. Type e.g. '_.more' to view.>
You can see that about half the memory is used by IPython.parallel.client.client.Metadata instances. A good indicator that results from the map invocations are being cached is the 401177 OptimizeResult instances, the same number as the number of optimize invocations via lbview.map - I am not caching them in my code.
Is there a way I can control this memory usage on both the kernel and the Ipython parallel controller (who'se memory consumption is comparable to the kernel)?
Ipython parallel clients and controllers store past results and other metadata from past transactions.
The IPython.parallel.Client class provides a method for clearing this data:
Client.purge_everything()
documented here. There is also purge_results() and purge_local_results() methods that give you some control over what gets purged.

Logical Addresses & Page numbers

I just started learning Memory Management and have an idea of page,frames,virtual memory and so on but I'm not understanding the procedure from changing logical addresses to their corresponding page numbers,
Here is the scenario-
Page Size = 100 words /8000 bits?
Process generates this logical address:
10 11 104 170 73 309 185 245 246 434 458 364
Process takes up two page frames,and that none of its are resident (in page frames) when the process begins execution.
Determine the page number corresponding to each logical address and fill them into a table with one row and 12 columns.
I know the answer is :
0 0 1 1 0 3 1 2 2 4 4 3
But can someone explain how this is done? Is there a equation or something? I remember seeing something with a table and changing things to binary and putting them in the page table like 00100 in Page 1 but I am not really sure. Graphical representations of how this works would be more than appreciated. Thanks

NSMutableArray causes memory buildup, not Autoreleasing with ARC

I created an array using the following code. After the 12 images have no longer needed, I set the imageArray to nil and reload a new set of images into this array. When I run the app in instruments I can see a memory buildup issue. I also ran heapshots and it shows 12 images still lingering even after I set the array to nil. I also tried to init this array in its own autorelease pool thinking it was somehow created on a separated thread below the main thread. That did not work either. Any ideas?
ViewController.h
#property (strong, nonatomic) NSMutableArray *imageArray;
ViewController.m
- (void) firstSetOfImages{
imageArray= [[NSMutableArray alloc] initWithObjects:wordImage1, wordImage2, wordImage3, wordImage4, wordImage5, wordImage6, wordImage7, wordImage8, wordImage9, wordImage10, wordImage11, wordImage12, nil];
}
- (void) clearImages{
[self setImageArray:nil];
[imageArray removeAllObjects];
[self secondSetOfImages];
}
- (void) secondSetOfImages{
imageArray= [[NSMutableArray alloc] initWithObjects:wordImage1, wordImage2, wordImage3, wordImage4, wordImage5, wordImage6, wordImage7, wordImage8, wordImage9, wordImage10, wordImage11, wordImage12, nil];
}
Here is an example of 1 heapshot taken in between the loading of 1 set of 12 images and the second set of 12 images.
Snapshot Timestamp Heap Growth # Persistent
Heapshot 3 00:39.457.174 53.02 KB 800
< non-object > 26.05 KB 277
UIImageView 3.38 KB 36
CFDictionary (mutable) 3.38 KB 72
CFBasicHash (key-store) 2.83 KB 73
CFBasicHash (value-store) 2.83 KB 73
NSPathStore2 2.25 KB 12
CGImageReadRef 1.88 KB 12
CALayer 1.69 KB 36
CGImage 1.62 KB 13
CFNumber 1.31 KB 84
CGImagePlus 1.31 KB 12
CFData 1.12 KB 24
CGImageProvider 768 Bytes 12
CGDataProvider 720 Bytes 5
UIImage 576 Bytes 12
CFString (immutable) 416 Bytes 13
CFArray (mutable-variable) 384 Bytes 12
CGImageReadSessionRef 192 Bytes 12
_ UIImageViewExtendedStorage 192 Bytes 4
__NSArrayM 160 Bytes 5
CFDictionary (immutable) 48 Bytes 1
EDIT:
I modified the code and made the arrays an ivar. I took another sample of Allocations in Instruments. Below are is a more detailed display of my heapshots. I took a heapshot every time I reset my array with 12 new images. Every heapshot is has a heapgrowth of about 35kb.
Snapshot Timestamp Heap Growth # Persistent
Heapshot 4 00:58.581.296 35.63 KB 680
< non-object > 13.02 KB 220
CFDictionary (mutable) 3.38 KB 72
CFBasicHash (key-store) 2.81 KB 72
CFBasicHash (value-store) 2.81 KB 72
NSPathStore2 2.28 KB 12
CGImageReadRef 1.88 KB 12
CGImage 1.50 KB 12
CFNumber 1.31 KB 84
CGImagePlus 1.31 KB 12
CFData 1.12 KB 24
UIImageView 1.12 KB 12
CGImageProvider 768 Bytes 12
UIImage 576 Bytes 12
CALayer 576 Bytes 12
CFString (immutable) 384 Bytes 12
CFArray (mutable-variable) 384 Bytes 12
CGImageReadSessionRef 192 Bytes 12
CGDataProvider 144 Bytes 1
_UIImageViewExtendedStorage96 Bytes 2
__NSArrayM 32 Bytes 1
Here is the stacktrace of one of those Persistent items in UIImage. It doesn't point to a specific line of code that created it. Not sure where to go from here?
24 FourGameCenter 0x4b4bf
23 FourGameCenter 0x4b538
22 UIKit UIApplicationMain
21 GraphicsServices GSEventRunModal
20 CoreFoundation CFRunLoopRunInMode
19 CoreFoundation CFRunLoopRunSpecific
18 CoreFoundation __CFRunLoopRun
17 CoreFoundation __CFRunLoopDoSource1
16 CoreFoundation __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__
15 GraphicsServices PurpleEventCallback
14 GraphicsServices _PurpleEventCallback
13 UIKit _UIApplicationHandleEvent
12 UIKit -[UIApplication sendEvent:]
11 UIKit -[UIWindow _sendTouchesForEvent:]
10 UIKit -[UIControl touchesEnded:withEvent:]
9 UIKit -[UIControl(Internal) _sendActionsForEvents:withEvent:]
8 UIKit -[UIControl sendAction:to:forEvent:]
7 UIKit -[UIApplication sendAction:toTarget:fromSender:forEvent:]
6 UIKit -[UIApplication sendAction:to:from:forEvent:]
5 FourGameCenter 0x8d7fa
4 FourGameCenter 0x71830
3 FourGameCenter 0x797e6
2 libobjc.A.dylib _objc_rootAllocWithZone
1 libobjc.A.dylib class_createInstance
0 libsystem_c.dylib calloc
You can't do the following in clearImages:
[self setImageArray:nil];
[imageArray removeAllObjects];
In the snippet above, you've just set imageArray to nil. You can't then send the nil object a removeAllObjects message: it'll just silently do nothing.
You need to reorder your lines to:
[imageArray removeAllObjects];
[self setImageArray:nil];
[self setImageArray:nil]
will remove the image objects automatically, so no need to do the removeObjects.
However if you are using those images in UIImageViews elsewhere, and those UIImageViews are retained or used in a view elsewhere, you will need to release those/remove the UIImageView from the superview as well.
[imageView removeFromSuperview];
Or if you have the UIImageViews in an array:
[imageViewArray makeObjectsPerformSelector:#selector(removeFromSuperview)];
You are mixing metaphors here - you use a property but set it like an ivar - its possible that creates a leak. Also, if you release the array, you don't have to release every object as it will do that for you.
So, use an ivar
#MyClass...
{
NSMutableArray *imageArray;
}
Now you can set it to new arrays, and for sure the old array is released. If you want to release the array completely, just set it to nil.
Alternately, if you want to use properties, then when you change the object, use:
self.imageArray = ...;
Now, a second problem - your code accesses imageArray directly (no leading "_") - did you synthesize the ivar using the same name? This is probably not a good idea as it just confuses people when they read your code. If you just drop the #synthesize, you now get the ivar "for free" as the property name with a "_" prepended. You can directly access that ivar but anyone reading your code will know its attached to a property.
EDIT: You obviously are not releasing the objects you think you are. Your problem is quite similar to this one. What you need to do is as follows:
1) create a UIImage subclass, and add a log statement in the dealloc. Instead of creating UIImages, you are going to create MyImages (the subclass), which means you'll need your subclass .h file anywhere you want to use it (pr put it in the pch file while debugging). Do this first to see if the images get released. If not you know your problem.
2) Subclass NSArray (NOT NSMutableArray, not you don't really need mutable arrays in the code you show above). If there are reasons that you must use mutable arrays, then when you set the ivar, use [MyArray arrayWithArray:theMutableArray] first. Log the dealloc.
Now run your tests. Either the array, or the images, is not getting released. Once you know which, you can track down the retain issue.
There is an old saying, I'll butcher it, but it goes something like, when you try everything that can possible be the cause of a problem, and that doesn't fix it, then it has to be something that you assume is NOT the problem that is actually it.
Check to see if you have zombies turned on in the scheme:diagnostics. EVen though Zombies are unlikely under ARC, the project might still have this turned on, which makes checking for memory leaks look like everything is leaking.

Resources