Ideal time to invalidate cached data in an iOS app - ios

So I'm working on an iOS app which uses Core Data as a local, offline store for data stored on a remote server. Think like the way Mail.app keeps your most recent n messages. Now right now the app quite naively stores all of this data without deleting any of the old data.
My question is this: what's the best time in an iOS app's lifecycle to take care of tasks like removing cached data? I already know how I'm going to delete this old data, but doing so is an expensive operation. So what I want to know is when the best time to perform this sort of expensive operation is.

If it is not too much trouble, then doing so when the application goes into the background would be a nice time to do it. If it takes around 10 seconds or more, though, be sure to set up a background task to allow you to run for a bit more time.

You can run this operation in background with GCD or NSOperationQueue. I would do it after I get new data from server, then delete old cache, and build new one. If you move expensive operation to background (threads, block, NSOperation, or what ever you prefer) then it better to use child NSMagaedObjects for synchronization.

Related

Delay in CoreData to PersistentCoudKitContainer synchronization

I am having an issue with my CoreData to iCloud synchronization with NSPersistentCloudKitContainer.
The synchronization works, but when i do a fresh install of the app there is an annoying delay of several seconds between app launch and the end of synchronization. I need to decide at launch whether to create new data entities or use the "old" data from iCloud.
I could live with the delay and wait for the sync to finish if there was a way to
a) determine at launch that there is data in iCloud to be synchronized and
b) get a notification when synchronization is finally done
Does anyone know of a solution to achieve this? Setting
NSPersistentStoreRemoteChangeNotificationPostOptionKey does not help much, as it is fired several times during sync and does not give any status information.
At the risk of providing you the most unhelpful answer possible, I don't think it is possible and, even if it is, I think you are battling against the philosophy of NSPersistentCloudKitContainer.
With NSPersistentCloudKitContainer, you should assume that synchronisation happens at irregular, erratic intervals or not at all. It is supposed to operate seamlessly, in the background, with nothing for you to worry about. You shouldn't try to speculate in your code when it happens or if it happens.
It is very similar to taking a photo on your iPhone and then it taking several seconds for that photo to appear on your iMac. The iCloud sync decides if and when that sync will take place.
I know this is not helpful, but I thought you should be aware of this perspective.

Download multiple files with operation queue not stable in background mode

Currently what I want to achieve is download files from an array that download only one file at a time and it still performs download even the app goes to the background state.
I'm using Rob code as stated in here but he's using URLSessionConfiguration.default which I want to use URLSessionConfiguration.background(withIdentifier: "uniqueID") instead.
It did work in the first try but after It goes to background everything became chaos. operation starts to download more than one file at a time and not in order anymore.
Is there any solution to this or what should I use instead to achieve what I want. If in android we have service to handle that easily.
The whole idea of wrapping requests in operation is only applicable if the app is active/running. It’s great for things like constraining the degree of concurrency for foreground requests, managing dependencies, etc.
For background session that continues to proceed after the app has been suspended, though, none of that is relevant. You create your request, hand it to the background session to manage, and monitor the delegate methods called for your background session. No operations needed/desired. Remember, these requests will be handled by the background session daemon even if your app is suspended (or if it terminated in the course of its normal lifecycle, though not if you force quit it). So the whole idea of operations, operation queues, etc., just doesn’t make sense if the background URLSession daemon is handling the requests and your app isn’t active.
See https://stackoverflow.com/a/44140059/1271826 for example of background session.
By the way, true background sessions are really useful when download very large resources that might take a very long time. But it introduces all sorts of complexities (e.g., you often want to debug and diagnose when not connected to the Xcode debugger which changes your app lifecycle, so you have to resort to mechanisms like unified messaging; you need to figure out how to restore UI if the app was terminated between the time the requests were initiated and when they finished; etc.).
Because of this complexity, you might want to consider whether this is absolutely needed. Sometimes, if you only need less than 30 seconds to complete some requests, it’s easier to just ask the OS to keep your app running in the background for a little bit after the user leaves the app and just use standard URLSession. For more information, see Extending Your App's Background Execution Time. It’s a much easier solution, bypassing many background URLSession hassles. But it only works if you only need 30 seconds or less. For larger requests that might exceed this small window, a true background URLSession is needed.
Below, you asked:
There are some downside with [downloading multiple files in parallel] as I understanding.
No, it’s always better to allow downloads to progress asynchronously and in parallel. It’s much faster and is more efficient. The only time you want to do requests consecutively, one after another, is where you need the parse the response of one request in order to prepare the next request. But that is not the case here.
The exception here is with the default, foreground URLSession. In that case you have to worry about latter requests timing out waiting for earlier requests. In that scenario you might bump up the timeout interval. Or we might wrap our requests in Operation subclass, allowing us to constrain not only how many concurrent requests we will allow, but not start subsequent requests until earlier ones finish. But even in that case, we don’t usually do it serially, but rather use a maxConcurrentOperationCount of 4 or something like that.
But for background sessions, requests don’t time out just because the background daemon hasn’t gotten around to them yet. Just add your requests to the background URLSession and let the OS handle this for you. You definitely don’t want to download images one at a time, with the background daemon relaunching your app in the background when one download is done so you can initiate the next one. That would be very inefficient (both in terms of the user’s battery as well as speed).
You need to loop inside an array of files and then add to the session to make it download but It will be download asynchronously so it's hard to keeping track also since the files are a lot.
Sure, you can’t do a naive “add to the end of array” if the requests are running in parallel, because you’re not guaranteed the order that they will complete. But it’s not hard to capture these responses as they come in. Just use a dictionary for example, perhaps keyed by the URL of the original request. Then you can easily look up in that dictionary to find the response associated with a particular request URL.
It’s incredibly simple. And we now can perform requests in parallel, which is much faster and more efficient.
You go on to say:
[Downloading in parallel] could lead the battery to be high consumption with a lot of requests at the same time. that's why I tried to make it download each file one at a time.
No, you never need to perform downloads one at a time for the sake of power. If anything, downloading one at a time is slower, and will take more power.
Unrelated, if you’re downloading 800+ files, you might want to allow the user to not perform these requests when the user is in “low data mode”. In iOS 13, for example, you might set allowsExpensiveNetworkAccess and allowsConstrainedNetworkAccess.
Regardless (and especially if you are supporting older iOS versions), you might also want to consider the appropriate settings isDiscretionary and allowsCellularAccess.
Bottom line, you want to make sure that you are respectful of a user’s limited cellular data plan or if they’re on some expensive service (e.g. connecting on an airplane’s expensive data plan or tethered via some local hotspot).
For more information on these considerations, see WWDC 2019 Advances in Networking, Part 1.

Realm file size in iOS app

I have an app that uses Realm as a staging database. It receives information from a bluetooth device, processes it, and sends the processed result to a server.
The incoming data from bluetooth gets stored in a Realm table (table1). A separate thread reads data from the Realm database, processes it, and stores it into a second table (table2) for uploading to a server. When it pulls this data and successfully processed it, it deletes it from table1.
The third thread pulls data from table2, and when it successfully sends, removes it from table2.
I'm using a database here in case, for whatever reason, the app is killed - data won't be lost... it will just resume where it left off when the app is restarted. But as you can see, the database is not something that hangs around (it's not like an address book or something... it is just temporary staging)
What I notice is that no matter what the heck I do, the realm database file just increases in size over time. I'll end up with a database that if I open it, will have one record in it, but the database file on disk could be 10s of MB in size if the app is running long enough.
Data is being processed on different background queues so as to not block any UX (one of the reasons I'm using Realm instead of CoreData). But I'm using things like autoreleasepools and the invalidate command to avoid objects that are read from having copies made (as suggested by many realm questions/answers)
What gives? I know I don't have a code sample here, but this just seems like a basic garbage collection problem in Realm. I've seen other questions related to this where people are like "why is my database so huge", and the answers suggest doing things like "writeCopyToPath", but that feels like an incredible hack, and regardless, it would be very difficult - this app is meant to be constantly connected and monitoring a bluetooth device, so to do this, it would mean stopping, making sure all threads that might alter the database are quiesced, doing the copy to compact the db, and then starting everything back up again. That just seems nonsensical to me. I might interrupt user operations for example. I don't want a user to not be able to do something because I decided it was time to do database maintenance.
I feel like I'm either missing some incredibly fundamental point in how to make Realm not keep junk around, or Realm is just the completely wrong solution for my problem. I've never seen this problem with databases - adding and deleting lots of records... quickly... seems like something a database should just be able to do without exploding in size.
Are you making sure that the background thread is not holding on to old versions of the Realm, preventing the space from being reused?
Quote from the docs (https://realm.io/docs/swift/latest/#seeing-changes-from-other-threads):
If a thread has no runloop (which is generally the case in a background thread), then Realm.refresh() must be called manually in order to advance the transaction to the most recent state.
Failing to refresh Realms on a regular basis could lead to some transaction versions becoming “pinned”, preventing Realm from reusing the disk space used by that version, leading to larger file sizes.

How can I change the background operation priority dynamically using Dispatch or Operation queues.

Here is the problem that I got. I have several tasks to complete in background when application is running. When I run these tasks in background by pushing them to concurrent dispatch queue it takes more then 10 seconds to complete all of them. They basically load data from disk and parse it and represent the result to the user. That is they are just cached results and hugely improve the user experience.
This cached results are used in a particular functionality inside the app, and when that functionality is not used immediately after opening the application, it is not a problem that it takes 10 seconds to load the data that supports that functionality, because when user decides to use it, that data will already be loaded.
But when user immediately enters that function in the app after opening it, it takes considerable time (from the point of view of the user) to load the data. Also the whole data is not needed at the same moment, but rather the piece of it at a given moment.
That's why we need concurrently load the data, and if possible bring the results as soon as possible. That's why I decided to break the data into chunks, and when user requests the data, we should load the corresponding chunk by background thread and give that thread the highest priority. I'll explain what I mean.
Imagine there are 100 pieces of data and it takes more than 10 seconds to load them all. Whenever user queries the data first time, the app determines which chunk of the data user needs and starts loading that chunk. After that part is loaded the remaining data will also be loaded in the background, in order to make later queries faster (without the lag of loading the cache). But here a problem occurs, when user decides to change the query immediately after he has already entered one, and that change occurs for instance on the 2nd second of data loading process (remember it takes more than 10 seconds to load the data and we still have more than 8 seconds to complete the loading process), then in the extreme case user will receive his data waiting until all data will be loaded. That's way I need somehow manage the execution of the background tasks. That is, when user changes the input, I should change the priorities of execution, and give the thread that loads the corresponding chunk the highest priority without stopping it, so it will receive more processor time, and will finish sooner, and deliver results to the user faster, than it would if I have left the priorities the same. I know I can assign priorities to queues. But is there a way that I can change them dynamically while they are still executing?
Or do I need to implement custom thread management, in order to implement these behaviour? I really don't want to dive into thread management, and will be glad if it is possible to implement using only dispatch or operation queues.
I hope I've described the problem well. If not please comment bellow what is unclear, I'll explain.
Thank you so much for reading so far :) And special thanks to one who will provide an answer. And very special thanks to one, who will give me solution using dispatch or operation queues :)))
I think you need to move away from thinking about the priority at which the queues are running (which actually doesn't sound very important for the scenario you are describing) and more towards how you can use Dispatch I/O or an even simpler Dispatch source to control how the data is being read in. As you say, it takes 10 seconds the load the data and if the user suddenly changes their query immediately after asking, you need to essentially stop reading the data for the previous request and do whatever needs to be done to fulfill the most recent query. Using Dispatch I/O to chunk the data (asynchronously) and update the UI also asynchronously will allow you to change your mind mid-stream (using some sort of semaphore or cancellation flag) and either continue to trickle the data in (you don't say whether or not that data will remain useful if the user changes their mind or not), suspend the reading process, or cancel it altogether and start a new operation. Eithe way, being able to suspend/resume a source and also have it fire callbacks for reasonably small chunks of data will certainly enable you to make decisions on a much more granular chunk of time than 8 seconds!
I'm afraid the only way to do that is to cancel running operation before starting new one.
You cannot remove it from queue until it's done or canceled.
As an improvement for your problem I would suggest to load things even user doesn't need them in background - so you can load them from cache after it's there.
You can create 2 NSOperationQueue with 2 different priorities and download things in background whenever user is idle on LowPriorityQueue. For important operations you can have high priority queue - which you will cancel each time search term changes.
On top of that you just need to cache results from both of those queues.

Core Data delay when switching NSPersistentStore files

I'm developing an app with Core Data that periodically downloads all the data from a webservice. Since the download can fail or be cancelled by the user, I want to be able to roll back to the previous state. I tried undoing the NSManagedObjectContext, but that seemed a bit slow (I have tens of thousands of entities). What I'm doing right now is making a backup of the persistent store file, download the data, and, if the download fails, replace the store file with the backup. This seems to work correctly, except there seems to be a delay after I can fetch entities from the store: if after the download I go immediately to a UITableView that uses an NSFetchedResultsController, I find it empty. If I wait some seconds, everything is ok.
So my question is: has anyone had this kind of delays too? Is there something that can be done to avoid this problem, something that forces everything to be ready, even if it blocks the thread?
I haven't used this setup but I think the delay you are seeing is probably caused by Core Data having to clear all it's caching. Core Data uses If you use a cache with the fetched results controller it will have to test and then delete it's existing cache.
I think the best thing to do is to tear down you Core Data stack and reboot it from scratch. That includes recreating a fresh fetched results controller.

Resources