CloudKit: Preventing Duplicate Records - ios

I am working through an app that pulls data from an external web service into a private CloudKit database. The app is a single user app, however I am running into a race condition that I am not sure how to avoid.
Every record in my external data has a unique identifier that I map to my CKRecord instances. The general app startup flow is:
Fetch current CKRecords for the relevant record type.
Fetch external records.
For every external record, if it doesn't exist in CloudKit, create it via batch create (modification operation).
Now, the issue is, if this process is kicked off on two of a user's devices simultaneously, since both the CK and external fetch is async, there is a strong possibility that I'll get duplicate records.
I know I can use zones to atomically commit all of my CKRecord instances, but I don't think that solves my issue because if all of these fetches happen at essential the same time, the save is not really the issue.
My questions are:
Does anyone know of a way to "lock" the private database for writes across all of a user's devices?
Alternatively, is there a way to enforce uniqueness on any CKRecord field?
Or, is there a way to use a custom value as the primary key, in that case I could use my external ID as the CK ID and allow the system to prevent duplicates itself.
Thanks for the help in advance!

Answers:
No, you cannot lock the private database
Cloudkit already enforces and assumes uniqueness of your record ID
You can make the record ID anything you like (in the non zone part of it).
Explanation:
Regarding your issue of duplication. If you are the one creating the record IDs (from the external records you mentioned for example) then at worst you should have one record over write the other with the same data if you have a race condition. I do not think that is an issue for the extreme case two devices kick off this process at the same time. Basically you logic of first fetching existing records and then modifying them seems sound to me.
Code:
//employeeID is a unique ID to identify an employee
let employeeID = "001"
//Remember the recordID needs to be unique within the same database.
//Assuming you have different record types, it is better to prefix the record name with the record type so that it is unique
let recordName = "Employee-\(employeeID)"
//If you are using a custom zone
let customZoneID = CKRecordZoneID(zoneName: "SomeCustomZone", ownerName: CKCurrentUserDefaultName)
let recordIDInCustomZone = CKRecordID(recordName: recordName, zoneID: customZoneID)
//If you are using the default zone
let recordIDInDefaultZone = CKRecordID(recordName: recordName)

I had similar issue of duplicates downloaded when I tried to read in a database of more than 100 records; the solution is found in the Apple's Atlas example which uses a Boolean to check if the last process finished before it launches the next. You find a block much like this...
#synchronized (self)
{
// Quickly returns if another loadNextBatch is running or we have the oldest post
if(self.isLoadingBatch || self.haveOldestPost) return;
else self.isLoadingBatch = YES;
}
Incidentally here the code to create your own record key.
CKRecordID *customID = [[CKRecordID alloc] initWithRecordName: [globalEOConfirmed returnEOKey:i]];
newrecord = [[CKRecord alloc] initWithRecordType:#"Blah" recordID:customID];

Related

Is there a way to access properties of an x-coredata:// object returned from an NSFetchRequest?

TL;DR: Is there a way to programmatically read/recall (NOT write!) an instance of a Core Data entity using the p-numbered "serial number" that's tacked on to the instance's x-coredata:// identifier? Is this a good/bad idea?
I'm using a method similar to the following to retrieve the instances of an Entity called from a Core Data data store:
var managedContext: NSManagedObjectContext!
let fetchRequest : NSFetchRequest<TrackInfo> = TrackInfo.fetchRequest()
fetchResults = try! managedContext.fetch(fetchRequest)
for (i, _) in Global.Vars.numberOfTrackButtons! {
let workingTrackInfo = fetchResults.randomElement()!
print("current track is: \(workingTrackInfo)")
The list of tracks comes back in fetchResults as an array, and I can select one of them at random (fetchResults.randomElement()). From there, I can examine the details of that one item by coercing it to a string and displaying it in the console (the print statement). I don't list the code below, but using workingTrackInfo I am able to see that instance, read its properties into other variables, etc.
In the console, iOS/Xcode lists the selected item as follows:
current track is: <MyProjectName.TrackInfo: 0x60000374c2d0> (entity:
TrackInfo; id: 0xa7dc809ab862d89d
<x-coredata://2B5DDCDB-0F2C-4CDF-A7B9-D4C43785FDE7/TrackInfo/p22>;
data: <fault>)
The line beginning with x-coredata: got my attention. It's formatted like a URL, consisting of what I assume is a UUID for the specific Core Data store associated with the current build of the app (i.e. not a stable address that you could hardcode; you'd need to programmatically look up the Core Data store, similar to the functions we use for programmatically locating the Documents Folder, App Bundle, etc.) The third item is the name of the Entity in my Core Data model -- easy enough.
But that last number is what I'm curious about. From examining the SQLite database associated with this data store, it appears to be a sort of "instance serial number" associated with the Z_PK field in the data model.
I AM NOT interested in trying to circumvent Core Data's normal mechanisms to modify the contents of a managed object. Apple is very clear about that being a bad idea.
What I AM interested in is whether it's possible to address a particular Core Data instance using this "serial number".**
In my application, where I'm randomly selecting one track out of what might be hundreds or even thousands of tracks, I'd be interested in, among other things, the ability to select a single track on the basis of that p-number serial, where I simply ask for an individual instance by generating a random p-number, tack it on to a x-coredata:// statement formatted like the one listed above, and loading the result (on a read-only basis!) into a variable for further use elsewhere in the app.
For testing purposes, I've tried simply hardcoding x-coredata://2B5DDCDB-0F2C-4CDF-A7B9-D4C43785FDE7/TrackInfo/p22 as a URL, but XCode doesn't seem to like it. Is there some other data Type (e.g. an NSManagedObject?) that allows you to set an x-coredata:// "URL" as its contents?
QUESTIONS: Has anyone done anything like this; are there any memory/threading considerations why grabbing instance names in this manner is a bad idea (I'm an iOS/Core Data noob, so I don't know what I don't know; please humor me!); what would the syntax/method for these types of statements be?
Thanks!
You are quite close.
x-coredata://2B5DDCDB-0F2C-4CDF-A7B9-D4C43785FDE7/TrackInfo/p22
is the uriRepresentation() of the NSManagedObjectID of the record.
You get this URL from an NSManagedObject with
let workingTrackInfo = fetchResults.randomElement()!
let objectIDURL = workingTrackInfo.objectID.uriRepresentation()
With this URL you can get the managed Object ID from the NSPersistentStoreCoordinator and the coordinator from the managed object context.
Then call object(with: on the context to get the object.
let persistentStoreCoordinator = managedContext.persistentStoreCoordinator!
if let objectID = persistentStoreCoordinator.managedObjectID(forURIRepresentation: objectIDURL) {
let object = managedContext.object(with: objectID) as! TrackInfo
print(object)
}

Fetching CKRecordZone from a CKRecord

Scenario:
I have a CKRecord which I have fetched from the server. The record exists inside a custom zone for which I do not know the identifier and do not have a CKRecordZone object for.
I need to make a call to CKDatabase.perform(query:inZoneWith:completion:) to get the records in the database which are components of the root shared record (which requires such a call) however without having a CKRecordZoneID (from a CKRecordZone) I am forced to iterate through every CKRecordZone in the shared database and perform the query until a matching record is found.
In summary: I want to take a CKRecord and find the CKRecordZone it exists in. Is this possible? Or is my method flawed and can I perform a query without the CKRecordZoneID?.
To find the CKRecordZoneID of a given record, the recordID property is helpful:
(record).recordID.zoneID yields the CKRecordZoneID that the CKRecord exists in.

Only read new events from Firebase Database

I was halfway done with implementing Core Data in my iOS app when I realized that Firebase has offline capabilities that would pretty much mimic what I was trying to accomplish the whole time.
In my database which is structured as such:
- Users
- user1
- user2
- Groups
- group1
- members
- user1
- events
- event1_By_Auto_Key
- event2_By_Auto_Key
I wanted to locally store all the events that have already been fetched by a user so that I wouldn't have to read all of them every single time I need to get a group's events. Now that I think I'm just going to stick with Firebase's offline capabilities instead of using Core Data, I have a question regarding how to efficiently read events from the database.
As seen from my database's structure the events are stored using the childByAutoId().setValue(data) method, meaning the keys are unknown when inserted. So my console for a given group might look like this:
My question is: how can I only read the new events from a group? The reason I was implementing Core Data was so that I could cache already fetched events, but I'm not sure how I can make sure that I don't re-read data.
There are a few strategies you could use. Since the ids generated are always lexically greater than any existing, you can use startAt() on your query with the newest record you already have. You just need to skip the record that matches the last ID you have. If you keep a timestamp in the events, you can use orderByChild() and the last timestamp and increment by one ms then you don't get any records you already have. It would be something like:
function getNewEvents(group, arrayOfExistingIds) {
let lastId = arrayOfExistingIds.sort().pop(),
ref = admin.database().ref('/Groups/' + group + '/events')
.orderByKey().startAt(lastId).on('value', function(snap){
if (snap.key === lastId) return;
console.log('New record: ' + snap.key);
})
}
Firebase provide you 10MB persistent memory to cache recently fetch records. In normal scenario 10MB is enough space.
You need to enable offline capabilities.

Core Data Different Object ID For First Time

I am using Entity's Object ID in order to uniquely identify local notifications and modify them. I observed that first time when I save my entity, it has following object ID:
<x-coredata:///Task/tE1C5A230-A419-42D5-AF78-3327A09D13BD2>
If I don't exit my application, and try to modify notification, object ID doesn't change and I can modify my notification.
Now, if I restart my app and try to access that entity again, it has different object ID:
<x-coredata://D6703834-ECB4-487B-84F8-330A215E16B7/Task/p13>
So I can't modify notification, as object ID for entity is different. Interesting thing is whenever I access that entity, Object ID remains same as the last one.
So my Question here is why Core data shows different object ID for the first time entity is created? When I try to access entity after opening app again for many times, the object ID (different than the first one) remains constant. I am curious to know why is it happening so?
Please note:
I know there are many posts on SO pointing out that using Object ID is not a reliable approach. Still I want to know reason that why two IDs are being shown.
the first OID is a temporary OID - a temporary id denotes objects that have not been saved yet. the 2nd id is a permanent one and is assigned to a MO AFTER it has been saved:
so...
var objectID = object.objectID
if objectID.temporaryID {
object.managedObjectContext.save() //try do catch left out
}
objectID = object.objectID
assert(objectID.temporaryID == false)

Why is Entity framework loading data from the db when I set a property?

I have two tables (there are more in the database but only two are involved here).
Account and AccountStatus, an account can have an AccountStatus (active,inactive etc).
I create a new Account and set a couple of properties but when I reach this code:
1. var status = db.AccountStatuses.SingleOrDefault(s => s.ID == (long)AccountStatusEnum.Active);
2. account.AccountStatus = status;
3. db.Accounts.AddObject(account);
The first line executes fine, but when I reach the second line it takes a REALLY long time, and when I step in to the code it seems that every single account is loaded from the database.
I don't see why it should even want to load all the accounts?
We use Entity Framework 4 and Poco and we have lazy loading enabled.
Any suggestions?
Cheers
/Jimmy
You have to be careful which constructs you use to fetch data, as some will pull in the whole set and filter afterword. (aside: the long time delay may be the database being created and seeded, if there isn't one already, it will occur the first time you touch it, likely with a query of some sort. Also remember that when you retrieve a whole dataset, you may in actuality only have what amounts to a compiled query that won't be evaluated until you interact with it).
Try this form instead and see if you have the same issue:
var status = db.AccountStatuses.Where(s => s.ID == (long)AccountStatusEnum.Active);

Resources