Is it possible in realm to query for objects that have the same property value?
Imagine a list of contacts with firstname and lastname. I want to query all contacts that have the same name and may be duplicates in the db.
As far as I'm aware, there's no automatic way to do that with NSPredicate (Of which Realm implements); it would need to be done manually.
That being said, it should be relatively trivial to do manually; simply loop through each object, performing a query that searches for that object's name properties, and see if the number of results returned is greater than 1.
That being said, depending on how big your data set is, this could become a very slow operation very quickly. Ideally, you might be better off ensuring that duplicate entries don't occur, or if they do, to somehow index them so they're easier to look up.
Related
I have 20k unique labels that each have their own Entity with their own title.
What is the quickest way to get access to an Entity, given its title?
I know this can be done using a predicate, like so
fetch.predicate = NSPredicate(format: "title contains %#", "example title")
My issue with this approach is that it involves searching through every single one of the 20k Entities until the right one is found.
Is there a way to do this where all the titles are somehow indexed, and I can instantly get access to any label? Similar to how you can instantly get access to an item in an associative array, with array['item_name'].
Thanks.
Using a predicate is how you do it with Core Data.
Before you do anything, is this actually a problem? You don't mention that you've seen any performance issues. Are you having any, or is this still a theoretical problem?
You might improve performance by making sure that the "Indexed" box is checked for this attribute in the Core Data model editor. You might also consider adding a field to your entity that would contain a numeric hash of the title, and fetching based on the hash. You'd still be searching every entity but you'd be doing numeric comparisons instead of strings. You wouldn't be able to do substring searches (as your use of contains implies) but you would be doing the same thing as an associative array (which is also called a hash, for exactly this reason).
If none of that works well enough and you're searching strings very frequently, you'll need to investigate a different data model more suited to your needs-- like building a trie structure for fast searching.
There are at least 2 main collection types used in Realm:
List
Results
The relevant description from the documentation on a Results object says:
Results is an auto-updating container type in Realm returned from
object queries.
Because I want my UITableView to respond to any changes on the Realm Object Server, I really think I want my UITableView to be backed by a Results object. In fact, I think I would always want a Results object to back my UI for this reason. This is only reinforced by the description of a List object in the documentation:
List is the container type in Realm used to define to-many
relationships.
Sure seems like a List is focused on data modeling... So, being new to Realm and just reading the API, I'm thinking the answer is to use the Results object, but the tutorial (Step 5) uses the List object while the RealmExamples sample code uses Results.
What am I missing? Should I be using List objects to back my UITableViews? If so, what are the reasons?
Short answer: use a List if one already exists that closely matches what you want to display in your table view, otherwise use a Results.
If the data represented by a List that's already stored in your Realm corresponds to what you want to display in your table view, you should certainly use that to back it. Lists have an interesting property in that they are implicitly ordered, which can sometimes be helpful, like in the tutorial you linked to above, where a user can reorder tasks.
Results contain the results of a query in Realm. Running this query typically has a higher runtime overhead than accessing a List, by how much depends on the complexity of the query and the number of items in the Realm.
That being said, mutating a List has performance implications too since it's writing to the file in an atomic fashion. So if this is something that will be changing frequently, a Results is likely a better fit.
You should use Results<> as the Results is auto updating to back your UITableView. List can be used to link child models in a Realm model. where as Results is used to query the Realm Objects and you should add a Realm Notification Token so you know when the Results are updated and take necessary action (reload table view etc.) Look here for realm notifications: https://realm.io/docs/swift/latest/#notifications
P.S. The data in that example is just static and no changes are observed
Sometimes my app will add many Realm records at once.I need to be able to consistently keep them in the same order.
The documentation recommends that I use NSDate:
Another common motivation for auto-incrementing properties is to preserve order of insertion. In some situations, this can be accomplished by appending objects to a List or by using a createdAt property with a default value of NSDate().
However, since records are added so quickly sometimes, the dates are not always unique, especially considering Realm stores NSDate only to the second accuracy.
Is there something I'm missing about the suggestion in the documentation?Maybe the documentation wasn't considering records added in quick succession? If so, would it be recommended to keep an Int position property and to always query for the last record at the moment when adding a new record, so as to ensure sequential positions?However, querying for the last record in such a case won't return the previous record unless you've also added and finalized a write, which is wasteful if you need to add a lot of records.Then, it would require batch create logic, which is unfortunate.
However, since records are added so quickly sometimes, the dates are not always unique, especially considering Realm stores NSDate only to the second accuracy.
The limitation on date precision was addressed back in Realm v0.101. Realm can now represent dates with greater precision than NSDate.
However, querying for the last record in such a case won't return the previous record unless you've also added and finalized a write, which is wasteful if you need to add a lot of records.
It's not necessary to commit a write transaction for queries on the same thread to see data that you've added during the write transaction.
Is there something I'm missing about the suggestion in the documentation?
You skipped over the first suggestion: appending objects to a List. Lists in Realm are inherently ordered, so you do not need to find a way to create unique, ordered values. Simply append the new object to the list, and rely on the list's order to determine the order in which the objects were added. This also has the advantage of being safe when using Realm Mobile Platform's synchronization features, as incrementing fields can generate duplicates on different devices and timestamps may not be reliable.
I am using Parse as the backend for an app I'm working on. I was wondering if there is an optimal algorithm to query for only 'unseen' new objects.
What I am planning on doing is something like adding a user to the viewed object's relation and later querying all objects to check for the absence of the user. This seems to be O(n* all users who have seen an 'n') complexity which is a bit too much.
Another way to do this is to add the object to a user's key 'seen' and to then query for all objects the user has not seen.
Maybe a much more efficient way could be (assuming I view these objects chronologically) is to mark the first and last objects I see, and only show ones before or after those points using the createdAt key. Then I guess to show new objects outward from those points to not have to divide into multiple queries.
Ideally I'd like to shuffle through the objects, but I also would like to keep this algorithm as efficient as possible.
Create a Class called View. Each View consists of a unique identifier from a viewable Object (CANNOT BE THE OBJECTID) and a pointer to the User that viewed the object.
Your query should be:
//(Simple equalTo query)
Query1 = All View Objects where User pointer = User
//(whereKey:that-unique-id-column doesNotMatchKey:that-unique-id-column inQuery:Query1)
Query2 = All viewable Objects where not included in pointers in Query1
I have a web service call that returns XML which I convert into domain objects, I then want to insert these domain objects into my Core Data store.
However, I really want to make sure that I dont insert duplicates (the objects have a date stamp which makes them unique which I would hope to use for the uniqueness check). I really dont want to loop over each object, do a fetch, then insert if nothing is found as that would be really poor on performance...
I am wondering if there is an easier way of doing it? Perhaps a "group by" on my objects in memory???? Is that possible?
Your question already has the answer. You need to loop over them, look for them, if they exist update; otherwise insert. There is no other way.
Since you are uniquing off of a single value you can fetch all of the relevant objects at once by setting the predicate:
[myFetchRequest setPredicate:[NSPredicate predicateWithFormat:#"timestamp in %#", myArrayOfIncomingTimestamps]];
This will give you all of the objects that already exist in a faulted state. You can then run an in memory predicate against that array to retrieve the existing objects to update them.
Also, a word of advice. A timestamp is a terribly uniqueID. I would highly recommend that you reconsider that.
Timestamps are not unique. However, we'll assume that you have unique IDs (e.g. a UUID/GUID/whatever).
In normal SQL-land, you'd add an index on the GUID and search, or add a uniqueness constraint and then just attempt the insert (and if the insert fails, do an update), or do an update (and if the update fails, do an insert, and if the insert fails, do another update...). Note that the default transactions in many databases won't work here — they lock rows, but you can't lock rows that don't exist yet.
How do you know a record would be a duplicate? Do you have a primary key or some other unique key? (You should.) Check for that key -- if it already exists in an Entity in the store, then update it, else, insert it.