I've recently done some benchmarking, and it seems like looking up another object by primary key:
let foo = realm.object(ofType: Bar.self, forPrimaryKey: id)
is more efficient (and in this specific case more readable), than trying to set the property directly as:
class Other: Object {
#objc dynamic var relation: Bar? = nil
let list = List<Bar>()
}
My benchmarking wasn't too thorough though (used only one element in the list, etc.) and I'm wondering if this is actually the case.
Intuition makes me think primary key lookup AND using the relation property above would be O(1) or O(logn). With 1,000,000 records and 1,000,000 lookups:
primary key: ~10s
relation property: ~12s
list property: ~14s
In summary: what is the performance of Realm's object(ofType:forPrimaryKey:) lookup?
Extra credit: when is it beneficial to use LinkingObjects, Lists, etc.? Assuming it's just a readability / convenience wrapper of some sort. In my case it has been more messy / bug prone, so I'm assuming I'm not using Realm in the way it was intended.
Realm isn't a relational database like SQLite. Instead, data is stored in B+ trees. All the data for a given property on a given model type is stored within a single tree, and all data retrieval (whether getting a property value or a linked object) involves traversing such a tree.
Furthermore, when a Realm is opened, the contents of the entire database file are mmaped into memory. When you use one of the Realm SDKs, the objects you create (e.g. Object instances) are actually thin wrappers that store a conceptual pointer to a location in the database file and provide methods to directly read from and write to the object at that location. Likewise, relationships (such as object properties on a model) are references to nodes elsewhere in the tree.
This means that retrieving an object requires the time it takes to traverse the database data structures to locate the required information, plus the time it takes to instantiate an object and initialize it. The latter is effectively a constant-time operation, so we want to look primarily at the former.
As for the situations you've outlined...
If you already know your primary key value, getting an object takes O(log n) time, where n is the number of objects of that particular type in the database. (The time it takes to retrieve a Dog is irrespective of the number of Cats the database contains.)
If you're naively implementing a relational-style foreign key pattern, where you model a link to an object of type U by storing a primary key value (like a string) on some object of type T, it will take O(log t) time to retrieve the primary key value (where t is the number of Ts), and O(log u) time to look up the destination object (as described in the previous bullet point; u = the number of Us).
If you're using an object property on your model type T to model a link to another object, it takes O(log t) time to retrieve the location of the destination object.
Using a list introduces another level of indirection, so retrieving the single object from a one-object list will be slower than retrieving an object directly from an object property.
Object, list, and linking objects properties are not intended to be an alternative to looking up objects via primary keys. Rather, they are intended to model many-to-one, many-to-many, and inverse relationships, respectively. For example, a Cat may have a single Owner, so it makes sense for a Cat model to have a object property pointing to its Owner. A Person may have multiple friends, so it makes sense for a Person model to have a list property containing all their friends (which may contain zero, one, or many other Persons).
Finally, if you're interested in learning more, the entire database stack is open source (except for the sync component, which is a strictly optional peripheral component). You can find the code for the core database engine here. We also have an older article that discusses the high-level design of the database engine; you can find that here.
Would it be possible to use a PersistentEntityStore and one or more plain Store instances in the same Environment instance? I was hoping to use transactions that cover changes on such a combination.
I see potential conflicts with store names that I would have to avoid. Anything else?
It's possible to mix code using different API layers inside a single transaction. The only requirement is the data touched by different API should be isolated, disjoint sets of names of Stores should be used.
What are the names of Stores used by PersistentEntityStore? Any PersistentEntityStore has its own unique name, and names of all Stores, that represent mapping of the entity store to key/value layer, start with "${PersistentEntityStore name}.", as it's specified in the source code.
Another issue is that API is not complete for such approach. After a StoreTransaction is created against the PersistentEntityStore, it should be be cast to PersistentStoreTransaction in order to call PersistentStoreTransaction#getEnvironmentTransaction() for getting underlying transaction:
final StoreTransaction txn = entityStore.beginTransaction();
// here is underlying Transaction instance:
final Transaction envTxn = ((PersistentStoreTransaction) txn).getEnvironmentTransaction();
Lua API has a function lua_getmetatable which will fetch the table with metafunctions if the value has one.
Lua auxiliary library (which is part of lua API) has another function luaL_getmetatable which is a macro that will fetch a value from LUA_REGISTRYINDEX.
But another function from this library luaL_getmetafield with similar name does a completely different thing - it will look for a method in the get_metatable's location.
Why is there two different locations?
When is each metatable used?
lua_getmetatable gets the metatable associated with the given object. This is a fundamental feature; if this function didn't exist, there would be no way to access the metatable for a given object.
luaL_getmetatable is part of a convention for giving types to userdata (C objects that can be accessed from Lua) or classes of tables. In this convention you add tables to the registry with luaL_newmetatable, and then use these tables to represent the metatables for different userdata/table types (when you need them you can read them from the registry and set them with luaL_setmetatable).
This is a convenience feature only; and you do not need to follow this convention if you don't want to. Everything will still work if you place the metadata tables somewhere that isn't in the registry and bind them to your userdata with lua_setmetatable. That said, if the luaL_*metatable functions didn't exist, where would you put the tables that you were using to represent the different userdata/table types; and how would you find them again when you needed them for a second time? You could definitely solve this problem in a different way, but why not use the pre-built convention if it works for you.
I'm trying to keep track of various variables in a big Lua code base, for logging and analytics purposes.
Ideally I want to create a register function that registers an existing global variable and keeps a reference to that variable (can be anything from numbers, booleans or other tables) in a table.
This table will then be used to loop on and output the values of the registered variables at certain points of the execution.
I won't have control over what type of variables are used, and cannot change how those variables are setup. So I don't have the option of changing those variables to tables and such.
What would be the best approach in Lua?
In my code I want to take advantage of ETS's bag type that can store multiple values for single key. However, it would be very useful to know if insertion actually inserts a new value or not (i.e. if the inserted key with value was or was not present in the bag).
With type set of ETS I could use ets:insert_new, but semantics is different for bag (emphasis mine):
This function works exactly like insert/2, with the exception that instead of overwriting objects with the same key (in the case of set or ordered_set) or adding more objects with keys already existing in the table (in the case of bag and duplicate_bag), it simply returns false.
Is there a way to achieve such functionality with one call? I understand it can be achieved by a lookup followed by an optional insert, but I am afraid it might hurt performance of concurrent access.