Spring Data Neo4j 4 and dynamic product properties - neo4j

In my Neo4j/SDN project I have to implement a following task - I have a parent Node(let's say ParentNode) that must define a set of product characteristics, for example:
weight: Double
size: Integer
license: String
active: Boolean
Also, new product characteristics(with any new name and type) can be added during the application runtime.
ParentNode can have a set of child nodes(let's say ProductNode) and each of these nodes can provide an own value for a certain characteristic defined by a ParentNode.
Based on these characteristics names and values I need to have possibility to filter a ProductNode nodes.
Previously this structure have been implemented by me by SDN3 DynamicProperties but AFAIK - DynamicProperties support has been removed from SDN 4.
So my question is - how to implement the following structure based on SDN 4 ?
UPDATED
Also, how about the idea to define every ParentNode characteristic as a separate node(let's say CharacteristicNode)
for example
CharacteristicNode.name = weight
CharacteristicNode.type = Double
CharacteristicNode.name = license
CharacteristicNode.type = String
...
and every ProductNode can provide a value node(CharacteristicValueNode) associated with ProductNode and CharacteristicNode.
The main question here how to support different types for CharacteristicValueNode and how to filter ProductNode nodes based on different characteristic and their values?

In SDN 4, you can model these properties as a Map of (property name, value) and write a custom converter. This will convert the Map to a String property on the node (json style perhaps), and then from the graph back to your Map.
The downside to this is that it is not easy to write a custom query for these dynamic properties because they're not really stored in the graph as independent properties- rather, your converter will squash them into a single one.
Update
If you were to define each Characteristic type as a node (in your example, you would have 4 nodes- one representing each of weight,size,active,license), then you don't need an intermediate CharacteristicValueNode to relate the ProductNode and the CharacteristicNode. Instead, you can model the value of the produce for the characteristic on the relationship between the ProductNode and CharacteristicNode.
For example, if the ProductNode had only weight and size, then you would have two relationships- one from ProductNode to the weight CharacteristicNode with the weight value on the relationship, and another from the ProductNode to the size CharacteristicNode with the size value on that relationship.
In SDN 4, these would be modelled as RelationshipEntities. For example,
#RelationshipEntity(type="CHARACTERISTIC")
public class ProductCharacteristic {
Long id;
#StartNode ProductNode product;
#EndNode CharacteristicNode characteristic;
int value;
...
}
The ProductNode would contain a collection of these relationship entities
Set<ProductCharacteristic> characteristics;
Then you can query for products related to characteristics with a certain name. Or query for ProductCharacteristic with findByCharacteristicName
I haven't really tried this approach out, but it is worth thinking about the change that this forces in your underlying graph model to support dynamic properties.

Related

Neo4j OGM Filter query against array

I'm trying to build a small property searching system. I'm using spring boot with a neo4j database and I'm trying to query from the neo4j database using filters becasue i need the querying to be dynamic too. In my use case properties has features such as electricity, tap water, tilled roof, etc. Property & Feature are nodes, Feature node has an attribtue named 'key', a Property node is linked to many Feature nodes by rich relationships typed HAS_FEATURE, what i want to do is query properties for a given array of feature keys using Filters. Follwing is code,
featureKeys is a java List here,
filter = new Filter("key", ComparisonOperator.IN, featureKeys);
filter.setRelationshipType("HAS_FEATURE");
filter.setNestedPropertyName("hasFeatures");
filter.setNestedPropertyType(Feature.class);
filters.add(filter);
SESSION.loadAll(Property.class, filters, new Pagination(pageNumber, pageSize));
The problem is i want only the properties that related to all the given features keys to be returned, but even the properties that is related to one element of the given feature key list is also returned. What do i need to do to query only the properties that are linked to all the given list elements, i can change the rich relationship to a normal relationship if needed.

Performance of a primary key lookup in Realm?

I've recently done some benchmarking, and it seems like looking up another object by primary key:
let foo = realm.object(ofType: Bar.self, forPrimaryKey: id)
is more efficient (and in this specific case more readable), than trying to set the property directly as:
class Other: Object {
#objc dynamic var relation: Bar? = nil
let list = List<Bar>()
}
My benchmarking wasn't too thorough though (used only one element in the list, etc.) and I'm wondering if this is actually the case.
Intuition makes me think primary key lookup AND using the relation property above would be O(1) or O(logn). With 1,000,000 records and 1,000,000 lookups:
primary key: ~10s
relation property: ~12s
list property: ~14s
In summary: what is the performance of Realm's object(ofType:forPrimaryKey:) lookup?
Extra credit: when is it beneficial to use LinkingObjects, Lists, etc.? Assuming it's just a readability / convenience wrapper of some sort. In my case it has been more messy / bug prone, so I'm assuming I'm not using Realm in the way it was intended.
Realm isn't a relational database like SQLite. Instead, data is stored in B+ trees. All the data for a given property on a given model type is stored within a single tree, and all data retrieval (whether getting a property value or a linked object) involves traversing such a tree.
Furthermore, when a Realm is opened, the contents of the entire database file are mmaped into memory. When you use one of the Realm SDKs, the objects you create (e.g. Object instances) are actually thin wrappers that store a conceptual pointer to a location in the database file and provide methods to directly read from and write to the object at that location. Likewise, relationships (such as object properties on a model) are references to nodes elsewhere in the tree.
This means that retrieving an object requires the time it takes to traverse the database data structures to locate the required information, plus the time it takes to instantiate an object and initialize it. The latter is effectively a constant-time operation, so we want to look primarily at the former.
As for the situations you've outlined...
If you already know your primary key value, getting an object takes O(log n) time, where n is the number of objects of that particular type in the database. (The time it takes to retrieve a Dog is irrespective of the number of Cats the database contains.)
If you're naively implementing a relational-style foreign key pattern, where you model a link to an object of type U by storing a primary key value (like a string) on some object of type T, it will take O(log t) time to retrieve the primary key value (where t is the number of Ts), and O(log u) time to look up the destination object (as described in the previous bullet point; u = the number of Us).
If you're using an object property on your model type T to model a link to another object, it takes O(log t) time to retrieve the location of the destination object.
Using a list introduces another level of indirection, so retrieving the single object from a one-object list will be slower than retrieving an object directly from an object property.
Object, list, and linking objects properties are not intended to be an alternative to looking up objects via primary keys. Rather, they are intended to model many-to-one, many-to-many, and inverse relationships, respectively. For example, a Cat may have a single Owner, so it makes sense for a Cat model to have a object property pointing to its Owner. A Person may have multiple friends, so it makes sense for a Person model to have a list property containing all their friends (which may contain zero, one, or many other Persons).
Finally, if you're interested in learning more, the entire database stack is open source (except for the sync component, which is a strictly optional peripheral component). You can find the code for the core database engine here. We also have an older article that discusses the high-level design of the database engine; you can find that here.

Versioning relationships between nodes in neo4j

Lets say there are two nodes A and B,
(A)-[r]-(B)
r has a property 'weight', that is a measure of dependency of A on B, let's say.
The value of weight frequently changes, and I wish to version the value of weight.
Is it feasible to make a new relationship between the two same nodes, and add a property ['valid': true] on the relationship created last?
I ask this question because I was told that if I need versioning on properties, they should definitely be nodes:
https://twitter.com/ikwattro/status/746997161645187072
But, the weight property between the two nodes A and B naturally belongs to the relationship between them. How do use a node to maintain the weight?
EDIT:
An example:
Let A be a node with label :FRUIT, and B be a node of label :PERSON
Further, let r be a relationship between the two, with a label :LIKING, and, the 'weight' property of r be a measure of how much person B likes fruit A.
The weight property of r keep changing, and it is required to version this property with time.
I think this depends on two things: The frequency of weight updates and the queries you will run on the versioned weights:
If you expect a smallish number of updates and if only keep them for reference, you could use a single relationship and store the old values in a property (e.g. a map or even a string).
If you expect a smallish number of updates and if you want to query the data regularly, it would be reasonable to use new relationships for each update.
If the weight changes frequently and you actually need to access the data (i.e. collect millions of weight values for millions of fruits), I would not store it in neo4j. Use a simple MySQL table with PersonID, FruitID, weight, timestamp or some other data store. Store only the latest value in neo4j.
I use both 2. and 3. a lot and even though 3. sounds overkill it's usually simple to implement as long as you only 'outsource' structured data with clear queries.

How to Model a relationship that adds a feature to a node?

This is a follow-up to this earlier question
How to model two nodes related through a third node in neo4j?
If the capabilities of a product are enhanced by a connects_to relationship with another product, how should that fact be captured?
Example: given
(shelf:Shelf {maxload:20}), if (node:L-bracket)-[connects-to]->(shelf), then shelf's maxload increases by 10. Now, if someone queries for a Shelf that supports maxload=30, I should be able to retrieve this combination of L-Bracket+Shelf as an option, in addition to the shelves that support maxload without L-bracket. This is one use-case.
The other is when the connects_to relationship adds an entirely new property to the Shelf node. The option I'm thinking of is adding a property to the relationship called 'provides feature' and then query those as well when returning nodes, to see if a product is been enhanced by any of its connections.
Part 1 :
I should be able to retrieve this combination of L-Bracket+Shelf as
an option, in addition to the shelves that support maxload without
L-bracket.
This use case is handled with OPTIONAL MATCH :
MATCH (shelf:Shelf {maxload:30})
OPTIONAL MATCH (shelf)<-[:CONNECTS_TO]-(bracket:L-Bracket)
RETURN shelf, collect(bracket) as brackets
This would return you a list of shelfs and a collection of brackets for each of them - empty collection if they don't have any brackets.
Part 2 :
the other is when the connects_to relationship adds an entirely new
property to the Shelf node. The option i'm thinking of is adding a
property to the relationship called 'provides feature' and then query
those as well when returning nodes, to see if a product is been
enhanced by any of its connections
You can simply use a PROVIDES_FEATURE relationship type, no need for a property on it. You can request for them in the same way as for part 1.
To be a bit more general, suppose everything that can be connected to a shelf (not just an L-Bracket) was represented by an Accessory node that has type and extraLoad properties, like this:
(:Accessory {type: 'L-Bracket', extraLoad: 10})
This would allow accessories of different types and with differing extra load capacities.
With this model, you could find all Shelf/Accessory combinations that can hold a load of at least 30 this way:
MATCH (shelf:Shelf)
OPTIONAL MATCH (shelf)<-[:CONNECTS_TO]-(x:Accessory)
WITH shelf, COLLECT(x) AS accessories, SUM(x.extraLoad) AS extra
WHERE shelf.maxLoad + extra >= 30
RETURN shelf, accessories;

Combining two neo4j cypher calls into one call

I have the following two cypher calls that I'd like to combine into one;
start r=relationship:link("key:\"foo\" and value:\"bar\"") return r.guid
This returns a relationship that contains a guid that I need based on a key value pair (in this case key:foo and value:bar).
Lets assume r.guid above returns 12345.
I then need all the property relationships for the object in question based on the returned guid and a property type key;
start r=relationship:properties("to:\"12345\" and key:\"baz\"") return r
This returns several relationships which have the values I need, in this case all property types baz that belong to guid 12345.
How do I combine these two calls into one? I'm sure its simple but I'm stumbling..
The answer I've gotten is that there is no way to perform an index lookup in the middle of a Cypher query, or to use a variable you have declared to perform the lookup.
Perhaps in later version of Cypher, as this ability should be standard especially with the dense node issue and the suggested solution of indexing.

Resources