Neo4j Cypher query: perform calculation on relationships' properties during traversing process - neo4j

I am working on RPG game, exactly on the exchange artifacts component. I am using Neo4j graph database to store all artifacts and players exchange orders for these artifacts.
Graph diagram looks as follows:
:Exchange relationships represents players exchange orders. For e.g.: Player B is exchanging 1 Mega Boots artifact for 10 gold. Player C is exchanging 1 Mega Helmet for 2 pairs of Mega Boots.
So now, I am working on creating cypher query that should provide different paths. Each path should reveal artifact exchange orders sequence, so in the end I will get more gold then I have at the begging.
For e.g.: existing gold amount 100.
Path1: Gold->MegaBoots->MegaHelmet->MegaSword->Gold, Number of gold after all exchanges 115
Path2: Gold->MegaBoots->MegaHelmet->Gold, Number of gold after all exchanges 111
Complexity: During moving between 2 adjacent nodes, query should determine (make calculation on properties of relationship that connects these nodes), whether I have enough resources to get to endNode.
For e.g.:
Initially, gold amount is 10 and query starts moving from startNode :Artifact({name=gold}) to it's adjacent node :Artifact({name=MegaBoots}). Query sees 2 :Exchange relationships and selects only relationship with id=2, as it's baseResourceAmount property is equal to initial gold amount (relationship with id=1 is not suitable for as, it's baseResourceAmount value greater then initial gold amount).
Now, query moves from node :Artifact({name=MegaBoots}) to end node :Artifact({name=MegaHelmet}) using :Exchange relationship with id=4 as after 1st exchange our resource amount is 2 which is equal to relationship's baseResourceAmount property value.
Eventually, the final path will be Gold--:Exchange(id=2)-->MegaBoots--:Exchange(id=4)-->MegaHelmet
So, does anyone know how to tell Cypher to make specific calculations on properties of relationships that bridge 2 adjacent nodes?

Related

How to calculate custom degree based on the node label or other conditions?

I have a scenario where I need to calcula a custom degree between the first node (:employee) where it should only be incremented to another node when this node's label is :natural or :relative, but not when it is :legal.
Example:
The thing is I'm having trouble generating this custom degree property as I needed it.
So far I've tried playing with FOREACH and CASE but had no luck. The closest I got to getting some sort of calculated custom degree is this:
match p = (:employee)-[*5..5]-()
WITH distinct nodes(p) AS nodes
FOREACH(i IN RANGE(0, size(nodes)) |
FOREACH(node IN [nodes[i]] |
SET node.degree = i
))
return *
limit 1
But even this isn't right, as despite having 5 distinct nodes, I get SIZE(nodes) = 6, as the :legal node is accounted for twice for some reason.
Does anyone know how to achieve my goal within a single cypher query?
Also, if you know why the :legal node is account for twice, please let me know. I suspect it is because it has 2 :natural nodes related to it, but don't know the inner workings that make it appear twice.
More context:
:employee nodes are, well, employees of an organization
:relative nodes are relatives to an employee
:natural nodes are natural persons that may or may not be related to a :legal
:legal nodes are companies (legal persons) that may, or may not, be related to an :employee, :relative, :natural or another :legal on an IS_PARTNER relationship when, in real life, they are part of the board of directors or are shareholders of that company (:legal).
custom degree is what I aim to create and will define how close one node is to another given some conditions to this project (specified below).
All nodes have a total_contracts property that are the total amount of money received through contracts.
The objective is to find any employees with relationships to another node that has total_contracts > 0 and are up to custom degree <= 3, as employees may be receiving money from external sources, when they shouldn't.
As for why I need this custom degree ignoring the distance when it is a :legal node, is because we threat companies as the same distance as the natural person that is a partner.
On the illustrated example above, the employee has a son, DIEGO, that is a shareholder of a company (ALLURE) and has 2 other business partners (JOSE and ROSIEL). When I ask what's the degree of the son to the employee, I should get 1, as they are directly related; when I ask whats the degree of JOSE to the employee I should get 2, as JOSE is related to DIEGO through ALLURE and we shouldn't increment the custom degree when it is a company, only when its a person.
The trick with this type of graph is making sure we avoid paths that loop back to the same nodes (which is definitely going to happen quite a lot because you're using multiple relationships between nodes instead of just one...you may want to make sure this is necessary in your model).
The easiest way to do that is via APOC Procedures, as you can adjust the uniqueness of traversals so that nodes are unique in each path.
So for example, for a specific start node (let's say the :employee has empId:1 just for the sake of mocking up a lookup of the node, we'll calculate a degree for all nodes within 5 hops of the starting node. The idea here is that we'll take the length of the path (the number of hops) - the number of :legal nodes in the path (by filtering the nodes in the path for just :legal nodes, then getting the size of that filtered list).
MATCH (e:employee {empId:1})
CALL apoc.path.expandConfig(e, {minLevel:1, maxLevel:5, uniqueness:'NODE_PATH'}) YIELD path
WITH e, last(nodes(path)) as endNode,
length(path) - size([x in nodes(path) WHERE x:legal]) as customDegree
RETURN e, endNode, customDegree

Neo4j: Pattern Simulation In Graph

I am trying to simulate the patterns in graph. My Graph contains 20 Million Persons and 4 Million Organization. Now I have to select nodes randomly and create the patterns like this:
(n1:Person:Employee)-[:EMPLOYED_BY]->(m1:Organization:Seller)
(n2:Person:BuyerContact)-[:EMPLOYED_BY]->(m2:Organization:Buyer)
(n1)-[:P2P]-(n2)
Here, in organization m1& m2 there can have more than 1 employee may be sometimes over 100 in number. Means, we have to select some n number of people and create a EMPLOYED_BY relationship in the above pattern.
Since, picking random samples is very tedious task in Neo4j, the operation taking quit long time to pick nodes randomly. How can I speed up the pattern simulation.

Depth wise retrieval of nodes from neo4j

I have a science graph in neo4j which has names of some scientists as nodes and connected to nodes holding laws by relation has_discovered. The laws are then related to their application by relation has_application. I am new to cypher. I want to know what cql query will give me level 1 and level 2 nodes of the scientists nodes. Here level 1 will be the nodes holding laws and level 2 will be nodes holding their applications.
This query should probably take care of it, assuming your labels are :Scientist, :Law, and :Application.
MATCH (sci:Scientist)-[:has_discovered]->(law:Law)-[:has_application]->(app:Application)
RETURN sci, law, app
As long as your :has_discovered and :has_application relationships only connect those types of nodes, you can leave off the :Law and :Application labels (but you'll want to keep the :Scientist label so you begin your pattern match only at :Scientist nodes).
You can use COLLECT() as necessary to group results if you want.

Fast search for unconnected nodes in big neo4j graph

So, i've created a Neo4j graph database out of a relational database. The graph database has about 7 million nodes, and about 9 million relationships between the nodes.
I now want to find all nodes, that are not connected to nodes with a certain label (let's call them unconnected nodes). For example, i have nodes with the labels "Customer" and "Order" (let's call them top-level-nodes). I want to find all nodes that have no relationship from or to these top-level-nodes. The relationship doesn't have to be direct, the nodes can be connected via other nodes to the top-level-nodes.
I have a cypher query which would solve this problem:
MATCH (a) WHERE not ((a)-[*]-(:Customer)) AND not ((a)-[*]-(:Order)) RETURN a;
As you can imagine, the query will need a long time to execute, the performance is bad. Most likely because of the undirected relationship and because it doesn't matter via how many nodes the relationship can be made. However, the relationship directions don't matter, and i need to make sure that there is no path from any node to one of the top-level-nodes.
Is there any way to find the unconnected nodes faster ? Note that the database is really big, and there are more than 2 labels which mark top-level-nodes.
You could try this approach, which does involve more operations, but can be run in batches for better performance (see apoc.periodic.commit() in the APOC procedures library).
The idea is to first apply a label (say, :Unconnected) to all nodes in your graph (batch execute with apoc.periodic.commit), and then, taking batches of top level nodes with that label, matching to all nodes in the subgraphs extending from them and removing that label.
When you finally have run out of top level nodes with the :Unconnected label (meaning all top level nodes and their subgraphs no longer have this label) then the only nodes remaining in your graph with the :Unconnected label are not connected to your top level nodes.
Any approach to this kind of operation will likely be slow, but the advantage again is that you can process this in batches, and if you get interrupted, you can resume. Once your queries are done, all the relevant unconnected nodes are now labeled for further processing at your convenience.
Also, one last note, in Neo4j undirected relationships have no arrows in the syntax ()-[*]-().
MATCH (a)
WHERE
not (a:Customer OR a:Order)
AND shortestPath((a)-[*]-(:Customer)) IS NULL
AND shortestPath((a)-[*]-(:Order)) IS NULL
RETURN a;
If you could add rel-types it would be faster.
One further optimization could be to check the nodes of an :Customer path for an :Order node and vice versa. i.e.
NONE(n in nodes(path) WHERE n:Order)
In general, this might be rather a set operation, i.e.
expand around all order and customer nodes in parallel into two sets
and compute the overlap between the two sets.
Then remove the overlap from the total number of nodes.
I added an issue for apoc here to add such a function or procedure
https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/223

neo4j - find child nodes with property value is slow

I have a graph with approximately 1 million nodes.
The graph represents a catalog tree (spare parts). Maximum deep is about 6.
A node have a filter property that can have any value, even empty. This filter property is used to filter the catalog for the user.
What I want is to ask a question like this when I click a node (any level):
"for each child node, tell me if any of its children (any level) has a filter attribute with a value of ...".
With my query I takes about 12 sec for each child to get the result. Should not this scenario be an ideal use case for neo? Shouldn't it be way faster?
I can send the nodes and relations as text files if you want the data.
my query is something like this:
start n=node(3)
match n-[:PARENT_ITEM;1..6]->x
where x.filter="something"
return count(x)
I'm running on a Windows Azure Large server (4 cores, 7Gb ram) and i haven't done any configurations after neo installation.

Resources