I was looking for the feature to generate some graph queries in neo4j.
As the database size is huge so can anyone suggest the procedure to generate small queries (3-5 nodes a -> b -> c ->a).
I can run BFS from a node but how can I find the small graph containing only a specific number of nodes as graph structure?
a
/ \
b-----c----d
[UPDATED]
If you want to get a single arbitrary path of length 4 (having 4 relationships and 5 nodes), and you do not need the path to be unidirectional, then you can simply do this:
MATCH p=()-[*4]-()
RETURN p
LIMIT 1;
If you want the path to be unidirectional (where all relationships point in the same direction), then you just need to specify a direction:
MATCH p=()-[*4]->()
RETURN p
LIMIT 1;
Related
I have a cypher query that returns a series of paths, which are partly overlapping and result in a number of distinct clusters. In this case there will be a modest number of clusters (100 - 1000) of relatively small size (1-50 nodes). The complete dataset is typically a few million nodes (the query extracts a relatively small subset of the total nodes).
A simplified version of the query looks like this:
MATCH p=(a:M)-[:F2EDGE]-(b:M) WHERE a.prop > 90 AND b.prop > 90 RETURN p
The actual query would be a bit more complex than that with a variable number of intermediate nodes, but that should exemplify the problem.
Now I want to explore the different clusters that are generated by that query.
I have found the docs on the Connected Components algorithm which seems on the right lines, but I can't see how that can be applied to a list of paths that is the result of the query.
I would want to be able to:
get list of the clusters and some basic properties for then (e.g. number of nodes)
fetch data that allowed me to reproducibly fetch that cluster again in the future (maybe by fetching the node ids or by adding new "cluster" nodes that linked to each cluster)
Can someone suggest how to achieve this?
You can use cypher projections with that
something along these lines:
CALL algo.unionFind('
MATCH (a:M) WHERE a.prop > 90 RETURN id(a) as id
UNION
MATCH (b:M) AND b.prop > 90 RETURN id(b) as id
', '
MATCH p=(a:M)-[:F2EDGE]->(b:M) WHERE a.prop > 90 AND b.prop > 90 RETURN id(a) as source, id(b) as target
', {graph:"cypher",write:true, partitionProperty:"partition"})
Please note that in this case one of the node queries would have been enough as they both have the same criteria, I just wanted to demonstrate how to combine queries on source and target nodes.
If you want to restrict the nodes to only the ones in your connected graph you can also use this as "node-query":
MATCH (a:M)-[:F2EDGE]-(b:M)
WHERE a.prop > 90 AND b.prop > 90
UNWIND [id(a), id(b)] as id
RETURN distinct id
I want to look up the top 5 (shortest) path in my graph (Neo4j 3.0.4) from point A to point Z.
The graph consists several nodes that are connected by the relation "CONNECTED_BY". This connection has a cost property that should be minimized.
I started with this:
MATCH p=(from:Stop{stopId:'A'}), (to:Stop{stopUri:'Z'}),
path = allShortestPaths((from)-[:CONNECTED_TO*]->(to))
WITH REDUCE (total = 0, r in relationships(p) | total + r.cost) as tt, path
RETURN path, tt
This query returns always the subgraph with the least hops, the cost property is not considered. There exists another subgraph with more hops that has a lower total cost. What I am doing wrong?
Furthermore, I acutally want to get the TOP 5 subgraphs. If I execute this query:
MATCH p=(from:Stop{stopUri:'A'})-[r:CONNECTED_TO*10]->(to:Stop{stopUri:'Z'}) RETURN p
I can see several paths, but the first one just returns one path.
The path should not contain loops etc. of course.
I want to execute this query via REST API, so a REST Call or cyhper query should do it.
EDIT1:
I want to execute this as REST Call, so I tried the dijkstra algorithm. This seems to be a good way, but I have to calculate the weight by adding 3 different cost properties in the relation. How this could be achieved?
allShortestPaths will find the shortest path between two points and then match every path that has the same number of hops. If you want to minimize based on cost rather than traversal length, try something like this:
MATCH p=(from:Stop{stopId:'A'}), (to:Stop{stopUri:'Z'}),
path = (from)-[:CONNECTED_TO*]->(to)
WITH REDUCE (total = 0, r in relationships(p) | total + r.cost) as cost, path
ORDER BY cost
RETURN path LIMIT 5
In a graph where the following nodes
A,B,C,D
have a relationship with each nodes successor
(A->B)
and
(B->C)
etc.
How do i make a query that starts with A and gives me all nodes (and relationships) from that and outwards.
I do not know the end node (C).
All i know is to start from A, and traverse the whole connected graph (with conditions on relationship and node type)
I think, you need to use this pattern:
(n)-[*]->(m) - variable length path of any number of relationships from n to m. (see Refcard)
A sample query would be:
MATCH path = (a:A)-[*]->()
RETURN path
Have also a look at the path functions in the refcard to expand your cypher query (I don't know what exact conditions you'll need to apply).
To get all the nodes / relationships starting at a node:
MATCH (a:A {id: "id"})-[r*]-(b)
RETURN a, r, b
This will return all the graphs originating with node A / Label A where id = "id".
One caveat - if this graph is large the query will take a long time to run.
I have a graph with about 800k nodes and I want to create random relationships among them, using Cypher.
Examples like the following didn't work because the cartesian product is too big:
match (u),(p)
with u,p
create (u)-[:LINKS]->(p);
For example I want 1 relationship for each node (800k), or 10 relationships for each node (8M).
In short, I need a query Cypher in order to UNIFORMLY create relationships between nodes.
Does someone know the query to create relationships in this way?
So you want every node to have exactly x relationships? Try this in batches until no more relationships are updated:
MATCH (u),(p) WHERE size((u)-[:LINKS]->(p)) < {x}
WITH u,p LIMIT 10000 WHERE rand() < 0.2 // LIMIT to 10000 then sample
CREATE (u)-[:LINKS]->(p)
This should work (assuming your neo4j server has enough memory):
MATCH (n)
WITH COLLECT(n) AS ns, COUNT(n) AS len
FOREACH (i IN RANGE(1, {numLinks}) |
FOREACH (x IN ns |
FOREACH(y IN [ns[TOINT(RAND()*len)]] |
CREATE (x)-[:LINK]->(y) )));
This query collects all nodes, and uses nested loops to do the following {numLinks} times: create a LINK relationship between every node and a randomly chosen node.
The innermost FOREACH is used as a workaround for the current Cypher limitation that you cannot put an operation that returns a node inside a node pattern. To be specific, this is illegal: CREATE (x)-[:LINK]->(ns[TOINT(RAND()*len)]).
I have created a graph db in Neo4j and want to use it for generalization purposes.
There are about 500,000 nodes (20 distinct labels) and 2.5 million relations (50 distinct types) between them.
In an example path : a -> b -> c-> d -> e
I want to find out the node without any incoming relations (which is 'a').
And I should do this for all the nodes (finding the nodes at the beginning of all possible paths that have no incoming relations).
I have tried several Cypher codes without any success:
match (a:type_A)-[r:is_a]->(b:type_A)
with a,count (r) as count
where count = 0
set a.isFirst = 'true'
or
match (a:type_A), (b:type_A)
where not (a)<-[:is_a*..]-(b)
set a.isFirst = 'true'
Where is the problem?!
Also, I have to create this code in neo4jClient, too.
Your first query will only match paths where there is a relationship [r:is_a], so counting r can never be 0. Your second query will return any arbitrary pair of nodes labeled :typeA that aren't transitively related by [:is_a]. What you want is to filter on a path predicate. For the general case try
MATCH (a)
WHERE NOT ()-->a
This translates roughly "any node that does not have incoming relationships". You can specify the pattern with types, properties or labels as needed, for instance
MATCH (a:type_A)
WHERE NOT ()-[:is_a]->a
If you want to find all nodes that have no incoming relationships, you can find them using OPTIONAL MATCH:
START n=node(*)
OPTIONAL MATCH n<-[r]-()
WITH n,r
WHERE r IS NULL
RETURN n