py2neo, create a node ? two ways? - py2neo

I am confused as what is the difference between below two ways to create a node ? It seems like the result is the same;
from py2neo import Graph
graph = Graph()
graph.cypher.execute("CREATE (a:Person {name:{N}})", {"N": "Alice"}) # a
graph.create( Node("Person",name="Alice")) # b

Looking at the py2neo v3 documentation, it seems there's yet a third way to create a node.
First instantiate a Node object, as in
a = Node("Person",name="Alice")
then insert it in a subgraph (see py2neo types),
sg = Subgraph(a)
then create elements of this subgraph (Graph.create method):
graph.create(sg)
I understand that subgraph creation should however be preferred when creating numerous nodes and edges (a subgraph ...).

You're right, the result is exactly the same. Py2neo exposes two levels of API: a pure Cypher API (execute) and a simpler object-based API (Node). The latter is generally easier to get up and running with, the former is more comprehensive.

Related

Neo4j Cypher find all paths exploring sorted relationships

I'm struggling for days to find a way for finding all paths (to a maximum length) between two nodes while controlling the path exploration by Neo4j by sorting the relationships that are going to be explored (by one of their properties).
So to be clear, lets say I want to find K best paths between two nodes until a maximum length M. The query will be like:
match (source{name:"source"}), (target{name:"target"}),
p = (source)-[*..M]->(target)
return p order by length(p) limit K;
So far so good. But lets say the relationships of the path have a property called "priority". What I want is to write a query that tells Neo4j on each step of path exploration which relationships should be explored first.
I know that can be possible when I use the java libraries and an embedded database (By implementing PathExpander interface and giving it as input to the GraphAlgoFactory.allSimplePaths() function in Java).
But now I'm trying to find a way doing this in a server mode database access using Bolt or REST api.
Is there any way to do this in the server mode? Or maybe using Java libraries functions while accessing the graph in server mode?
use labels and an index to find your two start-nodes
perhaps consider allShortestPaths to make it faster
try this:
match (source{name:"source"}), (target{name:"target"}),
p = (source)-[rels:*..20]->(target)
return p, reduce(prio=0, r IN rels | prio + r.priority) as priority
order by priority ASC, length(p)
limit 100;
I had a very similar problem. I was trying to find the shortest path from one node to all other nodes. I had written a query similar to the one in the answer above (https://stackoverflow.com/a/38030536/783836) and couldn't get it to perform in any reasonable time.
Asking Can Graph DBs perform well with unspecified end nodes? pointed me to the solution: the Single Shortest Path algorithm.
In Neo4j you need to install the Graph Data Science Library and make use of this function: gds.alpha.shortestPath.deltaStepping.stream

Neo4j - is it possible to visualise a simple overview of my database?

I've got my graph database, populated with nodes, relationships, properties etc. I'd like to see an overview of how the whole database is connected, each relationship to each node, properties of a node etc.
I don't mean view each individual node, but rather something like an ERD from a relational database, something like this, with the node labels. Is this possible?
You can use the metadata by running the command call db.schema().
In Neo4j v4 call db.schema() is deprecated, you can now use call db.schema.visualization()
As far as I know, there is no straight-forward way to get a nicely pictured diagram of a neo4j database structure.
There is a pre-defined query in the neo4j browser which finds all node types and their relationships. However, it traverses the complete graph and may fail due to memory errors if you have to much data.
Also, there is neoprofiler. It's a tool which claims to so what you ask. I never tried and it didn't get too many updates lately. Still worth a try: https://github.com/moxious/neoprofiler
Even though this is not a graphical representation, this query will give you an idea on what type of nodes are connected to other nodes with what type of relationship.
MATCH (n)
OPTIONAL MATCH (n)-[r]->(x)
WITH DISTINCT {l1: labels(n), r: type(r), l2: labels(x)}
AS `first degree connection`
RETURN `first degree connection`;
You could use this query to then unwind the labels to write that next cypher query dynamically (via a scripting language and using the REST API) and then paste that query back into the neo4j browser to get an example set of the data.
But this should be good enough to get an overview of your graph. Expand from here.

Defining label/s during create(node()) in batches

As I understand from the documentation of py2neo, the only way to add a label to a node is to use the add_labels() function, after the node is created. Is there any way to define label/s in the create(node()) function?
The only option right now is to use Cypher to create the node instead of the 'create' method. This is because the underlying REST resource does not support creation of nodes with labels. The next version of py2neo (currently in beta) will make the process slightly simpler, allowing labels to be used through the 'create' method by wrapping Cypher directly instead.
You can now add labels to nodes when you create them.
From the docs:
from py2neo import Node
alice = Node("Person", name="Alice")
banana = Node("Fruit", "Food", colour="yellow", tasty=True)

Clone nodes and relationships with Cypher

Is it possible to clone arbitrary nodes and relationships in a single Cypher neo4j 2.0 query?
'Arbitrary' reads 'without specifying their labels and relationship types'. Something like:
MATCH (node1:NodeType)-[e]->(n)
CREATE (clone: labels(n)) set clone=n set clone.prop=1
CREATE (node1)-[e1:type(e)]->(clone) set e1=e set e1.prop=2
is not valid in Cypher, so one cannot simply get labels from one node or relationship and assign them to another, because labels are compiled into the query literally.
Sure, labels and relation types are important for MATCH and WHERE for producing effective query plan, but isn't CREATE making another case?
The easiest way to clone parts of a graph is to use the dump command in Neo4j shell. dump generates cypher create statements from your return clauses. The result of dump can be appied to the graph database to create clones.
Today, April 2022, I believe the best approach might be using an APOC procedure
I had a similar requirement and this worked for me.
MATCH (rootA:Root{name:'A'}),
(rootB:Root{name:'B'})
MATCH path = (rootA)-[:LINK*]->(node)
WITH rootA, rootB, collect(path) as paths
CALL apoc.refactor.cloneSubgraphFromPaths(paths, {
standinNodes:[[rootA, rootB]]
})
YIELD input, output, error
RETURN input, output, error

Extract subgraph in neo4j

I have a large network stored in Neo4j. Based on a particular root node, I want to extract a subgraph around that node and store it somewhere else. So, what I need is the set of nodes and edges that match my filter criteria.
Afaik there is no out-of-the-box solution available. There is a graph matching component available, but it works only for perfect matches. The Neo4j API itself defines only graph traversal which I can use to define which nodes/edges should be visited:
Traverser exp = Traversal
.description()
.breadthFirst()
.evaluator(Evaluators.toDepth(2))
.traverse(root);
Now, I can add all nodes/edges to sets for all paths, but this is very inefficient. How would you do it? Thanks!
EDIT Would it make sense to add the last node and the last relationship of each traversal to the subgraph?
As for graph matching, that has been superseded by http://docs.neo4j.org/chunked/snapshot/cypher-query-lang.html which would fit nicely, and supports fuzzy matchin with optional relationships.
For subgraph representation, I would use the Cypher output to maybe construct new Cypher statements for recreating the graph, much like a SQL export, something like
start n=node:node_auto_index(name='Neo')
match n-[r:KNOWS*]-m
return "create ({name:'"+m.name+"'});"
http://console.neo4j.org/r/pqf1rp for an example
I solved it by constructing the induced subgraph based on all traversal endpoints.
Building the subgraph from the set of last nodes and edges of every traversal does not work, because edges that are not part of any shortest paths would not be included.
The code snippet looks like this:
Set<Node> nodes = new HashSet<Node>();
Set<Relationship> edges = new HashSet<Relationship>();
for (Node n : traverser.nodes())
{
nodes.add(n);
}
for (Node node : nodes)
{
for (Relationship rel : node.getRelationships())
{
if (nodes.contains(rel.getOtherNode(node)))
edges.add(rel);
}
}
Every edge is added twice. One time for the outgoing node and one time for the incoming node. Using a Set, I can ensure that it's in the collection only once.
It is possible to iterate over incoming/outgoing edges only, but it is unclear how loops (edge from a node to itself) are handled. To which category do they belong to? This snippet does not have this issue.
See dumping the database to cypher statements
dump START n=node({self}) MATCH p=(n)-[r:KNOWS*]->(m) RETURN n,r,m;
There's also an example for importing the subgraph of first database (db1) into a second (db2).

Resources