Creating relationship queries in py2neo.ogm - neo4j

I am using the py2neo.ogm api to construct queries of my IssueOGM class based on its relationship to another class.
I can see why this fails:
>>> list(IssueOGM.select(graph).where(
... "_ -[:HAS_TAG]- (t:TagOGM {tag: 'critical'})"))
Traceback (most recent call last):
...
py2neo.database.status.CypherSyntaxError: Variable `t` not defined (line 1, column 42 (offset: 41))
"MATCH (_:IssueOGM) WHERE _ -[:HAS_TAG]- (t:TagOGM {tag: 'critical'}) RETURN _"
Is there a way using the OGM api to create a filter that is interpreted as this?
"MATCH (_:IssueOGM) -[:HAS_TAG]- (t:TagOGM {tag: 'critical'}) RETURN _"

Like an ORM, the OGM seems to be really good for quickly storing and/or retrieving nodes from your graph, and saving special methods and so forth to make each node 'work' nicely in your application. In this instance, you could use the RelatedFrom class on TagOGM to list all the issues tagged with a particular tag. However, this approach can sometimes lead to making lots of inadvertent db calls without realising (especially in a big application).
Often for cases like this (where you're looking for a pattern rather than a specific node), I'd recommend just writing a cypher query to get the job done. py2neo.ogm actually makes this remarkably simple, by allowing you to store it as a class method of the GraphObject. In your example, something like the following should work. Writing similar queries in the future will also allow you to search based on much more complex criteria and leverage the functionality of neo4j and cypher to make really complex queries quickly in a single transaction (rather than going back and forth to the db as you manipulate an OGM object).
from py2neo import GraphObject, Property
class TagOGM(GraphObject):
name = Property()
class IssueOGM(GraphObject):
name = Property()
time = Property()
description = Property()
#classmethod
def select_by_tag(cls, tag_name):
'''
Returns an OGM instance for every instance tagged a certain way
'''
q = 'MATCH (t:TagOGM { name: {tag_name} })<-[:HAS_TAG]-(i:IssueOGM) RETURN i'
return [
cls.wrap(row['i'])
for row in graph.eval(q, { 'tag_name': tag_name }).data()
]

Related

Neo4j SDN 4 emulate sequence object(not UUID)

Is it possible in Neo4j or SDN4 to create/emulate something similar to a PostgreSQL sequence database object?
I need this thread safe functionality in order to be able to ask it for next, unique Long value. I'm going to use this value as a surrogate key for my entities.
UPDATED
I don't want to go with UUID because I have to expose these IDs within my web application url parameters and in case of UUID my urls look awful. I want to go with a plain Long values for IDs like StackOverflow does, for example:
stackoverflow.com/questions/42228501/neo4j-sdn-4-emulate-sequence-objectnot-uuid
This can be done with user procedures and functions. As an example:
package sequence;
import org.neo4j.procedure.*;
import java.util.concurrent.atomic.AtomicInteger;
public class Next {
private static AtomicInteger sequence = new AtomicInteger(0);
#UserFunction
public synchronized Number next() {
return sequence.incrementAndGet();
}
}
The problem of this example is that when the server is restarted the counter will be set to zero.
So it is necessary to keep the last value of the counter. This can be done using these examples:
https://maxdemarzi.com/2015/03/25/triggers-in-neo4j/
https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/master/src/main/java/apoc/trigger/Trigger.java
No. As far as I'm aware there isn't any similar functionality to sequences or auto increment identifiers in Neo4j. This question has also been asked a few times in the past.
The APOC project might be worth checking out for this though. There seems to be a request to add it.
If your main interest is in having a way to generate unique IDs, and you do not care if the unique IDs are strings, then you should consider using the APOC facilities for generating UUIDs.
There is an APOC function that generates a UUID, apoc.create.uuid. In older versions of APOC, this is a procedure that must be invoked using the CALL syntax. For example, to create and return a single Foo node with a new UUID:
CREATE (f:Foo {uuid: apoc.create.uuid()})
RETURN f;
There is also an APOC procedure, apoc.create.uuids(count), that generates a specified number of UUIDs. For example, to create and return 5 Foo nodes with new UUIDs:
CALL apoc.create.uuids(5) YIELD uuid
CREATE (f:Foo {uuid: uuid})
RETURN f;
The most simplest way in Neo4j is to disable ids reuse and use node Graph ID like sequencer.
https://neo4j.com/docs/operations-manual/current/reference/configuration-settings/
Table A.83. dbms.ids.reuse.types.override
Description: Specified names of id types (comma separated) that should be reused. Currently only 'node' and 'relationship' types are supported.
Valid values: dbms.ids.reuse.types.override is a list separated by "," where items are one of NODE, RELATIONSHIP
Default value: [RELATIONSHIP, NODE]

Grails 3 - return list in query result from HQL query

I have a domain object:
class Business {
String name
List subUnits
static hasMany = [
subUnits : SubUnit,
]
}
I want to get name and subUnits using HQL, but I get an error
Exception: org.springframework.orm.hibernate4.HibernateQueryException: not an entity
when using:
List businesses = Business.executeQuery("select business.name, business.subUnits from Business as business")
Is there a way I can get subUnits returned in the result query result as a List using HQL? When I use a left join, the query result is a flattened List that duplicates name. The actual query is more complicated - this is a simplified version, so I can't just use Business.list().
I thought I should add it as an answer, since I been doing this sort of thing for a while and a lot of knowledge that I can share with others:
As per suggestion from Yariash above:
This is forward walking through a domain object vs grabbing info as a flat list (map). There is expense involved when having an entire object then asking it to loop through and return many relations vs having it all in one contained list
#anonymous1 that sounds correct with left join - you can take a look at 'group by name' added to end of your query. Alternatively when you have all the results you can use businesses.groupBy{it.name} (this is a cool groovy feature} take a look at the output of the groupBy to understand what it has done to the
But If you are attempting to grab the entire object and map it back then actually the cost is still very hefty and is probably as costly as the suggestion by Yariash and possibly worse.
List businesses = Business.executeQuery("select new map(business.name as name, su.field1 as field1, su.field2 as field2) from Business b left join b.subUnits su ")
The above is really what you should be trying to do, left joining then grabbing each of the inner elements of the hasMany as part of your over all map you are returning within that list.
then when you have your results
def groupedBusinesses=businesses.groupBy{it.name} where name was the main object from the main class that has the hasMany relation.
If you then look at you will see each name has its own list
groupedBusinesses: [name1: [ [field1,field2,field3], [field1,field2,field3] ]
you can now do
groupedBusinesses.get(name) to get entire list for that hasMany relation.
Enable SQL logging for above hql query then compare it to
List businesses = Business.executeQuery("select new map(b.name as name, su as subUnits) from Business b left join b.subUnits su ")
What you will see is that the 2nd query will generate huge SQL queries to get the data since it attempts to map entire entry per row.
I have tested this theory and it always tends to be around an entire page full of query if not maybe multiple pages of SQL query created from within HQL compared to a few lines of query created by first example.

Cypher query with literal map syntax & dynamic keys

I'd like to make a cypher query that generates a specific json output. Part of this output includes an object with a dynamic amount of keys relative to the children of a parent node:
{
...
"parent_keystring" : {
child_node_one.name : child_node_one.foo
child_node_two.name : child_node_two.foo
child_node_three.name : child_node_three.foo
child_node_four.name : child_node_four.foo
child_node_five.name : child_node_five.foo
}
}
I've tried to create a cypher query but I do not believe I am close to achieving the desired output mentioned above:
MATCH (n)-[relone:SPECIFIC_RELATIONSHIP]->(child_node)
WHERE n.id='839930493049039430'
RETURN n.id AS id,
n.name AS name,
labels(n)[0] AS type,
{
COLLECT({
child.name : children.foo
}) AS rel_two_representation
} AS parent_keystring
I had planned for children.foo to be a count of how many occurrences of each particular relationship/child of the parent. Is there a way to make use of the reduce function? Where a report would generate based on analyzing the array proposed below? ie report would be a json object where each key is a distinct RELATIONSHIP and the property value would be the amount of times that relationship stems from the parent node?
Thank you greatly in advance for guidance you can offer.
I'm not sure that Cypher will let you use a variable to determine an object's key. Would using an Array work for you?
COLLECT([child.name, children.foo]) AS rel_two_representation
I think, Neo4j Server API output by itself should be considered as any database output (like MySQL). Even if it is possible to achieve, with default functionality, desired output - it is not natural way for database.
Probably you should look into creating your own server plugin. This allows you to implement any custom logic, with desired output.

py2neo return number of nodes and relationships created

I need to create a python function such that it adds nodes and relationship to a graph and returns the number of created nodes and relationships.
I have added the nodes and relationship using graph.cypher.execute().
arr_len = len(dic_st[story_id]['PER'])
for j in dic_st[story_id]['PER']:
graph.cypher.execute("MERGE (n:PER {name:{name}})",name = j[0].upper()) #creating the nodes of PER in the story
print j[0]
for j in range(0,arr_len):
for k in range(j+1,arr_len):
graph.cypher.execute("MATCH (p1:PER {name:{name1}}), (p2:PER {name:{name2}}) WHERE upper(p1.name)<>upper(p2.name) CREATE UNIQUE (p1)-[r:in_same_doc {st_id:{st_id}}]-(p2)", name1=dic_st[story_id]['PER'][j][0].upper(),name2=dic_st[story_id]['PER'][k][0].upper(),st_id=story_id) #linking the edges for PER nodes
What I need is to return the number of new nodes and relationships created.
What I get to know from the neo4j documentation is that there is something called "ON CREATE" and "ON MATCH" for MERGE in cypher, but thats not being very useful.
The browser interface for neo4j do actually shows the number of nodes and relationship updated. This is what I need to return, but I am not getting quite the way for it to access it.
Any help please.
In case you need the exact counts of properties either created or updated then you have use "Match" with "Create" or "Match" with "Set" and then count the size of results. Merge may not return which ones are updated and which ones are created.
When you post your query against the Cypher endpoint of the neo4j REST API without using py2neo, you can include the argument "includeStats": true in your post request to get the node/relationship statistics. See this question for an example.
As far as I can tell, py2neo currently does not support additional parameters for the Cypher query (even though it is using the same API endpoints under the hood).
In Python, you could do something like this (using the requests and json packages):
import requests
import json
payload = {
"statements": [{
"statement": "CREATE (t:Test) RETURN t",
"includeStats": True
}]
}
r = requests.post('http://your_server_host:7474/db/data/transaction/commit',
data=json.dumps(payload))
print(r.text)
The response will include statistics about the number of nodes created etc.
{
"stats":{
"contains_updates":true,
"nodes_created":1,
"nodes_deleted":0,
"properties_set":1,
"relationships_created":0,
"relationship_deleted":0,
"labels_added":1,
"labels_removed":0,
"indexes_added":0,
"indexes_removed":0,
"constraints_added":0,
"constraints_removed":0
}
}
After executing your query using x = session.run(...) you can use x.summary.counters to get the statistics noted in Martin Perusse's answer. See the documentation here.
In older versions the counters are available as a "private" field under x._summary.counters.

NEO4J execute severals statement

How it's possible to run a collection of query like this (came from a spreadsheet copy) directly in one cypher query? one by one it's ok, but need 100 copy/paste
*******************************
MATCH (c:`alpha`)
where c.name = "a-01"
SET c.CP_PRI=1, c.TO_PRI=1, c.TA_PRI=2
return c ;
MATCH (c:`beta`)
where c.name = "a-02"
SET c.CP_PRI=1, c.TO_PRI=1, c.TA_PRI=0
return c ;
and 100 other lines ...
*********************************
you may try the 'union' clause, which joins the results of queries into one big-honkin result set:
http://docs.neo4j.org/chunked/milestone/query-union.html
That said - the root behavior of what you are trying to do could use some details - maybe there's a better way to write the query - you could use Excel to 'build' the unified query via calculations / macros, you could possibly write a unified query that combines the rules you are trying to follow, there's a lot of options, but it's hard to know a starting direction w/o context....
Talking about the REST API you can use the transactional endpoint in Neo4J 2.0, or the batch endpoint in Neo4J 1.x.
If you want to use the shell, have a look to the import page, in particular the neo4j-shell-tools where they're importing massive quantity of data batching multiple queries.

Resources