So here's the deal, I'm using Neo4J 3.01 and I have a graph with nodes of type Action which have amongst other links and properties the following two properties:
Type: [Comment,Reply,Vote etc]
Date:[epoch timestamp]
I am attempting to run a cypher query that returns a sorted list of nodes ordered by the date field but collapsing (Collect?) sequential items of the same type
So for the following nodes:
{type:'Comment',date:1}
{type:'Comment',date:2}
{type:'Vote',date:3}
{type:'Comment',date:4}
{type:'Comment',date:5}
{type:'Reply',date:6}
{type:'Reply',date:7}
{type:'Vote',date:8}
{type:'Vote',date:9}
I would hope to get something like:
{type:'Comment',actions:[{type:'Comment',date:1},{type:'Comment',date:2}]}
{type:'Vote',actions:[{type:'Vote',date:3}]}
{type:'Comment',actions:[{type:'Comment',date:4},{type:'Comment',date:5}]}
{type:'Reply',actions:[{type:'Reply',date:6},{type:'Reply',date:7}]}
{type:'Vote',actions:[{type:'Vote',date:8},{type:'Vote',date:9}]}
I attempted a simple collect and order cypher query:
Match (a:Action)
with a.type as type, a order by a.date limit 20
return type, collect(a) as actions
but this seems to collect each type in its own group regardless of the sequence, so the actual result is something like:
{type:'Comment',actions:[{type:'Comment',date:1},{type:'Comment',date:2},{type:'Comment',date:4},{type:'Comment',date:5}]}
{type:'Vote',actions:[{type:'Vote',date:3},{type:'Vote',date:8},{type:'Vote',date:9}]}
{type:'Reply',actions:[{type:'Reply',date:6},{type:'Reply',date:7}]}
Related
match(m:master_node:Application)-[r]-(k:master_node:Server)-[r1]-(n:master_node)
where (m.name contains '' and (n:master_node:DeploymentUnit or n:master_node:Schema))
return distinct m.name,n.name
Hi,I am trying to get total number of records for the above query.How I change the query using count function to get the record count directly.
Thanks in advance
The following query uses the aggregating funtion COUNT. Distinct pairs of m.name, n.name values are used as the "grouping keys".
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
RETURN m.name, n.name, COUNT(*) AS cnt
I assume that m.name contains '' in your query was an attempt to test for the existence of m.name. This query uses the EXISTS() function to test that more efficiently.
[UPDATE]
To determine the number of distinct n and m pairs in the DB (instead of the number of times each pair appears in the DB):
MATCH (m:master_node:Application)--(:master_node:Server)--(n:master_node)
WHERE EXISTS(m.name) AND (n:DeploymentUnit OR n:Schema)
WITH DISTINCT m.name AS n1, n.name AS n2
RETURN COUNT(*) AS cnt
Some things to consider for speeding up the query even further:
Remove unnecessary label tests from the MATCH pattern. For example, can we omit the master_node label test from any nodes? In fact, can we omit all label testing for any nodes without affecting the validity of the result? (You will likely need a label on at least one node, though, to avoid scanning all nodes when kicking off the query.)
Can you add a direction to each relationship (to avoid having to traverse relationships in both directions)?
Specify the relationship types in the MATCH pattern. This will filter out unwanted paths earlier. Once you do so, you may also be able to remove some node labels from the pattern as long as you can still get the same result.
Use the PROFILE clause to evaluate the number of DB hits needed by different Cypher queries.
You can find examples of how to use count in the Neo4j docs here
In your case the first example where:
count(*)
Is used to return a count of each returned item should work.
In the Neo4j Browser, I performed one query as follows:
match (subject:User {name:{name}})
match (subject)-[:works_for]->(company:Company)<-[:works_for]-(person:User),
(subject)-[:interested_in]->(interest)<-[:interested_in]-(person)
return person.name as name, count(interest) as score,
collect(interest.name) as interests order by score DESC
The result only has the "table" and "text" views, without the "graph". Normally, a query can generate a subgraph. Right?
If you look at your return, you're not returning any actual nodes or relationships. You're returning property values (strings), counts (longs), and collections of strings. If you returned person instead you would probably be able to see a graphical result, as the return would include data (id, labels, properties) that could be used to display graph elements.
I have a bunch of venues in my Neo4J DB. Each venue object has the property 'catIds' that is an array and contains the Ids for the type of venue it is. I want to query the database so that I get all Venues but they are ordered where their catIds match or contain some off a list of Ids that I give the query. I hope that makes sense :)
Please, could someone point me in the direction of how to write this query?
Since you're working in a graph database you could think about modeling your data in the graph, not in a property where it's hard to get at it. For example, in this case you might create a bunch of (v:venue) nodes and a bunch of (t:type) nodes, then link them by an [:is] relation. Each venue is linked to one or more type nodes. Each type node has an 'id' property: {id:'t1'}, {id:'t2'}, etc.
Then you could do a query like this:
match (v:venue)-[r:is]->(:type) return v, count(r) as n order by n desc;
This finds all your venues, along with ALL their type relations and returns them ordered by how many type-relations they have.
If you only want to get nodes of particular venue types on your list:
match (v:venue)-[r:is]-(t:type) where t.id in ['t1','t2'] return v, count(r) as n order by n desc;
And if you want ALL venues but rank ordered according to how well they fit your list, as I think you were looking for:
match (v:venue) optional match (v)-[r:is]->(t:type) where t.id in ['t1','t2'] return v, count(r) as n order by n desc;
The match will get all your venues; the optional match will find relations on your list if the node has any. If a node has no links on your list, the optional match will fail and return null for count(r) and should sort to the bottom.
I have no idea of iterating on list in neo4j. Please some one suggest the idea for the below problem.
Example:
I have some nodes in the graph.
Then, I will give few(always varying, this is the user input) keywords to search for nodes which are common to this words. In my graph each word is a node.
Ex: Input: [Best sports car]
output: connected nodes for Best are [samsung,porshe,ambassdor,protein,puma]
connected nodes for sports are [cricket,racing,rugby,puma,porshe]
connected nodes for car are [porshe,ambassdor,benz,audi]
Common nodes to all words are : [porshe]
Result is : porshe
I don't have any idea of iterating each word and storing the match results. Please some one suggest any idea.
In order to test the following working query, I'll make some assumptions :
The words nodes have the label :Word and the name property.
The porsche, puma, etc.. nodes have the label :Item and a name property.
Item nodes have an outgoing CONNECT relationships to Word nodes
Which will give the following graph :
The query is the following (in order to simulate the given words as parameters, I added a WITH containing the words list in the beginning of the query)
WITH ["car","best","sports"] as words
MATCH (n:Word)<-[:CONNECT]-(i:Item)
WHERE n.name IN words
WITH i, count(*) as c, words
WHERE c = size(words)
RETURN i
And will return only the porsche Item node.
Logic explanation
The logic of the query, is that if a node matches all given words, there will be 3 patterns to it found in the first MATCH, so the count(*) will have a value of 3 here for the porsche node.
This value is compared to the size of the words list.
More explanations
In the WITH statement, there is two expressions : i and count(*).
i is not an aggregate function, so it will act as a grouping key.
count(*) is an aggregate function and will run on the i bucket, calculating the aggregate values.
For example, if you want to know how many words each Item is matching you can simply do :
WITH ["car","best","sports"] as words
MATCH (n:Word)<-[:CONNECT]-(i:Item) WHERE n.name IN words
RETURN i.name, count(*)
Which will return this :
You can see that porsche is matching 3 words, which is the size of the given words list, then you can simply compare the 3 from the count aggregation to this size.
In order to fully understand how aggregation works, you can refer to the manual : http://neo4j.com/docs/stable/query-aggregation.html
You can test the query here :
http://console.neo4j.org/r/e6bee0
If you pass the words as parameters, this will then be the corresponding query :
MATCH (n:Word)<-[:CONNECT]-(i:Item)
WHERE n.name IN {words}
WITH i, count(*) as c
WHERE c = size({words})
RETURN i
assuming {words} is the name of the given query parameter
Is something like this what you are after?
Start with a collection of words form the requested search.
Match each word against the graph.
Collect the connected words in a list.
with ['Best', 'sports', 'car'] as word_coll
unwind word_coll as word
match (:Word {name: word})--(conn_word:Word)
return word,collect(conn_word)
I have a Cypher query which I'd like to expand to be summed up over a list of matching nodes.
My query looks like this:
MATCH (u:User {name: {input} })-[r:USES]-(t) RETURN SUM(t.weight)
This matches one User node, and I'd like to adjust it to match a list of User nodes and then perform the aggregation.
My current implementation just calls the query in a loop and performs the aggregation outside of Cypher. This results in slightly inaccurate results and a lot of API calls.
Is there a way to evaluate the Cypher query against a list of elements (strings in my case)?
I'm using Neo4j 2.1 or 2.2.
Cheers
You can use the IN operator and pass in an array of usernames:
MATCH (u:User)-[r:USES]-(t)
WHERE u.name in ['John','Jim','Jack']
RETURN u.name, SUM(t.weight)
Instead of the array here you can use an parameter holding and array value as well:
MATCH (u:User)-[r:USES]-(t)
WHERE u.name in {userNames}
RETURN u.name, SUM(t.weight)
userNames = ['John','Jim','Jack']