Cypher Query in Qlikview - neo4j

I have connected Neo4j with Qlikview and I'm firing the cypher query. Now I have two graphs with the same labels so when I make a report in Qlikview I'm finding the aggregate count value of a particular label retrieved from both the graphs. So in order to get the data from only one graph can I fire the query with graph name say 'emp' prefixed with the label as below.
SQL MATCH a=(emp.cname)-[:has]->(emp.clusters) RETURN cname.name as cName, clusters.name as ClustName; where emp is the name of the graph from which I want to get the data and cname and clusters are the labels.
Thanks in advance...

There is technically only one graph. It seems that you have create two sub-graphs in that space and need a way of disguising them. That is typically done by creating an anchor node that at least one of the sub-graph nodes is related to or to add a unique label to each set of nodes to distinguish them. If that is not an option then you could run a preliminary query to determine the two start nodes (distinguished by some property/label or that are not connected in any way) and using one or the other for the query of interest.

Related

Cypher Query filter on related Nodes

I am trying to apply filter on my Neo4j Graph DB which has 733922 nodes and 303378 relationships, the DB size is 913.56 MiB. I want to fetch all nodes which has related labels from a specific set, it works if I give upto three related nodes but takes indefinite time to process queries beyond that. If I remove order by then it works for upto five related labels. Is this the most optimised query or am I doing something wrong here? I have attached the PROFILE output for one related label.
MATCH p=(node1:label1{value:'a2cb2c487d9da6941f0f9c1692ed1a5a', source: 'a0b116e2-d9f5-40d4-97ff-6ca6f2ec6c9b'})-[r]-> (:a),(:b),(:c),(:d),(:e),(:f),(:g),(:h),(:i),(:j)
RETURN node1,r
ORDER BY node1.created DESC
LIMIT 25
The following
(:b),(:c),(:d),(:e),(:f),(:g),(:h),(:i),(:j)
creates a Cartesian product between all nodes with labels b,c,d etc. which will quickly grow.
If you want any of the labels a..b use following:
MATCH p=(node1:label1{value:'a2cb2c487d9da6941f0f9c1692ed1a5a', source: 'a0b116e2-d9f5-40d4-97ff-6ca6f2ec6c9b'})-[r]-> (related)
WHERE related:a OR related:b OR ....
If you want to pass labels as parameter labels you could use following in the WHERE clause
WHERE any(x in labels(n) WHERE x in $labels)

faster way to create relationship between existing nodes?

I am working on an application in which I have only "11263" number of nodes in my neo4j database.
I am using following cypher query to form the relationships between the nodes:
let CreateRelations(fromToList : FromToCount list)=
client.Cypher
.Unwind(fromToList, "fromToList")
.Match("(source)", "(target)")
.Where("source.Id= fromToList.SId and target.Id= fromToList.FId ")
.Merge("(source)-[relation:Fights_With]->(target)")
.OnCreate()
.Set("relation.Count= fromToList.Count,relation.Date= fromToList.Date")
.OnMatch()
.Set("relation.Count= (relation.Count+ fromToList.Count )")
.Set("relation.Date= fromToList.Date")
.ExecuteWithoutResults()
It is taking almost 47 to 50 sec to form say 1000 relations in a neo4j database.
I am new to the neo4j DB, Is there is any other efficient way to do it?
The big thing slowing you down is that you're not using an index to lookup your starting nodes. Your match to source is performing a scan of all nodes in your db to find possible matches, per row in your unwound list. Then it does the same thing with target.
You need to add labels on your nodes, and if they already have labels, use the labels in your query. You'll need either an index or unique constraint on the label and id property so the index will be used for lookup.
Best way to go about tuning your queries is to try them out in the browser, and use EXPLAIN to ensure you're using index lookups, and if it's still slow, use PROFILE on the query (it will execute the query) to see the rows generated and db hits as the query executes.

Most efficient way to get all connected nodes in neo4j

The answer to this question shows how to get a list of all nodes connected to a particular node via a path of known relationship types.
As a follow up to that question, I'm trying to determine if traversing the graph like this is the most efficient way to get all nodes connected to a particular node via any path.
My scenario: I have a tree of groups (group can have any number of children). This I model with IS_PARENT_OF relationships. Groups can also relate to any other groups via a special relationship called role playing. This I model with PLAYS_ROLE_IN relationships.
The most common question I want to ask is MATCH(n {name: "xxx") -[*]-> (o) RETURN o.name, but this seems to be extremely slow on even a small number of nodes (4000 nodes - takes 5s to return an answer). Note that the graph may contain cycles (n-IS_PARENT_OF->o, n<-PLAYS_ROLE_IN-o).
Is connectedness via any path not something that can be indexed?
As a first point, by not using labels and an indexed property for your starting node, this will already need to first find ALL the nodes in the graph and opening the PropertyContainer to see if the node has the property name with a value "xxx".
Secondly, if you now an approximate maximum depth of parentship, you may want to limit the depth of the search
I would suggest you add a label of your choice to your nodes and index the name property.
Use label, e.g. :Group for your starting point and an index for :Group(name)
Then Neo4j can quickly find your starting point without scanning the whole graph.
You can easily see where the time is spent by prefixing your query with PROFILE.
Do you really want all arbitrarily long paths from the starting point? Or just all pairs of connected nodes?
If the latter then this query would be more efficient.
MATCH (n:Group)-[:IS_PARENT_OF|:PLAYS_ROLE_IN]->(m:Group)
RETURN n,m

Is a DFS Cypher Query possible?

My database contains about 300k nodes and 350k relationships.
My current query is:
start n=node(3) match p=(n)-[r:move*1..2]->(m) where all(r2 in relationships(p) where r2.GameID = STR(id(n))) return m;
The nodes touched in this query are all of the same kind, they are different positions in a game. Each of the relationships contains a property "GameID", which is used to identify the right relationship if you want to pass the graph via a path. So if you start traversing the graph at a node and follow the relationship with the right GameID, there won't be another path starting at the first node with a relationship that fits the GameID.
There are nodes that have hundreds of in and outgoing relationships, some others only have a few.
The problem is, that I don't know how to tell Cypher how to do this. The above query works for a depth of 1 or 2, but it should look like [r:move*] to return the whole path, which is about 20-200 hops.
But if i raise the values, the querys won't finish. I think that Cypher looks at each outgoing relationship at every single path depth relating to the start node, but as I already explained, there is only one right path. So it should do some kind of a DFS search instead of a BFS search. Is there a way to do so?
I would consider configuring a relationship index for the GameID property. See http://docs.neo4j.org/chunked/milestone/auto-indexing.html#auto-indexing-config.
Once you have done that, you can try a query like the following (I have not tested this):
START n=node(3), r=relationship:rels(GameID = 3)
MATCH (n)-[r*1..]->(m)
RETURN m;
Such a query would limit the relationships considered by the MATCH cause to just the ones with the GameID you care about. And getting that initial collection of relationships would be fast, because of the indexing.
As an aside: since neo4j reuses its internally-generated IDs (for nodes that are deleted), storing those IDs as GameIDs will make your data unreliable (unless you never delete any such nodes). You may want to generate and use you own unique IDs, and store them in your nodes and use them for your GameIDs; and, if you do this, then you should also create a uniqueness constraint for your own IDs -- this will, as a nice side effect, automatically create an index for your IDs.

Neo4j: Java API to compute intersection multiple properties

I'm very new in using Neo4j and have a question regarding the computation of intersections of nodes.
Let's suppose, I have the three properties A,B,C and I want to select only the nodes that have all three properties.
I created an index for the properties and thus, I can get all nodes having one of the properties. However, afterwards I have to merge the IndexHits. Is there a way to select directly all nodes having the three properties?
My second idea was to create a node for each property and connect other nodes by relationships. I can then iterate over all relationships and get for each property a list of nodes which are connected. But again, I have to compute the intersection afterwards.
Is there a function I miss here, since I suppose it's a standard problem.
Thanks a lot,
Benny
Do you also have the values you look for? You would start with the property that limits the amount of found nodes most.
MATCH (a:Label {property1:{value1}})
WHERE a.property2 = {value2} AND a.property3 = {value3}
RETURN a
For the Java API and lucene indexes:
gdb.index().forNodes("foo").query("p1:value1 p2:value2 p3:value3")
Lucene query syntax

Resources