NEO4J execute severals statement - neo4j

How it's possible to run a collection of query like this (came from a spreadsheet copy) directly in one cypher query? one by one it's ok, but need 100 copy/paste
*******************************
MATCH (c:`alpha`)
where c.name = "a-01"
SET c.CP_PRI=1, c.TO_PRI=1, c.TA_PRI=2
return c ;
MATCH (c:`beta`)
where c.name = "a-02"
SET c.CP_PRI=1, c.TO_PRI=1, c.TA_PRI=0
return c ;
and 100 other lines ...
*********************************

you may try the 'union' clause, which joins the results of queries into one big-honkin result set:
http://docs.neo4j.org/chunked/milestone/query-union.html
That said - the root behavior of what you are trying to do could use some details - maybe there's a better way to write the query - you could use Excel to 'build' the unified query via calculations / macros, you could possibly write a unified query that combines the rules you are trying to follow, there's a lot of options, but it's hard to know a starting direction w/o context....

Talking about the REST API you can use the transactional endpoint in Neo4J 2.0, or the batch endpoint in Neo4J 1.x.
If you want to use the shell, have a look to the import page, in particular the neo4j-shell-tools where they're importing massive quantity of data batching multiple queries.

Related

How to express multiple property set criteria for node selection using gremlin query

Here is my simplified graph schema,
package:
property:
- name: str (indexed)
- version: str (indexed)
I want to query the version using multiple set of property criteria within single query. I can use within for a list of single property, but how to do it for multiple properties?
Consider I have 10 package nodes, (p1,v1, p2,v2, p3,v3,.. p10,v10)
I want to select only nodes which has (p1 with v1, p8 with v8, p10 with v10)
Is there a way to do with single gremlin query?
Something equivalent to SELECT * from package WHERE (name, version) in ((p1,v1),(p8,v8),(p10,v10)).
It's always best to provide some sample data when asking questions about Gremlin. I assume that this is an approximation of what your model is:
g.addV('package').property('name','gremlin').property('version', '1.0').
addV('package').property('name','gremlin').property('version', '2.0').
addV('package').property('name','gremlin').property('version', '3.0').
addV('package').property('name','blueprints').property('version', '1.0').
addV('package').property('name','blueprints').property('version', '2.0').
addV('package').property('name','rexster').property('version', '1.0').
addV('package').property('name','rexster').property('version', '2.0').iterate()
I don't think that there is a way that you can compare pairs of inputs and expect an index hit. You therefore have to do what you normally do in graphs and choose the index to best narrow your results before you filter in memory. I would assume that in your case this would be the "name" property, therefore grab those first then filter the pairs:
gremlin> g.V().has('package','name', within('gremlin','blueprints')).
......1> elementMap().
......2> where(select('name','version').is(within([name:'gremlin',version:'2.0'], [name:'blueprints',version:'2.0'])))
==>[id:3,label:package,name:gremlin,version:2.0]
==>[id:12,label:package,name:blueprints,version:2.0]
this might not be the most "creative" way of doing that,
but I think that the easiest way would be to use or:
g.V().or(
hasLabel('v1').has('prop', 'p1'),
hasLabel('v8').has('prop', 'p8'),
hasLabel('v10').has('prop', 'p10')
)
example: https://gremlify.com/6s

how to create multiple relationships in a single cypher statement

I must create a set of relationships, all having the same source and type, like in the following sample:
create (_1)-[:`typ`]->(:`x` {`name`:"Mark"})
create (_1)-[:`typ`]->(:`y` {`name`:"Jane"})
create (_1)-[:`typ`]->(:`z` {`name`:"John"})
...
I'd like to have a shorten way to write those statements, like following attempt?
create (_1)-[:`typ`]->[(:`x` {`name`:"Mark"}),
(:`y` {`name`:"Jane"}),
(:`z` {`name`:"John"})]
Any idea?
Thank you in advance.
Paolo
You could do it in a performant and easy way by this pattern:
{batch: [
{from:"alice#example.com",to:"bob#example.com",properties:{since:2012}},
{from:"alice#example.com",to:"charlie#example.com",properties:{since:2016}}]}
UNWIND {batch} as row
MATCH (from:Label {row.from})
MATCH (to:Label {row.to})
CREATE/MERGE (from)-[rel:KNOWS]->(to)
(ON CREATE) SET rel += row.properties
Taken with thanks from 5 Tips & Tricks for Fast Batched Updates of Graph Structures with Neo4j and Cypher by #MichaelHunger.

How to set "resultDataContents" in Neo4jrb?

I want to visualise data from Neo4j with the frontend-library D3.js in an Rails application, using Neo4jrb. For example I could use the following query to get my graph data.
query = "MATCH path = (a)-[b]->(c) RETURN path"
result = Neo4j::Session.current.query(query)
But this query is not giving me the exact data I want.
According to the Neo4j data visualisation guide there is a possibility to set the parameter resultDataContents to "graph". (
Neo4j documentation for "resultDataContents")
This is exactly what I need for my application. Is there any possibility to set this parameter in Neo4jrb, or another idea how to achieve such a result?
Unfortunately not currently. The neo4j-core gem (which the neo4j gem uses) was build to abstract away the REST format. The "graph" format returns data in a different way.
You have a couple of options. You could make the JSON queries yourself or you could retrieve the nodes and relationships from the queries that you perform and then build your own nodes/relationships structure which is returned. This might be more future-proof anyway if you ever want to switch to Bolt.
A way that you might do this in your case:
query = "MATCH path = (a)-[b]->(c) RETURN nodes(path) AS nodes, rels(path) AS rels"
result = Neo4j::Session.current.query(query)
response = {nodes: [], rels: []}
result.each do |row|
response[:nodes].concat(row.nodes)
response[:rels].concat(row.rels)
end
response[:nodes].uniq!
response[:rels].uniq!

store temp variables in neo4j

I have some cypher queries that I execute against my neo4j database. The query is in this form
MATCH p=(j:JOB)-[r:HAS|STARTS]->(s:URL)-[r1:VISITED]->(t:URL)
WHERE j.job_id =5000 and r1.origin='iframe' and r1.job_id=5000 AND NOT (t.netloc =~ 'VERY_LONG_LIST')
RETURN count(r1) AS number_iframes;
If you can't understand what I am doing. This is a much simpler query
MATCH (s:WORD)
WHERE NOT (s.text=~"badword1|badword2|badword3")
RETURN s
I am basically trying to match some words against specific list
The problem is that this list is very large as you can see my job_id=5000 and I have more than 20000 jobs, so if my whitelist length is 1MB then I will end up with very large queries. I tried 500 jobs and end up with 200 MB queries file.
I was trying to execute these queries using transactions from py2neo but this is wont be feasible because my post request length will be very large and it will timeout. As a result, I though of using
neo4j-shell -file <queries_file>
However as you can see the file size is very large because of the large whitelist. So my question is there anyway that I can store this "whitelist" in a variable in neo4j using cypher??
I wish if there is something similar to this
SAVE $whitelist="word1,word2,word3,word4,word5...."
MATCH p=(j:JOB)-[r:HAS|STARTS]->(s:URL)-[r1:VISITED]->(t:URL)
WHERE j.job_id =5000 and r1.origin='iframe' and r1.job_id=5000 AND NOT (t.netloc =~ $whitelist)
RETURN count(r1) AS number_iframes;
What datatype is your netloc?
If you have an index on netloc you can also use t.netloc IN {list} where {list} is a parameter provided from the outside.
Such large regular expressions will not be fast
What exactly is your regexp and netloc format like? Perhaps you can change that into a split + index-list lookup?
In general also for regexps you can provide an outside parameter.
You can also use "IN" + index for job_ids.
You can also run a separate job that tags the jobs within your whitelist with a label and use that label for additional filtering e.g. in the match already.
Why do you have to check this twice ? Isn't it enough that the job has id=5000?
j.job_id =5000 and r1.job_id=5000

Neo4j Cypher 2.0: Pass in params for batch match - relationships

This question is similar to this: create relationships between nodes in parallel and this Neo4j: Best way to batch relate nodes using Cypher?
I would like to parameterize a batch for creating relationships using a Cypher query and Neo4jClient (a c# client for Neo4j).
How would I write this out (specifically focusing on performance) - i.e. using only match and create statements and not Merge as merge ends up timing out for some reason?
I was thinking I could do something like this ( as stated in that second SO link)
MATCH (s:ContactPlayer {ContactPrefixTypeId:{cptid}})
MATCH (c:ContactPrefixType {ContactPrefixTypeId:{cptid}})
CREATE c-[:CONTACT_PLAYER]->s
with params:
{
"query":...,
"params": {
"cptid":id1
}
}
But this doesn't work, because it's trying to match the property as an Array.
I modified it to use WHERE x.Y IN {params} but this was extremely slow. The second recommendation was to try to use the transactional endpoint for Neo4j but I'm unsure how to do that with Neo4jClient.
This was the recommendation from the 2nd SO link above:
{
"statements":[
"statement":...,
"parameters": {
"cptid":id1
},
"statement":...,
"parameters": {
"cptid":id2
}
]
}
I did see this pull request but did not see that it had been implemented yet: https://github.com/Readify/Neo4jClient/pull/26
Without transaction support, is there another way to do this?
What's the performance when you use the query below?
USING PERIODIC COMMIT 1000
MATCH (s:ContactPlayer), (c:ContactPrefixType)
WHERE s.ContactPrefixTypeId = c.ContactPrefixTypeId
CREATE c-[:CONTACT_PLAYER]->s
If you want to try out the periodic commit statement, you'll have to use version 2.1.0-M1 for now. Otherwise, you can leave it out.

Resources