How to use "MATCH.. CREATE UNIQUE.." with neography - neo4j

I'm trying to write an importer script that will take a MySQL resultset and add nodes and relationships in my Neo4J DB. To simplify things, my resultset has the following fields:
application_id
amount
application_date
account_id
I want to create a node with Application label with the application_id, amount, application_date fields and another node with Account label with account_id field and a relationship between them.
My SQL query is from the applications table so I'm not afraid of dups there, but an account_id can appear more than once and I obviously don't want to create multiple nodes for that.
I'm using neography (but willing to switch if there is something simpler). What would be the best (easiest) way to achieve that?
My script will drop the database before it starts so no leftovers to take care of.
Should I create an index before and use create_unique_node?
I think I can do what I want in cypher using "MATCH .. CRAETE UNIQUE..", what's the equivalent of that in neography? I don't understand how index_name gets into the equation...
Should I or should I not define the constraint on Account?
Many thanks,
It's my first time with graph DBs so apologies if I miss a concept here..

From what you describe, it looks like you should use the constraints that come with Neo4j 2.0 : http://docs.neo4j.org/chunked/milestone/query-constraints.html#constraints-create-uniqueness-constraint
CREATE CONSTRAINT ON (account:ACCOUNT) ASSERT account.account_id IS UNIQUE
Then you can use the MATCH .. CREATE UNIQUE clauses for all your inserts. You can use neography to submit the cypher queries, see the examples here: https://github.com/maxdemarzi/neography/wiki/Scripts-and-queries

Related

Cypher preventing a relationship from node b to a if the same relation from a to b exist?

I want to prevent a relationship between two nodes in Neo4j if the same relation from the different side is already present i.e.
create (a)-[r:Variation]->(b)
if and only if (b)-[r:Variation]->(a) is not present in the database ?
If your query only does relationship creation (nothing else after this), then just add WHERE NOT (b)-[:Variation]->(a) before your CREATE (I'm assuming there's a MATCH to a and b above this you didn't provide).
But if there's additional logic afterward and you want the query to keep executing whether or not the conditional is met, you may want to take a look at conditional procs in APOC Procedures, specifically apoc.do.when().

How could i use this SQL on cypher(neo4j)

hi how can i transform this SQL Query as CYPHER Query ? :
SELECT n.enginetype, n.Rocket20, n.Yearlong, n.DistanceOn,
FROM TIMETAB AS n
JOIN PLANEAIR AS p ON (n.tailnum = p.tailNum)
If it is requisition before using that query to create any relationship or antyhing please write and help with that one too.. thanks
Here's a good guide for comparing SQL with Cypher and showing the equivalent Cypher for some SQL queries.
If we were to translate this directly, we'd use :PLANEAIR and :TIMETAB node labels (though I'd recommend using better names for these), and we'll need a relationship between them. Let's call it :RELATION.
Joins in SQL tend to be replaced with relationships between nodes, so we'll need to create these patterns in your graph:
(:PLANEAIR)-[:RELATION]->(:TIMETAB)
There are several ways to get your data into the graph, usually through LOAD CSV. The general approach is to MERGE your :PLANEAIR and :TIMETAB nodes with some id or unique property (maybe TailNum?, use ON CREATE SET ... after the MERGE to add the rest of the properties to the node when it's created, and then MERGE the relationship between the nodes.
The MERGE section of the developers manual should be helpful here, though I'd recommend reading through the entire dev manual anyway.
With this in place, the Cypher equivalent query is:
MATCH (p:PLANEAIR)-[:RELATION]->(n:TIMETAB)
RETURN n.Rocket20,p.enginetype, n.year, n.distance
Now this is just a literal translation of your SQL query. You may want to reconsider your model, however, as I'm not sure how much value there is in keeping time-related data for a plane separate from its node. You may just want to have all of the :TIMETAB properties on the :PLANEAIR node and do away with the :TIMETAB nodes completely. Of course your queries and use cases should guide how to model that data best.
EDIT
As far as creating the relationship between :PLANEAIR and :TIMETAB nodes (and again, I recommend using better labels for these, and maybe even keeping all time-related properties on a :Plane node instead of a separate one), provided you already have those nodes created, you'll need to do a joining match, but it will help to have a unique constraints on :PLANEAIR(tailnum) :TIMETAB(tailNum) (or an index, if this isn't supposed to be a unique property):
CREATE CONSTRAINT ON (p:PLANEAIR)
ASSERT p.tailNum IS UNIQUE
CREATE CONSTRAINT ON (n:TIMETAB)
ASSERT n.TailNum IS UNIQUE
Now we're ready to create the relationships
MATCH (p:PLANEAIR)
MATCH (n:TIMETAB)
WHERE p.tailNum = n.tailNum
CREATE (p)-[:RELATION]->(n)
REMOVE n.tailNum
Now that the relationships are created, and :TIMETAB tailNum property removed, we can drop the unique constraint on :TIMETAB(tailNum), since the relationship to :PLANEAIR is all we need.
DROP CONSTRAINT ON (n:TIMETAB)
ASSERT n.tailNum IS UNIQUE

How do I refactor data two neo4j nodes to a relationship?

I'm doing an experiment with using a graph database (neo4j). I have two csv's that I imported into a neo4j datastore. I'm a little shakey on the neo terminology; so forgive me. Lets say I have:
Customer (AccountNumber, CustomerName) and
CustomerGroups (AccountNumber, GroupName).
I would like to create a new Node called groups which is comprised of the distinct GroupName from CustomerGroups. I'll call it Group.
I then want to create relationships "HAS_GROUP" from Customer to Group using the common AccountNumber from CustomerGroups.
Once the above is completed, I could delete CustomerGroups as its no longer needed.
I'm just stuck at the syntax. I can get the distinct groups from CustomerGroups with:
MATCH (n:CustomerGroups) distinct n.GROUP_NAME
and I get back about 50 distinct groups, but can't figure how to add the create statement to the results and CREATE g:Group {GroupName: n.GROUP_NAME}
I then know my followup question is how to do the MATCH to the new group using the old table with common account numbers.
FYI: I've indexed the AccountNumber in both Nodes. Both Customer and CustomerGroups have over 5 Million nodes. Not bad for a laptop (2 min to import using neo4j-import). I was impressed!
Thanks for any help you can give!
Instead of creating a CustomerGroups label and creating nodes for that, you should be able to define relationships that you would like to create in your neo4j-import. It would certainly be a lot faster too. See:
http://neo4j.com/docs/stable/import-tool-header-format.html
To your question, you could probably do something like:
MATCH (cg:CustomerGroup)
MATCH (customer:Customer {AccountNumber: cg. AccountNumber}), (group:Group {GroupName: cg.GroupName})
CREATE (customer)-[:IN_GROUP]->(group)
You'd definitely want to make sure you have indexes on :Customer(AccountNumber) and :Group(GroupName) first. But even then it would still be much slower than doing it as part of your initial import.
Also, you may or may not want MERGE instead of CREATE

How to determine the Max property on a Relationship in Neo4j 2.2.3

How do you quickly get the maximum (or minimum) value for a property of all instances of a relationship? You can assume the machine I'm running this on is well within the recommended spec's for the cpu and memory size of graph and the heap size is set accordingly.
Facts:
Using Neo4j v2.2.3
Only have access to modify graph via Cypher query language which I'm hitting via PHP or in the web interfacxe--would love to avoid any solution that requires java coding.
I've got a relationship, call it likes that has a single property id that is an integer.
There's about 100 million of these relationships and growing
Every day I grab new likes from a MySQL table to add to the graph within in Neo4j
The relationship property id is actually the primary key (auto incrementing integer) from the raw MySQL table.
I only want to add new likes so before querying MySQL for the new entries I want to get the max id from the likes, so I can use it in my SQL query as SELECT * FROM likes_table WHERE id > max_neo4j_like_property_id
How can I accomplish getting the max id property from neo4j in a optimal way? Please indicate the create statement needed for any index as well as the query you'd used to get the final result.
I've tried creating an index as follows:
CREATE INDEX ON :likes(id);
After the index is online I've tried:
MATCH ()-[r:likes]-() RETURN r.i ORDER BY r.id DESC LIMIT 1
as well as:
MATCH ()-[r:likes]->() RETURN MAX(r.id)
They work but take freaking forever as the explain plan for both indicate no indexes being used.
UPDATE: Holy $?##$?!!!! It looks like the new schema indexes aren't functional for relationships even though you can create them and show them with :schema. It also looks as if there's no way with cypher directly to create Legacy Indexes which look like they might solve this issue.
If you need to query relationship properties, it is generally a sign of a model issue.
The need of this query reveals you that you would better extract these properties into a node, that you'll then be able to query faster.
I don't say it is 100% the case, but certainly 99% of the people seen so far with the same problem has been demonstrating this model concern.
What is your model right now ?
Also you don't use labels at all in your query, likes have a context bound to the nodes.

How can a cypher query be written to create hyperedges?

I have a graph with offers and customers. A customer can share an offer with another customer, so when this happens, I create a hyperedge.
(CustomerA)-[:SHARED_OFFER]->(newNode)
(newNode)-[:FOR_OFFER]->(offer)
(newNode)-[:SHARED_WITH]->(customers) (this can be many customers)
Now if another customer B shares the same offer with others, I would want a new node created to represent this relationship.
Is there a way to accomplish all this in one Cypher query?
I am using:
start c=node:node_auto_index(name="C1"), o=node:node_auto_index(name="Offer"), sharedCustomer=node:node_auto_index(name="C2")
create unique c-[:SHARED_OFFER]->(sharedOffer)-[:FOR_OFFER]->(o), (sharedOffer)-[:SHARED_WITH]->(sharedCustomer)
which works for the first time. See the console at: http://console.neo4j.org/r/76no2g
This query correctly created relationships when C1 shared Offer with C2.
Executing the query for the case when C2 now shares Offer with C3 causes the same node to be reused---that's not what I want. There should be a new node created from C2 with the SHARED_OFFER relationship. Here is the query:
start c=node:node_auto_index(name="C2"), o=node:node_auto_index(name="Offer"), sharedCustomer=node:node_auto_index(name="C3")
create unique c-[:SHARED_OFFER]->(sharedOffer)-[:FOR_OFFER]->(o), (sharedOffer)-[:SHARED_WITH]->(sharedCustomer)
Any help is appreciated.
Note: I'm using 1.8.1 REST, so trying to accomplish this all in one go rather than in parts.
simply don't use create unique but just create. but remember to first specify all nodes you got in the path in the start clause and leave the param sharedOffer always unspecified so the create command will create just the unspecified elements.
update use create instead of create unique and filter on existing relations (or use 2 queries - one to check whether an sharedOffer already exists with C1 and second to update sharedOffer with C3):
START c=node:node_auto_index(name="C1"), o=node:node_auto_index(name="Offer"), sharedCustomer=node:node_auto_index(name="C2")
WHERE not(c-[:SHARED_OFFER]->(sharedOffer)-[:FOR_OFFER]->(o))
CREATE c-[:SHARED_OFFER]->(sharedOffer)-[:FOR_OFFER]->(o), (sharedOffer)-[:SHARED_WITH]->(sharedCustomer)

Resources