get the root parent node using neo4j - neo4j

I've imported a csv file to neo4j and created nodes and relationships for them.
In the above the first two nodes comes under db1 and last four nodes comes under db2.
How to find that last four nodes belongs to db2?
Below is the code and csv file
columnname,tablename,databasename,systemname
abc,1a,db1,Finance
def,1a,db1,Finance
ghi,1a,db1,Finance
klm,1a,db1,Finance
abc,1a,db2,Medical
def,1a,db2,Medical
ghi,1a,db2,Medical
klm,1a,db2,Medical
nop,1a,db2,Medical
qrs,1a,db2,Medical
I've created nodes and relationships for the above csv file in neo4j
This is for getting unique values
CREATE CONSTRAINT ON (c:ColumnName) ASSERT c.ColumnName IS UNIQUE;
CREATE CONSTRAINT ON (c:TableName) ASSERT c.TableName IS UNIQUE;
CREATE CONSTRAINT ON (c:DatabaseName) ASSERT c.DatabaseName IS UNIQUE;
CREATE CONSTRAINT ON (c:SystemName) ASSERT c.SystemName IS UNIQUE;
This is for loading csv file and creating nodes and relationships
LOAD CSV WITH HEADERS FROM "file:///test.csv" AS line
MERGE (ColumnName:ColumnName {ColumnName: line.ColumnName})
MERGE (TableName:TableName {TableName:line.TableName})
MERGE (DatabaseName: DatabaseName {DatabaseName:line.DatabaseName})
MERGE (SystemName: SystemName {SystemName:line.SystemName})
This is creating relationships among the nodes
MERGE (ColumnName)-[:iscolumnof]->(TableName)
MERGE (TableName)-[:istableof]->(DatabaseName)
MERGE (DatabaseName)-[:isdatabaseof ]->(SystemName)
If, i select one node 'nop'and expand i'll get the node(1a) 1a and if i expand 1a i'll get all the
nodes(columns). How to find that 'nop' belongs to 'db2'?

As far as I understand, you have a pattern
(:ColumnName)-[:iscolumnof]->(:TableName)-[:istableof]->(:DatabaseName)-[:isdatabaseof ]->(:SystemName)
If you want to test whether a certain :ColumnName belongs to a :DatabaseName
WITH 'nop' AS columnName, 'db2' AS databaseName
MATCH (col:ColumnName {ColumnName:columnName}),(db:DatabaseName
{DatabaseName:databaseName})
RETURN EXISTS((col)-[:iscolumnof]->(:TableName)-[:istableof]->(db)) AS result
If you want all the columns of db2
WITH 'db2' AS databaseName
MATCH (c:ColumnName)-[:iscolumnof]->(:TableName)-[:istableof]->(:DatabaseName {DatabaseName:databaseName})
RETURN c.ColumnName AS column

Related

How to combine similar nodes in neo4j

I have defined few nodes and relationships in neo4j graph database but the output is bit different from expected one as each node is representing its own data and attributes. I want combination of same node showcasing different relationships and attributes
`LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
CREATE(s:SourceID{Name:line.SourceID})
CREATE(t:Title{Name:line.Title})
CREATE(c:Coverage{Name:line.Coverage})
CREATE(p:Publisher{Name:line.Publisher})
MERGE (p)-[:PUBLISHES]->(t)
MERGE (p)-[:Coverage{covers:line.Coverage}]->(t)
MERGE (t)-[:BelongsTO]->(p)
MERGE (s)-[:SourceID]->(t)`
In given picture there are two nodes with Springer Nature and i wish to have only one node namely, Springer Nature and all the associated data of both the nodes to be present in single node.
First of all, I would recommend you to set a CONSTRAINT before adding data.
It seems that the Nodes can have duplicates when creating them because you are merging patterns and the cypher query does not specify that the nodes have to be identified unique nodes.
So in your case try this first for each of the node labels:
CREATE CONSTRAINT publisherID IF NOT EXISTS FOR (n:Publisher) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT sourceID IF NOT EXISTS FOR (n:SourceID) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT titleID IF NOT EXISTS FOR (n:Title) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT coverageID IF NOT EXISTS FOR (n:Coverage) REQUIRE (n.Name) IS UNIQUE;
Even better would be to not use the name but a publisher ID. But this is your choice, and if there aren't thousands of publishers in the data, this will be no issue at all.
Also, I would not use CREATE for creating the nodes but use MERGE instead. Because the cypher query goes line-by-line, if you want to create a node which already exists—which could happen on the second line or on the fiftieth line—the query would fail if you set the CONSTRAINT above.
And try everything on a blank database; for example, by deleting all nodes:
MATCH (n) DETACH DELETE n
So to sum up the Cypher Query in one go, you send the queries separately:
CREATE CONSTRAINT publisherID IF NOT EXISTS FOR (n:Publisher) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT sourceID IF NOT EXISTS FOR (n:SourceID) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT titleID IF NOT EXISTS FOR (n:Title) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT coverageID IF NOT EXISTS FOR (n:Coverage) REQUIRE (n.Name) IS UNIQUE;
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line
MERGE(s:SourceID{Name:line.SourceID})
MERGE(t:Title{Name:line.Title})
MERGE(c:Coverage{Name:line.Coverage})
MERGE(p:Publisher{Name:line.Publisher})
MERGE (p)-[:PUBLISHES]->(t)
MERGE (p)-[:Coverage{covers:line.Coverage}]->(t)
MERGE (t)-[:BelongsTO]->(p)
MERGE (s)-[:SourceID]->(t)
RETURN count(p), count(t), count(c), count(s);

Unable to link a node with itself using Neo4j

How can I create a relationship from a node to itself? I have one node (p:person) and my csv has 2 columns: name and vice. Each row in my csv represents a person who a ceo and their vp at the time. Now sometimes vp were ceo so I want to show that relationship. Here is what I was trying but no luck. If I do not include the WITH I receive error saying I need it but when I add the * or a property, it says it cannot find row. I'm stuck
:auto USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///ceo_vp.csv' AS row
CREATE (p:person {name:coalesce(row.name,'UNK')})
MATCH (p:person {name:row.vice })
WITH *
CREATE (p)-[:was_vp_for]->(p)
There is typo on the variable p; You must assign a different variable name for vp. Here is the script;
LOAD CSV WITH HEADERS FROM 'file:///ceo_vp.csv' AS row
MERGE (ceo:person {name:coalesce(row.name,'UNK')})
MERGE (vice:person {name:row.vice })
CREATE (vice)-[:was_vp_for]->(ceo)
Notice that I used merge because as you said, a vp can be a former ceo (and vice versa) so merge is better than create. Merge will ignore the person if it already exists.

My match/merge process is not creating relationships in the Neo4J database

I am very new to Neo4j/cypher/graph databases, and have been trying to follow the Neo4j tutorial to import data I have in a csv and create relationships.
The following code does what I want in terms of reading in the data, creating nodes, and setting properties.
/* Importing data on seller-buyer relationshsips */
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///customer_rel_table.tsv' AS row
FIELDTERMINATOR '\t'
MERGE (seller:Seller {sellerID: row.seller})
ON CREATE SET seller += {name: row.seller_name,
root_eid: row.vendor_eid,
city: row.city}
MERGE (buyer:Buyer {buyerID: row.buyer})
ON CREATE SET buyer += {name: row.buyer_name};
/* Creating indices for the properties I might want to match on */
CREATE INDEX seller_id FOR (s:Seller) on (s.seller_name);
CREATE INDEX buyer_id FOR (b:Buyer) on (b.buyer_name);
/* Creating constraints to guarantee buyer-seller pairs are not duplicated */
CREATE CONSTRAINT sellerID ON (s:Seller) ASSERT s.sellerID IS UNIQUE;
CREATE CONSTRAINT buyerID on (b:Buyer) ASSERT b.buyerID IS UNIQUE;
Now I have the nodes (sellers and buyers) that I want, and I would like to link buyers and sellers. The code I have tried for this is:
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///customer_rel_table.tsv' AS row
MATCH (s:Seller {sellerID: row.seller})
MATCH (b:Buyer {buyerID: row.buyer})
MERGE (s)-[st:SOLD_TO]->(b)
The query runs, but I don't get any relationships:
Query executed in 294ms. Query type: WRITE_ONLY.
No results.
Since I'm not asking it to RETURN anything, I think the "No results" comment is correct, but when I look at metadata for the DB, no relationships appear. Also, my data has ~220K rows, so 294ms seems fast.
EDIT: At #cybersam's prompting, I tried this query:
MATCH p=(:Seller)-[:SOLD_TO]->(:Buyer) RETURN p, which gives No results.
For clarity, there are two fields in my data that are the heart of the relationship:
seller and buyer, where the seller sells stuff to the buyer. The seller identifiers are repeated, but for each seller there are unique seller-buyer pairs.
What do I need to fix in my code to get relationships between the sellers and buyers? Thank you!
Your second query's LOAD CSV clause does not specify FIELDTERMINATOR '\t'. The default terminator is a comma (','). That is probably why it fails to MATCH anything.
Try adding FIELDTERMINATOR '\t' at the end of that clause.

How to conditionally add property to relationship from CSV in Neo4j

I am making a Neo4j graph to show a network of music artists.
I have a CSV with a few columns. The first column is called Artist and is the person who made the song. The second and third columns are called Feature1 and Feature2, respectively, and represent the featured artists on a song (see example https://docs.google.com/spreadsheets/d/1TE8MtNy6XnR2_QE_0W8iwoWVifd6b7KXl20oCTVo5Ug/edit?usp=sharing)
I have merged so that any given artist has just a single node. Artists are connected by a FEATURED relationship with a strength property that represents the number of times someone has been featured. When the relationship is initialized, the relationship property strength is set to 1. For example, when (X)-[r:FEATURED]->(Y) occurs the first time r.strength = 1.
CREATE CONSTRAINT ON (a:artist) ASSERT a.artistName IS UNIQUE;
CREATE CONSTRAINT ON (f:feature) ASSERT f.artistName IS UNIQUE;
CREATE CONSTRAINT ON (f:feature1) ASSERT f.artistName IS UNIQUE;
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS from 'aws/artist-test.csv' as line
MERGE (artist:Artist {artistName: line.Artist})
MERGE (feature:Artist {artistName: line.Feature1})
MERGE (feature1:Artist {artistName: line.Feature2})
CREATE (artist)-[:FEATURES {strength:1}]->(feature)
CREATE (artist)-[:FEATURES {strength:1}]->(feature1)
Then I deleted the None node for songs that have no features
MATCH (artist:Artist {artistName:'None'})
OPTIONAL MATCH (artist)-[r]-()
DELETE artist, r
If X features Y on another song further down the CSV, the code currently creates another (duplicate) relationship with r.strength = 1. Rather than creating a new relationship, I'd like to have only the one (previously created) relationship and increase the value of r.strength by 1.
Any idea how can I do this? My current approach has been to just create a bunch of duplicate relationships, then go back through and count all duplicate relationships, and set
r.strength = #duplicate relationships. However, I haven't been able to get this to work, and before I waste more time on this, I figured there is a more efficient way to accomplish this.
Any help is greatly appreciated. Thanks!
You can use MERGE on relationships with ON MATCH SET
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS from 'aws/artist-test.csv' as line
MERGE (artist:Artist {artistName: line.Artist})
MERGE (feature:Artist {artistName: line.Feature1})
MERGE (feature1:Artist {artistName: line.Feature2})
MERGE (artist)-[f1:FEATURES]->(feature)
ON CREATE SET f1.strength = 1
ON MATCH SET f2.strength = f1.strength + 1
MERGE (artist)-[f2:FEATURES]->(feature1)
ON CREATE SET f2.strength = 1
ON MATCH SET f2.strength = f2.strength + 1

How does load csv create relations in neo4j?

I am using the following commands to load data from a csv file into Neo4j. The input file is large and there are millions of rows.
While this query is running I can query for the number of nodes and check the progress. But once it stops creating nodes, I guess it moves on to creating relations. But I am not able to check the progress of this step.
I have two doubts:
Does it process the command for each line of file, i.e. create the nodes and relations etc. for each source line??
Or it creates all the nodes in one shot and then creates the relations.
Anyways I want to monitor the progress of the following command. It seems to get stuck after creating the nodes and when I try to query for number of relations I get 0 as output.
I created a constraint on the key attribute.
CREATE CONSTRAINT ON (n:Node) ASSERT n.key is UNIQUE;
Here is the cypher that loads the file.
USING PERIODIC COMMIT
LOAD CSV FROM "file:///data/abc.csv" AS row
MERGE (u:Node {name:row[1],type:row[2],key:row[1]+"*"+row[2]})
MERGE (v:Node {name:row[4],type:row[5], key:row[4]+"*"+row[5]})
CREATE (u) - [r:relatedTo]-> (v)
SET r.type = row[3], r.frequency=toint(trim(row[6]));
For every row of your CSV file, Neo4j is doing the cypher script, ie. :
MERGE (u:Node {name:row[1],type:row[2],key:row[1]+"*"+row[2]})
MERGE (v:Node {name:row[4],type:row[5], key:row[4]+"*"+row[5]})
CREATE (u) - [r:relatedTo]-> (v)
SET r.type = row[3], r.frequency=toint(trim(row[6]))
Due to using periodic commit, every 500 lines (the default value), a commit is done.
You can only see changes in your graph, when Neo4j have finished to parse 500 lines.
But your script is not optimized, you are not using the constraint with the merge.
You should consider this script instead:
USING PERIODIC COMMIT
LOAD CSV FROM "file:///data/abc.csv" AS row
MERGE (u:Node {key:row[1]+"*"+row[2]})
ON CREATE SET u.name = row[1],
u.type = row[2]
MERGE (v:Node {key:row[4]+"*"+row[5]})
ON CREATE SET v.name = row[4],
v.type = row[5]
CREATE (u)-[r:relatedTo]->(v)
SET r.type = row[3], r.frequency=toint(trim(row[6]));
Cheers

Resources