I'm using Neo4j Community edition 2.1.4. I have hierarchy of 4 levels and each level names i have treated it as label name for that level.So in my graph i have totally 4 labels. Now for the first time I have loaded csv file into neo4j and using MERGE and CREATEkeywords created the nodes and relationships. In future the requirement is like
scenario 1:
if someone wants to rename the hierarchy level name to some new name, then I have to
change the label name to a new name.
Scenario 2:
if any of the property name of node changes to to new name
In the both the cases i wanted to track the history of the node. How can i do it? So that in future someone wants to see the history details , they can query and get the details.
So how can i track the history details of the nodes in neo4j?
My Approch:
For the first time i will load the csv file and create nodes and relationships. Then if someone wants to change the label name of node A(level name which is standard) which is having a properties like ID, name,start_date,end_date,Status.Then i will replicate the node A with all the properties and change the status to inactive and i will overwrite the old node with the new details. But i'm clueless whether this solution is going to work or not. Also i have more thane 10000 nodes in my db.
So please suggest me a better approach to track the nodes history.
Have a look at the GraphAware ChangeFeed Module. You can track history of changes and also configure which changes you would like to track and which to ignore.
You can use versionning. Examples in this blog post : neo4j.org/graphgist?608bf0701e3306a23e77 that you can adapt for your needs
Related
In early editions of Neo4j, Super nodes were typically seen as a bad thing for performance. I have not seen too much about that recently with the 2.X and 3.X releases so was wondering if that was still a problem.
The issue I have is I need to store a finite number of options for a specific Node type. For example, Person and favorite colors. I can store an array in the Person Node that stores the colors the user likes, or I can create a Node for each color and then create a relationship from the Person to the Color Node. It seems the super node option would be faster to query but am worried as super nodes were bad in the past.
If I am trying to look up people who like a specific color, what's the recommended way to store such data in Neo?
I think the major issue here will be that the Color node will become a very connected node.
Maybe you need an Options subgraph to have a template of these options and then :
copy template node option to link this copy with the main entity node
or
copy only the choosen option into a property of the main entity node, like your array proposition
or, if your option has no properties
add label onto your main entity node
I think, even with the increase in performance of hyperlinked nodes with newer Neo4j versions, the read/write time will be always more than the one who has less.
I hope this help a bit.
I know how this would be accomplished in SQL but having difficulty wrapping my brain around how to do this in cypher..
Basically working on a master data setup where a user has a master_id (node) and need to use an existing relationship property to determine the master_id in order to create a new relationship between the master_id node and a location node.
Currently have master users created as nodes that contain a master_id property. A relationship is created between the master user and a brand, and the relationship has a property of brand_user_id.
I now have another file I need to import that contains data at the brand_user level, but need to create the relationship between the master_id and a location node. In order to do this because the file does not contain the master_id property, I am attempting to use the new file to lookup master_id's based on the existing relationship with the brand, then use that master_id to create the new relationship with the location.
Have this relationship:
(m:Master{master_id:12345})-[:IS_BRAND_USER{brand_user_id:9876}]->(b:Brand{name:"Acme"})
Have this file:
brand_user_id,location_id
9876,6
Need this relationship:
(m:Master{master_id:12345})-[:HAS_LOCATION]->(l:Location{id:6})
My approach:
LOAD CSV WITH HEADERS FROM "file:///brand_user_ids.csv" as buid
MATCH (m:Master)-[r:IS_BRAND_USER{brand_user_id:buid.id}]->(b:Brand)
WITH m, buid.location_id AS location_id
MATCH (l:Location {id: location_id})
CREATE (m)-[:HAS_LOCATION {source: 'abcdef'}]->(l)
Seems to run for an extremely long time and not seeing any real progress after an hour so I'm wondering if this is fundamentally the correct approach or not, or if I have inadvertently created some horrific cross join equivalent.
The problem is that you are trying to enter a graph on a relationship. And that always requires lots of "scanning the graph".
Now, I'm not a specialist in your domain, but you might be missing a type of nodes here ... BrandUser. And there could be several reasons for that :
Based on your data a Master can have many BrandUser id's. Potentially even more than one per Brand ? Do you have other properties that make sense on the BrandUser level ?
That Location data is strange. Wouldn't you agree it's actually the BrandUser that has a location and that a Master may have many locations ?
The most important reason is however ... if you're going to enter the graph on the brand_user_id all the time (and judging from the location example that may be the case) ... you've got the reason to turn it into a node right there.
So ... it's a modeling issue really.
Hope this helps.
Regards,
Tom
I'm using ne04j 2.1.2 community edition.
I have a nodes with a label called Company and I created these nodes and label by loading CSV file along with the MERGE and CREATE commands.
So in future if my label names changes,say Company to Organization, I wanted to maintain the createddate, UpdatedDate, NewLabelName, OldLabelName values somewhere.
So in order to achieve that I thought of maintaining one master node which holds the label information i.e., it should have the properties like NewLabelName, OldLabelName, CreatedDate, UpdatedDate. So the label name should come from the Master Node to other nodes. Whenever we made any changes to label ,then the corresponding UpdatedDate property value should be updated in the master node and NewLabelName should come from the master node to other nodes (nodes for which that label belongs to) .
Hope you understand the scenario here.
But how can i achieve this ? is it possible to achieve ? if yes, then how can i define the relationship between master and other nodes?
(Here my other nodes are Name of the Companies like Google, Yahoo, Samsung etc.. and those will be having some other child nodes like location)
Please suggest the solution. (I wanted to achieve these using cypher not using java)
Thanks
Although labels can be changed, you should do that rarely (e.g., to recover from a mistake). Changing a large number of labels is very expensive and should never be done as a part of normal processing.
Also, like a Java class name, a label name is not something you'd normally show to end users. So, there is really no reason to ever change them. Just try to pick reasonable label names to start with, and don't plan to change them.
We are working on a system where users can define their own nodes and connections, and can query them with arbitrary queries. A user can create a "branch" much like in SCM systems and later can merge back changes into the main graph.
Is it possible to create an efficient data model for that in Neo4j? What would be the best approach? Of course we don't want to duplicate all the graph data for every branch as we have several million nodes in the DB.
I have read Ian Robinson's excellent article on Time-Based Versioned Graphs and Tom Zeppenfeldt's alternative approach with Network versioning using relationnodes but unfortunately they are solving a different problem.
I Would love to know what you guys think, any thoughts appreciated.
I'm not sure what your experience level is. Any insight into that would be helpful.
It would be my guess that this system would rely heavily on tags on the nodes. maybe come up with 5-20 node types that are very broad, including the names and a few key properties. Then you could allow the users to select from those base categories and create their own spin-offs by adding tags.
Say you had your basic categories of (:Thing{Name:"",Place:""}) and (:Object{Category:"",Count:4})
Your users would have a drop-down or something with "Thing" and "Object". They'd select "Thing" for instance, and type a new label (Say "Cool"), values for "Name" and "Place", and add any custom properties (IsAwesome:True).
So now you've got a new node (:Thing:Cool{Name:"Rock",Place:"Here",IsAwesome:True}) Which allows you to query by broad categories or a users created categories. Hopefully this would keep each broad category to a proportional fraction of your overall node count.
Not sure if this is exactly what you're asking for. Good luck!
Hmm. While this isn't insane, think about the type of system you're replacing first. SQL. In SQL databases you wouldn't use branches because it's data storage. If you're trying to get data from multiple sources into one DB, I'd suggest exporting them all to CSV files and using a MERGE statement in cypher to bring them all into your DB at once.
This could manifest similar to branching by having each person run a script on their own copy of the DB when you merge that takes all the nodes and edges in their copy and puts them all into a CSV. IE
MATCH (n)-[:e]-(n2)
RETURN n,e,n2
Then comparing these CSV's as you pull them into your final DB to see what's already there from the other copies.
IMPORT CSV WITH HEADERS FROM "file:\\YourFile.CSV" AS file
MERGE (N:Node{Property1:file.Property1, Property2:file.Property2})
MERGE (N2:Node{Property1:file.Property1, Property2:file.Property2})
MERGE (N)-[E:Edge]-(N2)
This will work, as long as you're using node types that you already know about and each person isn't creating new data structures that you don't know about until the merge.
I deleted the reference node. So I need to recreate the reference node.
Using cypher how to create a node with id 0?
thanks.
The short answer is you can't, and you don't need to. Do you have a specific problem without that node? If so, maybe you can elaborate, chances are there is something else that answers your problem better than trying to recreate a node with a specific id.
The long answer is you can't assign id:s to nodes with cypher. The id is an index or offset into the node storage on disk, so it makes sense to let Neo4j worry about it and not try to manipulate it or include it in any application logic. See Node identifiers in neo4j and Has anyone used Neo4j node IDs as foreign keys to other databases for large property sets?.
You also most likely don't need a reference node. It is created by default in a new database, but it's use is deprecated and it won't exist in future releases. See Is concept of reference node in neo4j still used or deprecated?.
If you still want to assign id to nodes you create, it is accidentally possible in a roundabout way with with the CSV batch importer (1,2) and, I believe, with the Java API batch inserter.
If you still want to recreate or simulate the reference node you can either delete the database data files and let Neo4j recreate the the database, or you can try what this person did: Recreate reference node in a Neo4j database. You can also force Neo4j to recycle the ids of deleted nodes faster, so that new nodes that you create receive those ids that have been freed up and not yet reassigned.