I'm working on understanding how to use linked lists to improve performance and create activity feeds on Neo4j.. Still working on learning Cypher, so I have a question.. I've found some examples of linked lists, but I need lists with bigger examples to finally put all the pieces together in my head..
I've used this code from grepcode and have found it to be more helpful than the example in the Neo4j manual. Yet I'm still a bit confused.. Can someone modify it to have say seven nodes with seven items in the linked list, and then insert a node on the front of it?
Yea, I'm trying to put the latest status update on the top of the linked list. This example doesn't really do that, but it's close.. so looking for some mods.. No, I'm not really coding yet, still trying to master Cypher first - will continue to study it for the next two weeks... Have the Ruby on Rails side working .. just need to understand linked lists used with Cypher/Neo a bit better.
CREATE zero={name:0,value:0}, two={value:2,name:2}, zero-[:LINK]->two-[:LINK]->zero
==== zero ====
MATCH zero-[:LINK*0..]->before,
after-[:LINK*0..]->zero,
before-[old:LINK]->after
WHERE before.value? <= 1 AND
1 <= after.value?
CREATE newValue={name:1,value : 1},
before-[:LINK]->newValue,
newValue-[:LINK]->after
DELETE old
==== zero ====
MATCH p = zero-[:LINK*1..]->zero
RETURN length(p) as list_length
What I'm trying to do in my mind is understand the before after and zero data sets - I almost have it, but want to see how it's done on a set with more than two starting nodes so as to clear up any confusion
Thank you!
The node in front is special as it doesn't have a incoming link relationship. Usually you also keep the connection to the head node somewhere, so this is about replacing this link to the head node and moving the head node one step further away. Something like this:
start user=node:node_auto_index(user="me")
match user-[old:MESSAGES]->head
delete old
create new_heads = { title: "Title", date : 2348972389, text: "Text" },
user-[:MESSAGES]->new_head-[:LINK]->head
Related
I created a simple csv that has some boxing matches. I'm trying to figure out how to model this in Neo4j.
The csv looks like this:
My interest in practicing using this small dataset in Neo4j was because it seems like Neo4j would be a good way to easily query who fought who, and who had common opponents, or whatever.
My first thought was that naturally, each boxer should be represented in a 'boxer' node, and each fight should be represented in a 'fight' node.
After modeling it as such, I realized, that there isn't actually one node for each boxer, because over time, the boxer's age changes. So I realized that each boxer would have to have a separate node for each fight. For example, Glass Joe has 2 fights and thus he appears twice, once when he was 23 and once next year when he battled Sandman and he was 24:
But this kinda defeats the purpose. Now, my graph will be made up of disconnected sets of 3 nodes, one for each fight in the csv. So what's the purpose?
My question is, how can I model such a simple yet complex situation like this: some type of tournament or game that changes over time, and the properties of the competitors' nodes change -- yet we want the graph to be connected:
(oops: Sandman should now be 51)
But then again, I don't think the above image is correct -- the edges shown are actually properties of the boxer node. If they are properties of the boxer...then they don't belong on the edge, right?
Here is my code so far (and the csv lives here):
LOAD CSV WITH HEADERS FROM
'file:///<grab it from dropbox please!>' AS line
CREATE (b:boxer {boxer_id: line.boxer_id, name: line.name})
SET b.age = TOINT(age);
LOAD CSV WITH HEADERS FROM
'file:///<grab it from dropbox please!>' AS line
MERGE(f:fight {fight_id: line.fight_id});
I end up with these nodes:
...but not sure how to connect them. Any advice or recommendations would be greatly appreciated.
Your first instinct was right. Ideally if you had the boxer's birthday that's what you would store. That would also help you tell apart boxers who have the same name/nickname. Your idea of storing the boxer's age as part of the relationship is a good idea, though.
If you really wanted to store each node for each boxer for each row you could do the following:
(:BoxerRecord)-[:FOUGHT_IN]->(:Fight)
(:BoxerRecord)-[:REPRESENTS]->(:Boxer)
So basically you use the CREATE clause to create each BoxerRecord and MERGE for each Boxer record so that they get merged together.
Then if you wanted to find all of the boxers that two people have fought in common (I'm making up an :
MATCH
(b1:Boxer {boxer_id: 100),
(b2:Boxer {boxer_id: 101})
(b1)<-[:REPRESENTS]-(:BoxerRecord)-[:FOUGHT_IN]->(:Fight)<-[:FOUGHT_IN]-(:BoxerRecord)-[:REPRESENTS]->(common_boxer:Boxer)<-[:REPRESENTS]-(:BoxerRecord)-[:FOUGHT_IN]->(:Fight)<-[:FOUGHT_IN]-(:BoxerRecord)-[:REPRESENTS]->(b2)
RETURN common_boxer, count(*)
I am looking for a template of sorts for merging two linked chains that have already been sorted. I'm still fairly new to Java, and this seems to be a pretty challenging task to accomplish with the limited knowledge I have. I have an understanding of how to merge sort an array, but when it comes to linked lists I seem to be drawing blanks. Any help you all could give me, be it actual code or simply advise on where to start, would be greatly appreciated.
Thank you for your time!
If the two linked list are already sorted, then it is so easy to merge those two together. I am gonna tell you the algorithm but you need to write the code yourself since it seems like a school project. First you make a new linked list, and then assign the head of the new list to be the min of list1Head and list2Head, then you just walk the two list, each time picking the min of the current node of the two list and append to the new created list, make the current to be .Next if it got picked. If one of the list doesn't have more nodes, then append the rest of another list directly to the new list. Done
Can't you look at the first element in each list and take the smallest. This is the start of the new list. Remove this from the front ofwhichever list it came from. Now look at the first element again and take the smallest and make it the second element in the new list. Then just repeat this process zipping the two lists together.
If you want to avoid creating a new list the just find the smallest then look at the thing is pointing at and the beginning of the other list and see which is smaller. If you are not already pointing at the smaller one the update the pointer so it is. Then rinse and repeat.
UPDATED: Wes hit a home run here! Thanks.. I've added a Rails version I was developing using the neography Gem.. Accomplishes the same thing but his version is much faster.. see comparison below
I am using a linked list in Neo4j (1.9, REST, Cypher) to help keep the comments in proper order (Yes I know I can sort on the time etc).
(object node)---[:comment]--->(comment)--->(comment)--->(comment).... etc
Currently I have 900 comments and it's taking 7 seconds to get through the whole list - completely unacceptable.. I'm just returning the ID of the node (I know, don't do this, but it's not he point of my post).
What I'm trying to do is find the ID's of users who commented so I can return a count.. (like "Joe and 405 others commented on your post").. Now, I'm not even counting the unique nodes at this point - I'm just returning the author_id for each record.. (I'll worry about counting later - first take care of the basic performance issue).
start object=node(15837) match object-[:COMMENTS*]->comments return comments.author_id
7 seconds is waaaay too long..
Instead of using a linked list, I could just simply have an object and link all the comments directly to the node - but that could lead to a supernode that is just bogged down, and then finding the most recent comments, even with skip and limit, will be dog slow..
Will relationship indexes help here? I've never used them other than to ensure a unique relationship, or to see if a relationship exists, but can I use them in a cypher query to help speed things up?
If not, what else can I do to decrease the time it takes to return the IDs?
COMPARISON: Here is the Rails version using "phase II" methods of the Neography gem:
next_node_id=18233
#neo=Neography::Rest.new
start_node = Neography::Node.load(next_node_id, #neo)
all_nodes=start_node.outgoing(:COMMENTS).depth(10000)
raise all_nodes.size.to_i
Result: 526 nodes found in 290ms..
Wes' solution took 5 ms.. :-)
Relationship indexes will not help. I'd suggest using an unmanaged extension and the traversal API--it will be a lot faster than Cypher for this particular query on long lists. This example should get you close:
https://github.com/wfreeman/linkedlistlength
I based it on Mark Needham's example here:
http://www.markhneedham.com/blog/2014/07/20/neo4j-2-1-2-finding-where-i-am-in-a-linked-list/
If you're only doing this to return a count, the best solution here is to not figure it out on every query since it isn't changing that often. Cache the results on the node in a total_comments property to your node. Every time a relationship is added or removed, update that count. If you want to know whether any of the current user's friends commented on it so you can say, "Joe and 700 others commented on this," you could do a second query:
start joe=node(15830) object=node(15838) match joe-[:FRIENDS]->friend-[:POSTED_COMMENT]->comment<-[:COMMENTS]-object RETURN friend LIMIT 1
You limit it to 1 since you only need the name of one friend who commented. If it returns someone, adjust the number of comments displayed by 1, include the user's name. You could do that with JS so it doesn't delay your page load. Sorry if my Cypher is a little off, not used to <2.0 syntax.
First of all, I bet that there is an answer on this question somewhere in docs, but since 'Manual: Labels and Indexes' link here gives me 404 error, I'm going to ask you anyway.
Is it possible to create an index on some label and specify it as an automatic one (just like legacy indexes I'm currently using, but for labels)?
If someone from neo4j team is reading this post, please let me know if I'm looking for the documentation in the right place, 'cause I can't find anything more or less informative on labels and indexes (except a couple of posts in Michael Hunger's blog and, maybe, some presentations, what is obviously not enough).
This is a more technical one: is it possible to find an item in the index by the regex? Suppose I have node with property 'n' -> '/a/b/c', and another node 'n' -> '/a/*/c. Can I somehow match them?
I don't work for Neo4j but I'll answer anyway.
All label indexing is automatic. Once you've created the index it maintains itself, possibly with minimal delay.
The manual for the last stable release can always be found here. The chapter on indexing for the embedded Java API is here.
You cannot use regexp with label indices yet. It's said to be on the agenda, along with index support for array lookups, i.e. what in Cypher would be
MATCH (a:MyLabel) WHERE a.value IN ['val1', 'val2']
I'm very interested in building a data visualisation component and can
see how it could be done but would prefer not to reinvent something
which already exists. If this truly is a 'first' then I'm prepared to put my initial code
on Github for others to share [and hopefully improve !!]
Essentially I'd like to be able to do the following:
1) Access a table or tables within a database and create nodes based
on entries within them. Add nodes on create, remove them on delete.
2) Use the foreign keys and/or join tables [for many-many links] to
create edges. Add edge(s) when node created, remove edges when node
deleted, check and add/remove edges when node updated.
3) Pass the nodes and edges to Gephi for display
I can see how to do steps 1 and 2 quickly and easily -- what I haven't
been able to find (after much searching) is how to do step 3.
Has anyone had any success in doing this? -- any example code that they're willing to share ?
Thanks
We tried something similar once, but it may not help you that much. We wrote a Rake task that got data out our DB, which we then fed into Gephi manually. That wasn't really satisfactory and in the end I went with Rake task -> CSV -> R script for visualization (basically connections of users on a world map). If you are not dead set on using Gephi I could show you some of the R code :-)