Understanding Neo4j, creating unique nodes - neo4j

I'm trying to wrap my head around how Neo4j works and how I can apply it to my problem. I thought it should be really easy and a matter of minutes, but I'm stuck.
I have data in MongoDB, say User and Item. What I want is connecting User and Item in a graph with a LIKE relationship (maybe with a score). Later I want to do things like recommending items based on connections, basic stuff.
But how do I get the data into Neo4j? Every document in MongoDB has an unique _id, so I though I could just throw both _ids into Neo4j and have them connected. What I found so far is that it's not even possible to have unique nodes based on the _id field (Neo4j has numeric incremented ids), which is only possible with some "hack" (https://github.com/jexp/app-net-graph/blob/master/lib/appnet.rb#L11) or using MERGE (I'm stuck on < 2.0). Even their examples on the website add the same node again if executed multiple times. I think I have a fundamental misunderstanding of how to use Neo4j. Maybe I'm too spoiled by redis, where I can put strings in and and it just works. Redis' sets aren't feasible though for complex graphs, only for simple connections.
Maybe someone can help me with a simple cypher example of how to add two nodes foo and bar and have them connected with a LIKE connection. And the operation should be idempotent, no matter if none or all of the nodes/relationships already existed before execution.
I'm accessing Neo4j via REST, in particular using this node module https://github.com/thingdom/node-neo4j

You could define your external ID as extra property on your nodes. Then depending on if your are using SpringData or not, you can insert the data.
If you are using SpringData, you can configure your external ID as unique index and then normally save you nodes(consider though, that inserting a duplicated ID will overwrite the existing one).
If you are using the plain java API, you can create unique nodes as described here:
http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-unique-nodes.html#tutorials-java-embedded-unique-get-or-create
EDIT:
As for a sample query, does this help you?
http://console.neo4j.org/?id=b0z486
With the java api you would do it like this
firstNode = graphDb.createNode();
firstNode.setProperty( "externalID", "1" );
firstNode.setProperty( "name", "foo" );
secondNode = graphDb.createNode();
secondNode.setProperty( "externalID", "2" );
secondNode.setProperty( "name", "bar" );
relationship = firstNode.createRelationshipTo( secondNode, RelTypes.Likes );
I suggest you read some tutorials here: http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-hello-world.html

Given you are using Neo4J1.9, have you tried creating a unique index on your _ID column?
Try this article from the docs
If you were using Neo4j2, then this article is helpful

Related

How do I find a string in an unknown neo4j database using Cypher?

TL,DR: I need a query which gives me all nodes/relationships which contain a certain value (in my case a string, so much I know), without knowing which property(key) contains the string. I am using neo4j(latest version), meteor (latest version) and the meteor neo4j driver, which is recommended on the neo4j website for meteor.
Currently I am working (as part of my bachelor thesis) on a tool to visualize the output of any Cypher query on any database, regardless of the database contents.
So far I've managed to correctly display nodes/relationships which are coming out. My problem now is to visualize (get nodes and relationships to feed into my frontend) textual queries like (taken from the neo4j movie database, which I am using for development)
MATCH (tom:Person {name:"Tom Hanks"})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors)
RETURN coActors.name
This kind of queries only returns an array of strings and not whole nodes or relationships. I now need some way (preferably a Cypher query) to get all nodes which contain for example the string "Audrey Tatou".
The problem I've now run into is that I didn't find a way to write a query which doesn't need something like
MATCH n
WHERE Person.name = "some name"
Since I don't know anything about the contents of the database I cannot use
WHERE propertyName = "propertyValue"
since I only know the value but not the name of the property.
The only solution here will be to get every nodes with your label and check properties and values using reflection on client side.
Using cypher, the solution would be to get all properties and their values and parse their values using a foreach loop. Maybe you can do this, but I'm really not sure, it's a recent feature but you can still give a try.
Here is what I found for the cypher solution: How can I return all properties for a node using Cypher?
So, you have query that returns array of string.
In fact - you can receive almost anything as result. Cypher is capable to return just bare strings, that are not related to anything.
Long story short - you can't vizualize this data, because of this data nature. Best you can do is to represent them as table (or similar), like Neo4j browser do this.
But, there is (probably) solution for you. Neo4j has feature called Legacy indexing. And there you can find full text indexes. Maybe this can help you.
You can just use a driver that returns nodes and rels, or if you do the queries manually add resultDataContents entry
{statements:[{statement:"MATCH ..","resultDataContents",["graph"]}]}
to your payload and you get nodes and relationships back.

How to determine the Max property on a Relationship in Neo4j 2.2.3

How do you quickly get the maximum (or minimum) value for a property of all instances of a relationship? You can assume the machine I'm running this on is well within the recommended spec's for the cpu and memory size of graph and the heap size is set accordingly.
Facts:
Using Neo4j v2.2.3
Only have access to modify graph via Cypher query language which I'm hitting via PHP or in the web interfacxe--would love to avoid any solution that requires java coding.
I've got a relationship, call it likes that has a single property id that is an integer.
There's about 100 million of these relationships and growing
Every day I grab new likes from a MySQL table to add to the graph within in Neo4j
The relationship property id is actually the primary key (auto incrementing integer) from the raw MySQL table.
I only want to add new likes so before querying MySQL for the new entries I want to get the max id from the likes, so I can use it in my SQL query as SELECT * FROM likes_table WHERE id > max_neo4j_like_property_id
How can I accomplish getting the max id property from neo4j in a optimal way? Please indicate the create statement needed for any index as well as the query you'd used to get the final result.
I've tried creating an index as follows:
CREATE INDEX ON :likes(id);
After the index is online I've tried:
MATCH ()-[r:likes]-() RETURN r.i ORDER BY r.id DESC LIMIT 1
as well as:
MATCH ()-[r:likes]->() RETURN MAX(r.id)
They work but take freaking forever as the explain plan for both indicate no indexes being used.
UPDATE: Holy $?##$?!!!! It looks like the new schema indexes aren't functional for relationships even though you can create them and show them with :schema. It also looks as if there's no way with cypher directly to create Legacy Indexes which look like they might solve this issue.
If you need to query relationship properties, it is generally a sign of a model issue.
The need of this query reveals you that you would better extract these properties into a node, that you'll then be able to query faster.
I don't say it is 100% the case, but certainly 99% of the people seen so far with the same problem has been demonstrating this model concern.
What is your model right now ?
Also you don't use labels at all in your query, likes have a context bound to the nodes.

Is node reference equality in embedded neo4j guaranteed?

I am using an embedded graph database as part of a java application. Suppose that I carry out some type of cypher query, and return an ExecutionResult which contains a collection of nodes.
These nodes may be assumed to form a connected graph.
Each of these nodes has some relationships, which I can access using node.getRelationships(Direction.OUTGOING). My question is, if the target of one of these relationships already occurs in the Execution result (i.e. the relationship is part of the query template), is it guaranteed that Relationship.getEndPoint == Node X.
I suppose that what I am really asking is, when a transaction in Neo4j returns a node, does it return just the one object, and different queries will just keep returning references to that one object, or does it keep producing new objects which happen to refer to the same data point? Since Node doesn't override the equalsTo method, I have been assuming the former, but I was hoping someone could tell me.
Nodes are not reference-equals. You'll only get NodeProxy objects which are created on the fly in different operations.
But the equals()-method does id-equality so you should use that.
n1.equals(n2)
or if you keep the node id around use
n1.getId() == n2.getId()
See when you create a node neo4j internally assigns it a node-id. All the relationships you create will have reference to the start node id and end node id.
For checking do this
First create a node and save its node id by calling method node.getId()
Now create a relationship to it from another node. And call your relationship.getEndNode().getId() .
You will see the node-ids are same.
It sounds like your asking - does Neo 'out of the box' give concurrency control of database entities, like n-hibernate or entity framework does for SQL.
The answer is no! You will have to manage it yourself. If you do delelop it though, could make you a few bob

Setting node labels with a parameter

I'm trying to pile a load of Twitter data into Neo4J using the .Net Neo4JClient. It's essentially the same type of Twitter user data for each node, but some of the nodes have a different significance to others, hence I would like to label them differently.
(I'm brand new both to Neo4J and the client, too).
So I've been trying to label them like so:
var query = _client.Cypher
.Create("(primaryNode:nodeLabel {twitterUser})")
.WithParams(new { nodeLabel = "nodeType", twitterUser } );
query.ExecuteWithoutResults();
Note: I split out the ExecuteWithoutResults so I could debug the query, and it is registering the parameters OK. The documentation here:
https://github.com/Readify/Neo4jClient/wiki/cypher#explicit-parameters
... suggests that parameters can be created "at any point in the fluent query" - but the Neo documentation about parameters here:
http://docs.neo4j.org/chunked/1.8.2/cypher-parameters.html
... kind of suggests otherwise, that parameters are specifically for things like WHERE clauses, indexes and relationship Ids.
Anyway - when I execute the above, I get a shiny new node with the label "nodeLabel" - so the parameter ain't working. Could somebody clarify whether or not I'm just making a dumb newbie mistake?
You can call WithParams whenever you want in the query. That's what the Neo4jClient doco means about "at any point in the fluent query".
However, Neo4j only supports parameters in certain parts of the Cypher text. If the parameter would affect the query plan, it's not allowed.
In this case, you cannot use parameters for labels. You will need to actually construct the query dynamically if you want to do that.
Edit: Even if this was a supported place for parameters, you'd at least have to write {nodeLabel} in your Cypher instead of just nodeLabel.

Cypher query return related nodes as children

I am using the Neo4j .NET Client ExecuteGetCypherResults to run cypher. It expects everything to come back in a single column. I have simple class JobType which contains a list of JobSpecialties on it. In the database this is modeled as the Types having a relationship to the Specialties.
I need a cypher query that returns the results as such, in a single column. The related Specialties should be a child property of the Type node I would expect the query to look like this:
start s=node:node_auto_index(StartType='JobTypes')
match s-[:starts]->t, t-[:SubTypes]->ts
return {Id: t.Id, Name: t.Name, JobSpecialties: ts}
But this doesn't work. I can't figure out from the docs if this is even possible. If there is a better way to get the result back to the .Net client, I am open to suggestions.
start s=node:node_auto_index(StartType='JobTypes')
match s-[:SubTypes]->js
return s.Id, s.Name, js;
ExecuteGetCypherResults does support multiple columns, you just need to kick our deserializer into a different mode. This is an implementation detail generally hidden behind our higher level APIs, which is why this isn't obvious.
When you call new CypherQuery, pass CypherResultMode.Projection instead of CypherResultMode.Set.
I actually can't remember why we have this. Sometime, I'll need to dig through the lower levels and try and kill it. Pull requests welcomed. :)
As a preference though, we always prefer people to use the higher level APIs (but we recognise there are some limitations).
It sounds like the .Net client needs some updating for cypher. Cypher doesn't support building maps on the fly yet, although it is something that is in the feature request list already...
You can create an array with your results (but as of 1.9.M04, they need to be the same type to be merged into the array):
http://console.neo4j.org/r/xo7voi
I've actually submitted a pull request (through back channels, since it broke some unit tests) to fix that (so you can have multiple types in an array built on the fly), but I think there are some concerns whether merging of different types is a good idea.
https://github.com/wfreeman/neo4j/commit/ca457ace0df4732376833b8694e4affac4143244
Update: This will be fixed in 1.9.M05/1.9.GA. Now you can build an array with any type mixed:
http://console.neo4j.org/r/vm4f83

Resources