How can I map a query result to domain objects? - neo4j

I'm developping a program that uses the neo4j-ogm library directly (aka I don't use any Spring component) and my Neo4J DB has this sorts of relations:
PARAMETER<-[:HAS_PARAMETER]-TASK-[:HAS_STEP]->STEP-[:HAS_PARAMETER]->PARAMETER
STEP-[:HAS_STEP]->STEP
PARAMETER-[:INITIALIZES]->PARAMETER
I coded all my domain classes (PARAMETER, TASK and STEP).
I write a query like (with a session.query method call):
MATCH (:TASK)-[r*]->() return r
Can I directly map the result from my query to domain objects?
EDIT:
To be sharper, I've got this class Task
#NodeEntity
class Task {
#RelationShip(type = "HAS_STEP")
Set<Step> steps;
#RelationShip(type = "HAS_PARAMETER")
Set<Parameter> parameters;
}
I wish fill a Task instance (with steps and parameters) and each step be filled too.

You can use session.query() to return a org.neo4j.ogm.model.Result that contains the results. This is supported only in Neo4j OGM 2.0.1.
Returning a path is not supported, so you must return the nodes and relationships that comprise the path e.g.
MATCH p=(t:TASK)-[r*]->() return t, nodes(p), rels(p)
Then, you can access t from the Result and it will be a hydrated Task. Alternatively, you can access the nodes from the Result for all hydrated entities in the path.
More examples are in the test here: https://github.com/neo4j/neo4j-ogm/blob/2.0/core/src/test/java/org/neo4j/ogm/persistence/session/capability/QueryCapabilityTest.java
BTW the blog post that Christophe references is still valid for the OGM functionality as well if you need to understand what can be mapped.

Yes you can, there is a complete blog post about it :
http://graphaware.com/neo4j/2016/04/06/mapping-query-entities-sdn.html

Related

How to create queries using runtime managed labels using Neo4j-OGM?

Simple question: I'm using Neo4-OGM (with Quarkus) to interact with my Neo4J DB (latest version).
I have an entity "Contact" and I added the #Labels to be able to manage extra labels at runtime.
#NodeEntity
public class Contact {
#Id
#GeneratedValue(strategy = UuidStrategy.class)
private String identifier;
// some properties and relations...
#Labels
private List<String> labels;
}
This will work fine.
But now, I would like to querying my DB using the methods loadAll with Filters instead of writing by myself a cypher query.
Unfortunately, I cannot see how I could get any equivalent of the following cypher query:
MATCH (n:`Contact`:`Label_added_in_labels`) RETURN n
Is it supported? Or I will have to write the cypher by myself? (That's fine but I don't want to write them if it's not needed).
The Filter in Neo4j-OGM are property based and sadly cannot help you with this.
But you could use the Neo4j CypherDSL if you do not want to write your own statements.
For this you can add the following dependency to your project
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-cypher-dsl</artifactId>
<version>2021.3.0</version> // <- currently the latest version
</dependency>
and use it for example like this in combination with a Neo4j-OGM Session:
Node node = Cypher.node("Contact", "Label_added_in_labels");
Statement statement = Cypher.match(node.named("n")).returning(node).build();
Iterable<User> contacts = session.query(Contact.class, statement.getCypher(), Collections.emptyMap());

spring-data-neo4j query using dynamic key and dynamic value

spring-data-neo4j query using dynamic key and dynamic value,
like following code:
public interface NodeReposity extends Neo4jRepository<Node,Long> {
#Query("MATCH(n:Node{{key}={value}})return n")
Iterable<Node> queryByProperty(#Param("key")String key,#Param("value") String value);
}
But it says the {key} must be something like variable in string, such as MATCH(n:Node{name={value}})return n.Can't be {key}. But My property's key is dynamic like the value, how to implement it and is it possible?
Short answer: The query will be send "as it is" to the database and because cypher does only support placeholders for values, this will cause an error.
Slightly longer answer: When it comes to executing the method Spring Data Neo4j will look if it has already pre-processed the query and either process and cache it or just load it from the cache. This is done to improve the time it takes to execute the method from the application.
Pre-processing means SDN knows what parameters are in there and just adds the values in the right place when the method is called.
If SDN would provide more features for the query than cypher, the query would have to be processed every time the method gets called to create a new query that can be used with Neo4j.

Cypher 1.9.9, START by both relationship and node index

My Neo4j 1.9.9 entities are stored using Spring Data Neo4j. However, because many derived queries from repository methods are wrong, I've been forced to use directly Cypher
Basically, I have two classes:
#NodeEntity
public class RecommenderMashup {
#Indexed(indexType = IndexType.SIMPLE, indexName = "recommenderMashupIds")
private String mashupId;
}
#RelationshipEntity(type = "MASHUP_TO_MASHUP_SIMILARITY")
public class MashupToMashupSimilarity {
#StartNode
private RecommenderMashup mashupFrom;
#EndNode
private RecommenderMashup mashupTo;
}
In addition to the indexes directly provided, as you know, Spring Data Neo4j adds two other indexes: __types__ for nodes and __rel_types__ for relationship; both of them have className as their key.
So, I've tried the query below to get all the MashupToMashupSimilarity objects related to a specific node
START `mashupFrom`=node:`recommenderMashupIds`(`mashupId`='5367575248633856'),
`mashupTo`=node:__types__(className="package.RecommenderMashup"),
`mashupToMashupSimilarity`=rel:__rel_types__(className="package.MashupToMashupSimilarity")
MATCH `mashupFrom`-[:`mashupToMashupSimilarity`]->`mashupTo`
RETURN `mashupToMashupSimilarity`;
However, I always got empty results. I suspect that this is due to the fact that the START clause contains both nodes and relationships. Is this possible? Otherwise, what could be the problem here?
Additional infos
The suspect came from the fact that
START `mashupToMashupSimilarity`=rel:__rel_types__(className='package.MashupToMashupSimilarity')
RETURN `mashupToMashupSimilarity`;
and
START `mashup`=node:__types__(className="package.RecommenderMashup")
RETURN `mashup`;
and other similar queries always return the right results.
The only working alternative at this point is
START `mashupFrom`=node:`recommenderMashupIds`(`mashupId`='6006582764634112'),
`mashupTo`=node:__types__(className="package.RecommenderMashup")
MATCH `mashupFrom`-[`similarity`:MASHUP_TO_MASHUP_SIMILARITY]->`mashupTo`
RETURN `similarity`;
both I don't know how it works in terms of performance (the indexes should be faster). Also, I'm curious what I've been doing wrong.
Did you try to run your queries in the neo4j-browser or shell? did they work there?
This query is also wrong,
START `mashupFrom`=node:`recommenderMashupIds`(`mashupId`='5367575248633856'),
`mashupTo`=node:__types__(className="package.RecommenderMashup"),
`mashupToMashupSimilarity`=rel:__rel_types__(className="package.MashupToMashupSimilarity")
MATCH `mashupFrom`-[:`mashupToMashupSimilarity`]->`mashupTo`
RETURN `mashupToMashupSimilarity`;
you use mashupToMashupSimilarity as identifier for the relationship,
but then you use it wrongly as relationship-type:
-[:mashupToMashupSimilarity]->
it should be: -[mashupToMashupSimilarity]->
but of course better, skip the rel-index check and use -[similarity:MASHUP_TO_MASHUP_SIMILARITY]->
And you can just leave of the relationship-index lookup which doesn't make sense at all, as you should already filter with the relationship-type.
Update: Don't use index lookups for type check
START mashupFrom=node:recommenderMashupIds(mashupId='5367575248633856')
MATCH (mashupFrom)-[mashupToMashupSimilarity:MASHUP_TO_MASHUP_SIMILARITY]->(mashupTo)
WHERE mashupTo.__type__ = 'package.RecommenderMashup'
RETURN mashupToMashupSimilarity;
As the relationship-type is already restricting, I think you don't even need the type-check on the target node.

Why does spring-data require START in a Cypher query?

I have User type in neo4j database with a 'registered' property that stores the timestamp (Long) when user joined the site. I want to find how many users have registered before a given date. I defined a query method on the Spring-data Graph repository interface:
#Query("MATCH user=node:User WHERE user.registered < {0} RETURN count(*)")
def countUsersBefore(registered: java.lang.Long): java.lang.Long
I see in the Neo4j manual a lot of queries that just start with MATCH, but Spring-data doesn't seem to like it and requires a START. In my case I don't have an obvious node from where I can start, since my query is not following any relationships, it's just a plain count-where combination.
How can I fix this query? Do I need an index on the 'registered' property?
If you want to use this syntax you have to use Spring Data Neo4j 3.0-M01 which works with Neo4j 2.0.0-M06.
You also need that to be able to use labels.
But better wait for the next milestone version of SDN 3.0 which will work with Neo4j 2.0.0 final.
Update:
If you use the SDN types index:
START user=node:__types__(className="org.example.User")
WHERE user.registered < {0}
RETURN count(*)
or in a repository this derived method should work:
public interface UserRepository extends GraphRepository<User> {
int countByRegisteredLessThan(int value);
}
Instead of MATCH user=node:User..., you want MATCH (user:User)...

Neo4j indexes and legacy data

I have a legacy dataset (ENRON data represented as GraphML) that I would like to query. In an comment in a related question, #StefanArmbruster suggests that I use Cypher to query the database. My query use case is simple: given a message id (a property of the Message node), retrieve the node that has that id, and also retrieve the sender and recipient nodes of that message.
It seems that to do this in Cypher, I first have to create an index of the nodes. Is there a way to do this automatically when the data is loaded from the graphML file? (I had used Gremlin to load the data and create the database.)
I also have an external Lucene index of the data (I need it for other purposes). Does it make sense to have two indexes? I could, for example, index the Neo4J node ids into my external index, and then query the graph based on those ids. My concern is about the persistence of these ids. (By analogy, Lucene document ids should not be treated as persistent.)
So, should I:
Index the Neo4j graph internally to query on message ids using Cypher? (If so, what is the best way to do that: regenerate the database with some suitable incantation to get the index built? Build the index on the already-existing db?)
Store Neo4j node ids in my external Lucene index and retrieve nodes via these stored ids?
UPDATE
I have been trying to get auto-indexing to work with Gremlin and an embedded server, but with no luck. In the documentation it says
The underlying database is auto-indexed, see Section 14.12, “Automatic Indexing” so the script can return the imported node by index lookup.
But when I examine the graph after loading a new database, no indexes seem to exist.
The Neo4j documentation on auto indexing says that a bunch of configuration is required. In addition to setting node_auto_indexing = true, you have to configure it
To actually auto index something, you have to set which properties
should get indexed. You do this by listing the property keys to index
on. In the configuration file, use the node_keys_indexable and
relationship_keys_indexable configuration keys. When using embedded
mode, use the GraphDatabaseSettings.node_keys_indexable and
GraphDatabaseSettings.relationship_keys_indexable configuration keys.
In all cases, the value should be a comma separated list of property
keys to index on.
So is Gremlin supposed to set the GraphDatabaseSettings parameters? I tried passing in a map into the Neo4jGraph constructor like this:
Map<String,String> config = [
'node_auto_indexing':'true',
'node_keys_indexable': 'emailID'
]
Neo4jGraph g = new Neo4jGraph(graphDB, config);
g.loadGraphML("../databases/data.graphml");
but that had no apparent effect on index creation.
UPDATE 2
Rather than configuring the database through Gremlin, I used the examples given in the Neo4j documentation so that my database creation was like this (in Groovy):
protected Neo4jGraph getGraph(String graphDBname, String databaseName) {
boolean populateDB = !new File(graphDBName).exists();
if(populateDB)
println "creating database";
else
println "opening database";
GraphDatabaseService graphDB = new GraphDatabaseFactory().
newEmbeddedDatabaseBuilder( graphDBName ).
setConfig( GraphDatabaseSettings.node_keys_indexable, "emailID" ).
setConfig( GraphDatabaseSettings.node_auto_indexing, "true" ).
setConfig( GraphDatabaseSettings.dump_configuration, "true").
newGraphDatabase();
Neo4jGraph g = new Neo4jGraph(graphDB);
if (populateDB) {
println "Populating graph"
g.loadGraphML(databaseName);
}
return g;
}
and my retrieval was done like this:
ReadableIndex<Node> autoNodeIndex = graph.rawGraph.index()
.getNodeAutoIndexer()
.getAutoIndex();
def node = autoNodeIndex.get( "emailID", "<2614099.1075839927264.JavaMail.evans#thyme>" ).getSingle();
And this seemed to work. Note, however, that the getIndices() call on the Neo4jGraph object still returned an empty list. So the upshot is that I can exercise the Neo4j API correctly, but the Gremlin wrapper seems to be unable to reflect the indexing state. The expression g.idx('node_auto_index') (documented in Gremlin Methods) returns null.
the auto indexes are created lazily. That is - when you have enabled the auto-indexing, the actual index is first created when you index your first property. Make sure you are inserting data before checking the existence of the index, otherwise it might not show up.
For some auto-indexing code (using programmatic configuration), see e.g. https://github.com/neo4j-contrib/rabbithole/blob/master/src/test/java/org/neo4j/community/console/IndexTest.java (this is working with Neo4j 1.8
/peter
Have you tried the automatic index feature? It's basically the use case you're looking for--unfortunately it needs to be enabled before you import the data. (Otherwise you have to remove/add the properties to reindex them.)
http://docs.neo4j.org/chunked/milestone/auto-indexing.html

Resources