Neo4J, SDN and running Cypher spatial queries - neo4j

I am new to Neo4J and I am trying to build a proof of concept for High Availability spatial temporal based querying.
I have a setup with 2 standalone Neo4J Enterprise servers and a single Java application running with an embedded HA Neo4J server.
Everything was simple to setup and basic queries are easy to setup and efficient. Additionally performing the queries derived from the Neo4J SpatialRepository work as expected.
What I am struggling to understand is how to use SDN to make a spatial query in combination with any other where clauses. As a trivial example how could I write find all places User called X has been within Y miles of lat/lon. Because the SpatialRepository is not part of the regular Spring Repository class tree I do not believe that there are any naming conventions that I can use, is the intention that I perform the spatial query and then filter the results?
I have traced the code through to a LegacyIndexSearcher (which has a name that scares me!) and cannot see any mechanism for extending the search. I have also had a look at the IndexProviderTest on GitHub which could provide a manual mechanism for performing the query against the index, except that I think there may be two indexes in play.
It might be helpful if I understood how to construct a Cypher query that I could use within an #Query annotation. Whilst I have been able to use the console to perform a simple REST query using:
:POST /db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance
{
"layer":"location",
"pointX":0.0,
"pointY":51.526256,
"distanceInKm":100
}
This does not work:
start n=node:location('withinDistance:[51.526256,0.0,100.0]') return n;
The error is:
Index `location` does not exist
Neo.ClientError.Schema.NoSuchIndex
The index was (possibly naively) created using Spring:
#Indexed(indexType = IndexType.POINT, indexName = "location")
String wkt;
If I run index --indexes in the console I can see that there is no index named location, but that there is one named location__neo4j-spatial__LayerNodeIndex__internal__spatialNodeLookup__.
Am I required to create the Index manually? If so, could someone point me in the direction of the documentation and I'll get on with it.
Assuming that it is just ignorance that has stopped me getting the simple Cypher query to run, is it as simple as adding a regular Cypher WHERE clause to the query to perform the combination of Spatial and property based querying?
Added more index detail
Having run :GET /db/data/index/node/ from the console I could see two possibly useful indexes (other indexes removed):
{
"location__neo4j-spatial__LayerNodeIndex__internal__spatialNodeLookup__": {
"template": "/db/data/index/node/location__neo4j-spatial__LayerNodeIndex__internal__spatialNodeLookup__/{key}/{value}",
"provider": "lucene",
"type": "exact"
},
"GeoTemporalThing": {
"template": "/db/data/index/node/GeoTemporalThing/{key}/{value}",
"provider": "lucene",
"type": "exact"
}
}
So perhaps this should is the correct format for the query I was trying:
start n=node:GeoTemporalThing('withinDistance:[51.526256,0.0,100.0]') return n;
But that gives me this error (which I am now Googling)
org.apache.lucene.queryParser.ParseException: Cannot parse 'withinDistance:[51.526256,0.0,100.0]': Encountered " "]" "] "" at line 1, column 35.
Was expecting one of:
"TO" ...
...
...
Update
Having decided that my index didn't exist and that it should I used the REST interface to create an index with the name that I expected SDN to create like this:
:POST /db/data/index/node
{
"name" : "location",
"config" : {
"provider" : "spatial",
"geometry_type" : "point",
"wkt" : "wkt"
}
}
And, now everything seems to work just fine. So, my question is, should I have to create that index manually? If I look at the code in org.springframework.data.neo4j.support.index.IndexType it looks as if it should use exactly the settings that I used above but it had only created the long named Lucene Index:
public enum IndexType
{
#Deprecated
SIMPLE { public Map getConfig() { return LuceneIndexImplementation.EXACT_CONFIG; } },
LABEL { public Map getConfig() { return null; } public boolean isLabelBased() { return true; }},
FULLTEXT { public Map getConfig() { return LuceneIndexImplementation.FULLTEXT_CONFIG; } },
POINT { public Map getConfig() { return MapUtil.stringMap(
IndexManager.PROVIDER, "spatial", "geometry_type" , "point","wkt","wkt") ; } }
;
public abstract MapgetConfig();
public boolean isLabelBased() { return false; }
}
I did clear down the system and the behaviour was the same, is there a step I have missed?
Software details:
Java:
neo4j 2.0.1
neo4j-ha 2.0.1
neo4j-spatial 0.12-neo4j-2.0.1
spring-data-neo4j 3.0.0.RELEASE
Standalone Servers:
neo4j-enterprise-2.0.1
neo4j-spatial-0.12-neo4j-2.0.1-server-plugin

I'm not sure if this is a bug in Spring Data when setting up the index, but manually creating the index using the REST index worked:
:POST /db/data/index/node
{
"name" : "location",
"config" : {
"provider" : "spatial",
"geometry_type" : "point",
"wkt" : "wkt"
}
}
I can now perform queries with minimal effort using cypher in an #Query annotation (more parameters coming obviously):
#Query(value = "start n=node:location('withinDistance:[51.526256,0.0,100.0]') MATCH user-[wa:WAS_HERE]-n WHERE wa.ts > {ts} return user"
Page findByTimeAtLocation(#Param("ts") long ts);

Related

Can Cypher do phonetic text search with only a part of the text, without using elastic search?

Say I have a job as financial administrator (j:Job {name: 'financial administrator'}).
Many people use different titles for a 'financial administrator'. Therefore, I want abovementioned job as a hit, even if people type only 'financial' or 'administrator' and their input has typos (like: 'fynancial').
CONTAINS only gives results when the match is 100% - so without typos.
Thanks a lot!
First, you could try fuzzy matching with a full text index and see if it solves the issue.
An example would be:
Set up the index-
CALL db.index.fulltext.createNodeIndex('jobs', ['Job'], ['name'], {})
Query the index with fuzzy matching (note the ~)
CALL db.index.fulltext.queryNodes('jobs', 'fynancial~')
If you want to go further and use Lucene's phonetic searches, then you could write a little Java code to register a custom analyzer.
Include the lucene-analyzers-phonetic dependency like so:
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-phonetic</artifactId>
<version>8.5.1</version>
</dependency>
Then create a custom analyzer:
#ServiceProvider
public class PhoneticAnalyzer extends AnalyzerProvider {
public PhoneticAnalyzer() {
super("phonetic");
}
#Override
public Analyzer createAnalyzer() {
return new Analyzer() {
#Override
protected TokenStreamComponents createComponents(String s) {
Tokenizer tokenizer = new StandardTokenizer();
TokenStream stream = new DoubleMetaphoneFilter(tokenizer, 6, true);
return new TokenStreamComponents(tokenizer, stream);
}
};
}
}
I used the DoubleMetaphoneFilter but you can experiment with others.
Package it as a jar, and put it into Neo4j's plugin directory along with the Lucene phonetic jar and restart the server.
Then, create a full text index using this analyzer:
CALL db.index.fulltext.createNodeIndex('jobs', ['Job'], ['name'], {analyzer:'phonetic'})
Querying the index looks the same:
CALL db.index.fulltext.queryNodes('jobs', 'fynancial')
It took a while, this is how I solved my question.
MATCH (a)-[:IS]->(hs)
UNWIND a.naam AS namelist
CALL apoc.text.phonetic(namelist) YIELD value
WITH value AS search_str, SPLIT('INPUT FROM DATABASE', ' ') AS input, a
CALL apoc.text.phonetic(input) YIELD value
WITH value AS match_str, search_str, a
WHERE search_str CONTAINS match_str OR search_str = match_str
RETURN DISTINCT a.naam, label(a)

typeorm table name specified more than once

I have the next query:
const foundDeal: any = await dealRepository.findOne({
where: { id: dealId },
relations: ['negotiationPointsDeals', 'chosenInventoryToSubtractQuantity',
'chosenInventoryToSubtractQuantity.inventoryItemType',
'chosenInventoryToSubtractQuantity.inventoryItemType.quality', 'negotiationPointsDeals.negotiationPointsTemplate',
'chosenInventoryToSubtractQuantity.addressOfOriginId', 'chosenInventoryToSubtractQuantity.currentLocationAddress',
'chosenInventoryToSubtractQuantity.labAttestationDocs',
'chosenInventoryToSubtractQuantity.labAttestationDocs.storage',
'chosenInventoryToSubtractQuantity.proveDocuments', 'chosenInventoryToSubtractQuantity.proveDocuments.storage',
'chosenInventoryToSubtractQuantity.inventoryItemSavedFields', 'chosenInventoryToSubtractQuantity.inventoryItemSavedFields.proveDocuments',
'chosenInventoryToSubtractQuantity.inventoryItemSavedFields.proveDocuments.storage',
'sellerBroker', 'sellerBroker.users',
'seller', 'seller.users',
'buyerBroker', 'buyerBroker.users',
'buyer', 'buyer.users',
'order', 'order.inventory', 'order.inventory.inventoryItemType',
'order.inventory.inventoryItemType.quality',
'order.inventory.addressOfOriginId', 'order.inventory.currentLocationAddress',
'order.inventory.inventoryItemSavedFields', 'order.inventory.inventoryItemSavedFields.proveDocuments',
'order.inventory.inventoryItemSavedFields.proveDocuments.storage',
'order.inventory.labAttestationDocs', 'order.inventory.labAttestationDocs.storage',
// 'postTradeProcessingDeal', 'postTradeProcessingDeal.postTradeProcessingStepsDeal',
'order.inventory.proveDocuments',
'order.inventory.proveDocuments.storage',
'negotiationPointsDeals.negotiationPointsTemplate.negotiationPointsTemplateChoices',
'postTradeProcessing',
],
});
So, the error is next:
error: table name "Deal__chosenInventoryToSubtractQuantity_Deal__chosenInventoryTo" specified more than once.
But I can't see any doubles in query.
I ran into this issue when switching to start using the snake case naming strategy.
I think somehow the aliases that TypeORM generates by default do not collide if you "re-join" to existing eagerly-loaded relations.
However, under the new naming strategy it threw an error if I tried to add in a relation that was already eagerly loaded.
The solution for me was to find and eliminate places where I was doing relations: ["foo"] in a query where foo was already eagerly loaded by the entity.
The issue is documented in this TypeORM issue.
After some digging, I realized this error is due to TypeORM creating some kind of variable when using eager loading that is longer than Postgres limit for names.
For example, if you are eager loading products with customer, typeorm will create something along the lines of customer_products, connecting the two. If that name is longer than 63 bytes (Postgres limit) the query will crash.
Basically, it happens when your variable names are too long or there's too much nesting. Make your entity names shorter and it will work. Otherwise, you could join the tables manually using queryBuilder and assign aliases for them.
It looks like you are using Nestjs, typeorm, and the snakeNamingStrategy as well, so I'll show how I fixed this with my system. I use the SnakeNamingStrategy for Typeorm which might be creating more issues as well. Instead of removing it, I extended it and wrote an overwriting function for eager-loaded aliases.
Here is my solution:
// short-snake-naming.strategy.ts
import { SnakeNamingStrategy } from "typeorm-naming-strategies";
import { NamingStrategyInterface } from "typeorm";
export class ShortSnakeNamingStrategy
extends SnakeNamingStrategy
implements NamingStrategyInterface
{
eagerJoinRelationAlias(alias: string, propertyPath: string): string {
return `${alias.replace(
/[a-zA-Z]+(_[a-zA-Z]+)*/g,
(w) => `${w[0]}_`
)}_${propertyPath}`;
}
}
// read-database.configuration.ts
import { TypeOrmModuleOptions, TypeOrmOptionsFactory } from "#nestjs/typeorm";
import { SnakeNamingStrategy } from "typeorm-naming-strategies";
import { ShortSnakeNamingStrategy } from "./short-snake-naming.strategy";
export class ReadDatabaseConfiguration implements TypeOrmOptionsFactory {
createTypeOrmOptions(): TypeOrmModuleOptions | Promise<TypeOrmModuleOptions> {
return {
name: "read",
type: "postgres",
...
namingStrategy: new ShortSnakeNamingStrategy(),
};
}
}
The ShortSnakeNamingStrategy Class takes each eager-loaded relationship and shortens its name from Product__change_log___auth_user___roles__permissions to P_____c____a___r__permissions
So far this has generated no collisions and kept it below the 63 character max index length.

How to make node changes in a TransactionEventHandler that are returned within the same CREATE query

I am trying to implement a plugin for neo4j to add an autoincrement ID using GraphAware library. To this end, I've written the following classes:
public class ModuleBootstrapper implements RuntimeModuleBootstrapper
{
public RuntimeModule bootstrapModule(String moduleId, Map<String, String> config, GraphDatabaseService database)
{
return new MyModule(moduleId, config, database);
}
}
And:
public class MyModule extends BaseTxDrivenModule<Void>
{
int counter = 0;
public Void beforeCommit(ImprovedTransactionData transactionData)
throws DeliberateTransactionRollbackException
{
if (transactionData.mutationsOccurred()) {
for (Node newNode : transactionData.getAllCreatedNodes()) {
newNode.setProperty("id", counter++);
}
}
}
}
And for the testing I can execute:
CREATE (n);
And then:
MATCH (n) RETURN n;
And I can see the effect of my plugin as some id property added to the node. But when I run:
CREATE (n) RETURN n;
The returned node does not have the mentioned id property but again, when I match the node in a separate query, I see the things have worked out just fine. It's just that in the CREATE query, the returned node infos are the ones before my plugin modified them.
The questions are; why is that? Didn't I modify the nodes through my plugin within the transaction? Shouldn't the returned nodes be showing the modifications I've made on them? Is there any way I can make this happen?
While you're still within the transaction, the Cypher result is already computed and there is no clean way to add additional informations to it.
I guess a feature request on the neo4j repository could be cool but in total honesty this would require a serious change into the neo4j core codebase.
BTW, the incremental ID is already implemented in the graphaware-uuid plugin : https://github.com/graphaware/neo4j-uuid#specifying-the-generator-through-configuration

XText cross-referencing: How to establish references between locally unique IDs, using their context as a qualifier?

I have a problem with cross-referencing terminals that are only locally unique (in their block/scope), but not globally. I found tutorials that describe, how I can use fully qualified names or package declarations, but my case is syntactically a little bit different from the example and I cannot change the DSL to support something like explicit fully qualified names or package declarations.
In my DSL I have two types of structured JSON resources:
The instance that contains my data.
A meta model, containing type information etc. for my data.
I can easily parse those two, and get an EMF model with the following Java snippet:
new MyDSLStandaloneSetup().createInjectorAndDoEMFRegistration();
ResourceSet rs = new ResourceSetImpl();
rs.getResource(URI.createPlatformResourceURI("/Foo/meta.json", true), true);
Resource instanceResource= rs.getResource(URI.createPlatformResourceURI("/Bar/instance.json", true), true);
EObject eobject = instanceResource.getContents().get(0);
Simplyfied example:
meta.json
{
"toplevel_1": {
"sublevels": {
"sublevel_1": {
"type": "int"
},
"sublevel_2": {
"type": "long"
}
}
},
"toplevel_2": {
"sublevels": {
"sublevel_1": {
"type": "float"
},
"sublevel_2": {
"type": "double"
}
}
}
}
instance.json
{
"toplevel_1": {
"sublevel_1": "1",
"sublevel_2": "2"
},
"toplevel_2": {
"sublevel_1": "3",
"sublevel_2": "4"
}
}
From this I want to infer that:
toplevel_1:sublevel_1 has type int and value 1
toplevel_1:sublevel_2 has type long and value 2
toplevel_2:sublevel_1 has type float and value 3
toplevel_2:sublevel_2 has type double and value 4
I was able to cross-reference the unique toplevel-elements and iterate over all sublevels until I found the ones that I was looking for, but for my use case that is quite inefficient and complicated. Also, I don't get the generated editor to link between the sublevels this way.
I played around with linking and scoping, but I'm unsure as to what I really need, and if I have to extend the providers-classes AbstractDeclarativeScopeProvider and/or DefaultDeclarativeQualifiedNameProvider.
What's the best way to go?
See also:
Xtext cross reference using custom terminal rule
http://www.eclipse.org/Xtext/documentation.html#scoping
http://www.eclipse.org/Xtext/documentation.html#linking
After some trial and error I solved my problem with a ScopeProvider.
The main issue was that I didn't really understand what a scope is in Xtext-terms, and what I have to provide it to.
Looking at the signature from the documentation:
IScope scope_<RefDeclaringEClass>_<Reference>(<ContextType> ctx, EReference ref)
In my example language:
RefDeclaringEClass would refer to the Sublevel from instance.json,
Reference to the cross-reference to the Sublevel from meta.json, and
ContextType would match the RefDeclaringEClass.
Using the eContainer of ctx I can get the Toplevel from instance.json.
This Toplevel already has a cross-reference to the matching Toplevel from meta.json, which I can use to get the Sublevels from meta.json. This collection of Sublevels is basically the scope within which the current Sublevel should be unique.
To get the IScope I used Scopes#scopeFor(Iterable).
I didn't post any code here because the actual grammar is bigger/different, and therefore doesn't really help the explanation.

Grails "max" subquery with an association, to get only the latest of a hasMany

The simplified domain model:
'Txn' (as in Transaction) hasMany 'TxnStatus'. TxnStatus has a dateTime
This is a legacy mapping so I cant change the DB, the mapping on Txn:
static mapping = {
txnStatus column: 'MessageID', ignoreNotFound: true, fetch: 'join'
}
I need to get Txns based on a number of dynamically built criteria, currently using GORM's 'where' query, it works well; BUT I need to also get only the latest txnStatus.
Tried:
def query = Txn.where {
txnStatus { dateTime == max(dateTime) }
}
gives: java.lang.ClassCastException: org.hibernate.criterion.DetachedCriteria cannot be cast to java.util.Date
also tried:
def query = Txn.where {
txnStatus.dateTime == max(txnStatus.dateTime)
}
which gives:
Compilation Error: ...
Cannot use aggregate function max on expressions "txnStatus.dateTime"
At this stage I am thinking of changing to HQL...any help appreciated!
There was a question a couple of days ago very similar to this. It appears that using where queries with a 'max' subquery doesn't work well with ==
The OP was able to get it to work with < and worked around it that way. Looking at the docs on where queries has not helped me figure this one out.
Here is a really wild guess -
Txn.where {
txnStatus {
dateTime == property(dateTime).of { max(dateTime) }
}
}

Resources