spring-data-neo4j query using dynamic key and dynamic value,
like following code:
public interface NodeReposity extends Neo4jRepository<Node,Long> {
#Query("MATCH(n:Node{{key}={value}})return n")
Iterable<Node> queryByProperty(#Param("key")String key,#Param("value") String value);
}
But it says the {key} must be something like variable in string, such as MATCH(n:Node{name={value}})return n.Can't be {key}. But My property's key is dynamic like the value, how to implement it and is it possible?
Short answer: The query will be send "as it is" to the database and because cypher does only support placeholders for values, this will cause an error.
Slightly longer answer: When it comes to executing the method Spring Data Neo4j will look if it has already pre-processed the query and either process and cache it or just load it from the cache. This is done to improve the time it takes to execute the method from the application.
Pre-processing means SDN knows what parameters are in there and just adds the values in the right place when the method is called.
If SDN would provide more features for the query than cypher, the query would have to be processed every time the method gets called to create a new query that can be used with Neo4j.
Related
I have a use case where in I have to pick up selected data from a spanner table and dump to BigQuery.
The catch here is that for the batch job the name of the table and the columns to select will only be known at runtime.
It seems that dataflow's SpannerIO doesn't accept the table and the columns at runtime. Please refer below code for better understanding:
p.apply(SpannerIO.read().withSpannerConfig(spannerConfig)
.withTable("tablename")
.withColumns(list or array of columns))
It only accepts string and not ValueProviders. How to make this work?
To access runtime values you need to use the ReadAll transform and build an instance of ReadOperation in the previous step.
See Reading data from all available tables from the examples.
Yes, the withColumn and withTable methods of SpannerIO.Read do not take in a ValueProvider by default.
Could you write code outside your function to get the table and column names and then pass them into withTable and withColumn as a list of strings at runtime?
If they can be passed in as commandline arguments, consider using PipelineOptions.
Here is a simple example. More docs on using dataflow connectors from Cloud Spanner can be found here.
I used read operation as suggested by Mairbek
p.apply(Create.ofProvider(options.getMyParamValueProvider(), StringUtf8Coder.of()))
.apply(MapElements.via(new SimpleFunction<String, ReadOperation>() {
#Override
public ReadOperation apply(String value) {
return ReadOperation.create()
.withTable("TableName").withColumns(value);
}
}))
.apply(SpannerIO.readAll().withSpannerConfig(spannerConfig));
I have the following code
PagedResultList res = myService.getPage(paginateParams, ...)
println res.size() // returns 2
println res.getTotalCount() // returns 1
getPage looks like:
def criteria = MyDomain.createCriteria()
criteria.list(max: paginateParams.max, offset: paginateParams.offset) { // max is 10, offset is 0, sortBy is updatedAt and sortOrder is desc
eq('org', org)
order(paginateParams.sortBy, paginateParams.sortOrder)
}
why do the two method return different values? The documentation doesn't explain the difference, but does mention that getTotalCount is for number of records
currently on grails 2.4.5
edits:
println on res prints out:
res: [
com.<hidden>.MyDomain: 41679f98-a7c5-4193-bba8-601725007c1a,
com.<hidden>.MyDomain: 41679f98-a7c5-4193-bba8-601725007c1a]
Yes, res has a SINGLE object twice - that's the bug I'm trying to fix. How do I know that? I have an primary key on MyDomain's ID, and when I inspect the database, it's also showing one record for this particular org (see my criteria)
edit 2: I found this comment (http://docs.grails.org/2.4.5/ref/Domain%20Classes/createCriteria.html)
listDistinct If subqueries or associations are used, one may end up
with the same row multiple times in the result set. In Hibernate one
would do a "CriteriaSpecification.DISTINCT_ROOT_ENTITY". In Grails one
can do it by just using this method.
Which, if I understand correctly, is their way of saying "list" method doesn't work in this scenario, use listDistinct instead but then they go on to warn:
The listDistinct() method does not work well with the pagination
options maxResult and firstResult. If you need distinct results with
pagination, we currently recommend that you use HQL. You can find out
more information from this blog post.
However, the blog post is a dead link.
Related: GORM createCriteria and list do not return the same results : what can I do?
Not related to actual problem after question edited but this quote seems useful
Generally PagedResultList .size() perform size() on resultList property (in-memory object represent database record), while .getTotalCount() do count query against database. If this two value didn't match your list may contain duplicate.
After viewing related issues (GORM createCriteria and list do not return the same results : what can I do?) I determined that there were several approaches:
Use grails projection groupBy('id') - doesn't work b/c i need the entire object
USe HSQL - Domain.executeQuery - actually this didn't work for my scenario very well because this returns a list, whereas criteria.list returns a PagedResultList from which I previously got totalCount. This solution had me learning HSQL and also made me break up my existing logic into two components - one that returned PagedResultList and one that didn't
Simply keep a set of IDs as I process my PagedResultList and make sure that I didn't have any duplicates.
I ended up going with option 3 because it was quick, didn't require me to learn a new language (HSQL) and I felt that I could easily write the code to do it and I'm not limited by the CPU to do such a unique ID check.
Hello I am trying to execute a query using breezejs 1.3.4 . My query is the following
function getContacts() {
var query = breeze.EntityQuery
.from("Contacts").where("Desc", "startsWith", "P");
return manager.executeQuery(query)
.then(getSucceeded).fail(getFailed);
}
"Desc" is a string property in my "Contacts" C# backend model. Tha problem is that the Query URL is formatted like this .../api/Application/Contacts?$filter=startswith(Desc%2Ctime'P')%20eq%20true
The word time is added before "P" and I get a this exception in the Response
{"$id":"1","$type":"System.Web.Http.HttpError, System.Web.Http","Message":"The query specified in the URI is not valid.","ExceptionMessage":"Unrecognized 'Edm.Time' literal 'time'P''
If in the comparison I use a lower case "p" then the Url is costructed as it should be like this
"$filter=startswith(Desc%2C'p')%20eq%20true` .
I don't have the same problem when using other UpperCase letters of the english alphabet.
Does anyone have an idea what am I missing, I can't figure out why the word "time" is added in that specific query?
Thank you.
We were able to reproduce the problem.
While the exception message is confusing, I believe you might be getting the error because you have not associated that the resource name 'Contacts' with an EntityType of presumably 'Contact'.
What is happening is that when Breeze tries to build the query it will usually use its metadata to correctly format the url query string to send to the server. The critical part of this process involves determining the EntityType associated with the resourceName given in the 'from' clause of the query ('Contacts' in your case). Breeze uses a resource name to entity type map to do this, but if it cannot find a mapping, it will still continue to build the url because we still need to support ad-hoc requests against endpoints for which we have no metadata.
You can update this map yourself via the MetadataStore.setEntityTypeForResourceName method. If you are using the Entity Framework, this map is initially populated based on the EntityType name/Entity Set name mapping that is part of your EDMX model. So in your case you can either modify your EDMX model or call the setEntityTypeForResourceName method directly.
Unfortunately, without the metadata Breeze has to infer the datatypes in the query. So in your case
"Desc", "startsWith", "P"
since Breeze can't determine that "Desc" is a 'string' property of your Contract EntityType it tries to infer the 'dataType' of "Desc" based on the value 'P'. Unfortunately, 'P' is a valid ISO8601 'duration' value ( a way to express a 'Time' datatype), so Breeze incorrectly tries to construct a query string with 'P' treated as a 'Time' constant. This is why your code works with a lower case 'p' ( not a valid duration value).
This inference logic can certainly be improved and we have a fix that will allow us to do exactly that coming out in the next release. However... the better and more general solution to this type of issue is to get the 'resourceName/entityType' mappings correct in the first place.
Hope this helps.
I have a legacy dataset (ENRON data represented as GraphML) that I would like to query. In an comment in a related question, #StefanArmbruster suggests that I use Cypher to query the database. My query use case is simple: given a message id (a property of the Message node), retrieve the node that has that id, and also retrieve the sender and recipient nodes of that message.
It seems that to do this in Cypher, I first have to create an index of the nodes. Is there a way to do this automatically when the data is loaded from the graphML file? (I had used Gremlin to load the data and create the database.)
I also have an external Lucene index of the data (I need it for other purposes). Does it make sense to have two indexes? I could, for example, index the Neo4J node ids into my external index, and then query the graph based on those ids. My concern is about the persistence of these ids. (By analogy, Lucene document ids should not be treated as persistent.)
So, should I:
Index the Neo4j graph internally to query on message ids using Cypher? (If so, what is the best way to do that: regenerate the database with some suitable incantation to get the index built? Build the index on the already-existing db?)
Store Neo4j node ids in my external Lucene index and retrieve nodes via these stored ids?
UPDATE
I have been trying to get auto-indexing to work with Gremlin and an embedded server, but with no luck. In the documentation it says
The underlying database is auto-indexed, see Section 14.12, “Automatic Indexing” so the script can return the imported node by index lookup.
But when I examine the graph after loading a new database, no indexes seem to exist.
The Neo4j documentation on auto indexing says that a bunch of configuration is required. In addition to setting node_auto_indexing = true, you have to configure it
To actually auto index something, you have to set which properties
should get indexed. You do this by listing the property keys to index
on. In the configuration file, use the node_keys_indexable and
relationship_keys_indexable configuration keys. When using embedded
mode, use the GraphDatabaseSettings.node_keys_indexable and
GraphDatabaseSettings.relationship_keys_indexable configuration keys.
In all cases, the value should be a comma separated list of property
keys to index on.
So is Gremlin supposed to set the GraphDatabaseSettings parameters? I tried passing in a map into the Neo4jGraph constructor like this:
Map<String,String> config = [
'node_auto_indexing':'true',
'node_keys_indexable': 'emailID'
]
Neo4jGraph g = new Neo4jGraph(graphDB, config);
g.loadGraphML("../databases/data.graphml");
but that had no apparent effect on index creation.
UPDATE 2
Rather than configuring the database through Gremlin, I used the examples given in the Neo4j documentation so that my database creation was like this (in Groovy):
protected Neo4jGraph getGraph(String graphDBname, String databaseName) {
boolean populateDB = !new File(graphDBName).exists();
if(populateDB)
println "creating database";
else
println "opening database";
GraphDatabaseService graphDB = new GraphDatabaseFactory().
newEmbeddedDatabaseBuilder( graphDBName ).
setConfig( GraphDatabaseSettings.node_keys_indexable, "emailID" ).
setConfig( GraphDatabaseSettings.node_auto_indexing, "true" ).
setConfig( GraphDatabaseSettings.dump_configuration, "true").
newGraphDatabase();
Neo4jGraph g = new Neo4jGraph(graphDB);
if (populateDB) {
println "Populating graph"
g.loadGraphML(databaseName);
}
return g;
}
and my retrieval was done like this:
ReadableIndex<Node> autoNodeIndex = graph.rawGraph.index()
.getNodeAutoIndexer()
.getAutoIndex();
def node = autoNodeIndex.get( "emailID", "<2614099.1075839927264.JavaMail.evans#thyme>" ).getSingle();
And this seemed to work. Note, however, that the getIndices() call on the Neo4jGraph object still returned an empty list. So the upshot is that I can exercise the Neo4j API correctly, but the Gremlin wrapper seems to be unable to reflect the indexing state. The expression g.idx('node_auto_index') (documented in Gremlin Methods) returns null.
the auto indexes are created lazily. That is - when you have enabled the auto-indexing, the actual index is first created when you index your first property. Make sure you are inserting data before checking the existence of the index, otherwise it might not show up.
For some auto-indexing code (using programmatic configuration), see e.g. https://github.com/neo4j-contrib/rabbithole/blob/master/src/test/java/org/neo4j/community/console/IndexTest.java (this is working with Neo4j 1.8
/peter
Have you tried the automatic index feature? It's basically the use case you're looking for--unfortunately it needs to be enabled before you import the data. (Otherwise you have to remove/add the properties to reindex them.)
http://docs.neo4j.org/chunked/milestone/auto-indexing.html
When I use criteria queries, the result contains array list of lazy initialized objects. that is, the list has values with handler org.codehaus.groovy.grails.orm.hibernate.proxy.GroovyAwareJavassistLazyInitializer.
This prevent me from doing any array operation (minus, remove etc) in it. When I use, GORM methods, I get array list of actual object types. How can I get the actual objects in criteria query?
The code is listed below.
availableTypes = Type.withCriteria() {
'in'("roleFrom", from)
'in'("roleTo", to)
}
availableTypes (an array list) has one value , but not actual object but value with a handler of GroovyAwareJavassistLazyInitializer
availableTypes (an array list) has values with type Type
availableTypes = Type.findByRoleFrom(from)
---------- Update ----------
I did further troubleshooting, and this is what I found. Probably the above description might be misleading, but I kept it in case it helps.
When using findAllBy for the first time, I get proxy objects rather than the actual instance. Then, I invoke the method through an ajax call, the actual instance is loaded (anything to do with cache loading??). When I refresh the page, it again loads the proxy
def typeFrom = Type.findAllByParty(partyFrom)
there is another use of findAllBy in the same method, which always returns actual instances.
def relFrom = Relation.findAllByParty(partyFrom)
When compared the two classes, the attribute 'party' of class Roles is part of a 1-m relation. like
class Role {
RoleType roleType
LocalDate validFrom
LocalDate validTo
static belongsTo = [party : Party ]
...
}
I know if I do statement like Party.findAll(), the role instances would be proxy till they access. But, when using gorm directly on the class (Role), why I am getting the proxy objects ???
thanks for the help.
thanks.
Turns out are a couple of possible solutions which I came across but didn't try, such as
Overloading the equals method so that the proxy and the domain
object use a primary key instead of the hashCode for equality
Using a join query so that you get actual instances back and not proxies
GrailsHibernateUtil.unwrapProxy(o)
HibernateProxyHelper.getClassWithoutInitializingProxy(object)
One solution that worked for me was to specify lazy loading to be false in the domain object mapping.
History of this problem seems to be discussed here: GRAILS-4614
See also: eager load