Mongo Template : Modifying Match Operation dynamically - mongotemplate

I have defined my match operation in mongo template as below.
MatchOperation match = Aggregation.match(new Criteria("workflow_stage_current_assignee").ne(null)
.andOperator(new Criteria("CreatedDate").gte(new Date(fromDate.getTimeInMillis()))
.andOperator(new Criteria("CreatedDate").lte(new Date(toDate.getTimeInMillis())))));
Everything is fine until this. However I can not modify this match operation using the reference match I have created. I was looking for List kind of functionality where in I could add multiple criteria clauses as and when they are needed to an already created reference. Something on the lines match.add(new Criteria)
However MatchOperation currently does not support any methods which would provide this functionality. Any help in this regard would be appreciated.

Criteria is where you add new criteria, which is backed by list.
Use static Criteria where(String key) method to create a initialize criteria object.
Something like
Criteria criteria = where("key1").is("value1");
Add more criteria's
criteria.and("key2").is("value2");
Create implicit $and criteria and chain to existing criteria chain.
criteria.and(where("key3).gt(value3).lte(value4))
When you are done, just pass it to match operation.
MatchOperation match = Aggregation.match(criteria);

Related

PagedResultList .size() and .getTotalCount() return different values in grails gorm

I have the following code
PagedResultList res = myService.getPage(paginateParams, ...)
println res.size() // returns 2
println res.getTotalCount() // returns 1
getPage looks like:
def criteria = MyDomain.createCriteria()
criteria.list(max: paginateParams.max, offset: paginateParams.offset) { // max is 10, offset is 0, sortBy is updatedAt and sortOrder is desc
eq('org', org)
order(paginateParams.sortBy, paginateParams.sortOrder)
}
why do the two method return different values? The documentation doesn't explain the difference, but does mention that getTotalCount is for number of records
currently on grails 2.4.5
edits:
println on res prints out:
res: [
com.<hidden>.MyDomain: 41679f98-a7c5-4193-bba8-601725007c1a,
com.<hidden>.MyDomain: 41679f98-a7c5-4193-bba8-601725007c1a]
Yes, res has a SINGLE object twice - that's the bug I'm trying to fix. How do I know that? I have an primary key on MyDomain's ID, and when I inspect the database, it's also showing one record for this particular org (see my criteria)
edit 2: I found this comment (http://docs.grails.org/2.4.5/ref/Domain%20Classes/createCriteria.html)
listDistinct If subqueries or associations are used, one may end up
with the same row multiple times in the result set. In Hibernate one
would do a "CriteriaSpecification.DISTINCT_ROOT_ENTITY". In Grails one
can do it by just using this method.
Which, if I understand correctly, is their way of saying "list" method doesn't work in this scenario, use listDistinct instead but then they go on to warn:
The listDistinct() method does not work well with the pagination
options maxResult and firstResult. If you need distinct results with
pagination, we currently recommend that you use HQL. You can find out
more information from this blog post.
However, the blog post is a dead link.
Related: GORM createCriteria and list do not return the same results : what can I do?
Not related to actual problem after question edited but this quote seems useful
Generally PagedResultList .size() perform size() on resultList property (in-memory object represent database record), while .getTotalCount() do count query against database. If this two value didn't match your list may contain duplicate.
After viewing related issues (GORM createCriteria and list do not return the same results : what can I do?) I determined that there were several approaches:
Use grails projection groupBy('id') - doesn't work b/c i need the entire object
USe HSQL - Domain.executeQuery - actually this didn't work for my scenario very well because this returns a list, whereas criteria.list returns a PagedResultList from which I previously got totalCount. This solution had me learning HSQL and also made me break up my existing logic into two components - one that returned PagedResultList and one that didn't
Simply keep a set of IDs as I process my PagedResultList and make sure that I didn't have any duplicates.
I ended up going with option 3 because it was quick, didn't require me to learn a new language (HSQL) and I felt that I could easily write the code to do it and I'm not limited by the CPU to do such a unique ID check.

Grails: query or criteria against a string/value pairs map property

Grails gives the possibility of creating simple string/value map properties section "Maps of Objects", first paragraph.
I was wondering, is there a way to later query the domain class (using Gorm dynamic finders, criterias or HQL) using the map property as part of the query (i.e adding a condition for the key X to have the value Y)?
After playing with it a bit and almost give up, I discovered the map syntax to (surprisingly) work in HQL. Assuming the class looks like:
class SomeClass {
Map pairKeyProperty
}
You can build queries that look like the following:
select * from SomeClass sc where sc.pairKeyProperty['someKey'] = 'someValue' and sc.pairKeyProperty['someOtherKey'] = 'someOtherValue'
Pretty neat! I still would prefer to use criterias as they are much cleaner to compose, but they seem to not support the same syntax (or I couldn't find it).
I created a sample app in GitHub:
https://github.com/deigote/grails-simple-map-of-string-value-pairs
It can be visisted at:
http://grails-map-of-string-pairs.herokuapp.com/
The form above uses a cross join. To enforce an inner join use
join sc.pairKeyProperty pk1 on index(pk1) = 'someKey'
where 'someValue' in elements(pk1)

Neo4j indexes and legacy data

I have a legacy dataset (ENRON data represented as GraphML) that I would like to query. In an comment in a related question, #StefanArmbruster suggests that I use Cypher to query the database. My query use case is simple: given a message id (a property of the Message node), retrieve the node that has that id, and also retrieve the sender and recipient nodes of that message.
It seems that to do this in Cypher, I first have to create an index of the nodes. Is there a way to do this automatically when the data is loaded from the graphML file? (I had used Gremlin to load the data and create the database.)
I also have an external Lucene index of the data (I need it for other purposes). Does it make sense to have two indexes? I could, for example, index the Neo4J node ids into my external index, and then query the graph based on those ids. My concern is about the persistence of these ids. (By analogy, Lucene document ids should not be treated as persistent.)
So, should I:
Index the Neo4j graph internally to query on message ids using Cypher? (If so, what is the best way to do that: regenerate the database with some suitable incantation to get the index built? Build the index on the already-existing db?)
Store Neo4j node ids in my external Lucene index and retrieve nodes via these stored ids?
UPDATE
I have been trying to get auto-indexing to work with Gremlin and an embedded server, but with no luck. In the documentation it says
The underlying database is auto-indexed, see Section 14.12, “Automatic Indexing” so the script can return the imported node by index lookup.
But when I examine the graph after loading a new database, no indexes seem to exist.
The Neo4j documentation on auto indexing says that a bunch of configuration is required. In addition to setting node_auto_indexing = true, you have to configure it
To actually auto index something, you have to set which properties
should get indexed. You do this by listing the property keys to index
on. In the configuration file, use the node_keys_indexable and
relationship_keys_indexable configuration keys. When using embedded
mode, use the GraphDatabaseSettings.node_keys_indexable and
GraphDatabaseSettings.relationship_keys_indexable configuration keys.
In all cases, the value should be a comma separated list of property
keys to index on.
So is Gremlin supposed to set the GraphDatabaseSettings parameters? I tried passing in a map into the Neo4jGraph constructor like this:
Map<String,String> config = [
'node_auto_indexing':'true',
'node_keys_indexable': 'emailID'
]
Neo4jGraph g = new Neo4jGraph(graphDB, config);
g.loadGraphML("../databases/data.graphml");
but that had no apparent effect on index creation.
UPDATE 2
Rather than configuring the database through Gremlin, I used the examples given in the Neo4j documentation so that my database creation was like this (in Groovy):
protected Neo4jGraph getGraph(String graphDBname, String databaseName) {
boolean populateDB = !new File(graphDBName).exists();
if(populateDB)
println "creating database";
else
println "opening database";
GraphDatabaseService graphDB = new GraphDatabaseFactory().
newEmbeddedDatabaseBuilder( graphDBName ).
setConfig( GraphDatabaseSettings.node_keys_indexable, "emailID" ).
setConfig( GraphDatabaseSettings.node_auto_indexing, "true" ).
setConfig( GraphDatabaseSettings.dump_configuration, "true").
newGraphDatabase();
Neo4jGraph g = new Neo4jGraph(graphDB);
if (populateDB) {
println "Populating graph"
g.loadGraphML(databaseName);
}
return g;
}
and my retrieval was done like this:
ReadableIndex<Node> autoNodeIndex = graph.rawGraph.index()
.getNodeAutoIndexer()
.getAutoIndex();
def node = autoNodeIndex.get( "emailID", "<2614099.1075839927264.JavaMail.evans#thyme>" ).getSingle();
And this seemed to work. Note, however, that the getIndices() call on the Neo4jGraph object still returned an empty list. So the upshot is that I can exercise the Neo4j API correctly, but the Gremlin wrapper seems to be unable to reflect the indexing state. The expression g.idx('node_auto_index') (documented in Gremlin Methods) returns null.
the auto indexes are created lazily. That is - when you have enabled the auto-indexing, the actual index is first created when you index your first property. Make sure you are inserting data before checking the existence of the index, otherwise it might not show up.
For some auto-indexing code (using programmatic configuration), see e.g. https://github.com/neo4j-contrib/rabbithole/blob/master/src/test/java/org/neo4j/community/console/IndexTest.java (this is working with Neo4j 1.8
/peter
Have you tried the automatic index feature? It's basically the use case you're looking for--unfortunately it needs to be enabled before you import the data. (Otherwise you have to remove/add the properties to reindex them.)
http://docs.neo4j.org/chunked/milestone/auto-indexing.html

Jena UpdateFactory

I was wondering if it was possible to create a SPARQL UpdateRequest in Jena by using ARQ Op objects. I would be interested to create programmatically updates like this:
DELETE {?s :predicate <http://example.org#old> }
INSERT {?s :predicate <http://example.org#toAdd>}
WHERE {?s :predicate <http://example.org#old> }
by creating the patterns in the DELETE, INSERT, and WHERE clauses from the ARQ API.
So far the only ways I have found to create SPARQL Update requests require to parse a SPARQL string or to create a com.hp.hpl.jena.update.Update object (which uses QuadAcc objects for which I couldn't find examples of use.
My fear is that the management of SPARQL UPDATE requests and the one of SPARQL SELECT queries are separated and that ARQ cannot be used to 'assemble' queries on the fly.
Thanks in advance
This question also burned me. I wanted to compose an UpdateRequest from ElementGroup objects and ElementTriplesBlock objects. This are the two main classes used to construct a Query. For example:
ElementGroup queryPattern = ...
ElementTriplesBlock constructTriples = ...
Query query = new Query();
query.setQueryConstructType();
// set CONSTRUCT clause
query.setConstructTemplate(new Template(constructTriples.getPattern()));
// set WHERE clause
query.setQueryPattern(queryPattern);
I tried the Jena mailing-list and received this answer:
The Update API is designed to deal with streaming arbitrarily large
unbounded INSERT and DELETE data hence the use of QuadAcc rather than
an Element for the INSERT/DELETE portion of the update.
Eventually I implemented this using a ParametrizedSparqlString:
ElementGroup queryPattern = ...
ElementTriplesBlock deleteTriples = ...
ElementTriplesBlock insertTriples = ...
ParameterizedSparqlString qstring = new ParameterizedSparqlString();
// Set DELETE clause
qstring.append("DELETE {");
qstring.append(deleteTriples.toString());
qstring.append("}");
// Set INSERT clause
qstring.append("INSERT {");
qstring.append(insertTriples.toString());
qstring.append("}");
// Set WHERE clause
qstring.append("WHERE {");
qstring.append(queryPattern.toString());
qstring.append("}");
// Construct an update query
UpdateRequest request = qstring.asUpdate();
I haven't tried this myself, but it looks like creating Update objects and assembling them into an UpdateRequest is indeed the way to go.
After a short look, QuadAcc doesn't seem particularly difficult, just use addTriple() with triples that contain variables.
The UpdateModify subclass of Update looks particularly interesting, it corresponds to the DELETE … INSERT … WHERE pattern in your example. Unfortunately the WHERE clause is initialised with an Element (syntactic representation of a query part) rather than an Op (algebraic representation).

Code re-use with Linq-to-Sql - Creating 'generic' look-up tables

I'm working on an application at the moment in ASP.NET MVC which has a number of look-up tables, all of the form
LookUp {
Id
Text
}
As you can see, this just maps the Id to a textual value. These are used for things such as Colours. I now have a number of these, currently 6 and probably soon to be more.
I'm trying to put together an API that can be used via AJAX to allow the user to add/list/remove values from these lookup tables, so for example I could have something like:
http://example.com/Attributes/Colours/[List/Add/Delete]
My current problem is that clearly, regardless of which lookup table I'm using, everything else happens exactly the same. So really there should be no repetition of code whatsoever.
I currently have a custom route which points to an 'AttributeController', which figures out the attribute/look-up table in question based upon the URL (ie http://example.com/Attributes/Colours/List would want the 'Colours' table). I pass the attribute (Colours - a string) and the operation (List/Add/Delete), as well as any other parameters required (say "Red" if I want to add red to the list) back to my repository where the actual work is performed.
Things start getting messy here, as at the moment I've resorted to doing a switch/case on the attribute string, which can then grab the Linq-to-Sql entity corresponding to the particular lookup table. I find this pretty dirty though as I find myself having to write the same operations on each of the look-up entities, ugh!
What I'd really like to do is have some sort of mapping, which I could simply pass in the attribute name and get out some form of generic lookup object, which I could perform the desired operations on without having to care about type.
Is there some way to do this to my Linq-To-Sql entities? I've tried making them implement a basic interface (IAttribute), which simply specifies the Id/Text properties, however doing things like this fails:
System.Data.Linq.Table<IAttribute> table = GetAttribute("Colours");
As I cannot convert System.Data.Linq.Table<Colour> to System.Data.Linq.Table<IAttribute>.
Is there a way to make these look-up tables 'generic'?
Apologies that this is a bit of a brain-dump. There's surely imformation missing here, so just let me know if you'd like any further details. Cheers!
You have 2 options.
Use Expression Trees to dynamically create your lambda expression
Use Dynamic LINQ as detailed on Scott Gu's blog
I've looked at both options and have successfully implemented Expression Trees as my preferred approach.
Here's an example function that i created: (NOT TESTED)
private static bool ValueExists<T>(String Value) where T : class
{
ParameterExpression pe = Expression.Parameter(typeof(T), "p");
Expression value = Expression.Equal(Expression.Property(pe, "ColumnName"), Expression.Constant(Value));
Expression<Func<T, bool>> predicate = Expression.Lambda<Func<T, bool>>(value, pe);
return MyDataContext.GetTable<T>().Where(predicate).Count() > 0;
}
Instead of using a switch statement, you can use a lookup dictionary. This is psuedocode-ish, but this is one way to get your table in question. You'll have to manually maintain the dictionary, but it should be much easier than a switch.
It looks like the DataContext.GetTable() method could be the answer to your problem. You can get a table if you know the type of the linq entity that you want to operate upon.
Dictionary<string, Type> lookupDict = new Dictionary<string, Type>
{
"Colour", typeof(MatchingLinqEntity)
...
}
Type entityType = lookupDict[AttributeFromRouteValue];
YourDataContext db = new YourDataContext();
var entityTable = db.GetTable(entityType);
var entity = entityTable.Single(x => x.Id == IdFromRouteValue);
// or whatever operations you need
db.SubmitChanges()
The Suteki Shop project has some very slick work in it. You could look into their implementation of IRepository<T> and IRepositoryResolver for a generic repository pattern. This really works well with an IoC container, but you could create them manually with reflection if the performance is acceptable. I'd use this route if you have or can add an IoC container to the project. You need to make sure your IoC container supports open generics if you go this route, but I'm pretty sure all the major players do.

Resources