Error using DseGraphFrame with querying timestamp field - datastax-enterprise

I have a Person label with a created property defined:
schema.propertyKey(“created”).Timestamp().single().create()
I get the error below when trying to use DseGraphFrame to filter for the Person label using the created property in dse spark:
scala> g.V().hasLabel(“Person”).has(“created”,
P.gt("2018-10-07T14:46:26.790Z")).count().next()
org.apache.spark.sql.AnalysisException: cannot resolve '(created >
1538923586790L)' due to data type mismatch: differing types in
'(created > 1538923586790L)' (timestamp and bigint).;; 'Filter
((~label#270 = Person) && (created#280 > 1538923586790))…
Any idea why?

This was a defect in DSE version but resolved in DSE 5.1.8 and DSE 6.0.0.
see here - https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/releaseNotes/RNdse.html and look for DSP-15146

Related

failure in serializing optional date type filed to avro regardless of null value or non-null value

We are using avro1.8.2 to serialize data with optional date type field to be published to topic.
record aRecord {
/** Variable: lastUpdate
* lastUpdate indicates the latest date and time the reference asset was updated
*/
union {null, date} lastUpdate = null;
/** Variable: businessDate
* businessDate indicates the business date of the reference asset price
*/
union {null, date} businessDate = null;
}
Ran into the following exception while using the avro tool generated java class to serialize the data:
Error serializing avro message
Caused by: org.apache.avro.AvroRunTimeException: Unknown datum type org.joda.time.LocalDate: 2021-09-17
at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:772)
at org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:302)
at org.apache.avro.generic.GenericData.resolveUnion
Please note that2 this happens regardless of the value is null or non-null (as shown value 2021-09-17 also caused the exception)
We did the following investigation and experiment but could not figure it out why:
Making the date field mandatory, the issue is resolved.
This is because DATE_CONVERSION is added to the corresponding field in the java class generated by avro tool.
If this field is defined as optional and default value is null, DATA_CONVERSION is not added to the java file generated by avro tool.
Using avro 1.9.1 resolved the issue unfortunately we must use avro 1.8.2
We also tried a few other versions of kafka-avro-serializer and spring-boot kafka framework. Nothing works for us.
Other projects that depend on avro1.8.2 seems to be able to handle this and we checked all the places as far as we considered relevant
and all the codes are the same except that somehow they have DATE_CONVERSION in place in the java file
generated by avro tool (although they are defined in advl file exactly the same).
Debuggin into the GenericData.java we found that if DATE_CONVERSION is in place for optional date field, getSchemaName is not called at all.
The getSchemaName basically checks of the type of the object, whether it's an Int, Record, String,...etc.
The date is a logicaltype of joda. Its real type is int as far as we understand
So our questions are:
How to make avro tool enable DATE_CONVERSION for optional "date" type field using avro 1.8.2?
If DATE_CONVERSION is not the key to resolve the issue, what's the best practice to serialize date type field using avro 1.8.2?
and this field could be null (default) or non-null.
Thanks.
SpecificData specificData = SpecificData.get();
specificData.addLogicalTypeConversion(new DateConversion());
DatumWriter<MessageClass> dw = new SpecificDatumWriter<MessageClass>(message.getSchema(), specificData);
DataFileWriter<MessageClass> dfw = new DataFileWriter<MessageClass>(dw);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
dfw.create(message.getSchema(), outputStream);
dfw.append(message);
dfw.close();
ProducerRecord<String, byte[]> record = new ProducerRecord<>(topic, key, outputStream.toByteArray());
return kafkaProducer.send(record, new Callback());
The above code fixed the issue. MessageClass is the java code generated by avro tool.
message is wrapped in specificData which is constructed with new DateConversion()
DATE_CONVERSION is exactly what is needed for optional date field during serialization.
Note that this solution is only needed as a workaround to avro1.8.

gds.beta.pipeline.nodeClassification.train: Target property not found

Using Neo4j v4.4 and GDS 2.0. I'm trying to train a model. When I type:
CALL gds.beta.pipeline.nodeClassification.train('individual-graph', {
pipeline: 'pipe',
nodeLabels: ['PERSON'],
modelName: 'xmen-model-fastRP',
targetProperty: 'is_risky',
metrics: ['F1_WEIGHTED','ACCURACY'],
randomSeed: 2
}) YIELD modelInfo
RETURN
modelInfo.bestParameters AS winningModel,
modelInfo.metrics.F1_WEIGHTED.outerTrain AS trainGraphScore,
modelInfo.metrics.F1_WEIGHTED.test AS testGraphScore
I get the following error message:
Failed to invoke procedure gds.beta.pipeline.nodeClassification.train: Caused by: java.lang.IllegalArgumentException: Target property is_risky not found in graph with node properties: [[embedding]]
What am I doing wrong? Can you help please?
It means that the property 'is_risky' is not found in the node PERSON. The only existing property is embedding.
Going thru the example in the neo4j documentation (https://neo4j.com/docs/graph-data-science/current/machine-learning/nodeclassification-pipelines/) will give you an idea of what the error is. Below is an example of a similar issue that you are getting.
Target property `my_class` not found in graph
with node properties: [[sizePerStory, class], [sizePerStory]]
As you can see, the algorithm will give you a list of properties available for prediction.

Spatial querydsl - GeometryExpressions. Find entities that range contains given "Point"

Used tools and their versions:
I am using:
spring boot 2.2.6
hibernate/hibernate-spatial 5.3.10 with dialect set to: org.hibernate.spatial.dialect.mysql.MySQL56SpatialDialect
querydsl-spatial 4.2.1
com.vividsolutions.jts 1.13
jscience 4.3.1
Problem description:
I have an entity that represents medical-clinic:
import com.vividsolutions.jts.geom.Polygon;
#Entity
public class Clinic {
#Column(name = "range", columnDefinition = "Polygon")
private Polygon range;
}
The range is a circle calculated earlier based on the clinic's gps-location and radius. It represents the operating area for that clinic. That means that it treats only patients with their home address lying within that circle. Let's assume that above circle, is correct.
My goal (question):
I have a gps point with a patient location: 45.7602322 4.8444941. I would like to find all clinics that are able to treat that patient. That means, to find all the clinics that their range field contains 45.7602322 4.8444941.
My solution (partially correct (I think))
To get it done, I have created a simple "Predicate"/"BooleanExpression":
GeometryExpressions.asGeometry(QClinic.clinic.range)
.contains(Wkt.fromWkt("Point(45.7602322 4.8444941)"))
and it actualy works, because I can see proper sql query in console:
select (...) where
ST_Contains(clinic0_.range, ?)=1 limit ?
first binding param: POINT(45.7602322 4.8444941)
But I have two problems with that:
QClinic.clinic.range is marked as "warning" in intellij as: "Unchecked assignment: 'com.querydsl.spatial.jts.JTSPolygonPath' to 'com.querydsl.core.types.Expression<org.geolatte.geom.Geometry'". Yes, in QClinic range is com.querydsl.spatial.jts.JTSPolygonPath
Using debugger and intellij's "evaluate" on the above line (that creates the expression) i can see that there is an error message: "unknown operation with operator CONTAINS and args [clinic.range, POINT(45.7602322 4.8444941)]"
You can ignore the second warning. The spatial specific operations are simply not registered in the serializer used for the toString of the operation. They reside in their own module.
The first warning indicates that you're mixing up Geolatte and JTS expressions. From your mapping it seems you intend to use JTS. In that case you need to use com.querydsl.spatial.jts.JTSGeometryExpressions instead of com.querydsl.spatial.jts.GeometryExpressions in order to get rid of that warning.

How to pass Collection Parameters to Repository Queries for Neo4J

Using Spring Data for Neo4J I want to pass a collection as a parameter to a repository query:
#Query("MATCH (product:Product) WHERE ANY(c IN product.categories WHERE c IN {categories}) RETURN product")
Iterable<Product> findAllWithCategories(#Param("categories") List<String> categories);
On the command line the corresponding query runs successfully and delivers the right results:
MATCH (product:Product) WHERE ANY(c IN product.categories WHERE c IN ["Märklin","Fleischmann"]) RETURN product
But from within Java no results are returned, when the findAllWithCategories method is invoked with a list of categories. The strange thing is that it looks like the correct http-request is sent to the DB:
request: {"statements":[{"statement":"MATCH (product:Product) WHERE ANY(c IN product.categories WHERE c IN {categories}) RETURN product","parameters":{"categories":["Märklin","Fleischmann"]},"resultDataContents":["graph"],"includeStats":false}]}
Any idea what goes wrong here? In general how can I pass collections as parameters to a repository query to Neo4J?
Edit
The same query run without the Spring Data repository but with the more lower-level Neo4JTemplate gets the same result, which is really strange as the Query on the command line does what it should.
private final String FIND_PRODUCTS_WITH_CATEGORIES = "MATCH (product:Product) WHERE ANY(c IN product.categories WHERE c IN {categories}) RETURN product";
String[] categories = ...
Map<String, Object> map = new HashedMap<>();
map.put("categories", categories);
products = neo4j.queryForObjects(Product.class, FIND_PRODUCTS_WITH_CATEGORIES, map);
I don't think there is anything wrong with the query statement, but rather with the parameter of list type.
Edit
After half a day I tried the bolt driver, instead of the http driver, and everything was okay (using the version 2.0.6 of the driver, version 2.1.0 throw a strange exception)
The queries are all right. The handover of arrays or lists as parameters to the queries works. The problem is the driver: no success using the http-driver, the bolt-driver seems to be buggy in the newest version 2.1.0. But with bolt 2.0.6 I got it running.

Binary expression in return statement neo4jClient neo4j

I am trying to get the following Cypher query written out in GraphClient:
START agency=node(12345)
MATCH agency
-[:AGENCY_HAS_PEOPLE]-()
-[:AGENCY_HAS_PERSON]-person
,person-[?:PERSON_IS_CARER]-carer
,person-[?:PERSON_IS_CLIENT]-client
WHERE (person.UniqueId! = 18989)
RETURN person, carer is not null as IsCarer, client is not null as IsClient
The query works fine in the console and returns the results I expect:
person IsCarer IsClient
Node(1545421) True False
When I try to write that query using Neo4jClient, it throws the following exception.
Expression of type System.Linq.Expressions.LogicalBinaryExpression is not supported.
It is mainly due to the expression in the return statement:
.Start(...)
.Match(...)
.Where(...)
.Return((person, client, carer) => new
{
Person = person.As<Person>(),
IsClient = client != null
IsCarer = carer != null
});
Is anyone already working on a solution for this?
Is there a workaround for it?
Is there any other way to write this query to get the same result?
If I were to implement a solution for this, is there anything related to the internals of Neo4jClient (limitation, pitfalls) that I should know before I begin?
Thanks..
It's a bug in https://github.com/Readify/Neo4jClient/blob/master/Neo4jClient/Cypher/CypherReturnExpressionBuilder.cs
Fork the project
Clone it locally
Write a failing test that demonstrates the problem, in https://github.com/Readify/Neo4jClient/blob/master/Test/Cypher/CypherFluentQueryReturnTests.cs
Fix the issue
Commit and push your fix
Open a pull request
When I'm happy with the quality of the solution, I'll accept and merge the pull request. The fix will then be published to NuGet within a few minutes.

Resources