Building APIs to access Neo4j data - neo4j

I have a huge Neo4j database that I created using the batch import tool. Now I want to expose certain parts of the data via APIs (that will run a query in the backend) to my users. My requirements are pretty general:
1. Latency should be minimum
2. Support qps of about ~10-20.
Can someone give me recommendations on what I should use for this and any documentation on how to go about this? I see several examples of ruby/rails and REST APIs -- however they are specific to exposing the data as is without any complex queries in the backend. I am not sure how to translate that into the specific APIs that I want. Any help would be appreciated.
Thanks.

I wrote a simple Flask API example that interfaces with Neo4j for a simple demo (backend for a messaging iOS app).
You might find it a helpful reference: https://github.com/johnymontana/messages-api
There are also a few resources online for using Flask with Neo4j:
http://nicolewhite.github.io/neo4j-flask/
http://neo4j.com/blog/building-python-web-application-using-flask-neo4j/
https://github.com/nicolewhite/neo4j-flask

Check out the GraphAware Framework. You can build the APIs directly on top of Neo4j (same JVM) but you have to use Cypher, Java, or Scala.
I'd start with Cypher, because you can write it very quickly, then optimise for performance, and finally, if all else fails and your latency is still to high, convert to Java.
You can expose subgraphs (or even partially hydrated nodes and relationship, i.e. only certain properties) very easily. Checkout out the stuff in the api package. Example code:
You'd write a controller to return a person's graph, but only include nodes' names (not ages or anything else):
#RestController
public class ApiExample {
private final GraphDatabaseService database;
#Autowired
public ApiExample(GraphDatabaseService database) {
this.database = database;
}
#RequestMapping(path = "person/{name}")
public JsonGraph getPersonGraph(#PathVariable(value = "name") String name) {
JsonGraph<?> result = new JsonGraph() {
#Override
protected JsonGraph self() {
return this;
}
};
try (Transaction tx = database.beginTx()) {
Node person = database.findNode(label("Person"), "name", name);
if (person == null) {
throw new NotFoundException(); //eventually translate to 404
}
result.addNode(person, IncludeOnlyNameNodeTransformer.INSTANCE);
for (Relationship worksFor : person.getRelationships(withName("WORKS_FOR"), Direction.OUTGOING)) {
result.addRelationship(worksFor);
result.addNode(worksFor.getEndNode(), IncludeOnlyNameNodeTransformer.INSTANCE);
}
tx.success();
}
return result;
}
private static final class IncludeOnlyNameNodeTransformer implements NodeTransformer<LongIdJsonNode> {
private static final IncludeOnlyNameNodeTransformer INSTANCE = new IncludeOnlyNameNodeTransformer();
private IncludeOnlyNameNodeTransformer() {
}
#Override
public LongIdJsonNode transform(Node node) {
return new LongIdJsonNode(node, new String[]{"name"});
}
}
}
Running this test
public class ApiExampleTest extends GraphAwareApiTest {
#Override
protected void populateDatabase(GraphDatabaseService database) {
database.execute("CREATE INDEX ON :Person(name)");
database.execute("CREATE (:Person {name:'Michal', age:32})-[:WORKS_FOR {since:2013}]->(:Company {name:'GraphAware', est:2013})");
}
#Test
public void testExample() {
System.out.println(httpClient.get(baseUrl() + "/person/Michal/", 200));
}
}
would return the following JSON
{
"nodes": [
{
"properties": {
"name": "GraphAware"
},
"labels": [
"Company"
],
"id": 1
},
{
"properties": {
"name": "Michal"
},
"labels": [
"Person"
],
"id": 0
}
],
"relationships": [
{
"properties": {
"since": 2013
},
"type": "WORKS_FOR",
"id": 0,
"startNodeId": 0,
"endNodeId": 1
}
]
}

Obviously you can roll your own using frameworks like Rails / Sinatra. If you want a standard for the way that your API is formatted I quite like the JSON API standard:
http://jsonapi.org/
Here is an episode of The Changelog podcast talking about it:
https://changelog.com/189/
There's also a gem for creating resource objects which determine what is exposed and what is not:
https://github.com/cerebris/jsonapi-resources
I tried it out a bit with the neo4j gem and it works at a basic level, though once you start getting into includes there seems to be some dependencies on ActiveRecord. I'd love to see issues like that worked out, though.
You might also check out the GraphQL standard which was created by Facebook:
https://github.com/facebook/graphql
There's a Ruby gem for it:
https://github.com/rmosolgo/graphql-ruby
And, of course, another episode of The Changelog ;)
http://5by5.tv/changelog/149
Various other API resources for Ruby:
https://github.com/webmachine/webmachine-ruby
https://github.com/ruby-grape/grape

Use grest.
You can simply define your primary model(s) and its relation(s) (as secondary) and build an API with minimal coding and as quickly as possible!

Related

Modeling sub-collections in MongoDB Realm Sync

I'm new to MongoDB as well as to MongoDB Realm Sync. I was following the Realm Sync tutorial and Realm data model docs, but I wanted to learn more so I tweaked the Atlas collection structure as follows.
Projects > Tasks // i.e. tasks is a sub-collection in each project.
What I don't know is how to come up with Realm Sync Schema which can support Atlas sub-collections.
The best I came up with is a Schema where Tasks are modelled as an array within the Project. But, I'm worried that this can hit the 16MB (although a lot!) document limit for projects with a lot of the tasks.
{
"bsonType": "object",
"properties": {
"_id": {
"bsonType": "objectId"
},
"_partition": {
"bsonType": "string"
},
"name": {
"bsonType": "string"
},
"tasks": {
"bsonType": "array",
"items": {
"bsonType": "object",
"title": "Task",
"properties": {
"name": {
"bsonType": "string"
},
"status": {
"bsonType": "string"
}
}
}
}
},
"required": [
"_id",
"_partition",
"name",
],
"title": "Project"
}
Looking forward on how to model sub-collection the right way.
Edit
Here's my client side Realm models.
import Foundation
import RealmSwift
class Project: Object {
#objc dynamic var _id: String = ObjectId.generate().stringValue
#objc dynamic var _partition: String = "" // user.id
#objc dynamic var name: String = ""
var tasks = RealmSwift.List<Task>()
override static func primaryKey() -> String? {
return "_id"
}
}
class Task: EmbeddedObject {
#objc dynamic var name: String = ""
#objc dynamic var status: String = "Pending"
}
As far the CRUD operations are concerned, I only create a new project and read existing projects as follows.
// Read projects
realm.objects(Project.self).forEach { (project) in
// Access fields
}
// Create a new project
try! realm.write {
realm.add(project)
}
Your code looks great and your heading the right direction, so this answer is more explanation and suggestions on modeling than hard code.
First, Realm objects are lazily loaded which means they are only loaded when used. Tens of thousands of objects will have very little impact on a devices memory. So suppose you have 10,000 users and you 'load them all in'
let myTenThousandUsers = realm.objects(UserClass.self)
meh, no big deal. However, doing this
let someFilteredUsers = myTenThousandUsers.filter { $0.blah == "blah" }
will (could) create a problem - if that returns 10,000 users they are all loaded into memory possibly overwhelming the device. That's a Swift function and 'converting' Realms lazy data using Swift should generally be avoided (use case dependent)
The observation of this code using Swift .forEach
realm.objects(Project.self).forEach { (project) in
// Access fields
}
could cause issues depending on what's being done with those project objects - using them as a tableView dataSource could be trouble if there are a lot of them.
Second thing is the question about the 16Mb limit per document. For clarity an Atlas document is this
{
field1: value1,
field2: value2,
field3: value3,
...
fieldN: valueN
}
where value can be any of the BSON data types such as other documents, arrays, and arrays of documents.
In your structure, the var tasks = RealmSwift.List<Task>() where Task is an embedded object. While conceptually embedded objects are objects, I believe they count toward a single document limit because they are embedded (correct me if I am wrong); as the number of them grows, the size of the enclosing document grows - keeping in mind that 16Mb of text is an ENORMOUS of text so that would/could equate to millions of tasks per project.
The simple solution is to not embed them and have them stand on their own.
class Task: Object {
#objc dynamic var _id: String = ObjectId.generate().stringValue
#objc dynamic var _partition: String = ""
#objc dynamic var name: String = ""
#objc dynamic var status: String = "Pending"
override static func primaryKey() -> String? {
return "_id"
}
}
Then each one can be 16Mb, and an 'unlimited number' can be associated with a single project. One advantage of embedded objects is a type of cascade delete where when the parent object is deleted, the child objects are as well, but with a 1-many relationship from Project to Tasks - deleting a bunch of tasks belonging to a parent is easy.
Oh - another case for not using embedded objects - especially for this use case - is they cannot have indexed properties. Indexing can greatly speed up some queries.

Can't count the occurences of the entity with a field of particular value inside a nested property using Spring Data ElasticSearch Repository

I have the Article entity and inside it there is a nested property, let's say Metadata.
I need to count all articles, which have a particular field inside this nested property, let's say indexed, assigned to e.g. 1.
Java Document Snippet:
#Document(indexName = "article", type = "article", useServerConfiguration = true, createIndex = false)
#Setting(settingPath = "/mappings/settings.json")
#Mapping(mappingPath = "/mappings/articles.json")
public class Article {
// getters and setters, empty constructor are omitted for brevity
#Id
private String id;
private Metadata metadata;
// remainder of the body is omitted
}
Metadata.class snippet
public class Metadata {
// getters and setters, empty constructor are omitted for brevity
private Integer indexed;
// remainder of the body is omitted
}
The query I use to retrieve articles, which satisfy the given criteria and which I put as a value of #org.springframework.data.elasticsearch.annotations.Query on top of the custom method:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "metadata",
"query": {
"bool": {
"must": [
{
"match": {
"metadata.indexed": 1
}
}
]
}
}
}
}
]
}
}
}
My custom Spring Data ElasticSearch repository snippet with a custom method:
public CustomSpringDataElasticsearchRepository extends ElasticsearchRepository<Article, String> {
#Query("The query from above")
Long countByMetadata_Indexed(int value);
}
When I use the repository method shown above , I get java.lang.IllegalArgumentException: Expected 1 but found n results.
Custom Spring Data Elasticsearch Repository method(without #Query) returns 0(version without underscore returns 0 as well) though it should return everything correctly.
How do I get the correct results using Spring Data ElasticSearch Repository? Why does the custom method without #Query doesn't work as well?
UPD: The version of spring-data-elasticsearch used is 3.1.1.RELEASE.
Repository query methods currently(3.2.4.RELEASE) don't support the count by the fields inside nested fields.
As was mentioned previously, #Query annotation doesn't support custom count queries as of the latest version(3.2.4.RELEASE).
In other words, currently, the only way to do this query through Spring Data ElasticSearch is to use ElasticsearchTemplate bean or ElasticsearchOperations bean.
Credit: P.J.Meisch

How can we use InstancePerMatchingLifetimeScope in Autofac configuration Json/XML file

How can I register the scope as InstancePerMatchingLifetimeScope in configuration (Json/Xml) as provided by the Autofac. As of now it is throwing my exception as Invalid Scope.
Autofac configuration does not currently support instance per matching lifetime scope. You can see the code here where lifetime scope values are parsed and there is a table in the documentation showing what is currently supported - be sure to scroll to the right in the table so you can see the list of valid values.
I have not tested this solution.
Make sure you used proper string in the configuration, try below
{
"instanceScope": "per-matching-lifetime"
}
I guess you could do it by creating a Custom Module
public class CustomModule : Module
{
public bool TagName { get; set; }
protected override void Load(ContainerBuilder builder)
{
builder.Register<CustomType>().InstancePerMatchingLifetimeScope(TagName);
}
}
Configuration:
{
"modules": [{
"type": "MyNamespace.CustomModule, MyAssembly",
"properties": {
"TagName": "customRequest"
}
}]
}
If this does not help, please provide additional details about project type(asp.net core/asp.net mvc), exception details, and stack trace.

How to correctly wrap a Flux inside a Mono object

I have a web-service which returns student and enrolled class details.
{
"name": "student-name",
"classes": [
{
"className": "reactor-101",
"day": "Tuesday"
},
{
"className": "reactor-102",
"day": "Friday"
}
]
}
The DTO for this class is as below:
public class Student {
private String name;
private Flux<StudentClass> classes;
#Data
#AllArgsConstructor
#JsonInclude(JsonInclude.Include.NON_DEFAULT)
public static class StudentClass {
private String className;
private String day;
}
}
The main REST controller logic to fetch the student is as follows:
Flux<StudentClass> studentClassFlux = studentClassRepository.getStudentClass(studentName);
return Mono.just(new Student(studentName, studentClassFlux));
The problem with this is, I get the following output after making the REST call:
{
"name": "student-name",
"classes": {
"prefetch": 32,
"scanAvailable": true
}
}
I can achieve the desired output by blocking on the flux request to get completed and then convert the output to list.
List<StudentClass> studentClassList = studentClassRepository.getStudentClass(studentName)..toStream().collect(Collectors.toList());
return Mono.just(new Student(studentName, studentClassList)); // Change the Student#classes from flux to list
I am new to reactive-programming.
What is the correct way of using the Flux & Mono here to get the desired output?
Reactive types aren't meant to be serialized when wrapped in each other.
In that particular case, you probably want your Student object to contain a List<StudentClass>. You can achieve that like this:
public Mono<Student> findStudent(String studentName) {
return studentClassRepository
.getStudentClass(studentName)
.collectList()
.map(studentClasses -> new Student(studentName, studentClasses));
}
I think, in the case that you really need a Flux in your result, you would want to break down the API so that you have separate methods to retrieve the entities.
One for student properties, and another for their classes. The student GET method could be a Mono, while the classes would return a Flux.

Xamarin.Forms deserialising json to an object

var myObjectList = (List<MyObject>)JsonConvert.DeserializeObject(strResponseMessage, typeof(List<MyObject>));
the above works to deserialise a JSON string to a list of custom objects when the JSON has the following format
[
{
"Name": "Value"
},
{
"Name": "Value"
},
{
"Name": "Value"
},
"Name": "Value"
}
]
I don't know how to do the same when the format is like this
{
"ReturnedData" : [
{
"Name": "Value"
},
{
"Name": "Value"
},
{
"Name": "Value"
},
"Name": "Value"
}
]
}
I can get the data like this
JObject information = JObject.Parse(strResponseMessage);
foreach (dynamic data in information)
{
//convert to object here
}
and that works for Android but it seems that you cannot use a type of 'dynamic' for iOS as I get the error:
Object type Microsoft.CSharp.RuntimeBinder.CSharpInvokeMemberBinder cannot be converted to target type: System.Object[]
What step am I missing to convert the second JSON string to the first?
If JsonConvert is JSON.Net just instead of List use
public class MyClass {
public List<MyObject> ReturnedData { get; set; }
}
You can't use the dynamic keyword on iOS as its forbidden to generate code as it states in this link.
Quote:-
No Dynamic Code Generation
Since the iPhone's kernel prevents an application from generating code dynamically Mono on the iPhone does not support any form of dynamic code generation.
These include:
The System.Reflection.Emit is not available.
Quote:-
System.Reflection.Emit
The lack of System.Reflection. Emit means that no code that depends on runtime code generation will work. This includes things like:
The Dynamic Language Runtime.
Any languages built on top of the Dynamic Language Runtime.
Apparently there is some support creeping in from v7.2 as can be seen in this link - See #Rodja answer. - however - its very experimental and has flaws preventing this from fully working.
Your best approach would be to process the JObject - without - reying on the dynamic keyword and you will be alright on both Android and iOS.
Thanks for your replies - I was able to solve it as follows
JObject information = JObject.Parse(strResponseMessage);
string json = JsonConvert.SerializeObject(strResponseMessage["ReturnedData "]);
var myObjectList = (List<MyObject>)JsonConvert.DeserializeObject(json , typeof(List<MyObject>));
Works perfectly!

Resources