Importing data from JSON - neo4j

I'm facing an issue importing a JSON file using the apoc.load.json procedure.
The expected relationship I'm trying to capture:
University --child--> Class --child--> Student
Output:
Neo.ClientError.Statement.SyntaxError: Variable `value` not defined (line 1, column 8 (offset: 7))
"UNWIND value.university AS university"
Here is the performed command sequence:
CALL apoc.load.json("FILE:///C:/tmp/input.json") YIELD value
UNWIND value.university AS university
UNWIND university.class AS class
UNWIND class.student AS student
MERGE (u:UniversityCategory {name:university.name})
MERGE (c:ClassCategory {name:class.name})
MERGE (s:StudentCategory {instr:student.name})
ON CREATE SET i.ID = instructions.ID
ON CREATE SET i.GPA = instructions.GPA
MERGE (u)-[:CHILD]->(c)
MERGE (c)-[:CHILD]->(s)
Here is the JSON file structure:
{
"university": [{
"name": "universityA",
"class": [{
"name": "class_1",
"student": [{
"name": "student_1",
"ID": "1234",
"GPA": "3.8"
},
{
"name": "student_2",
"ID": "12345",
"GPA": "3.4"
}
]
},
{
"name": "class_2",
"student": [{
"name": "student_3",
"ID": "14",
"GPA": "3.0"
}]
}
]
}]
}
My apoc.load.json command appears to work because I see the structured JSON file in the browser window. The next steps are suspect but I think I'm close to defining the relationships.

Resolved my issue.
The commands I expressed above are all correct but the load.apoc.json command has to be run in the SAME query as the rest versus sequentially.

Related

How to get only matched data from nested class using query builder in elastic search?

I am trying to get only the matched data from nested array of elastic search class. but I am not able to get it..the whole nested array data is being returned as output.
this is my Query:-
QueryBuilders.nestedQuery("questions",
QueryBuilders.boolQuery()
.must(QueryBuilders.matchQuery("questions.questionTypeId", quesTypeId)), ScoreMode.None)
.innerHit(new InnerHitBuilder());
I am using querybuilders to get data from nested class.Its working fine but not able to get only the matched data.
Request Body :
{
"questionTypeId" : "MCMC"
}
when questionTypeId = "MCMC"
this is the output i am getting..Here I want to exclude the output for which the questionTypeId = "SCMC".
output :
{
"id": "46",
"subjectId": 1,
"topicId": 1,
"subtopicId": 1,
"languageId": 1,
"difficultyId": 4,
"isConceptual": false,
"examCatId": 3,
"examId": 1,
"usedIn": 1,
"questions": [
{
"id": "46_31",
"pid": 31,
"questionId": "QID41336691",
"childId": "CID1",
"questionTypeId": "MCMC",
"instruction": "This is a single correct multiple choice question.",
"question": "Who holds the most english premier league titles?",
"solution": "Manchester United",
"status": 1000,
"questionTranslation": []
},
{
"id": "46_33",
"pid": 33,
"questionId": "QID41336677",
"childId": "CID1",
"questionTypeId": "SCMC",
"instruction": "This is a single correct multiple choice question.",
"question": "Who holds the most english premier league titles?",
"solution": "Manchester United",
"status": 1000,
"questionTranslation": []
}
]
}
As you have tagged this with spring-data-elasticsearch:
Support to return inner hits was recently added to version 4.1.M1 and so will be included in the next released version. Then in a SearchHit you will get the complete top level document, but in the innerHits property only the matching inner hits will be returned.

Return values for dimensions in timeseries query Druid

I have Druid timeseries query:
{
"queryType": "timeseries",
"dataSource": {
"type": "union",
"dataSources": [
"ds1",
"ds2"
]
},
"dimensions":["dim1"],
"aggregations": [
{
"name": "y1",
"type": "doubleMax",
"fieldName": "value1"
}
],
"granularity": {
"period": "PT10S",
"type": "period"
},
"postAggregations": [],
"intervals": "2017-06-09T13:05:46.000Z/2017-06-09T13:06:46.000Z"
}
And i want to return the values of the dimensions as well, not just for the aggregations like this:
{
"timestamp": "2017-06-09T13:05:40.000Z",
"result": {
"y1": 28.724306106567383
}
},
{
"timestamp": "2017-06-09T13:05:50.000Z",
"result": {
"y1": 28.724306106567383
}
},
How do I have to change the query? Thanks in advance!
If your requirement is to use dimension column in time series query that means you are using aggregated data with non aggregated column, this requirement leads to the use of topN or groupBy query.
groupBy query is probably one of the most powerful druid currently supports but it has poor performance as well, instead you can use topN query for your purpose.
Link for topN documentation and example can be found here:
http://druid.io/docs/latest/querying/topnquery.html
Is Timeseries() query is not supporting dimension?
I tried it in my project but it is not working.
here is Error:
TypeError: queryRep.dataSource(...).dimension is not a function
2|DSP-api | at dimensionData (/home/ec2-user/reports/dsp_reports/controllers/ReportController.js:228:22)
Let me know if anyone has a solution for this.
TY.

How to create Neo4J relationship between nodes yelp dataset

I am new to Neo4j. I am trying to populate Yelp dataset in Neo4j. Basically, I am interested in three json file provided by them i.e.
user.json
{
"user_id": "-lGwMGHMC_XihFJNKCJNRg",
"name": "Gabe",
"review_count": 277,
"yelping_since": "2014-10-31",
"friends": ["Oa84FFGBw1axX8O6uDkmqg", "SRcWERSl4rhm-Bz9zN_J8g", "VMVGukgapRtx3MIydAibkQ", "8sLNQ3dAV35VBCnPaMh1Lw", "87LhHHXbQYWr5wlo5W7_QQ"],
"useful": 45,
"funny": 4,
"cool": 55,
"fans": 17,
"elite": [],
"average_stars": 4.72,
"compliment_hot": 5,
"compliment_more": 1,
"compliment_profile": 0,
"compliment_cute": 1,
"compliment_list": 0,
"compliment_note": 11,
"compliment_plain": 20,
"compliment_cool": 15,
"compliment_funny": 15,
"compliment_writer": 1,
"compliment_photos": 8
}
I have omitted several entries from friends array to make output readable
business.json
{
"business_id": "YDf95gJZaq05wvo7hTQbbQ",
"name": "Richmond Town Square",
"neighborhood": "",
"address": "691 Richmond Rd",
"city": "Richmond Heights",
"state": "OH",
"postal_code": "44143",
"latitude": 41.5417162,
"longitude": -81.4931165,
"stars": 2.0,
"review_count": 17,
"is_open": 1,
"attributes": {
"RestaurantsPriceRange2": 2,
"BusinessParking": {
"garage": false,
"street": false,
"validated": false,
"lot": true,
"valet": false
},
"BikeParking": true,
"WheelchairAccessible": true
},
"categories": ["Shopping", "Shopping Centers"],
"hours": {
"Monday": "10:00-21:00",
"Tuesday": "10:00-21:00",
"Friday": "10:00-21:00",
"Wednesday": "10:00-21:00",
"Thursday": "10:00-21:00",
"Sunday": "11:00-18:00",
"Saturday": "10:00-21:00"
}
}
review.json
{
"review_id": "VfBHSwC5Vz_pbFluy07i9Q",
"user_id": "-lGwMGHMC_XihFJNKCJNRg",
"business_id": "YDf95gJZaq05wvo7hTQbbQ",
"stars": 5,
"date": "2016-07-12",
"text": "My girlfriend and I stayed here for 3 nights and loved it.",
"useful": 0,
"funny": 0,
"cool": 0
}
As we can see in the sample files that relationship between user and business is associated via the review.json file. How can I create a relationship edge between user and business using the review.json file.
I have also seen Mark Needham tutorial where he has shown StackOverflow data population but in that case, relationship file was already present with sample data. Do I need to build a similar file? If yes, how should I approach this problem? or is there any other way to build relationship between user & business?
It very much depends on your model as to what you want to do, but you could do 3 imports:
//Create Users - does assume the data is unique
CALL apoc.load.json('file:///c://temp//SO//user.json') YIELD value AS user
CREATE (u:User)
SET u = user
then add the businesses:
CALL apoc.load.json('file:///c://temp//SO//business.json') YIELD value AS business
CREATE (b:Business {
business_id : business.business_id,
name : business.name,
neighborhood : business.neighborhood,
address : business.address,
city : business.city,
state : business.state,
postal_code : business.postal_code,
latitude : business.latitude,
longitude : business.longitude,
stars : business.stars,
review_count : business.review_count,
is_open : business.is_open,
categories : business.categories
})
For the businesses, we can't just do the SET b = business because the JSON has nested maps. So you might want to decide if you want them, and might have to go down a different route.
Lastly, the reviews, which is where we join it all up.
CALL apoc.load.json('file:///c://temp//SO//review.json') YIELD value AS review
CREATE (r:Review)
SET r = review
WITH r
//Match user to a review
MATCH (u:User {user_id: r.user_id})
CREATE (u)-[:HAS_REVIEW]->(r)
WITH r, u
//Match business to a review, and a user to a business
MATCH (b:Business {business_id: r.business_id})
//Merge here in case of multiple reviews
MERGE (u)-[:HAS_REVIEWED]->(b)
CREATE (b)-[:HAS_REVIEW]->(r)
Obviously - change labels/relationships to types you want, and it might need tuning depending on the size of data etc, so you might need to use apoc.periodic.iterate to work it.
Apoc is here if you need it (and you should use it!)

Neo4j return a node with an array of nodes as propery or seperate array

I have four nodes that -[belongTo]-> (ContainerNode)
I want the json to return as a single container node which contains an array of all the nodes that link to it. For example:
"nodes": [
{
"id": "240",
"name":"MyNodeContainer",
"Type": "ContainerNode"
"SubNodes": [
{
"id": "1",
"name":"MyNodeA",
"Type": "node"
},
{
"id": "2",
"name":"MyNodeB",
"Type": "node"
}
]
},
It seems simple but all i can get is the default of all nodes being returned as equal. I want the result to make it clear that the container node is separate from the rest. An array property seems most intuitive but i would also be content with two lists - one for the single nodeContainer and one for the contained nodes
Does something like this steer you towards your end goal? It builds a collection of the contained nodes and then returns it as a property of the ContainerNode.
MATCH (c:ContainerNode)<-[:BELONGS_TO]-(n:Node)
WITH c, collect({ id: id(n), name: n.name, type: labels(n)[0] }) AS nodes
WITH { id: id(c), name: c.name, type: labels(c)[0], SubNodes: nodes } AS containerNode
RETURN {nodes: collect(containerNode) }

Importing relationships with Core Data and Magical Record

I am getting JSON data from a webservice and try to store that in Core Data with Magical Record. I read the great post (and only documentation?) "Importing data made easy" by Saul Mora but I still do not really understand what I need to do to get all data in my entities.
Here is the JSON the web service returns:
{
"ApiVersion": 4,
"AvailableFileSystemLibraries": [
{
"Id": 10,
"Name": "Movie Shares",
"Version": "0.5.4.0"
},
{
"Id": 11,
"Name": "Picture Shares",
"Version": "0.5.4.0"
},
{
"Id": 5,
"Name": "Shares",
"Version": "0.5.4.0"
},
{
"Id": 9,
"Name": "Music Shares",
"Version": "0.5.4.0"
}
],
"AvailableMovieLibraries": [
{
"Id": 3,
"Name": "Moving Pictures",
"Version": "0.5.4.0"
},
{
"Id": 7,
"Name": "MyVideo",
"Version": "0.5.4.0"
}
],
"AvailableMusicLibraries": [
{
"Id": 4,
"Name": "MyMusic",
"Version": "0.5.4.0"
}
],
"AvailablePictureLibraries": [
{
"Id": 8,
"Name": "Picture Shares",
"Version": "0.5.4.0"
}
],
"AvailableTvShowLibraries": [
{
"Id": 6,
"Name": "MP-TVSeries",
"Version": "0.5.4.0"
}
],
"DefaultFileSystemLibrary": 5,
"DefaultMovieLibrary": 3,
"DefaultMusicLibrary": 4,
"DefaultPictureLibrary": 0,
"DefaultTvShowLibrary": 6,
"ServiceVersion": "0.5.4"
}
The entities I want to store that data in look like this:
There is also a Server entity with a 1:1 relationship to ServerInfo.
What I want to do:
Store basic data (ApiVersion, ...) in ServerInfo. This I already got to work.
Store each object in AvailableXYLibraries in BackendLibrary (1:n relationship from ServerInfo).
Set type based on the XY part of AvailableXYLibraries, for example "movie" for AvailableMovieLibraries.
Set defaultLibrary to true if this library is referenced by DefaultXYLibrary.
Set providerId to servername + LibraryId as there are multiple servers that can have BackendLibraries with the same numeric ID.
Is this possible with Magical Record? I guess I need to implement some of the import hooks and set some user info keys, but everything I read doesn't really tell me where to set what user info key or implement which method where and how.
I hope this made sense and that you can give me some hints :) Thanks!
The structure of this data is quite a bit different from your Core Data model. What you'll most likely have to do is iterate a bit on the dictionary. That is, there are various collections of library data, eg. FileSystemLibraries, AvailableMovieLibraries, etc. You'll have to get the array out of those keys, and then map your entities as I described in the article. In order to launch the process, you'll have to call
[BackendLibrary importFromArray:arrayFromDownloadedDictionary];
where the arrayFromDownloadedDictionary is each array in the example dictionary you've posted. Once you give the array to MagicalRecord, and provided the proper field mapping, MagicalRecord will then import and create all the entities for you at that point.
Make sure you map "Id" to BackendLibary.id, "Name" to BackendLibrary.name, and "Version" to BackendLibrary.version

Resources