How to create dynamic node relation in neo4j for dynamic data? - neo4j

I was able to create author nodes directly from the json file . But the challenge is on what basis or how we have to link the data. Linking "Author" to "organization". since the data is dynamic we cannot generalize it. I have tried with using csv file but, it fails the conditions when dynamic data is coming. For example one json record contain 2 organization and 3 authors, next record will be different. Different json record have different author and organization to link. organization/1 represent organization1 and organization/2 represents organization 2. Any help or hint will be great. Thank you. Please find the json file below.
"Author": [
{
"seq": "3",
"type": "abc",
"identifier": [
{
"idtype:auid": "10000000"
}
],
"familyName": "xyz",
"indexedName": "MI",
"givenName": "T",
"preferredName": {
"familyName": "xyz1",
"givenName": "a",
"initials": "T.",
"indexedName": "bT."
},
"emailAddressList": [],
"degrees": [],
"#id": "https:abc/2009127993/author/person/3",
"hasAffiliation": [
"https:abc/author/organization/1"
],
"organization": [
[
{
"identifier": [
{
"#type": "idtype:uuid",
"#subtype": "idsubtype:affiliationInstanceId",
"#value": "aff2"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:afid",
"#value": "12345"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:dptid"
}
],
"organizations": [],
"addressParts": [],
"sourceText": "",
"text": " Medical University School of Medicine",
"#id": "https:abc/author/organization/1"
}
],
[
{
"identifier": [
{
"#type": "idtype:uuid",
"#subtype": "idsubtype:affiliationInstanceId",
"#value": "aff1"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:afid",
"#value": "7890"
},
{
"#type": "idtype:OrgDB",
"#subtype": "idsubtype:dptid"
}
],
"organizations": [],
"addressParts": [],
"sourceText": "",
"text": "K University",
"#id": "https:efg/author/organization/2"
}
]

Hi I see that Organisation is part of the Author data, so you have to model it like wise. So for instance (Author)-[:AFFILIATED_WITH]->(Organisation)
When you use apoc.load.json which supports a stream of author objects you can load the data.
I did some checks on your JSON structure with this cypher query:
call apoc.load.json("file:///Users/keesv/work/check.json") yield value
unwind value as record
WITH record.Author as author
WITH author.identifier[0].`idtype:auid` as authorId,author, author.organization[0] as organizations
return authorId, author, organizations
To get this working you will need to create include apoc in the plugins directory, and add the following two lines in the apoc.conf file (create one if it is not there) in the 'conf' directory.
apoc.import.file.enabled=true
apoc.import.file.use_neo4j_config=false
I also see a nested array for the organisations in the output why is that and what is the meaning of that?
And finally I see also in the JSON that an organisation can have a reference to other organisations.
explanation
In my query I use UNWIND to unwind the base Author array. This means you get for every author a 'record' to work with.
With a MERGE or CREATE statement you can now create an Author Node with the correct properties. With the FOREACH construct you can walk over all the Organization entry and create/merge an Organization node and create the relation between the Author and the Organization.
here an 'psuedo' example
call apoc.load.json("file:///Users/keesv/work/check.json") yield value
unwind value as record
WITH record.Author as author
WITH author.identifier[0].`idtype:auid` as authorId,author, author.organization[0] as organizations
// creating the Author node
MERGE (a:Author { id: authorId })
SET a.familyName = author.familyName
...
// walk over the organizations
// determine
FOREACH (org in organizations |
MERGE (o:Organization { id: ... })
SET o.name = org.text
...
MERGE (a)-[:AFFILIATED_WITH]->(o)
// if needed you can also do a nested FOREACH here to process the Org Org relationship
)
Here is the JSON file I used I had to change something at the start and the end
[
{
"Author":{
"seq":"3",
"type":"abc",
"identifier":[
{
"idtype:auid":"10000000"
}
],
"familyName":"xyz",
"indexedName":"MI",
"givenName":"T",
"preferredName":{
"familyName":"xyz1",
"givenName":"a",
"initials":"T.",
"indexedName":"bT."
},
"emailAddressList":[
],
"degrees":[
],
"#id":"https:abc/2009127993/author/person/3",
"hasAffiliation":[
"https:abc/author/organization/1"
],
"organization":[
[
{
"identifier":[
{
"#type":"idtype:uuid",
"#subtype":"idsubtype:affiliationInstanceId",
"#value":"aff2"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:afid",
"#value":"12345"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:dptid"
}
],
"organizations":[
],
"addressParts":[
],
"sourceText":"",
"text":" Medical University School of Medicine",
"#id":"https:abc/author/organization/1"
}
],
[
{
"identifier":[
{
"#type":"idtype:uuid",
"#subtype":"idsubtype:affiliationInstanceId",
"#value":"aff1"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:afid",
"#value":"7890"
},
{
"#type":"idtype:OrgDB",
"#subtype":"idsubtype:dptid"
}
],
"organizations":[
],
"addressParts":[
],
"sourceText":"",
"text":"K University",
"#id":"https:efg/author/organization/2"
}
]
]
}
}
]
IMPORTANT create unique constraints for Author.id and Organization.id!!
In this way you can process any json file with an unknown number of author elements and an unknown number of affiliated organisations

Related

Does TypeORM have a way to search for all documents with an array containing a value?

My goal is to use an input array of strings (fake emails) as a search query for documents in my MongoDB database, which I am powering using TypeORM. This way if I want to search for documents using more than one email at a time, I can do that. Meaning I want to be able to feed in:
query = ["kim#gmail.com", "jim#gmail.com", "sarah#gmail.com"] and get 3 different documents where document one has kim#gmail.com as the attendee, jim#gmail.com is another document's attendee field, and sarah#gmail.com is the third document's attendee (or is among them).
I want to use an email as a query to search for and return all documents where the array field has the email in the array.
So as an example here is the results for the "get all documents" endpoint right now:
[
{
"_id": "6283d7ad706445dc33319bcb",
"hostUsername": "jack",
"hostEmail": "jack#outlook.com",
"meetingName": "nervous-fish-hautily-vetting",
"startTime": "2022-12-12T08:00:00.000Z",
"attendees": [
"kate#gmail.com",
"sawyer#gmail.com"
]
},
{
"_id": "6284235e662f7dfb073e2cbc",
"hostUsername": "jacob",
"hostEmail": "jacob#gmail.com",
"meetingName": "eager-fish-hautily-vetting",
"startTime": "2022-12-12T08:00:00.000Z",
"attendees": [
"kate#gmail.com",
"benjaminlinus#gmail.com"
]
},
{
"_id": "6283d7c3706445dc33319bcc",
"hostUsername": "richard",
"hostEmail": "richard#outlook.com",
"meetingName": "eager-cat-hautily-subtracting",
"startTime": "2022-12-12T08:00:00.000Z",
"attendees": [
"johnlocke#gmail.com",
"hurley#gmail.com"
]
},
{
"_id": "6283d82b706445dc33319bcd",
"hostUsername": null,
"hostEmail": "richard#outlook.com",
"meetingName": "nervous-cat-hautily-jumping",
"startTime": "1970-01-01T00:00:00.000Z",
"attendees": null
},
{
"_id": "6283d8af706445dc33319bce",
"hostUsername": null,
"hostEmail": "richard#outlook.com",
"meetingName": "eager-plant-ignorantly-jumping",
"startTime": "1970-01-01T00:00:00.000Z",
"attendees": null
}
]
I want to query the database with ["kate#gmail.com"] and get back the two results that have "kate#gmail.com" in the attendees field.
The closest solution (that doesn't work) is the one I found in this GitHub issue and also another close solution (that doesn't work) in this StackOverflow question
Here is me implementing those two suggestions:
import { In } from "typeorm";
async searchMeetingsByDetails(
attendees?: string[]
): Promise<IMeeting[]> {
console.log(attendees, 39);
const meetingsByAttendees = attendees
? await this.meetingRepository.find({
where: {
attendees: In([...attendees]),
},
})
: [];
return [
meetingsByAttendees,
].flat();
}
This gives me an empty array [] when the input is ["kate#gmail.com"] so if the In() thing worked, it would give results.
const meetings = await this.meetingRepository
.createQueryBuilder("meeting")
.where("meeting.attendees IN (:attendees)", {
attendees: [...attendees],
});
This one gives ERROR [ExceptionsHandler] Query Builder is not supported by MongoDB. TypeORMError: Query Builder is not supported by MongoDB.

Renaming type for FSharp.Data JsonProvider

I have a JSON that looks something like this:
{
...
"names": [
{
"value": "Name",
"language": "en"
}
],
"descriptions": [
{
"value": "Sample description",
"language" "en"
}
],
...
}
When using JsonProvider from the FSharp.Data library, it maps both fields as the same type MyJsonProvider.Name. This is a little confusing when working with the code. Is there any way how to rename the type to MyJsonProvider.NameOrDescription? I have read that this is possible for the CsvProvider, but typing
JsonProvider<"./Resources/sample.json", Schema="Name->NameOrDescription">
results in an error.
Also, is it possible to define that the Description field is actually an Option<MyJsonProvider.NameOrDescription>? Or do I just have to define the JSON twice, once with all possible values and the second time just with mandatory values?
[
{
...
"names": [
{
"value": "Name",
"language": "en"
}
],
"descriptions": [
{
"value": "Sample description",
"language" "en"
}
],
...
},
{
...
"names": [
{
"value": "Name",
"language": "en"
}
],
...
}
]
To answer your first question, I do not think there is a way of specifying such renaming. It would be quite reasonable option, but the JSON provider could also be more clever when generating names here (it knows that the type can represent Name or Description, so it could generate a name with Or based on those).
As a hack, you could add an unusued field with the right name:
type A = JsonProvider<"""{
"do not use": { "value_with_langauge": {"value":"A", "language":"A"} },
"names": [ {"value":"A", "language":"A"} ],
"descriptions": [ {"value":"A", "language":"A"} ]
}""">
To answer your second question - your names and descriptions fields are already arrays, i.e. ValueWithLanguge[]. For this, you do not need an optional value. If they are not present, the provider will simply give you an empty array.

Openlayers bbox strategy

I have bbox strategy for one source of data. Code looks like this:
bbox: function newBboxFeatureSource(url, typename) {
return new ol.source.Vector({
loader: function (extent) {
let u = `${url}&TYPENAME=${typename}&bbox=${extent.join(",")}`;
$.ajax(u).then((response) => {
this.addFeatures(
geoJsonFormat.readFeatures(response)
);
});
},
strategy: ol.loadingstrategy.bbox
});
},
I works fine but... When I pan/move the map then this loader is calling again and add another features which fit to new box. But there is a lot of duplicates then because some of new features are just the same as old.
So I wanted first clear all features using this.clear() before add new features but when I add this command then loader is running all the time and I have "infinitive loop". Do you know why? How can I disable loading new features after calling this.clear()?
edit:
my response with features looks like this:
{ "type": "FeatureCollection", "crs": { "type": "name", "properties":
{ "name": "urn:ogc:def:crs:EPSG::3857" } },
"features": [ { "type": "Feature", "properties": { "ogc_fid": "2",
"name": "AL" }, "geometry": { "type": "MultiPolygon" , "coordinates":
[ [ [ ... ] ] ] } }, { "type": "Feature", "properties": { "ogc_fid":
"3", "name": "B" }, "geometry": { "type": "MultiPolygon" ,
"coordinates": [ [ [ ...] ] ] } } ..... and so on
I've removed coordinates because there was too many of them.
My features are generated by mapserver and are configured in .map file which looks like this:
LAYER
NAME "postcode_area_boundaries"
METADATA
"wfs_title" "Postcode area boundaries"
"wfs_srs" "EPSG:3857"
"wfs_enable_request" "*"
"wfs_getfeature_formatlist" "json"
"wfs_geomtype" "multipolygon"
"wfs_typename" "postcode_area_boundaries"
"wms_context_fid" "id"
"wfs_featureid" "id"
"gml_featureid" "id"
"gml_include_items" "id,postarea,wkb_geometry"
"gml_postarea_alias" "name"
"ows_featureid" "id"
"tinyows_table" "postcode_area_boundaries"
"tinyows_retrievable" "1"
"tinyows_include_items" "id,postarea,wkb_geometry"
END
TYPE POLYGON
STATUS ON
CONNECTIONTYPE POSTGIS
CONNECTION "..."
DATA "wkb_geometry FROM postcode_area_boundaries USING UNIQUE id"
DUMP TRUE
END
To summarize the discussion and answer the initial question:
The features sent by the server need an attribute called id, which must be unique and the same for the feature on every request.
{type: "Feature", id: "some-wfs.1234", properties: { "ogc_fid": 2, ...
See this GitHub Issue for the original comment of ahocevar.
In GeoServer this can be achieved if you set an identifier in your layer.
I guess there is something similar to set in MapServer.
Following up on #bennos comment: you can expose the fid in Mapserver using the FORMATOPTION USE_FEATUREID=true as described in: https://mapserver.org/output/ogr_output.html
USE_FEATUREID=true/false
Starting from MapServer v7.0.2. Defaults to false. Include feature ids in the generated output, if the ows_featureid metadata key is set at the layer level. The featureid column to use should be an integer column. Useful if you need to include an “id” attribute to your geojson output. Use with caution as some OGR output drivers may behave strangely when fed with random FIDs.
OUTPUTFORMAT
NAME "application/json"
DRIVER "OGR/GEOJSON"
MIMETYPE "application/json"
FORMATOPTION "FORM=SIMPLE"
FORMATOPTION "FILENAME=ol-query.json"
FORMATOPTION "STORAGE=memory"
FORMATOPTION "USE_FEATUREID=true"
END

How to use parameters correctly in Transactional Cypher HTTP

I'm a bit stuck. I'm trying to use parameters in my http request in order to reduce the overhead but I just can't figure out why this won't work:
{
"statements" : [ {
"statement" : "MATCH (n:Person) WHERE n.name = {name} SET n.dogs={dogs} RETURN n",
"parameters" : [{
"name" : "Andres",
"dogs":5
},{
"name" : "Michael",
"dogs":3
},{
"name" : "Someone",
"dogs":2
}
]
}]
}
I've tried just opening a transaction with a STATEMENT and feeding the separate 'rows' in as PARAMETERS on subsequent transactions before I /COMMIT, but no joy.
I know that multiple nodes can be created in a similar from the examples in the manual.
What am I missing?
I've since modified an answer from this post, which seems to work by using a FOREACH statement to allow for multplie 'sets' of parameters.
{
"statements" : [
{
"parameters": {
"props": [
{
"userid": "177032492760",
"username": "John"
},
{
"userid": "177032492760",
"username": "Mike"
},
{
"userid": "100007496328",
"username": "Wilber"
}
]
},
"statement": "FOREACH (p in {props} | MERGE (user:People {id:p.userid}) ON CREATE SET user.name = p.username) "
}
]
}
You can also use UNWIND for this case :
The statement :
UNWIND props as prop
MERGE (user:People {id: {prop}.id}) // and the rest of your query
The parameters :
{"props":[ {"id": 1234, "name": "John"},{"id": 4567, "name": "Chris"}]}
This is what is used on Graphgen to load the generated graphs in a local database from the webapp. http://graphgen.neoxygen.io

Restkit: Nested relationship mapping

I have the following JSON:
{
"votecategory": [
{
"id": "nlvfl2",
"title": "Best Song",
"pollQuestion": {
"id": "nbprqp",
"title": "best-song",
"displayText": "Best Song",
"answer": [
{
"id": "qylaw4",
"title": "Bruno Mars – Locked Out Of Heaven",
"relatedItems": [
{
"Name": "Bruno Mars",
"id": "sljkur",
"Bio": "Bio info here"
},
{} //Sometimes there's an empty object
],
"winner": "true"
},
{
"id": "q05sb3",
"title": "Daft Punk – Get Lucky (ft. Pharrell Williams)",
"displayText": "Daft Punk – Get Lucky (ft. Pharrell Williams)",
"relatedItems": [
{
"Name": "Daft Punk",
"id": "d9sd84",
"Bio": "Bio info here"
}
]
},
...
]
}
},
...
]
}
Which maps to the following entities:
Category (votecategory values)
Nomination (answer values)
Artist (relatiedItems values)
Ive managed to setup object and relationship mappings for votecategory (category) -> answer (nomination), however I'm having a problem mapping nomination to artist.
What I need to do is have a 1:1 core data relationship setup between nomination and artist, and 1:N relationship setup between artists and nomination (one artist can have multiple nominations).
The problem is that "relatedItems" is an array, but in reality only contains 1 usable value, the related artist. This "should" be a 1:1 relationship from a data perspective, however the JSON maps it as a 1:N relationship, this confuses restkit (rightfully so).
How can I store the single item in the JSON relatedItems response as a single 1:1 relationship?
Thanks
Oli
You could look at using a custom value transformer on that mapping which converts the array into a single object. Check out this question for some additional details.

Resources