create if not exists... with multiple properties (and unique ID) - neo4j

First, sorry for my pretty bad english, i'm French :p
I'm currently switching from MySQL to Neo4j and i have a little question about my scripts.
I have artists and music albums; each of them linked (if needed) as (artist)-[:OWNS]->(album).
Now i develop the API for updating the information and i have a little "bug" for this :
How can i get an existing node and create it if not exist ?
For another part, i'm doing like that :
MATCH (u:User) WHERE u.id='83cac821-1607-49a3-e124-07431ef375ce' MERGE (c:Country {name:'France'}) CREATE UNIQUE (u)-[:FROM]->(c) RETURN u,c;
So, if the country "France" already exists, neo4j will not create a second one... Perfect 'cause my countries haven't ID's...
But for artists and albums, i need an unique identifier; and i can't create my request :
MATCH (ar:Artist) WHERE ar.id='83cac821-1607-49a3-e124-07431ef375ce' MERGE (al:Album {name:'Title01', id:'31efc821-1607-49a3-e124-074383ca75ce'}) CREATE UNIQUE (ar)-[:OWNS]->(al) RETURN ar,al;
In this way, i need to know the album'ID (and in my API, i don't !). In fact, i need Neo4j get the album "Title01" if exist, and create (with a fresh new ID) if not. In my exemple, if i don't give the ID, it can get the album if exist; but if not, it will create a new one without ID... And if i send an ID, neo4j will never get it (cause the title's already exist but not with this particular ID).
(Before, in Mysql i was using multiple requests : 1° search if album exist. If yes, return ID; if not create with new one and return ID. 2° the same for artist. 3° create link between them...)
Thanks for your help !

The MERGE command can be extended with ON MATCH and ON CREATE, see http://docs.neo4j.org/chunked/stable/query-merge.html#_use_on_create_and_on_match. I guess you have to something like
MATCH (ar:Artist) WHERE ar.id='83cac821-1607-49a3-e124-07431ef375ce'
MERGE (al:Album {name:'Title01'})
ON CREATE SET al.id = '31efc821-1607-49a3-e124-074383ca75ce'
CREATE UNIQUE (ar)-[:OWNS]->(al) RETURN ar,al

Here's a page that shows how to create a node if it doesn't exist: Link

Related

Dexie eachUniqueKey and Where Clause

I'm developing an application in Quasar/Electron and using Dexie/IndexedDB for my database. I want to find all distinct records in the database that contain both my Event ID and a Dog ID (both key indexed fields). I am able to do this with the following code:
await myDB.runTable
.orderBy('[fk_event+fk_dog]')
.eachUniqueKey((theDuo) => {
this.runsArray.push({eventID: theDuo[0], dogID: theDuo[1]})
})
I'm using a combined key which is working well. However, I need to have more of the records than just the keys. I need a few more fields, is this possible?
I was trying to get records with the unique key function while also using the where function, but that doesn't seem to work.
I need to get all the unique (distinct?) dogs in the table that are in a particular event. And also get their corresponding information. I'm not sure if there is a better, more efficient way to do this? I can always pull out all the records and loop through them to build a custom array, I was just hoping to do this at the table read level. (yeah I'm still in tables/records even though these are collections etc. :p ).
Even the above code gives me all the events, and I can pull out what I need with a filter. I just was thinking it would be faster and more efficient to do it at the read level.
this.enteredRuns = this.runsArray.filter((theEvent) => {
return ( (theEvent.eventID == this.currentEventID) )
})
Try
await myDB.runTable
.orderBy('[fk_event+fk_dog]')
.clone({unique: "unique"})
.toArray()
I know this isn't documented but it should do the work to use unique cursor while still extracting the whole objects and not just the keys. You cannot combine with where but you could use .filter. Just be aware that not all records with be scanned as it will jump over records with same keys - selecting the first visited records only.

beatbox bulk delete: Getting MALFORMED_ID

Just like upsert, I want to bulk delete records of a particular custom index using beatbox. Is there any way?
I am getting MALFORMED_ID when i am doing it.
Delete command in beatbox depends on delete() SOAP API call. It requires to know primary keys Id of deleted objects and there is no possibility to use external ID, because it should be known beforehand exactly what is deleted. (example for Contact object)
sql = "SELECT Id FROM Contact WHERE my_external_id__c in ({})".format(
', '.join("'{}'".format(x) for x in external_ids)
)
svc.delete([x['Id'] for x in soap.query(sql)])
You can see in the docs nearby that update() and upsert() calls support external IDs.

Listing appointments of a contact

I'd like to list the appointments that a contacts is set to as a required participating party. I'm running the following query in my organization.
The service end point is set up like so.
https://bazinga/XRMServices/2011/OrganizationData.svc/AppointmentSet?
And then I concatenate that with the following.
$select=
ScheduledStart,
ModifiedOn,
appointment_activity_parties/PartyId
&$top=1000
&$filter=
ModifiedOn gt DateTime'2014-08-21'
&$expand=appointment_activity_parties
This gives me a list of a few appointments and when I investigate the contents, I can clearly see a tag called PartId with Id, Name etc. in it. The infromation is there. Then I check by hand the guid of the contact I'm curious about and add another condition in the filter specifying it.
$select=
ScheduledStart,
ModifiedOn,
appointment_activity_parties/PartyId
&$top=1000
&$filter=
(ModifiedOn gt DateTime'2014-08-21')
and (appointment_activity_parties/PartyId eq guid'...')
&$expand=appointment_activity_parties
However, the stupid organization service says that no property PartyId exists. When I try adding /Id, it says something about no support for complex data types when querying. I'm sure it's just a small syntax thingy but after a few hours and sick and tired. What do I miss?!
Your scenario behind the clause (appointment_activity_parties/PartyId eq guid'...') is that you want to find out appointments in the Appointmentset the PartyId of whose appointment_activity_parties equals a certain GUID. But you need to specify do you want to return the appointment if the PartyId of any of its appointment_activity_parties equals that GUID, or you want to return the appointment only when all of the PartyId of its appointment_activity_parties equals that GUID.
This any and all difference is specified in the section 5.1.1.2 of OData V4 Protocol part 2: URL Conventions.
Thus, you can rewrite your query as follows:
$select=
ScheduledStart,
ModifiedOn,
appointment_activity_parties/PartyId
&$top=1000
&$filter=
(ModifiedOn gt DateTime'2014-08-21')
and (appointment_activity_parties/any(a:a/PartyId eq guid'...'))
&$expand=appointment_activity_parties
And you can choose modify the any to all according to your actual needs.

Neo4jClient: doubts about CRUD API

My persistency layer essentially uses Neo4jClient to access a Neo4j 1.9.4 database. More specifically, to create nodes I use IGraphClient#Create() in Neo4jClient's CRUD API and to query the graph I use Neo4jClient's Cypher support.
All was well until a friend of mine pointed out that for every query, I essentially did two HTTP requests:
one request to get a node reference from a legacy index by the node's unique ID (not its node ID! but a unique ID generated by SnowMaker)
one Cypher query that started from this node reference that does the actual work.
For read operations, I did the obvious thing and moved the index lookup into my Start() call, i.e.:
GraphClient.Cypher
.Start(new { user = Node.ByIndexLookup("User", "Id", userId) })
// ... the rest of the query ...
For create operations, on the other hand, I don't think this is actually possible. What I mean is: the Create() method takes a POCO, a couple of relationship instances and a couple of index entries in order to create a node, its relationships and its index entries in one transaction/HTTP request. The problem is the node references that you pass to the relationship instances: where do they come from? From previous HTTP requests, right?
My questions:
Can I use the CRUD API to look up node A by its ID, create node B from a POCO, create a relationship between A and B and add B's ID to a legacy index in one request?
If not, what is the alternative? Is the CRUD API considered legacy code and should we move towards a Cypher-based Neo4j 2.0 approach?
Does this Cypher-based approach mean that we lose POCO-to-node translation for create operations? That was very convenient.
Also, can Neo4jClient's documentation be updated because it is, frankly, quite poor. I do realize that Readify also offers commercial support so that might explain things.
Thanks!
I'm the author of Neo4jClient. (The guy who gives his software away for free.)
Q1a:
"Can I use the CRUD API to look up node A by its ID, create node B from a POCO, create a relationship between A and B"
Cypher is the way of not just the future, but also the 'now'.
Start with the Cypher (lots of resources for that):
START user=node:user(Id: 1234)
CREATE user-[:INVITED]->(user2 { Id: 4567, Name: "Jim" })
Return user2
Then convert it to C#:
graphClient.Cypher
.Start(new { user = Node.ByIndexLookup("User", "Id", userId) })
.Create("user-[:INVITED]->(user2 {newUser})")
.WithParam("newUser", new User { Id = 4567, Name = "Jim" })
.Return(user2 => user2.Node<User>())
.Results;
There are lots more similar examples here: https://github.com/Readify/Neo4jClient/wiki/cypher-examples
Q1b:
" and add B's ID to a legacy index in one request?"
No, legacy indexes are not supported in Cypher. If you really want to keep using them, then you should stick with the CRUD API. That's ok: if you want to use legacy indexes, use the legacy API.
Q2.
"If not, what is the alternative? Is the CRUD API considered legacy code and should we move towards a Cypher-based Neo4j 2.0 approach?"
That's exactly what you want to do. Cypher, with labels and automated indexes:
// One time op to create the index
// Yes, this syntax is a bit clunky in C# for now
graphClient.Cypher
.Create("INDEX ON :User(Id)")
.ExecuteWithoutResults();
// Find an existing user, create a new one, relate them,
// and index them, all in a single HTTP call
graphClient.Cypher
.Match("(user:User)")
.Where((User user) => user.Id == userId)
.Create("user-[:INVITED]->(user2 {newUser})")
.WithParam("newUser", new User { Id = 4567, Name = "Jim" })
.ExecuteWithoutResults();
More examples here: https://github.com/Readify/Neo4jClient/wiki/cypher-examples
Q3.
"Does this Cypher-based approach mean that we lose POCO-to-node translation for create operations? That was very convenient."
Correct. But that's what we collectively all want to do, where Neo4j is going, and where Neo4jClient is going too.
Think about SQL for a second (something that I assume you are familiar with). Do you run a query to find the internal identifier of a node, including its file offset on disk, then use this internal identifier in a second query to manipulate it? No. You run a single query that does all that in one hit.
Now, a common use case for why people like passing around Node<T> or NodeReference instances is to reduce repetition in queries. This is a legitimate concern, however because the fluent queries in .NET are immutable, we can just construct a base query:
public ICypherFluentQuery FindUserById(long userId)
{
return graphClient.Cypher
.Match("(user:User)")
.Where((User user) => user.Id == userId);
// Nothing has been executed here: we've just built a query object
}
Then use it like so:
public void DeleteUser(long userId)
{
FindUserById(userId)
.Delete("user")
.ExecuteWithoutResults();
}
Or, add even more Cypher logic to delete all the relationships too:
Then use it like so:
public void DeleteUser(long userId)
{
FindUserById(userId)
.Match("user-[:?rel]-()")
.Delete("rel, user")
.ExecuteWithoutResults();
}
This way, you can effectively reuse references, but without ever having to pull them back across the wire in the first place.

Google Contacts: Unique Contacts?

I am building an application that I will need to distinguish the Google Contacts from each other. I am just wondering, as long as google sends contacts as First Name/Last Name/mail.. etc (Example) without a unique ID, what will be the first approach to distinguish each contacts?
1) Should I create an ID based on the user's fields? -> by a minimal change, it can break down.
2) Should I create an ID based on First Name + Last Name? -> but most people can have duplicate contacts on their page, would that be a problem? Or married contacts, which can create a little mess.
The reason I am asking this I am trying to create relations and I need to store the data somewhere like that [person=Darth Vader, subject=Luke Skywalker, type=father(or son)], so I need a fast algorithm that can make a mapping for each contact and retrieve the related contacts fast.
I believe they do send back an ID. From the return schema:
<link rel='self' type='application/atom+xml' href='https://www.google.com/m8/feeds/contacts/userEmail/full/contactId'/>
You could use the full HREF value as the ID, or parse out the contactID from the end of the URL, whichever you like better.

Resources