I am trying to analyze user behaviour on clickstream data using Neo4j and my use case is to create Flow-Chart/Sankey chart on user's journey from a specific node that user start browsing to N next clicks; my data model and query pattern is similar to this article.
MATCH p=((s:Page{type:"Home"})-[:NEXT*..10]->(d:Page))
where length(p)>3
RETURN p
I am unable to figure out approach for-
How to extract all pairs of nodes within p
I need count of each pairs from step1 to create Flow chart .
I tried various functions and able to retrieve above pairs but there few (<1% anomalies) circular references such that user1 and user2 reach same destination with exact reverse flow that break flow chart as flow charts are directional. Any suggestions on how to filter these anomalies.
Any suggestions on how to implement above would be appreciated. It need not be exact answer but pseudo code or reference articles would help. Thanks
Related
I am working on a dating app where users can "like" or "dislike" other users and get matched.
As you can imagine the most important query of the app would be:
Give me a stack of nearby user profiles that I have NOT liked/disliked before.
I tried to work on this with a document database (Firestore) and figured it's simply not suitable for such kind of application and hence landed in the graph database world which is new and fascinating to me.
I understand that by nature a graph database retrieves data by tracing through the relationships and make relationships first-class citizens. My question now is that what if the nodes that I am trying to get are those with no relationship from the given node? What would the query look like? Can anyone provide an example query?
Edit:
- added nearby criteria to the query statement
This is definitely possible, here is a query example :
MATCH (me:Profile {name: "Chris"})
MATCH (other:Profile) WHERE NOT (other)-[:LIKES]->(me)
As stated in the comments of your original question, on a large dataset it might not scale well, that said it is pretty uncommon that you would use only one criteria for matching, for example, the list of possible profiles to match from can be grouped by :
geolocation
profiles in depth 2 ( who is liking me, then find who other people they like, do those people like me ? )
shared interests
age group
skin color
...
I've got my graph database, populated with nodes, relationships, properties etc. I'd like to see an overview of how the whole database is connected, each relationship to each node, properties of a node etc.
I don't mean view each individual node, but rather something like an ERD from a relational database, something like this, with the node labels. Is this possible?
You can use the metadata by running the command call db.schema().
In Neo4j v4 call db.schema() is deprecated, you can now use call db.schema.visualization()
As far as I know, there is no straight-forward way to get a nicely pictured diagram of a neo4j database structure.
There is a pre-defined query in the neo4j browser which finds all node types and their relationships. However, it traverses the complete graph and may fail due to memory errors if you have to much data.
Also, there is neoprofiler. It's a tool which claims to so what you ask. I never tried and it didn't get too many updates lately. Still worth a try: https://github.com/moxious/neoprofiler
Even though this is not a graphical representation, this query will give you an idea on what type of nodes are connected to other nodes with what type of relationship.
MATCH (n)
OPTIONAL MATCH (n)-[r]->(x)
WITH DISTINCT {l1: labels(n), r: type(r), l2: labels(x)}
AS `first degree connection`
RETURN `first degree connection`;
You could use this query to then unwind the labels to write that next cypher query dynamically (via a scripting language and using the REST API) and then paste that query back into the neo4j browser to get an example set of the data.
But this should be good enough to get an overview of your graph. Expand from here.
I want to find the spanning tree from graph with loops. I cannot use regular bfs traversal here. so I check the allsimplepaths java function api, It seems find loop between two nodes. right now i select a random root, but don't know the end points. so i just want to get the spanning tree from graph while the it has many loops maybe. so it should convert to DAG and then give the tree structures. The graph may have more than one spanning tree.
how to do this? can allsimplepaths applied here?
Look at TraversalDescription with an appropriate uniqueness (NODE_GLOBAL) and Path-Expanders that follow the interesting relationships.
I'm trying to query Book nodes for recommendation by Cypher.
I want to recommend A:Book and C:Book for A:User.
i'm sorry I need some graph to explain this question, but I could't up graph image because my lepletion lacks for upload function.
I wrote query below.
match (u1:User{uid:'1003'})-->(o1:Order)-->(b1:Book)<--(o2:Order)
<--(u2:User)-->(o3:Order)-->(b2:Book)
return b2
This query return all Books(A,B,C,D) dispite cypher's Uniqueness.
I expect to only return A:Book and C:Book.
Is this behavior Neo4j' specification?
How do I get expected return? Thanks, everyone.
environment:
Neo4j ver.v2.0.0-RC1
Using Neo4j Server with REST API
Without the sample graph its hard to say why you get something back when you expected something else. You can share a sample graph by including a create statement that would generate said graph, or by creating it in Neo4j console and putting the link in your question. Here is an example of the latter: console.neo4j.org/r/fnnz6b
In the meantime, you probably want to declare the type of the relationships in your pattern. If a :User has more than one type of outgoing relationships you will be excluding those other paths based on the labels of the nodes on the other end, which is much less efficient than to only traverse the right relationships to begin with.
To my mind its not clear whether (u:User)-->(o:Order)-->(b:Book) means that a user has one or more orders, and each order consists of one or more books; or if it means only that a user ordered a book. If you can share a sample, hopefully that will be clear too.
Edit:
Great, so looking at the graph: You get B and D back because others who bought B also bought D, and others who bought D also bought B, which is your criterion for recommendation. You can add a filter in the WHERE clause to exclude those books that the user has already bought, something like
WHERE NOT (u1)-[:BUY]->()-[:CONTAINS]->(b2)
This will give you A, C, C back, since there are two matching paths to C. It's probably not important to get two result items for C, so you can either limit the return to give only distinct values
RETURN DISTINCT(b2)
or group the return values by counting the matching paths for each result as a 'recommendation score'
RETURN b2, COUNT(b2) as score
Also, if each order only [CONTAINS] one book, you could try modelling without order, just (:User)-[:BOUGHT]->(:Book).
I'm pretty new to neo4j and the spatial plugin for it so bear with me.
I've used the OSM importer to import the whole of Ireland into the db and now I'm able to query it with the rest API to find nodes within X km of a point. (side note: I am unable to get the Cypher query to return any results? Does the OSMImporter add the data to an index for querying or must I loop through it all and add to an index myself now?)
What I actually want is a rudimental reverse geocoder style query. I want to query the graph for the geometry that contains a geo coordinate, check if this is a town/city/village etc, if not check its ancestors until it is able to tell me what town/county/state I am inside.
Unfortunately I'm quite lost and I've tried, unsuccessfully, looking through the neo4j-spatial code and examples for a start point.