I want do the following, I want to run an index search and collect all the nodes, path etc, store the new subgraph and run another search on that new subgraph.
For example:
First Search
CALL apoc.index.search("cat", "Category.name:fashion") YIELD node AS catg
Second search
CALL apoc.index.search("cat", "Category.name:dresses") on the new resultant graph
The data is very similar to Amazon's Taxonomy tree, where the top is fashion and then it has tree below it. So there are multiple Root nodes.
Any help or pointers would be appreciated.
I'd recommend altering your data model. Rather than having categories as properties or list items in properties, they could be modeled as :Category nodes. That way a product's categories are defined by the relationships they have to :Category nodes, which also allows easier queries based on category: match on the desired categories, then match on products that have relationships with those categories.
Related
I'm currently ramping up on graph databases and to do that am working through a set of questions to learn Cypher. However, I'm not 100% happy with the design I've chosen since I have to match relationships to nodes to make some of the queries work.
I found Neo4j: Suggestions for ways to model a graph with shared nodes but has a unique path based on some property with some suggestions that are relevant, but they involve copying nodes (repeating them) when in fact they do represent the same thing. That seems like an update issue waiting to happen.
My design currently has
(:Dept {name,floor})-[:SOLD {quantity}]->(:Item {name,type})<-[:SUPPLIES {dept,volume)]-(:Company {name,address})
As you can see, to figure out which department a company supplied an item to, I have to check the :SUPPLIES dept property. This leads to somewhat awkward queries - it feels that way to me, anyway.
I've tried other relationships, like having (:Company)-[:SUPPLIES {item,vol}]->(:Dept) but then the problem just shifts to matching :SUPPLIES relationship properties to :Item nodes.
The types of queries I am building are of the nature: Find departments that sell all of the items they are supplied.
Is there some other way to model this that I am overlooking? Or is this sort of relationship, where a supplier is related to two things, an item and a department, just something that doesn't fit the graph model very well?
You want to store and query a triangular relationship between :Dept, :Item, and :Company. This can't be accomplished by a linear relationship pattern. Comparing IDs of entities is not the Neo4j way, you would neglect the strengths of a graph database.
(Assuming that I understood your use case scenario) I would introduce an additional node of type :SupplyEvent that has relationships to :Dept, :Item, and :Company. You could also split up :SOLD relationship in a similar way, if you want relations between department, item, and, e.g., a customer.
Now, you can query all companies that supplied which items to which departments (without comparing any IDs):
MATCH (company:Company)<-[:SUPPLIED_FROM]-(se:SupplyEvent)-[:SUPPLIED_TO]->(dept:Dept),
(se)-[:SUPPLIED]->(item:Item)
RETURN company, item, dept
(p:Product {name:"xxx"})-[r:Has_Attribute]->(a:Attributes{name:"xx", price:"xx", shop:"xxx"})
Does it help a lot if I create index on name, price and shop for the Attributes label?
If my query starts from the Product node, so probably it doesn't speed up by creating these indexes for Attributes nodes, since given a product, there is a very limited search space for the Attributes nodes.
If my query starts from the Attributes, i.e. the Product is unknown, so the index on Attributes will be helpful, since it needs to search for all the attributes nodes.
So it depends on my query. Is my understanding right?
I have a database which will store a number of users and the items belonging to those users. users and items will be stored as nodes. My initial approach was to have a user node with properties of username, email, and item with properties name and category, with their inbetween relatiohsip being:
(item)-[BELONGS_TO]->(user)
After reading an article in the neo4j blog, I moved the category property into a separate node, as it may belong to multiple items.
What I am concerned about is that now in a scenario of thousands of items, category nodes would have thousands of relationships. How would that affect the overall performance if I were to search for a single item and the categories it belongs to?
Dense nodes are indeed an issue (and there's quite a few approaches to increase the performance / solve the issue). Having said that, the denseness here is on the side of the category (1 category having thousands of relationships with items). If your entry point into the graph is the item however ... getting all the categories it belongs to (just a few I imagine) should not cause any problems whatsoever.
Hope this helps,
Tom
You can avoid having to create Category nodes and the relationships to them by indexing the category property of Item nodes. This would allow you to quickly find all the Items that belong to a single category.
So here's the problem I have with my data model
I have artists, users and tags
Tags are unique data objects that I am storing in nodes.
Users can tag artists with certain tags
I started with the following relationship
(user)-[:tags]->(tag)-[:on]->(artist)
OFcourse this fails to identify the user who tagged the artist.
Then I thought of trying the following approach
(user)-[:tags]->(artist)->[:with]->(tag)
Here, I can identify the artist, but cannot identify what the tag for the artist was.
I am a little lost here. I know I could simply go
(user)-[:tags {tagname}]->artist
But is there any way of representing tag as an independent entity while still maintaining data associated on both ends
You want an hypergraph with edges connecting more than 2 vertices (the user, the tag, the artist).
However, Neo4j is not an hypergraph implementation, so you'll need to introduce a node representing the "user tag" and connected to the 3 nodes with regular relationships:
MATCH (user:User {uuid: {userId}}),
(tag:Tag {uuid: {tagId}}),
(artist:Artist {uuid: {artistId}})
CREATE (user)-[:USER_TAGS]->(userTag:UserTag)-[:USES_TAG]->(tag),
(userTag)-[:TAGS_ARTIST]->(artist)
Neo4j is a property graph model. Generally:
Because hyperedges are multidimensional, hypergraph models are more generalized than property graphs. Yet, the two are isomorphic, so you can always represent a hypergraph as a property graph (albeit with more relationships and nodes) – whereas you can’t do the reverse.
https://neo4j.com/blog/other-graph-database-technologies/
One option is to break up this up into two discrete pieces: the tags that are applied to the user (assuming a unique constraint on the tag name), and the tagging relationship from the tagger to the taggee. Unsure whether your system only allows use of predefined tags, or if users are allowed to dynamically create them.
Let's assume for the moment that tags are predefined: You have nodes with the :Tag label, and you might use queries on that label to generate lists of tags users can use, or to use for autocompletion of tags as the user types.
So say a user wants to tag an artist with a tag. This will trigger an operation that first tags the artist with the tag, and then creates the :tags relationship between the user and the artist.
MATCH (t:Tag {name:tagname}), (a:Artist {id:artistID}), (u:User {id:userID})
MERGE (t)<-[:taggedAs]-(a)
MERGE (user)-[:tags {tagname}]->(artist)
The advantage of this approach is that you preserve both pieces of information (that a user has tagged an artist with specific tags, and that the artist is tagged as certain tags) in such a way that it's easy to query both pieces of information: given a user and an artist, we can quickly figure out the :tags relationships between them and get the tags with those names. We can also easily query what tags apply to the artist from all users without the expense of having to iterate through every single :tags relationship from all users.
The downside is that tag removal by a user is a more complex operation, possibly with a race condition: the :tags relationship between the user and artist has to be deleted, then all other :tags relationships from other users to that artist need to be checked to see if that tag still applies to the user, or if we need to remove it. You may need locks on this operation to prevent a race condition. If tag removal by a user is not allowed or is rare, then this could be an acceptable solution.
you can tag artist with the help of tagId to identify
(user)-[:tags {tagId}]->(artist)-[:with]->(tag {tagId})
Maybe it is a long shot but worth trying...
I have the following relation User1-[:MATCHED]-User2, I want to allow other users to give feedback (Like) on that relation, I am guessing that the obvious answer is to define new node from type Match which will be created for every two matched users and then relate to that node with LIKE relation from each user who liked the match.
I am trying to think about other way to model that in the graph without the overhead of creating new node for each match...
Can a relation relate to other nodes except the start/end nodes?
Any help will be appreciated thanks.
Neo4j does not support hypergraphs or relationships to relationships. Modelling your MATCHED relationship with a node is probably the way to go.
An alternative is to reference the relationship id from another node:
User1-[MATCHED]->User2 (where MATCHED has the id xyz)
User3-[LIKES]->Relationship(relId = xyz)
The "Relationship" node would contain the id of the MATCHED relationship as a property. This relId property would need to be indexed to find all LIKES of a given MATCHED relationship.
This solution is not well suited for traversals though.