I am building a database of genes in neo4j, with the main gene name as each node's unique identifier. However, each gene can go by other names, and i want those names to be searchable in the database as well.
I saw there might be a way to do this by indexing the alternate names to relate to the primary name, but not sure how to go about this. The two options i have seen are:
create new :Alias nodes for each alternate gene name, relate them to the primary node, then index that relationship.
Add an array of the alternate names as a property of the primary node, and index it that way.
Is there a correct method for doing this? Thanks
Related
(p:Product {name:"xxx"})-[r:Has_Attribute]->(a:Attributes{name:"xx", price:"xx", shop:"xxx"})
Does it help a lot if I create index on name, price and shop for the Attributes label?
If my query starts from the Product node, so probably it doesn't speed up by creating these indexes for Attributes nodes, since given a product, there is a very limited search space for the Attributes nodes.
If my query starts from the Attributes, i.e. the Product is unknown, so the index on Attributes will be helpful, since it needs to search for all the attributes nodes.
So it depends on my query. Is my understanding right?
I want do the following, I want to run an index search and collect all the nodes, path etc, store the new subgraph and run another search on that new subgraph.
For example:
First Search
CALL apoc.index.search("cat", "Category.name:fashion") YIELD node AS catg
Second search
CALL apoc.index.search("cat", "Category.name:dresses") on the new resultant graph
The data is very similar to Amazon's Taxonomy tree, where the top is fashion and then it has tree below it. So there are multiple Root nodes.
Any help or pointers would be appreciated.
I'd recommend altering your data model. Rather than having categories as properties or list items in properties, they could be modeled as :Category nodes. That way a product's categories are defined by the relationships they have to :Category nodes, which also allows easier queries based on category: match on the desired categories, then match on products that have relationships with those categories.
Consider Person nodes and Item nodes.
What is the best way to prevent having both 'Purchased' type relationships and 'Bought' type relationships in the graph that have the same meaning, but are simply named differently?
E.g. if we end up with our graph in a state like:
(Alice) -[Bought] -> (Pickles)
(Bob) -[Purchased]-> (Pickles)
and I want to know everyone who has bought a jar of pickles. Clearly someone made a mistake when creating one of these relationships. How do I prevent that class of mistake?
Limit the relationships a user can create to a specific set of names, and don't allow any other relationship names.
Maybe it is a long shot but worth trying...
I have the following relation User1-[:MATCHED]-User2, I want to allow other users to give feedback (Like) on that relation, I am guessing that the obvious answer is to define new node from type Match which will be created for every two matched users and then relate to that node with LIKE relation from each user who liked the match.
I am trying to think about other way to model that in the graph without the overhead of creating new node for each match...
Can a relation relate to other nodes except the start/end nodes?
Any help will be appreciated thanks.
Neo4j does not support hypergraphs or relationships to relationships. Modelling your MATCHED relationship with a node is probably the way to go.
An alternative is to reference the relationship id from another node:
User1-[MATCHED]->User2 (where MATCHED has the id xyz)
User3-[LIKES]->Relationship(relId = xyz)
The "Relationship" node would contain the id of the MATCHED relationship as a property. This relId property would need to be indexed to find all LIKES of a given MATCHED relationship.
This solution is not well suited for traversals though.
I need help creating an appropriate database structure that will allow me to dynamically create "fields" and "values". I plan on using the following 5 tables.
TraitCategories
Groups
TraitGroupings
People
TraitValues
TraitCategories table holds only categories (i.e. "fields") of traits -- i.e. hair color, height, etc. -- and the categories can be added/removed as desired.
Groups table holds ad hoc/dynamic group labels -- i.e. Asian, South American, etc.
TraitGroupings is the join table for TraitCategories and Groups
The People table will be linked to the Groups table via a foreign key and thus will be assigned various categories (fields) of traits by leveraging the relationship between the Groups and TraitCategories tables.
But the question is, how do I assign per person values to the trait categories/fields?
I was thinking of having each row in the TraitValues table contain person_id and trait_category_id so that there will be a relationship between the TraitValues table and both the People and TraitCategories tables. Does this approach make sense? Will this approach allow me to get trait categories and values via the People table?
You are describing a form of EAV.
I'm not sure how practical this is going to be for representing in Ruby, but in you case, the database model would look similar to this:
(Most non-key fields omitted, for brevity.)
Note how we abundantly use the identifying relationships. This is what lets us propagate GroupId down both sides of the "diamond-shaped" dependency, and merge it into a single field at the bottom, in TraitValue.
This is what ensures a person cannot have a trait, unless it is also listed for that person's group. For example, a person can have a "hair color" only if the person's group has the "hair color" as well.
BTW...
The People table will be linked to the TraitGroupings via a foreign key -- and thus will be assigned various categories (fields) of traits.
If People has a FK that directly references TraitGroupings, then a person can have at most one trait grouping and therefore at most one trait category. From the wording of your question, that desn't appear to be what you want.