(p:Product {name:"xxx"})-[r:Has_Attribute]->(a:Attributes{name:"xx", price:"xx", shop:"xxx"})
Does it help a lot if I create index on name, price and shop for the Attributes label?
If my query starts from the Product node, so probably it doesn't speed up by creating these indexes for Attributes nodes, since given a product, there is a very limited search space for the Attributes nodes.
If my query starts from the Attributes, i.e. the Product is unknown, so the index on Attributes will be helpful, since it needs to search for all the attributes nodes.
So it depends on my query. Is my understanding right?
Related
I'm currently ramping up on graph databases and to do that am working through a set of questions to learn Cypher. However, I'm not 100% happy with the design I've chosen since I have to match relationships to nodes to make some of the queries work.
I found Neo4j: Suggestions for ways to model a graph with shared nodes but has a unique path based on some property with some suggestions that are relevant, but they involve copying nodes (repeating them) when in fact they do represent the same thing. That seems like an update issue waiting to happen.
My design currently has
(:Dept {name,floor})-[:SOLD {quantity}]->(:Item {name,type})<-[:SUPPLIES {dept,volume)]-(:Company {name,address})
As you can see, to figure out which department a company supplied an item to, I have to check the :SUPPLIES dept property. This leads to somewhat awkward queries - it feels that way to me, anyway.
I've tried other relationships, like having (:Company)-[:SUPPLIES {item,vol}]->(:Dept) but then the problem just shifts to matching :SUPPLIES relationship properties to :Item nodes.
The types of queries I am building are of the nature: Find departments that sell all of the items they are supplied.
Is there some other way to model this that I am overlooking? Or is this sort of relationship, where a supplier is related to two things, an item and a department, just something that doesn't fit the graph model very well?
You want to store and query a triangular relationship between :Dept, :Item, and :Company. This can't be accomplished by a linear relationship pattern. Comparing IDs of entities is not the Neo4j way, you would neglect the strengths of a graph database.
(Assuming that I understood your use case scenario) I would introduce an additional node of type :SupplyEvent that has relationships to :Dept, :Item, and :Company. You could also split up :SOLD relationship in a similar way, if you want relations between department, item, and, e.g., a customer.
Now, you can query all companies that supplied which items to which departments (without comparing any IDs):
MATCH (company:Company)<-[:SUPPLIED_FROM]-(se:SupplyEvent)-[:SUPPLIED_TO]->(dept:Dept),
(se)-[:SUPPLIED]->(item:Item)
RETURN company, item, dept
I have a database which will store a number of users and the items belonging to those users. users and items will be stored as nodes. My initial approach was to have a user node with properties of username, email, and item with properties name and category, with their inbetween relatiohsip being:
(item)-[BELONGS_TO]->(user)
After reading an article in the neo4j blog, I moved the category property into a separate node, as it may belong to multiple items.
What I am concerned about is that now in a scenario of thousands of items, category nodes would have thousands of relationships. How would that affect the overall performance if I were to search for a single item and the categories it belongs to?
Dense nodes are indeed an issue (and there's quite a few approaches to increase the performance / solve the issue). Having said that, the denseness here is on the side of the category (1 category having thousands of relationships with items). If your entry point into the graph is the item however ... getting all the categories it belongs to (just a few I imagine) should not cause any problems whatsoever.
Hope this helps,
Tom
You can avoid having to create Category nodes and the relationships to them by indexing the category property of Item nodes. This would allow you to quickly find all the Items that belong to a single category.
Hi Guys, i would like to know, if i create unique index of two columns on postgreSQL, does normal indexing for the both columns also work by same unique index or i have to create one unique index and two more index for both columns as shown in the code? I want to create unique index of talent_id, job_id, also both columns should separately indexed. I read many resources but does not get appropriate answer.
add_index :talent_actions, [:talent_id, :job_id], unique: true
Does above code also handles below indexing also or i have to add below indexing separately?
add_index :talent_actions, :talent_id
add_index :talent_actions, :job_id
Thank you.
An index is an object in the database, which can be used to look up data faster, if the query planner decides it will be appropriate. So the trivial answer to your question is "no", creating one index will not result in the same structures in the database as creating three different indexes.
I think what you actually want to know is this:
Do I need all three indexes, or will the unique index already optimise all queries?
This, as with any database optimisation, depends on the queries you run, and the data you have.
Here are some considerations:
The order of columns in a multi-column index matters. If you have an index of people sorted by surname then first name, then you can use it to search for everybody with the same surname; but you probably can't use it to search for somebody when you only know their first name.
Data distribution matters. If everyone in your list has the surnames "Smith" and "Jones", then you can use a surname-first index to search for a first name fairly easily (just look up under Jones, then under Smith).
Index size matters. The fewer columns an index has, the more of it fits in memory at once, so the faster it will be to use.
Often, there are multiple indexes the query planner could use, and its job is to estimate the cost of the above factors for the query you've written.
Usually, it doesn't hurt to create multiple indexes which you think might help, but it does use up disk space, and occasionally can cause the query planner to pick a worse plan. So the best approach is always to populate a database with some real data, and look at the query plans for some real queries.
I want do the following, I want to run an index search and collect all the nodes, path etc, store the new subgraph and run another search on that new subgraph.
For example:
First Search
CALL apoc.index.search("cat", "Category.name:fashion") YIELD node AS catg
Second search
CALL apoc.index.search("cat", "Category.name:dresses") on the new resultant graph
The data is very similar to Amazon's Taxonomy tree, where the top is fashion and then it has tree below it. So there are multiple Root nodes.
Any help or pointers would be appreciated.
I'd recommend altering your data model. Rather than having categories as properties or list items in properties, they could be modeled as :Category nodes. That way a product's categories are defined by the relationships they have to :Category nodes, which also allows easier queries based on category: match on the desired categories, then match on products that have relationships with those categories.
In Xcode you can add "Indexes" for an entity in the data model inspector.
For the screenshot I did hit "add" twice so "comma,separated,properties" is just the default value.
What exactly are those indexes?
Do they have anything to do with indexed attributes? And if they have what is the difference between specifying the Indexes in this inspector and selecting "Indexed" for the individual attribute?
Optimizing Core Data searches and sorts
As the title says, indexing is to speed up searching and sorting your database. However it slows down saving changes to persistant store. It matters when you are using NSPredicate and NSSortDescriptor objects within your query.
Let's say you have two entities: PBOUser and PBOLocation (many to many). You can see its properties at the image below:
Suppose that in database there is 10,000 users, and 50,000 locations. Now we need to find every user with email starting on a. If we provide such query without indexing, Core Data must check every record (basically 10,000).
But what if it is indexed (in other words sorted by email descending)? --> Then Core Data checks only those records started with a. If Core Data reaches b then it will stop searching because it is obvious that there are no more records whose email starts with a since it is indexed.
How to enable indexing on a Core Data model from within Xcode:
or:
Hopefully they are equivalent:-)
But what if you wanted: Emails started with a and name starts with b You can do this checking INDEXED for name property for PBOUser entity, or:
This is how you can optimise your database:-)
Use the Indexes list to add compound indexes to the entity. A compound index is an index that spans multiple attributes or relationships. A compound index can make searching faster. The names of attributes and relationships in your data model are the most common indexes. You must use the SQLite store to use compound indexes.
Adding a row with a single attribute to the Indexes list is equivalent to selecting Indexed for that attribute: It creates an index for the attribute to speed up searches in query statements.
The Indexes list is meant for compound indexes. Compound indexes are useful when you know that you will be searching for values of these attributes combined in the WHERE clause of a query:
SELECT * FROM customer WHERE surname = "Doe" AND firstname = "Joe";
This statement could make use of a compound index surname, firstname. That index would also be useful if you just search for surname, but not if you only search for firstname. Think of the index as if it were a phone book: It is sorted by surname first, then by first name. So the order of attributes is important.