Lazy propagation range update - segment-tree

I was reading lazy propagation on GFG and it says following for range update
For example consider the node with value 27 in above diagram, this node stores sum of values at indexes from 3 to 5. If our update query is for range 2 to 5, then we need to update this node and all descendants of this node
Segment Tree Diagram
I don't understand if the range is 2 to 5 why are we supposed to update only 27 and not other nodes which also contain index = 2 in their ranges
Link to the article

Firstly, could you provide a link to the article.
With lazy propagation you would have to update the node with value 3 and the node with value 27 to process the update query in range [2, 5]. However, you only implicitly update the subtree of the node with value 27, by updating the lazy value of the node with value 27. I assume the text just doesn't mention the node with value 2, but from what you have provided the updates of nodes which contains index 2 are not explicitly excluded.

Related

Find nodes with 3+ occurrences in a 10 minute period

I have a list of nodes with a startTime property. I need to determine if the list contains a clump of 3 or more nodes with a startTime within 10 minutes of each other. I don't need to get the nodes that are in the clump, I just need a boolean indicating the existence of such a clump.
I am at a loss, everything I have tried fails so badly that it is not worth posting them.
I feel that I am missing something easy.
This should be doable.
First you'll need to collect the startTimes, order them, and collect them.
From there, you'll need to get the relevant pairings (each entry, and the entry 2 indices ahead for the end of the duration) that will comprise a group of 3, then see if the start times of that pair occur within 10 minutes of each other.
Assuming for the sake of example :Event nodes with a startTime property, you might use this query to get the results you want:
MATCH (e:Event)
WITH e
ORDER BY e.startTime ASC
WITH collect(e.startTime)[1..] as times
WITH times, range(0, size(times) - 3) as indices
RETURN any(index in indices WHERE times[index + 2] <= times[index] + duration({minutes:10}))

when loading csv in neo4j do not create all the relationships

good to all please help me with this problem :D
when I execute my query:
USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:///Create_all.csv" AS row
MATCH(x:Category{uuid:row.uuid_category})
MERGE (t:Subscriber{name:row.name_subscriber, uuid:row.uuid_subscriber})
CREATE (n:Product{name: row.name_product, uuid: row.uuid_product}),
(Price:AttributeValue{name:'Price', value: row.price_product}),
(Stock:AttributeValue{name:'Stock', value: row.stock_product }),
(Style:AttributeValue{name:'Style', value: 'Pop Art'}),
(Subject:AttributeValue{name:'Subject', value: 'Portrait'}),
(Originality:AttributeValue{name:'Originality', value: 'Reproduction'}),
(Region:AttributeValue{name:'Region', value: 'Japan'}),
(Price)-[:IS_ATTRIBUTEVALUE_OF]->(n),
(Stock)-[:IS_ATTRIBUTEVALUE_OF]->(n),
(Style)-[:IS_ATTRIBUTEVALUE_OF]->(n),
(Subject)-[:IS_ATTRIBUTEVALUE_OF]->(n),
(Originality)-[:IS_ATTRIBUTEVALUE_OF]->(n),
(Region)-[:IS_ATTRIBUTEVALUE_OF]->(n)
WITH (n),(t),(x)
create (n)-[:OF_CATEGORY]->(x)
create (t)-[:SELLS]->(n)
The format of my csv is as follows:
I have 4 categories, 30 products and 10 subscriber creates me:
Added 164 labels, created 164 nodes, set 328 properties, created 184
relationships, completed after 254 ms.
I verify the result with:
MATCH p=()-[r:OF_CATEGORY]->() RETURN count(r)
There are 23 relationships created, however, the remaining 7 relationships were not created.
please guide me with the query should be created all relationships in this case would be 30 relationships products with categories
The critical part is MATCH(x:Category{uuid:row.uuid_category})
If that match fails for a row, the row will be wiped out and none of the other operations for that row will execute.
Since your input consists of 4 of the same category (let's call them 1,2,3,and 4) repeating 7 times (for 28 rows total so far), and then two of those occurring one more time each (2 times if both successful, for a total of your entire 30 rows), it would make sense if some of your matches are failing, with :Category nodes with some of those uuid_category properties not actually being present in the graph.
Of those uuids (1,2,3, and 4), only 1 and 2 occur at the end (so occurring across 8 rows for these two, as opposed to 7 times for uuids 3 and 4). It would make sense if either uuid 3 or 4 doesn't have a corresponding node in the graph. That would get us 1 * 7 + 2 * 8 = 23, which is the number of relationships that your query is creating.
So there is no :Category node for the uuid_category ending with either 3 or 4.
Check your graph against your data to confirm.

NEO4j 3.0 retrieve data between certain period

I'm using NEO4J 3.0 and it seems that HAS function was removed.
Type of myrelationship is a date and I'm looking to retrieve all relation between two dates such as my property "a" is greater than certain value.
How can I test this using NEO4j
Thank you
[EDITED to add info from comments]
I have tried this:
MATCH p=(n:origin)-[r]->()
WHERE r>'2015-01'
RETURN AVG(r.amount) as totalamout;
I created relationship per date and each one has a property, amount, and I am looking to compute the average amount for certain period. As example, average amount since 2015-04.
To answer the issue raised by your first sentence: in neo4j 3.x, the HAS() function was replaced by EXISTS().
[UPDATE 1]
This version of your query should work:
MATCH p=(n:origin)-[r]->()
WHERE TYPE(r) > '2015-01'
RETURN AVG(r.amount) as totalamout;
However, it is a bad idea to give your relationships different types based on a date. It is better to just use a date property.
[UPDATE 2]
If you changed your data model to add a date property to your relationships (to which I will give the type FOO), then the following query will find the average amount, per p, of all the relationships whose date is after 2015-01 (assuming that all your dates follow the same strict YYYY-MM pattern):
MATCH p=(n:origin)-[r:FOO]->()
WHERE r.date > '2015-01'
RETURN p, AVG(r.amount) as avg_amout;

Using labels in Batch import

In the new 2.0 branch of the NEO4J batch-importer,
To specify a label, I believe one must specify the header,
Using the example from the readme.md and wiki:
name l:label age works_on
Michael Person,Father 37 neo4j
Selina Person,Child 14
Rana Person,Child 6
Selma Person,Child 4
Does the header always have to follow the following format of being l:label and
What does the comma do and is it optional?
ie. What does person,Father represent? label,???
I believe in this case Person is the label but I'm curious how can I query (in cypher) the label value in this case either Father or Child.
I think if you wanted to explicitly assign the type of some property you would use colon + type for that, i.e.
name:String age:int
and :label is used after the fashion of a data type to signal that the value(s) in a field is a node label or a relationship type. Since labels are not name/value pairs like properties, I would think the l in l:label doesn't really do anything.

multiple loads in neo4j

I have loaded some data in neo4j graph database using batch importer. Now let's say if I have to load more data then do i have to keep track of what was inserted externally or there are standard features of neo4j that can be used to:
1) get the id for the last node inserted so that i know the id for the new node that needs to be inserted and index accordingly.
2) get the list of nodes already present in database so that i can check the uniqueness of the nodes that are going to be inserted. if a node already exists in the database i will just use the same id and won't create a new node.
3) check the uniqueness of the triplets - suppose a triplet "January Month is_a" is already present in neo4j database and let's say the new data that i want to insert also have same triplet, i would like to not insert it as it will give me duplicate results.
For example: if you add following data in neo4j graph database using batch-importer:https://github.com/jexp/batch-import
$ cat nodes.csv
name age works_on
Michael 37 neo4j
Selina 14
Rana 6
Selma 4
$ cat nodes_index.csv
0 name age works_on
1 Michael 37 neo4j
2 Selina 14
3 Rana 6
4 Selma 4
$ cat rels.csv
start end type since counter:int
1 2 FATHER_OF 1998-07-10 1
1 3 FATHER_OF 2007-09-15 2
1 4 FATHER_OF 2008-05-03 3
3 4 SISTER_OF 2008-05-03 5
2 3 SISTER_OF 2007-09-15 7
Now, if you have to add more data to the same database then you will need to know following things:
1) if nodes already exists then what are their ids so that you can use them while creating a triplet, if not then create a list of such nodes (not in database) and then start from a id that has not been used in last import and use it as a starting id for creating a new nodes_index.csv
2) if a triplet in database already exist, then don't create that triplet again as it will result in a duplicate result when running cypher queries against the database.
It seems like same issue has been raised here as well: https://github.com/jexp/batch-import/issues/27
Thanks!
1- why you need to know last node id .. you don't need to know the id to insert the new node it will added automatically in first free id in graph
2- for uniqueness , why you don't use create unique query "for nodes and relations as well"
here you can check the references : http://docs.neo4j.org/chunked/1.8/cypher-query-lang.html

Resources