.indexOf() equivalent in Neo4j Cypher - neo4j

No matter how I swing it, I need some kind of function to find the index of a item in an array supplied as a parameter.
I am trying to simply update items in a collection based on the index of one of their properties in an array, and have been poring through Cypher docs for nearly 2 hours...
It would also be acceptable to order the items by that array, and then run a foreach on the ordered list...

Following #stefan-armbruster answer and great blog post, a slow but simple index_of can be done with:
reduce(x=[-1,0], i IN [1,2,7,5,21,5,1,435] |
CASE WHEN i = 21 THEN [x[1], x[1]+1] ELSE [x[0], x[1]+1] END
)[0]
Here reduce function works with a two elements array: the position and the current index. If an element in your array matches the given condition, the first element of the reduced array will be replaced with the current index.
I put an example on neo4j console http://console.neo4j.org/?id=34byv

I've blogged about that recently. You can use the reduce function with an three element array as state variable containing
the index of the highest occupation so far
the current index (aka iteration number)
the value of highest occupation so far
As an example to find the index of max element in an array:
RETURN reduce(x=[0,0,0], i IN [1,2,2,5,2,1] |
CASE WHEN i>x[2] THEN [x[1],x[1]+1,o] ELSE [x[0], x[1]+1,x[2]] END
)[0]

Related

Active Record Array array query - to check records that are present in an array

I have an Objective model which has an attribute called as labels whose values are array data type. I need to query all the Objectives whose labels attribute has values that are present in some particular array.
For Example:
I have an array
a = ["textile", "blazer"]
the Objective.labels may have values as ["textile, "ramen"]
I need to return all objectives that might have either "textile" or "blazer" as one of their labels array values
I tried the following:
Objective.where("labels #> ARRAY[?]::varchar[]", ["textile"])
This returns some records.Now when I try
Objective.where("labels #> ARRAY[?]::varchar[]", ["textile", "Blazer"])
I expect it to return all Objectives which contains at-least one of the labels array value as textile or blazer.
However, it returns an empty array. Any Solutions?
Try && overlap operator.
overlap (have elements in common)
Objective.where("labels && ARRAY[?]::varchar[]", ["textile", "Blazer"])
If you have many rows, a GIN index can speed it up.

How to remove all elements found in GROUP B to GROUP A in Lua

Let's say I have two groups defined in my Lua script
groupA = {"donkey", "goat", "eagle", "whale", "dolphine", "dog", "mosquito", ...}
groupB = {"goat", "mosquito", "donkey"}
After the remove operation, the value of groupA have no more elements: "goat", "mosquito", and "donkey"
How do I remove all items in groupA that are found in groupB. I know we can loop through the items and compare each one but I prefer any API or simple built in statements that solve this types of problem. The elements could also be any type like record.
There are no built-in operators that calculate set difference in Lua. You can do what you described and to speed up this process you can build a hash of elements from the second table and then iterate over the elements in the first table and check if they are present in the hash (of the elements in the second table).
If you end up using table.remove to remove elements from the first table while iterating, you need to be careful to iterate from the end, otherwise you may end up skipping elements you need to remove.
You can also check if some of the suggestions in this thread about set operators work for you.
local lookup = {}
for i, v in ipairs(groupB) do
lookup[v] = true
end
local answer = {}
for i, v in ipairs(groupA) do
if (not lookup[v]) then
table.insert(answer, v)
end
end
create a lookup table for the unique items in groupB
traverse groupA and lookup each item in the lookup table
add items from groupA not found in the lookup table to answer table
Note: this approach doesn't account for duplicates. For example, if groupB contains "goat" three times, and groupA contains "goat" four times, then answer will contain "goat" zero times.
After some time of researching, I found out that this simple subtraction in Lua works for groups (or table) to remove elements found in a group from another.
Ex.
groupA = groupA - groupB

Neo4j: Sum relationship properties where node properties equal Value A and Value B (intersection)

Basically my question is: how do I sum relationship properties where there is a related nodes that have properties equal to Value A and Value B?
For example:
I have a simple DB has the following relationship:
(site)-[:HAS_MEMBER]->(user)-[:POSTED]->(status)-[:TAGGED_WITH]->(tag)
On [:TAGGED_WITH] I have a property called "TimeSpent". I can easily SUM up all the time spent for a particular day and user by using the following query:
MATCH (user)-[:POSTED]->(updates)-[r:TAGGED_WITH]->(tags)
WHERE user.name = "Josh Barker" AND updates.date = 20141120
RETURN tags.name, SUM(r.TimeSpent) as totalTimeSpent;
This returns to me a nice table with tags and associated time spent on each. (i.e. #Meeting 4.5). However, the question arises if I want to do some advanced searches and say "Show me all the meetings for ProjectA" (i.e. #Meeting #ProjectA). Basically, I am looking for a query that I can get all of the relationships where a single status has BOTH tags (and only if it has both). Then I can SUM that number up to get a count for how many meetings I spent in #ProjectA.
How do I do this?
MATCH (updates)-[r:TAGGED_WITH]->(tag1 {name: 'Meeting'}),
(updates)-[r:TAGGED_WITH]->(tag2 {name: 'ProjectA'})
RETURN SUM(r.TimeSpent) as totalTimeSpent, count(updates);
This should find all updates tagged with both of those things, and sum all time spent across all of those updates.
To create a generic solution where you may want one or more tags you could use something like this, passing in the array of tags as a parameter (and using the length of the array instead of the hard coded 2.
MATCH (user)-[:POSTED]->(update)-[r:TAGGED_WITH]->(tag)
WHERE user.name = "Josh Barker" AND updates.date = 20141120 AND tag.name IN ['Meeting', 'ProjectA']
WITH update, SUM(r.TimeSpent) AS totalTimeSpent, COLLECT(tag) AS tags
WHERE LENGTH(tags) = 2
RETURN update, totalTtimeSpent
As long as tag.name is indexed, this should be fast.
Edit - Remove User constraint
MATCH (update)-[r:TAGGED_WITH]->(tag)
WHERE tag.name IN ['Meeting', 'ProjectA']
WITH update, SUM(r.TimeSpent) AS totalTimeSpent, COLLECT(tag) AS tags
WHERE LENGTH(tags) = 2
RETURN update, totalTtimeSpent

Neo4j - Cypher / strange output when dealing with an array as node property

Using Cypher (Neo4j 2.1.2), it seems that array properties do not work well with aggregate functions.
For instance, I can have a RETURN clause like this:
RETURN meeting.title, count(participant) as number_part
Output: MyTitle 2
It well returns all the meetings's titles grouped by participants.
However, with an array as property rather than simple one like title, the output is strange:
RETURN meeting.arrayProperty, count(participant) as number_part
Output:
MyTitle [1,2,3] 1
MyTitle [1,2,3] 1 //not grouped by ...
Better than text, here's a graphgist I made to explain the issue, the workaround I found and what I really expect.
Does anyone know the reason? (maybe obvious...)
Just tried the following workaround: rebuild the array property as an collection:
RETURN extract(x in meeting.arrayProperty | x), count(participant) as number_part
Theory: the array property is handled as java native array whereas extract returns a collection (in Java sense). Comparing collections works based on comparing the elements whereas comparing a native array compares memory addresses which are different.

How do I collect and combine multiple arrays for calculation?

I am collecting the values for a specific column from a named_scope as follows:
a = survey_job.survey_responses.collect(&:base_pay)
This gives me a numeric array for example (1,2,3,4,5). I can then pass this array into various functions I have created to retrieve the mean, median, standard deviation of the number set. This all works fine however I now need to start combining multiple columns of data to carry out the same types of calculation.
I need to collect the details of perhaps three fields as follows:
survey_job.survey_responses.collect(&:base_pay)
survey_job.survey_responses.collect(&:bonus_pay)
survey_job.survey_responses.collect(&:overtime_pay)
This will give me 3 arrays. I then need to combine these into a single array by adding each of the matching values together - i.e. add the first result from each array, the second result from each array and so on so I have an array of the totals.
How do I create a method which will collect all of this data together and how do I call it from the view template?
Really appreciate any help on this one...
Thanks
Simon
s = survey_job.survey_responses
pay = s.collect(&:base_pay).zip(s.collect(&:bonus_pay), s.collect(&:overtime_pay))
pay.map{|i| i.compact.inject(&:+) }
Do that, but with meaningful variable names and I think it will work.
Define a normal method in app/helpers/_helper.rb and it will work in the view
Edit: now it works if they contain nil or are of different sizes (as long as the longest array is the one on which zip is called.
Here's a method that will combine an arbitrary number of arrays by taking the sum at each index. It'll allow each array to be of different length, too.
def combine(*arrays)
# Get the length of the largest array, that'll be the number of iterations needed
maxlen = arrays.map(&:length).max
out = []
maxlen.times do |i|
# Push the sum of all array elements at a given index to the result array
out.push( arrays.map{|a| a[i]}.inject(0) { |memo, value| memo += value.to_i } )
end
out
end
Then, in the controller, you could do
base_pay = survey_job.survey_responses.collect(&:base_pay)
bonus_pay = survey_job.survey_responses.collect(&:bonus_pay)
overtime_pay = survey_job.survey_responses.collect(&:overtime_pay)
#total_pay = combine(base_pay, bonus_pay, overtime_pay)
And then refer to #total_pay as needed in your view.

Resources