Active Record Array array query - to check records that are present in an array - ruby-on-rails

I have an Objective model which has an attribute called as labels whose values are array data type. I need to query all the Objectives whose labels attribute has values that are present in some particular array.
For Example:
I have an array
a = ["textile", "blazer"]
the Objective.labels may have values as ["textile, "ramen"]
I need to return all objectives that might have either "textile" or "blazer" as one of their labels array values
I tried the following:
Objective.where("labels #> ARRAY[?]::varchar[]", ["textile"])
This returns some records.Now when I try
Objective.where("labels #> ARRAY[?]::varchar[]", ["textile", "Blazer"])
I expect it to return all Objectives which contains at-least one of the labels array value as textile or blazer.
However, it returns an empty array. Any Solutions?

Try && overlap operator.
overlap (have elements in common)
Objective.where("labels && ARRAY[?]::varchar[]", ["textile", "Blazer"])
If you have many rows, a GIN index can speed it up.

Related

Writing variable-length sequence to a compound array

I am using compound datatypes with h5py, with some elements being variable-length arrays. I can't find a way to set the item. The following MWE shows 6 various ways to do that (sequential indexing — which would not work in h5py anyway, fused indexing, read-modify-commit for columns/rows), neither of which works.
What is the correct way? Why is h5py saying Cannot change data-type for object array when writing integer list to int32 list?
with h5py.File('/tmp/test-vla.h5','w') as h5:
dt=np.dtype([('a',h5py.vlen_dtype(np.dtype('int32')))])
dset=h5.create_dataset('test',(5,),dtype=dt)
dset['a'][2]=[1,2,3] # does not write the value back
dset[2]['a']=[1,2,3] # does not write the value back
dset['a',2]=[1,2,3] # Cannot change data-type for object array
dset[2,'a']=[1,2,3] # Cannot change data-type for object array
tmp=dset['a']; tmp[2]=[1,2,3]; dset['a']=tmp # Cannot change data-type for object array
tmp=dset[2]; tmp['a']=[1,2,3]; dset[2]=tmp # 'list' object has no attribute 'dtype'
When working with compound datasets, I've discovered it's best to add all row data in a single statement. I tweaked your code and to show how add 3 rows of data (each of different length). Note how I: 1) define the row of data with a tuple; 2) define the list of integers with np.array(); and 3) don't reference the field name ['a'].
with h5py.File('test-vla.h5','w') as h5:
dt=np.dtype([('a',h5py.vlen_dtype(np.dtype('int32')))])
dset=h5.create_dataset('test',(5,),dtype=dt)
print (dset.dtype, dset.shape)
dset[0] = ( np.array([0,1,2]), )
dset[1] = ( np.array([1,2,3,4]), )
dset[2] = ( np.array([0,1,2,3,4]), )
For more info, take a look at this post on the HDF Group Forum under HDF5 Ancillary Tools / h5py:
Compound datatype with int, float and array of floats

How do I extract a field from an array of my models in order to form a single query?

I’m using Rails 4.2.7. I have an array of my model objects and currently I’m iterating through that array to find matching entries in the database based on a field my each object …
my_object_times.each_with_index do |my_object_time, index|
found_my_object_time = MyObjectTime.find_by_my_object_id_and_overall_rank(my_object_id, my_object_time.overall_rank)
My question is, how can I rewrite the above to run one query instead of N queries, if N is the size of the array. What I wanted was to force my underlying database (PostGres 9.5) to do a “IF VALUE IN (…)” type of query but I’m not sure how to extract all the attributes from my array and then pass them in appropriately to a query.
I would do something like this:
found_my_object_times = MyObjectTime.where(
object_id: my_object_id,
overall_rank: my_object_times.map(&:overall_rank)
)

mongoid: select elements that have at least n elements in array

In mongoid you can query items that have at least one element in array:
Item.any_in(tag_ids: [id1,id2,id3])
You can also select elements that have all elements in array:
Item.all_in(tag_ids: [id1,id2,id3])
My Question: Is there any way to query elements that have at least n elements in array ?
I'd like to query something like Item.at_least(tag_ids: [id1,id2,id3], n: 2) to return any Item that share at least two ids with [id1,id2,id3]
Thanks !
I don't know a pure Mongoid-solution. I also haven't found such query in the MongoDB manual: http://docs.mongodb.org/manual/reference/operator/query-array/
I would use some mix of Mongoid and array operations.
The disadvantage of it is, that all items which have at least 1 of these tags will be loaded.
searched_tag_ids = ['54253ad452656b1d25000000','54253adc52656b1d25010000','54253ae352656b1d25020000']
items_with_min_1_searched_tag = Item.any_in(tag_ids: searched_tag_ids).to_a
items_with_min_2_searched_tag = items_with_min_1_searched_tag.select{|item| (item.tag_ids.collect{|tag_id| tag_id.to_s} & searched_tag_ids).size >=2}

Querying array element in node property

I have nodes within a the graph with property pathway storing an array of of values ranging from
path:ko00030
path:ko00010
.
.
path:koXXXXX
As an example, (i'm going to post in the batch import format:https://github.com/jexp/batch-import/tree/20)
ko:string:koid name definition l:label pathway:string_array pathway.name:string_array
ko:K00001 E1.1.1.1, adh alcohol dehydrogenase [EC:1.1.1.1] ko path:ko00010|path:ko00071|path:ko00350|path:ko00625|path:ko00626|path:ko00641|path:ko00830
the subsequent nodes might have a different combination of pathway values.
How do i query using CYPHER to retrieve all nodes with path:ko00010 in pathway
the closest i've gotten is using the solution provided for a different problem:
How to check array property in neo4j?
match (n:ko)--cpd
Where has(n.pathway) and all ( m in n.pathway where m in ["path:ko00010"])
return n,cpd;
but here only nodes with pathways matching exactly to the list provided are returned.
ie. if i were to query path:ko00010 like in the example above, I'll only be able to retrieve nodes holding path:ko00010 as the only element in the pathway property and not nodes containing path:ko00010 as well as other path:koXXXXX
In your query the extension of the predicate ALL is all the values in the property array, meaning that your query will only return those cases where every value in the pathway property array are found in the literal array ["path:ko00010"]. If I understand you right you want the opposite, you want to test that all values in the literal array ["path:ko00010"] are found in the property array pathway. If that's indeed what you want you can just switch their places, your WHERE clause will then be
WHERE HAS(n.pathway) AND ALL (m IN ["path:ko00010"] WHERE m IN n.pathway)
It is not strictly correct to say that your query only matches cases where the array you ask for and the property array are exactly the same. You could have had more than one value in the literal array, something like ["path:ko00010","path:ko00020"], and nodes with only one one of those values in their pathway array would also have matched–as long as all values in the property array could be found in the literal array. Conversely, with the altered WHERE filter that I've suggested, the query will match any node that has all of the values of the literal array in their pathway property.
If you want to filter the matched patterns with an array of values where all of them have to be present, this is good. In your example you only use one value, however, and for those queries there is no reason to use an array and the ALL predicate. You can simply do
WHERE HAS(n.pathway) and "path:ko00010" IN n.pathway
If in some context you want to include results where any of a set of values are found in the pathway property array you can just switch from ALL to ANY
WHERE HAS(n.pathway) AND ANY (m IN ["path:ko00010","path:ko00020"] WHERE m IN n.pathway)
Also, you probably don't need to check for the presence of the pathway property, unless you have some special use for it you should be fine without the HAS(n.pathway).
And once you've got the queries working right, try to switch out literal strings and arrays for parameters!
WHERE {value} IN n.pathway
// or
WHERE ALL (m IN {value_array} WHERE m IN n.pathway)
// or
WHERE ANY (m IN {value_array} WHERE m IN n.pathway)

How do I collect and combine multiple arrays for calculation?

I am collecting the values for a specific column from a named_scope as follows:
a = survey_job.survey_responses.collect(&:base_pay)
This gives me a numeric array for example (1,2,3,4,5). I can then pass this array into various functions I have created to retrieve the mean, median, standard deviation of the number set. This all works fine however I now need to start combining multiple columns of data to carry out the same types of calculation.
I need to collect the details of perhaps three fields as follows:
survey_job.survey_responses.collect(&:base_pay)
survey_job.survey_responses.collect(&:bonus_pay)
survey_job.survey_responses.collect(&:overtime_pay)
This will give me 3 arrays. I then need to combine these into a single array by adding each of the matching values together - i.e. add the first result from each array, the second result from each array and so on so I have an array of the totals.
How do I create a method which will collect all of this data together and how do I call it from the view template?
Really appreciate any help on this one...
Thanks
Simon
s = survey_job.survey_responses
pay = s.collect(&:base_pay).zip(s.collect(&:bonus_pay), s.collect(&:overtime_pay))
pay.map{|i| i.compact.inject(&:+) }
Do that, but with meaningful variable names and I think it will work.
Define a normal method in app/helpers/_helper.rb and it will work in the view
Edit: now it works if they contain nil or are of different sizes (as long as the longest array is the one on which zip is called.
Here's a method that will combine an arbitrary number of arrays by taking the sum at each index. It'll allow each array to be of different length, too.
def combine(*arrays)
# Get the length of the largest array, that'll be the number of iterations needed
maxlen = arrays.map(&:length).max
out = []
maxlen.times do |i|
# Push the sum of all array elements at a given index to the result array
out.push( arrays.map{|a| a[i]}.inject(0) { |memo, value| memo += value.to_i } )
end
out
end
Then, in the controller, you could do
base_pay = survey_job.survey_responses.collect(&:base_pay)
bonus_pay = survey_job.survey_responses.collect(&:bonus_pay)
overtime_pay = survey_job.survey_responses.collect(&:overtime_pay)
#total_pay = combine(base_pay, bonus_pay, overtime_pay)
And then refer to #total_pay as needed in your view.

Resources