Neo4j gem - Distinct Query with Paginate - ruby-on-rails

How can I get my plucked array to work with the paginate method (which I believe only works on queryproxy objects)
My results are pulling up a few of the same nodes as there are multiple paths to it, so I added pluck and distinct like so..
current_user.friends....where()...params().pluck('DISTINCT e').paginate..
Is there another way around it? or would a change have to be made in the neo4j paginate gem?

Right now, this isn't doable using the paginate method. WillPaginate returns WillPaginate::Collection objects that are already populated from the database. We might be able to make it return something else and evaluate lazily but I'd have to play around with it more.
You can create Neo4j::Paginated objects directly, but these are just plucked results from QP.
# match stupid friends with awful events, return distinct events
query = current_user.friends(:f).where(stupid: true).events(e:).rel_where(expected_attendees: 0)
#bad_events = Neo4j::Paginated.create_from(query, 1, 15).pluck('distinct e')
create_from returns a Neo4j::Paginated object that delegates each and pluck to its the QueryProxy object fed to it. Note that it's going to paginate based on the end of the chain, so it's doing the first page with 15 per page based on the events. Also note that you can't do a distinct count.
Check https://github.com/neo4jrb/neo4j/blob/master/lib/neo4j/paginated.rb for more. It's pretty easy to read.

Related

ActiveRecord Get Max Value Without Loading

I need to get the last record with a certain value, but without loading it.
I have to do something like:
Thing.where(cool: true).where(created_at: Thing.where(cool: true).maximum(:created_at))
The above is a way to do it, but it does a 2nd query to get the maximum value first. I want to do it all in one query and get the SQL equivalent of something = max(etc).
Just before someone mentions it: .last doesn't work because it returns an object not a relation.
In order words, I need to do .last but returning a relation instead of objects.
Try this way:
Thing.where("cool=1 AND created_at=(SELECT MAX(created_at) FROM things WHERE cool=1)")
On Rails 7, when I tested your query, the database is queried only once.

Ruby's .where vs. detect

I'm looking for a method that is faster and uses less server processing. In my application, I can use both .where and .detect:
Where:
User.where(id: 1)
# User Load (0.5ms)
Detect:
User.all.detect{ |u| u.id == 1 }
# User Load (0.7ms). Sometimes increases more than .where
I understand that .detect returns the first item in the list for which the block returns TRUE but how does it compares with .where if I have thousands of Users?
Edited for clarity.
.where is used in this example because I may not query for the id alone. What if I have a table column called "name"?
In this example
User.find(1) # or
User.find_by(id: 1)
will be the fastest solutions. Because both queries tell the database to return exactly one record with a matching id. As soon as the database finds a matching record, it doesn't look further but returns that one record immediately.
Whereas
User.where(id: 1)
would return an array of objects matching the condition. That means: After a matching record was found the database would continue looking for other records to match the query and therefore always scan the whole database table. In this case – since id is very likely a column with unique values – it would return an array with only one instance.
In opposite to
User.all.detect { |u| u.id == 1 }
that would load all users from the database. This will result in loading thousands of users into memory, building ActiveRecord instances, iterating over that array and then throwing away all records that do not match the condition. This will be very slow compared to just loading matching records from the database.
Database management systems are optimized to run selection queries and you can improve their ability to do so by designing a useful schema and adding appropriate indexes. Every record loaded from the database will need to be translated into an instance of ActiveRecord and will consume memory - both operations are not for free. Therefore the rule of thumb should be: Whenever possible run queries directly in the database instead of in Ruby.
NB One should use ActiveRecord#find in this particular case, please refer to the answer by #spickermann instead.
User.where is executed on DB level, returning one record.
User.all.detect will return all the records to the application, and only then iterate through on ruby level.
That said, one must use where. The former is resistant to an amount of records, there might be billions and the execution time / memory consumption would be nearly the same (O(1).) The latter might even fail on billions of records.
Here's a general guide:
Use .find(id) whenever you are looking for a unique record. You can use something like .find_by_email(email) or .find_by_name(name) or similar (these finders methods are automatically generated) when searching non-ID fields, as long as there is only one record with that particular value.
Use .where(...).limit(1) if your query is too complex for a .find_by query or you need to use ordering but you are still certain that you only want one record to be returned.
Use .where(...) when retrieving multiple records.
Use .detect only if you cannot avoid it. Typical use cases for .detect are on non-ActiveRecord enumerables, or when you have a set of records but are unable to write the matching condition in SQL (e.g. if it involves a complex function). As .detect is the slowest, make sure that before calling .detect you have used SQL to narrow down the query as much as possible. Ditto for .any? and other enumerable methods. Just because they are available for ActiveRecord objects doesn't mean that they are a good idea to use ;)

rails 4 activerecord relation pagination after filtering by permission

I'm using thinking sphinx as my search database, and after I search, I need to filter the results based off of if a user has access to see each result or not. I have a method like, current_user.can_see? that returns true/false. This works fine, however, no matter how I try to loop over the relation, it has to turn it into an array in order to filter/remove the results. This essentially breaks the pagination, total count, total pages, etc.
Does anyone know of a way to do this, or a different approach to paginating a filtered result set?
EDIT: Result set is coming back from a ThinkingSphinx search.
I assume you are using will_paginate.
When you have an array and not a collection, the only thing you have to do is to initialize in your app will_paginate in order to work for arrays.
You need to provide an initializer (for instance config/initializers/will_paginate.rb) that will only contain this line:
require 'will_paginate/array'
And there you go. It is working.

Ohm, find all records from array of ids

I am looking for a way to find all Ohm affiliated objects with one query, by feeding it an array of attributes that are indexed. In Mongoid, this is done with something like:
Foo.any_in(:some_id => [list_of_ids])
ActiveRecord has the find_all family of methods.
I essentially want to be able to pull N records from the data store without calling find() 30 times individually.
You can pass find an array or list of IDs:
Foo.find(1,2,3) or Foo.find([1,2,3])
This does not seem to work with the latest Ohm (1.1.1). I looked through the source and it seems you need to do something like Model.all.send(:fetch, [1,2,3]). Problem is... you have to call a private method.
I created an issue to see if this is the right approach.
UPDATE: It was just made public!

why is Model.all different to Model.where('true') in rails 3

I have a query, which works fine:
ModelName.where('true')
I can chain this with other AR calls such as where, order etc. However when I use:
ModelName.all
I receive the "same" response but can't chain a where or order to it as it's an array rather than a AR collection.
Whereas I have no pragmatic problem using the first method it seems a bit ugly/unnecessary. Is there a cleaner way of doing this maybe a .to_active_record_collection or something?
There is an easy solution. Instead of using
ModelName.where('true')
Use:
ModelName.scoped
As you said:
ModelName.where('true').class #=> ActiveRecord::Relation
ModelName.all.class #=> Array
So you can make as many lazy loading as long as you don't use all, first or last which trigger the query.
It's important to catch these differences when you consider caching.
Still I can't understand what kind of situation could lead you to something like:
ModelName.all.where(foobar)
... Unless you need the whole bunch of assets for one purpose and get it loaded from the database and need a subset of it to other purposes. For this kind of situation, you'd need to use ruby's Array filtering methods.
Sidenote:
ModelName.all
should never be used, it's an anti-pattern since you don' control how many items you'll retrieve. And hopefully:
ModelName.limit(20).class #=> ActiveRecord::Relation
As you said, the latter returns an array of elements, while the former is an ActiveRecord::Relation. You can order and filter array using Ruby methods. For example, to sort by id you can call sort_by(&:id). To filter elements you can call select or reject. For ActiveRecord::Relation you can chain where or order to it, as you said.
The difference is where the sorting and processing goes. For Array, it is done by the application; for Relation - by the database. The latter is usually faster, when there is more records. It is also more memory efficient.

Resources