Application level filtering/data manipulation - ruby-on-rails

I have a model with many different children (has_many).
My app needs to lots of different manipulation on the who set of data. Therefore getting the data from the database is pretty simple, so I won't really need to use scopes and finders, BUT I want to do things on the data that are equivalent to say:
named_scope :red, :conditions => { :colour => 'red' }
named_scope :since, lambda {|time| {:conditions => ["created_at > ?", time] }}
Should I be writing the equivalent methods that just manipulate the already served data? Or in a helper?
Just need a little help as most things I see all relate to querying the actual database for a subset of data, when I will require all the children of this one model, but do many different visualisations on it.

So, if I understand right, you'd like to query the whole set of data once, then select different sets of rows from it for different uses.
Named scopes won't do any caching, as they are building separate queries for each variation.
If you want a simple, you can just query all rows (ActiveRecord will cache the result for the same query), then you can use select to filter the rows:
Article.all.select{|a| a.colour == 'red'}
Or, one step further, you can create a general method that filters the rows based on parameters:
def self.search(options)
articles = Articles.all
articles = articles.select{|a| a.color == 'red'} if options[:red]
articles = articles.select{|a| a.created_at > options[:since]} if options[:since]
articles
end
Article.search(:red => true, :since => 2.days.ago)
Or, if you really want to keep the chainable method syntax provided by scopes, then add your filter methods to the Array class:
class Array
def red
select.select{|a| a.colour == 'red'}
end
end
Or, if you just don't want to add all that garbage to every Array object, you can just add them to the objects, but you'll need to override the all method and add the methods every time you're creating a subset of the rows:
def self.with_filters(articles)
def articles.red
Article.with_filters(select{|a| a.color == 'red'})
end
articles
end
def self.all
self.with_filters(super)
end

Related

Rails the best way to scope vars

i have a 'Course' model that has the following attributes;
Course
Price - float
Featured - boolean
My question would be the following, I need 4 lists in my controller, recent courses, paid courses, free courses and featured courses.
It would be good practice to write my controller as follows?
def index
#courses = Course.order(created_at: :desc)
#free_courses = []
#courses.map {|c| #free_courses << c if c.price == 0}
#premium_courses = []
#courses.map {|c| #premium_courses << c if c.price> 0}
#featured_courses = []
#courses.map {|c| #featured_courses << c if c.featured}
end
Or do the consultations separately?
def index
#courses = Course.order(created_at: :desc)
#free_courses = Course.where("price == 0")
#premium_courses = Course.where("price > 0")
#featured_courses = Course.where(featured: true)
end
I checked through the logs that the first option is more performance but I am in doubt if it is an anti partner.
Thanks for all!
The second approach will become faster than the first as the size of the Course table increases. The first approach has to iterate over every record in the table 4 times. The second approach creates a Relation of only the records that match the where clause, so it does less work.
Also, the second approach has the advantage of laziness. Each query is only run at the time it is used, so it can be changed further along the code path. It's more flexible.
Note that it would be an improvement to the second approach to create scopes on the Course model that handles the logic. For example, one each for courses, free_courses, premium_courses and featured courses. This has the advantage of putting database logic in the model instead of the controller, where it can more easily be reused and maintained.
The second approach is better because when you use the .where() method, you are arranging the query in database itself rather than by the controller.
It is generally bad practice to iterate over all records in the database in Rails (i.e. Course.map or Course.all) both for performance and memory usage. As your database grows this becomes exponentially problematic. It's much better to use Course.where() methods. You'll probably want a default sort order so you can add with one line in your model.
default_scope { order(created_at: :desc) }
Then you can just do this in controller and they'll have the sort by default:
#courses = Course.all
I would also suggest adding scopes to your model for easier access.
So in your course.rb file
scope :free -> { where("price == 0") }
scope :premium -> { where("price > 0") }
scope :featured -> { where(featured: true) }
Then in your controller you can just do:
#courses = Course.all
#free_courses = Course.free
#premium_courses = Course.premium
#featured_courses = Course.featured
These scopes can also be chained if you need to combine those so you could do things like:
#mixed_courses = Course.premium.featured
As others have explained, Model.where() executes the selection of data by passing sql inside where("Write Pure SQL QUERIES HERE") where as regular ruby enumerable methods (.map) iterate over array which must be instantiated as ruby objects. That's where the memory / performance issues take the hit. It's ok if you're working with small data sets, but anything with data volume will get ugly.

Rails: how and where to add this method

I have an app where I retrieve a list of users from a specific country.
I did this in the UsersController:
#fromcanada = User.find(:all, :conditions => { :country => 'canada' })
and then turned it into a scope on the User model
scope :canada, where(:country => 'Canada').order('created_at DESC')
but I also want to be able to retrieve a random person or multiple persons from the country. I found this method that's supposed to be an efficient way to retrieve a random user from the database.
module ActiveRecord
class Base
def self.random
if (c = count) != 0
find(:first, :offset =>rand(c))
end
end
end
end
However, I have a few questions about how to add it, and how the syntax works.
Where would I put that code? Direct in the User model?
Syntax: so that I don't use code that I don't understand, can you explain how the syntax is working? I don't get (c = count). What is count counting? What is rand(c) doing? Is it finding the first one starting at the offset? If rand is an expensive method (hence the need to create a different more efficient random method), why use the expensive 'rand' in this new more efficient random method?
How could I add the call to random on my find method in the UsersController? How to add it to the scope in the model?
Building on question 3, is there a way to get two or three random users?
I wouldn't monkey patch that (or anything else!) into ActiveRecord, putting that into your User would make more sense.
The count is counting how many elements there are in your table and storing that number in c. Then rand(c) gives you a random integer in the interval [0,c) (i.e. 0 <= rand(c) < c). The :offset works the way you think it does.
rand isn't terribly expensive but doing order by random() inside the database can be very expensive. The random method that you're looking at is just a convenient way to get a random record/object from the database.
Adding it to your own User would look something like this:
def self.random
n = scoped.count
scoped.offset(rand(n)).first
end
That would allow you to chain random after a bunch of scopes:
u = User.canadians_eh.some_other_scope.random
but the result of random would be a single user so your chaining would stop there.
If you wanted multiple users you'd want to call random multiple times until you got the number of users you wanted. You could try this:
def self.random
n = scoped.count
scoped.offset(rand(n))
end
us = User.canadians_eh.random.limit(3)
to get three random users but the users would be clustered together in whatever order the database ended up with after your other scopes and that's probably not what you're after. If you want three you'd be better off with something like this:
# In User...
def self.random
n = scoped.count
scoped.offset(rand(n)).first
end
# Somewhere else...
scopes = User.canadians_eh.some_other_scope
users = 3.times.each_with_object([]) do |_, users|
users << scopes.random
scopes = scopes.where('id != :latest', :latest => users.last.id)
end
You'd just grab a random user, update your scope chain to exclude them, and repeat until you're done. You would, of course, want to make sure you had three users first.
You might want to move the ordering out of your canada scope: one scope, one task.
That code is injecting a new method into ActiveRecord::Base. I would put it in lib/ext/activerecord/base.rb. But you can put it anywhere you want.
count is a method being called on self. self will be some class inheriting from ActiveRecord::Base, eg. User. User.count returns the number of user records (sql: SELECT count(*) from users;). rand is a ruby stdlib method Kernel#rand. rand(c) returns a random integer in the Range 0...c and c was previously computed by calling #count. rand is not expensive.
You don't call random with find, User#random is a find, it returns one random record from all User records. In your controller you say User.random and it returns a single random record (or nil if there are no user records at all).
modify the AR::Base::random method like so:
module ActiveRecord
class Base
def self.random( how_many = 1 )
if (c = count) != 0
res = (0..how_many).inject([]) do |m,i|
m << find(:first, :offset =>rand(c))
end
how_many == 1 ? res.first : res
end
end
end
end
User.random(3) # => [<User Rand1>,<User Rand2>,<User Rand3>]

Select the complement of a set

I am using Rails 3.0. I have two tables: Listings and Offers. A Listing has-many Offers. An offer can have accepted be true or false.
I want to select every Listing that does not have an Offer with accepted being true. I tried
Listing.joins(:offers).where('offers.accepted' => false)
However, since a Listing can have many Offers, this selects every listing that has non-accepted Offers, even if there is an accepted Offer for that Listing.
In case that isn't clear, what I want is the complement of the set:
Listing.joins(:offers).where('offers.accepted' => true)
My current temporary solution is to grab all of them and then do a filter on the array, like so:
class Listing < ActiveRecord::Base
...
def self.open
Listing.all.find_all {|l| l.open? }
end
def open?
!offers.exists?(:accepted => true)
end
end
I would prefer if the solution ran the filtering on the database side.
The first thing that comes to mind is to do essentially the same thing you're doing now, but in the database.
scope :accepted, lambda {
joins(:offers).where('offers.accepted' => true)
}
scope :open, lambda {
# take your accepted scope, but just use it to get at the "accepted" ids
relation = accepted.select("listings.id")
# then use select values to get at those initial ids
ids = connection.select_values(relation.to_sql)
# exclude the "accepted" records, or return an unchanged scope if there are none
ids.empty? ? scoped : where(arel_table[:id].not_in(ids))
}
I'm sure this could be done more cleanly using an outer join and grouping, but it's not coming to me immediately :-)

How do I calculate the most popular combination of a order lines? (or any similar order/order lines db arrangement)

I'm using Ruby on Rails. I have a couple of models which fit the normal order/order lines arrangement, i.e.
class Order
has_many :order_lines
end
class OrderLines
belongs_to :order
belongs_to :product
end
class Product
has_many :order_lines
end
(greatly simplified from my real model!)
It's fairly straightforward to work out the most popular individual products via order line, but what magical ruby-fu could I use to calculate the most popular combination(s) of products ordered.
Cheers,
Graeme
My suggestion is to create an array a of Product.id numbers for each order and then do the equivalent of
h = Hash.new(0)
# for each a
h[a.sort.hash] += 1
You will naturally need to consider the scale of your operation and how much you are willing to approximate the results.
External Solution
Create a "Combination" model and index the table by the hash, then each order could increment a counter field. Another field would record exactly which combination that hash value referred to.
In-memory Solution
Look at the last 100 orders and recompute the order popularity in memory when you need it. Hash#sort will give you a sorted list of popularity hashes. You could either make a composite object that remembered what order combination was being counted, or just scan the original data looking for the hash value.
Thanks for the tip digitalross. I followed the external solution idea and did the following. It varies slightly from the suggestion as it keeps a record of individual order_combos, rather than storing a counter so it's possible to query by date as well e.g. most popular top 10 orders in the last week.
I created a method in my order which converts the list of order items to a comma separated string.
def to_s
order_lines.sort.map { |ol| ol.id }.join(",")
end
I then added a filter so the combo is created every time an order is placed.
after_save :create_order_combo
def create_order_combo
oc = OrderCombo.create(:user => user, :combo => self.to_s)
end
And finally my OrderCombo class looks something like below. I've also included a cached version of the method.
class OrderCombo
belongs_to :user
scope :by_user, lambda{ |user| where(:user_id => user.id) }
def self.top_n_orders_by_user(user,count=10)
OrderCombo.by_user(user).count(:group => :combo).sort { |a,b| a[1] <=> b[1] }.reverse[0..count-1]
end
def self.cached_top_orders_by_user(user,count=10)
Rails.cache.fetch("order_combo_#{user.id.to_s}_#{count.to_s}", :expiry => 10.minutes) { OrderCombo.top_n_orders_by_user(user, count) }
end
end
It's not perfect as it doesn't take into account increased popularity when someone orders more of one item in an order.

Paginate through a randomized list of blog posts using will_paginate

I want to give users the ability to page through my blog posts in random order.
I can't implement it like this:
#posts = Post.paginate :page => params[:page], :order => 'RANDOM()'
since the :order parameter is called with every query, and therefore I risk repeating blog posts.
What's the best way to do this?
RAND accepts a seed in MySQL:
RAND(N)
From the MySQL docs:
RAND(), RAND(N)
Returns a random floating-point value
v in the range 0 <= v < 1.0. If a
constant integer argument N is
specified, it is used as the seed
value, which produces a repeatable
sequence of column values. In the
following example, note that the sequences of values produced by RAND(3) is the same both places where it occurs.
Other databases should have similar functionality.
If you use the SAME seed each time you call RAND, the order will be consistent across requests and you can paginate accordingly.
You can then store the seed in the user's session - so each user will see a set of results unique to them.
To avoid each page (generated from a new request) potentially having a repeated post you'll need to store the order of posts somewhere for retrieval over multiple requests.
If you want each user to have a unique random order then save the order in a session array of IDs.
If you don't mind all users having the same random order then have a position column in the posts table.
You could :order => RANDOM() on your original query that populates #posts, and then when you paginate, don't specify the order.
Create a named scope on your Post model that encapsulates the random behaviour:
class Post < ActiveRecord::Base
named_scope :random, :order => 'RANDOM()'
.
.
.
end
Your PostsController code then becomes:
#posts = Post.random.paginate :page => params[:page]

Resources