Random record in Rails - ruby-on-rails

Random record in Rails - ruby-on-rails

Now i am making a web application (Online word learning) that allow user to choose the correct meaning of the word. When they click start, it will select randomly one word from the database and show to the user. After the user choose the answer, it will go to the next question.
Please see the image below:
If i use, Word.order("rand()").limit(1), i wonder can the word will be repeated with the last selected word?
With the app as in the image above, any better ideas to solve this problem?

I would add the following scopes to the model (depends on the database you are using):
# in app/models/word.rb
# 'RANDOM' works with postgresql and sqlite, whereas mysql uses 'RAND'
scope :random, -> { order('RAND()') }
scope :without, ->(ids) { where.not(id: ids) }
With that scopes you can write the following query in your controller:
#word = Word.random.without(params[:last_ids]).limit(1)
When you want to load new random elements in the view, just add the ids of the current words to the request. This ensures that this ids (params[:last_ids]) are not randomly choosen.

Long story short, in order not to repeat yourself, you have to store those words somewhere. Either the ones that are yet to be shown, or the ones that have been already displayed. And If I were you I would go one of the following routes:
Fetch all the words before starting the quiz and randomize them. This could be something like:
session[:words] = Word.order("RAND()").select(:id).take(10)
Or even better by defining a scope for your random words:
class Word < ActiveRecord::Base
# ...
scope :random_quiz, -> { order("RAND()").take(10).pluck(:id) }
# ...
end
# ... in the controller when the quiz is getting started:
session[:words] = Word.random_quiz
# ... in the controller when you want to show the word:
new_word = Word.find(sessions[:words].pop)
As ORDER BY RAND() is a very expensive operation, this might make sense. And then you just pop the word ID's one by one by using session[:words].pop and present the questions.
This way it will guarantee that you won't repeat the words in the quiz and give you pretty optimal performance.
Fetch words one by one as you're progressing with giving out the questions and save the ones you've already asked about.
class Word < ActiveRecord::Base
# ...
def self.random_word(exclusions)
eligible = where('id NOT IN (?)', exclusions)
eligible.offset(rand(0..eligible.count)).take!(1)
end
# ..
end
# ... in the controller when you need a new word:
session[:words_shown] ||= [ ]
new_word = Word.random_word(session[:words_shown])
# mark the word as shown:
session[:words_shown].push(new_word.id)
You might have noticed the weird way of getting a random record in the second example. It turns out to be more efficient as it generates the following query:
SELECT * FROM words OFFSET _random_number_ LIMIT 1
Instead of:
SELECT * FROM words ORDER BY RAND() LIMIT 1
The first one is just an ordinary select, while the second one requires unindexed sorting by RAND() of the entire table before giving you that random result. Turns out to be the former is almost tenfold faster than the latter.
Hope that makes sense!

Related

Ruby on Rails - User input for sort ordering

In order to learn Ruby on Rails I am writing a web app that will be used to sort teams within a tournament given their performance to date.
The complication is that I want each tournament organiser (system user) to be able to use a variety of metrics in an arbitrary order.
Expressed as SQL (my background) I want User 1 to be able to choose:
ORDER BY
METRIC1
,METRIC2
,METRIC3
Whilst User 2 could choose:
ORDER BY
METRIC2
,METRIC3
,METRIC1
How would I accept this user input and use it to create a query on the Team table?
Edit 1 Neglected to mention (sorry) that the metrics themselves are calculated on the fly. Currently they are instance methods (e.g #team.metric1 etc). The abortive attempts I have made so far all involve trying to convert user strings to method names which just seems wrong (and I haven't been able to get it to work).
Edit 2 some example code in teams_controller.rb:
class Team < ApplicationRecord
belongs_to :tournament
has_many :matches
def score_for
matches.sum(:score_for)
end
def score_diff
matches.sum(:score_for) - matches.sum(:score_against)
end
end

ActiveRecord allows multiple arguments to be passed to the order method. So you could do something like:
Team.order(:metric2, :metric3, metric1: :desc)
Another options is you can also use ActiveRecord to dynamically construct a query. ActiveRecord queries are lazily evaluated, so the SQL won't be executed until you call an operation that requires loading the records.
For example you could construct a scope on Team like this:
class Team < ApplicationRecord
scope :custom_order, lambda { |sorting_order|
sorting_order.each do |metric|
order(metric)
end
}
end
You would then just need to input a collection of attributes in the order you wanted the order by clauses to be executed. For example:
Team.custom_order([:metric2, :metric3, :metric1])

A working but probably awful solution:
class Tournament < ApplicationRecord
has_many :teams
serialize :tiebreaker, Array
TIEBREAKER_WHITELIST = %w[score opponent_score possession].freeze
def sorted_teams
list = teams.shuffle
(TIEBREAKER_WHITELIST & tiebreaker).reverse.each do |metric|
list = list.sort_by { |team| [team.send(metric), list.find_index(team)] }
end
list.reverse
end
end
Each tournament has many teams. A tournament instance has a serialized field called tiebreaker. This contains an array of strings something like ["score", "possession"] where each string matches the name of a public instance method on team. Each of these methods returns a number.
The tiebreaker field is in descending order of precedence, so for the above example I would only expect possession to affect sorting for teams with an equal score.
list = teams.shuffle - this randomises the list to start with, in case teams are tied for all of the following tiebreakers.
(TIEBREAKER_WHITELIST & tiebreaker) - this returns only strings that appear in both the tiebreaker field and the whitelist constant to protect against end users running arbitrary methods.
.reverse.each do |metric| - this reverses the array of metrics so that the list is sorted by the lowest precedence metric first.
[team.send(metric), list.find_index(team)] - this is the sort for each metric. send turns the string into a method call. I found find_indexwas necessary to preserver sort order from previous sorts. i.e. if I had first sorted for possession this would preserve the order for teams with the same score.
list.reverse - reverse the list then return it. This was because I wanted higher scoring/possession teams first on my list and sort_by sorts ascending.
I wanted some metrics sorted ascending (opponent_score) and others descending (score) so I handled this in the respective methods, returning negative values for opponent_score for example.
I'm not entirely happy with the solution as is but it does seem to work!

Extract records which satisfy a model function in Rails

I have following method in a model named CashTransaction.
def is_refundable?
self.amount > self.total_refunded_amount
end
def total_refunded_amount
self.refunds.sum(:amount)
end
Now I need to extract all the records which satisfy the above function i.e records which return true.
I got that working by using following statement:
CashTransaction.all.map { |x| x if x.is_refundable? }
But the result is an Array. I am looking for ActiveRecord_Relation object as I need to perform join on the result.
I feel I am missing something here as it doesn't look that difficult. Anyways, it got me stuck. Constructive suggestions would be great.
Note: Just amount is a CashTransaction column.
EDIT
Following SQL does the job. If I can change that to ORM, it will still do the job.
SELECT `cash_transactions`.* FROM `cash_transactions` INNER JOIN `refunds` ON `refunds`.`cash_transaction_id` = `cash_transactions`.`id` WHERE (cash_transactions.amount > (SELECT SUM(`amount`) FROM `refunds` WHERE refunds.cash_transaction_id = cash_transactions.id GROUP BY `cash_transaction_id`));
Sharing Progress
I managed to get it work by following ORM:
CashTransaction
.joins(:refunds)
.group('cash_transactions.id')
.having('cash_transactions.amount > sum(refunds.amount)')
But what I was actually looking was something like:
CashTransaction.joins(:refunds).where(is_refundable? : true)
where is_refundable? being a model function. Initially I thought setting is_refundable? as attr_accesor would work. But I was wrong.
Just a thought, can the problem be fixed in an elegant way using Arel.

There are two options.
1) Finish, what you have started (which is extremely inefficient when it comes to bigger amount of data, since it all is taken into the memory before processing):
CashTransaction.all.map(&:is_refundable?) # is the same to what you've written, but shorter.
SO get the ids:
ids = CashTransaction.all.map(&:is_refundable?).map(&:id)
ANd now, to get ActiveRecord Relation:
CashTransaction.where(id: ids) # will return a relation
2) Move the calculation to SQL:
CashTransaction.where('amount > total_refunded_amount')
Second option is in every possible way faster and efficient.
When you deal with database, try to process it on the database level, with smallest Ruby involvement possible.
EDIT
According to edited question here is how you would achieve the desired result:
CashTransaction.joins(:refunds).where('amount > SUM(refunds.amount)')
EDIT #2
As to your updates in question - I don't really understand, why you have latched onto is_refundable? as an instance method, which could be used in query, which is basically not possible in AR, but..
My suggestion is to create a scope is_refundable:
scope :is_refundable, -> { CashTransaction
.joins(:refunds)
.group('cash_transactions.id')
.having('cash_transactions.amount > sum(refunds.amount)')
}
Now it is available in as short notation as
CashTransaction.is_refundable
which is shorter and more clear than aimed
CashTransaction.where('is_refundable = ?', true)

You can do it this way:
cash_transactions = CashTransaction.all.map { |x| x if x.is_refundable? } # Array
CashTransaction.where(id: cash_transactions.map(&:id)) # ActiveRecord_Relation
But, this is an in-efficient way of doing it as the other answerers also mentioned.
You can do it using SQL if amount and total_refunded_amount are the columns of the cash_transactions table in the database which will be much more efficient and performant:
CashTransaction.where('amount > total_refunded_amount')
But, if amount or total_refunded_amount are not the actual columns in the database, then you can't do it this way. Then, I guess you have do it the other way which is in-efficient than using raw SQL.

I think you should pre-compute is_refundable result (in a new column) when a CashTransaction and his refunds (supposed has_many ?) are updated by using callbacks :
class CashTransaction
before_save :update_is_refundable
def update_is_refundable
is_refundable = amount > total_refunded_amount
end
def total_refunded_amount
self.refunds.sum(:amount)
end
end
class Refund
belongs_to :cash_transaction
after_save :update_cash_transaction_is_refundable
def update_cash_transaction_is_refundable
cash_transaction.update_is_refundable
cash_transaction.save!
end
end
Note : The above code must certainly be optimized to prevent some queries
They you can query is_refundable column :
CashTransaction.where(is_refundable: true)

I think it's not bad to do this on two queries instead of a join table, something like this
def refundable
where('amount < ?', total_refunded_amount)
end
This will do a single sum query then use the sum in the second query, when the tables grow larger you might find that this is faster than doing a join in the database.

Ruby on Rails: Return name if table value is between a specified range

I have three tables (that are relevant to this problem). One table is called organizations.
I also have a table called organization_details, which contains organization_id and multi-row information about the organization.
I work in the event industry, so the organization_details table contains a column called total_attendance, where a person can input an integer of the org's attendance for a certain year.
The third table is called divisions. This has five rows total, with columns division_smallest and division_largest (referring to the attendance range). Each row has a range to separate which division an organization should belong to according to their most recent attendance record.
For example, one row in the division table shows a division_smallest equal to 1 and a division_largest equal to 100000 (again, referring to attendance). Finally, the division table also has a name column (e.g. "Division 1").
I want the app to automatically figure out which division an organization belongs to according to their most recent total_attendance. Ideally, the division's name would display in the organization index and show pages.
I'd like to make a custom method for this, but am unsure how best to tackle it. I've read a little bit about .between? as in (possibly) .between?(division.division_smallest, division.division_largest) return "#{division.name}"
...But I am not sure how the entire method would work or if I need to steer away from that entirely. I would greatly appreciate any insight into this!

My suggestion is to add the following method to organization.rb
def division_name
last_details = organization_details.order('created_at DESC').first
if last_details.present?
Division.where(':attendance >= division_smallest AND :attendance <= division_largest', attendance: last_details.attendance).first.name
else
"None"
end
end
The code first grabs the organization details that have been created most recently. If the organization has organization details it uses the attendance value to select the appropriate division and it returns that division's name. If the organization doesn't have any organization_details it returns the string "None". You may also want to handle the case where the attendance isn't inside of the range on any of the divisions you have defined.
I hope this points you in the right direction.

A naive implementation might look something like this:
class Division
def self.for_attendance(total)
first('? BETWEEN divisions.division_smallest AND divisions.division_largest', total)
end
end
class Organization
def latest_division
Division.for_attendance(organization_details.last.try(:total_attendance))
end
end
Now calling some_organization.latest_division will pull the latest division for that organization. This is great for a 'show' page, but will run you into trouble when you have an 'index' with many Organizations - these 2 queries will need to run for each Organization (an N+1 problem). Instead use this:
class Division
def self.merge_latest!(organizations)
left_join = "LEFT JOIN organization_details od2 ON organization_details.organization_id = od2.organization_id AND organization_details.created_at < od2.created_at"
subquery = OrganizationDetails.where(organization_id: organizations.map(&:id)).
joins(left_join).
where(od2: {id: nil}).to_sql
divisions = joins("#{subquery} as t ON t.total_attendance divisions.division_smallest AND divisions.division_largest").
select('divisions.*, t.organization_id')
organizations.each {|org| org.latest_division = divisions.detect{|d| d.organization_id == org.id}
end
end
def Organization
attr_accessor :latest_division
end
Now you can call Division.merge_latest!(organizations) to collect the latest division for all the organizations in a single query, addressable via an organization's :latest_division attribute.

Why is Foo.first returning the last record?

I have 2 records in Foo, with id's 1 and 2. Both created in that order. Bare in mind, in Postgres, records have no inherent order.
In Rails console. Foo.first and Foo.last returns the last record. I was under the impression that Foo.first would return the first record.
Here's the catch. The SQL queries look like:
SELECT "foos".* FROM "foos" LIMIT 1
SELECT "foos".* FROM "foos" ORDER BY "foos"."id" DESC LIMIT 1
The second query (Foo.last) has an ORDER BY DESC. So why doesn't AR have an ORDER BY ASC for .first? Whats the logic behind this? Seems a bit "inconsistent".
I can easily solve this by doing: Foo.order('id ASC').first instead. But looking for an explanation.

There isn't any logic to it, if there was any sense to first (or last for that matter), then it would raise an exception if you neglected to specify an explicit order either as an argument to first or as part of the current scope chain. Neither first nor last make any sense whatsoever in the context of a relational database unless there is an explicit ordering specified.
My guess is that whoever wrote first assumed that order by whatever_the_pk_is was implicit if there was no explicit order by. Then they probably did some experiments to empirically verify their assumption and it just happened to work as they expected with the particular tables and databases that they checked with (mini-rant: this is why you never ever assume unspecified behavior; if a particular behavior isn't explicitly specified, don't assume it even if the current implementation behaves that way or if empirical evidence suggests that it behaves that way).
If you trace through a simple M.first, you'll find that it does this:
limit(1).to_a[0]
No explicit ordering so you get whatever random ordering the database feels like using, that could be order by pk or it could be the table's block order on disk. If you trace through M.last, you'll get to find_last:
def find_last
#...
reverse_order.limit(1).to_a[0]
#...
end
And reverse_order:
def reverse_order
relation = clone
relation.reverse_order_value = !relation.reverse_order_value
relation
end
The #reverse_order_value instance variable isn't initialized so it will start out as nil and a ! will turn it into a true. And if you poke around for how #reverse_order_value is used, you'll get to reverse_sql_order:
def reverse_sql_order(order_query)
order_query = ["#{quoted_table_name}.#{quoted_primary_key} ASC"] if order_query.empty?
#...
and there's the author's invalid assumption about ordering laid bare for all to see. That line should probably be:
raise 'Specify an order you foolish person!' if order_query.empty?
I'd recommend that you always use .order(...).limit(1).first instead of first or last so that everything is nice and explicit; of course, if you wanted last you'd reverse the .order condition. Or you could always say .first(:order => :whatever) and .last(:order => :whatever) to again make everything explicit.

For the Rails version 4+, if you don't define any order, it will be sorted by primary key.
# Find the first record (or first N records if a parameter is supplied).
# If no order is defined it will order by primary key.
#
# Person.first # returns the first object fetched by SELECT * FROM people
# Person.where(["user_name = ?", user_name]).first
# Person.where(["user_name = :u", { u: user_name }]).first
# Person.order("created_on DESC").offset(5).first
# Person.first(3) # returns the first three objects fetched by SELECT * FROM people LIMIT 3
def first(limit = nil)
if limit
if order_values.empty? && primary_key
order(arel_table[primary_key].asc).limit(limit).to_a
else
limit(limit).to_a
end
else
find_first
end
end
Source: https://github.com/rails/rails/blob/4-0-stable/activerecord/lib/active_record/relation/finder_methods.rb#L75-L82

Subquery in Rails report generation

I'm building a report in a Ruby on Rails application and I'm struggling to understand how to use a subquery.
Each 'Survey' has_many 'SurveyResponses' and it is simple enough to retrieve these however I need to group them according to one of the fields, 'jobcode', as I only want to report the information relating to a single jobcode in one line in the report.
However I also need to know the constituent data that makes up the totals for that jobcode. The reason for this is that I need to calculate data such as medians and standard deviations and so need to know the values that make the total.
My thinking is that I retrieve the distinct jobcodes that were reported on for the survey and then as I loop through these I retrieve the individual responses for each jobcode.
Is this the correct way to do this or should I follow a different method?

You could use a named scope to simplify getting the groups of responses:
named_scope :job_group, lambda{|job_code| {:conditions => ["job_code = ?", job_code]}}
Put that in your response model, aand use it like this:
job.responses.job_group('some job code')
and you'll get an array of responses. If you're looking to get the mean of the values of one of the attributes on the responses, you can use map:
r = job.responses.job_group('some job code')
r.map(&:total)
=> [1, 5, 3, 8]
Alternatively, you might find it quicker to write custom SQL in order to get the mean / average / sum of groups of attributes. Going through rails for this sort of work may cause significant lag.
ActiveRecord::Base.connection.execute("Custom SQL here")

You can also use Model.find_by_sql()
For example:
class User < Activerecord::Base
# Your usual AR model
end
...
def index
#users = User.find_by_sql "select * from users"
# etc
end

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart