Ruby on Rails Demographic Data - ruby-on-rails

I made an site for a PS3 game and I have quite a lot of users. I am wanting to make tournaments based on peoples locations and would also like to target age groups. When users sign up the input there date of birth in the format YYYY-MM-DD. I am pulling the data and making it into a hash like so:
# Site.rb
has_many :members
def ages
ages = {"Under 18" => 0, "19-24" => 0, "25-35" => 0, "36-50" => 0, "51-69" => 0,"70+" => 0}
ages_results = self.members.count("DATE_FORMAT(dob, '%Y')", :group =>"DATE_FORMAT(dob, '%Y')")
ages_results.each do |k,v|
k = k.to_i
if k.between?(18.years.ago.strftime("%Y").to_i, 0.years.ago.strftime("%Y").to_i)
ages["Under 18"] += v
elsif k.between?(24.years.ago.strftime("%Y").to_i, 19.years.ago.strftime("%Y").to_i)
ages["19-24"] += v
elsif k.between?(35.years.ago.strftime("%Y").to_i, 25.years.ago.strftime("%Y").to_i)
ages["25-35"] += v
elsif k.between?(50.years.ago.strftime("%Y").to_i, 36.years.ago.strftime("%Y").to_i)
ages["36-50"] += v
elsif k.between?(69.years.ago.strftime("%Y").to_i, 51.years.ago.strftime("%Y").to_i)
ages["51-69"] += v
elsif k > 70.years.ago.strftime("%Y").to_i
ages["70+"] += v
end
end
ages
end
I am not a expert ruby developer and not sure if the above approach is good or it can be done a much better way, could anyone give me some advice about this?
Cheers

Couple of things to note in your code:
you seem to disregard month and day when a user was born
you convert to and from strings unnecessarilly:
50.years.ago.strftime("%Y").to_i
could be written as
50.years.ago.year
hard-coded values all over the code
I would start rewriting by finding an adequate method for calculating exact age. This one seems to be ok:
require 'date'
def age(dob)
now = Time.now.utc.to_date
now.year - dob.year - ((now.month > dob.month || (now.month == dob.month && now.day >= dob.day)) ? 0 : 1)
end
Then I would extract age table to a separate structure, to be able to change it easily, if needed, and have it visually together:
INF = 1/0.0 # convenient infinity
age_groups = {
(0..18) => 'Under 18',
(19..24) => '19-24',
(25..35) => '25-35',
(36..50) => '36-50',
(51..69) => '51-69',
(70..INF) => '70+'
}
Next you can take as the input the array of users' birth dates:
users_dobs = [Date.new(1978,4,16), Date.new(2001,6,13), Date.new(1980,10,22)]
And starting to find a suitable method to group them based on your map, say using inject:
p users_dobs.each_with_object({}) {|dob, result|
age_group = age_groups.keys.find{|ag| ag === age(dob)}
result[age_group] ||= 0
result[age_group] += 1
}
#=>{25..35=>2, 0..18=>1}
or, perhaps, using group_by
p users_dobs.group_by{|dob|
age_groups.keys.find{|ag| ag === age(dob)}
}.map{|k,v| [age_groups[k], v.count]}
#=>[["25-35", 2], ["Under 18", 1]]
etc.

Related

Does there exist a gem to parse human numbers?

There is a helper #number_to_human to print large numbers, but is there an opposite helper to parse large numbers, similar to strtotime()?
No specific search results. Ruby Toolbox is dead.
A bonus would be to accept a locale, to handle , and ..
I would like to parse things like
$1m
$15 million
999 thousand
$999k
$111 M
1,234,567.89
€987.654,00
$1.1 billion
I found something and customized it.
def human_to_number(human)
return human unless human.is_a? String
return human if human.blank? # leave '' as is
human.downcase!
if human.index('k') || human.index('thousand')
multiplier = 1000
elsif human.index('m')
multiplier = 1_000_000
elsif human.index('b')
multiplier = 1_000_000_000
elsif human.index('t')
multiplier = 1_000_000_000_000
else
multiplier = 1
end
number = human.gsub(/[^0-9\.]/,'').to_f
number = number * multiplier
end
irb(main):003:0> d.human_to_number '$1.2 million'
=> 1200000.0
irb(main):004:0> d.human_to_number '$1.2 billion'
=> 1200000000.0
irb(main):005:0> d.human_to_number '$1.2k'
=> 1200.0
irb(main):006:0> d.human_to_number '1.2k'
=> 1200.0
irb(main):007:0> d.human_to_number '555.66k'
=> 555660.0

How to combine ActiveRecord Relations? Merge not working?

Initially when I was trying to build a histogram of all Items that have an Order start between a given set of dates based on exactly what the item was (:name_id) and the frequency of that :name_id, I was using the following code:
dates = ["May 27, 2016", "May 30, 2016"]
items = Item.joins(:order).where("orders.start >= ?", dates.first).where("orders.start <= ?", dates.last)
histogram = {}
items.pluck(:name_id).uniq.each do |name_id|
histogram[name_id] = items.where(name_id:name_id).count
end
This code worked FINE.
Now, however, I'm trying to build a histogram that's more expansive. I still want to capture frequency of :name_id over a period of time, but now I want to bound that time by Order start and end. I'm having trouble however, combining the ActiveRecord Relations that follow the queries. Specifically, if my queries are as follows:
items_a = Item.joins(:order).where("orders.start >= ?", dates.first).where("orders.start <= ?", dates.last)
items_b = Item.joins(:order).where("orders.end >= ?", dates.first).where("orders.end <= ?", dates.last)
How do I join the 2 queries so that my code below that acts on query objects still works?
items.pluck(:name_id).each do |name_id|
histogram[name_id] = items.where(name_id:name_id).count
end
What I've tried:
+, but of course that doesn't work because it turns the result into an Array where methods like pluck don't work:
(items_a + items_b).pluck(:name_id)
=> error
merge, this is what all the SO answers seem to say... but it doesn't work for me because, as the docs say, merge figures out the intersection, so my result is like this:
items_a.count
=> 100
items_b.count
=> 30
items_a.merge(items_b)
=> 15
FYI currently, I've monkey-patched this with the below, but it's not very ideal. Thanks for the help!
name_ids = (items_a.pluck(:name_id) + items_b.pluck(:name_id)).uniq
name_ids.each do |name_id|
# from each query object, return the ids of the item objects that meet the name_id criterion
item_object_ids = items_a.where(name_id:name_id).pluck(:id) + items_b.where(name_id:name_id).pluck(:id) + items_c.where(name_id:name_id).pluck(:id)
# then check the item objects for duplicates and then count up. btw I realize that with the uniq here I'm SOMEWHAT doing an intersection of the objects, but it's nowhere near as severe... the above example where merge yielded a count of 15 is not that off from the truth, when the count should be maybe -5 from the addition of the 2 queries
histogram[name_id] = item_object_ids.uniq.count
end
You can combine your two queries into one:
items = Item.joins(:order).where(
"(orders.start >= ? AND orders.start <= ?) OR (orders.end >= ? AND orders.end <= ?)",
dates.first, dates.last, dates.first, dates.last
)
This might be a little more readable:
items = Item.joins(:order).where(
"(orders.start >= :first AND orders.start <= :last) OR (orders.end >= :first AND orders.end <= :last)",
{ first: dates.first, last: dates.last }
)
Rails 5 will support an or method that might make this a little nicer:
items_a = Item.joins(:order).where(
"orders.start >= :first AND orders.start <= :last",
{ first: dates.first, last: dates.last }
).or(
"orders.end >= :first AND orders.end <= :last",
{ first: dates.first, last: dates.last }
)
Or maybe not any nicer in this case
Maybe this will be a bit cleaner:
date_range = "May 27, 2016".to_date.."May 30, 2016".to_date
items = Item.joins(:order).where('orders.start' => date_range).or('orders.end' => date_range)

Incrementing iteration through a hash in Ruby

What is the best way to incrementally iterate through a pair of hashes in Ruby? Should I convert them to arrays? Should I go an entirely different direction? I am working on a problem where the code is supposed to determine what to bake, and in what quantities, for a bakery given 2 inputs. The number of people to be fed, and their favorite food. They bake 3 things (keys in my_list) and each baked item feeds a set number of people (value in my_list).
def bakery_num(num_of_people, fav_food)
my_list = {"pie" => 8, "cake" => 6, "cookie" => 1}
bake_qty = {"pie_qty" => 0, "cake_qty" => 0, "cookie_qty" => 0}
if my_list.has_key?(fav_food) == false
raise ArgumentError.new("You can't make that food")
end
index = my_list.key_at(fav_food)
until num_of_people == 0
bake_qty[index] = (num_of_people / my_list[index])
num_of_people = num_of_people - bake_qty[index]
index += 1
end
return "You need to make #{pie_qty} pie(s), #{cake_qty} cake(s), and #{cookie_qty} cookie(s)."
end
The goal is to output a list for the bakery that will result in no uneaten food. When doing the math, the modulo would then be divided into the next food item.
Thanks for the help.
What is the best way to incrementally iterate through a pair of hashes in Ruby?
Since the keys of bake_qty conveniently have a '_qty' appended to them from their corresponding keys in my_list, you can use this to your advantage:
max_value = my_list[fav_food]
my_list.each do |key,value|
next if max_value < value
qty = bake_qty[key+'_qty']
...
end
You could use 'inject' method.
until num_of_people == 0
num_of_people = my_list.inject(num_of_people) do |t,(k,v)|
if num_of_people > 0
bake_qty["#{key}_qty"] += num_of_people/v
t - v
end
end
You can sort your hash at the beginning to ensure that your first food is the fav food

Using scope to return results within multiple DateTime ranges in ActiveRecord

I've got a Session model that has a :created_at date and a :start_time date, both stored in the database as :time. I'm currently spitting out a bunch of results on an enormous table and allowing users to filter results by a single date and an optional range of time using scopes, like so:
class Session < ActiveRecord::Base
...
scope :filter_by_date, lambda { |date|
date = date.split(",")[0]
where(:created_at =>
DateTime.strptime(date, '%m/%d/%Y')..DateTime.strptime(date, '%m/%d/%Y').end_of_day
)
}
scope :filter_by_time, lambda { |date, time|
to = time[:to]
from = time[:from]
where(:start_time =>
DateTime.strptime("#{date} #{from[:digits]} #{from[:meridian]}", '%m/%d/%Y %r')..
DateTime.strptime("#{date} #{to[:digits]} #{to[:meridian]}", '%m/%d/%Y %r')
)
}
end
The controller looks more or less like this:
class SessionController < ApplicationController
def index
if params.include?(:date) ||
params.include?(:time) &&
( params[:time][:from][:digits].present? && params[:time][:to][:digits].present? )
i = Session.scoped
i = i.filter_by_date(params[:date]) unless params[:date].blank?
i = i.filter_by_time(params[:date], params[:time]) unless params[:time].blank? || params[:time][:from][:digits].blank? || params[:time][:to][:digits].blank?
#items = i
#items.sort_by! &params[:sort].to_sym if params[:sort].present?
else
#items = Session.find(:all, :order => :created_at)
end
end
end
I need to allow users to filter results using multiple dates. I'm receiving the params as a comma-separated list in string format, e.g. "07/12/2012,07/13/2012,07/17/2012", and need to be able to query the database for several different date ranges, and time ranges within those date ranges, and merge those results, so for example all of the sessions on 7/12, 7/13 and 7/17 between 6:30 pm and 7:30 pm.
I have been looking everywhere and have tried several different things but I can't figure out how to actually do this. Is this possible using scopes? If not what's the best way to do this?
My closest guess looks like this but it's not returning anything so I know it's wrong.
scope :filter_by_date, lambda { |date|
date = date.split(",")
date.each do |i|
where(:created_at =>
DateTime.strptime(i, '%m/%d/%Y')..DateTime.strptime(i, '%m/%d/%Y').end_of_day
)
end
}
scope :filter_by_time, lambda { |date, time|
date = date.split(",")
to = time[:to]
from = time[:from]
date.each do |i|
where(:start_time =>
DateTime.strptime("#{i} #{from[:digits]} #{from[:meridian]}", '%m/%d/%Y %r')..
DateTime.strptime("#{i} #{to[:digits]} #{to[:meridian]}", '%m/%d/%Y %r')
)
end
}
Another complication is that the start times are all stored as DateTime objects so they already include a fixed date, so if I want to return all sessions started between 6:30 pm and 7:30 pm on any date I need to figure something else out too. A third party is responsible for the data so I can't change how it's structured or stored, I just need to figure out how to do all these complex queries. Please help!
EDIT:
Here's the solution I've come up with by combining the advice of Kenichi and Chuck Vose below:
scope :filter_by_date, lambda { |dates|
clauses = []
args = []
dates.split(',').each do |date|
m, d, y = date.split '/'
b = "#{y}-#{m}-#{d} 00:00:00"
e = "#{y}-#{m}-#{d} 23:59:59"
clauses << '(created_at >= ? AND created_at <= ?)'
args.push b, e
end
where clauses.join(' OR '), *args
}
scope :filter_by_time, lambda { |times|
args = []
[times[:from], times[:to]].each do |time|
h, m, s = time[:digits].split(':')
h = (h.to_i + 12).to_s if time[:meridian] == 'pm'
h = '0' + h if h.length == 1
s = '00' if s.nil?
args.push "#{h}:#{m}:#{s}"
end
where("CAST(start_time AS TIME) >= ? AND
CAST(start_time AS TIME) <= ?", *args)
}
This solution allows me to return sessions from multiple non-consecutive dates OR return any sessions within a range of time without relying on dates at all, OR combine the two scopes to filter by non-consecutive dates and times within those dates. Yay!
An important point I overlooked is that the where statement must come last -- keeping it inside of an each loop returns nothing. Thanks to both of you for all your help! I feel smarter now.
something like:
scope :filter_by_date, lambda { |dates|
clauses = []
args = []
dates.split(',').each do |date|
m, d, y = date.split '/'
b = "#{y}-#{m}-#{d} 00:00:00"
e = "#{y}-#{m}-#{d} 23:59:59"
clauses << '(start_time >= ? AND start_time <= ?)'
args.push b, e
end
where clauses.join(' OR '), *args
}
and
scope :filter_by_time, lambda { |dates, time|
clauses = []
args = []
dates.split(',').each do |date|
m, d, y = date.split '/'
f = time[:from] # convert to '%H:%M:%S'
t = time[:to] # again, same
b = "#{y}-#{m}-#{d} #{f}"
e = "#{y}-#{m}-#{d} #{t}"
clauses << '(start_time >= ? AND start_time <= ?)'
args.push b, e
end
where clauses.join(' OR '), *args
}
So, the easy part of the question is what to do about datetimes. The nice thing about DateTimes is that they can be cast to times really easily with this:
CAST(datetime_col AS TIME)
So you can do things like:
i.where("CAST(start_time AS TIME) IN(?)", times.join(", "))
Now, the harder part, why aren't you getting any results. The first thing to try is to use i.to_sql to decide whether the scoped query looks reasonable. My guess is that when you print it out you'll find that all those where are chaining together with AND. So you're asking for objects with a date that is on 7/12, 7/13, and 7/21.
The last part here is that you've got a couple things that are concerning: sql injections and some overeager strptimes.
When you do a where you should never use #{} in the query. Even if you know where that input is coming from your coworkers may not. So make sure you're using ? like in the where I did above.
Secondly, strptime is extremely expensive in every language. You shouldn't know this, but it is. If at all possible avoid parsing dates, in this case you can probably just gsub / into - in that date and everything will be happy. MySQL expects dates in m/d/y form anyways. If you're still having trouble with it though and you really need a DateTime object you can just as easily do: Date.new(2001,2,3) without eating your cpu.

Are there any model analytics gems?

I'm working on allowing clients to view analytics per day, week, month, in a period of time, grouped by hours or days or months, etc... All of that is based on the created_at attribute.
Is there any gem out there that already does this? Something like:
Posts.analytics(:by => :day, :period => :this_week, :column => :created_at)
Would return:
{
'2012-06-19' => 14,
'2012-06-20' => 0, // Empty rows padding support*
'2012-06-21' => 3
}
I'm trying to make it from scratch but it seems like a lot of unecessary work if there's already a gem to do the job.
Update
I tried to make an analytics module that gets included into all models for easy analytics generation, But it's really unreliable, Sometimed i get more days than i need, and it's really messy, Could anyone collaborate and rewrite/improve on this:
# Usage:
# include Analytics::Timeline
# Model.timeline(:period => :last_24_hours, :time_by => :hour)
module Analytics
module Timeline
def self.included(base)
base.class_eval {
def self.timeline(*filters)
filters = filters[0]
period = filters[:period] || :this_week
time_by = filters[:time_by] || :days
date_column = filters[:date_column] || :created_at
# Named periods conventions
period_range = case period
when :last_12_hours
[Time.now-12.hours, Time.now]
when :last_24_hours
[Time.now-24.hours, Time.now]
when :last_7_days
[Time.now-7.days, Time.now]
when :last_30_days
[Time.now-30.days, Time.now]
when :this_week
[Time.now.beginning_of_week, Time.now.end_of_week]
when :past_week
[(Time.now - 1.week).beginning_of_week, (Time.now - 1.week).end_of_week]
when :this_month
[Time.now.beginning_of_month, Time.now.end_of_month]
when :past_month
[(Time.now-1.month).beginning_of_month, (Time.now - 1.month).end_of_month]
when :this_year
[Time.now.beginning_of_year, Time.now.end_of_year]
end
period_range = period if period.kind_of?(Array)
period_range = [period, Time.now] if period.is_a?(String)
# determine the SQL group method
group_column = case time_by
when :months
time_suffix = "-01 00:00:00"
records = where("#{table_name}.#{date_column} > ? AND #{table_name}.#{date_column} <= ?", period_range[0].to_date, period_range[1].to_date)
"DATE_FORMAT(#{table_name}.#{date_column.to_s}, '%Y-%m')"
when :days
time_suffix = " 00:00:00"
records = where("#{table_name}.#{date_column} > ? AND #{table_name}.#{date_column} <= ?", period_range[0].to_date, period_range[1].to_date)
"DATE(#{table_name}.#{date_column.to_s})"
when :hours
time_suffix = ":00:00"
records = where("#{table_name}.#{date_column} > ? AND #{table_name}.#{date_column} <= ?", period_range[0], period_range[1])
"DATE_FORMAT(#{table_name}.#{date_column.to_s}, '%Y-%m-%d %H')"
when :minutes
time_suffix = ":00"
records = where("#{table_name}.#{date_column} > ? AND #{table_name}.#{date_column} <= ?", period_range[0], period_range[1])
"DATE_FORMAT(#{table_name}.#{date_column.to_s}, '%Y-%m-%d %H:%M')"
end
# Get counts per cycle
records = records.group(group_column).select("*, count(*) AS series_count, #{group_column} AS series_time")
series = {}
# Generate placeholder series
time_table = { :days => 60*60*24, :hours => 60*60, :minutes => 60, :seconds => 0 }
if time_by == :months
ticks = 12 * (period_range[1].year - period_range[0].year) + (period_range[1].month + 1) - period_range[0].month
else
ticks = (period_range[1] - period_range[0] + 1) / time_table[time_by]
end
ticks.to_i.times do |i|
time = period_range[1]-i.send(time_by)
time = case time_by
when :minutes
time.change(:sec => 0)
when :hours
time.change(:min => 0)
when :days
time.change(:hour => 0)
when :months
time.change(:day => 1, :hour => 0)
end
series[time.to_s(:db)] = 0
end
# Merge real counts with placeholder series
to_merge = {}
records.each do |r|
to_merge[r.series_time.to_s+time_suffix] = r.series_count
end
series.merge!(to_merge)
end
}
end
end
end
The ActiveRecord statistics gem seems like it could be really useful to you.
If the statistics gem doesn't help, the admin_data gem has some analytics built in. Check out the demo. Use of the entire admin system might be overkill but you could at least try to browse the source to mimic the analytics feature.

Resources