Better way to query instead of looping through all data - ruby-on-rails

I have the following method that takes a 10 digit number and searches for matches:
def self.get_agents_by_phone_number(phone_number)
all_agents = Agent.all
agents = Array.new
all_agents.each do |a|
if "1" + a.phone_number.scan(/\d/).join('').last(10) == phone_number
agents << a
end
end
return agents
end
The catch is that the phone_number field in the DB have not been scrubbed and may be in a few different formats. Here is a sample:
2.5.3 :089 > Agent.all.pluck :phone_number
(0.5ms) SELECT "agents"."phone_number" FROM "agents"
=> ["1-214-496-5089", "193.539.7577", "557-095-1452", "(734) 535-5668", "(279) 691-4148", "(474) 777-3615", "137.158.9465", "(280) 680-8618", "296.094.7455", "1-500-079-7285", "1-246-171-1355", "1-444-626-9429", "(614) 603-6276", "594.170.4795", "1-535-859-1377", "676.706.4384", "256-312-4417", "1-592-904-2339", "174.912.8838", "677.137.7019", "319-013-7526", "(200) 790-1698", "576-106-0746", "(214) 042-9715", "(312) 188-5862", "1-823-392-9020", "663.331.4191", "237-101-0271", "1-836-465-1204", "394-499-0004", "713-068-1726", "1-223-484-7856"]
The method I shared above works but feels pretty inefficient. Any better ways of doing this without touching the data in the DB?

What about:
#sample if incoming phone_num = "333.111.4444"
#You need to adjust it to possible ways and push to array.
options = Array.new
options << "1.333.111.4444"
options << "333.111.4444"
options << "1-333-111-4444"
options << "333-111-4444"
# and so on ...
# then query in where
Agent.where("phone_number IN (?)", options)
It's way better than looping, only you need to adjust the phone number to get better performance. I do not know, how big is your data but since you are fetching .all agents it could be huge :)

I guess you can use the LIKE operator with multiple '%' wildcards between digit.
phone_number = '1234567890'
# split the number in groups
last4 = phone_number[-4..-1] # 7890
middle3 = phone_number[-7..-5] # 456
first3or4 = phone_number[0..-8] # 123
# build the pattern
pattern = "#{first3or4}%#{middle3}%#{last4}" # 123%456%7890
# use it for LIKE query
Agent.where('phone_number LIKE ?', pattern)
It won't be a fast query with all those wildcards.
https://dev.mysql.com/doc/refman/5.7/en/string-comparison-functions.html#operator_like

Related

Generate array of daily avg values from db table (Rails)

Context:
Trying to generating an array with 1 element for each created_at day in db table. Each element is the average of the points (integer) column from records with that created_at day.
This will later be graphed to display the avg number of points on each day.
Result:
I've been successful in doing this, but it feels like an unnecessary amount of code to generate the desired result.
Code:
def daily_avg
# get all data for current user
records = current_user.rounds
# make array of long dates
long_date_array = records.pluck(:created_at)
# create array to store short dates
short_date_array = []
# remove time of day
long_date_array.each do |date|
short_date_array << date.strftime('%Y%m%d')
end
# remove duplicate dates
short_date_array.uniq!
# array of avg by date
array_of_avg_values = []
# iterate through each day
short_date_array.each do |date|
temp_array = []
# make array of records with this day
records.each do |record|
if date === record.created_at.strftime('%Y%m%d')
temp_array << record.audio_points
end
end
# calc avg by day and append to array_of_avg_values
array_of_avg_values << temp_array.inject(0.0) { |sum, el| sum + el } / temp_array.size
end
render json: array_of_avg_values
end
Question:
I think this is a common extraction problem needing to be solved by lots of applications, so I'm wondering if there's a known repeatable pattern for solving something like this?
Or a more optimal way to solve this?
(I'm barely a junior developer so any advice you can share would be appreciated!)
Yes, that's a lot of unnecessary stuff when you can just go down to SQL to do it (I'm assuming you have a class called Round in your app):
class Round
DAILY_AVERAGE_SELECT = "SELECT
DATE(rounds.created_at) AS day_date,
AVG(rounds.audio_points) AS audio_points
FROM rounds
WHERE rounds.user_id = ?
GROUP BY DATE(rounds.created_at)
"
def self.daily_average(user_id)
connection.select_all(sanitize_sql_array([DAILY_AVERAGE_SELECT, user_id]), "daily-average")
end
end
Doing this straight into the database will be faster (and also include less code) than doing it in ruby as you're doing now.
I advice you to do something like this:
grouped =
records.order(:created_at).group_by do |r|
r.created_at.strftime('%Y%m%d')
end
At first here you generate proper SQL near to that you wish to get in first approximation, then group result records by created_at field converted to just a date.
points =
grouped.map do |(date, values)|
[ date, values.reduce(0.0, :audio_points) / values.size ]
end.to_h
# => { "1-1-1970" => 155.0, ... }
Then you remap your grouped hash via array, to calculate average values with audio_points.
You can use group and calculations methods built in AR: http://guides.rubyonrails.org/active_record_querying.html#group
http://guides.rubyonrails.org/active_record_querying.html#calculations

Sending array of values to a sql query in ruby?

I'm struggling on what seems to be a ruby semantics issue. I'm writing a method that takes a variable number of params from a form and creates a Postgresql query.
def self.search(params)
counter = 0
query = ""
params.each do |key,value|
if key =~ /^field[0-9]+$/
query << "name LIKE ? OR "
counter += 1
end
end
query = query[0..-4] #remove extra OR and spacing from last
params_list = []
(1..counter).each do |i|
field = ""
field << '"%#{params[:field'
field << i.to_s
field << ']}%", '
params_list << field
end
last_item = params_list[-1]
last_item = last_item[0..-3] #remove trailing comma and spacing
params_list[-1] = last_item
if params
joins(:ingredients).where(query, params_list)
else
all
end
end
Even though params_list is an array of values that match in number to the "name LIKE ?" parts in query, I'm getting an error: wrong number of bind variables (1 for 2) in: name LIKE ? OR name LIKE ? I tried with params_list as a string and that didn't work any better either.
I'm pretty new to ruby.
I had this working for 2 params with the following code, but want to allow the user to submit up to 5 ( :field1, :field2, :field3 ...)
def self.search(params)
if params
joins(:ingredients).where(['name LIKE ? OR name LIKE ?',
"%#{params[:field1]}%", "%#{params[:field2]}%"]).group(:id)
else
all
end
end
Could someone shed some light on how I should really be programming this?
PostgreSQL supports standard SQL arrays and the standard any op (...) syntax:
9.23.3. ANY/SOME (array)
expression operator ANY (array expression)
expression operator SOME (array expression)
The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ANY is "true" if any true result is obtained. The result is "false" if no true result is found (including the case where the array has zero elements).
That means that you can build SQL like this:
where name ilike any (array['%Richard%', '%Feynman%'])
That's nice and succinct so how do we get Rails to build this? That's actually pretty easy:
Model.where('name ilike any (array[?])', names.map { |s| "%#{s}%" })
No manual quoting needed, ActiveRecord will convert the array to a properly quoted/escaped list when it fills the ? placeholder in.
Now you just have to build the names array. Something simple like this should do:
fields = params.keys.select { |k| k.to_s =~ /\Afield\d+\z/ }
names = params.values_at(*fields).select(&:present)
You could also convert single 'a b' inputs into 'a', 'b' by tossing a split and flatten into the mix:
names = params.values_at(*fields)
.select(&:present)
.map(&:split)
.flatten
You can achieve this easily:
def self.search(string)
terms = string.split(' ') # split the string on each space
conditions = terms.map{ |term| "name ILIKE #{sanitize("'%#{term}%'")}" }.join(' OR ')
return self.where(conditions)
end
This should be flexible: whatever the number of terms in your string, it should returns object matching at least 1 of the terms.
Explanation:
The condition is using "ILIKE", not "LIKE":
"ILIKE" is case-insensitive
"LIKE" is case-sensitive.
The purpose of the sanitize("'%#{term}%'") part is the following:
sanitize() will prevent from SQL injections, such as putting '; DROP TABLE users;' as the input to search.
Usage:
User.search('Michael Mich Mickey')
# can return
<User: Michael>
<User: Juan-Michael>
<User: Jean michel>
<User: MickeyMouse>

Get columns names with ActiveRecord

Is there a way to get the actual columns name with ActiveRecord?
When I call find_by_sql or select_all with a join, if there are columns with the same name, the first one get overridden:
select locations.*, s3_images.* from locations left join s3_images on s3_images.imageable_id = locations.id and s3_images.imageable_type = 'Location' limit 1
In the example above, I get the following:
#<Location id: 22, name: ...
>
Where id is that of the last s3_image. select_rows is the only thing that worked as expected:
Model.connection.select_rows("SELECT id,name FROM users") => [["1","amy"],["2","bob"],["3","cam"]]
I need to get the field names for the rows above.
This post gets close to what I want but looks outdated (fetch_fields doesn't seem to exist anymore How do you get the rows and the columns in the result of a query with ActiveRecord? )
The ActiveRecord join method creates multiple objects. I'm trying to achieve the same result "includes" would return but with a left join.
I am attempting to return a whole lot of results (and sometimes whole tables) this is why includes does not suit my needs.
Active Record provides a #column_names method that returns an array of column names.
Usage example: User.column_names
two options
Model.column_names
or
Model.columns.map(&:name)
Example
Model named Rabbit with columns name, age, on_facebook
Rabbit.column_names
Rabbit.columns.map(&:name)
returns
["id", "name", "age", "on_facebook", "created_at", "updated_at"]
This is just way active record's inspect method works: it only lists the column's from the model's table. The attributes are still there though
record.blah
will return the blah attribute, even if it is from another table. You can also use
record.attributes
to get a hash with all the attributes.
However, if you have multiple columns with the same name (e.g. both tables have an id column) then active record just mashes things together, ignoring the table name.You'll have to alias the column names to make them unique.
Okay I have been wanting to do something that's more efficient for a while.
Please note that for very few results, include works just fine. The code below works better when you have a lot of columns you'd like to join.
In order to make it easier to understand the code, I worked out an easy version first and expanded on it.
First method:
# takes a main array of ActiveRecord::Base objects
# converts it into a hash with the key being that object's id method call
# loop through the second array (arr)
# and call lamb (a lambda { |hash, itm| ) for each item in it. Gets called on the main
# hash and each itm in the second array
# i.e: You have Users who have multiple Pets
# You can call merge(User.all, Pet.all, lambda { |hash, pet| hash[pet.owner_id].pets << pet }
def merge(mainarray, arr, lamb)
hash = {}
mainarray.each do |i|
hash[i.id] = i.dup
end
arr.each do |i|
lamb.call(i, hash)
end
return hash.values
end
I then noticed that we can have "through" tables (nxm relationships)
merge_through! addresses this issue:
# this works for tables that have the equivalent of
# :through =>
# an example would be a location with keywords
# through locations_keywords
#
# the middletable should should return as id an array of the left and right ids
# the left table is the main table
# the lambda fn should store in the lefthash the value from the righthash
#
# if an array is passed instead of a lefthash or a righthash, they'll be conveniently converted
def merge_through!(lefthash, righthash, middletable, lamb)
if (lefthash.class == Array)
lhash = {}
lefthash.each do |i|
lhash[i.id] = i.dup
end
lefthash = lhash
end
if (righthash.class == Array)
rhash = {}
righthash.each do |i|
rhash[i.id] = i.dup
end
righthash = rhash
end
middletable.each do |i|
lamb.call(lefthash, righthash, i.id[0], i.id[1])
end
return lefthash
end
This is how I call it:
lambmerge = lambda do |lhash, rhash, lid, rid|
lhash[lid].keywords << rhash[rid]
end
Location.merge_through!(Location.all, Keyword.all, LocationsKeyword.all, lambmerge)
Now for the complete method (which makes use of merge_through)
# merges multiple arrays (or hashes) with the main array (or hash)
# each arr in the arrs is a hash, each must have
# a :value and a :proc
# the procs will be called on values and main hash
#
# :middletable will merge through the middle table if provided
# :value will contain the right table when :middletable is provided
#
def merge_multi!(mainarray, arrs)
hash = {}
if (mainarray.class == Hash)
hash = mainarray
elsif (mainarray.class == Array)
mainarray.each do |i|
hash[i.id] = i.dup
end
end
arrs.each do |h|
arr = h[:value]
proc = h[:proc]
if (h[:middletable])
middletable = h[:middletable]
merge_through!(hash, arr, middletable, proc)
else
arr.each do |i|
proc.call(i, hash)
end
end
end
return hash.values
end
Here's how I use my code:
def merge_multi_test()
merge_multi!(Location.all,
[
# each one location has many s3_images (one to many)
{ :value => S3Image.all,
:proc => lambda do |img, hash|
if (img.imageable_type == 'Location')
hash[img.imageable_id].s3_images << img
end
end
},
# each location has many LocationsKeywords. Keywords is the right table and LocationsKeyword is the middletable.
# (many to many)
{ :value => Keyword.all,
:middletable => LocationsKeyword.all,
:proc => lambda do |lhash, rhash, lid, rid|
lhash[lid].keywords << rhash[rid]
end
}
])
end
You can modify the code if you wish to lazy load attributes that are one to many (such as a City is to a Location) Basically, the code above won't work because you'll have to loop through the main hash and set the city from the second hash (There is no "city_id, location_id" table). You could reverse the City and Location to get all the locations in the city hash then extract back. I don't need that code yet so I skipped it =)

Sanitizing User Regexp

I want to write a function that allows users to match data based on a regexp, but I am concerned about sanitation of the user strings. I know with SQL queries you can use bind variables to avoid SQL injection attacks, but I am not sure if there's such a mechanism for regexps. I see that there's Regexp.escape, but I want to allow valid regexps.
Here is is the sample function:
def tagged?(text)
tags.each do |tag|
return true if text =~ /#{tag.name}/i
end
return false
end
Since I am just matching directly on tag.name is there a chance that someone could insert a Proc call or something to break out of the regexp and cause havoc?
Any advice on best practice would be appreciated.
Interpolated strings in a Regexp are not executed, but do generate annoying warnings:
/#{exit -3}/.match('test')
# => exits
foo = '#{exit -3}'
/#{foo}/.match('test')
# => warning: regexp has invalid interval
# => warning: regexp has `}' without escape
The two warnings seem to pertain to the opening #{ and the closing } respectively, and are independent.
As a strategy that's more efficient, you might want to sanitize the list of tags into a combined regexp you can run once. It is generally far less efficient to construct and test against N regular expressions than 1 with N parts.
Perhaps something along the lines of this:
class Taggable
def tags
#tags
end
def tags=(value)
#tags = value
#tag_regexp = Regexp.new(
[
'^(?:',
#tags.collect do |tag|
'(?:' + tag.sub(/\#\{/, '\\#\\{').sub(/([^\\])\}/, '\1\\}') + ')'
end.join('|'),
')$'
].to_s,
Regexp::IGNORECASE
)
end
def tagged?(text)
!!text.match(#tag_regexp)
end
end
This can be used like this:
e = Taggable.new
e.tags = %w[ #{exit-3} .*\.gif .*\.png .*\.jpe?g ]
puts e.tagged?('foo.gif').inspect
If the exit call was executed, the program would halt there, but it just interprets that as a literal string. To avoid warnings it is escaped with backslashes.
You should probably create an instance of the Regexp class instead.
def tagged?(text)
return tags.any? { |tag| text =~ Regexp.new(tag.name, Regexp::IGNORECASE) }
end

Verifying if an object is in an array of objects in Rails

I'm doing this:
#snippets = Snippet.find :all, :conditions => { :user_id => session[:user_id] }
#snippets.each do |snippet|
snippet.tags.each do |tag|
#tags.push tag
end
end
But if a snippets has the same tag two time, it'll push the object twice.
I want to do something like if #tags.in_object(tag)[...]
Would it be possible? Thanks!
I think there are 2 ways to go about it to get a faster result.
1) Add a condition to your find statement ( in MySQL DISTINCT ). This will return only unique result. DBs in general do much better jobs than regular code at getting results.
2) Instead if testing each time with include, why don't you do uniq after you populate your array.
here is example code
ar = []
data = []
#get some radom sample data
100.times do
data << ((rand*10).to_i)
end
# populate your result array
# 3 ways to do it.
# 1) you can modify your original array with
data.uniq!
# 2) you can populate another array with your unique data
# this doesn't modify your original array
ar.flatten << data.uniq
# 3) you can run a loop if you want to do some sort of additional processing
data.each do |i|
i = i.to_s + "some text" # do whatever you need here
ar << i
end
Depending on the situation you may use either.
But running include on each item in the loop is not the fastest thing IMHO
Good luck
Another way would be to simply concat the #tags and snippet.tags arrays and then strip it of duplicates.
#snippets.each do |snippet|
#tags.concat(snippet.tags)
end
#tags.uniq!
I'm assuming #tags is an Array instance.
Array#include? tests if an object is already included in an array. This uses the == operator, which in ActiveRecord tests for the same instance or another instance of the same type having the same id.
Alternatively, you may be able to use a Set instead of an Array. This will guarantee that no duplicates get added, but is unordered.
You can probably add a group to the query:
Snippet.find :all, :conditions => { :user_id => session[:user_id] }, :group => "tag.name"
Group will depend on how your tag data works, of course.
Or use uniq:
#tags << snippet.tags.uniq

Resources