How to slow down code - ruby-on-rails

I get a task to make this code slower. I can change just inside the method. The reason, why I do this is to try ruby profiling. How or where can I can change a code to make it slower?
class FibonacciSequence
def next_fib
#index += 1
if #seq[#index].nil?
f = #seq[#index - 1] + #seq[#index - 2]
#seq[#index] = f
return f
else
return #seq[#index]
end
end
def current_fib
return #index >= 0 ? #seq[#index] : nil
end
def current_index
return #index >= 0 ? #index : nil
end
def [](n)
return nil if n < 0
return #seq[n] if n <= #index
while #index < n
self.next_fib
end
return self.current_fib
end
end

sleep(num_secs) is best way.
Other than this is calling function multiple times, itterate through loops, make array/hash say 1000 elements in it and apply methods like sort, map, on it,
reading remote file, reading huge data and process it (get 1000 users name and convert them all to uppercase. here you can read row in db and update some data clone it and save it back also help, if your db is remote this will give you more lag :)
But sleep is best way as you can comment just 1 line and this code will be optimum or you can change time parameter as you need.

Related

Getting all the pages from an API

This is something I struggle with, or whenever I do it it seems to be messy.
I'm going to ask the question in a very generic way as it's not a single problem I'm really trying to solve.
I have an API that I want to consume some data from, e.g. via:
def get_api_results(page)
results = HTTParty.get("api.api.com?page=#{page}")
end
When I call it I can retrieve a total.
results["total"] = 237
The API limits the number of records I can retrieve in one call, say 20. So I need to call it a few more times.
I want to do something like the following, ideally breaking it into pieces so I can use things like delayed_job..etc
def get_all_api_pages
results = get_api_results(1)
total = get_api_results(1)["total"]
until page*20 > total do |p|
results += get_api_results(p)
end
end
I always feel like I'm writing rubbish whenever I try and solve this (and I've tried to solve it in a number of ways).
The above, for example, leaves me at the mercy of an error with the API, which knocks out all my collected results if I hit an error at any point.
Wondering if there is just a generally good, clean way of dealing with this situation.
I don't think you can have that much cleaner...because you only receive the total once you called the API.
Have you tried to build your own enum for this. It encapsulates the ugly part. Here is a bit of sample code with a "mocked" API:
class AllRecords
PER_PAGE = 50
def each
return enum_for(:each) unless block_given?
current_page = 0
total = nil
while total.nil? || current_page * PER_PAGE < total
current_page += 1
page = load_page(current_page)
total = page[:total]
page[:items].each do |item|
yield(item)
end
end
end
private
def load_page(page)
if page == 5
{items: Array.new(37) { rand(100) }, total: 237}
else
{items: Array.new(50) { rand(100) }, total: 237}
end
end
end
AllRecords.new.each.each_with_index do |item, index|
p index
end
You can surely clean that out a bit but i think that this is nice because it does not collect all the items first.

Retrieving only unique records with multiple requests

I have this "heavy_rotation" filter I'm working on. Basically it grabs tracks from our database based on certain parameters (a mixture of listens_count, staff_pick, purchase_count, to name a few)
An xhr request is made to the filter_tracks controller action. In there I have a flag to check if it's "heavy_rotation". I will likely move this to the model (cos this controller is getting fat)... Anyway, how can I ensure (in a efficient way) to not have it pull the same records? I've considered an offset, but than I have to keep track of the offset for every query. Or maybe store track.id's to compare against for each query? Any ideas? I'm having trouble thinking of an elegant way to do this.
Maybe it should be noted that a limit of 14 is set via Javascript, and when a user hits "view more" to paginate, it sends another request to filter_tracks.
Any help appreciated! Thanks!
def filter_tracks
params[:limit] ||= 50
params[:offset] ||= 0
params[:order] ||= 'heavy_rotation'
# heavy rotation filter flag
heavy_rotation ||= (params[:order] == 'heavy_rotation')
#result_offset = params[:offset]
#tracks = Track.ready.with_artist
params[:order] = "tracks.#{params[:order]}" unless heavy_rotation
if params[:order]
order = params[:order]
order.match(/artist.*/){|m|
params[:order] = params[:order].sub /tracks\./, ''
}
order.match(/title.*/){|m|
params[:order] = params[:order].sub /tracks.(title)(.*)/i, 'LOWER(\1)\2'
}
end
searched = params[:q] && params[:q][:search].present?
#tracks = parse_params(params[:q], #tracks)
#tracks = #tracks.offset(params[:offset])
#result_count = #tracks.count
#tracks = #tracks.order(params[:order], 'tracks.updated_at DESC').limit(params[:limit]) unless heavy_rotation
# structure heavy rotation results
if heavy_rotation
puts "*" * 300
week_ago = Time.now - 7.days
two_weeks_ago = Time.now - 14.days
three_months_ago = Time.now - 3.months
# mix in top licensed tracks within last 3 months
t = Track.top_licensed
tracks_top_licensed = t.where(
"tracks.updated_at >= :top",
top: three_months_ago).limit(5)
# mix top listened to tracks within last two weeks
tracks_top_listens = #tracks.order('tracks.listens_count DESC').where(
"tracks.updated_at >= :top",
top: two_weeks_ago)
.limit(3)
# mix top downloaded tracks within last two weeks
tracks_top_downloaded = #tracks.order("tracks.downloads_count DESC").where(
"tracks.updated_at >= :top",
top: two_weeks_ago)
.limit(2)
# mix in 25% of staff picks added within 3 months
tracks_staff_picks = Track.ready.staff_picks.
includes(:artist).order("tracks.created_at DESC").where(
"tracks.updated_at >= :top",
top: three_months_ago)
.limit(4)
#tracks = tracks_top_licensed + tracks_top_listens + tracks_top_downloaded + tracks_staff_picks
end
render partial: "shared/results"
end
I think seeking an "elegant" solution is going to yield many diverse opinions, so I'll offer one approach and my reasoning. In my design decision, I feel that in this case it's optimal and elegant to enforce uniqueness on query intersections by filtering the returned record objects instead of trying to restrict the query to only yield unique results. As for getting contiguous results for pagination, on the other hand, I would store offsets from each query and use it as the starting point for the next query using instance variables or sessions, depending on how the data needs to be persisted.
Here's a gist to my refactored version of your code with a solution implemented and comments explaining why I chose to use certain logic or data structures: https://gist.github.com/femmestem/2b539abe92e9813c02da
#filter_tracks holds a hash map #tracks_offset which the other methods can access and update; each of the query methods holds the responsibility of adding its own offset key to #tracks_offset.
#filter_tracks also holds a collection of track id's for tracks that already appear in the results.
If you need persistence, make #tracks_offset and #track_ids sessions/cookies instead of instance variables. The logic should be the same. If you use sessions to store the offsets and id's from results, remember to clear them when your user is done interacting with this feature.
See below. Note, I refactored your #filter_tracks method to separate the responsibilities into 9 different methods: #filter_tracks, #heavy_rotation, #order_by_params, #heavy_rotation?, #validate_and_return_top_results, and #tracks_top_licensed... #tracks_top_<whatever>. This will make my notes easier to follow and your code more maintainable.
def filter_tracks
# Does this need to be so high when JavaScript limits display to 14?
#limit ||= 50
#tracks_offset ||= {}
#tracks_offset[:default] ||= 0
#result_track_ids ||= []
#order ||= params[:order] || 'heavy_rotation'
tracks = Track.ready.with_artist
tracks = parse_params(params[:q], tracks)
#result_count = tracks.count
# Checks for heavy_rotation filter flag
if heavy_rotation? #order
#tracks = heavy_rotation
else
#tracks = order_by_params
end
render partial: "shared/results"
end
All #heavy_rotation does is call the various query methods. This makes it easy to add, modify, or delete any one of the query methods as criteria changes without affecting any other method.
def heavy_rotation
week_ago = Time.now - 7.days
two_weeks_ago = Time.now - 14.days
three_months_ago = Time.now - 3.months
tracks_top_licensed(date_range: three_months_ago, max_results: 5) +
tracks_top_listens(date_range: two_weeks_ago, max_results: 3) +
tracks_top_downloaded(date_range: two_weeks_ago, max_results: 2) +
tracks_staff_picks(date_range: three_months_ago, max_results: 4)
end
Here's what one of the query methods looks like. They're all basically the same, but with custom SQL/ORM queries. You'll notice that I'm not setting the :limit parameter to the number of results that I want the query method to return. This would create a problem if one of the records returned is duplicated by another query method, like if the same track was returned by staff_picks and top_downloaded. Then I would have to make an additional query to get another record. That's not a wrong decision, just one I didn't decide to do.
def tracks_top_licensed(args = {})
args = #default.merge args
max = args[:max_results]
date_range = args[:date_range]
# Adds own offset key to #filter_tracks hash map => #tracks_offset
#tracks_offset[:top_licensed] ||= 0
unfiltered_results = Track.top_licensed
.where("tracks.updated_at >= :date_range", date_range: date_range)
.limit(#limit)
.offset(#tracks_offset[:top_licensed])
top_tracks = validate_and_return_top_results(unfiltered_results, max)
# Add offset of your most recent query to the cumulative offset
# so triggering 'view more'/pagination returns contiguous results
#tracks_offset[:top_licensed] += top_tracks[:offset]
top_tracks[:top_results]
end
In each query method, I'm cleaning the record objects through a custom method #validate_and_return_top_results. My validator checks through the record objects for duplicates against the #track_ids collection in its ancestor method #filter_tracks. It then returns the number of records specified by its caller.
def validate_and_return_top_results(collection, max = 1)
top_results = []
i = 0 # offset incrementer
until top_results.count >= max do
# Checks if track has already appeared in the results
unless #result_track_ids.include? collection[i].id
# this will be returned to the caller
top_results << collection[i]
# this is the point of reference to validate your query method results
#result_track_ids << collection[i].id
end
i += 1
end
{ top_results: top_results, offset: i }
end

Inconsistent read in database

I'm seeing inconsistent behavior, and am wondering what I could be doing wrong here.
I have subscription objects, whose state is defined by its cycle and con attributes, that are integers. months_passed returns an integer that counts how many FULL months has passed between the start_date of the subscription and Time.current.
def update_state
update_cycle
update_con
self.save
end
def update_cycle
self.cycle = if months_passed > 0
(months_passed - 1)/3 + 1
else
0
end
end
def update_con
self.con = if months_passed > 0
(months_passed - 1) % 3 + 1
else
0
end
end
def in_con1?
update_state
con == 1
end
However, when I call in_con1?, quickly in succession, I'll inconsistently get true or false.
Do I need to reload the object? Is something stale?
Argh, sorry guys. I found the culprit. Had nothing to do with inconsistent database reads. It was when months_passed was getting called, an hour before I expected.

optimize memory usage in rails loop

i develop a heroku rails application on the cedar stack and this is the bottle neck.
def self.to_csvAlt(options = {})
CSV.generate(options) do |csv|
column_headers = ["user_id", "session_id", "survey_id"]
pages = PageEvent.order(:page).select(:page).map(&:page).uniq
page_attributes = ["a", "b", "c", "d", "e"]
pages.each do |p|
page_attributes.each do |pa|
column_headers << p + "_" + pa
end
end
csv << column_headers
session_ids = PageEvent.order(:session_id).select(:session_id).map(&:session_id).uniq
session_ids.each do |si|
session_user = PageEvent.find(:first, :conditions => ["session_id = ? AND page != ?", si, 'none']);
if session_user.nil?
row = [si, nil, nil, nil]
else
row = [session_user.username, si, session_user.survey_name]
end
pages.each do |p|
a = 0
b = 0
c = 0
d = 0
e = 0
allpages = PageEvent.where(:page => p, :session_id => si)
allpages.each do |ap|
a += ap.a
b += ap.b
c += ap.c
d += ap.d
e += ap.e
end
index = pages.index p
end_index = (index + 1)*5 + 2
if !p.nil?
row[end_index] = a
row[end_index-1] = b
row[end_index-2] = c
row[end_index-3] = d
row[end_index-4] = e
else
row[end_index] = nil
row[end_index-1] = nil
row[end_index-2] = nil
row[end_index-3] = nil
row[end_index-4] = nil
end
end
csv << row
end
end
end
as you can see, it generates a csv file from a table that contains data on each individual page taken from a group of surveys. the problem is that there are ~50,000 individual pages in the table and the heroku app continues to give me R14 errors (out of memory 512MB) and eventually dies when the dyno goes to sleep after an hour.
that being said, i really dont care how long it takes to run, i just need it to complete. i am waiting on approval to add a worker dyno to run the csv generation, which i know will help but in the meantime i still would like to optimize this code. There is potential for over 100,000 pages to be processed at at time and i realize this is incredibly memory heavy and really need to cut back its memory usage as much as possible. thank you for your time.
You can split it up into batches so that the work is completed in sensible chunks.
Try something like this:
def self.to_csvAlt(options = {})
# ...
pages = PageEvent.order(:page).select(:page).map(&:page).uniq
pages.find_each(:batch_size => 5000) do |p|
# ...
Using find_each with a batch_size, you wont do one huge lookup for your loop. Instead it'll fetch 5000 rows, run your loop, fetch another, loop again ... etc, until you have no more records returned.
The other key thing to note here is that rather than rails trying to instantiate all of the objects returned from the database at the same time, it will only instantiate those returned in your current batch. This can save a huge memory overhead if you have a giant dataset.
UPDATE:
Using #map to restrict your results to a single attribute of your model is highly inefficient. You should instead use the pluck Active record method to just pull back the data you want from the DB directly rather than manipulating the results with Ruby, like this:
# Instead of this:
pages = PageEvent.order(:page).select(:page).map(&:page).uniq
# Use this:
pages = PageEvent.order(:page).pluck(:page).uniq
I also personally prefer to use .distinct rather than the alias .uniq as I feel it sits more in line with the DB query rather than confusing things with what seems more like an array function:
pages = PageEvent.order(:page).pluck(:page).distinct
Use
CSV.open("path/to/file.csv", "wb")
This will stream CSV into the file.
Instead of CSV.generate.
generate will create a huge string that will end up exasting memory if it gets too large.

Create and use a function in Ruby on Rails

I have a few lines of code that I would rather call as a function that paste in every time I need to call it. I know how to do this in C++ but not sure if it works the same in Ruby or how it works.
if op2_health < 0
op2_health = 0
#result = op1_name + ' Wins!'
elsif op1_health < 0
op1_health = 0
#result = op2_name + ' Wins'
end
That is the code I want to use as a function but I don't know the function syntax in Ruby. Thanks!
Define the function with def:
def my_function
if #op2_health < 0
#op2_health = 0
#op1_name + ' Wins!'
elsif #op1_health < 0
#op1_health = 0
#op2_name + ' Wins'
end
end
Then call it like this:
#result = my_function
I don't know your business rules, but are you sure, for example, that those conditions are mutually exclusive? Regardless, this should be the way to express your code as a reusable function--perhaps with some minor tweaks though for your situation.
I have identified the problem: Elsif isn't a word.

Resources