Slow cache read on first cache fetch in Rails

Slow cache read on first cache fetch in Rails - ruby-on-rails

I am seeing some very slow cache reads in my rails app. Both redis (redis-rails) and memcached (dalli) produced the same results.
It looks like it is only the first call to Rails.cache that causes the slowness (averaging 500ms).
I am using skylight to instrument my app and see a graph like:
I have a Rails.cache.fetch call in this code, but when I benchmark it I see it average around 8ms, which matches what memcache-top shows for my average call time.
I thought this might be dalli connections opening slowly, but benchmarking that didnt show anything slow either. I'm at a loss for what else to check into.
Does anyone have any good techniques for tracking this sort of thing down in a rails app?
Edit #1
Memcache server is stored in ENV['MEMCACHE_SERVERS'], all the servers are in the us-east-1 datacenter.
Cache config looks like:
config.cache_store = :dalli_store, nil, { expires_in: 1.day, compress: true }
I ran something like:
100000.times { Rails.cache.fetch('something') }
and calculated the average timings and got something on the order of 8ms when running on one of my webservers.
Testing my theory of the first request is slow, I opened a console on my web server and ran the following as the first command.
irb(main):002:0> Benchmark.ms { Rails.cache.fetch('someth') { 1 } }
Dalli::Server#connect my-cache.begfpc.0001.use1.cache.amazonaws.com:11211
=> 12.043342
Edit #2
Ok, I split out the fetch into a read and write, and tracked them independently with statsd. It looks like the averages sit around what I would expect, but the max times on the read are very spiky and get up into the 500ms range.
http://s16.postimg.org/5xlmihs79/Screen_Shot_2014_12_19_at_6_51_16_PM.png

Related

What makes heroku response time too much slow

I am making a rails application to crawl the flight information from specific website. This app can be found here https://vemaybay.herokuapp.com/.
It only took around 4-5 seconds to response locally, but it took 15-20 seconds when running on heroku.
Is there anyway to speed up this response time?
I have already changed the free to hobby dyno type to avoid DB spin-up costs but I believe DB connection and query is not the root cause.
Is it related to the hosting problem? So can think about buying a host.
Below is my example code:
FlightService
def crawl(from, to, date)
return if flight_not_available?(from, to)
begin
selected_day = date.day - 1
browser = ::Ferrum::Browser.new
browser.headers.set({ "User-Agent" => "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" })
browser.goto("https://www.abay.vn/")
browser.at_css("input#cphMain_ctl00_btnSearch").click
browser.back
browser.execute("document.getElementById('cphMain_ctl00_txtFrom').setAttribute('value','#{from}')")
browser.execute("document.getElementById('cphMain_ctl00_txtTo').setAttribute('value','#{to}')")
browser.execute("document.getElementById('cphMain_ctl00_cboDepartureDay').selectedIndex = #{selected_day}")
browser.at_css("input#cphMain_ctl00_btnSearch").click
# browser.execute("document.querySelectorAll('a.linkViewFlightDetail').forEach(btn=> btn.click())")
sleep(1)
body = Nokogiri::HTML(browser.body)
flight_numbers = body.css("table.f-result > tbody > tr.i-result > td.f-number").map(&:text)
depart_times = body.css("table.f-result > tbody > tr.i-result > td.f-time").map { |i| i.text.split(" - ").first }
arrival_times = body.css("table.f-result > tbody > tr.i-result > td.f-time").map { |i| i.text.split(" - ").second }
base_prices = body.css("table.f-result > tbody > tr.i-result > td.f-price").map(&:text)
prices = base_prices
store_flight(flight_numbers, from, to, date, depart_times, arrival_times, base_prices, prices)
browser.quit
rescue StandardError => e
Rails.logger.error e.message
fail_with_message(e.message)
browser.quit
end
end
Then in my controller i just call the crawl method to fetch data:
service = FlightService.new(from: #from, to: #to, departure_date: #departure_date, return_date: #return_date)
service.crawl_go_flights
#go_flights = service.go_flights

I would try to add NewRelic Heroku add-on, it will show you what takes the most time, most likely it will be your Ruby code doing HTTP requests in a controller action to crawl a page.
Heroku tends to be slower than running code on your own development machine because Heroku resources are shared across users unless you bought expensive M/L dynos.
Without you sharing the code for crawling we don't know much how it work and where is the bottleneck. Do you crawl the single page or many pages (then this might be slow).
You can try moving crawl logic to the background worker, for instance, use Sidekiq gem. You could crawl page from time to time and store results in your DB then your controller action would only ask for data from your DB instead of crawling page every time. You can also use a rake task every 10 minutes defined in Heroku Scheduler to crawl page instead of Sidekiq (this might be faster to do). I don't know if having data up to date every 10 minutes is good enough for your use case. You need to pick a tech solution for your business use case needs. With Sidekiq you could run jobs more often by starting them every 1 minute using clockwork gem.

Eager loading taking more time to load page in Rails?

I am trying to play around with Eager Loading in a nonperformant app that has a n+1 of 5000 objects. This is the query with eager loading:
2855, 2856, 2857, 2858, 2859, 2860, 2861, 2862, 2863, 2864, 2865, 2866, 2867, 2868, 2869, 2870, 2871, 2872, 2873, 2874, 2875, 2876, 2877, 2878, 2879, 2880, 2881, 2882, 2883, 2884, 2885, 2886, 2887, 2888, 2889, 2890, 2891, 2892, 2893, 2894, 2895, 2896, 2897, 2898, 2899, 2900, 2901, 2902, 2903, 2904, 2905, 2906, 2907, 2908, 2909, 2910, 2911, 2912, 2913, 2914, 2915, 2916, 2917, 2918, 2919, 2920, 2921, 2922, 2923, 2924, 2925, 2926, 2927, 2928, 2929, 2930, 2931, 2932, 2933, 2934, 2935, 2936, 2937, 2938, 2939, 2940, 2941, 2942, 2943, 2944, 2945, 2946, 2947, 2948, 2949, 2950, 2951, 2952, 2953, 2954, 2955, 2956, 2957, 2958, 2959, 2960, 2961, 2962, 2963, 2964, 2965, 2966, 2967, 2968, 2969, 2970, 2971, 2972, 2973, 2974, 2975, 2976, 2977, 2978, 2979, 2980, 2981, 2982, 2983, 2984, 2985, 2986, 2987, 2988, 2989, 2990, 2991, 2992, 2993, 2994, 2995, 2996, 2997, 2998, 2999, 3000, 3001)
Rendered author/index.html.erb within layouts/application (2373.7ms)
Completed 200 OK in 2398ms (Views: 2357.5ms | ActiveRecord: 40.3ms)
It's way faster in view load time and AR load time than lazy loading (if I used .all, it took like 10 seconds to load), but for some reason the page is hanging for longer than 10 seconds now even though the view is taking longer to load. Any idea why?

Since you're talking about eager loading, it means there's a sql JOIN happening under the hood, and if, for example, each author has many posts, that means that 1 query will have to load 5000 authors with all their posts, which is a very big number, in opposite to lazy loading, where you have LOTS of queries, but they're quite easy and fast (if you have indexes on right columns), but since there's a lot of them, it's still slow.
Nevertheless, it's not performance wise to load 5000 objects at once, this makes a very heavy load on your database, and none of your users doesn't need all that data at once. Imagine, what if Google would give you 5000 pages on your search query - that would just stopped the search, since a) it would create a heavy load on Google servers, and b) the page would be very heavy to load it fast that would cause a lot of frustration to users.
Consider integrating some pagination gems, such as Kaminari or will_paginate.
UPDATE: There's a wonderful article about different types of eager loading in Rails (yeah, there are many of them), it's a very interesting reading.
Have fun!

How can I prevent database connections from timing out in Rails?

I have a Rails system in which every half hour, the following is done:
There are 15 clients somewhere else on the network
The server creates a record called Measurement for each of these clients
The measurement records are configured, and then they are run asynchronously via Sidekiq, using MeasurementWorker.perform_async(m.id)
The connection to the client is done with Celluloid actors and a WebSocket client
Each measurement, when run, creates a number of event records that are stored in the database
The system has been running well with 5 clients, but now I am at 15, and many of the measurements don't run anymore when I start them at the same time, with the following error:
2015-02-04T07:30:10.410Z 35519 TID-owd4683iw MeasurementWorker JID-15f6b396ae9e3e3cb2ee3f66 INFO: fail: 5.001 sec
2015-02-04T07:30:10.412Z 35519 TID-owd4683iw WARN: {"retry"=>false, "queue"=>"default", "backtrace"=>true, "class"=>"MeasurementWorker", "ar
gs"=>[6504], "jid"=>"15f6b396ae9e3e3cb2ee3f66", "enqueued_at"=>1423035005.4078047}
2015-02-04T07:30:10.412Z 35519 TID-owd4683iw WARN: could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)
2015-02-04T07:30:10.412Z 35519 TID-owd4683iw WARN: /home/webtv/.rbenv/versions/2.1.2/lib/ruby/gems/2.1.0/gems/activerecord-4.1.4/lib/active_
record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll'
....
Now, my production environment looks like this:
config/sidekiq.yml
production:
:verbose: false
:logfile: ./log/sidekiq.log
:poll_interval: 5
:concurrency: 50
config/unicorn.rb
...
worker_processes Integer(ENV["WEB_CONCURRENCY"] || 3)
timeout 60
...
config/database.yml
production:
adapter: postgresql
database: ***
username: ***
password: ***
host: 127.0.0.1
pool: 50
postgresql.conf
max_connections = 100 # default
As you see, I've already increased the concurrency of Sidekiq to 50, to cater for a high number of possible concurrent measurements. I've set the database pool to 50, which already looks like overkill to me.
I should add that the server itself is quite powerful, with 8 GB RAM and a quad-core Xeon E5-2403 1.8 GHz.
What should these values ideally be set to? What formula can I use to calculate them? (E.g. number of maximum DB connections = Unicorn workers × Sidekiq concurrency × N)

It looks to me like your pool configuration of 100 is not taking affect. Each process will need a max of 50 so change 100 to 50. I don't know if you are using Heroku but it is notoriously tough to configure the pool size.
Inside mysql, your max connection count should look like this:
((Unicorn processes) * 1) + ((sidekiq processes) * 50)
Unicorn is single threaded and never needs more than one connection unless you are spinning up your own threads in your Rails app for some reason.

I'm sure the creator of sidekiq #MikePerham is more than suited to the task of fixing your sidekiq issues but as a ruby dev two things stand out.
If you're doing a lot of database operations via ruby can you push some of them into the database as triggers? You could still start them on the appside with a sidekiq process of course. :)
Second every half hour screams to me of a rake task run via cron. Hope you're doing that too. FWIW I usually use the Whenever gem to create the cron line I have to drop into the crontab of the user running the app. Note its designed to autocreate the crontask in a scripted deploy but in a non-scripted one you can still leverage it to give you the lines you have to paste into your crontab though via the whenever command.
Also you mention this is for measurements.
Have you considered leveraging something like elasticsearch and the searchkick gem? This is a little more of a complex setup, be sure to firewall the server you install ES on. But this might make your code a lot more manageable as you grow. Also it gives you a good search mechanism almost for free and its distributed and more language agnostic, e.g. Bloodhound, Java. :) Plus kibana gives you a nice window into the ES records

How to update expiration time in MemCached using Dalli?

I'm using Ruby on Rails (v3.2.13), Dalli (v2.6.4) and MemCached (v1.4.13).
I do caching like this:
result = Rails.cache.fetch("test_key", :expires_in => 1.week) do
get_data() # slow call, result of which should be cached
end
I want to update cache expiration date based on the data, since some of my data can be kept longer.
Right now the following code does the job:
if keep_longer(result)
Rails.cache.write("test_key", result, :expires_in => 6.months)
end
I know that MemCached supports "touch" command that allows to update expiration date without sending the value. And I don't see how to use it through the Dalli gem. Is there a way to update expiration date without resending the result?
UPDATE:
Rails.cache.dalli.touch('some_key', 24.hours)
This should work, but for me it doesn't. Does it work for you?
Here is small example you can try. After execution of the following code in the IRB
dc = Dalli::Client.new("localhost:11211")
dc.set("test_key", "test_value", 5.minutes)
dc.set( "key", "value", 5.minutes)
dc.touch( "key", 10.minutes)
I'm checking the expiration dates using telnet:
telnet localhost 11211
Then given the correct slab_id and using "stats cachedump" command I obtain expiration times in seconds:
stats cachedump 1 0
ITEM key [9 b; 1375733492 s]
ITEM test_key [14 b; 1375905957 s]
Note that the expiration time of the key "key" points to the past. When I expect it to be 300 seconds later than "test_key" expiration time. Also I noticed that "key" expiration time is approximately 1 second before the MemCached server has started. Which probably indicates that this key has no expiration time. And in fact "key" doesn't get deleted in the near future.
Am I doing something wrong or it is a bug of Dalli/MemCached?

Dalli does support this - there's a touch method on Dalli::Client that does exactly what it says on the tin. Rails.cache returns a cache store rather than the underlying Dalli object so you need to do
Rails.cache.dalli.touch('some_key', 24.hours)
To bump the cache entry's expiry time by 24 hours (and of course memcache may decide to drop the entry anyway)

I found that my version of MemCached (v1.4.13) has the bug: binary touch operation did not update expiration time correctly. This bug was fixed in v1.4.14 (release notes):
Fixed issue with invalid binary protocol touch command expiration time
The problem now: as of today versions v1.4.14 and later cannot be installed using apt-get.

Why is my Ruby script utilizing 90% of my CPU?

I wrote a admin script that tails a heroku log and every n seconds, it summarizes averages and notifies me if i cross a certain threshold (yes I know and love new relic -- but I want to do custom stuff).
Here is the entire script.
I have never been a master of IO and threads, I wonder if I am making a silly mistake. I have a couple of daemon threads that have while(true){} which could be the culprit. For example:
# read new lines
f = File.open(file, "r")
f.seek(0, IO::SEEK_END)
while true do
select([f])
line = f.gets
parse_heroku_line(line)
end
I use one daemon to watch for new lines of a log, and the other to periodically summarize.
Does someone see a way to make it less processor-intensive?

This probably runs hot because you never really block while reading from the temporary file. IO::select is a thin layer over POSIX select(2). It looks like you're trying to block until the file is ready for reading, but select(2) considers EOF to be ready ("a file descriptor is also ready on end-of-file"), so you always return right away from select then call gets which returns nil at EOF.
You can get a truer EOF reading and nice blocking behavior by avoiding the thread which writes to the temp file and instead using IO::popen to fork the %x[heroku logs --ps router --tail --app pipewave-cedar] log tailer, connected to a ruby IO object on which you can loop over gets, exiting when gets returns nil (indicating the log tailer finished). gets on the pipe from the tailer will block when there's nothing to read and your script will only run as hot as it takes to do your line parsing and reporting.
EDIT: I'm not set up to actually try your code, but you should be able to replace the log tailer thread and your temp file read loop with this code to get the behavior described above:
IO.popen( %w{ heroku logs --ps router --tail --app my-heroku-app } ) do |logf|
while line = logf.gets
parse_heroku_line(line) if line =~ /^/
end
end
I also notice your reporting thread does not do anything to synchronize access to #total_lines, #total_errors, etc. So, you have some minor race conditions where you can get inconsistent values from the instance vars that parse_heroku_line method updates.

select is about whether a read would block. f is just a plain old file, so you when get to the end reads don't block, they just return nil instantly. As a result select returns instantly rather than waiting for something to be appending to the file as I assume you're expecting. Because of this you're sitting in a tight busy loop, so high cpu is to be expected.
If you are at eof (you could either check f.eof? or whether gets returns nil), then you could either start sleeping (perhaps with some sort of back off) or use something like listen to be notified of filesystem changes

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Slow cache read on first cache fetch in Rails - ruby-on-rails

Related

What makes heroku response time too much slow

Eager loading taking more time to load page in Rails?

How can I prevent database connections from timing out in Rails?

How to update expiration time in MemCached using Dalli?

Why is my Ruby script utilizing 90% of my CPU?

Categories

Resources