How to enable caching of model data? - ruby-on-rails

How do I enable caching across requests/actions?
My stack:
Rails 4.2.7
Postgres 9.5
I notice the following in my Rails logs
Country Load (0.4ms) SELECT "countries".* FROM "countries" WHERE "countries"."iso" = $1 LIMIT 1 [["iso", "US"]]
State Load (1.4ms) SELECT "states".* FROM "states" WHERE "states"."iso" = $1 AND "states"."country_id" = $2 LIMIT 1 [["iso", "OH"], ["country_id", 233]]
Country Load (0.3ms) SELECT "countries".* FROM "countries" WHERE "countries"."iso" = $1 LIMIT 1 [["iso", "US"]]
State Load (1.1ms) SELECT "states".* FROM "states" WHERE "states"."iso" = $1 AND "states"."country_id" = $2 LIMIT 1 [["iso", "OH"], ["country_id", 233]]
Country Load (3.6ms) SELECT "countries".* FROM "countries" WHERE "countries"."iso" = $1 LIMIT 1 [["iso", "US"]]
State Load (1.2ms) SELECT "states".* FROM "states" WHERE "states"."iso" = $1 AND "states"."country_id" = $2 LIMIT 1 [["iso", "OH"], ["country_id", 233]]
Note that the same queries are being run multiple times against my database in rapid succession. Is there some way I can indicate to Rails that certain tables/models are very unlikely to change, and so certain lookups can be cached on the app server?

Is there some way I can indicate to Rails that certain tables/models
are very unlikely to change, and so certain lookups can be cached on
the app server?
Absolutely, model caching seems to be a perfect fit here.
Here's the article which will give you a good overview of setting up different types of caching.
Also, check out official guides on caching.
Basically, you want to look into model caching.
You can write an extention and use it in models, which are to be cached:
module ModelCachingExtention
extend ActiveSupport::Concern
included do
class << self
# The first time you call Model.all_cached it will cache the collection,
# each consequent call will not fire the DB query
def all_cached
Rails.cache.fetch(['cached_', name.underscore.to_s, 's']) { all.load }
end
end
after_commit :clear_cache
private
# Making sure, that data is in consistent state by removing the cache
# everytime, the table is touched (eg some record is edited/created/destroyed etc).
def clear_cache
Rails.cache.delete(['cached_', self.class.name.underscore.to_s, 's'])
end
end
end
Then use it in model:
class Country < ActiveRecord::Base
include ModelCachingExtention
end
Now, when using Country.all_cached you will have the cached collection returned with zero db queries (once it is cached).

Within a single request queries are already cached by Rails. You will see it within your logfile with the prefix CACHE. Some operations, like inserting a new record within your request leads to a clear_query_cache and all cache entries are gone.
If you want that your query cache has a longer life time, you have to do it on your own. You can use Rails caching features for this:
Rails.cache.fetch("countries iso: #{iso}", expires_in: 1.hour) do
Country.where(iso: iso).all
end
You can also use memcache. With memcache you can share cached data between multiple servers.
You have to set it up within your specific environment config.
config.cache_store = :mem_cache_store, "cache-1.example.com", "cache-2.example.com"

Related

How to attach raw SQL to an existing Rails ActiveRecord chain?

I have a rule builder that ultimately builds up ActiveRecord queries by chaining multiple where calls, like so:
Track.where("tracks.popularity < ?", 1).where("(audio_features ->> 'valence')::numeric between ? and ?", 2, 5)
Then, if someone wants to sort the results randomly, it would append order("random()").
However, given the table size, random() is extremely inefficient for ordering, so I need to use Postgres TABLESAMPLE-ing.
In a raw SQL query, that looks like this:
SELECT * FROM "tracks" TABLESAMPLE SYSTEM(0.1) LIMIT 250;
Is there some way to add that TABLESAMPLE SYSTEM(0.1) to the existing chain of ActiveRecord calls? Putting it inside a where() or order() doesn't work since it's not a WHERE or ORDER BY function.
irb(main):004:0> Track.from('"tracks" TABLESAMPLE SYSTEM(0.1)')
Track Load (0.7ms) SELECT "tracks".* FROM "tracks" TABLESAMPLE SYSTEM(0.1) LIMIT $1 [["LIMIT", 11]]

Does “#includes” position matter on rails?

I find my query is taking too long to load so I'm wondering if the position of the includes matters.
Example A:
people = Person.where(name: 'guillaume').includes(:jobs)
Example B:
people = Person.includes(:jobs).where(name: 'guillaume')
Is example A faster because I should have fewer people's jobs to load?
Short answer: no.
ActiveRecord builds your query and as long as you don't need the records, it won't send the final SQL query to the database to fetch them. The 2 queries you pasted are identical.
Whenever in doubt, you can always open up rails console, write your queries there and observe the queries printed out. In your example it would be something like:
SELECT "people".* FROM "people" WHERE "people"."name" = $1 LIMIT $2 [["name", "guillaume"], ["LIMIT", 11]]
SELECT "jobs".* FROM "jobs" WHERE "jobs"."person_id" = 1
in both of the cases.

Rails order user based on number of badges with Gem Merit

I have a problem ordering users based on numbers of badges he/she has collected. I'm using Gem Merit to make the badges system.
so if I want to collect user's badges, the code would be user.badges that would convert to sql as
Merit::Sash Load (1.6ms) SELECT "sashes".* FROM "sashes" WHERE "sashes"."id" = ? LIMIT ? [["id", 5], ["LIMIT", 1]]
Merit::BadgesSash Load (1.4ms) SELECT "badges_sashes".* FROM "badges_sashes" WHERE "badges_sashes"."sash_id" = ? [["sash_id", 5]]
in User model I include Merit and has_merit code as merit documentation said, and when I write users = User.where(id: user_ids).left_joins(:merit).order(is_verified: :desc).group(:id).order("count(badges.id) desc") it says errors because it cannot recognize either the merit or badges.id
How do I actually order the users based on who has the most badges? THank you!
I never tried merit. But if u are able to get badges from user. then following query would give the USER_ID ==> BADGES_COUNT in descending order.
User.joins(:badges).group('users.id').order("count(badges.id) DESC").count

Rails: ActiveRecord preload alludes me

I am not able to grasp how the ActiveRecord preload method is of use.
When I do, for example, User.preload(:posts), it does runs two queries but what is returned is just the same as User.all. The second query does not seem to affect the result.
User Load (3.2ms) SELECT "users".* FROM "users"
Post Load (1.2ms) SELECT "posts".* FROM "posts" WHERE "posts"."user_id" IN (1, 2, 3)
Can someone explain?
Thanks!
Output is the same, but when you'll call user.posts, Rails will not load your posts from the database next time:
users = User.preload(:posts).limit(5) # users collection, 2 queries to the database
# User Load ...
# Post Load ...
users.flat_map(&:posts) # users posts array, no loads
users.flat_map(&:posts) # users posts array, no loads
You can do it as mutch times as you want, Rails just 'remember' your posts in RAM. The idea is that you go to the database only once.

PostgreSQL & ActiveRecord - slow query

I run simple query:
History.where(channel_id: 1).order('histories.id DESC').first
Result:
History Load (808.8ms) SELECT "histories".* FROM "histories" WHERE "histories"."channel_id" = 1 ORDER BY histories.id DESC LIMIT 1 [["channel_id", 1]]
808.8ms for 1 of 7 records with channel_id = 1. Total histories count is 2,110,443.
If I select all histories for channel_id = 1:
History.where(channel_id: 1)
History Load (0.5ms) SELECT "histories".* FROM "histories" WHERE "histories"."channel_id" = 1 [["channel_id", 1]]
It took only 0.5ms
And if we try to take one record with help of ruby Array:
History.where(channel_id: 1).order('histories.id DESC').to_a.first
History Load (0.5ms) SELECT "histories".* FROM "histories" WHERE "histories"."channel_id" = 1 ORDER BY id DESC [["channel_id", 1]]
Where I should find the problem?
PS: I already have an index on channel_id field.
UPD:
History.where(channel_id: 1).order('histories.id DESC').limit(1).explain
History Load (848.9ms) SELECT "histories".* FROM "histories" WHERE "histories"."channel_id" = 1 ORDER BY histories.id DESC LIMIT 1 [["channel_id", 1]]
=> EXPLAIN for: SELECT "histories".* FROM "histories" WHERE "histories"."channel_id" = 1 ORDER BY histories.id DESC LIMIT 1 [["channel_id", 1]]
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Limit (cost=0.43..13.52 rows=1 width=42)
-> Index Scan Backward using histories_pkey on histories (cost=0.43..76590.07 rows=5849 width=42)
Filter: (channel_id = 1)
(3 rows)
There are two ways PostgreSQL can handle your query (with the ORDER BY and LIMIT clauses):
It can scan the table and order the found tuples, then limit your result. PostgreSQL will choose this plan if he thinks your table has a really low number of tuples, or if he thinks the index will be of no use;
It can use the index.
It seems that PostgreSQL did choose the first option, which can occur for only two reasons in my humble opinion:
Your table statistics are not accurate. I recommend you VACUUM your table and try again this query
The channel_id values are unevenly distributed (for example almost all of the tuples have channel_id=2), which is why PostgreSQL thinks that the index is of no use. Here I recommend the use of a partial index.

Resources