Rails 6 with multiple database, auto change connection based on read or create query - ruby-on-rails

The question might be silly and it's not practiced in real world. Anyway kindly give your thoughts/pros/cons....
Lets say I am having two database read replica database and master database
Scenario 1:
Model.all # It should query from read replica database
Scenario 2:
Model.create(attributes) # It should create data in master database
Scenario 3:
Model.where(condition: :some_condition).update(attributes) # It should read data from replica database and update the data in master database
Note: In runtime database should detect the query and process the above 3 scenario.
Questions:
Is this a valid expectation?
if Yes, How to achieve this case completely or partially?
if No, What wrong in this case and what issues we will be facing?

Rails 6 provides a framework for auto-routing incoming requests to either the primary database connection, or a read replica.
By default, this new functionality allows your app to automatically route read requests (GET, HEAD) to a read-relica database if it has been at least 2 seconds since the last write request (any request that is not a GET or HEAD request) was made.
The logic that specifies when a read request should be routed to a replica is specified in a resolver class, ActiveRecord::Middleware::DatabaseSelector::Resolver by default, which you would override if you wanted custom behavior.
The middleware also provides a session class, ActiveRecord::Middleware::DatabaseSelector::Resolver::Session that is tasked with keeping track of when the last write request was made. Like the resolver, this class can also be overridden.
To enable the default behavior, you would add the following configuration options to one of your app's environment files - config/environments/production.rb for example:
config.active_record.database_selector = { delay: 2.seconds }
config.active_record.database_resolver =
ActiveRecord::Middleware::DatabaseSelector::Resolver
config.active_record.database_operations =
ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
If you decide to override the default functionality, you can use these configuration options to specify the delay you'd like to use, the name of your custom resolver class, and the name of your custom session class, both of which should be descendants of the default classes

Related

Use other database connection and execute query

In our app, we need to switch to read replica database and read from it for some read-only APIs.
We decided to use the around_action filter for that:
Switch DB to read_replica before the action
Yield
Switching back to master.
We decided to use establish_connection for switching, which did the job but later we noticed that it's not thread-safe i.e it causes our other threads to face "#<ActiveRecord::ConnectionNotEstablished: No connection pool with 'primary' found.>" issue. So this solution would have worked in the case of single-threaded servers.
Later we tried to create a new connection pool, as below which is thread-safe:
databases = Rails.configuration.database_configuration
resolver = ActiveRecord::ConnectionAdapters::ConnectionSpecification::Resolver.new(databases)
spec = resolver.spec(:read_replica)
pool = ActiveRecord::ConnectionAdapters::ConnectionPool.new(spec)
pool.with_connection { |conn|
execute SQL query here.
}
The only problem with the above approach is, we can only execute queries using execute method like conn.execute(sql_query) any AR ORM query we execute inside this with_connection block run on the original DB and not read_replica.
Seems like ActiveRecord do have its default connection and it's using it when we run AR ORM queries.
Not sure how can we execute the AR ORM query inside the with_connection block as User.where(id: 1..10).
Please note:
I am aware that we can do this natively in rails 6, need to skip that for now.
I am also aware of the Octopus gem, again need to skip on that.
Appreciate any help, Thanks.

Put class instance to class constant in initializers

In one of my old apps, I'm using several API connectors - like AWS or Mandill as example.
For some reason (may be I saw it somewhere, don't remember), I using class constant to initialize this objects on init stage of application.
As example:
/initializers/mandrill.rb:
require 'mandrill'
MANDRILL = Mandrill::API.new ENV['MANDRILL_APIKEY']
Now I can access MANDRILL class constant of my application in any method and use it. (full path MyApplication::Application::MANDRILL, or just MANDRILL). All working fine, example:
def update_mandrill
result = MANDRILL.inbound.update_route id, pattern, url
end
The question is: it is good practice to use such class constants? Or better create new class instance in every method that using this instance, like in example:
def update_mandrill
require 'mandrill'
mandrill = Mandrill::API.new ENV['MANDRILL_APIKEY']
result = mandrill.inbound.update_route id, pattern, url
end
Interesting question.
It's very handy approach but it may have cons in some scenarios.
Imagine you have a constant that either takes a lot of time to initialize or it loads a lot of data into memory. When its initialization takes long you essentially degrade app boot time (which may or may not be a problem, usually it will in development).
If it loads a lot of data into memory it may turn out it's gonna be a problem when running rake tasks for example which load entire environment. You may hit memory boundaries in use cases in which you essentially do not need this data at all.
I know one application which load a lot of data during boot - and it's done very deliberately. Sure, use case is a bit uncommon, but still.
Another thing to consider is - imagine, you're trying to establish connection to external service like Mongo or anything else. If this service is unavailable (what happens) your application won't be able to boot. Maybe this service is essential for app to work, and without it it would be "useless" anyway, but it's also possible that you essentially stop everything because storage in which you keeps log does not work.
I'm not saying you shouldn't use it as you suggested - I do it also in my apps, but you should be aware of potential drawbacks.
Yes, pre-creating a pseudo-constant object (like that api client) is usually a good idea. However, there is, approximately, a thousand ways go about it and the constant is not on top of my personal list.
These days I usually go with setting it in the env files.
# config/environments/production.rb
config.email_client = Mandrill::API.new ENV['MANDRILL_APIKEY'] # the real thing
# config/environments/test.rb
config.email_client = a_null_object # something that conforms to the same api, but does absolutely nothing
# config/environments/development.rb
config.email_client = a_dev_object # post to local smtp, or something
Then you refer to the client like this:
Rails.application.configuration.email_client
And the correct behaviour will be picked up in each env.
If I don't need this per-env variation, then I either use some kind of singleton object (EmailClient.get) or a global variable in the initializer ($email_client). It can be argued that a constant is better than global variable, semantically and because it raises a warning when you try to re-assign it. But I like that global variable stands out more. You see right away that it's something special. (And then again, it's only #3 in the list, so I don't do it very often.).

How do I bypass Rails.cache for a single request or code block?

I have an API endpoint that aggregates a bunch of data from code that leverages Rails.cache for small pieces of data here and there. There are times, however, when I want 100% up-to-date data, as if Rails.cache was empty. Obviously I could clear cache prior to aggregating the data, but that will affect unrelated data and requests.
Is there a way for me to have a request in rails act as if Rails.cache is empty, similar to if Rails.cache was configured to be :null_store?
The query cache in ActiveRecord has something like this - an "uncached" function that you can pass a block to, where the block will run w/o query cache enabled. I need something similar, but for Rails.cache in general.
Since it does not appear there is a solution to this out of the box, I coded a solution of my own by adding the following code as config/initializers/rails_cache.rb
module Rails
class << self
alias :default_rails_cache :cache
def cache
# Allow any thread to override Rails.cache with its own cache implementation.
RequestStore.store[:rails_cache] || default_rails_cache
end
end
end
This allows any thread to specify its own cache store, which will then be used for all fetches, reads, and writes. As such, it will not read from the default Rails.cache, nor will its values be written to the default Rails.cache.
If the thread is long-running and benefits from having caching enabled, you can easily set this to its own MemoryStore instance:
RequestStore.store[:rails_cache] = ActiveSupport::Cache.lookup_store(:memory_store)
And if you want caching completely off for this thread, you can :null_store instead of :memory_store.
If you are not using the request_store gem, "RequestStore.store" can be replaced with "Thread.current" for the same effect - just have to be more careful about thread reuse across requests.

Rails - how to cache data for server use, serving multiple users

I have a class method (placed in /app/lib/) which performs some heavy calculations and sub-http requests until a result is received.
The result isn't too dynamic, and requested by multiple users accessing a specific view in the app.
So, I want to schedule a periodic run of the method (using cron and Whenever gem), store the results somewhere in the server using JSON format and, by demand, read the results alone to the view.
How can this be achieved? what would be the correct way of doing that?
What I currently have:
def heavyMethod
response = {}
# some calculations, eventually building the response
File.open(File.expand_path('../../../tmp/cache/tests_queue.json', __FILE__), "w") do |f|
f.write(response.to_json)
end
end
and also a corresponding method to read this file.
I searched but couldn't find an example of achieving this using Rails cache convention (and not some private code that I wrote), on data which isn't related with ActiveRecord.
Thanks!
Your solution should work fine, but using Rails.cache should be cleaner and a bit faster. Rails guides provides enough information about Rails.cache and how to get it to work with memcached, let me summarize how I would use it in your case
Heavy method
def heavyMethod
response = {}
# some calculations, eventually building the response
Rails.cache.write("heavy_method_response", response)
end
Request
response = Rails.cache.fetch("heavy_method_response")
The only problem here is that when ur server starts for the first time, the cache will be empty. Also if/when memcache restarts.
One advantage is that somewhere on the flow, the data u pass in is marshalled into storage, and then unmartialled on the way out. Meaning u can pass in complex datastructures, and dont need to serialize to json manually.
Edit: memcached will clear your item if it runs out of memory. Will be very rare since its using a LRU (i think) algoritm to expire things, and I presume you will use this often.
To prevent this,
set expires_in larger than your cron period,
change your fetch code to call the heavy_method if ur fetch fails (like Rails.cache.fetch("heavy_method_response") {heavy_method}, and change heavy_method to just return the object.
Use something like redis which will not delete items.

django.db.utils.IntegrityError: (1062, "Duplicate entry '22-add_' for key 'content_type_id'")

I am using django multiple DB router concepts, having multiple sites with different db's. Base database user will login with all other sub sites.
When i try syncdb in base site its worked properly(at any time), but trying syncdb with other sites works first time only, if we try next time on-wards it throws integiry error like below
django.db.utils.IntegrityError: (1062, "Duplicate entry
'22-add_somesame' for key 'content_type_id'")
Once i removed multiple DB router settings in that project means syncdb works properly(at any time).
So is this relates to multiple db router? or what else?
Please anyone advise on this, thanks.
The problem here is with the db router and django system objects. I've experienced the same issue with multiple DBs and routers. As I remember the problem here is with the auth.permission content types, which get mixed in between databases. The syncdb script otherwise tries to create these in all databases, and theb it creates permission content type for some object, which id is already reserved for a local model.
I have the following
BASE_DB_TYPES = (
'auth.user',
'auth.group',
'auth.permission',
'sessions.session',
)
and then in the db router:
def db_for_read(self, model, **hints):
if hasattr(model, '_meta') and str(model._meta) in BASE_DB_TYPES:
return 'base_db' # the alias of base db that will store users
return None # the default database, or some custom mapping
EDIT:
Also, the exception might say that you're declaring a permission 'add_somesame' for your model 'somesame', while Django automatically creates add_, delete_, edit_ permissions for all objects.

Resources