How to set transaction isolation level using ActiveRecord connection? - ruby-on-rails

I need to manage transaction isolation level on a per-transaction basis in a way portable across databases (SQLite, PostgreSQL, MySQL at least).
I know I can do it manually, like that:
User.connection.execute('SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE')
...but I would expect something like:
User.isolation_level( :serializable ) do
# ...
end

This functionality is supported by ActiveRecord itself:
MyRecord.transaction(isolation: :read_committed) do
# do your transaction work
end
It supports the ANSI SQL isolation levels:
:read_uncommitted
:read_committed
:repeatable_read
:serializable
This method is available since Rails 4, it was unavailable when the OP asked the question. But for any decently modern Rails application this should be the way to go.

There was no gem available so I developed one (MIT): https://github.com/qertoip/transaction_isolation

Looks Rails4 would have the feature out of box:
https://github.com/rails/rails/commit/392eeecc11a291e406db927a18b75f41b2658253

Related

How do I set the isolation level of all my rails transactions

I'm using RoR version 4.2.3, and I understand I can set the isolation level of my transactions. However, where do I define setting the isolation level of all transactions? so I only have to define it once and then not worry about it?
I'm using postgreSQL as my database
There does not seem to be a global isolation option, thus you are left with four options:
Monkeypatch existing transaction implementation, so that it picks
your desired isolation level. (Monkeypatching is not desirable)
Use correct isolation level throughout your application.
SomeModel.transaction(isolation: :read_committed)
Extend ActiveRecord and create your own custom transaction method.
As commented - you may be able to edit the default isolation level in DB configuration. For postgres it's this one
Example code:
#lib/activerecord_extension.rb
module ActiveRecordExtension
extend ActiveSupport::Concern
module ClassMethods
def custom_transaction
transaction(isolation: :read_committed) do
yield
end
end
end
end
ActiveRecord::Base.send(:include, ActiveRecordExtension)
Then in your initializers:
#config/initializers/activerecord_extension.rb
require "active_record_extension"
Afterwards you can use:
MyModel.custom_transaction do
<...>
end
And in the future, this will allow you to change the isolation level in one place.
Rails doesn't support setting a global isolation level, but Postgres lets you set one for a session. You can hook into Rails' connection establishment to run a command every time a connection is made, thought the techniques for this all rely on monkeypatching and may be questionable.
Run Raw SQL in Rails after connecting to Database
Can I hook into ActiveRecord connection establishment?
Then configure your isolation level with:
SET SESSION CHARACTERISTICS AS TRANSACTION transaction_mode
Though this is interesting, I'd go with something more like Magnuss's answer for maintainability and sanity.

Is it possible to duck type an ActiveRecord ORM?

In an app I'm working on, the production database is Oracle and the development db is sqlite.
Most of the app code is high level ActiveRecord, but there is some custom sql for reporting. This sql varies depending on the backend db.
Rather than extending the ORM and adapters, or writing if statements throughout the application, is it possible to duck type the connection such that something like the below code is possible:
if Archive.connection.supports_function?("EXTRACT")
Archive.select("extract(year from created_at)")...
else
Archive.select("strftime('%Y', created_at)")...
end
I might be completely misunderstanding your requirement but you can check the adapter and change the code used for a method easily enough.
If you want to add a new extract method to activerecord that behaves in two ways for example:
# config/initializers/active_record_extract.rb
class ActiveRecord::Base
def self.extract_agnostic(oracle_column, default_column)
if ActiveRecord::Base.connection.instance_values["config"][:adapter].include?('oracle')
return self.select("extract(#{column1} from created_at)")...
end
self.select("strftime(#{default_column}, created_at)")...
end
end
# Usage:
Archive.extract_agnostic("year", "%Y")
Obviously this isn't perfect but should get you started?
I don't think rails can tell you if your adapter understands a command, but you could always try wrapping the command you want in a begin/rescue:
begin
self.select("extract(year from created_at)")...
rescue # the above failed, try something else
self.select("strftime('%Y', created_at)")...
end
Why can't you run an Oracle database for your development environment..? Scratch that - I don't want to know.
Use create_function to plug an extract() method into your SQLite:
http://rdoc.info/github/luislavena/sqlite3-ruby/SQLite3/Database#create_function-instance_method
(And good luck doing THAT in Oracle!;)

Rails 3 : Anticipating migration for 2.3 beginners

I am a beginner in Rails. I use 2.3.X.
I just saw Rails 3 is pre-released [edit: now in release candidate!]. I will most probably eventually switch to it.
What are the common coding habits in 2.3 I should not take, so that the switch is as smooth as possible ?
Edit:
I've done my homework and read the Release notes. But they are far from clear for the most crucial points, for example :
1.5 New APIs
Both the router and query interface have seen significant, breaking changes. There is a backwards compatibility layer that is in place and will be supported until the 3.1 release.
This is not comprehensive enough for a beginner like me. What will break ? What could I do already in 2.3.X to avoid having troubles later ?
Looking at my personal coding habits (I have been using Rails since 1.2.x), here's a list of API changes you can anticipate according to Rails 3 release notes.
find(:all)
Avoid the usage of:
Model.find(:all)
Model.find(:first)
Model.find(:last)
in favour of:
Model.all
Model.first
Model.last
Complex queries
Avoid the composition of complex queries in favor of named scopes.
Anticipate Arel
Rails 3 offers a much cleaner approach for dealing with ActiveRecord conditions and options. You can anticipate it creating custom named scopes.
class Model
named_scope :limit, lambda { |value| { :limit => value }}
end
# old way
records = Model.all(:limit => 3)
# new way
records = Model.limit(3).all
# you can also take advantage of lazy evaluation
records = Model.limit(3)
# then in your view
records.each { ... }
When upgrading to Rails 3, simply drop the named scope definition.
Constants
Avoid the usage of the following constants in favour of the corresponding Rails.x methods, already available in Rails 2.x.
RAILS_ROOT in favour of Rails.root,
RAILS_ENV in favour of Rails.env, and
RAILS_DEFAULT_LOGGER in favour of Rails.logger.
Unobtrusive Javascript
Avoid heavy JavaScript helpers in favour of unobtrusive JavaScript.
Gem dependencies
Keep your environment.rb as clean as possible in order to make easier the migration to Bundler. You can also anticipate the migration using Bundler today without Rails 3.
The release notes are the most important thing to keep an eye on. Other than that Jeremy McAnally has some neat blog posts about the whole Rails 3 thing (and has just released a gem to help you with the migration).
I'd say, read the rails release notes and see for yourself what seems the more surprising to you. A lot of stuff changed so reading this is definitively very important.

Model-specific SQL logging in rails

In my rails application, I have a background process runner, model name Worker, that checks for new tasks to run every 10 seconds. This check generates two SQL queries each time - one to look for new jobs, one to delete old completed ones.
The problem with this - the main log file gets spammed for each of those queries.
Can I direct the SQL queries spawned by the Worker model into a separate log file, or at least silence them? Overwriting Worker.logger does not work - it redirects only the messages that explicitly call logger.debug("something").
The simplest and most idiomatic solution
logger.silence do
do_something
end
See Logger#silence
Queries are logged at Adapter level as I demonstrated here.
How do I get the last SQL query performed by ActiveRecord in Ruby on Rails?
You can't change the behavior unless tweaking the Adapter behavior with some really really horrible hacks.
class Worker < ActiveRecord::Base
def run
old_level, self.class.logger.level = self.class.logger.level, Logger::WARN
run_outstanding_jobs
remove_obsolete_jobs
ensure
self.class.logger.level = old_level
end
end
This is a fairly familiar idiom. I've seen it many times, in different situations. Of course, if you didn't know that ActiveRecord::Base.logger can be changed like that, it would have been hard to guess.
One caveat of this solution: this changes the logger level for all of ActiveRecord, ActionController, ActionView, ActionMailer and ActiveResource. This is because there is a single Logger instance shared by all modules.

Database sharding and Rails

What's the best way to deal with a sharded database in Rails? Should the sharding be handled at the application layer, the active record layer, the database driver layer, a proxy layer, or something else altogether? What are the pros and cons of each?
FiveRuns have a gem named DataFabric that does application-level sharding and master/slave replication. It might be worth checking out.
I assume with shards we're talking about horizontal partitioning and not vertical partitioning (here are the differences on Wikipedia).
First off, stretch vertical partitioning as far as you can take it before you consider horizontal partitioning. It's easy in Rails to have different models point to different machines and for most Rails sites, this will bring you far enough.
For horizontal partitioning, in an ideal world, this would be handled at the application layer in Rails. But while it's not hard, it's not trivial in Rails, and by the time you need it, usually your application has grown beyond the point where this is feasible since you have ActiveRecord calls sprinkled all over the place. And no one, developers or management, likes working on it before you need it since everyone would rather work on features users will use now rather than on partitioning which may not come into play for years after your traffic has exploded.
ActiveRecord layer... not easy from what I can see. Would require lots of monkey patching into Rails internals.
At Spock we ended up handling this using a custom MySQL proxy and open sourced it on SourceForge as Spock Proxy. ActiveRecord thinks it's talking to one MySQL database machine when reality it's talking to the proxy, which then talks to one or more MySQL databases, merges/sorts the results, and returns them to ActiveRecord. Requires only a few changes to your Rails code. Take a look at the Spock Proxy SourceForge page for more details and for our reasons for going this route.
For those of you like me who hadn't heard of sharding:
http://highscalability.com/unorthodox-approach-database-design-coming-shard
rails 6.1 provides ability to switch connection per database thus we can do the horizontal partitioning.
Shards are declared in the three-tier config like this:
production:
primary:
database: my_primary_database
adapter: mysql2
primary_replica:
database: my_primary_database
adapter: mysql2
replica: true
primary_shard_one:
database: my_primary_shard_one
adapter: mysql2
primary_shard_one_replica:
database: my_primary_shard_one
adapter: mysql2
replica: true
Models are then connected with the connects_to API via the shards key
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
connects_to shards: {
default: { writing: :primary, reading: :primary_replica },
shard_one: { writing: :primary_shard_one, reading: :primary_shard_one_replica }
}
end
Then models can swap connections manually via the connected_to API. If using sharding, both a role and a shard must be passed:
ActiveRecord::Base.connected_to(role: :writing, shard: :shard_one) do
#id = Person.create! # Creates a record in shard one
end
ActiveRecord::Base.connected_to(role: :writing, shard: :shard_one) do
Person.find(#id) # Can't find record, doesn't exist because it was created
# in the default shard
end
reference:
https://edgeguides.rubyonrails.org/active_record_multiple_databases.html#horizontal-sharding
https://dev.to/ritikesh/multitenant-architecture-on-rails-6-1-27c7
Connecting Rails to multiple databases is not a big deal- you simply have an ActiveRecord subclass for each shard that overrides the connection property. That makes it pretty simple if you need to make cross-shard calls. You then just have to write a little code when you need to make calls between the shards.
I don't like Hank's idea of splitting the rails instances, because it seems challenging to call the code between the instances unless you have a big shared library.
Also you should look at doing something like Masochism before you start sharding.
For rails to work with replicated environment, I would suggest using my_replication plugin which helps switch database connection to one of the slaves at run-time
https://github.com/minhnghivn/my_replication
To my mind, the simplest way is maintain a 1:1 between rails instances and DB shards.
Proxy layer is better, it can support all program languages.
For example: Apache ShardingSphere' proxy.
There are 2 different products of Apache ShardingSphere, ShardingSphere-JDBC for application layer which for Java language only and ShardingSphere-Proxy for proxy layer which for all program languages.
FYI: https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/
Depends upon rails version. Newer rails version provide support for sharding as said by #Oshan. But if you can't update to a newer version you can use the octopus gem.
Gem Link
https://github.com/thiagopradi/octopus

Resources