Commit a nested transaction - ruby-on-rails

Let's say I have a method that provides access to an API client in the scope of a user and the API client will automatically update the users OAuth tokens when they expire.
class User < ActiveRecord::Base
def api
ApiClient.new access_token: oauth_access_token,
refresh_token: oauth_refresh_token,
on_oauth_refresh: -> (tokens) {
# This proc will be called by the API client when an
# OAuth refresh occurs
update_attributes({
oauth_access_token: tokens[:access_token],
oauth_refresh_token: tokens[:refresh_token]
})
}
end
end
If I consume this API within a Rails transaction and a refresh occurs and then an error occurs - I can't persist the new OAuth tokens (because the proc above is also treated as part of the transaction):
u = User.first
User.transaction {
local_info = Info.create!
# My tokens are expired so the client automatically
# refreshes them and calls the proc that updates them locally.
external_info = u.api.get_external_info(local_info.id)
# Now when I try to locally save the info returned by the API an exception
# occurs (for example due to validation). This rolls back the entire
# transaction (including the update of the user's new tokens.)
local_info.info = external_info
local_info.save!
}
I'm simplifying the example but basically the consuming of the API and the persistence of data returned by the API need to happen within a transaction. How can I ensure the update to the user's tokens gets committed even if the parent transaction fails.

Have you tried opening a new db connection inside new thread, and in this thread execute the update
u = User.first
User.transaction {
local_info = Info.create!
# My tokens are expired so the client automatically
# refreshes them and calls the proc that updates them locally.
external_info = u.api.get_external_info(local_info.id)
# Now when I try to locally save the info returned by the API an exception
# occurs (for example due to validation). This rolls back the entire
# transaction (including the update of the user's new tokens.)
local_info.info = external_info
local_info.save!
# Now open new thread
# In the new thread open new db connection, separate from the one already opened
# In the new connection execute update only for the tokens
# Close new connection and new thread
Thread.new do
ActiveRecord::Base.connection_pool.with_connection do |connection|
connection.execute("Your SQL statement that will update the user tokens")
end
end.join
}
I hope this helps

Nermin's (the accepted) answer is correct. Here's an update for Rails >= 5.0
Thread.new do
Rails.application.executor.wrap do
record.save
end
# Note: record probably won't be updated here yet since it's async
end
Documented here: Rails guides threading and concurrency

This discussion from a previous question might help you. It looks like you can set a requires_new: true flag and essentially mark the child transaction as a sub transaction.
User.transaction {
User.transaction(requires_new: true) {
u.update_attribute(:name, 'test')
};
u.update_attribute(:name, 'test2');
raise 'boom'
}

Related

Rails ActionCable and Ember CLI app - Resource Bottlenecks

We've successfully implemented real time updates in our app using ActionCable in Rails and implemented the consumer as a client service in Ember CLI, but am looking for a better, less-expensive approach.
app/models/myobj.rb
has_many :child_objs
def after_commit
ActionCable.server.broadcast("obj_#{self.id}", model: "myobj", id: self.id)
self.child_objs.update_all foo: bar
end
app/models/child_obj.rb
belongs_to :myobj
def change_job
self.job = 'foo'
self.save
ActionCable.server.broadcast("obj_#{self.myobj.id}", model: "child_obj", id: self.id)
end
frontend/app/services/stream.js
Here we're taking the model and id data from the broadcast and using it to reload from the server.
import Ember from 'ember';
export default Ember.Service.extend({
store: Ember.inject.service(),
subscribe(visitId) {
let store = this.get("store")
MyActionCable.cable.subscriptions.create(
{channel: "ObjChannel", id: objId}, {
received(data) {
store.findRecord(data.model, data.id, {reload: true});
}
}
);
},
});
This approach "works" but feels naïve and is resource intensive, hitting our server again for each update, which requires re-authenticating the request, grabbing data from the database, re-serializing the object (which could have additional database pulls), and sending it across the wire. This does in fact cause pool and throttling issues if the number of requests are high.
I'm thinking we could potentially send the model, id, and changeset (self.changes) in the Rails broadcast, and have the Ember side handle setting the appropriate model properties. Is this the correct approach, or is there something else anyone recommends?
You should be fine with sending whole entity payload with your change event via sockets. Later you can push payload to the store - create new records or update existing. This way you'll avoid additional server requests.

The stratigy of build a talk-to-talk system using em-websocket in rails?

Maybe it is a good example for server push system. There are many users in the system, and users can talk with each other. It can be accomplished like this: one user sends message(through websocket) to the server, then the server forward the message to the other user. The key is to find the binding between the ws(websocket object) and the user. The example code like below:
EM.run {
EM::WebSocket.run(:host => "0.0.0.0", :port => 8080, :debug => false) do |ws|
ws.onopen { |handshake|
# extract the user id from handshake and store the binding between user and ws
}
ws.onmessage { |msg|
# extract the text and receiver id from msg
# extract the ws_receiver from the binding
ws_receiver.send(text)
}
end
}
I want to figure out following issues:
The ws object can be serialized so it can be stored into disk or database? Otherwise I can only store the binding into memory.
What the differences between em-websocket and websocket-rails?
Which gem do you recommend for websocket?
You're approaching a use case that websockets are pretty good for, so you're on the right track.
You could serialize the ws object with Marshal, but think of websocket objects as being a bit like http request objects in that they are abstractions for a type of communication. You are probably best off marshaling/storing the data.
em-websocket is a lower(ish) lever websocket library built more or less directly on web-machine. websocket-rails is a higher level abstraction on websockets, with a lot of nice tools built in and pretty ok docs. It is built on top of faye-websocket-rails which is itself built on web machine. *Note, action cable which is the new websocket library for Rails 5 is built on faye.
I've use websocket-rails in the past and rather like it. It will take care of a lot for you. However, if you can use Rails 5 and Action Cable, do that, its the future.
The following is in addition to Chase Gilliam's succinct answer which included references to em-websocket, websocket-rails (which hadn't been maintained in a long while), faye-websocket-rails and ActionCable.
I would recommend the Plezi framework. It works both as an independent application framework as well as a Rails Websocket enhancement.
I would consider the following points as well:
do you need the message to persist between connections (i.e. if the other user if offline, should the message wait in a "message box"? for how long should the message wait?)...?
Do you wish to preserve message history?
These points would help yo decide if to use a persistent storage (i.e. a database) for the messages or not.
i.e., to use Plezi with Rails, create an init_plezi.rb in your application's config/initializers folder. use (as an example) the following code:
class ChatDemo
# use JSON events instead of raw websockets
#auto_dispatch = true
protected #protected functions are hidden from regular Http requests
def auth msg
#user = User.auth_token(msg['token'])
return close unless #user
# creates a websocket "mailbox" that will remain open for 9 hours.
register_as #user.id, lifetime: 60*60*9, max_connections: 5
end
def chat msg, received = false
unless #user # require authentication first
close
return false
end
if received
# this is only true when we sent the message
# using the `broadcast` or `notify` methods
write msg # writes to the client websocket
end
msg['from'] = #user.id
msg['time'] = Plezi.time # an existing time object
unless msg['to'] && registered?(msg['to'])
# send an error message event
return {event: :err, data: 'No recipient or recipient invalid'}.to_json
end
# everything was good, let's send the message and inform
# this will invoke the `chat` event on the other websocket
# notice the `true` is setting the `received` flag.
notify msg['to'], :chat, msg, true
# returning a String will send it to the client
# when using the auto-dispatch feature
{event: 'message_sent', msg: msg}.to_json
end
end
# remember our route for websocket connections.
route '/ws_chat', ChatDemo
# a route to the Javascript client (optional)
route '/ws/client.js', :client
Plezi sets up it's own server (Iodine, a Ruby server), so remember to remove from your application any references to puma, thin or any other custom server.
On the client side you might want to use the Javascript helper provided by Plezi (it's optional)... add:
<script src='/es/client.js' />
<script>
TOKEN = <%= #user.token %>;
c = new PleziClient(PleziClient.origin + "/ws_chat") // the client helper
c.log_events = true // debug
c.chat = function(event) {
// do what you need to print a received message to the screen
// `event` is the JSON data. i.e.: event.event == 'chat'
}
c.error = function(event) {
// do what you need to print a received message to the screen
alert(event.data);
}
c.message_sent = function(event) {
// invoked after the message was sent
}
// authenticate once connection is established
c.onopen = function(event) {
c.emit({event: 'auth', token: TOKEN});
}
// // to send a chat message:
// c.emit{event: 'chat', to: 8, data: "my chat message"}
</script>
I didn't test the actual message code because it's just a skeleton and also it requires a Rails app with a User model and a token that I didn't want to edit just to answer a question (no offense).

Refresh OAuth client token in a thread-safe way

My app is consuming an OAuth resource and, from time to time, an access token must be refreshed using its refresh token. To this end, I'm doing something like:
record = MyClientModel.find(...)
client = OAuthClient.new(record.access_token)
begin
tries ||= 2
client.do_something
rescue ExpiredOAuthToken => e
new_access_token, new_refresh_token =
client.refresh_token(record.refresh_token)
client.access_token = new_access_token
record.access_token = new_access_token
record.refresh_token = new_refresh_token
record.save
retry unless (tries -= 1).zero?
raise e
end
This code is designed to be run simultaneously in web requests and in worker processes but, obviously, it is not thread-safe, e.g.:
record.access_token expires
Thread A encounters ExpiredOAuthToken
Thread A calls #refresh_token
Thread B encounters ExpiredOAuthToken
Thread B calls #refresh_token but fails because record.refresh_token is now invalid
Thread A persists new token
Thread A continues
I've never really had to think about thread safety before so I'm looking for suggestions on how I might go about improving this code.
I have recently come across this issue - in my case, I am storing my access tokens in an ActiveRecord model, so I can use the ActiveRecord with_lock method (with MySQL / PostgreSQL).
# ...inside Authorization model instance methods
def refresh
# do not allow more than one refresh to occur simultaneously
with_lock do
# if the token was just refreshed, return (there may be threads waiting on this token)
return oauth_token if updated_at > 5.seconds.ago
oauth = OmniAuth::Strategies::Custom.new(nil)
token =
OAuth2::AccessToken.new(
oauth.client,
oauth_token,
refresh_token: oauth_refresh_token
)
new_token = token.refresh!
return unless new_token
update!(
oauth_token: new_token.token,
oauth_refresh_token: new_token.refresh_token,
oauth_expires_at: Time.at(new_token.expires_at)
) && new_token.token
end
end
When my threads encounter an expired token, they will all call refresh, but only the first refresh is executed, the other threads wait, and will immediately get the new token when it is updated.

Thread running in Middleware is using old version of parent's instance variable

I've used Heroku tutorial to implement websockets.
It works properly with Thin, but does not work with Unicorn and Puma.
Also there's an echo message implemented, which responds to client's message. It works properly on each server, so there are no problems with websockets implementation.
Redis setup is also correct (it catches all messages, and executes the code inside subscribe block).
How does it work now:
On server start, an empty #clients array is initialized. Then new Thread is started, which is listening to Redis and which is intended to send that message to corresponding user from #clients array.
On page load, new websocket connection is created, it is stored in #clients array.
If we receive the message from browser, we send it back to all clients connected with the same user (that part is working properly on both Thin and Puma).
If we receive the message from Redis, we also look up for all user's connections stored in #clients array.
This is where weird thing happens:
If running with Thin, it finds connections in #clients array and sends the message to them.
If running with Puma/Unicorn, #clients array is always empty, even if we try it in that order (without page reload or anything):
Send message from browser -> #clients.length is 1, message is delivered
Send message via Redis -> #clients.length is 0, message is lost
Send message from browser -> #clients.length is still 1, message is delivered
Could someone please clarify me what am I missing?
Related config of Puma server:
workers 1
threads_count = 1
threads threads_count, threads_count
Related middleware code:
require 'faye/websocket'
class NotificationsBackend
def initialize(app)
#app = app
#clients = []
Thread.new do
redis_sub = Redis.new
redis_sub.subscribe(CHANNEL) do |on|
on.message do |channel, msg|
# logging #clients.length from here will always return 0
# [..] retrieve user
send_message(user.id, { message: "ECHO: #{event.data}"} )
end
end
end
end
def call(env)
if Faye::WebSocket.websocket?(env)
ws = Faye::WebSocket.new(env, nil, {ping: KEEPALIVE_TIME })
ws.on :open do |event|
# [..] retrieve current user
if user
# add ws connection to #clients array
else
# close ws
end
end
ws.on :message do |event|
# [..] retrieve current user
Redis.current.publish({user_id: user.id, { message: "ECHO: #{event.data}"}} )
end
ws.rack_response
else
#app.call(env)
end
end
def send_message user_id, message
# logging #clients.length here will always return correct result
# cs = all connections which belong to that client
cs.each { |c| c.send(message.to_json) }
end
end
Unicorn (and apparently puma) both start up a master process and then fork one or more workers. fork copies (or at least presents the illusion of copying - an actual copy usually only happens as you write to pages) your entire process but only the thread that called fork exists in the new process.
Clearly your app is being initialised before being forked - this is normally done so that workers can start quickly and benefit from copy on write memory savings. As a consequence your redis checking thread is only running in the master process whereas #clients is being modified in the child process.
You can probably work around this by either deferring the creation of your redis thread or disabling app preloading, however you should be aware that your setup will prevent you from scaling beyond a single worker process (which with puma and a thread friendly JVM like jruby would be less of a constraint)
Just in case somebody will face the same problem, here are two solutions I have come up with:
1. Disable app preloading (this was the first solution I have come up with)
Simply remove preload_app! from the puma.rb file. Therefore, all threads will have their own #clients variable. And they will be accessible by other middleware methods (like call etc.)
Drawback: you will lose all benefits of app preloading. It is OK if you have only 1 or 2 workers with a couple of threads, but if you need a lot of them, then it's better to have app preloading. So I continued my research, and here is another solution:
2. Move thread initialization out of initialize method (this is what I use now)
For example, I moved it to call method, so this is how middleware class code looks like:
attr_accessor :subscriber
def call(env)
#subscriber ||= Thread.new do # if no subscriber present, init new one
redis_sub = Redis.new(url: ENV['REDISCLOUD_URL'])
redis_sub.subscribe(CHANNEL) do |on|
on.message do |_, msg|
# parsing message code here, retrieve user
send_message(user.id, { message: "ECHO: #{event.data}"} )
end
end
end
# other code from method
end
Both solutions solve the same problem: Redis-listening thread will be initialized for each Puma worker/thread, not for main process (which is actually not serving requests).

Optimal way to structure polling external service (RoR)

I have a Rails application that has a Document with the flag available. The document is uploaded to an external server where it is not immediately available (takes time to propogate). What I'd like to do is poll the availability and update the model when available.
I'm looking for the most performant solution for this process (service does not offer callbacks):
Document is uploaded to app
app uploads to external server
app polls url (http://external.server.com/document.pdf) until available
app updates model Document.available = true
I'm stuck on 3. I'm already using sidekiq in my project. Is that an option, or should I use a completely different approach (cron job).
Documents will be uploaded all the time and so it seems relevant to first poll the database/redis to check for Documents which are not available.
See this answer: Making HTTP HEAD request with timeout in Ruby
Basically you set up a HEAD request for the known url and then asynchronously loop until you get a 200 back (with a 5 second delay between iterations, or whatever).
Do this from your controller after the document is uploaded:
Document.delay.poll_for_finished(#document.id)
And then in your document model:
def self.poll_for_finished(document_id)
document = Document.find(document_id)
# make sure the document exists and should be polled for
return unless document.continue_polling?
if document.remote_document_exists?
document.available = true
else
document.poll_attempts += 1 # assumes you care how many times you've checked, could be ignored.
Document.delay_for(5.seconds).poll_for_finished(document.id)
end
document.save
end
def continue_polling?
# this can be more or less sophisticated
return !document.available || document.poll_attempts < 5
end
def remote_document_exists?
Net::HTTP.start('http://external.server.com') do |http|
http.open_timeout = 2
http.read_timeout = 2
return "200" == http.head(document.path).code
end
end
This is still a blocking operation. Opening the Net::HTTP connection will block if the server you're trying to contact is slow or unresponsive. If you're worried about it use Typhoeus. See this answer for details: What is the preferred way of performing non blocking I/O in Ruby?

Resources