Thread-safe way of changing the connection search_paths - ruby-on-rails

I want to be able to switch between different DB schemas in a Rails 4 app.
The plan is to add a new middleware in the very beginning of the stack that will do that for me.
The only way to do it is by setting ActiveRecord::Base.connection.schema_search_path = '"$user",my_schema'.
The problem I have with this is that this connection will go to the pool and all the following requests will use the schema that was set in the first one (basically leaking it through).
So the solution I see is to always reset the search path to what it was before and always set it on each request.
But I don't want to do this because:
99% of the requests will go to the default (public) schema, executing set search_path to '$user$,my_schema' would be additional query that could have been avoided
higher risk of leaking (other middleware may establish the connection earlier, or some changes to Rails or gems outside of my control)
All that especially applies to threaded servers, like Puma.
So are there any better alternatives to my solution with a middleware?
Thanks.

When you return connections to the pool, you must ensure the pool runs DISCARD ALL; to reset the connection state.
That will clear any SET ROLE, SET SESSION AUTHORIZATION, session variables, search_path setting, etc.

Related

How to configure PostgreSQL client_min_messages on Heroku & Rails

I am trying to reduce some logging noise I am getting from PostgreSQL on my Heroku/Rails application. Specifically, I am trying to configure the client_min_messages setting to warning instead of the default notice.
I followed the steps in this post and specified min_messages: warning in my database.yml file but that doesn't seem to have any effect on my Heroku PostgreSQL instance. I'm still seeing NOTICE messages in my logs and when I run SHOW client_min_messages on the database it still returns notice.
Here is a redacted example of the logs I'm seeing in Papertrail:
Nov 23 15:04:51 my-app-name-production app/postgres.123467 [COLOR] [1234-5] sql_error_code = 00000 log_line="5733" application_name="puma: cluster worker 0: 4 [app]" NOTICE: text-search query contains only stop words or doesn't contain lexemes, ignored
I can also confirm that the setting does seem to be in the Rails configuration - Rails.application.config.database_configuration[Rails.env] in a production console does show a hash containing "min_messages"=>"warning"
I also tried manually updating that via the PostgreSQL console - so SET client_min_messages TO WARNING; - but that setting doesn't 'stick'. It seems to be reset on the next session.
How do I configure client_min_messages to be warning on Heroku/Rails?
If all else fails and your log is flooded by the server logs you can't control or client logs you can't trace and switch off, Papertrail lets you filter out anything you don't want. The database/client still generate them, Heroku still passes them to Papertrail, but Papertrail discards them once they come in.
Shotgun method, PostgreSQL-specific
REVOKE SET ON PARAMETER client_min_messages,log_min_messages FROM your_app_user;
REVOKE GRANT OPTION FOR SET ON PARAMETER client_min_messages,log_min_messages FROM your_app_user;
ALTER SYSTEM SET client_min_messages=WARNING;
ALTER SYSTEM SET log_min_messages =WARNING;
ALTER DATABASE db_user_by_your_app SET client_min_messages=WARNING;
ALTER DATABASE db_user_by_your_app SET log_min_messages =WARNING;
ALTER ROLE your_app_user SET client_min_messages=WARNING;
ALTER ROLE your_app_user SET log_min_messages =WARNING;
And then you need to either wait, restart the app, force it to re-connect or restart the db/instance/server/cluster it connects to.
If your app opens and closes connections - just wait and with time old connections will be replaced by new ones using the new settings.
If it uses a pool, it'll keep re-using connections it already has, so you'll have to force it to re-open them for the change to propagate. You might need to restart the app, or they can be force-closed:
SELECT pg_terminate_backend(pid) from pg_stat_activity where pid<>pg_backend_pid();
The reason is that there's no way for you to alter session-level settings on the fly, from the outside - and all of the above only affects defaults for new sessions. The REVOKE will prevent the app user from changing the setting but it'll also throw an error if they actually try. I'm leaving this in for future reference, keeping in mind that at the moment Heroku supports PostgreSQL versions up to 14, and REVOKE SET ON PARAMETER was added in version 15.
To need all these at once, you'd have to be seeing logs from both ends of each connection in your Papertrail, connecting to multiple databases, using different users, who also can keep changing the settings. Check one by one to isolate the root cause.
Context
There's one log written to each client, one or more written on the server.
client_min_messages applies the client log, sent back in each connection.
log_min_messages applies to the server log(s).
Depending on what feeds the log into your Papertrail, you might need to change only one of these. Manipulating settings can always be tricky because of how and when they apply. You have multiple levels where parameters can be specified, then overriden:
system-level parameters, loaded from postgresql.conf, then postgresql -c/pg_ctl -o and postgresql.auto.conf, which reflects changes applied using ALTER SYSTEM SET ... or directly.
database overrides system. Applied with ALTER DATABASE db SET...
user/role overrides database. ALTER ROLE user SET...
session overrides user/role. Changed with SET... that clients also use upon connection init. If the value for client_min_messages set under min_messages is specified both in database.yml and ENV['DATABASE_URL'], Rails will use the env setting, overriding the one found in .yml with it DATABASE_URL=postgres://localhost/rails_event_store_active_record?min_messages=warning
transaction-level parameters are the most specific, overriding all else - they are basically session-level parameters that will change back to their initial setting at the end of transaction. SET LOCAL...
When a new session opens, it loads the system defaults, overrides them with the database-level, then role-level, after which clients typically issue their own SETs.
It might be a good idea to make sure you're using the settings in database.yml that you think you're using, since it's possible to have multiple sets of them. There can be some logic in your app that keeps altering the setting.
I think you want log_min_messages, not client_min_messages:
Controls which message levels are written to the server log. Valid values are DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1, INFO, NOTICE, WARNING, ERROR, LOG, FATAL, and PANIC. Each level includes all the levels that follow it. The later the level, the fewer messages are sent to the log. The default is WARNING. Note that LOG has a different rank here than in client_min_messages. Only superusers and users with the appropriate SET privilege can change this setting.
I'm not sure if your database user will be allowed to set it, but you can try doing so at the database level:
ALTER DATABASE your_database
SET log_min_messages TO 'warning';
If this doesn't work, and setting at the role or connection level doesn't work, and heroku pg:settings doesn't work (confirmed via other answers and comments), the answer might unfortunately be that this isn't possible on Heroku.
Heroku Postgres is a managed service, so the vendor makes certain decisions that aren't configurable. This might be one of those situations.

Do constants stay the same for ALL users?

I have a web app that I built. It communicates with the Salesforce API. I have users and administrators. All connections to the API use the same credentials.
I am concerned that my API connection is going to be created multiple times because each admin that is logged in has their own instance of the connection.
If I hold the API connection in a constant, do all other sessions/users have access to that exact connection or do I have to connect for each user, or how can I share one single API connection for ALL users?
A stateless API will never have a persistent connection, so there's no use in holding these in constants. Every HTTP request is a separate TCP connection by definition.
It's only things like database or Websocket connections that persist and if you need to manage those you need a connection pool, not a simple constant. If the connection ever fails it needs to be replaced, and if more than one thread potentially requires it you have to handle acquisition and locking properly.
Create your API connectors as necessary. Unless you have a measurable performance problem don't worry about it.
A Ruby constant is like a variable, except that its value is supposed to remain constant for the duration of the program. The Ruby interpreter does not actually enforce the constancy of constants, but it does issue a warning if a program changes the value of a constant.
Reference: http://rubylearning.com/satishtalim/ruby_constants.html

Re-initialize ActiveRecord after rails startup

I'm building a system which allows the user to modify the database.yml contents via an admin frontend interface.
That changes to database.yml obviously don't have any affect until the application is restarted. Rather than forcing the user (who may not have SSH access to the physical box) to restar the application manually, I'd like a way to force ActiveRecord to reload the config post startup.
Rather than requiring them to restart the server, is there a way to force ActiveRecord to re-initialize after initial startup?
Rationale
There are two use cases for this - a) initial setup wizard b) moving from sqlite evaluation database to production supported database.
Initial Setup Wizard
After installing our application and starting it for the first time the user will be presented with a setup wizard, which amongst other things, allows the user to choose between the built in evaluation database using sqlite or to specify a production supported database. They need to specify the database properties. Rather than asking users to edit yml files on the server we wish the present a frontend to do so during initial setup.
Moving from sqlite evaluation database to production supported database
If the user opted to go with the built in evaluation database, but alter wants to migrate to a production database they need to update the database settings to reflect this change. Same reasons as above, a front end rather than requiring the user to edit config files is desired, especially since we can validate connectivity, permissions etc from the application rather than the user finding out it didn't work when they try to start the application and they get an activerecord exception.
Restart your Rails stack on the next request just as you would if you had access to the server.
system("touch #{File.join(Rails.root,'tmp','restart.txt')")
Building on #wless1's answer in order to get ERB-like ENV vars (e.g. <%= ENV['DB_HOST'] %>) working try this:
YAML::load(ERB.new(File.read(File.join(Rails.root, "config/database.yml"))).result)
ActiveRecord::Base.establish_connection config[Rails.env]
Theoretically, you could achieve this with the following:
config = YAML::load File.read(File.join(Rails.root, "config/database.yml"))
ActiveRecord::Base.establish_connection config[Rails.env]
You'd have to execute the reconnection code in every rails process you're running for it to be effective. There's a bunch of dangerous things to worry about here though - race conditions when changing the database config, syntax problems in the new database.yml, etc.

BoneCP does not supply a valid connection if it was created when the database was down

I have a use case where the database (sybase) may be unavailable when bonecp (0.7.1.RELEASE) creates a connection pool. When it is later available however, if my app requests a connection the call never returns.
I'm using out of the box config bonecp defaults, but I've tested this with transactionRecoveryEnabled set to true too.
Considering that c3p0 and dbcp both have this functionality, is there something I'm doing wrong?
Try using lazyInit config setting, it's meant for that feature.

Rails+PostgreSQL: search_path depending on subdomain

In our rails 2.x application the search_path of the database connection depends on the subdomain through which the application is contacted (basically search_path = "production_"+subdomain). Because the search_path is defined per connection and database connections are shared over requests, even concurrently, this is a problem. I would rather not change concurrency to only serve one request at a time for obvious reasons.
So is there a way to group the database connections in the connection pool and set some kind of policy that only a fitting connection is used for the request? Or is there a way to use one connection pool per subdomain (where the pools are automatically discarded after a timeout)? Starting a rails instance for each subdomain is no option because there might be many idling subdomains (it's some kind of pro-account where you get a subdomain and your own "world" that differs from the rest of the site in some tables).
What would be the best solution for this problem?
You can just set connection.search_path at the beginning of the request, before any objects are loaded, and you'll be fine. In our case we have a Rack app that wraps our rails app and does this setup for us based on the incoming domain.

Resources