Discourse server fails to start with errors related to redis - ruby-on-rails

Rails server fails to start in Discourse project either in development or production. Below are the logs when trying to start the server in dev mode. The application was installed and has been working, It's deployed on AWS in production mode and restarting the unicorn loads the application for some time and again the url stops responding with error messages.
Development logs from $rails s
rb t#ip-XXX-XX-XX-XX-app:/var/www/discourse# vi
config/environments/development.r
root#ip-172-31-25-46-app:/var/www/discourse# rails s
=> Booting Puma
=> Rails 5.1.4 application starting in production
=> Run `rails server -h` for more startup options
Exiting
bundler: failed to load command: script/rails (script/rails)
Redis::CommandError: ERR Error running script (call to f_b06356ba4628144e123b652c99605b873107c9be): #user_script:14: #user_script: 14: -MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis/client.rb:121:in `call'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2399:in `block in _eval'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:58:in `block in synchronize'
/usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:58:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2398:in `_eval'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2450:in `evalsha'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus/backends/redis.rb:380:in `cached_eval'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus/backends/redis.rb:140:in `publish'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus.rb:248:in `publish'
/var/www/discourse/lib/distributed_cache.rb:72:in `publish'
**Production logs **
/var/www/discourse/lib/demon/base.rb:109:in `ensure_running'
/var/www/discourse/lib/demon/base.rb:34:in `block in ensure_running'
/var/www/discourse/lib/demon/base.rb:33:in `each'
/var/www/discourse/lib/demon/base.rb:33:in `ensure_running'
config/unicorn.conf.rb:145:in `master_sleep'
/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:284:in `join'
/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/bin/unicorn:126:in `<top (required)>'
/var/www/discourse/vendor/bundle/ruby/2.3.0/bin/unicorn:23:in `load'
/var/www/discourse/vendor/bundle/ruby/2.3.0/bin/unicorn:23:in `<main>'
E, [2018-01-04T08:43:37.949928 #60] ERROR -- : reaped #<Process::Status: pid 5870 exit 1> worker=unknown
Detected dead worker 5870, restarting...
Loading Sidekiq in process id 5883
Failed to report error: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error. 4 Redis::CommandError (MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.)
/var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:121:in `call' web-exception
Redis logs
47:M 17 Jan 09:38:01.070 # Can't save in background: fork: Cannot allocate memory
47:M 17 Jan 09:38:07.087 * 10000 changes in 60 seconds. Saving...

The issue has been fixed, Edited this file
/etc/sysctl.conf and added the line at last
vm.overcommit_memory=1
After this restarted sysctl.
$ sudo sysctl -p /etc/sysctl.conf
Redis doesn't need the amount of memory which OS thinks, the status 1 means always overcommit, never check.
More details can be found from Redis Docs.

Related

After restarting the server instance, Capistrano throws the error that Authentication failed for user Net::SSH::AuthenticationFailed

It is our maintenance project. We have got the private key(id_rsa.txt) of production server to sign-in from the client.
If we want to deploy the application, we just add the private key file to my local ssh-agent
ssh-add id_rsa.txt
then follow the capistrano's deployment command and got success
bundle exec cap deploy
Here is the line for ssh in Deploy.rb:
server 'example.com', user: 'app', roles: %w[app db web sidekiq]
set :ssh_options, { forward_agent: true, user: "app", keys: %w(/home/user/id_rsa.txt) }
Problem: Everything was fine before the instance restarted via aws's web console. After, we are unable to deploy and getting an error as
home/rubx/.rvm/gems/ruby-2.7.2#glamz-web/gems/net-ssh-6.1.0/lib/net/ssh.rb:268:in `start': Authentication failed for user user#example.com (Net::SSH::AuthenticationFailed)
1: from /home/rubx/.rvm/gems/ruby-2.7.2#glamz-web/gems/sshkit-1.21.2/lib/sshkit/runners/parallel.rb:11:in `block (2 levels) in execute'
/home/rubx/.rvm/gems/ruby-2.7.2#glamz-web/gems/sshkit-1.21.2/lib/sshkit/runners/parallel.rb:15:in `rescue in block (2 levels) in execute': Exception while executing as app#52.58.220.92: Authentication failed for user user#example.com (SSHKit::Runner::ExecuteError)
cap aborted!
SSHKit::Runner::ExecuteError: Exception while executing as user#example.com: Authentication failed for user user#example.com
/home/rubx/.rvm/gems/ruby-2.7.2#glamz-web/gems/sshkit-1.21.2/lib/sshkit/runners/parallel.rb:15:in `rescue in block (2 levels) in execute'
/home/rubx/.rvm/gems/ruby-2.7.2#glamz-web/gems/sshkit-1.21.2/lib/sshkit/runners/parallel.rb:11:in `block (2 levels) in execute'
Caused by:
Net::SSH::AuthenticationFailed: Authentication failed for user user#example.com
After some googling, I tried the following steps but did not succeed.
Added my public key(.ssh/id_rsa.pub) to authorized_keys of target server
Allowing my IP address to the inbound rule of server.
Tried to put only the server key in my ssh-agent.
Note: We can successfully login to server using the same key file
Do I need to configure the server specifically for Capistrano deployment?
Thanks in advance
We were trying syslog(/var/log/syslog) for the log messages and then some googling came to know about the /var/log/auth.log.
The following error messages were found on auth.log while establishing a server connection via the script
Jun 11 05:01:36 ip- sshd[1757827]: userauth_pubkey: key type ssh-rsa not in PubkeyAcceptedAlgorithms [preauth]
Jun 11 05:01:36 ip- sshd[1757827]: Connection closed by authenticating user app 15.4.81.22 port 47652 [preauth]
Solution
The problem was solved after allowing the ssh-rsa algorithm to PubkeyAcceptedKeyTypes in the sshd configuration
sudo vi /etc/ssh/sshd_config
PubkeyAcceptedKeyTypes=+ssh-rsa
Restart the sshd to reflect the changes
sudo systemctl restart sshd
Reference

Gitlab upgrade issue in Docker

I am trying to upgrade my Gitlab CE which is running in Docker container. I am upgrading from version 11.9.1 to 14.2.1. I am also following the required upgrade path from official Gitlab documentation which is:
11.9.1->11.11.8->12.0.12->12.1.17->12.10.14->13.0.4->13.1.11->13.8.8->13.12.9->14.0.7->14.2.1
The last version that works is 14.0.7, I am also able to run latest 14.1.x version, but during migration to 14.2.x the following error appears, some migrations does not work.
There was an error running gitlab-ctl reconfigure:
rails_migration[gitlab-rails] (gitlab::database_migrations line 51) had an error: Mixlib::ShellOut::ShellCommandFailed: bash[migrate gitlab-rails database] (/opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/resources/rails_migration.rb line 16) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of "bash" "/tmp/chef-script20210903-28-4vm0c2" ----
STDOUT: rake aborted!
StandardError: An error has occurred, all later migrations canceled:
Expected batched background migration for the given configuration to be marked as 'finished', but it is 'active': {:job_class_name=>"CopyColumnUsingBackgroundMigrationJob", :table_name=>"ci_job_artifacts", :column_name=>"id", :job_arguments=>[["id", "job_id"], ["id_convert_to_bigint", "job_id_convert_to_bigint"]]}
Finalize it manualy by running
sudo gitlab-rake gitlab:background_migrations:finalize[CopyColumnUsingBackgroundMigrationJob,ci_job_artifacts,id,'[["id"\, "job_id"]\, ["id_convert_to_bigint"\, "job_id_convert_to_bigint"]]']
For more information, check the documentation
https://docs.gitlab.com/ee/user/admin_area/monitoring/background_migrations.html#database-migrations-failing-because-of-batched-background-migration-not-finished
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:1129:in `ensure_batched_background_migration_is_finished'
/opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20210706212710_finalize_ci_job_artifacts_bigint_conversion.rb:14:in `up'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:61:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:23:in `load'
/opt/gitlab/embedded/bin/bundle:23:in `<main>'
Caused by:
Expected batched background migration for the given configuration to be marked as 'finished', but it is 'active': {:job_class_name=>"CopyColumnUsingBackgroundMigrationJob", :table_name=>"ci_job_artifacts", :column_name=>"id", :job_arguments=>[["id", "job_id"], ["id_convert_to_bigint", "job_id_convert_to_bigint"]]}
Finalize it manualy by running
sudo gitlab-rake gitlab:background_migrations:finalize[CopyColumnUsingBackgroundMigrationJob,ci_job_artifacts,id,'[["id"\, "job_id"]\, ["id_convert_to_bigint"\, "job_id_convert_to_bigint"]]']
For more information, check the documentation
https://docs.gitlab.com/ee/user/admin_area/monitoring/background_migrations.html#database-migrations-failing-because-of-batched-background-migration-not-finished
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers.rb:1129:in `ensure_batched_background_migration_is_finished'
/opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20210706212710_finalize_ci_job_artifacts_bigint_conversion.rb:14:in `up'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:61:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:23:in `load'
/opt/gitlab/embedded/bin/bundle:23:in `<main>'
Tasks: TOP => db:migrate
(See full trace by running task with --trace)
== 20210706212710 FinalizeCiJobArtifactsBigintConversion: migrating ===========
STDERR:
---- End output of "bash" "/tmp/chef-script20210903-28-4vm0c2" ----
Ran "bash" "/tmp/chef-script20210903-28-4vm0c2" returned 1
Running handlers:
Running handlers complete
Chef Infra Client failed. 11 resources updated in 16 seconds
Thank you for using GitLab Docker Image!
Current version: gitlab-ce=14.2.0-ce.0
Configure GitLab for your system by editing /etc/gitlab/gitlab.rb file
And restart this container to reload settings.
To do it use docker exec:
docker exec -it gitlab editor /etc/gitlab/gitlab.rb
docker restart gitlab
For a comprehensive list of configuration options please see the Omnibus GitLab readme
https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/README.md
If this container fails to start due to permission problems try to fix it by executing:
docker exec -it gitlab update-permissions
docker restart gitlab
Cleaning stale PIDs & sockets
Preparing services...
Starting services...
Configuring GitLab...
/opt/gitlab/embedded/bin/runsvdir-start: line 24: ulimit: pending signals: cannot modify limit: Operation not permitted
/opt/gitlab/embedded/bin/runsvdir-start: line 37: /proc/sys/fs/file-max: Read-only file system
I have tried executing migrations by hand and all other fixes that logs propose, but none of them worked.
I use Ubuntu 20.04 LTS and Docker version 20.10.7, build 20.10.7-0ubuntu1~20.04.1
Ok, turned out that I had to update to 14.0.5 firstly and wait for some background migration to complete (you can see them in menu->admin->monitoring->background migrations).

Rails.application.database_configuration error with Rails 5.1 + Nginx + Phusion_Passenger

I have a server with Rails 5.1, Phusion_Passenger and Nginx.
When I start the server with just Phusion_Passenger, all is good:
=============== Phusion Passenger Standalone web server started ===============
PID file: /project/tmp/pids/passenger.3000.pid
Log file: /project/log/passenger.3000.log
Environment: development
Accessible via: http://0.0.0.0:3000/
You can stop Phusion Passenger Standalone by pressing Ctrl-C.
Problems? Check https://www.phusionpassenger.com/library/admin/standalone/troubleshooting/
===============================================================================
[ N 2017-09-26 15:13:06.4195 8753/T5 age/Cor/SecurityUpdateChecker.h:374 ]: Security update check: no update found (next check in 24 hours)
When I try to start and access the same instance with Nginx as the overlay, I get the following error:
App 8129 stdout:
App 8129 stdout:
[ E 2017-09-26 15:06:26.4848 1774/T1l age/Cor/App/Implementation.cpp:304 ]: Could not spawn process for application /project: An error occurred while starting up the preloader.
Error ID: e18b79ab
Error details saved to: /tmp/passenger-error-YkowRo.html
Message from application: Cannot load `Rails.application.database_configuration`:
undefined method `[]' for nil:NilClass (NoMethodError)
(erb):13:in `<main>'
It seems that when you load the rails app with Nginx, it cannot access the "Rails" object.
Passenger does not have a default environment set in the passenger.conf. Added this variable and it fixed the problem.
passenger_app_env development;

Resque job returning with error: "No such file or directory - getcwd"

I have a really simple job:
class MyJob
#queue = :high
def self.perform(user_id)
user = User.find(user_id)
MyMailer.send_email(user).deliver
end
end
If I run it manually MyJob.perform(some_id)it works perfect. However, when Rescue is executing it, it returns this error:
Exception
Errno::ENOENT Error
No such file or directory - getcwd
shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/action_view/template/resolver.rb:221:in `expand_path' shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/action_view/template/resolver.rb:221:in `initialize' shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/action_view/template/resolver.rb:251:in `new' shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/action_view/template/resolver.rb:251:in `instances' shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/action_view/lookup_context.rb:16:in `<class:LookupContext>' shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/action_view/lookup_context.rb:12:in `<module:ActionView>' shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/action_view/lookup_context.rb:5:in `<top (required)>' shared/bundle/ruby/1.9.1/gems/actionpack-3.2.12/lib/abstract_controller/view_paths.rb:45:in `lookup_context' shared/bundle/ruby/1.9.1/gems/actionmailer-3.2.12/lib/action_mailer/base.rb:456:in `process' shared/bundle/ruby/1.9.1/gems/actionmailer-3.2.12/lib/action_mailer/base.rb:452:in `initialize' shared/bundle/ruby/1.9.1/gems/actionmailer-3.2.12/lib/action_mailer/base.rb:439:in `new' shared/bundle/ruby/1.9.1/gems/actionmailer-3.2.12/lib/action_mailer/base.rb:439:in `method_missing' releases/1111111111111/app/jobs/my_job.rb:6:in `perform'
Have any ideas why might be happening?
Thanks!
Yes Sky. You are right about it needing to be restarted.
Some people received this error after trying to run from an already deleted directory.
I received this error after switching databases and leaving the server running. The old server info was still showing up but I was getting this error. Restarted my rails server and everything works fine with the new db.
Basically it means that there is a significant state change on the server, and your environment needs to be reset/restarted.
I started having this same issue on my production environment. After some investigation I found that this was caused by my resque workers failing to be properly restarted on each capistrano deployment.

Spring will not start

I am receiving the following error when trying to start Spring (https://github.com/jonleighton/spring). I am running it in a Vagrant box with Ubuntu 12.04 LTS 12.04.
vagrant#rails-starter-box:/vagrant/ticketee$ spring start
/usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:85:in `unlink': Text file busy - /vagrant/ticketee/tmp/spring/spring.pid (Errno::ETXTBSY)
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:85:in `unlink'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:85:in `block (2 levels) in set_exit_hook'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:84:in `each'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:84:in `block in set_exit_hook'
/usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:34:in `initialize': Operation not permitted - /vagrant/ticketee/tmp/spring/spring (Errno::EPERM)
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:34:in `open'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:34:in `boot'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/server.rb:15:in `boot'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/client/start.rb:13:in `call'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/client/command.rb:7:in `call'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/lib/spring/client.rb:23:in `run'
from /usr/local/rvm/gems/ruby-1.9.3-p194/gems/spring-0.0.8/bin/spring:4:in `'
from /usr/local/rvm/gems/ruby-1.9.3-p194/bin/spring:19:in `load'
from /usr/local/rvm/gems/ruby-1.9.3-p194/bin/spring:19:in `'
from /usr/local/rvm/gems/ruby-1.9.3-p194/bin/ruby_noexec_wrapper:14:in `eval'
from /usr/local/rvm/gems/ruby-1.9.3-p194/bin/ruby_noexec_wrapper:14:in `'
Because of Vagrant's read-only file system you need environment set the variable SPRING_TMP_PATH to somewhere outside of the /vagrant directory
run this at the command line
mkdir ~/spring_tmp; export SPRING_TMP_PATH=/home/vagrant/spring_tmp
spring start
then run
spring status
if spring is now running then add the following line to ~/.bashrc
export SPRING_TMP_PATH="/home/vagrant/spring_tmp" # Temp PATH for spring
Credit for this goes to George Brocklehurst
I got this error with RSpec. I had to mount /windows via SMB (cifs) from a shared folder. I also had to change the permissions on the shared folder to grant write access. You can either grant Full Control to Everyone, or Full Control to your Windows user, but then you have to mount with permissions.
mount -t cifs //10.0.2.2/aidc /windows -o credentials=/etc/samba/credentials,uid=500,gid=500
You also have to install Samba (cifs?) too in your Virtual Box.
I heard vboxfs is really bad and doesn't handle a large number of files.

Resources