Rails/Capistrano: check if sidekiq is running on an EC2 instance - ruby-on-rails

I'm using an EC2 instance for hosting a rails application. I'm deploying with capistrano and I had already included sidekiq and it's working fine. However, sometimes on deploy, and sometimes sporadically, sidekiq stops running and I don't notice until some tasks that use sidekiq doesn't run.
I could do something on deploy to check that, but if it stops to work eventually after deploy, that would still be a problem.
I would like to know what is the best way, in that scenario, to check periodically if sidekiq is running, and if not to, run it.
I thought of doing a bash script for that, but apparently, when I run sidekiq from command line, it creates another process with a different pid of the one launched by sidekiq... so I think it could get messy.
Any help is appreciated. Thanks!

Learn and use systemd to manage the service.
https://github.com/mperham/sidekiq/wiki/Deployment#running-your-own-process

Related

How to detect orphaned sidekiq process after capistrano deploy?

We have a Rails 4.2 application that runs alongisde a sidekiq process for long tasks.
Somehow, in a deploy a few weeks ago something went south (the capistrano deploy process didn't effectively stopped it, wasn't able to figure out why) and there was left an orphaned sidekiq process running that was competing with the current one for jobs on the redis queue. Because this process source became outdated it started giving random results on our application (depending on which process captured the job) and we got a very hard time until we figured this out..
How I can stop this from happenning ever again? I mean, I can ssh into the VPS after every deploy and run ps aux | grep sidekiq to check if there is more than one.. but it's not practical.
Use your init system to manage Sidekiq, not Capistrano.
https://github.com/mperham/sidekiq/wiki/Deployment#running-your-own-process

Where should I run scheduled background jobs?

Here in my company we have our regular application in aws ebs with some background jobs. The problem is, these jobs are starting to get heavier and we were thinking in separate them from the application. The question is: Where should we do it?
We were thinking in doing it in aws lambda, but then we would have to port our rails code to python, node or java, which seems to be a lot of work. What are other options for this? Should we just create another ec2 environment for the jobs? Thanks in advance.
Edit: I'm using shoryuken gem: http://github.com/phstc/shoryuken integrated with SQS. But its currently with some memory leak and my application is going down sometimes, I dont know if the memory leak is the cause tough. We already separated the application between an API part in EBS and a front-end part in S3.
Normally, just another EC2 instance with a copy of your Rails app, where instead of rails s to start the web server, you run rake resque:work or whatever your job runner start command is. Both would share the same Redis instance and database so that your web server writes the jobs to the queue and the worker picks them up and runs them.
If you need more workers, just add more EC2 instances pointing to the same Redis instance. I would advise separating your jobs by queue name, so that one worker can just process fast stuff e.g. email sending, and others can do long running or slow jobs.
We had a similar requirement, for us it was the sidekiq background jobs, they started to get very heavy, so we split it into a separate opsworks stack, with a simple recipe to build the machine dependencies ( ruby, mysql, etc ), and since we don't have to worry about load balancers and requests timing out, it's ok for all machines to deploy at the same time.
Also another thing you could use in opsworks is using scheduled machines ( if the jobs are needed at certain times during the day ), having the machine get provisioned few minutes before the time of the task, and then after the task is done you could make it shutdown automatically, that would reduce your cost.
EB also has a different type of application, which is the worker application, you could also check that out, but honestly I haven't looked into it so I can't tell you what are the pros and cons of that.
We recently passed on that route. I dockerized our rails app, and wrote a custom entrypoint to that docker container. In summary that entrypoint parses commands after you run docker run IMAGE_NAME
For example: If you run: docker run IMAGE_NAME sb rake do-something-magical entrypoint understands that it will run rake job with sandbox envrionment config. if you only run: docker run IMAGE_NAME it will do rails s -b 0.0.0.0
PS: I wrote custom entrypoint because we have 3 different environments, that entrypoint downloads environment specific config from s3.
And I set up an ECS Cluster, wrote an task-runner job on Lambda this lambda function schedules a task on ecs cluster, and we trigger that lambda from CloudWatch Events. You can send custom payloads to lambda when using CloudWatch Events.
It sounds complicated but implementation is really simple.
You may consider to submit your task to AWS SQS services, then you may use elasticbeantaslk worker enviroment to process your backgrown task.
Elasticbeanstalk supports rail application:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Ruby_rails.html
Depending on what kind of work these background jobs perform, you might want to think about maybe extracting those functions into microservices if you are running your jobs on a difference instance anyway.
Here is a good codeship blog post on how to approach this.
For simple mailer type stuff, this definitely feels a little heavy handed, but if the functionality is more complex, e.g. general notification of different clients, it might well be worth the overhead.

Rails Resque keeps running old code even after server restart

I've got the resque gem running and for some reason it seems to be caching the job code. Rails in currently in the dev environment and this still persists after server restart.
I've tried changing the queue name but the same code continues to be run.
The only thing that works is creating an entirely new class with a different name and then calling that.
Is there a cache that can be cleared?
There are resque worker(s) process running in the background that requires you to restart them. You need to restart your resque worker(s) in order for your changes to take effect.

Keeping rake jobs:work running

I'm using delayed_job to run jobs, with new jobs being added every minute by a cronjob.
Currently I have an issue where the rake jobs:work task, currently started with 'nohup rake jobs:work &' manually, is randomly exiting.
While God seems to be a solution to some people, the extra memory overhead is rather annoying and I'd prefer a simpler solution that can be restarted by the deployment script (Capistrano).
Is there some bash/Ruby magic to make this happen, or am I destined to run a monitoring service on my server with some horrid hacks to allow the unprivelaged account the site deploys to the ability to restart it?
I'd suggest you to use foreman. It allows you to start any number of jobs in development by using foreman run, and then export your configuration (number of processes per type, limits etc) as upstart scripts, to make them available to Ubuntu's upstart (why invoking God when the operating system already has this for free??).
The configuration file, Procfile, is also exactly the same file Heroku uses for process configuration, so with just one file you get three process management systems covered.

Running delayed_job worker on Heroku?

So right now I have an implementation of delayed_job that works perfectly on my local development environment. In order to start the worker on my machine, I just run rake jobs:work and it works perfectly.
To get delayed_job to work on heroku, I've been using pretty much the same command: heroku run rake jobs:work. This solution works, without me having to pay anything for worker costs to Heroku, but I have to keep my command prompt window open or else the delayed_job worker stops when I close it. Is there a command to permanently keep this delayed_job worker working even when I close the command window? Or is there another better way to go about this?
I recommend the workless gem to run delayed jobs on heroku. I use this now - it works perfectly for me, zero hassle and at zero cost.
I have also used hirefireapp which gives a much finer degree of control on scaling workers. This costs, but costs less than a single heroku worker (over a month). I don't use this now, but have used it, and it worked very well.
Add
worker: rake jobs:work
to your Procfile.
EDIT:
Even if you run it from your console you 'buy' worker dyno but Heroku has per second biling. So you don't pay because you have 750h free, and month in worst case has 744h, so you have free 6h for your extra dynos, scheduler tasks and so on.
I haven't tried it personally yet, but you might find nohup useful. It allows your process to run even though you have closed your terminal window. Link: http://linux.101hacks.com/unix/nohup-command/
Using heroku console to get workers onto the jobs will only create create a temporary dyno for the job. To keep the jobs running without cli, you need to put the command into the Procfile as #Lucaksz suggested.
After deployment, you also need to scale the dyno formation, as heroku need to know how many dyno should be put onto the process type like this:
heroku ps:scale worker=1
More details can be read here https://devcenter.heroku.com/articles/scaling

Resources