How can I run Rails background jobs on AWS Elastic Beanstalk? - ruby-on-rails

I just started to use AWS Elastic Beanstalk with my rails app and I need to use the Resque gem for background jobs. However, despite all effort to search how to run Resque worker on Elastic Beanstalk, I haven't been able to figure out how?
How can I run Rails background jobs with Resque on AWS Elastic Beanstalk? talks about running those as services in Elastic Beanstalk containers, however, it is still very confusing.
Here my ebextentions resque.config file:
services:
sysvinit:
resque_worker:
enabled: true
ensureRunning: true
commands:
resque_starter:
rake resque:work QUEUE='*'
EDIT
Now my resque.config file looks like this:
container_commands:
resque_starter: "rake resque:work QUEUE='*'"
services:
sysvinit:
resque_worker:
enabled: true
ensureRunning: true
commands:
resque_starter
And it is still not working.
EDIT 2
container_commands:
resque_starter:
command: "rake resque:work QUEUE=sqs_message_sender_queue"
cwd: /var/app/current/
ignoreErrors: true
Still it shows 0 workers.

I think it is suboptimal to run queues, like Resque, inside Elastic Beanstalk web environments. A web environment is intended to host web applications and spawn new instances when traffic and load increases. However, it would not make sense to have multiple Resque queues, each running on one of the instances.
Elastic Beanstalk offers worker environments which are intended to host code that executes background tasks. These worker environments pull jobs from an Amazon SQS queue (which makes an additional queue solution, like Resque, obsolete). An Amazon SQS queue scales easily and is easier to maintain (AWS just does it for you).
To use worker environments, which come with Amazon SQS queues, makes more sense, as it is out of the box supported and fits nicely into the Elastic Beanstalk landscape. There also exists a gem, Active Elastic Job, which makes it simple for Rails >= 4.2 applications to run background tasks in worker environments.
Disclaimer: I'm the author of Active Elastic Job.

First at all, I would recommend to run resque with help of supervisord this will help you make sure that worker will be restarted if process die.
On how to run command when you do deploy every time:
Log in by ssh to your beanstalk instance go to folder: /opt/elasticbeanstalk/hooks/appdeploy/
Here you will find list of hooks that execute every time when you do deploy. Also here you can put own script that will be executed every time when you do deploy. Also with same approach you can put script
to hooks that responsible to application server restart and have ability to restart you background job without connection by ssh.
Another option, put your command that start background worker is use container_commands instead of commands.
Also, please have a look to best articles that I have found about customization of beanstalk: http://www.hudku.com/blog/tag/elastic-beanstalk/ it would be good start point for customization of beanstalk environment for your need.
\

Related

What is the best approach to create a schedule job on AWS Elastic Beanstalk - Ruby on Rails?

Recently I moved my personal project from Heroku to AWS ElasticBeanstalk, because of Heroku's new pricing table. There, in Heroku I had a schedule job using Sidekiq which was dependent of a Redis and Dynno worker.
I deployed my project at AWS, without the scheduled job, now I am facing some problems to create this cronjov at AWS.
What I've tried?
Create a cron job at my EC2 enviroment using a cron.config on the zip .ebextensions - I can even run some simple cronjobs, but I couldn't run a Ruby on Rails script because some configuration that is necessary and the documentation is not clear.
Tried to use the the active-elastic-job gem, but it raises a lot of gem problems, which makes it impossible to be deployed
Tried to use AWS Lambda, but I did not understand how to do it, saw many examples in Python and other languages, but not in Ruby
What do you suggest me to do?
My next approach would be to use create a cronjob in my EC2 instance with a http request to a controller containing the task I need ...
I ended up using the rufus-scheduler gem. It works fine with AWS Elastic Beanstalk and its EC2 instances.

best approach to deploy dockerized rails app on AWS?

I dockerized existing Rails Application and it runs on development properly. I want to deploy the app to production environment. I used docker-compose locally.
Application Stack is as follows:
Rails app
Background Workers for mails and cleanup
Relation DB - postgres
NoSQL DB - DynamoDB
SQS queues
Action Cable - Redis
Caching - Memcached
As far as I know, options for deployment are as below:
ECS (Trying this, but having difficulties in relating concepts such as Task and Task Definitions with docker-compose concepts)
ECS with Elastic Beanstalk
With Docker Machine according to this docker documentation: https://docs.docker.com/machine/drivers/aws/
I do not have experience with capistrano, haven't used it yet on this project yet so I am not aiming to use it for docker deployment either. I am planning to use some CD/CI solution for easy deployment. I want advice on options available, and how to deploy the stack in a way that is easy to maintain and push updates with minimal deployment efforts?

Where should I run scheduled background jobs?

Here in my company we have our regular application in aws ebs with some background jobs. The problem is, these jobs are starting to get heavier and we were thinking in separate them from the application. The question is: Where should we do it?
We were thinking in doing it in aws lambda, but then we would have to port our rails code to python, node or java, which seems to be a lot of work. What are other options for this? Should we just create another ec2 environment for the jobs? Thanks in advance.
Edit: I'm using shoryuken gem: http://github.com/phstc/shoryuken integrated with SQS. But its currently with some memory leak and my application is going down sometimes, I dont know if the memory leak is the cause tough. We already separated the application between an API part in EBS and a front-end part in S3.
Normally, just another EC2 instance with a copy of your Rails app, where instead of rails s to start the web server, you run rake resque:work or whatever your job runner start command is. Both would share the same Redis instance and database so that your web server writes the jobs to the queue and the worker picks them up and runs them.
If you need more workers, just add more EC2 instances pointing to the same Redis instance. I would advise separating your jobs by queue name, so that one worker can just process fast stuff e.g. email sending, and others can do long running or slow jobs.
We had a similar requirement, for us it was the sidekiq background jobs, they started to get very heavy, so we split it into a separate opsworks stack, with a simple recipe to build the machine dependencies ( ruby, mysql, etc ), and since we don't have to worry about load balancers and requests timing out, it's ok for all machines to deploy at the same time.
Also another thing you could use in opsworks is using scheduled machines ( if the jobs are needed at certain times during the day ), having the machine get provisioned few minutes before the time of the task, and then after the task is done you could make it shutdown automatically, that would reduce your cost.
EB also has a different type of application, which is the worker application, you could also check that out, but honestly I haven't looked into it so I can't tell you what are the pros and cons of that.
We recently passed on that route. I dockerized our rails app, and wrote a custom entrypoint to that docker container. In summary that entrypoint parses commands after you run docker run IMAGE_NAME
For example: If you run: docker run IMAGE_NAME sb rake do-something-magical entrypoint understands that it will run rake job with sandbox envrionment config. if you only run: docker run IMAGE_NAME it will do rails s -b 0.0.0.0
PS: I wrote custom entrypoint because we have 3 different environments, that entrypoint downloads environment specific config from s3.
And I set up an ECS Cluster, wrote an task-runner job on Lambda this lambda function schedules a task on ecs cluster, and we trigger that lambda from CloudWatch Events. You can send custom payloads to lambda when using CloudWatch Events.
It sounds complicated but implementation is really simple.
You may consider to submit your task to AWS SQS services, then you may use elasticbeantaslk worker enviroment to process your backgrown task.
Elasticbeanstalk supports rail application:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Ruby_rails.html
Depending on what kind of work these background jobs perform, you might want to think about maybe extracting those functions into microservices if you are running your jobs on a difference instance anyway.
Here is a good codeship blog post on how to approach this.
For simple mailer type stuff, this definitely feels a little heavy handed, but if the functionality is more complex, e.g. general notification of different clients, it might well be worth the overhead.

How can I run Rails background jobs with Resque on AWS Elastic Beanstalk?

I am running my rails app on the AWS Elastic Beanstalk platform, which is running a single EC2 instance with Auto Scaling & Elastic Load Balancing.
I'm wondering how to run resque, delayed_job or sidekicq or some other solution for background jobs on Elastic Beanstalk.
What are the possible options for background jobs on Elastic Beanstalk?
I created a gem, Active Elastic Job, as a solution for background jobs of Rails applications running on Elastic Beanstalk.
It makes use of Elastic Beanstalk worker environments, which are intended to be used for background tasks of Elastic Beanstalk applications.
Advantages are:
You can use the same code base for executing background jobs, no need to branch off a dedicated version of your application to run in worker environments,
make use of Elastic Beanstalk autoscale capabilities,
no need to set up external EC2 instances or services to run a
a queueing backend like resque or sidekiq,
no need to customize Elastic Beanstalk containers.
keep simplicity of Elastic Beanstalk's predefined infrastructure.
However, this gem is only compatible with Rails >= 4.2 applications.
Best way to start/stop/restart background jobs could be via init scripts for these tasks. You could have these init scripts triggered as services when instances are launched. More about Customizing ElasticBeanstalk containers for services here.
Once done, you could freeze your init scripts by creating an AMI of your instance and then launching instances out of this custom AMI with auto-scaling.
Hope this helps.

Keeping rake jobs:work running

I'm using delayed_job to run jobs, with new jobs being added every minute by a cronjob.
Currently I have an issue where the rake jobs:work task, currently started with 'nohup rake jobs:work &' manually, is randomly exiting.
While God seems to be a solution to some people, the extra memory overhead is rather annoying and I'd prefer a simpler solution that can be restarted by the deployment script (Capistrano).
Is there some bash/Ruby magic to make this happen, or am I destined to run a monitoring service on my server with some horrid hacks to allow the unprivelaged account the site deploys to the ability to restart it?
I'd suggest you to use foreman. It allows you to start any number of jobs in development by using foreman run, and then export your configuration (number of processes per type, limits etc) as upstart scripts, to make them available to Ubuntu's upstart (why invoking God when the operating system already has this for free??).
The configuration file, Procfile, is also exactly the same file Heroku uses for process configuration, so with just one file you get three process management systems covered.

Resources