Sustainable Solution To Configuring Rails, Sidekiq, Redis All On AWS Elastic Beanstalk - ruby-on-rails

AWS Elastic Beanstalk rails app that needs a sidekiq worker processes running alongside Puma/Passenger. Getting the sidekiq process to run has resulted in hours failed attempts. Also, getting the rails app and sidekiq to talk to my AWS ElastiCache cluster apparently needs some security rule changes.
Background
We started out with an extremely simple Rails app that was easily deployed to AWS Elastic Beanstalk. Since those early times we've evolved the app to now use the worker framework Sidekiq. Sidekiq in turn likes to use Redis to pull its jobs. Anyway, getting all these puzzle pieces assembled in the AWS world is a little challenging.

Solutions From The Web...with some sustainability problems
The AWS ecosystem goes through updates and upgrades, many aren't documented with clarity. For example environment settings change regularly; a script you have written may break in subsequent versions.
I used the following smattering of solutions to try to solve this:
http://blog.noizeramp.com/2013/04/21/using-sidekiq-with-elastic-beanstalk/ (please note that the comments in this blog post contains a number of helpful gists). Many thanks to the contributor and commenters in this post.
http://qiita.com/sawanoboly/items/d28a05d3445901cf1b25 (starting sidekiq with upstart/initctl seems like the simplest and most sustainable approach). This page is in japanese, but the sidekiq startup code makes complete sense. Thanks!
Use AWS's ElastiCache for Redis. Make sure to configure your security groups accordingly: this AWS document was helpful...

Related

What is the best approach to create a schedule job on AWS Elastic Beanstalk - Ruby on Rails?

Recently I moved my personal project from Heroku to AWS ElasticBeanstalk, because of Heroku's new pricing table. There, in Heroku I had a schedule job using Sidekiq which was dependent of a Redis and Dynno worker.
I deployed my project at AWS, without the scheduled job, now I am facing some problems to create this cronjov at AWS.
What I've tried?
Create a cron job at my EC2 enviroment using a cron.config on the zip .ebextensions - I can even run some simple cronjobs, but I couldn't run a Ruby on Rails script because some configuration that is necessary and the documentation is not clear.
Tried to use the the active-elastic-job gem, but it raises a lot of gem problems, which makes it impossible to be deployed
Tried to use AWS Lambda, but I did not understand how to do it, saw many examples in Python and other languages, but not in Ruby
What do you suggest me to do?
My next approach would be to use create a cronjob in my EC2 instance with a http request to a controller containing the task I need ...
I ended up using the rufus-scheduler gem. It works fine with AWS Elastic Beanstalk and its EC2 instances.

Replicating Heroku's Review Apps on AWS

I'm currently working for a client that are using Heroku and migrating to AWS. However, we're having trouble in understanding how the Review Apps feature can be replicated in AWS.
Specifically, we want a Jenkins job that will allow us to specify a branch name and a set of environment variables. That job will then spin up our entire stack, so that the developer can test their changes in isolation, before moving to staging.
Our stack is 5 different Ruby on Rails applications, all of which must know each other's URLs, which does complicate things.
I'm told that tools like AWS Fargate or EKS might be suitable, but I'm not sure.

best approach to deploy dockerized rails app on AWS?

I dockerized existing Rails Application and it runs on development properly. I want to deploy the app to production environment. I used docker-compose locally.
Application Stack is as follows:
Rails app
Background Workers for mails and cleanup
Relation DB - postgres
NoSQL DB - DynamoDB
SQS queues
Action Cable - Redis
Caching - Memcached
As far as I know, options for deployment are as below:
ECS (Trying this, but having difficulties in relating concepts such as Task and Task Definitions with docker-compose concepts)
ECS with Elastic Beanstalk
With Docker Machine according to this docker documentation: https://docs.docker.com/machine/drivers/aws/
I do not have experience with capistrano, haven't used it yet on this project yet so I am not aiming to use it for docker deployment either. I am planning to use some CD/CI solution for easy deployment. I want advice on options available, and how to deploy the stack in a way that is easy to maintain and push updates with minimal deployment efforts?

Where should I run scheduled background jobs?

Here in my company we have our regular application in aws ebs with some background jobs. The problem is, these jobs are starting to get heavier and we were thinking in separate them from the application. The question is: Where should we do it?
We were thinking in doing it in aws lambda, but then we would have to port our rails code to python, node or java, which seems to be a lot of work. What are other options for this? Should we just create another ec2 environment for the jobs? Thanks in advance.
Edit: I'm using shoryuken gem: http://github.com/phstc/shoryuken integrated with SQS. But its currently with some memory leak and my application is going down sometimes, I dont know if the memory leak is the cause tough. We already separated the application between an API part in EBS and a front-end part in S3.
Normally, just another EC2 instance with a copy of your Rails app, where instead of rails s to start the web server, you run rake resque:work or whatever your job runner start command is. Both would share the same Redis instance and database so that your web server writes the jobs to the queue and the worker picks them up and runs them.
If you need more workers, just add more EC2 instances pointing to the same Redis instance. I would advise separating your jobs by queue name, so that one worker can just process fast stuff e.g. email sending, and others can do long running or slow jobs.
We had a similar requirement, for us it was the sidekiq background jobs, they started to get very heavy, so we split it into a separate opsworks stack, with a simple recipe to build the machine dependencies ( ruby, mysql, etc ), and since we don't have to worry about load balancers and requests timing out, it's ok for all machines to deploy at the same time.
Also another thing you could use in opsworks is using scheduled machines ( if the jobs are needed at certain times during the day ), having the machine get provisioned few minutes before the time of the task, and then after the task is done you could make it shutdown automatically, that would reduce your cost.
EB also has a different type of application, which is the worker application, you could also check that out, but honestly I haven't looked into it so I can't tell you what are the pros and cons of that.
We recently passed on that route. I dockerized our rails app, and wrote a custom entrypoint to that docker container. In summary that entrypoint parses commands after you run docker run IMAGE_NAME
For example: If you run: docker run IMAGE_NAME sb rake do-something-magical entrypoint understands that it will run rake job with sandbox envrionment config. if you only run: docker run IMAGE_NAME it will do rails s -b 0.0.0.0
PS: I wrote custom entrypoint because we have 3 different environments, that entrypoint downloads environment specific config from s3.
And I set up an ECS Cluster, wrote an task-runner job on Lambda this lambda function schedules a task on ecs cluster, and we trigger that lambda from CloudWatch Events. You can send custom payloads to lambda when using CloudWatch Events.
It sounds complicated but implementation is really simple.
You may consider to submit your task to AWS SQS services, then you may use elasticbeantaslk worker enviroment to process your backgrown task.
Elasticbeanstalk supports rail application:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Ruby_rails.html
Depending on what kind of work these background jobs perform, you might want to think about maybe extracting those functions into microservices if you are running your jobs on a difference instance anyway.
Here is a good codeship blog post on how to approach this.
For simple mailer type stuff, this definitely feels a little heavy handed, but if the functionality is more complex, e.g. general notification of different clients, it might well be worth the overhead.

Architectural overview for Resque on Heroku?

tldr; What pieces do you need to make a web app with a resque+resque_web dashboard?
I've seen the Heroku tutorial, and plenty of configuration examples, but it seems like there's a lot of complexity being glossed over:
Dynos don't have stable IP addresses, so how does the communication work between the web process, a resque process, and redis?
The Heroku docs imply that an additional communication service is necessary to coordinate between dynos; am I reading this right?
How many dynos and services are required to make a "basic" web app which:
hands off some long-running to jobs to resque which
saves its results in the web app's database, and
is accessible by resque_web (mounted w/in the web app or standalone)?
Honestly, if someone could sketch a diagram, that'd be great.
Disclaimer: I don't have actually deployed a heroku app with resque. So this is information gleaned from: https://devcenter.heroku.com/articles/queuing-ruby-resque and checking into the example app.
The web-dyno and the worker-dyno will not communicate directly with each other. They will communicate with each other via redis, and redis is provisioned on a specific DNS (which you can find on your apps resource page on heroku, after adding a redis plugin). These settings can be transfered into an .env file (via the heroku toolbelt plugin config). This env file can be used by foreman to set up the ENV variables. and these ENV variables you you use in your application to configure redis.
Not sure, but the example-app does not imply any such necessary service
2: 1 web-dyno, 1 worker-dyno

Resources