Worker "dyno" in AWS Elastic Beanstalk - ruby-on-rails

Amazon Web Service has now a worker tiers in their Elastic Beanstalk. But, it nevertheless confuse us who come from the days of Worker dyno.
As a comparison, in Heroku, one can configure two dynos (something like processor?) each for web and worker. The web will work for any request, and will timeout normally at 15 secs. Thus, if you have a request that last more than that, your request will simply timed-out although not terminated per se. In that case, you should use worker and your web dyno should visit the endpoint several times per minutes (maybe) to check if there is any result to be brought back to the user. To make either worker or web dyno, what you need is just slide the slider and you are good to go. Sometimes, you may need a Procfile. But there is nothing fancy, or something really difficult, or confusing about it.
In AWS EBS (Elastic Beanstalk), since day 1 you hit eb init, you will be asked whether it is a Standard or Worker. When you hit Standard, it seems there is no way to make it as worker as well.
In our situation, the worker and standard web is located under one application. So, how could we use an EBS instance both for worker and standard. Our worker is using sidekiq, and redis. Please, point to any guidance or help us in this matter.

AWS Elastic Beanstalk has two types of Environments - Web tier and Worker tier.
Web tier environments are meant for web applications - http/https request processing. You get one or more EC2 instances behind a load balancer. You can get other resources like database per your requirement. You can choose the platform you wish e.g. Ruby, Python, Java, Node.js, PHP, Docker.
Worker environments are meant for asynchronous message processing. When you create a worker environment you do not have a load balancer. All your EC2 instances are in an autoscaling group. All these instances are running a daemon which is polling a single SQS queue for messages. When a message is pulled by the daemon from the SQS queue, the daemon sends a HTTP Post request on localhost:80. You can configure the port but the important thing is that the daemon posts the message as an HTTP request on localhost. Your worker application is actually a web application that receives the post request and processes the message. After the message is successfully processed the worker daemon expects that your web application running on localhost returns a HTTP 200 OK response. The daemon then deletes the message from SQS queue. You can write your worker application for any platform just like standard web server applications - Ruby, Python, Java, Node.js, PHP, Docker.
Based on my understanding of your usecase I would recommend creating two Elastic Beanstalk environments - one Standard and one Worker environment. The Standard web server receives HTTP requests and processes them synchronously. This environment puts the relevant data in an SQS queue. The second environment is a worker and the daemon running on this environment polls this SQS queue for messages. Your second environment is a web application that is NOT open to the internet. The worker daemon posts the messages as HTTP requests to your worker environment. Thus you can process long running workloads asynchronously using this second worker environment.
With worker environments you can use your own queues or Elastic Beanstalk can generate a queue for you. You can configure parameters like message visibility timeout, http connections based on your requirements or you can use the defaults.
Below are some links that may be useful for you:
http://aws.amazon.com/blogs/aws/background-task-handling-for-aws-elastic-beanstalk/
http://blogs.aws.amazon.com/application-management/post/Tx1Y8QSQRL1KQZC/Elastic-Beanstalk-Video-Tutorial-Worker-Tier
https://stackoverflow.com/a/23942498/161628
Does this meet your requirements? Please let me know if you have further questions.
Update
You need to upload your source code at two places - once for the worker environment and once for the web server environment. If someone was starting from scratch then they might have two separate code bases. But I think in your case I think it should be perfectly fine to have a single code base shared between the two environments. Suppose your web request arrives at '/register', then the register() method in your application can post messages to an SQS queue and be done with the HTTP request. Now your worker environment will poll the SQS queue and post messages over HTTP on localhost to the URL '/async_register' which will invoke a method async_register() in your application and do the asynchronous processing. These two methods can live in the same source code bundle which can be shared by both the worker and web server environment. The code path taken by worker and web server will be different so that web server environments will invoke register() and worker environments will invoke the async_register() method.
Another caveat is that HTTP requests sent by the worker daemon on localhost will contain an HTTP header - "User-Agent": "aws-sqsd/1.1". Read more here. So in your web application you can have a single listener to post requests on "/register" and depending on whether this header is present or not you invoke register() or async_register() methods internally.
Also I think if you want to share the code base between the two environments you can upload the code base at only one place. Your environments are logically grouped into applications. So you can have a single application. You upload your source code to this application using the "CreateApplicationVersion" API call. Suppose you upload an application version with label 'v1'. You can now create a worker environment and a web server environment under the same application. When you create an environment you need to provide a version to deploy to your enviroment. In this case you can deploy v1 to both environments. So you will be sharing the same source code for both environments. When you have a new version "v2". You upload this version and then perform an update on both environments changing their version to "v2".
The same version of the source code can be deployed to both environments. They will be running on different EC2 instances because one environment is dedicated for responding to web requests and one environment is dedicated for responding to asynchronous web requests (worker).

Related

AWS ELB/ECS Http response headers changed

Some context here:
An old Symfony app is used in multiple EC2 instances. Handles millions of requests each day without issues.
For dev purposes, the app was added to a container and that container is used locally by the developers without having to install all the requirements. The dockerized app uses the same nginx/supervisor/php-fpm configs that productive ec2 instances.
To make easier some dev processes, it was decided to create multiple dev environments using AWS Fargate, instead of EC2 instances.
The image is pushed to ECR and is deployed using FARGATE strategy to clusters.
The approach perhaps is too much, since we have 1 Cluster running 1 service only with 1 task. That Service uses an ELB -> Target group.
The application is working fine, but after some time (hours, or days), some requests are returned with different headers. The response is a JSON, but the content type is returned as HTML, other headers are dropped from the request like access-control-allow-headers, access-control-allow-credentials, access-control-allow-methods, triggering a CORS error in the client's browser.
The weird part is that if 1 page creates 10 requests to this service, 9 will work correctly, but 1 request will return 200 with different headers. That endpoint consistently will behave in the same way to any user until the task is restarted.
The response headers are returned by the Symfony app. I also tried to force those headers including those in nginx config by default for every response, and the result is the same.
The docker image exposes port 80 to the service.
The load balancer has the rule to forward HTTPS (443) traffic to port 80, so traffic can reach the container.
The load balancer has enabled the use of HTTP/2
The only notable difference besides EC2/Fargate implementations is the load balancer. The production load balancer is an old class load balancer with only HTTP/1 enabled and the new ones are Applications load balancers using HTTP/2.
This is driving me crazy. Has anyone experienced something like this?
Incorrect headers
Correct headers

Is there a way to start a Rails puma webserver within rails without a db connection?

I have a microservice that is taking in webhooks to process but it is currently getting pounded by the sender of said webhooks. Right now I am taking them and inserting the webhooks into the db for processing but the data is so bursty at times that I don't have enough bandwidth to manage the flood of requests and I cannot scale anymore as I'm out of db connections. The current thought is to just take the webhooks and throw them into a Kafka queue for processing; using Kafka I can scale up the number of frontend workers to whatever I need to handle the deluge of requests and I have the replayability of Kafka. By throwing the webhooks into Kafka, the frontend web server no longer needs a pool of db connections as it literally is just taking the request and throwing into the queue for processing. Does anyone have any knowledge on removing the db connectivity from Puma or have an alternative to do what's being asked?
Currently running
ruby 2.6.3
rails 6.0.1
puma 3.11
Ended up using Puma's before fork and on_worker_boot methods to not re-establish the database connection for those particular web workers within the config

How to deploy frontend and backend in same heroku applications but different docker images

I'm creating a web application that has Angular for front end and Flask for back end. I already 'dockered' my applications but I'm not sure on how to get them in Heroku as the same application
I've been reading that some people has used a reverse proxy server (this means that both applications are in different heroku app and they connect them using a proxy like traeffik or haproxy). But I don't want to do this, I want them to be in the same application (Example: grupo-camporota.herokuapp.com)
I was thinking that I should push both images, one as web dyno (front) and the other one as a worker (backend) but I've read that the worked dyno it's not for this, but for EXTERNAL apis. I would like to upload both image into heroku and make them communicate between them.
I would like to know how to get this done (I'm pretty sure it's possible), since i'm kinda lost
Your backend can't be a worker dyno: only web dynos can receive traffic from the internet. And one app will only get a single port to listen on, so you can't run two services on a single app.
You could serve your front-end up from your back-end as static files, but I don't think that will work with Docker. Also, Flask doesn't like to serve static files itself, so that may not be a good fit either.
It also looks like you can't communicate between Docker containers using a private network on Heroku. You may just have to deploy two apps (or host your front-end on a more appropriate static host).

Rails 5 - Running Resque on a Separate Machine

I understand how to setup a machine to run Resque with the environment, etc. What I cannot find, is now to point a Web (etc) server to (use) a queue on a separate machine.
For example, I am using Resque to handle mail (with the resque_mailer gem). Resque::Mailer is included in my ApplicationMailer setup, and everything works fine all running on one machine. But, I do not wish to use the resources of the Web machine for the background-processes.
How do I tell the ApplicationMailer, or configure Resque::Mailer, so that the request is routed to another machine on the internal network?
The answer was - you don't have to do anything other than ensure the system that is generating the work (the 1st machine) can reach the Redis instance, so it can update the queue there.
The 2nd machine, which carries-out the jobs, also only needs to be able to find the Redis server, which it will check to find new jobs to do.
The Redis server can be anywhere reachable by the two machines - on the 1st, 2nd, or a 3rd machine - in the context of this example.

Does application server workers spawn threads?

The application servers used by Ruby web applications that I know have the concept of worker processes. For example, Unicorn has this on the unicorn.rb configuration file, and for mongrel it is called servers, set usually on your mongrel_cluster.yml file.
My two questions about it:
1) Does every worker/server works as a web server and spam a processes/threads/fiber each time it receives a request, or it blocks when a new request is done if there is already other running?
2) Is this different from application server to application server? (Like unicorn, mongrel, thin, webrick...)
This is different from app server to app server.
Mongrel (at least as of a few years ago) would have several worker processes, and you would use something like Apache to load balance between the worker processes; each would listen on a different port. And each mongrel worker had its own queue of requests, so if it was busy when apache gave it a new request, the new request would go in the queue until that worker finished its request. Occasionally, we would see problems where a very long request (generating a report) would have other requests pile up behind it, even if other mongrel workers were much less busy.
Unicorn has a master process and just needs to listen on one port, or a unix socket, and uses only one request queue. That master process only assigns requests to worker processes as they become available, so the problem we had with Mongrel is much less of an issue. If one worker takes a really long time, it won't have requests backing up behind it specifically, it just won't be available to help with the master queue of requests until it finishes its report or whatever the big request is.
Webrick shouldn't even be considered, it's designed to run as just one worker in development, reloading everything all the time.
off the top of my head, so don't take this as "truth"
ruby (MRI) servers:
unicorn, passenger and mongrel all use 'workers' which are separate processes, all of these workers are started when you launch the master process and they persist until the master process exits. If you have 10 workers and they are all handling requests, then request 11 will be blocked waiting for one of them to complete.
webrick only runs a single process as far as I know, so request 2 would be blocked until request 1 finishes
thin: I believe it uses 'event I/O' to handle http, but is still a single process server
jruby servers:
trinidad, torquebox are multi-threaded and run on the JVM
see also puma: multi-threaded for use with jruby or rubinious
I think GitHub best explains unicorn in their (old, but valid) blog post https://github.com/blog/517-unicorn.
I think it puts backlog requests in a queue.

Resources