migrate AWS Lambda rails code to AWS Fargate

migrate AWS Lambda rails code to AWS Fargate - ruby-on-rails

I have AWS Lambda function written in ruby and due to the larger and larger amounts of data it should process (maths) it timeouts. Due to the nature of the job it is timing out due to very extensive maths and optimization of maths is out of review - it is way more complex and not an option as sits in a proprietary library written in ruby.
To avoid this I'm searching the way to migrate whole lambda code to AWS Fargate so it will be able to instantiate a task and let it run for a long time (probably around 30-35 minutes). I was not able to find a guide on how to convert lambda code to aws fargate deployment.
So I have an AWS lambda zip file that contains rails lambda code, proprietary ruby library, and mysql2 gem with native *.so libraries (results of calculations are stored in db).
Is there any tool or guide how to migrate such or similar lambda functions to fargate? Btw, lambda is invoked from SQS as far as I understood in case of Fargate SQS will invoke lambda which will run fargate task. But task configuration and images and rest are not clear.

#sngsnd Thanks for asking the question.
As per the definition,
AWS Fargate is a serverless compute engine for containers.
Below steps may help you.
Containerized your Ruby code.
Put image in repository like (DockerHub, ECR etc)
Create task definition in ECS
Create ECS cluster.
Create Service to run your task.
This article may help you.
serverless-application-for-long-running-process-fargate-lambda

Related

Can I run aws cdk deploy as an aws lambda service?

Can I cdk deploy from an aws lambda (preferably python)?
Example workflow:
Lambda is triggered by e.g. S3 File upload
Lambda runs cdk deploy

I would recommend to create a CodeBuild that runs the command and invoke that from the Lambda instead.
Lambda has a 15 minute timeout limit whereas CodeBuild is much more relaxed in that regard and, being a full blown virtual machine, you get all the benefits of being able to utilise other OS level dependencies like Docker for asset bundling.
Trying to do it inside a Lambda you will either hit timeouts or inability to use os dependencies or disk space issues somewhere down the road inevitably.

Nothing officially supported but worth checking out this:
https://github.com/misterjoshua/cdk-lambda-deploy

It's possible. I use lambda container image to daily deploy/destroy test environments.

Reuse Airflow hooks with kubernetes operator

Is possible to import the hooks that airflow provided in the code (snowflake hook, aws hook, etc) in a kubernetes operator that run a python script?
I may have the wrong idea of how to work with airflow.
Example. I have Airflow running with a kubernetes cluster. Tasks of the dag are inside containers. One task takes some data from SQL and upload it to S3. If I want to do this in a python script inside container I need to code all this stuff (with the possible error and expend time) if I can reuse the libraries (hooks) that airflow has in the code I can save time and also, I'm supported for a lot of developers.
Thank you

Start EC2 with Docker, run script and shut down

Hi Stackoverflow community, I have a question regarding using Docker with AWS EC2. I am comfortable with EC2 but am very new to Docker. I code in Python 3.6 and would like to automate the following process:
1: start an EC2 instance with Docker (Docker image stored in ECR)
2: run a one-off process and return results (let's call it "T") in a CSV format
3: store "T" in AWS S3
4: Shut down the EC2
The reason for using an EC2 instance is because the process is quite computationally intensive and is not feasible for my local computer. The reason for Docker is to ensure the development environment is the same across the team and the CI facility (currently using circle.ci). I understand that interactions with AWS can mostly be done using Boto3.
I have been reading about AWS's own ECS and I have a feeling that it's geared more towards deploying a web-app with Docker rather than running a one-off process. However, when I searched around EC2 + Docker nothing else but ECS came up. I have also done the tutorial in AWS but it doesn't help much.
I have also considered running EC2 with a shell script (i.e. downloading docker, pulling the image, building the container etc)but it feels a bit hacky? Therefore my questions here are:
1: Is ECS really the most appropriate solution in his scenario? (or in other words is ECS designed for such operations?)
2: If so are there any examples of people setting-up and running a one-off process using ECS? (I find the set-up really confusing especially the terminologies used)
3: What are the other alternatives (if any)?
Thank you so much for the help!

Without knowing more about your process; I'd like to pose 2 alternatives for you.
Use Lambda
Pending just how compute intensive your process is, this may not be a viable option. However, if it something that can be distributed, Lambda is awesome. You can find more information about the resource limitations here. This route, you would simply write Python 3.6 code to perform your task and write "T" to S3.
Use Data Pipeline
With Data Pipeline, you can build a custom AMI (EC2) and use that as your image. You can then specify the size of the EC2 resource that you need to run this process. It sounds like your process would be pretty simple. You would need to define:
EC2resource
Specify AMI, Role, Security Group, Instance Type, etc.
ShellActivity
Bootstrap the EC2 instance as needed
Grab your code form S3, GitHub, etc
Execute your code (Include in your code writing "T" to S3)
You can also schedule the pipeline to run at an interval/schedule or call it directly from boto3.

Where should I run scheduled background jobs?

Here in my company we have our regular application in aws ebs with some background jobs. The problem is, these jobs are starting to get heavier and we were thinking in separate them from the application. The question is: Where should we do it?
We were thinking in doing it in aws lambda, but then we would have to port our rails code to python, node or java, which seems to be a lot of work. What are other options for this? Should we just create another ec2 environment for the jobs? Thanks in advance.
Edit: I'm using shoryuken gem: http://github.com/phstc/shoryuken integrated with SQS. But its currently with some memory leak and my application is going down sometimes, I dont know if the memory leak is the cause tough. We already separated the application between an API part in EBS and a front-end part in S3.

Normally, just another EC2 instance with a copy of your Rails app, where instead of rails s to start the web server, you run rake resque:work or whatever your job runner start command is. Both would share the same Redis instance and database so that your web server writes the jobs to the queue and the worker picks them up and runs them.
If you need more workers, just add more EC2 instances pointing to the same Redis instance. I would advise separating your jobs by queue name, so that one worker can just process fast stuff e.g. email sending, and others can do long running or slow jobs.

We had a similar requirement, for us it was the sidekiq background jobs, they started to get very heavy, so we split it into a separate opsworks stack, with a simple recipe to build the machine dependencies ( ruby, mysql, etc ), and since we don't have to worry about load balancers and requests timing out, it's ok for all machines to deploy at the same time.
Also another thing you could use in opsworks is using scheduled machines ( if the jobs are needed at certain times during the day ), having the machine get provisioned few minutes before the time of the task, and then after the task is done you could make it shutdown automatically, that would reduce your cost.
EB also has a different type of application, which is the worker application, you could also check that out, but honestly I haven't looked into it so I can't tell you what are the pros and cons of that.

We recently passed on that route. I dockerized our rails app, and wrote a custom entrypoint to that docker container. In summary that entrypoint parses commands after you run docker run IMAGE_NAME
For example: If you run: docker run IMAGE_NAME sb rake do-something-magical entrypoint understands that it will run rake job with sandbox envrionment config. if you only run: docker run IMAGE_NAME it will do rails s -b 0.0.0.0
PS: I wrote custom entrypoint because we have 3 different environments, that entrypoint downloads environment specific config from s3.
And I set up an ECS Cluster, wrote an task-runner job on Lambda this lambda function schedules a task on ecs cluster, and we trigger that lambda from CloudWatch Events. You can send custom payloads to lambda when using CloudWatch Events.
It sounds complicated but implementation is really simple.

You may consider to submit your task to AWS SQS services, then you may use elasticbeantaslk worker enviroment to process your backgrown task.
Elasticbeanstalk supports rail application:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Ruby_rails.html

Depending on what kind of work these background jobs perform, you might want to think about maybe extracting those functions into microservices if you are running your jobs on a difference instance anyway.
Here is a good codeship blog post on how to approach this.
For simple mailer type stuff, this definitely feels a little heavy handed, but if the functionality is more complex, e.g. general notification of different clients, it might well be worth the overhead.

AWS OpsWorks vs AWS Beanstalk vs AWS CloudFormation? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
The community reviewed whether to reopen this question last month and left it closed:
Original close reason(s) were not resolved
Improve this question
I would like to know what are the advantages and disadvantages of using AWS OpsWorks vs AWS Beanstalk and AWS CloudFormation?
I am interested in a system that can be auto scaled to handle any high number of simultaneous web requests (From 1000 requests per minute to 10 million rpm.), including a database layer that can be auto scalable as well.
Instead of having a separate instance for each app, Ideally I would like to share some hardware resources efficiently. In the past I have used mostly an EC2 instance + RDS + Cloudfront + S3
The stack system will host some high traffic ruby on rails apps that we are migrating from Heroku, also some python/django apps and some PHP apps as well.

I would like to know what are the advantages and disadvantages of using AWS OpsWorks vs AWS Beanstalk and AWS CLoudFormation?
The answer is: it depends.
AWS OpsWorks and AWS Beanstalk are (I've been told) simply different ways of managing your infrastructure, depending on how you think about it. CloudFormation is simply a way of templatizing your infrastructure.
Personally, I'm more familiar with Elastic Beanstalk, but to each their own. I prefer it because it can do deployments via Git. It is public information that Elastic Beanstalk uses CloudFormation under the hood to launch its environments.
For my projects, I use both in tandem. I use CloudFormation to construct a custom-configured VPC environment, S3 buckets and DynamoDB tables that I use for my app. Then I launch an Elastic Beanstalk environment inside of the custom VPC which knows how to speak to the S3/DynamoDB resources.
I am interested in a system that can be auto scaled to handle any high number of simultaneous web requests (From 1000 requests per minute to 10 million rpm.), including a database layer that can be auto scalable as well.
Under the hood, OpsWorks and Elastic Beanstalk use EC2 + CloudWatch + Auto Scaling, which is capable of handling the loads you're talking about. RDS provides support for scalable SQL-based databases.
Instead of having a separate instance for each app, Ideally I would like to share some hardware resources efficiently. In the past I have used mostly an EC2 instance + RDS + Cloudfront + S3
Depending on what you mean by "some hardware resources", you can always launch standalone EC2 instances alongside OpsWorks or Elastic Beanstalk environments. At present, Elastic Beanstalk supports one webapp per environment. I don't recall what OpsWorks supports.
The stack system will host some high traffic ruby on rails apps that we are migrating from Heroku, also some python/django apps and some PHP apps as well.
All of this is fully supported by AWS. OpsWorks and Elastic Beanstalk have optimized themselves for an array of development environments (Ruby, Python and PHP are all on the list), while EC2 provides raw servers where you can install anything you'd like.

OpsWorks is an orchestration tool like Chef - in fact, it's derived from Chef - Puppet, Ansible or Saltstalk. You use Opsworks to specify the state that you want your network to be in by specifying the state that you want each resource - server instances, applications, storage - to be in. And you specify the state that you want each resource to be in by specifying the value that you want for each attribute of that state. For example, you might want the Apache service to be always up and running and start on boot-up with Apache as the user and Apache as the Linux group.
CloudFormation is a json template (**) that specifies the state of the resource(s) that you want to deploy i.e. you want to deploy an AWS EC2 micro t2 instance in us-east-1 as part of VPC 192.168.1.0/24. In the case of an EC2 instance, you can specify what should run on that resource through your custom bash script in the user-data section of the EC2 resource. CloudFormation is just a template. The template gets fleshed ourt as a running resource only if you run it either through the AWS Management Console for CloudFormation or if you run the aws cli command for Cloudformation i.e. aws cloudformation ...
ElasticBeanstalk is a PAAS- you can upload the specifically Ruby/Rails, node.js or Python/django or Python/Flask apps. If you're running anything else like Scala, Haskell or anything else, create a Docker image for it and upload that Docker image into Elastic Beanstalk (*).
You can do the uploading of your app into Elastic Beanstalk by either running the aws cli for CloudFormation or you create a recipe for Opsworks to upload your app into Elastic Beanstalk. You can also run the aws cli for Cloudformation through Opsworks.
(*) In fact, AWS's documentation on its Ruby app example was so poor that I lost patience and embedded the example app into a Docker image and uploaded the Docker image into Elastic Beanstalk.
(**) As of Sep 2016, Cloudformation also supports YAML templates.

AWS Beanstalk:
It is Deploy and manage applications in the AWS cloud without worrying about the infrastructure that runs yor web applications with Elastic Beanstalk.
No need to worry about EC2 or else installations.
AWS OpsWorks
AWS OpsWorks is nothing but an application management service that makes it easy for the new DevOps users to model & manage the entire their application

In Opsworks you can share "roles" of layers across a stack to use less resources by combining the specific jobs an underlying instance maybe doing.
Layer Compatibility List (as long as security groups are properly set):
HA Proxy : custom, db-master, and memcached.
MySQL : custom, lb, memcached, monitoring-master, nodejs-app, php-app, rails-app, and web.
Java : custom, db-master, and memcached.
Node.js : custom, db-master, memcached, and monitoring-master
PHP : custom, db-master, memcached, monitoring-master, and rails-app.
Rails : custom, db-master, memcached, monitoring-master, php-app.
Static : custom, db-master, memcached.
Custom : custom, db-master, lb, memcached, monitoring-master, nodejs-app, php-app, rails-app, and web
Ganglia : custom, db-master, memcached, php-app, rails-app.
Memcached : custom, db-master, lb, monitoring-master, nodejs-app, php-app, rails-app, and web.
reference : http://docs.aws.amazon.com/opsworks/latest/userguide/layers.html

AWS CloudFormation - Create and Update your environments.
AWS Opsworks - Manage your systems inside that environments like we do with Chef or Puppet
AWS Beanstalk - Create, Manage and Deploy.
But personally I like CloudFormation and OpsWorks both by using its full power for what they are meant for.
Use CloudFormation to create your environment then you can call Opsworks from cloud formation scripts to launch your machine. Then you will have Opsworks stack to manage it. For example add a user in linux box by using Opsworks or do patching of your boxes using chef recipes. You can write down chef recipes for deployment also. Otherwise you can use CodeDeploy specifically build for deployment.

AWS OpsWorks - This is a part of AWS management service. It helps to configure the application using scripting. It uses Chef as the devops framework for this application management and operation.
There are templates which can be used for configuration of server, database, storage. The templates can also be customized to perform any other task. DevOps Engineers have control on application's dependencies and infrastructure.
AWS Beanstalk - It provides the environment for language like Java, Node Js, Python, Ruby Go. Elastic Bean stalk provide the resource to run the application. Developers not to worry about the infrastructure and they don't have control on infrastructure.
AWS CloudFormation - CloudFormation has sample templates to manage the AWS resources in order.

As many others have commented AWS Beanstalk, AWS OpsWorks and AWS Cloud Formation offers different solutions for different problems.
In order to acomplish with
I am interested in a system that can be auto scaled to handle any high number of simultaneous web requests (From 1000 requests per minute to 10 million rpm.), including a database layer that can be auto scalable as well.
And taking into consideration you are in migration process I strongly recommend you to start taking a look at AWS Lambda & AWS DynamoDB solution (or hybrid one).
Both two are designed for auto scaling in a simple way and may be a very cheap solution.

You should use OpsWorks in place of CloudFormation if you need to deploy an application that requires updates to its EC2 instances. If your application uses a lot of AWS resources and services, including EC2, use a combination of CloudFormation and OpsWorks
If your application will need other AWS resources, such as database or storage service. In this scenario, use CloudFormation to deploy Elastic Beanstalk along with the other resources.

Just use terraform and ECS or EKS.
opsworks, elastic beanstalk and cloudformation old tech now. -)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart