I want to test an AWS CDK construct/template that involves creating a lambda function with a docker image asset. This causes the unit tests to take awhile to execute. Is there anyway to tell the cdk to not physically create these docker images for testing purposes?
Have you got esbuild installed? Without access to esbuild the bundling part of generating the lambda falls back to docker.
I have AWS Lambda function written in ruby and due to the larger and larger amounts of data it should process (maths) it timeouts. Due to the nature of the job it is timing out due to very extensive maths and optimization of maths is out of review - it is way more complex and not an option as sits in a proprietary library written in ruby.
To avoid this I'm searching the way to migrate whole lambda code to AWS Fargate so it will be able to instantiate a task and let it run for a long time (probably around 30-35 minutes). I was not able to find a guide on how to convert lambda code to aws fargate deployment.
So I have an AWS lambda zip file that contains rails lambda code, proprietary ruby library, and mysql2 gem with native *.so libraries (results of calculations are stored in db).
Is there any tool or guide how to migrate such or similar lambda functions to fargate? Btw, lambda is invoked from SQS as far as I understood in case of Fargate SQS will invoke lambda which will run fargate task. But task configuration and images and rest are not clear.
#sngsnd Thanks for asking the question.
As per the definition,
AWS Fargate is a serverless compute engine for containers.
Below steps may help you.
Containerized your Ruby code.
Put image in repository like (DockerHub, ECR etc)
Create task definition in ECS
Create ECS cluster.
Create Service to run your task.
This article may help you.
serverless-application-for-long-running-process-fargate-lambda
Hi Stackoverflow community, I have a question regarding using Docker with AWS EC2. I am comfortable with EC2 but am very new to Docker. I code in Python 3.6 and would like to automate the following process:
1: start an EC2 instance with Docker (Docker image stored in ECR)
2: run a one-off process and return results (let's call it "T") in a CSV format
3: store "T" in AWS S3
4: Shut down the EC2
The reason for using an EC2 instance is because the process is quite computationally intensive and is not feasible for my local computer. The reason for Docker is to ensure the development environment is the same across the team and the CI facility (currently using circle.ci). I understand that interactions with AWS can mostly be done using Boto3.
I have been reading about AWS's own ECS and I have a feeling that it's geared more towards deploying a web-app with Docker rather than running a one-off process. However, when I searched around EC2 + Docker nothing else but ECS came up. I have also done the tutorial in AWS but it doesn't help much.
I have also considered running EC2 with a shell script (i.e. downloading docker, pulling the image, building the container etc)but it feels a bit hacky? Therefore my questions here are:
1: Is ECS really the most appropriate solution in his scenario? (or in other words is ECS designed for such operations?)
2: If so are there any examples of people setting-up and running a one-off process using ECS? (I find the set-up really confusing especially the terminologies used)
3: What are the other alternatives (if any)?
Thank you so much for the help!
Without knowing more about your process; I'd like to pose 2 alternatives for you.
Use Lambda
Pending just how compute intensive your process is, this may not be a viable option. However, if it something that can be distributed, Lambda is awesome. You can find more information about the resource limitations here. This route, you would simply write Python 3.6 code to perform your task and write "T" to S3.
Use Data Pipeline
With Data Pipeline, you can build a custom AMI (EC2) and use that as your image. You can then specify the size of the EC2 resource that you need to run this process. It sounds like your process would be pretty simple. You would need to define:
EC2resource
Specify AMI, Role, Security Group, Instance Type, etc.
ShellActivity
Bootstrap the EC2 instance as needed
Grab your code form S3, GitHub, etc
Execute your code (Include in your code writing "T" to S3)
You can also schedule the pipeline to run at an interval/schedule or call it directly from boto3.
We have team of j2ee spring and angular developers. We are developing small applications in short span. As of now we don't have luxury to have DevOps team to maintain staging and QA environments.
I am checking feasibility that developer who want to get their application tested can build docker image and float it on on-premise central docker server (At times they work from remote locations as well). We are in process of CI but it may take some time.
Due to cost pressure we can not use AWS except for production.
Any pointer will be helpful.
Thanks in advance.
Since you plans on using Docker, you can infact setup a simple build flow which makes lives easier in the long run.
Use DockerHub for building and storing docker images (This saves time for building and also provides a easy way of rolling back shipping and DevOps). It takes few minutes to connect your Github/Bitbucket repository to DockerHub and tell for each branch/tag build an image upon PR merge or push. Also the cost for the service is minimal.
Use these images for your local environment as well as production environment (Giving guarantee that you refer the correct versions)
For production use AWS Elastic Beanstalk or AWS ECS (I prefer ECS due to Container Orchestration capabilities) to simplify the Deployments & DevOps where most of the configurations can be done from AWS Web Console). Cost is only for the underlying EC2 instances.
For Dockerizing your Java application, this video might be helpful to get insights of JVM
Note: Later on you can connect these Dots using your CI environment reducing further efforts
Here in my company we have our regular application in aws ebs with some background jobs. The problem is, these jobs are starting to get heavier and we were thinking in separate them from the application. The question is: Where should we do it?
We were thinking in doing it in aws lambda, but then we would have to port our rails code to python, node or java, which seems to be a lot of work. What are other options for this? Should we just create another ec2 environment for the jobs? Thanks in advance.
Edit: I'm using shoryuken gem: http://github.com/phstc/shoryuken integrated with SQS. But its currently with some memory leak and my application is going down sometimes, I dont know if the memory leak is the cause tough. We already separated the application between an API part in EBS and a front-end part in S3.
Normally, just another EC2 instance with a copy of your Rails app, where instead of rails s to start the web server, you run rake resque:work or whatever your job runner start command is. Both would share the same Redis instance and database so that your web server writes the jobs to the queue and the worker picks them up and runs them.
If you need more workers, just add more EC2 instances pointing to the same Redis instance. I would advise separating your jobs by queue name, so that one worker can just process fast stuff e.g. email sending, and others can do long running or slow jobs.
We had a similar requirement, for us it was the sidekiq background jobs, they started to get very heavy, so we split it into a separate opsworks stack, with a simple recipe to build the machine dependencies ( ruby, mysql, etc ), and since we don't have to worry about load balancers and requests timing out, it's ok for all machines to deploy at the same time.
Also another thing you could use in opsworks is using scheduled machines ( if the jobs are needed at certain times during the day ), having the machine get provisioned few minutes before the time of the task, and then after the task is done you could make it shutdown automatically, that would reduce your cost.
EB also has a different type of application, which is the worker application, you could also check that out, but honestly I haven't looked into it so I can't tell you what are the pros and cons of that.
We recently passed on that route. I dockerized our rails app, and wrote a custom entrypoint to that docker container. In summary that entrypoint parses commands after you run docker run IMAGE_NAME
For example: If you run: docker run IMAGE_NAME sb rake do-something-magical entrypoint understands that it will run rake job with sandbox envrionment config. if you only run: docker run IMAGE_NAME it will do rails s -b 0.0.0.0
PS: I wrote custom entrypoint because we have 3 different environments, that entrypoint downloads environment specific config from s3.
And I set up an ECS Cluster, wrote an task-runner job on Lambda this lambda function schedules a task on ecs cluster, and we trigger that lambda from CloudWatch Events. You can send custom payloads to lambda when using CloudWatch Events.
It sounds complicated but implementation is really simple.
You may consider to submit your task to AWS SQS services, then you may use elasticbeantaslk worker enviroment to process your backgrown task.
Elasticbeanstalk supports rail application:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Ruby_rails.html
Depending on what kind of work these background jobs perform, you might want to think about maybe extracting those functions into microservices if you are running your jobs on a difference instance anyway.
Here is a good codeship blog post on how to approach this.
For simple mailer type stuff, this definitely feels a little heavy handed, but if the functionality is more complex, e.g. general notification of different clients, it might well be worth the overhead.