Terraform: AWS Lambda with Image not updating - docker

We have a new terraform script that is pushing a docker image to an AWS Lambda. The script works well and correctly connects the fresh image to the Lambda. I can confirm this by checking the Image URL as shown in the AWS console for the Lambda and it is the newly pushed+connected image. However when testing the lambda it is clearly running the prior code. It seems like the Lambda has been updated but the running in-memory instances didnt get the message.
Question: is there a way to force the in-memory Lambdas to be cycled to the new image?
Here is our TF code for the Lambda:
resource "aws_lambda_function" "my_lambda" {
function_name = "MyLambda_${var.environment}"
role = data.aws_iam_role.iam_for_lambda.arn
image_uri = "${data.aws_ecr_repository.my_image.repository_url}:latest"
memory_size = 512
timeout = 300
architectures = ["x86_64"]
package_type = "Image"
environment {variables = {stage = var.environment, commit_hash=var.commit_hash}}
}

After more searching I found some discussions (here) that mention the source_code_hash option in terraform for the Lambda creation block (docs here). Its mostly used with a SHA hash of the zip file used for pushing code from an S3 bucket, but in our case we are using a container/image so there is not really a file to get a hash from. However, it turns out that it is just a string that Lambda checks for changes. So we added the following:
resource "aws_lambda_function" "my_lambda" {
function_name = "MyLambda_${var.environment}"
role = data.aws_iam_role.iam_for_lambda.arn
image_uri = "${data.aws_ecr_repository.my_image.repository_url}:latest"
memory_size = 512
timeout = 300
architectures = ["x86_64"]
package_type = "Image"
environment {variables = {stage = var.environment, commit_hash=var.commit_hash}}
source_code_hash = var.commit_hash << New line
}
And we use a bitbucket pipeline to inject the git hash into the terraform apply operation. This fix allowed the Lambda to correctly update the running version.

Alternatively, if you don't want to depend on bitbucket for this, you can add a data source for the ECR image:
data "aws_ecr_image" "repo_image" {
repository_name = "repo-name"
image_tag = "tag"
}
And then use its id as a source code hash like this:
source_code_hash = trimprefix(data.aws_ecr_image.repo_image.id, "sha256:")

Related

Accessing the return value of a Lambda Step in Sagemaker pipeline

I've added a Lambda Step as the first step in my Sagemaker Pipeline. It processes some data and creates 2 files as part of the output like so:
from sagemaker.workflow.lambda_step import LambdaStep, Lambda, LambdaOutput, LambdaOutputTypeEnum
# lamb_preprocess = LambdaStep(func_arn="")
output_param_1 = LambdaOutput(output_name="status", output_type=LambdaOutputTypeEnum.Integer)
output_param_2 = LambdaOutput(output_name="file_name_a_c_drop", output_type=LambdaOutputTypeEnum.String)
output_param_3 = LambdaOutput(output_name="file_name_q_c_drop", output_type=LambdaOutputTypeEnum.String)
step_lambda = LambdaStep(
name="ProcessingLambda",
lambda_func=Lambda(
function_arn="arn:aws:lambda:us-east-1:xxxxxxxx:function:xxxxx"
),
inputs={
"input_data": input_data,
"input_file": trigger_file,
"input_bucket": trigger_bucket
},
outputs = [
output_param_1, output_param_2, output_param_3
]
)
In my next step, I want to trigger a Processing Job for which I need to pass in the above Lambda function's outputs as it's inputs. I'm trying to do it like so:
inputs = [
ProcessingInput(source=step_lambda.properties.Outputs["file_name_q_c_drop"], destination="/opt/ml/processing/input"),
ProcessingInput(source=step_lambda.properties.Outputs["file_name_a_c_drop"], destination="/opt/ml/processing/input"),
]
However, when the processing step is trying to get created, I get a validation message saying
Object of type Properties is not JSON serializable
I followed the data dependency docs here: https://sagemaker.readthedocs.io/en/stable/amazon_sagemaker_model_building_pipeline.html#lambdastep and tried accessing step_lambda.OutputParameters["file_name_a_c_drop"] too but it errored out saying 'LambdaStep' object has no attribute 'OutputParameters'
How do I properly access the return value of a LambdaStep in a Sagemaker pipeline ?
You can access the output as follows - step_lambda.OutputParameters["output1"]. You don't need to add .properties
To access a LambdaStep output in another step you can do this:
step_lambda.properties.Outputs["file_name_a_c_drop"]
Try this
steplambda.properties.ProcessingOutputConfig.Outputs["file_name_q_c_drop"].S3Output.S3Uri

Is there a way to specify number of AZs to use when creating a vpc?

When instantiating the vpc object within a stack using the CDK. There is a parameter max_azs which supposedly defaults to 3. However, when I create a vpc no matter what I set that number to, I only ever get 2 AZs.
from aws_cdk import (
core,
aws_ec2 as ec2
)
app = core.App()
subnets = []
subnets.append(ec2.SubnetConfiguration(name = "public", subnet_type = ec2.SubnetType.PUBLIC, cidr_mask = 20))
subnets.append(ec2.SubnetConfiguration(name = "private", subnet_type = ec2.SubnetType.PRIVATE, cidr_mask = 20))
subnets.append(ec2.SubnetConfiguration(name = "isolated", subnet_type = ec2.SubnetType.ISOLATED, cidr_mask = 20))
vpc = ec2.Vpc(app, "MyVpc", subnet_configuration = subnets, max_azs = 3)
print(vpc.availability_zones)
app.synth()
I would expect to see 3 azs used here, but actually only ever get 2. Even if i set the value to 99, which should mean all azs.
Ah yes I came across the same issue myself. What solved it for me was specifying the region and account when creating the stack.
The following examples are for typescript but I'm sure you can write the corresponding python.
new MyStack(app, 'MyStack', {
env: {
region: 'us-east-1',
account: '1234567890',
}
});
In the case of typescript you need to rebuild and synth before you deploy.
$ npm run build
$ cdk synth

How to inject variables from a file in Jenkins Declarative Pipeline?

I have a text file :
export URL = "useful url"
export NAME = "some name"
What I do is executing this file with command source var_file.txt
But when I do echo $URL or env.URL it returns nothing.
Please I don't have the ability to change the file var_file.txt : it means it will still be export var= value var
I know that it is possible to use load file.groovy step in pipeline to load variables but the file must be a list of : env.URL = 'url', I can't use this because I can't change file.
And we may also work with withEnv([URL = 'url']) but I must first get the values from an other script. This will really be a complicated solution.
So is there a way to use the file with list of export var = var_value in Jenkins Pipeline ?
What I have done is :
def varsFile = "var_file.txt"
def content = readFile varsFile
Get content line by line and split change the each line of content to env.variable = value:
def lines = content.split("\n")
for(l in lines){
String variable = "${l.split(" ")[1].split("=")[0]}"
String value = l.split(" ")[1].split("=")[1]
sh ("echo env.$variable = \\\"$value\\\" >> var_to_exp.groovy")
}
And then load file groovy with step load in the pipeline:
load var_to_exp.groovy
Alternative suggestion: embed scripted pipeline (not sure if there is a genuine "declarative" way of doing this -- at least I haven't found it so far):
stage('MyStage') {
steps {
script {
<extract your variables using some Groovy>
env.myvar = 'myvalue'
}
echo env.myvar
}
}
I'm not entirely sure how much modification you are allowed to do on your input (e.g. get rid of the export etc.), or whether that has to remain an executable shell script.

Access managed files, local files and remote files ssh during Jenkins pre-build stages

I have a parametrized build and I'd like to populate parameter values based on contents of files/directories on the local slave and/or on a remote box accessible via ssh.
It's not a problem to access local and remote files during build stages, but I need to make it work in an Active Choice Plugin (or something similar).
Apparently, sh function doesn't work, but some Java-like Groovy API is still available (as described here: https://wiki.jenkins.io/display/JENKINS/Active+Choices+Plugin)
jenkinsURL=jenkins.model.Jenkins.instance.getRootUrl()
def propFile=vPropFile //name of properties file
def propKey=vPropKey // name of properties key
def relPropFileUrl=vRelPropFileUrl // userContent/properties/
def propAddress="${jenkinsURL}${relPropFileUrl}$propFile"
def props= new Properties()
props.load(new URL(propAddress).openStream())
def choices=[]
props.get(propKey.toString()).split(",").each{
choices.add(it)
}
return choices
I wonder if it's possible to access managed files the same way or better yet to access something remotely using SSH.
Is there an API for that?
I couldn't find a solution that would allow to SSH during during Active Choices parameter script execution.
However, I was able to use configuration file(s) managed by Jenkins. Here's the code that can be run from the Active Choices parameter script:
def gcf = org.jenkinsci.plugins.configfiles.GlobalConfigFiles.get()
// Read different file based on referencedParameter ENVIRONMENT
def deploymentFileName = 'deployment.' + ENVIRONMENT + '.properties'
def deploymentFile = gcf.getById(deploymentFileName)
def deploymentProperties = new Properties();
deploymentProperties.load(new java.io.StringReader(deploymentFile.content))
def choices = []
// Make use of Properties object here to return list of choices
return choices
Later in the main Groovy Script of the pipeline it's possible to update file the same way, but the file has to be read/loaded again as the script context is different:
def gcf = org.jenkinsci.plugins.configfiles.GlobalConfigFiles.get()
def deploymentFile = gcf.getById(deploymentFileName)
def deploymentProperties.load(new java.io.StringReader(deploymentFile.content))
// Update deploymentProperties as necessary here.
def stringWriter = new java.io.StringWriter()
deploymentProperties.store(stringWriter, "comments")
// Content of the deploymentFile object is immutable.
// So need to create new instance and reuse the same file id to overwrite the file.
def newDeploymentFile = deploymentFile.getDescriptor().newConfig(
deploymentFile.id, deploymentFile.name, deploymentFile.comment, stringWriter.toString())
gcf.save(newDeploymentFile)
Of course, all necessary API permissions have to be granted in Jenkins.

How to get the most recent ebs snapshot using terraform datasource?

I am trying to get the most recent created snapshot using terraform, don't know how to do it, according to terraform's document, for aws ami, it can be done by:
data "aws_ami" "web" {
filter {
name = "state"
values = ["available"]
}
filter {
name = "tag:Component"
values = ["web"]
}
most_recent = true
}
I am expecting similar things for ebs snapshot like:
data "aws_ebs_snapshot" "latest_snapshot" {
filter {
name = "state"
values = ["available"]
}
most_recent = true
}
But there is no "most_recent" argument at the reference page for data -> "aws_ebs_snapshot" here, so how can I get the most recent created snapshot using terraform? and why cannot we use the similar syntax as what aws_ami has?
Currently not available in the latest release of Terraform v0.8.2, but this feature has been merged into the latest master of Terraform just a few days ago.
https://github.com/hashicorp/terraform/pull/10986
It is also listed in CHANGELOG of the next release v0.8.3, so it will be available soon.

Resources