How to setup AWS CDK app execution in AWS CodeBuild? - aws-cdk

I want to run AWS CDK synthesis from Git repository using AWS CodeBuild - i.e. if I update the CDK app code in the repo I want CloudFormation stacks to be updated automatically. What are the best practices for setting up build role permissions?

For a GitHub repository, your CodeBuild role doesn't need additional permissions but it should have access to an oauthToken to access GitHub.
For a CodeCommit repository, create or import a codecommit.Repository object and use a CodeCommitSource object for your source parameter, and the build role permissions will be set up automatically (in particular, the permissions that will be added will be to codecommit:GitPull from the indicated repository).
See here.
You might also be interested in CDK's app-delivery package. It doesn't just create a CodeBuild project though, it uses CodePipeline to fetch, build and deploy a CDK application, so it might be more than you are looking for.

AWS released a month ago a new class to the CDK suite called pipelines that includes several utilities to ease the job of setting up self modifying pipelines. In addition, there's codepipeline-actions that includes constructs to hook your pipeline to CodeCommit, GitHub, BitBucket, etc...
Here's a complete example (verbatim from the linked blog post), using github as a source, that deploys a lambda through CodePipeline:
Create a stage with your stack
import { CfnOutput, Construct, Stage, StageProps } from '#aws-cdk/core';
import { CdkpipelinesDemoStack } from './cdkpipelines-demo-stack';
/**
* Deployable unit of web service app
*/
export class CdkpipelinesDemoStage extends Stage {
public readonly urlOutput: CfnOutput;
constructor(scope: Construct, id: string, props?: StageProps) {
super(scope, id, props);
const service = new CdkpipelinesDemoStack(this, 'WebService');
// Expose CdkpipelinesDemoStack's output one level higher
this.urlOutput = service.urlOutput;
}
}
Create a stack with your pipeline
import * as codepipeline from '#aws-cdk/aws-codepipeline';
import * as codepipeline_actions from '#aws-cdk/aws-codepipeline-actions';
import { Construct, SecretValue, Stack, StackProps } from '#aws-cdk/core';
import { CdkPipeline, SimpleSynthAction } from "#aws-cdk/pipelines";
/**
* The stack that defines the application pipeline
*/
export class CdkpipelinesDemoPipelineStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
const sourceArtifact = new codepipeline.Artifact();
const cloudAssemblyArtifact = new codepipeline.Artifact();
const pipeline = new CdkPipeline(this, 'Pipeline', {
// The pipeline name
pipelineName: 'MyServicePipeline',
cloudAssemblyArtifact,
// Where the source can be found
sourceAction: new codepipeline_actions.GitHubSourceAction({
actionName: 'GitHub',
output: sourceArtifact,
oauthToken: SecretValue.secretsManager('github-token'),
owner: 'OWNER',
repo: 'REPO',
}),
// How it will be built and synthesized
synthAction: SimpleSynthAction.standardNpmSynth({
sourceArtifact,
cloudAssemblyArtifact,
// We need a build step to compile the TypeScript Lambda
buildCommand: 'npm run build'
}),
});
// This is where we add the application stages
// ...
}
}

Related

How can I reference my constant within a Jenkins Parameter?

I have the following code in a Pipelineconstant.groovy file:
public static final list ACTION_CHOICES = [
N_A,
FULL_BLUE_GREEN,
STAGE,
FLIP,
CLEANUP
]
and this PARAMETERS in Jenkins multi-Rapper-file:
parameters {
string (name: 'ChangeTicket', defaultValue: '000000', description : 'Prod change ticket otherwise 000000')
choice (name: 'AssetAreaName', choices: ['fpukviewwholeof', 'fpukdocrhs', 'fpuklegstatus', 'fpukbooksandjournals', 'fpukleglinks', 'fpukcasesoverview'], description: 'Select the AssetAreaName.')
/* groovylint-disable-next-line DuplicateStringLiteral */
choice (name: 'AssetGroup', choices: ['pdc1c', 'pdc2c'])
}
I would like to ref ACTION_CHOICES in the parameter as this:
choice (name: 'Action', choices: constants.ACTION_CHOICES, description: 'Multi Version deployment actions')
but it doesn't work for me.
I tried to do this:
choice (name: 'Action', choices: constants.ACTION_CHOICES, description: 'Multi Version deployment actions')
but it doesn't work for me.
You're almost there! Jenkinsfile(s) can be extended with variables / constants defined (directly in your file or (better I'd say) from a Jenkins shared library (this scenario).
The parameter syntax within you pipeline was fine as well as the idea of lists of constants, but what was missing: a proper interlink of those parts together - proper library import. See example below (the names below in the example are not carved in stone and can be of course changed but watch out - Jenkins is quite sensitive about filenames, paths, ... (especially in shared libraries]):
Pipelineconstant.groovy should be placed in src/org/pipelines of your Jenkins shared library.
Pipelineconstant.groovy
package org.pipelines
class Pipelineconstant {
public static final List<String> ACTION_CHOICES = ["N_A", "FULL_BLUE_GREEN", "STAGE", "FLIP", "CLEANUP"]
}
and then you can reference this list of constants within your Jenkinsfile pipeline.
Jenkinsfile
#Library('jsl-constants') _
import org.pipelines.Pipelineconstant
pipeline {
agent any
parameters {
choice (name: 'Action', choices: Pipelineconstant.ACTION_CHOICES , description: 'Multi Version deployment actions')
}
// rest of your pipeline code
}
The first two lines of the pipeline are important - the first loads the JSL itself! Therefore the second line of that import can be used (otherwise Jenkins would not know where find that Pipelineconstant.groovy file.
B) Without Jenkins shared library (files in one repo):
I've found this topic discussed and solved for scripted pipeline here: Load jenkins parameters from external groovy file

AWS CDK Pipelines using with an existing codepipeline

The documentation of #aws-cdk/pipelines seems to suggest that a CDK pipeline can be added to an existing #aws-cdk/aws-codepipeline/Pipeline, using the codePipeline prop: https://docs.aws.amazon.com/cdk/api/latest/docs/#aws-cdk_pipelines.CodePipeline.html
codePipeline? Pipeline An existing Pipeline to be reused and built upon.
However, I am not able to get this to work and am experiencing multiple errors at the cdk synth step, depending on how I try to set it up. As far as I can tell there isn't really any documentation yet to cover this scenario.
Essentially, we are trying to create a pipeline that runs something like:
clone
lint / typecheck / unit test
cdk deploy to test environment
integration tests
deploy to preprod
smoke test
manual approval
deploy to prod
I guess it's just not clear the difference between this codebuild pipeline and the cdk pipeline. Also, the naming convention of stages seems a little unclear - referencing this issue: https://github.com/aws/aws-cdk/issues/15945
See: https://github.com/ChrisSargent/cdk-issues/blob/pipelines/lib/cdk-test-stack.ts and below:
import * as cdk from "#aws-cdk/core";
import * as pipelines from "#aws-cdk/pipelines";
import * as codepipeline from "#aws-cdk/aws-codepipeline";
import * as codepipeline_actions from "#aws-cdk/aws-codepipeline-actions";
export class CdkTestStack extends cdk.Stack {
constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const cdkInput = pipelines.CodePipelineSource.gitHub(
"ChrisSargent/cdk-issues",
"pipelines"
);
// Setup the code source action
const sourceOutput = new codepipeline.Artifact();
const sourceAction = new codepipeline_actions.GitHubSourceAction({
owner: "ChrisSargent",
repo: "cdk-issues",
branch: "pipelines",
actionName: "SourceAction",
output: sourceOutput,
oauthToken: cdk.SecretValue.secretsManager("git/ChrisSargent"),
});
const pipeline = new codepipeline.Pipeline(this, "Pipeline", {
stages: [
{
actions: [sourceAction],
stageName: "GitSource",
},
],
});
const cdkPipeline = new pipelines.CodePipeline(this, "CDKPipeline", {
codePipeline: pipeline,
synth: new pipelines.ShellStep("Synth", {
// Without input, we get: Error: CodeBuild action 'Synth' requires an input (and the pipeline doesn't have a Source to fall back to). Add an input or a pipeline source.
// With input, we get:Error: Validation failed with the following errors: Source actions may only occur in first stage
input: cdkInput,
commands: ["yarn install --frozen-lockfile", "npx cdk synth"],
}),
});
// Produces: Stage 'PreProd' must have at least one action
// pipeline.addStage(new MyApplication(this, "PreProd"));
// Produces: The given Stage construct ('CdkTestStack/PreProd') should contain at least one Stack
cdkPipeline.addStage(new MyApplication(this, "PreProd"));
}
}
class MyApplication extends cdk.Stage {
constructor(scope: cdk.Construct, id: string, props?: cdk.StageProps) {
super(scope, id, props);
console.log("Nothing to deploy");
}
}
Any guidance or experience with this would be much appreciated.
I'm able to achieve something similar by adding waves/stages with only pre and post steps into the CDK pipelines, sample code is listed as below, I'm amending on your original code snippet:
import * as cdk from "#aws-cdk/core";
import * as pipelines from "#aws-cdk/pipelines";
import * as codepipeline from "#aws-cdk/aws-codepipeline";
import * as codepipeline_actions from "#aws-cdk/aws-codepipeline-actions";
export class CdkTestStack extends cdk.Stack {
constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const cdkInput = pipelines.CodePipelineSource.gitHub(
"ChrisSargent/cdk-issues",
"pipelines"
);
const cdkPipeline = new pipelines.CodePipeline(this, "CDKPipeline", {
selfMutation: true,
crossAccountKeys: true, //can be false if you don't need to deploy to a different account.
pipelineName,
synth: new pipelines.ShellStep("Synth", {
// Without input, we get: Error: CodeBuild action 'Synth' requires an input (and the pipeline doesn't have a Source to fall back to). Add an input or a pipeline source.
// With input, we get:Error: Validation failed with the following errors: Source actions may only occur in first stage
input: cdkInput,
commands: ["yarn install --frozen-lockfile", "npx cdk synth"],
primaryOutputDirectory: 'cdk.out'
}),
});
// add any additional test step here, they will run parallels in waves
cdkPipeline.addWave('test', {post: [provideUnitTestStep(this, 'unitTest')]});
// add a manual approve step if needed.
cdkPipeline.addWave('promotion', {post: [new ManualApprovalStep('PromoteToUat')]});
// Produces: Stage 'PreProd' must have at least one action
// pipeline.addStage(new MyApplication(this, "PreProd"));
// Produces: The given Stage construct ('CdkTestStack/PreProd') should contain at least one Stack
cdkPipeline.addStage(new MyApplication(this, "PreProd"));
}
}
class MyApplication extends cdk.Stage {
constructor(scope: cdk.Construct, id: string, props?: cdk.StageProps) {
super(scope, id, props);
console.log("Nothing to deploy");
}
}
What's noticing is that you might need to covert the way you write your Codebuild action to the new cdk CodeBuildStep. A sample unit test step could is like below:
const provideUnitTestStep = (
id: string
): cdkpipeline.CodeBuildStep => {
const props: CodeBuildStepProps = {
partialBuildSpec: codebuild.BuildSpec.fromObject({
version: '0.2',
env: {
variables: {
DEFINE_VARIBLES: 'someVariables'
}
},
phases: {
install: {
commands: [
'install some dependencies',
]
},
build: {
commands: [
'run some test!'
]
}
}
}),
commands: [],
buildEnvironment: {
buildImage: codebuild.LinuxBuildImage.STANDARD_5_0
}
};
return new cdkpipeline.CodeBuildStep(`${id}`, props);
};
It's not so trivial(and straight forward enough) to retrive the underline CodeBuild project Role, you will need to pass in rolePolicyStatements property in the CodeBuildStep props to grant extra permission needed for your test.
First of all, the error Pipeline must have at least two stages is correct.
You only got the GitHub checkout/clone command as a single stage.
For a second stage, you could use a CodeBuild project to compile/lint/unit test... as you mentioned.
However, what would you like to do with your compiled artifacts then?
Build containers to deploy them later?
If so, there are better ways with CDK of doing this (DockerImageAsset).
This also could save up your preexisting pipeline and you can use the CDK Pipeline directly.
Could you please try to set the property restartExecutionOnUpdate: true,
of your regular Pipeline, like in my following snippet?
const pipeline = new codepipeline.Pipeline(this, "Pipeline", {
restartExecutionOnUpdate: true,
stages: [
{
actions: [sourceAction],
stageName: "GitSource",
},
],
});
This is needed for the self-mutation capability of the CDK pipeline.
This happened to me when I was creating a pipeline in a stack without specifically defined account and region.
Check if you have env like this:
new CdkStack(app, 'CdkStack', {
env: {
account: awsProdAccount,
region: defaultRegion,
}
});

How to run database migrations within a CDK Pipeline

Is there a good pattern for running database migrations within a CDK Pipeline?
Normally (without a CDK Pipeline) I would achieve this with a deploy script that:
deploys the database stack
waits for the database stack to complete
runs the db migrations
deploys the API stack
Is there any way to do this in a CDK Pipeline app (run migrations after the Database stack is deployed but before the API stack is)?
export class MyStage extends Stage {
constructor(scope: Construct, id: string, props?: StageProps) {
super(scope, id, props);
const dbStack = new DatabaseStack(this, 'Database');
const apiStack = new ApiStack(this, 'Api', {
dbUrl: dbStack.dbUrl
});
}
}
Things like this I would put in a CustomResource: https://docs.aws.amazon.com/cdk/api/latest/docs/custom-resources-readme.html
Basically you write a lambda that will handle CREATE event and takes the database as a Property. Then it would have the code that would normally be your migration script. You can just ignore the update/delete events or maybe do some data backup on delete. Just remember that the events are for the custom resource, not necessarily the database (even though they may coincide).

How to create VPC that can be shared across stacks?

I am trying to wrap my head around how to create a reusable VPC that can be used across multiple stacks using AWS CDK. I want to be able to create different stack per project and then be able to import the VPC that should be assigned to the different stacks. I also want to create this using a good structure where I can deploy different stacks at different times (meaning: I do not want to deploy all stacks at once).
I have tried the following approach but this will create a new VPC per stack which is not what I want to achieve, instead I would like to create my VPC once and then if it already exists it will simply reuse the previous created.
app.ts
import cdk = require('#aws-cdk/core');
import { Stack1 } from '../lib/stack1';
import { Stack2 } from '../lib/stack2';
const app = new cdk.App();
new Stack1(app, "Stack1");
new Stack2(app, "Stack2");
stack1.ts
import cdk = require('#aws-cdk/core');
import { Configurator } from './configurators/configurator'
export class Stack1 extends cdk.Stack {
constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const configurator = new Configurator(scope, "Stack1");
// later reuse vpc from configurator using configurator.vpc
}
}
stack2.ts
import cdk = require('#aws-cdk/core');
import { Configurator } from './configurators/configurator'
export class Stack2 extends cdk.Stack {
constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const configurator = new Configurator(scope, "Stack2");
// later reuse vpc from configurator using configurator.vpc
}
}
configurator.ts
import cdk = require('#aws-cdk/core');
import ec2 = require("#aws-cdk/aws-ec2");
export class Configurator {
vpc: ec2.Vpc;
constructor(scope: cdk.Construct, name: string) {
this.vpc = new ec2.Vpc(scope, "MyVPC", {
maxAzs: 3
});
}
}
After doing
cdk synth
cdk deploy Stack1
cdk deploy Stack2
This will create 2 VPCs and not reusing 1 VPC as I would like. I will deploy the stacks to same account and region.
How can I change my approach in order to achieve the output I am looking for? I want to be able to deploy my stacks independently of each other.
If you intend to reuse the VPC in different stacks, I'd recommend placing it in a separate stack, since your VPC stack will have a different lifecycle than your application stacks.
Here's what I'd do. I hope you don't mind a bit of Python :)
First, define your VPC in VpcStack:
class VpcStack(core.Stack):
def __init__(self, app: core.App, id: str, **kwargs) -> None:
super().__init__(app, id, **kwargs)
aws_ec2.Vpc(self, 'MyVPC', max_azs=3)
Then look it up in another stack:
class Stack1(core.Stack):
def __init__(self, app: core.App, id: str, **kwargs) -> None:
super().__init__(app, id, **kwargs)
# Lookup the VPC named 'MyVPC' created in stack 'vpc-stack'
my_vpc = aws_ec2.Vpc.from_lookup(self, 'MyVPC', vpc_name=f'vpc-stack/MyVPC')
# You can now use the VPC in ECS cluster, etc.
And this would be your cdk_app.py:
app = core.App()
vpc = VpcStack(app, 'vpc-stack')
stack1 = Stack1(app, 'stack1')
I tried 0x32e0edfb answer and got some problem.
so I fix like this.
VPC Stack
class VpcStack(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
self.eks_vpc = ec2.Vpc(self, 'eks-vpc',
cidr='10.1.0.0/16',
max_azs=2
)
share VPC to other Stack
class EksClusterStack(core.Stack):
def __init__(self, scope: core.Construct, id: str, props: ec2.Vpc, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
cluster = eks.Cluster(self, 'eks-control-plane',
vpc=props,
default_capacity=0
)
and then app.py file
app = core.App()
vpc_stack = VpcStack(app, 'vpc-stack')
eks_cluster_stack = EksClusterStack(app, 'eks-cluster', vpc_stack.eks_vpc)
eks_cluster_stack.add_dependency(vpc_stack)
app.synth()
from_lookup is much better used on already existing VPC.
so I choose to use share-vpcs to share VPC information.
from_lookup only does the API call once - then, the data is cached in the cdk.context.json file, which should be committed to source control
That problem was when I recreating the same VPC.
cdk.context.json didn't update to lasted version. So when I use from_lookup always get old vpc-id.
I need to use cdk context --clear command and then deploy again. cdk.context.json would get lasted version vpc-id.
Finally, it can work properly on from_lookup method.
ref:
https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-eks/test/integ.eks-kubectl.lit.ts
https://docs.aws.amazon.com/cdk/latest/guide/context.html

How do I integration test a Dataflow pipeline writing to Bigtable?

According to the Beam website,
Often it is faster and simpler to perform local unit testing on your
pipeline code than to debug a pipeline’s remote execution.
I want to use test-driven development for my Beam/Dataflow app that writes to Bigtable for this reason.
However, following the Beam testing documentation I get to an impasse--PAssert isn't useful because the output PCollection contains org.apache.hadoop.hbase.client.Put objects, which don't override the equals method.
I can't get the contents of the PCollection to do validation on them either, since
It is not possible to get the contents of a PCollection directly - an
Apache Beam or Dataflow pipeline is more like a query plan of what
processing should be done, with PCollection being a logical
intermediate node in the plan, rather than containing the data.
So how can I test this pipeline, other than manually running it? I'm using Maven and JUnit (in Java since that's all the Dataflow Bigtable Connector seems to support).
The Bigtable Emulator Maven plugin can be used to write integration tests for this:
Configure the Maven Failsafe plugin and change your test case's ending from *Test to *IT to run as an integration test.
Install the Bigtable Emulator in the gcloud sdk on command line:
gcloud components install bigtable
Note that this required step is going to reduce code portability (e.g. will it run on your build system? On other devs' machines?) so I'm going to containerize it using Docker before deploying to the build system.
Add the emulator plugin to the pom per the README
Use the HBase Client API and see the example Bigtable Emulator integration test to set up your session and table(s).
Write your test as normal per the Beam documentation, except instead of using PAssert actually call CloudBigtableIO.writeToTable and then use the HBase Client to read the data from the table to verify it.
Here's an example integration test:
package adair.example;
import static org.apache.hadoop.hbase.util.Bytes.toBytes;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.UUID;
import java.util.stream.Collectors;
import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.transforms.Create;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.Mutation;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;
import org.hamcrest.collection.IsIterableContainingInAnyOrder;
import org.junit.Assert;
import org.junit.Test;
import com.google.cloud.bigtable.beam.CloudBigtableIO;
import com.google.cloud.bigtable.beam.CloudBigtableTableConfiguration;
import com.google.cloud.bigtable.hbase.BigtableConfiguration;
/**
* A simple integration test example for use with the Bigtable Emulator maven plugin.
*/
public class DataflowWriteExampleIT {
private static final String PROJECT_ID = "fake";
private static final String INSTANCE_ID = "fakeinstance";
private static final String TABLE_ID = "example_table";
private static final String COLUMN_FAMILY = "cf";
private static final String COLUMN_QUALIFIER = "cq";
private static final CloudBigtableTableConfiguration TABLE_CONFIG =
new CloudBigtableTableConfiguration.Builder()
.withProjectId(PROJECT_ID)
.withInstanceId(INSTANCE_ID)
.withTableId(TABLE_ID)
.build();
public static final List<String> VALUES_TO_PUT = Arrays
.asList("hello", "world", "introducing", "Bigtable", "plus", "Dataflow", "IT");
#Test
public void testPipelineWrite() throws IOException {
try (Connection connection = BigtableConfiguration.connect(PROJECT_ID, INSTANCE_ID)) {
Admin admin = connection.getAdmin();
createTable(admin);
List<Mutation> puts = createTestPuts();
//Use Dataflow to write the data--this is where you'd call the pipeline you want to test.
Pipeline p = Pipeline.create();
p.apply(Create.of(puts)).apply(CloudBigtableIO.writeToTable(TABLE_CONFIG));
p.run().waitUntilFinish();
//Read the data from the table using the regular hbase api for validation
ResultScanner scanner = getTableScanner(connection);
List<String> resultValues = new ArrayList<>();
for (Result row : scanner) {
String cellValue = getRowValue(row);
System.out.println("Found value in table: " + cellValue);
resultValues.add(cellValue);
}
Assert.assertThat(resultValues,
IsIterableContainingInAnyOrder.containsInAnyOrder(VALUES_TO_PUT.toArray()));
}
}
private void createTable(Admin admin) throws IOException {
HTableDescriptor tableDesc = new HTableDescriptor(TableName.valueOf(TABLE_ID));
tableDesc.addFamily(new HColumnDescriptor(COLUMN_FAMILY));
admin.createTable(tableDesc);
}
private ResultScanner getTableScanner(Connection connection) throws IOException {
Scan scan = new Scan();
Table table = connection.getTable(TableName.valueOf(TABLE_ID));
return table.getScanner(scan);
}
private String getRowValue(Result row) {
return Bytes.toString(row.getValue(toBytes(COLUMN_FAMILY), toBytes(COLUMN_QUALIFIER)));
}
private List<Mutation> createTestPuts() {
return VALUES_TO_PUT
.stream()
.map(this::stringToPut)
.collect(Collectors.toList());
}
private Mutation stringToPut(String cellValue){
String key = UUID.randomUUID().toString();
Put put = new Put(toBytes(key));
put.addColumn(toBytes(COLUMN_FAMILY), toBytes(COLUMN_QUALIFIER), toBytes(cellValue));
return put;
}
}
In Google Cloud you can do e2e testing of your Dataflow pipeline easily using real cloud resources like Pub/Sub topic and BigQuery tables.
By using Junit5 Extension Model (https://junit.org/junit5/docs/current/user-guide/#extensions) you can create custom classes that will handle the creation and deletion of the required resources for your pipeline.
You can find a demo/seed project here https://github.com/gabihodoroaga/dataflow-e2e-demo and a blog post here https://hodo.dev/posts/post-31-gcp-dataflow-e2e-tests/.

Resources