How to fetch SSM Parameters from two different accounts using AWS CDK - aws-cdk

I have a scenario where I'm using CodePipeline to deploy my cdk project from a tools account to several environment accounts.
The way my pipeline is deploying is by running cdk deploy from within a CodeBuild job.
My team has decided to use SSM Parameter Store to store configuration and we ended up with some parameters living in the environment account, for example the VPC_ID (resources/vpc/id) that I can read in deployment time => ssm.StringParameter.valueForStringParameter.
However, other parameters are living in the tools account, such as the Account Ids from my environment accounts (environment/nonprod/account/id) and other Global Config. I'm having trouble fetching those values.
At the moment, the only way I could think of was by using a step to read all those values in a previous step and loaded them into the context values.
Is there a more elegant approach for this problem? I was hoping I could specify in which account to get the SSM values from. Any ideas?
Thank you.

As you already stated there is no native support for that. I am also using CodePipeline in cross-account deployments, so all the automation parameters or product specified parameters are stored in a secured account and CodePipeline deploys the resources using CloudFormation as an action provider.
Cross account resolution of SSM parameters isn't supported, so in the end, I had added an extra step (stage) in my CodePipeline, which is nothing else but a CodeBuild project, which runs a script in a containerized environment and scripts then "syncs" the parameters from the automation account to the destination account.

As part of your pipeline, I would add a preliminary step to execute a Lambda. That Lambda can then execute whatever queries you wish to obtain whatever metadata/config that is required. The output from that Lambda can then be passed in to the CodeBuild step.
e.g. within the Lambda:
export class ConfigFetcher {
codepipeline = new AWS.CodePipeline();
async fetchConfig(event: CodePipelineEvent, context : Context) : Promise<void> {
// Retrieve the Job ID from the Lambda action
const jobId = event['CodePipeline.job'].id;
// now get your config by executing whatever queries you need, even cross-account, via the SDK
// we assume that the answer is in the variable someValue
const params = {
jobId: jobId,
outputVariables: {
MY_CONFIG: someValue,
},
};
// now tell CodePipeline you're done
await this.codepipeline.putJobSuccessResult(params).promise().catch(err => {
console.error('Error reporting build success to CodePipeline: ' + err);
throw err;
});
// make sure you have some sort of catch wrapping the above to post a failure to CodePipeline
// ...
}
}
const configFetcher = new ConfigFetcher();
exports.handler = async function fetchConfigMetadata(event: CodePipelineEvent, context : Context): Promise<void> {
return configFetcher.fetchConfig(event, context);
};
Assuming that you create your pipeline using CDK, then your Lambda step will be created using something like this:
const fetcherAction = new LambdaInvokeAction({
actionName: 'FetchConfigMetadata',
lambda: configFetcher,
variablesNamespace: 'ConfigMetadata',
});
Note the use of variablesNamespace: we need to refer to this later in order to retrieve the values from the Lambda's output and insert them as env variables into the CodeBuild environment.
Now our CodeBuild definition, again assuming we create using CDK:
new CodeBuildAction({
// ...
environmentVariables: {
MY_CONFIG: {
type: BuildEnvironmentVariableType.PLAINTEXT,
value: '#{ConfigMetadata.MY_CONFIG}',
},
},
We can call the variable whatever we want within CodeBuild, but note that ConfigMetadata.MY_CONFIG needs to match the namespace and output value of the Lambda.
You can have your lambda do anything you want to retrieve whatever data it needs - it's just going to need to be given appropriate permissions to reach across into other AWS accounts if required, which you can do using role assumption. Using a Lambda as a pipeline step will be a LOT faster than using a CodeBuild step in the pipeline, plus it's easier to change: if you write your Lambda code in Typescript/JS or Python, you can even use the AWS console to do in-place edits whilst you test that it executes correctly.

AFAIK there is no native way to achieve what you described. If there is way I'd like to know too. I believe you can use the CloudFormation custom resource baked by lambda for this purpose.
You can pass parameters to the lambda request and get information back from the lambda response.
See https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-custom-resources-lambda.html, https://www.2ndwatch.com/blog/a-step-by-step-guide-on-using-aws-lambda-backed-custom-resources-with-amazon-cfts/ and https://docs.aws.amazon.com/cdk/api/latest/docs/custom-resources-readme.html for more information.

This question is a year old, but a simpler method I found for retrieving parameters from your tools/deployment account is to specify them as env variables in your buildspec file. CodeBuild will always pull these from whatever account your job is running in (which in this question's scenario would be the tools account).
To pull parameters from your target environment accounts, it's best to use the CDK SSM approach suggested by the question author.

Related

Access context object when defining a Step Function workflow in the CDK?

I would like to pass in the Step Function Execution ID of the current workflow to my Lambda function when it is executed.
I see in the documentation that to "access the context object, first specify the parameter name by appending .$ to the end" however, I cannot do this as I am defining my workflow using the CDK, which does not have access to parameters. I have to use InputPath.
Despite not being able to follow the instructions I tried a few things anyway. I have tried:
new tasks.LambdaInvoke(this, "InvokeLambdaTask", {
lambdaFunction: myLambda
inputPath: "$$.Execution.id"
})
and got the following error: Invalid path '$.Execution.id' : No results for path: $['Execution']['id']
I also tried a single dollar sign:
new tasks.LambdaInvoke(this, "InvokeLambdaTask", {
lambdaFunction: myLambda
inputPath: "$.Execution.id"
})
and got Invalid path '$.Execution.id' : Missing property in path $['Execution']
Is there any way to achieve this? I've seen a few other questions asking the more or less same thing, however I cannot really make use of these answers using the CDK.
I would like to pass in the Step Function Execution ID of the current workflow to my Lambda function
The LambdaInvoke task's payload is supplied to the Lambda function as input. Note the two equivalent ways of referencing the JSONPath.
new tasks.LambdaInvoke(this, "InvokeLambdaTask", {
lambdaFunction: myLambda,
payload: sfn.TaskInput.fromObject({
executionId: sfn.JsonPath.stringAt("$$.Execution.Id"),
"alsoExecutionId.$": "$$.Execution.Id",
}),
});
The Lambda receives the Execution Id in the event payload:
{
executionId: "arn:aws:states:us-east-1:123456789012:execution:StateMachine2E01...",
alsoExecutionId: "arn:aws:states:us-east-1:123456789012:execution:StateMachine2E01..."
}
The CDK ... does not have access to parameters.
It does. The CDK renders the payload arg into the State Machine definition's Parameters:
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Lambda...",
"Parameters": {
"executionId.$": "$$.Execution.Id",
"alsoExecutionId.$": "$$.Execution.Id"
}
You have to use stepfunctions.JsonPath.stringAt("$$.Execution.id"). I'm not 100% sure why, but CDK requires you to use JsonPath.xAt() for any references.

How to use a Google Secret in a deployed Cloud Run Service (managed)?

I have a running cloud run service user-service. For test purposes I passed client secrets via environment variables as plain text. Now since everything is working fine I'd like to use a secret instead.
In the "Variables" tab of the "Edit Revision" option I can declare environment variables but I have no idea how to pass in a secret? Do I just need to pass the secret name like ${my-secret-id} in the value field of the variable? There is not documentation on how to use secrets in this tab only a hint at the top:
Store and consume secrets using Secret Manager
Which is not very helpful in this case.
You can now read secrets from Secret Manager as environment variables in Cloud Run. This means you can audit your secrets, set permissions per secret, version secrets, etc, and your code doesn't have to change.
You can point to the secrets through the Cloud Console GUI (console.cloud.google.com) or make the configuration when you deploy your Cloud Run service from the command-line:
gcloud beta run deploy SERVICE --image IMAGE_URL --update-secrets=ENV_VAR_NAME=SECRET_NAME:VERSION
Six-minute video overview: https://youtu.be/JIE89dneaGo
Detailed docs: https://cloud.google.com/run/docs/configuring/secrets
UPDATE 2021: There is now a Cloud Run preview for loading secrets to an environment variable or a volume. https://cloud.google.com/run/docs/configuring/secrets
The question is now answered however I have been experiencing a similar problem using Cloud Run with Java & Quarkus and a native image created using GraalVM.
While Cloud Run is a really interesting technology at the time of writing it lacks the ability to load secrets through the Cloud Run configuration. This has certainly added complexity in my app when doing local development.
Additionally Google's documentation is really quite poor. The quick-start lacks a clear Java example for getting a secret[1] without it being set in the same method - I'd expect this to have been the most common use case!
The javadoc itself seems to be largely autogenerated with protobuf language everywhere. There are various similarly named methods like getSecret, getSecretVersion and accessSecretVersion
I'd really like to see some improvment from Google around this. I don't think it is asking too much for dedicated teams to make libraries for common languages with proper documentation.
Here is a snippet that I'm using to load this information. It requires the GCP Secret library and also the GCP Cloud Core library for loading the project ID.
public String getSecret(final String secretName) {
LOGGER.info("Going to load secret {}", secretName);
// SecretManagerServiceClient should be closed after request
try (SecretManagerServiceClient client = buildClient()) {
// Latest is an alias to the latest version of a secret
final SecretVersionName name = SecretVersionName.of(getProjectId(), secretName, "latest");
return client.accessSecretVersion(name).getPayload().getData().toStringUtf8();
}
}
private String getProjectId() {
if (projectId == null) {
projectId = ServiceOptions.getDefaultProjectId();
}
return projectId;
}
private SecretManagerServiceClient buildClient() {
try {
return SecretManagerServiceClient.create();
} catch(final IOException e) {
throw new RuntimeException(e);
}
}
[1] - https://cloud.google.com/secret-manager/docs/reference/libraries
Google have documentation for the Secret manager client libraries that you can use in your api.
This should help you do what you want
https://cloud.google.com/secret-manager/docs/reference/libraries
Since you haven't specified a language I have a nodejs example of how to access the latest version of your secret using your project id and secret name. The reason I add this is because the documentation is not clear on the string you need to provide as the name.
const [version] = await this.secretClient.accessSecretVersion({
name: `projects/${process.env.project_id}/secrets/${secretName}/versions/latest`,
});
return version.payload.data.toString()
Be sure to allow secret manager access in your IAM settings for the service account that your api uses within GCP.
I kinda found a way to use secrets as environment variables.
The following doc (https://cloud.google.com/sdk/gcloud/reference/run/deploy) states:
Specify secrets to mount or provide as environment variables. Keys
starting with a forward slash '/' are mount paths. All other keys
correspond to environment variables. The values associated with each
of these should be in the form SECRET_NAME:KEY_IN_SECRET; you may omit
the key within the secret to specify a mount of all keys within the
secret. For example:
'--update-secrets=/my/path=mysecret,ENV=othersecret:key.json' will
create a volume with secret 'mysecret' and mount that volume at
'/my/path'. Because no secret key was specified, all keys in
'mysecret' will be included. An environment variable named ENV will
also be created whose value is the value of 'key.json' in
'othersecret'. At most one of these may be specified
Here is a snippet of Java code to get all secrets of your Cloud Run project. It requires the com.google.cloud/google-cloud-secretmanager artifact.
Map<String, String> secrets = new HashMap<>();
String projectId;
String url = "http://metadata.google.internal/computeMetadata/v1/project/project-id";
HttpURLConnection conn = (HttpURLConnection)(new URL(url).openConnection());
conn.setRequestProperty("Metadata-Flavor", "Google");
try {
InputStream in = conn.getInputStream();
projectId = new String(in.readAllBytes(), StandardCharsets.UTF_8);
} finally {
conn.disconnect();
}
Set<String> names = new HashSet<>();
try (SecretManagerServiceClient client = SecretManagerServiceClient.create()) {
ProjectName projectName = ProjectName.of(projectId);
ListSecretsPagedResponse pagedResponse = client.listSecrets(projectName);
pagedResponse
.iterateAll()
.forEach(secret -> { names.add(secret.getName()); });
for (String secretName : names) {
String name = secretName.substring(secretName.lastIndexOf("/") + 1);
SecretVersionName nameParam = SecretVersionName.of(projectId, name, "latest");
String secretValue = client.accessSecretVersion(nameParam).getPayload().getData().toStringUtf8();
secrets.put(secretName, secretValue);
}
}
Cloud Run support for referencing Secret Manager Secrets is now at general availability (GA).
https://cloud.google.com/run/docs/release-notes#November_09_2021

Failing to create services on Google Cloud Run with API using Java SDK

I create a Cloud Run client, however, couldn't find a way to list a service that is deployed with Cloud Run on GKE (for Anthos).
Create the client:
HttpTransport httpTransport = new NetHttpTransport();
JsonFactory jsonFactory = new JacksonFactory();
GoogleCredentials credential = GoogleCredentials.getApplicationDefault();
credential.createScoped("https://www.googleapis.com/auth/cloud-platform");
HttpRequestInitializer requestInitializer = new HttpCredentialsAdapter(credential);
CloudRun.Builder builder = new CloudRun.Builder(httpTransport, jsonFactory, requestInitializer);
return builder.setApplicationName(applicationName)
.setRootUrl(cloudRunRootUrl)
.build();
} catch (IOException e) {
e.printStackTrace();
}
try to list services:
services = cloudRun.namespaces().services()
.list("namespaces/default")
.execute()
.getItems();
My "hello" service is deploy on a GKE cluster under the namespace default. The above code doesn't work because the client always see "default" as project_id and complains about permission stuff. If I put the project_id rather than "default", permission errors are gone, but no services will be found.
I tried another project that does have Google fully-managed cloud run services, the same code returns result (with .list("namespaces/")).
How to access the service on GKE?
And my next question would be, how to programmatically create Cloud Run services on GKE?
Edit - for creating a service
As I couldn't figure out how to interact with Cloud Run on GKE, I took a step back to try fully managed one. The following code to create a service fails, and the error message just doesn't provide much useful insight, how to make it work?
Service deployedService = null;
// Map<String,String> annotations = new HashMap<>();
// annotations.put("client.knative.dev/user-image","gcr.io/cloudrun/hello");
ServiceSpec spec = new ServiceSpec();
List<Container> containers = new ArrayList<>();
containers.add(new Container().setImage("gcr.io/cloudrun/hello"));
spec.setTemplate(new RevisionTemplate().setMetadata(new ObjectMeta().setName("hello-fully-managed-v0.1.0"))
.setSpec(new RevisionSpec().setContainerConcurrency(20)
.setContainers(containers)
.setTimeoutSeconds(100)
)
);
helloService.setApiVersion("serving.knative.dev/v1")
.setMetadata(new ObjectMeta().setName("hello-fully-managed")
.setNamespace("data-infrastructure-test-env")
// .setAnnotations(annotations)
)
.setSpec(spec)
.setKind("Service");
try {
deployedService = cloudRun.namespaces().services()
.create("namespaces/data-infrastructure-test-env",helloService)
.execute();
} catch (IOException e) {
e.printStackTrace();
response.add(e.toString());
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(response);
}
Error message I got:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "The request has errors",
"reason" : "badRequest"
} ],
"message" : "The request has errors",
"status" : "INVALID_ARGUMENT"
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
And the base_url is: https://europe-west1-run.googleapis.com
Your question is quite detailed (and is about Java which I am no expert in) and there are actually too many questions in there (ideally, please ask only 1 question here). However, I'll try to answer a few things you asked:
First, Cloud Run (managed, and on GKE) both implement the Knative Serving API. I've explained this at https://ahmet.im/blog/cloud-run-is-a-knative/ In fact, Cloud Run on GKE is just the open source Knative components installed to your cluster.
And my next question would be, how to programmatically create Cloud Run services on GKE?
You will have a very hard time (if possible at all) using the Cloud Run API client libraries (e.g. new CloudRun above) because these are designed for *.googleapis.com endpoints.
The Knative API part of "Cloud Run on GKE" is actually just your Kubernetes (GKE) master API endpoint (which runs on an IP address, with a TLS certificate that isn't trusted by root CAs, but you can find the CA cert in GKE GetCluster API call to verify the cert.) The TLS is part is why it's so hard to use the API Client libraries.
Knative APIs are just Kubernetes objects. So your best bet is one of these:
See Kubernetes java client (https://github.com/kubernetes-client/java) actually allows dynamic objects. (Go implementation does) and try to use that to create Knative CRDs.
Use kubectl apply.
Ask Knative Serving open source repository for help (they should be providing client libraries, maybe they're already there I'm not sure)
To program Cloud Run (managed) with the API Client Libraries, you need to explicitly override the API endpoint to the region e.g. us-central1-run.googleapis.com. (This is documented on each API call's REST API reference documentation.)
I have written a blog post in detail (with sample code in Go) on how to create/update services on Cloud Run (managed) using the Knative Serving API here: https://ahmet.im/blog/gcloud-run-deploy/
If you want to see how gcloud run deploy works, and which APIs it calls, you can pass --log-http option to observe the request/response traffic.
As for the error you got, it seems like the error message isn't helpful, but it might be coming from anywhere (as you're trying to imitate Knative API in GCP client libraries). I recommend reading my blog posts and sample code in depth.
UPDATES: Our engineering team's looking at the issue, it appears that there's currently a bug not adding the "details" field to the error. That's being worked on.
In your case, we see the following errors from requests:
field: "spec.template.spec"
description: "Missing template spec."
Means you are not properly filling up the spec field as I shown in my blog post and sample code.
field: "metadata.name"
description: "The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -"
Make sure the name you are specifying adheres the patterns specified in API docs. Try to create that name manually perhaps in the UI or gcloud CLI.
field: "api_version"
description: "Unsupported API version \'serving.knative.dev/v1\'. Expected \'serving.knative.dev/v1alpha1\'"
Do not use v1alpha1 API, use v1 directly.
We'll try to get the details to the error message, however it appears that you need to study the sample code I linked in my blog post more in detail:
https://github.com/GoogleCloudPlatform/cloud-run-button/blob/a52c7fbaae33a3e06c112206c7227a0ef9649647/cmd/cloudshell_open/deploy.go#L26-L112
The Java SDK is automatically generated from the fact that the Cloud Run (fully managed) API is public. It does not support Cloud Run for Anthos.
(gcloud.run.deploy) The revision name must be prefixed by the name of the enclosing Service or Configuration with a trailing -revision name
revision name name should be 65 character then problem will be resolved in Automation pipeline with GCP revision suffix should be less revision name is the combination of (service name +revision suffix) will automatically created by GCP.

Is passing environment variables to sapper's client side secure with Rollup Replace?

I am using replace in my rollup configuration for sapper and sapper-environment to pass environment variables to the client side in sapper - is this secure? Is there a better/safer way to approach this?
Using this config below:
rollup.config.js
const sapperEnv = require('sapper-environment');
export default {
client: {
input: config.client.input(),
output: config.client.output(),
plugins: [
replace({
...sapperEnv(),
'process.browser': true,
'process.env.NODE_ENV': JSON.stringify(mode)
})
...
And then this allows me to use the variables in stores.js:
import { writable } from 'svelte/store';
import Client from 'shopify-buy';
const key = process.env.SAPPER_APP_SHOPIFY_KEY;
const domain = process.env.SAPPER_APP_SHOPIFY_DOMAIN;
// Initialize a client
const client = Client.buildClient({
domain: domain,
storefrontAccessToken: key
});
export { key, domain, client };
I have tried running this in server,js and passing the variables through the session data, but client side no matter what I do they always seems to return 'undefined'.
There are two questions here — a) is it secure, and b) why are the values undefined?
The answer to the first question is 'no'. Any time you include credentials in JavaScript that gets served to the client (or in session data), you're making those credentials available to anyone who knows how to look for them. If you need to avoid that, you'll need your server (or another server) to make requests on behalf of authenticated clients.
As for the second part, it's very hard to tell without a reproduction unfortunately!

easiest way to schedule a Google Cloud Dataflow job

I just need to run a dataflow pipeline on a daily basis, but it seems to me that suggested solutions like App Engine Cron Service, which requires building a whole web app, seems a bit too much.
I was thinking about just running the pipeline from a cron job in a Compute Engine Linux VM, but maybe that's far too simple :). What's the problem with doing it that way, why isn't anybody (besides me I guess) suggesting it?
This is how I did it using Cloud Functions, PubSub, and Cloud Scheduler
(this assumes you've already created a Dataflow template and it exists in your GCS bucket somewhere)
Create a new topic in PubSub. this will be used to trigger the Cloud Function
Create a Cloud Function that launches a Dataflow job from a template. I find it easiest to just create this from the CF Console. Make sure the service account you choose has permission to create a dataflow job. the function's index.js looks something like:
const google = require('googleapis');
exports.triggerTemplate = (event, context) => {
// in this case the PubSub message payload and attributes are not used
// but can be used to pass parameters needed by the Dataflow template
const pubsubMessage = event.data;
console.log(Buffer.from(pubsubMessage, 'base64').toString());
console.log(event.attributes);
google.google.auth.getApplicationDefault(function (err, authClient, projectId) {
if (err) {
console.error('Error occurred: ' + err.toString());
throw new Error(err);
}
const dataflow = google.google.dataflow({ version: 'v1b3', auth: authClient });
dataflow.projects.templates.create({
projectId: projectId,
resource: {
parameters: {},
jobName: 'SOME-DATAFLOW-JOB-NAME',
gcsPath: 'gs://PATH-TO-YOUR-TEMPLATE'
}
}, function(err, response) {
if (err) {
console.error("Problem running dataflow template, error was: ", err);
}
console.log("Dataflow template response: ", response);
});
});
};
The package.json looks like
{
"name": "pubsub-trigger-template",
"version": "0.0.1",
"dependencies": {
"googleapis": "37.1.0",
"#google-cloud/pubsub": "^0.18.0"
}
}
Go to PubSub and the topic you created, manually publish a message. this should trigger the Cloud Function and start a Dataflow job
Use Cloud Scheduler to publish a PubSub message on schedule
https://cloud.google.com/scheduler/docs/tut-pub-sub
There's absolutely nothing wrong with using a cron job to kick off your Dataflow pipelines. We do it all the time for our production systems, whether it be our Java or Python developed pipelines.
That said however, we are trying to wean ourselves off cron jobs, and move more toward using either AWS Lambdas (we run multi cloud) or Cloud Functions. Unfortunately, Cloud Functions don't have scheduling yet. AWS Lambdas do.
There is a FAQ answer to that question:
https://cloud.google.com/dataflow/docs/resources/faq#is_there_a_built-in_scheduling_mechanism_to_execute_pipelines_at_given_time_or_interval
You can automate pipeline execution by using Google App Engine (Flexible Environment only) or Cloud Functions.
You can use Apache Airflow's Dataflow Operator, one of several Google Cloud Platform Operators in a Cloud Composer workflow.
You can use custom (cron) job processes on Compute Engine.
The Cloud Function approach is described as "Alpha" and it's still true that they don't have scheduling (no equivalent to AWS cloudwatch scheduling event), only Pub/Sub messages, Cloud Storage changes, HTTP invocations.
Cloud composer looks like a good option. Effectively a re-badged Apache Airflow, which is itself a great orchestration tool. Definitely not "too simple" like cron :)
You can use cloud scheduler to schedule your job as well. See my post
https://medium.com/#zhongchen/schedule-your-dataflow-batch-jobs-with-cloud-scheduler-8390e0e958eb
Terraform script
data "google_project" "project" {}
resource "google_cloud_scheduler_job" "scheduler" {
name = "scheduler-demo"
schedule = "0 0 * * *"
# This needs to be us-central1 even if the app engine is in us-central.
# You will get a resource not found error if just using us-central.
region = "us-central1"
http_target {
http_method = "POST"
uri = "https://dataflow.googleapis.com/v1b3/projects/${var.project_id}/locations/${var.region}/templates:launch?gcsPath=gs://zhong-gcp/templates/dataflow-demo-template"
oauth_token {
service_account_email = google_service_account.cloud-scheduler-demo.email
}
# need to encode the string
body = base64encode(<<-EOT
{
"jobName": "test-cloud-scheduler",
"parameters": {
"region": "${var.region}",
"autoscalingAlgorithm": "THROUGHPUT_BASED",
},
"environment": {
"maxWorkers": "10",
"tempLocation": "gs://zhong-gcp/temp",
"zone": "us-west1-a"
}
}
EOT
)
}
}

Resources