Currently I see we can run tasks across multiple platform in SCDF. I see samples for Kubernetes and CF.
https://dataflow.spring.io/docs/recipes/multi-platform-deployment/multi-platform-task/
Do we have similar example for local deployment? I am looking for Deployment of tasks across different physical servers using local deployment ( we are not using container as of now )
I think you can do it by defining spring.cloud.dataflow.task.platform.local.accounts entries (cf. https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#configuration-local-tasks)
You also have a sample in the mentioned article (https://dataflow.spring.io/docs/recipes/multi-platform-deployment/multi-platform-task/#configuring-spring-cloud-data-flow-1)
export SPRING_APPLICATION_JSON="{\"spring.cloud.dataflow.task\":{\"platform.kubernetes.accounts\":{\"kzone\":{\"namespace\" : \"default\"}},\"platform.cloudfoundry.accounts\":{\"cfzone\":{\"connection\":{\"url\":\"https://myconnection\",\"domain\":\"mydomain\",\"org\":\"myorg\",\"space\":\"myspace\",\"username\":\"admin\",\"password\":\"password\",\"skipSslValidation\":true},\"deployment\":{\"deleteRoutes\":false,\"services\":\"garsql,atscheduler\",\"enableRandomAppNamePrefix\":false,\"memory\":3072},\"schedulerProperties\":{\"schedulerUrl\":\"https://scheduler.cf.navy.springapps.io\"}}}}}{\"spring.cloud.dataflow.task\":{\"platform.kubernetes.accounts\":{\"kzone\":{\"namespace\" : \"default\"}}}}{\"spring.cloud.dataflow.task\":{\"platform.local.accounts\":{\"local\":{\"timeout\" : \"60\"}}}}"
Here the interesting part :
{
"spring.cloud.dataflow.task": {
"platform.local.accounts": {
"local": {
"timeout": "60"
}
}
}
}
Related
I am learning AWS Cloud Development Kit (CDK).
As part of this learning, I am trying to understand how I am supposed to correctly handle production and development environment.
I know AWS CDK provides the environment parameter to allow deploying stacks to specific account.
But then, how to have specific options for development versus production stacks ? It does not seem to be provided by default by AWS CDK or am I missing/misunderstanding something ?
A very simple example could be that I want a S3 bucket called my-s3-bucket-dev for my development account and one named my-s3-bucket-prod for my production account. But then how to have e.g. a variable stage correctly handled in AWS CDK ?
I know I can add parameters in the cdk.json file but again, I don't know how to correctly use this file to depend upon the deployed stack i.e. production vs development.
Thanks for the support
Welcome to AWS CDK.
Enjoy the ride. ;)
Actually, there is no semantic (in your case the stage) in an account itself.
This has nothing to do with CDK or Cloud Formation.
You need to take care of this.
You're right, that you could use the CDK context in the cdk.json.
There's no schema enforcement in the context, except for some internally used variables by CDK.
You could define your dev and prod objects within.
There are other ways of defining the context.
Here is an example, what it could look like:
{
"app": "node app",
// usually there's some internal definition for your CDK project
"context": {
"dev": {
"accountId" : "prod_account",
"accountRegion" : "us-east-1",
"name": "dev",
"resourceConfig":
{
// here you could differentiate the config per AWS resource-type
// e.g. dev has lower hardware specs
}
},
"prod": {
"accountId" : "prod_account",
"accountRegion" : "us-east-1",
"name": "prod",
"resourceConfig":
{
// here you could differentiate the config per AWS resource-type
// prod has higher hardware specs or more cluster nodes
}
}
}
}
With this being defined, you need to run your CDK application with the -c flag to specify which configuration object (dev or prod), you want to have.
For instance, you could run it with cdk synth -c stage=prod.
This sets the stage variable in your context and makes it available.
When it was successful, you can re-access the context again and fetch the appropriate config object.
const app = new cdk.App();
const stage = app.node.tryGetContext('stage');
// the following step is only needed, if you have a different config per account
const stageConfig = app.node.tryGetContext(stage );
// ... do some validation and pass the config to the stacks as constructor argument
As I said, the context is one way of doing this.
However, there are drawbacks to it.
It's JSON and no code.
What I prefer is to have TypeScript types per resource configuration (e.g. S3) and wire them all together as a plain object.
The object maps the account/region information and the corresponding resource configurations.
Context:
Projen is an awesome tool to generate and manage (JSII-built) AWS CDK projects.
Background:
Previously I have managed CDK dependencies with RenovateBot's group:aws-cdkMonorepo preset. This will result in RenovateBot creating a single Github Pull Request for AWS CDK depedency updates.
Question:
With Projen, one controls the CDK version in .projenrc.js:
const { AwsCdkConstructLibrary } = require('projen');
const project = new AwsCdkConstructLibrary({
authorName: "Example",
authorAddress: "contact#example.com",
cdkVersion: "1.64.0",
name: "#example/project",
repository: "https://github.com/example/project.git",
});
project.synth();
So how can one manage that cdkVersion value with tooling such as DependaBot or RenovateBot?
Since keeping one's CDK constructs up-to-date with current CDK version is critial and with multiple CDK constructs doing it by hand will be painful.
The central version management depends on your requirement. If you are using a centralized construct library it is a must-have.
For managing the dependencies in centrally in a single configuration, you need to add the following snippet in the .projenrc.js
cdkDependecies:[
'#aws-cdk/core'
]
Now, whenever you run projen the cdk app would be managed centrally and use the latest version.
Is it possible to configure the Core Web Vitals metrics thresholds during build time? or as part of your CI process?
The Core Web Vitals prescribe a specific set of metrics thresholds and percentiles we believe correspond well to user expectations across a range of devices.
We encourage using our official thresholds as much as possible. If however, you would like to set custom targets for thresholds (e.g a Largest Contentful Paint performance budget of < 3s), this is possible using Lighthouse CI and LightWallet. Metric targets can be set using assertions and a performance budgets file.
An example of such assertions can be found below:
{
"ci": {
"assert": {
"assertions": {
"largest-contentful-paint": ["warn", {"maxNumericValue": 3000}],
"viewport": "error",
"resource-summary:document:size": ["error", {"maxNumericValue": 14000}],
"resource-summary:font:count": ["warn", {"maxNumericValue": 1}],
"resource-summary:third-party:count": ["warn", {"maxNumericValue": 5}]
}
}
}
}
I am trying to run a job using Google Cloud Machine Learning REST-API ml.jobs.project.create
The latest job that I submitted has job id 'drivermonitoring20180109335'. Here on completion of the job, message 'job completed successfully' is displayed but I cannot see any desired output file in the specified location. Output logs can be seen in fig1
Also I would like to keep in-front of you my few observations while running this job id:
i) Running the job took very less time in comparison to any other job that I executed before.
ii) While running jobs before, every job earlier was executed via two different tasks viz a)master-replica-0 and b)service (refer fig2) but this job didn't have master-replica-0 task(refer fig3) I tried to Google the issue, but was unable to find any solution related to the issue.
So I can infer that the task that I was trying to run is being scheduled but the python script that I am trying to run is never scheduled to be executed.
Kindly let me know if you require more screenshots or if you want to have a look at the project structure to help with the issue.
Thanks in advance.
EDIT 1: Added JSON while making API call
POST https://ml.googleapis.com/v1/projects/drivermonitoringsystem/jobs?key={YOUR_API_KEY}
{
"trainingInput": {
"pythonModule": "trainer.retrain",
"args": [
"--bottleneck_dir=ModelTraining/tf_files/bottlenecks \
--model_dir=ModelTraining/tf_files/models/ \
--architecture=mobilenet_0.50_224 \
--output_graph=gs://<BUCKET_NAME>/tf_files/retrained_graph.pb \
--output_labels=gs://<BUCKET_NAME>/tf_files/retrained_labels.txt \
--image_dir=gs://<BUCKET_NAME>/dataset224x224/"
],
"region": "us-central1",
"packageUris": [
"gs://<BUCKET_NAME>/ModelTraining4.tar.gz"
],
"jobDir": "gs://<BUCKET_NAME>/tf_files/",
"runtimeVersion": "1.4"
},
"jobId": "job_id201801101535"
}
I have just run myself some sample jobs using both the gcloud command and the REST API, and everything has just worked fine in both of the cases. It looks like, in your case, the job was never executed, as there is no cluster created for processing the job itself (that is why master-replica-0 is missing).
The jobs that you had run previously and which had worked were launched also using the REST API, or instead with gcloud or a Client Library?
Here I share an example JSON I used when making the API call to ml.projects.jobs.create through the API Explorer link you shared, I suggest you try adapting it to your requirements and check if you got any missing field:
POST https://ml.googleapis.com/v1/projects/<YOUR_PROJECT>/jobs?key={YOUR_API_KEY}
{
"jobId": "<JOB_ID>",
"trainingInput": {
"jobDir": "gs://<LOCATION_TO_STORE_OUTPUTS>",
"runtimeVersion": "1.4",
"region": "<REGION>",
"packageUris": [
"gs://<PATH_TO_YOUR_TRAINER>/trainer-0.0.0.tar.gz"
],
"pythonModule": "<PYTHON_MODULE_TO_RUN>",
"args": [
"--train-files",
"gs://<PATH_TO_YOUR_TRAINING_DATA>/data.csv",
"--eval-files",
"gs://<PATH_TO_YOUR_TEST_DATA>/test.csv",
"--train-steps",
"100",
"--eval-steps",
"10",
"--verbosity",
"DEBUG"
]
}
}
Change TrainingInput to PredictionInput (and the appropriate child fields) if you are trying to run a prediction job instead of a training one, as in this example.
I had to implement a Pipeline and trying to find a way, how to publish Robot Framework results in Jenkins Pipeline.
I found multiple questions about implementation of Robot Framework plugin into Pipeline and also found this question which seems to be solution. However I have tried this approach and results are still missing.
Is there any workaround or functional example?
[Edited to reflect successful workaround]
This comment on the issue tracker shows a workaround that seems to work:
step([
$class : 'RobotPublisher',
outputPath : outputDirectory,
outputFileName : "*.xml",
disableArchiveOutput : false,
passThreshold : 100,
unstableThreshold: 95.0,
otherFiles : "*.png",
])
However, the Robot Framework Plugin currently does not seem to be fully compatible with Pipeline right now: https://issues.jenkins-ci.org/browse/JENKINS-34469
This is common with many plugins in the Jenkins ecosystem right now that have not been updated yet to be compatible with the new Jenkins Pipeline. You could potentially create the full compatibility yourself though, if you're motivated enough.
I used the workaround mentioned in the other answer but it wouldn't display the results with the job like in non pipline jobs, so i made freestyle project that is triggered by the pipline job and just copies the results files across then runs the analysis. This is crufty and won't be portable across nodes, the job numbers might get confusing over time so the correlations might be tricky. At the point i will investigate using generic artifact storage or just getting rid of robot altogether.
I had trouble using the answer given above, resulting in errors; but I was able to figure it out and add it to the Pipeline. Here is how I fixed it in case anyone else has come across the same issues:
stage('Tests') {
steps {
echo 'Testing...'
script {
step(
[
$class : 'RobotPublisher',
outputPath : '<insert/the/output/path>',
outputFileName : "*.xml",
reportFileName : "report.html",
logFileName : "log.html",
disableArchiveOutput : false,
passThreshold : 100,
unstableThreshold : 95.0,
otherFiles : "*.png"
]
)
}
}
}