In Apache Beam, what is the Control service and Provision service? - google-cloud-dataflow

In the Apache Beam Fn API, what are the responsibilities of the Control service and Provision service? How do they interact with the SDK Harness Container? Does the Control/Provision service contact the SDK Harness Container (to control, or provision it...?) - or vice versa?
For context, I am building a Dataflow custom container: https://cloud.google.com/dataflow/docs/guides/using-custom-containers#create_and_building_the_container_image and I am trying to understand the boot entrypoint, which has the control and provision service: https://github.com/apache/beam/blob/master/sdks/java/container/boot.go#L49

Related

Can dapr self-hosted apps invoke a remote dapr service hosted on kubernetes?

My distributed dapr.io application is growing very quickly and contains several dapr app-ids; and running all applications locally for development purposes is becoming difficult.
Is it possible for a local self-hosted app in development to invoke a production app running on an AKS cluster?
When you want to mix local development with existing services on AKS check out Bridge to Kubernetes
https://devblogs.microsoft.com/visualstudio/bridge-to-kubernetes-ga/
https://channel9.msdn.com/Shows/Visual-Studio-Toolbox/Bridge-to-Kubernetes
this information is intended preliminary, I will try to bring up a Dapr sample scenario

Airflow on Google Cloud Composer vs Docker

I can't find much information on what the differences are in running Airflow on Google Cloud Composer vs Docker. I am trying to switch our data pipelines that are currently on Google Cloud Composer onto Docker to just run locally but am trying to conceptualize what the difference is.
Cloud Composer is a GCP managed service for Airflow. Composer runs in something known as a Composer environment, which runs on Google Kubernetes Engine cluster. It also makes use of various other GCP services such as:
Cloud SQL - stores the metadata associated with Airflow,
App Engine Flex - Airflow web server runs as an App Engine Flex application, which is protected using an Identity-Aware Proxy,
GCS bucket - in order to submit a pipeline to be scheduled and run on Composer, all that we need to do is to copy out Python code into a GCS bucket. Within that, it'll have a folder called DAGs. Any Python code uploaded into that folder is automatically going to be picked up and processed by Composer.
How Cloud Composer benefits?
Focus on your workflows, and let Composer manage the infrastructure (creating the workers, setting up the web server, the message brokers),
One-click to create a new Airflow environment,
Easy and controlled access to the Airflow Web UI,
Provide logging and monitoring metrics, and alert when your workflow is not running,
Integrate with all of Google Cloud services: Big Data, Machine Learning and so on. Run jobs elsewhere, i.e. other cloud provider (Amazon).
Of course you have to pay for the hosting service, but the cost is low compare to if you have to host a production airflow server on your own.
Airflow on-premise
DevOps work that need to be done: create a new server, manage Airflow installation, takes care of dependency and package management, check server health, scaling and security.
pull an Airflow image from a registry and creating the container
creating a volume that maps the directory on local machine where DAGs are held, and the locations where Airflow reads them on the container,
whenever you want to submit a DAG that needs to access GCP service, you need to take care of setting up credentials. Application's service account should be created and downloaded as a JSON file that contains the credentials. This JSON file must be linked into your docker container and the GOOGLE_APPLICATION_CREDENTIALS environment variable must contain the path to the JSON file inside the container.
To sum up, if you don’t want to deal with all of those DevOps problem, and instead just want to focus on your workflow, then Google Cloud composer is a great solution for you.
Additionally, I would like to share with you tutorials that set up Airflow with Docker and on GCP Cloud Composer.

Best strategy to dockerize Java with Angular JS applications

I have a java/AngularJS project that needs to be dockerize for CI/CD process. My project is as below:
Project:
UI - Angular/Node JS
Java - Project ABC:
-- Branch: Master
-- Service 1 (.jar/war)
-- Service 2 (.jar)
-- Service 3 (.jar)
Should I put all jar/war files into one container/volume? I would like to automate the process as much as possible using CI/CD tools. Any suggestions would be appreciated. Thanks.
a service is called a micro service only if he can be as a standalone process, meaning it can communicate with other services via socket, fd, pipes etc.. (common and most easy use is socket, often as a higher protocol aka http)
if your services are meeting this criteria than each one of them should be in a different docker container, you can expose any port on any container and because docker maintain a host and dns system you can access each one via the name_of_container:port you should look in the docker compose docs for more info

How to deploy docker app using docker-compose.yml in cloud foundry

I have a docker-compose.yml file which have environment variable and certificates. I like to deploy these in cloud foundry dev version.
I want to deploy microgateway on cloud foundry link for microgateway is below-
https://github.com/CAAPIM/Microgateway
In cloud native world, you instantiate the services to your foundation beforehand. You can use prebuilt services (auto-scaler) available from the market place.
If the service you want is not available, you can install a tile (e.g redis, mysql, rabbitmq), which will add services to the market place. Lot of vendors provide tiles that can be installed on PCF (check on newtork.pivotal.io for the full list).
If you have services that are outside of cloud foundry (e.g. Oracle, Mongo, or MS Sql Server), and you wish to inject them into your cloud foundry foundation, you can create do that by creating User Provide Services (cups).
Once you have a service, you have to create a service instance. Think of it as provisioning a service for you. After you have provisioned i.e. created a service instance, then you can bind it to one or more apps.
A service instance is scoped to an org and a space. All apps within a org - space, can be bound to that service instance.
You deploy your app individually, by itself, to cloud foundry (jar, war, zip). You then bind any needed services to your app (e.g db, scaling, caching etc).
Use a manifest file to do all these steps in one deployment.
PCF 2.0 is introducing PKS - Pivotal Container Service. It is implementation of Kubo within PCF. It is still not GA.
Kubo, Kubernetes, and PKS allow you to deployed your containerized applications.
I have played with MiniKube and little bit of Kubo. Still getting my hands wet on PKS.
Hope this helps!

Is there a commercially supported option for a standalone Spring Cloud Data Flow?

We're looking at using Spring Cloud Task / Spring Cloud Data Flow for our batch processing needs as we're modernising from a legacy system. We don't want or need the whole microservices offering ... we want to be able to deploy jobs/tasks, kick off batch processes, have them log to a log file, and share a database connection pool and message queue. We don't need the whole PaaS that's provided by Spring Cloud Foundry, and we don't want to pay for that, but we do want the Data Flow / Task framework to be commercially supported. Is such an option available?
Spring Cloud Data Flow (SCDF) builds upon spring-cloud-deployer abstraction to deploy stream/task workloads to a variety of runtimes including Cloud Foundry, Kubernetes, Mesos and Yarn - see this visual.
You'd need a runtime for SCDF to orchestrate these workloads in production setting. If there's no scope for cloud infrastructure, the YARN based deployment could be a viable option for standalone bare-metal installation. Please review the reference guide and Apache Ambari provisioning tools for more details. There's a separate commercial support option available for this type of installation.

Resources