How to point streaming Dataflow at internal service? - google-cloud-dataflow

I am running a private service (e.g. Redis) on my Cloud Network, and I would like to access it from my streaming Dataflow job. Is there a good way to configure my job so that, if I need to update the IP address(es) for the private service, I don't need to modify the Dataflow job?

You can add a layer of indirection to this setup using Global Load Balancing with Single Anycast IP (https://cloud.google.com/load-balancing/). With internal load balancing, you can configure this target without exposing it to the Internet.

Related

VPC access connector GCP - Cloudrun Services and AlloyDB different Regions

Quick Question, i am trying to configure a cloudrun service to be connected using AlloyDB on GCP.
The problem here is AlloyDB is in a different region than the others services, in this case central1, and services east1.
Is there any way to do the pairing between them?
Thanks in advance,
There is no connectivity issue. You use a serverless VPC connector to bridge the serverless world (where your Cloud Run live) with your VPC. Therefore, with default configuration, all the traffic going to a private IP will arrive in your VPC.
Then you have your AlloyDB peered with your VPC also. Because the VPC is global, as long as you are in the VPC (AlloyDB or Cloud Run), any service can reach any resources, whatever their location.
In fact, your main concern should be the latency and the egress cost.

How to expose a service from minikube to be able to access it from another device in the same network?

I've created a service inside minikube (expressjs API) running on my local machine,
so when I launch the service using minikube service wedeliverapi --url I can access it from my browser with localhost:port/api
But I also want to access that service from another device so I can use my API from a flutter mobile application. How can I achieve this goal?
Due to small amount of information and to clarify everything- I am posting a general Community wiki answer.
The solution to solve this problem was to use reverse proxy server. In this documentation is definiton what exactly is reverse proxy server .
A proxy server is a go‑between or intermediary server that forwards requests for content from multiple clients to different servers across the Internet. A reverse proxy server is a type of proxy server that typically sits behind the firewall in a private network and directs client requests to the appropriate backend server. A reverse proxy provides an additional level of abstraction and control to ensure the smooth flow of network traffic between clients and servers
Common uses for a reverse proxy server include:
Load balancing
Web acceleration
Security and anonymity
This is the guide where one can find basic configuration of a proxy server.
See also this article.

How can I integrate my application with Kubernetes cluster running Docker containers?

This is more of a research question. If it does not meet the standards of SO, please let me know and I will ask elsewhere.
I am new to Kubernetes and have a few basic questions. I have read a lot of doc on the internet and was hoping someone can help answer few basic questions.
I am trying to create an integration with Kubernetes (user applications running inside Docker containers to be precise) and my application that would act as a backup for certain data in the containers.
My application currently runs in AWS. Would the Kube cluster need to run in AWS as well ? Or can it run in any cloud service or even on-prem as long as the APIs are available ?
My application needs to know the IP of the Master node API server to do POST/GET requests and nothing else ?
For authentication, can I use AD (my application uses AD today for a few things). That would also give me Role based policies for each user. Or do I have to use the Kube Token Reviewer API for authentication always ?
Would the applications running in Kubernetes use the APIs I provide to communicate with my application ?
Would my application use POST/GET to communicate with the Kube Master API server ? Do I need to use kubectl for this and above #4 ?
Thanks for your help.
Your application needn't exist on the same server as k8s. There are several ways to connect to k8s cluster, depending on your use case. Either you can expose the built-in k8s API using kubectl proxy, connect directly to the k8s API on the master, or you can expose services via load balancer or node port.
You would only need to know the IP for the master node if you're connecting to the cluster directly through the built-in k8s API, but in most cases you should only be using this API to internally administer your cluster. The preferred way of accessing k8s pods is to expose them via load balancer, which allows you to access a service on any node from a single IP. k8s also allows you to access a service with a nodePort from any k8s node (except the master) through a preassigned port.
TokenReview is only one of the k8s auth strategies. I don't know anything about Active Directory auth, but at a glance OpenID connect tokens seem to support it. You should review whether or not you need to allow users direct access to the k8s API at all. Consider exposing services via LoadBalancer instead.
I'm not sure what you mean by this, but if you deploy your APIs as k8s deployments you can expose their endpoints through services to communicate with your external application however you like.
Again, the preferred way to communicate with k8s pods from external applications is through services exposed as load balancers, not through the built-in API on the k8s master. In the case of services, it's up to the underlying API to decide which kinds of requests it wants to accept.

Cloud dataflow job using Internal IP?

How do I configure to run my Cloud dataflow job using Internal IP?
Our policy doesn't allow to use external IP to spawn the workers. So, looking for options that would disallow external IP. I ran and got the below error.
Startup of the worker pool in zone XXX failed to bring up any of the desired 1 workers. Please check for errors in your job parameters, check quota, and retry later, or please try in a different zone/region.
Add instance projects to use external IP with it.
You can use the --usePublicIps=false flag. Here you can look at some examples.
looks like they updated flags
now it's
--no_use_public_ips or --use_public_ips
PS: Python

Google Cloud Platform DataFlow workers IP addresses

Is it possible to know what range of external IP the DataFlow workers on GCP are using? The goal is to set-up some kind of IP filtering on an external service, so that only our DataFlow jobs running on GCP can access the service.
The best solution would be to upgrade so that you can use SSL or other mechanisms of strong authentication.
You can use the --network= option to control the GCE Network that the worker VMs are assigned to. Take a look at the GCE docs on networking for details on how to set up a VPN (like the comment from Elmar suggested). You could also look at setting up a single machine in the network with a static, external IP and using it as a proxy for the other VMs in the network.
This is not a use pattern we have tested, so there may be issues with latency or throughput of traffic through the proxy/VPN. You will likely need to be careful to only send your traffic through this proxy so that you don’t accidentally hijack the traffic used by each worker to communicate with the Dataflow service.

Resources