Access HTTP service running in GKE from Google Dataflow - google-cloud-dataflow

I have an HTTP service running on a Google Container Engine cluster (behind a kubernetes service).
My goal is to access that service from a Dataflow job running on the same GCP project using a fixed name (in the same way services can be reached from inside GKE using DNS). Any idea?
Most solutions I have read on stackoverflow relies on having kube-proxy installed on the machines trying to reach the service. As far as I know, it is not possible to reliably set up that service on every worker instance created by Dataflow.
One option is to create an external balancer and create an A record in the public DNS. Although it works, I would rather not have an entry in my public DNS records pointing to that service.

EDIT:
this is now supported on GKE (now known as Kubernetes Engine): https://cloud.google.com/kubernetes-engine/docs/how-to/internal-load-balancing
I have implemented this in a pretty smooth way IMHO. I will try to walk through briefly how it works:
Remember that when you create a container cluster (or nodepool), it will consist of a set of GCE instances in an instance group that is a part of the default network. NB: add a specific GCE network tag(s) so that you can add only those instances to a firewall rule later for letting load balancer check instance health.
This instance group is just a regular instance group.
Now, remember that kubernetes has something called NodePort, which will expose the service at this port on all nodes, i.e all GCE instances in your cluster. This is what we want!
Now that we know that we have a set of GCE instances in an instance group we can then add this instance group to an internal load balancer in your default network without it needing to know anything about kubernetes internals or DNS.
The Guide which you can follow, skipping many of the initial steps is here: https://cloud.google.com/compute/docs/load-balancing/internal/
Remember that this works for regions, so dataflow and everything else must be in the same region.
See this spec for the service:
kind: Service
apiVersion: v1
metadata:
name: name
labels:
app: app
spec:
selector:
name: name
app: app
tier: backend
ports:
- name: health
protocol: TCP
enter code here port: 8081
nodePort: 30081
- name: api
protocol: TCP
port: 8080
nodePort: 30080
type: NodePort
This is the code for setting up the load balancer with health checks, forwarding rules and firewall that it needs to work:
_region=<THE_REGION>
_instance_group=<THE_NODE_POOL_INSTANCE_GROUP_NAME>
#Can be different for your case
_healtcheck_path=/liveness
_healtcheck_port=30081
_healtcheck_name=<THE_HEALTCHECK_NAME>
_port=30080
_tags=<TAGS>
_loadbalancer_name=internal-loadbalancer-$_region
_loadbalancer_ip=10.240.0.200
gcloud compute health-checks create http $_healtcheck_name \
--port $_healtcheck_port \
--request-path $_healtcheck_path
gcloud compute backend-services create $_loadbalancer_name \
--load-balancing-scheme internal \
--region $_region \
--health-checks $_healtcheck_name
gcloud compute backend-services add-backend $_loadbalancer_name \
--instance-group $_instance_group \
--instance-group-zone $_region-a \
--region $_region
gcloud compute forwarding-rules create $_loadbalancer_name-forwarding-rule \
--load-balancing-scheme internal \
--ports $_port \
--region $_region \
--backend-service $_loadbalancer_name \
--address $_loadbalancer_ip
#Allow google cloud to healthcheck your instance
gcloud compute firewall-rules create allow-$_healtcheck_name \
--source-ranges 130.211.0.0/22,35.191.0.0/16 \
--target-tags $_tags \
--allow tcp

Lukasz's answer is probably the most straightforward way to expose your service to dataflow. But, if you really don't want a public IP and DNS record, you can use a GCE route to deliver traffic to your cluster's private IP range (something like option 1 in this answer).
This would let you hit your service's stable IP. I'm not sure how to get Kubernetes' internal DNS to resolve from Dataflow.

The Dataflow job running on GCP will not be part of the Google Container Engine cluster, so it will not have access to the internal cluster DNS by default.
Try setting up a load balancer for the service that you want to expose which knows how to route the "external" traffic to it. This will allow you to connect to the IP address directly from a Dataflow job executing on GCP.

Related

Have one container access another with helm/Docker

The context
Let me know if I've gone down a rabbit hole here.
I have a simple web app with a frontend and backend component, deployed using Docker/Helm inside a Kubernetes cluster. The frontend is servable via nginx, and the backend component will be running a NodeJS microservice.
I had been thinking to have both run on the same pod inside Docker, but ran into some problems getting both nginx and Node to run in the background. I could try having a startup script that runs both, but the Internet says it's a best practice to have different containers each be responsible for only running one service - so one container to run nginx and another to run the microservice.
The problem
That's fine, but then say the nginx server's HTML pages need to know what to send a POST request to in the backend - how can the HTML pages know what IP to hit for the backend's Docker container? Articles like this one come up talking about manually creating a Docker network for the two containers to speak to one another, but how can I configure this with Helm so that the frontend container knows how to hit the backend container each time a new container is deployed, without having to manually configure any network service each time? I want the deployments to be automated.
You mention that your frontend is based on Nginx.
Accordingly,Frontend must hit the public URL of backend.
Thus, backend must be exposed by choosing the service type, whether:
NodePort -> Frontend will communicate to backend with http://<any-node-ip>:<node-port>
or LoadBalancer -> Frontend will communicate to backend with the http://loadbalancer-external-IP:service-port of the service.
or, keep it ClusterIP, but add Ingress resource on top of it -> Frontend will communicate to backend with its ingress host http://ingress.host.com.
We recommended the last way, but it requires to have ingress controller.
Once you tested one of them and it works, then, you can extend your helm chart to update the service and add the ingress resource if needed
You may try to setup two containers in one pod and then communicate between containers via localhost (but on different ports!). Good example is here - Kubernetes multi-container pods and container communication.
Another option is to create two separate deployments and for each create service. Instead of using IP addresses (won't be the same for every re-deployment of your app) use a DNS name for connecting to them.
Example - two NGINX services communication.
First create two NGINX deplyoments:
kubectl create deployment nginx-one --image=nginx --replicas=3
kubectl create deployment nginx-two --image=nginx --replicas=3
Let's expose them using the kubectl expose command. It's the same if I had created a service from a yaml file:
kubectl expose deployment nginx-one --name=my-service-one --port=80
kubectl expose deployment nginx-two --name=my-service-two --port=80
Now let's check services - as you can see both of them are ClusterIP type:
user#shell:~$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.36.0.1 <none> 443/TCP 66d
my-service-one ClusterIP 10.36.6.59 <none> 80/TCP 60s
my-service-two ClusterIP 10.36.15.120 <none> 80/TCP 59s
I will exec into pod from nginx-one deployment and curl the second service:
user#shell:~$ kubectl exec -it nginx-one-5869965455-44cwm -- sh
# curl my-service-two
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
If you have problems, make sure you have a proper CNI plugin installed for your cluster - also check this article - Cluster Networking for more details.
Also check these:
My similar answer but with a wider explanation + example of communication between two namespaces.
Access Services Running on Clusters | Kubernetes
Service | Kubernetes
Debug Services | Kubernetes
DNS for Services and Pods | Kubernetes

Connection refused when trying to connect to services in Kubernetes

I'm trying to create a Kubernetes cluster for learning purposes. So, I created 3 virtual machines with Vagrant where the master has IP address of 172.17.8.101 and the other two are 172.17.8.102 and 172.17.8.103.
It's clear that we need Flannel so that our containers in different machines can connect to each other without port mapping. And for Flannel to work, we need Etcd, because flannel uses this Datastore to put and get its data.
I installed Etcd on master node and put Flannel network address on it with command etcdctl set /coreos.com/network/config '{"Network": "10.33.0.0/16"}'
To enable ip masquerading and also using the private network interface in the virtual machine, I added --ip-masq --iface=enp0s8 to FLANNEL_OPTIONS in /etc/sysconfig/flannel file.
In order to make Docker use Flannel network, I added --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU}' to OPTIONS variable in /etc/sysconfig/docker file. Note that the values for FLANNEL_SUBNET and FLANNEL_MTU variables are the ones set by Flannel in /run/flannel/subnet.env file.
After all these settings, I installed kubernetes-master and kubernetes-client on the master node and kubernetes-node on all the nodes. For the final configurations, I changed KUBE_SERVICE_ADDRESSES value in /etc/kubernetes/apiserver file to --service-cluster-ip-range=10.33.0.0/16
and KUBELET_API_SERVER value in /etc/kubernetes/kubelet file to --api-servers=http://172.17.8.101:8080.
This is the link to k8s-tutorial project repository with the complete files.
After all these efforts, all the services start successfully and work fine. It's clear that there are 3 nodes running when I use the command kubectl get nodes. I can successfully create a nginx pod with command kubectl run nginx-pod --image=nginx --port=80 --labels="app=nginx" and create a service with kubectl expose pod nginx-pod --port=8000 --target-port=80 --name="service-pod" command.
The command kubectl describe service service-pod outputs the following results:
Name: service-pod
Namespace: default
Labels: app=nginx
Selector: app=nginx
Type: ClusterIP
IP: 10.33.39.222
Port: <unset> 8000/TCP
Endpoints: 10.33.72.2:80
Session Affinity: None
No events.
The challenge is that when I try to connect to the created service with curl 10.33.79.222:8000 I get curl: (7) Failed connect to 10.33.72.2:8000; Connection refused but if I try curl 10.33.72.2:80 I get the default nginx page. Also, I can't ping to 10.33.79.222 and all the packets get lost.
Some suggested to stop and disable Firewalld, but it wasn't running at all on the nodes. As Docker changed FORWARD chain policy to DROP in Iptables after version 1.13 I changed it back to ACCEPT but it didn't help either. I eventually tried to change the CIDR and use different IP/subnets but no luck.
Does anybody know where am I going wrong or how to figure out what's the problem that I can't connect to the created service?
The only thing I can see that you have that is conflicting is the PodCidr with Cidr that you are using for the services.
The Flannel network: '{"Network": "10.33.0.0/16"}'. Then on the kube-apiserver --service-cluster-ip-range=10.33.0.0/16. That's the same range and it should be different so you have your kube-proxy setting up services for 10.33.0.0/16 and then you have your overlay thinking it needs to route to the pods running on 10.33.0.0/16. I would start by choosing a completely non-overlapping Cidrs for both your pods and services.
For example on my cluster (I'm using Calico) I have a podCidr of 192.168.0.0/16 and I have a service Cidr of 10.96.0.0/12
Note: you wouldn't be able to ping 10.33.79.222 since ICMP is not allowed in this case.
Your service is of type ClusterIP which means it can only be accessed by other Kubernetes pods. To achieve what you are trying to do consider switching to a service of type NodePort. You can then connect to it using the command curl <Kubernetes-IP-address>:<exposedServicePort>
See https://kubernetes.io/docs/tasks/access-application-cluster/service-access-application-cluster/ for an example of using NodePort.

How to get kubernetes pods or deployment IP to be used inside application

I just tried setting up kubernetes on my bare server,
Previously I had successfully create my docker compose
There are several apps :
Apps A (docker image name : a-service)
Apps B (docker image name : b-service)
Inside Application A and B there are configs (actually there are apps A,B,C,D,etc lots of em)
The config file is something like this
IPFORSERVICEA=http://a-service:port-number/path/to/something
IPFORSERVICEB=http://b-service:port-number/path/to/something
At least above config work in docker compose (the config is inside app level, which require to access another apps). Is there any chance for me to access another Kubernetes Service from another service ? As I am planning to create 1 app inside 1 deployment, and 1 service for each deployment.
Something like:
App -> Deployment -> Service(i.e: NodePort,ClusterIP)
Thanks !
Is there any chance for me to access another Kubernetes Service from
another service ?
Yes, you just need to specify DNS name of service (type: ClusterIP works fine for this) you need to connect to as:
<service_name>.<namespace>.svc.cluster.local
In this case such domain name will be correctly resolved into internal IP address of the service you need to connect to using built-in DNS.
For example:
nginx-service.web.svc.cluster.local
where nginx-service - name of your service and web - is apps's namespace, so service yml definition can look like:
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: web
spec:
ports:
- name: http
protocol: TCP
port: 80
selector:
app: nginx
type: ClusterIP
See official docs to get more information.
Use Kubernetes service discovery.
Service discovery is the process of figuring out how to connect to a
service. While there is a service discovery option based on
environment variables available, the DNS-based service discovery is
preferable. Note that DNS is a cluster add-on so make sure your
Kubernetes distribution provides for one or install it yourself.
Service dicovery by example

Kubernetes using Gitlab installing Ingress returns "?" as external IP

I have successfully connect my Kubernetes-Cluster with Gitlab. Also I was able to install Helm through the Gitlab UI (Operations->Kubernetes)
My Problem is that if I click on the "Install"-Button of Ingress Gitlab will create all the nessecary stuff that is needed for the Ingress-Controller. But one thing will be missed : external IP. External IP will mark as "?".
And If I run this command:
kubectl get svc --namespace=gitlab-managed-apps ingress-nginx-ingress- controller -o jsonpath='{.status.loadBalancer.ingress[0].ip}'; echo
It will show nothing. Like I won´t have a Loadbalancer that exposes an external IP.
Kubernetes Cluster
I installed Kubernetes through kubeadm, using flannel as CNI
kubectl version:
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2"}
Is there something that I have to configure before installing Ingress. Do I need an external Loadbalancer(my thought: Gitlab will create that service for me)?
One more hint: After installation, the state of the Nginx-Ingress-Controller Service will be stay on pending. The reason for that it is not able to detect external IP. I also modified the yaml-File of the service and I manually put the "externalIPs : -External-IP line. The output of this was that it was not pending anymore. But still I couldn't find an external IP by typing the above command and Gitlab also couldn´t find any external IP
EDIT:
This happens after installation:
see picture
EDIT2:
By running the following command:
kubectl describe svc ingress-nginx-ingress-controller -n gitlab-managed-apps
I get the following result:
see picture
In Event log you will see that I switch the type to "NodePort" once and then back to "LoadBalancer" and I added the "externalIPs: -192.168.50.235" line in the yaml file. As you can see there is an externalIP but Git is not detecting it.
Btw. Im not using any of these cloud providers like AWS or GCE and I found out that LoadBalancer is not working that way. But there must be a solution for this without LoadBalancer.
I would consider to look at MetalLB as for the main provisioner of Load balancing service in your cluster. If you don't use any of Cloud providers in order to obtain the entry point (External IP) for Ingress resource, there is option for Bare-metal environments to switch to MetalLB solution which will create Kubernetes services of type LoadBalancer in the clusters that don’t run on a cloud provider, therefore it can be also implemented for NGINX Ingress Controller.
Generally, MetalLB can be installed via Kubernetes manifest file or using Helm package manager as described here.
MetalLB deploys it's own services across Kubernetes cluster and it might require to reserve pool of IP addresses in order to be able to take ownership of the ingress-nginx service. This pool can be defined in a ConfigMap called config located in the same namespace as the MetalLB controller:
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 203.0.113.2-203.0.113.3
External IP would be assigned to your LoadBalancer once ingress service obtains IP address from this address pool.
Find more details about MetalLB implementation for NGINX Ingress Controller in official documentation.
After some research I found out that this is an Gitlab issue. As I said above, I successfully build a connection to my cluster. Since Im using Kubernetes without cloud providers it is not possible to use the type "LoadBalancer". Therefore you need to add an external IP or change the type to "NodePort". This way you can make your Ingress-Controller accessible outside.
Check this out: kubernetes service external ip pending
I just continued the Gitlab tutorial and it worked.

Connect non-dockerised application to kubenetes pos

I have non deckerised application that needs to connect to dockerised application running inside kubernetes pod.
Given that pods may died and came again with different ip address, how my application can detect this? any way to assign a hostname that redirect to whatever existing pods?
You will have to use kubernetes service. Service gives you a way to talk to your pods with static Ip and dns (if you're client app is inside the cluster).
https://kubernetes.io/docs/concepts/services-networking/service/
You can do it in several ways:
Easiest: Use kubernetes service with type: NodePort. Then you can access the pod using http://[nodehost]:[nodeport]
Use kubernetes ingress. See this link for more details (https://kubernetes.io/docs/concepts/services-networking/ingress/)
If you are running in the cloud like aws, azure or gce, you can use kubernetes service type LoadBalancer.
In addition to Bal Chua’s work and suggestions from silverfox, I would like to show you the method
I used for Kubernetes to expose and manage incoming traffic from the outside:
Step 1: Deploy an application
In this example, Kubernetes sample hello application will run on port 8080/tcp
kubectl run web --image=gcr.io/google-samples/hello-app:1.0 --port=8080
Step 2: Expose your Deployment as a Service internally
This command tells Kubernetes to expose port 8080/tcp to interact with the world outside:
kubectl expose deployment web --target-port=8080 --type=NodePort
After, please check if it exposed running command:
kubectl get service web
Step 3: Manage Ingress resource
Ingress sends traffic to a proper service working inside Kubernetes.
Open a text editor and then create a file basic-ingress.yaml
with content:
apiVersion:
extensions/v1beta1
kind: Ingress
metadata:
name: basic-ingress
spec:
backend:
serviceName: web
servicePort: 8080
Apply the configuration:
kubectl apply -f basic-ingress.yaml
and that's all. It is time to test. Get the external IP address of Kubernetes installation:
kubectl get ingress basic-ingress
and run web browser with this address to see hello application working.

Resources