How can I integrate my application with Kubernetes cluster running Docker containers? - docker

This is more of a research question. If it does not meet the standards of SO, please let me know and I will ask elsewhere.
I am new to Kubernetes and have a few basic questions. I have read a lot of doc on the internet and was hoping someone can help answer few basic questions.
I am trying to create an integration with Kubernetes (user applications running inside Docker containers to be precise) and my application that would act as a backup for certain data in the containers.
My application currently runs in AWS. Would the Kube cluster need to run in AWS as well ? Or can it run in any cloud service or even on-prem as long as the APIs are available ?
My application needs to know the IP of the Master node API server to do POST/GET requests and nothing else ?
For authentication, can I use AD (my application uses AD today for a few things). That would also give me Role based policies for each user. Or do I have to use the Kube Token Reviewer API for authentication always ?
Would the applications running in Kubernetes use the APIs I provide to communicate with my application ?
Would my application use POST/GET to communicate with the Kube Master API server ? Do I need to use kubectl for this and above #4 ?
Thanks for your help.

Your application needn't exist on the same server as k8s. There are several ways to connect to k8s cluster, depending on your use case. Either you can expose the built-in k8s API using kubectl proxy, connect directly to the k8s API on the master, or you can expose services via load balancer or node port.
You would only need to know the IP for the master node if you're connecting to the cluster directly through the built-in k8s API, but in most cases you should only be using this API to internally administer your cluster. The preferred way of accessing k8s pods is to expose them via load balancer, which allows you to access a service on any node from a single IP. k8s also allows you to access a service with a nodePort from any k8s node (except the master) through a preassigned port.
TokenReview is only one of the k8s auth strategies. I don't know anything about Active Directory auth, but at a glance OpenID connect tokens seem to support it. You should review whether or not you need to allow users direct access to the k8s API at all. Consider exposing services via LoadBalancer instead.
I'm not sure what you mean by this, but if you deploy your APIs as k8s deployments you can expose their endpoints through services to communicate with your external application however you like.
Again, the preferred way to communicate with k8s pods from external applications is through services exposed as load balancers, not through the built-in API on the k8s master. In the case of services, it's up to the underlying API to decide which kinds of requests it wants to accept.

Related

How to expose a service from minikube to be able to access it from another device in the same network?

I've created a service inside minikube (expressjs API) running on my local machine,
so when I launch the service using minikube service wedeliverapi --url I can access it from my browser with localhost:port/api
But I also want to access that service from another device so I can use my API from a flutter mobile application. How can I achieve this goal?
Due to small amount of information and to clarify everything- I am posting a general Community wiki answer.
The solution to solve this problem was to use reverse proxy server. In this documentation is definiton what exactly is reverse proxy server .
A proxy server is a go‑between or intermediary server that forwards requests for content from multiple clients to different servers across the Internet. A reverse proxy server is a type of proxy server that typically sits behind the firewall in a private network and directs client requests to the appropriate backend server. A reverse proxy provides an additional level of abstraction and control to ensure the smooth flow of network traffic between clients and servers
Common uses for a reverse proxy server include:
Load balancing
Web acceleration
Security and anonymity
This is the guide where one can find basic configuration of a proxy server.
See also this article.

How to add a reverse proxy for authentication & load balancing in Kuberenetes (GKE)?

Okay, I have a DB consisting of several nodes deployed to GKE.
The deployment.yaml adds each node as ClusterIP, which makes sense. Here is the complete deployment file:
https://github.com/dgraph-io/dgraph/blob/master/contrib/config/kubernetes/dgraph-ha/dgraph-ha.yaml
For whatever reason, the DB has zero security functionality, so I cannot expose any part using a LoadBalancer service because doing so would give unsecured access to the entire DB. The vendor argues that security is solely the user's problem. The AlphaNode comes with an API endpoint, which is also unsecured, but I actually want to connect to that API endpoint from an external IP.
So, the best I can do is adding an NGNIX as a (reverse) proxy with authentication to secure access to the API endpoint of the Alpha node(s). Practically, I have three alpha nodes so adding load balancing makes sense. I found a config that does load balancing to three alpha nodes in Docker Compose although, without authenication.:
https://gist.github.com/MichelDiz/42954e321620159c872c35c20e9d85c6
Now, the million-dollar question I have is, how do I add an NGNIX load balance to Kubernetes that authenticates and load balances incoming traffic to my (ClusterIP) alpha nodes?
Any pointers?
Any help?
If you want to do it that hard way, you can deploy your own nginx deployment and expose it as LoadBalancer Service. You can configure it with different authentication mechanisms that nginx support.
Instead, you can use Ingress resource backed by an IngressController that supports authentication. Check if your kubernetes distribution provides an IngressController and if it is supports auth. If not, you can install nginx or Traefik IngressControllers which supports authentication.
Looks like GKE ingress has recently added support for IAP bassed authentication which is still in beta - https://cloud.google.com/iap/docs/enabling-kubernetes-howto
If you are looking for more traditional type of authentication with ingress, install nginx or traefik and use the kubernetes.io/ingress.class annotation so that only IngressController claims your ingress resource - https://kubernetes.github.io/ingress-nginx/user-guide/multiple-ingress/

Using traefik for docker internal traffic via websockets

I'm using docker in swarm mode for the services in my application and traefik to handle, well, the traffic. My goal is to make a separate service for each API section my application has (so for example requests on domain.com/api/foo_api go to the foo_api service and requests on domain.com/api/bar_api go to the bar_api service.
Now all this is pretty straightforward with traefik. However, I'm also using the API services with other internal services not related to the API. They use a websocket connection to the internal docker URL, so currently it's ws://api:api_port/ws. However, if I split up the API part I'd need something like ws://foo_api:foo_api_port/ws which obviously leaves the service only access to the foo_api, not every other one.
So my question is: Can I route this websocket traffic with traefik similiar to how I do it externally, but internally in the docker net?
Traefik is a north-south reverse proxy. Most people historically in traditional infrastructure would use NGINX or Apache to address inbound - good to see you using a more modern tool. What you are describing is an east-west pattern of communication inside your firewall behind traefik (assuming you control all ingress through traefik).
Have you considered using service discovery and registry capabilities with tools like Hashicorp Consul - https://consul.io?
The idea of having service discovery is so that your containers / services inside the swarm can be discovered and made available through the registry and referenced in proximation to each other by name without the pains of manual labor in building and maintaining complicated name-IP-lookups. Most understand this historically in a more persistent model behind DNS SRV which requires external query. Consul can still support that legacy reference integration as well.
This site might help you along: https://attx-project.github.io/Consul-for-Service-Discovery-on-Docker-Swarm.html
They appear to have addressed a similar case to yours. And the work is likely reusable with a few tweaks.

deploying Spring Boot Rest Service with https enabled, in kubernetes

I have developed a spring boot based REST API service and enabled https on it by using a self signed cert keystore (to test locally), and it works well.
server.ssl.key-store=classpath:certs/keystore.jks
server.ssl.key-store-password=keystore
server.ssl.key-store-type=PKCS12
server.ssl.key-alias=tomcat
Now, I want to package a docker image and deploy this service in a kubernetes cluster. I know I can expose the service as a NodePort and access it externally.
What I want to know is, I doubt that my self signed cert generated in local machine will work when deployed in kubernetes cluster. I researched and found a couple of solutions using kubernetes ingress, kubernetes secrets, etc. I am confused as to what will be the best way to go about doing this, so that I can access my service running in kubernetes through https. What changes will I need to do to my REST API code?
UPDATED NOTE : Though I have used a self signed cert for testing purposes, I can obtain a CA signed cert from my company and use it for production. My question is more on the lines of, For a REST API service which already uses a SSL/TLS based connection, what are some of the better ways to deploy and access the cert in kubernetes cluster , eg: package in the application itself, use Secrets, or scrap the application's SSL configuration and use Ingres instead, etc. Hope my question makes sense :)
Thanks for any suggestions.
Well it depends on the way you want to expose your service. Basically you have either an ingress, an external load balancer (only in certain cloud evironments available) or a Service thats routed to a Port (either via NodePort or HostPort) as options.
Attention: Our K8S Cluster is self hosted so I have no reliable information about external load balancers in K8S and will therefore omit that option.
If you want to expose your service directly behind one of your domains on port 80 (e.g. https://app.myorg.org) you'll want to use ingress. But if you don't need that and you can live with a specific port the NodePort approach should do the trick (e.g. https://one.ofyourcluster.servers:30000/).
Let's assume you want to try the ingress approach than you need to add the certificates to the ingress definition in K8S instead of the spring boot application or you must additionally specify that the service is reachable via https itself in the ingress. The way to do it may differ from ingress controller to ingress controller.
For the NodePort/HostPort you just need to enable SSL in your application.
Despite that you also need a valid certificate e.g. issued by https://letsencrypt.org/
Actually for K8S there are some projects that can fetch you a letsencrypt certificate automatically if you to use ingresses. (e.g. https://github.com/jetstack/cert-manager/)

Google Cloud Platform DataFlow workers IP addresses

Is it possible to know what range of external IP the DataFlow workers on GCP are using? The goal is to set-up some kind of IP filtering on an external service, so that only our DataFlow jobs running on GCP can access the service.
The best solution would be to upgrade so that you can use SSL or other mechanisms of strong authentication.
You can use the --network= option to control the GCE Network that the worker VMs are assigned to. Take a look at the GCE docs on networking for details on how to set up a VPN (like the comment from Elmar suggested). You could also look at setting up a single machine in the network with a static, external IP and using it as a proxy for the other VMs in the network.
This is not a use pattern we have tested, so there may be issues with latency or throughput of traffic through the proxy/VPN. You will likely need to be careful to only send your traffic through this proxy so that you don’t accidentally hijack the traffic used by each worker to communicate with the Dataflow service.

Resources