Metrics to monitor ALB quotas - monitoring

Is there a way to monitor max number of target groups per ALB? It is 100 and can be easily reached when using ALB Ingress Controller in Kubernetes.

Related

Kubernetes is affecting cpu usage of pods

in my environment one kubernetes pod, let's call it P1, is connected outside the cluster via a message oriented middleware (MOM). The latter is publicly exposed through the following Service:
apiVersion: v1
kind: Service
metadata:
name: my-mom-svc
spec:
externalIPs:
- aaa.bbb.ccc.ddd
selector:
app: my-mom
ports:
- port: pppp
name: my-port-name
Clients are outside the k8s cluster and connect to the MOM thanks to this service. P1 processes messages coming from the MOM and sent by the clients. My goal is to maximize the CPU used by P1.
I defined a limitrange so that P1 can use all the available CPUs on a worker node.
However, in my test environment it does not use all of them and indeed, the more pods like P1 I create the less CPU each of them uses (notice that there is only one pod like P1 for a single worker node).
I tried to define a resourcequota with a huge max cpu number, but the result does not change.
In desperation i entered into the pod and executed the command 'stress --cpu x'..and here the pod uses all the x cpus.
I tried the same test using a 'raw' docker containers, that is running my environment without kubernetes and only using docker containers. In this case the containers use all the available CPUs.
Are there any default kubernetes limitations or behavior limiting something? how can i modify them?
Thanks!
A few things to note:
The fact that you were able to stress the CPU fully with stress --cpu x when logging into your pod's container is evidence that k8s is functioning correctly when pod requests resources on the worker node. So, resource requests and limits are functional.
You should consider if your network traffic that one P1 pod is handling is actually enough to generate a high CPU utilisation. Typically, you need to generate a VERY HIGH amount of network traffic to get a service to utilize a lot of CPU since such a workload is network latency centric and not compute power centric.
You describe that when increasing your P1 pods, your loads/pod decreases that is because your Service object is doing a great job. Service objects are responsible for load balancing incoming traffic equally to all the pods that are serving the traffic. The fact that CPU load reduces is evidence that since there are more pods to serve the incoming traffic, the load is naturally distributed across them by the Service abstraction.
When you define a very large number for your request quota two things can happen:
a. If there is no admission control in your cluster (an application that processes all the incoming API requests and performs actions on it, like validation/compliance/security checks), your pod will be stuck in Pending state, since there will be no Node big enough for the scheduler to be able to fit your pod.
b. If there is an admission-controller setup, it will try to enforce a maximum allowable quota by overriding the value of quota in your manifest before it is sent to the API server for processing. So, even if you specify 10 in your vCPU request, if the admission-controller has a rule which doesn't allow more than 2 vCPUs in quota, it will be changed to 2 by the controller. You can verify this isn't the case by printing your Pod and looking at the quota fields if they are the same as the ones you specified when applying you might not have an admission-controller in your cluster.
I would suggest a better way to approach the problem would be to test your Pod with a reasonable/realistic maximum value of traffic that you expect on 1 node and then record the CPU usage and memory usage. You can then instead of attempting to get the Pod to use more CPU, you can resize your node into a smaller sized node, this way, your pod will have less CPU available and hence better utilisation :)
This is a very common design pattern (especially for scenario like yours where you have 1pod/worker node). This allows to have light-weight easy scale-out architectures which can perform really well along with autoscaling of nodes.

How to collect messages (total number and size) between microservices?

I have a microservices based software architecture.
There is a php application which orchestrates the communication among microservices and the application's whole logic.
I need to simulate the communication between microservices as a graph.
There will be edges with weights , which will represent the affinities between microservices.
I am searching for a tool in order to collect all messages and their size.
I have read that there are distibuted tracing systems like Zipkin which i have already deployed, and could accomplish this task.
But, i cannot find how to collect the messages i want.
This is the php library i used for the instrumentation of my app
[https://github.com/openzipkin/zipkin-php]
Any ideas about other tools or how to use Zipkin differently to achieve my goal?
Let me add to this thread my three bits. Speaking of Envoy, yes, when attached to your application it adds a lot of useful features from observability bucket, e.g. network level statistics and tracing.
Here is the question, have you considered running your legacy apps inside service mesh, like Istio ?.
Istio simplifies deployment and configuration of Envoy for you. It injects sidecar container (istio-proxy, in fact Envoy instance) to your Pod application, and gives you these extra features like a set of service metrics out of the box*.
Example: Stats produced by Envoy in Prometheus format, like istio_request_bytes are visualized in Kiali Metrics dashboard for inbound traffic as request_size (check screenshot)
*as mentioned by #David Kruk, you still needs to have Prometheus server deployed in your cluster to be able to pull these metrics to Kiali dashboards.
You can learn more about Istio here. There is also a dedicated section on how to visualize metrics collected by Istio (e.g. request size).

AWS EKS Cluster Auto scale

I have a AWS EKS cluster 1.12 version for my applications, We have deployed 6 apps in the cluster everything is working fine, while creating nodes I have added an autoscaling node group which spans across availability zones with minimum 3 and max 6 nodes, so desired 3 nodes are running fine.
I have scenario like this:
when some memory spike happens I need to get more nodes as I mentioned in auto scaling group max nodes, so at the time of cluster set up I didn't add Cluster auto scale.
Can somebody please address following doubts
As per AWS documentation cluster auto scale won't support if our node group in multiple AZs
If at all we require to create multiple node groups as per the aws doc, how to mention min max nodes, is it like for entire cluster ?
How can I achieve auto scale on memory metric since this won't come by default like cpu metric
You should create one node group for every AZ. So if your cluster size is 6 nodes then create 2 instance node groups in one AZ each. You can also spread the pods across AZ for High Availability. If you look at cluster autoscaler documentation, it recommends:
Cluster autoscaler does not support Auto Scaling Groups which span
multiple Availability Zones; instead you should use an Auto Scaling
Group for each Availability Zone and enable the
--balance-similar-node-groups feature. If you do use a single Auto Scaling Group that spans multiple Availability Zones you will find
that AWS unexpectedly terminates nodes without them being drained
because of the rebalancing feature.
I am assuming you want to scale the pods based on memory. For that you will have to use metric server or Prometheus and create a HPA which scaled based on memory. You can find a working example here.

How to get Max Egress Flows to match Configured Max Egress Flows in Solace?

In the Solace CLI, I type in the following command:
solace> show message-spool message-vpn Solace_VPN
The result output contains a difference between "actual" versus "configured":
Flows
Max Egress Flows: 100
Configured Max Egress Flows: 1000
Current Egress Flows: 60
Max Ingress Flows: 100
Configured Max Ingress Flows: 1000
Current Ingress Flows: 22
How do I get "Max Egress Flows" and "Configured Max Egress Flows" to align?
Is it as easy as restarting my Message VPN (but this will disconnect all my existing clients)?
It this just a limitation of the community edition?
From the output, it would appear that your message-broker is configured to only operate at the default 100 connection scaling tier.
There are two options to get the limits to align:
If you meet the system requirements for the 1000 connections scaling tier, you can increase your connection scaling tier to 1000 using the procedure here.
Manually lower the VPN limits. This can be easily done by going to "Message VPNs, ACLs & Bridges" tab in SolAdmin, clicking on "Edit Message VPN", and adjusting the limits in the "Advanced Properties" tab.

Ingress and Egress Rate of Solace Appliance

I wanted to set Ingress and Egress Rate of Solace Appliance on below mentioned 3 level.
1 : Appliances Level
2 : Message-VPN Level
3 : Queue Level
Please let me know the possibility & share CLI Commend for same.
I assume you're asking if there is a way to set rate limits to ingress or egress traffic.
There is egress traffic shaping facility available on some NABs (e.g. NAB-0610EM, NAB-0210EM-04, NAB-0401ET-04, and NAB‑0801ET‑04) but it is per physical interface (port), e.g,
solace(configure/interface)# traffic-shaping
solace(configure/interface/traffic-shaping)# egress
solace(configure/interface/traffic-shaping/egress)# rate-limit <number-in-MBPS>
If your network RTT is significant enough, you might be able to do some "rate-limit" by constraining the TCP maximum send window size with a desired bandwidth-delay product. Note that this is not a generic solution and will only work for certain cases, depending on the network environment. It can be set on the client-profile for egress traffic:
solace(configure)# client-profile <name> message-vpn <vpn-name>
solace(configure/client-profile)# tcp max-wnd <num-kilo-bytes>
There is nothing you can do for rate-limiting ingress traffic on the appliance.
I don't believe SolOS-TR supports rate limiting on the appliance side.
https://sftp.solacesystems.com/Portal_Docs/#page/Solace_Router_Feature_Provisioning/B_Managing_Consumption_of_Router_Resources.html

Resources