How to know if load balancing works in Docker Swarm? - docker

I created a service called accountservice and replicated it 3 times after. In my service I get IP address of the producing service instance and populate it in JSON response. The question is everytime I run curl $manager-ip:6767/accounts/10000 the returned IP is the same as before (I tried 100 times)
manager-ip environment variable:
set -x manager-ip (docker-machine ip swarm-manager-1)
Here's my Dockerfile:
FROM iron/base
EXPOSE 6767
ADD accountservice-linux-amd64 /
ADD healthchecker-linux-amd64 /
HEALTHCHECK --interval=3s --timeout=3s CMD ["./healthchecker-linux-amd64", "-port=6767"] || exit 1
ENTRYPOINT ["./accountservice-linux-amd64"]
And here's my automation script to build and run service:
#!/usr/bin/env fish
set -x GOOS linux
set -x CGO_ENABLED 0
set -x GOBIN ""
eval (docker-machine env swarm-manager-1)
go get
go build -o accountservice-linux-amd64 .
pushd ./healthchecker
go get
go build -o ../healthchecker-linux-amd64 .
popd
docker build -t azbshiri/accountservice .
docker service rm accountservice
docker service create \
--name accountservice \
--network my_network \
--replicas=1 \
-p 6767:6767 \
-p 6767:6767/udp \
azbshiri/accountservice
And here's the function I call to get the IP:
package common
import "net"
func GetIP() string {
addrs, err := net.InterfaceAddrs()
if err != nil {
return "error"
}
for _, addr := range addrs {
ipnet, ok := addr.(*net.IPNet)
if ok && !ipnet.IP.IsLoopback() {
if ipnet.IP.To4() != nil {
return ipnet.IP.String()
}
}
}
panic("Unable to determine local IP address (non loopback). Exiting.")
}
And I scale the service using the command below:
docker service scale accountservice=3

A few things:
Your results are normal. By default, a Swarm service has a VIP (virtual IP) in front of the service tasks to act as a load balancer. Trying to reach that service from inside the virtual network will only show that IP.
If you want to use a round-robin approach and skip the VIP, you could create a service with --endpoint-mode=dnsrr that would then return a different service task for each DNS request (but your client might be caching DNS names, causing that to show the same IP, which is why VIP is usually better).
If you wanted to get a list of IP's for task replicas, do a dig tasks.<servicename> inside the service's network.
If you wanted to test something easy, have your service create a random string, or use hostname on startup and return that so you can tell the different replicas when accessing. A easy example is to run one service using image elasticsearch:2 which will return JSON on port 9200 with a different random name per container.

Related

Expose port using DockerOperator

I am using DockerOperator to run a container. But I do not see any related option to publish required port. I need to publish a webserver port when the task is triggered. Any help or guide will be helpful. Thank you!
First, don't forget docker_operator is deprecated, replaced (now) with providers.docker.operators.docker.
Second, I don't know of a command to expose a port in a live (running) Docker container.
As described in this article from Sidhartha Mani
Specifically, I needed access to the filled mysql database. .
I could think of a few ways to do this:
Stop the container and start a new one with the added port exposure. docker run -p 3306:3306 -p 8080:8080 -d java/server.
The second option is to start another container that links to this, and knows how to port forward.
Setup iptables rules to forward a host port into the container.
So:
Following existing rules, I created my own rule to forward to the container
iptables -t nat -D DOCKER ! -i docker0 -p tcp --dport 3306-j DNAT \
--to-destination 172.17.0.2:3306
This just says that whenever a packet is destined to port 3306 on the host, forward it to the container with ip 172.17.0.2, and its port 3306.
Once I did this, I could connect to the container using host port 3306.
I wanted to make it easier for others to expose ports on live containers.
So, I created a small repository and a corresponding docker image (named wlan0/redirect).
The same effect as exposing host port 3306 to container 172.17.0.2:3306 can be achieved using this command.
This command saves the trouble of learning how to use iptables.
docker run --privileged -v /proc:/host/proc \
-e HOST_PORT=3306 -e DEST_IP=172.17.0.2 -e DEST_PORT=3306 \
wlan0/redirect:latest
In other words, this kind of solution would not be implemented from a command run in the container, through an Airflow Operator.
As per my understanding DockerOperator will create a new container, then why is there no way of exposing ports while create a new container.
First, the EXPOSE part is, as I mentioned here, just a metadata added to the image. It is not mandatory.
The runtime (docker run) -p option is about publishing, not exposing: publishing a port and mapping it to a host port (see above) or another container port.
That might be not needed with an Airflow environment, where there is a default network, and even the possibility to setup a custom network or subnetwork.
Which means other (Airflow) containers attached to the same network should be able to access a ports of any container in said network, without needing any -p (publication) or EXPOSE directive.
In order to accomplish this, you will need to subclass the DockerOperator and override the initializer and _run_image_with_mounts method, which uses the API client to create a container with the specified host configuration.
class DockerOperatorWithExposedPorts(DockerOperator):
def __init__(self, *args, **kwargs):
self.port_bindings = kwargs.pop("port_bindings", {})
if self.port_bindings and kwargs.get("network_mode") == "host":
self.log.warning("`port_bindings` is not supported in `host` network mode.")
self.port_bindings = {}
super().__init__(*args, **kwargs)
def _run_image_with_mounts(
self, target_mounts, add_tmp_variable: bool
) -> Optional[Union[List[str], str]]:
"""
NOTE: This method was copied entirely from the base class `DockerOperator`, for the capability
of performing port publishing.
"""
if add_tmp_variable:
self.environment['AIRFLOW_TMP_DIR'] = self.tmp_dir
else:
self.environment.pop('AIRFLOW_TMP_DIR', None)
if not self.cli:
raise Exception("The 'cli' should be initialized before!")
self.container = self.cli.create_container(
command=self.format_command(self.command),
name=self.container_name,
environment={**self.environment, **self._private_environment},
ports=list(self.port_bindings.keys()) if self.port_bindings else None,
host_config=self.cli.create_host_config(
auto_remove=False,
mounts=target_mounts,
network_mode=self.network_mode,
shm_size=self.shm_size,
dns=self.dns,
dns_search=self.dns_search,
cpu_shares=int(round(self.cpus * 1024)),
port_bindings=self.port_bindings if self.port_bindings else None,
mem_limit=self.mem_limit,
cap_add=self.cap_add,
extra_hosts=self.extra_hosts,
privileged=self.privileged,
device_requests=self.device_requests,
),
image=self.image,
user=self.user,
entrypoint=self.format_command(self.entrypoint),
working_dir=self.working_dir,
tty=self.tty,
)
logstream = self.cli.attach(container=self.container['Id'], stdout=True, stderr=True, stream=True)
try:
self.cli.start(self.container['Id'])
log_lines = []
for log_chunk in logstream:
log_chunk = stringify(log_chunk).strip()
log_lines.append(log_chunk)
self.log.info("%s", log_chunk)
result = self.cli.wait(self.container['Id'])
if result['StatusCode'] != 0:
joined_log_lines = "\n".join(log_lines)
raise AirflowException(f'Docker container failed: {repr(result)} lines {joined_log_lines}')
if self.retrieve_output:
return self._attempt_to_retrieve_result()
elif self.do_xcom_push:
if len(log_lines) == 0:
return None
try:
if self.xcom_all:
return log_lines
else:
return log_lines[-1]
except StopIteration:
# handle the case when there is not a single line to iterate on
return None
return None
finally:
if self.auto_remove == "success":
self.cli.remove_container(self.container['Id'])
elif self.auto_remove == "force":
self.cli.remove_container(self.container['Id'], force=True)
Explanation: The create_host_config method of the APIClient has an optional port_bindings keyword argument, and create_container method has an optional ports argument. These calls aren't exposed in the DockerOperator, so you have to copy the _run_image_with_mounts method and override it with a copy and supply those arguments with the port_bindings field set in the initializer. You can then supply the ports to publish as a keyword argument. Note that in this implementation, the expectation is argument is a dictionary:
t1 = DockerOperatorWithExposedPorts(image=..., task_id=..., port_bindings={5000: 5000, 8080:8080, ...})

Unable to run couchbase container with non standard ports

Update 04/09/20 (jj/mm/aaaa)
I tried running my couchbase instance on custom ports by playing with couchbase configuration files.
docker run -d --privileged --memory 3200M --name bumblebase \
-v '$PWD/static_config:/opt/couchbase/etc/couchbase/static_config' \
-p 3456-3461:3456-3461 -p 6575-6576:6575-6576 \
couchbase:community-6.6.0
static_config is a file I created with the following content, following this page instructions (bottom section) :
{rest_port, 3456}.
{query_port, 3458}.
{fts_http_port, 3459}.
{cbas_http_port, 3460}.
{eventing_http_port, 3461}.
{memcached_port, 6575}.
But then I cannot access my couchbase instance at all (either web UI or rest api). I tried to point my local ports to both custom and default ports (-p 3456-3461:8091-8096) but none worked, and the problem disappear only if I remove the -v option - which brings me back to the original post scenario.
On a side note, I'm still trying to play with setting-alternate-address without any success so far. For some reason, when I set the alternate hostname (which seems to be required for this to run), accessing the alternate address takes a long time to load to eventually fail with a timeout error.
Original post
Since I develop many applications running different couchbase clusters, I have to run them in containers on different port. I created my container with the following command :
docker run -d --memory 2048M --name "my-database" \
-p 3456-3461:8091-8096 \
-p 6210-6211:11210-11211 \
couchbase
I then added buckets, and going on the web UI at localhost:3456 works fine :
Here is my server info :
In my code I have the following connect function :
var cluster *gocb.Cluster
func Cluster() *gocb.Cluster {
return cluster
}
func init() {]
c, err := gocb.Connect(
"couchbase://127.0.0.1:3456",
gocb.ClusterOptions{
Username: "Administrator",
Password: "password",
TimeoutsConfig: gocb.TimeoutsConfig{
ConnectTimeout: 30 * time.Second,
},
},
)
if err != nil {
panic(err)
}
cluster = c
}
Panic doesn't trigger, but whenever I try to perform a KV operation, it fails with the following error :
ambiguous timeout | {"InnerError":{"InnerError":{"InnerError":{},"Message":"ambiguous timeout"}},"OperationID":"Add","Opaque":"0x0","TimeObserved":2501547947,"RetryReasons":null,"RetryAttempts":0,"LastDispatchedTo":"","LastDispatchedFrom":"","LastConnectionID":""}
And this error only comes when I try to connect to my docker couchbase instance. If I run the Couchbase Server application and connect to appropriate port (8091), it works perfectly fine :
var cluster *gocb.Cluster
func Cluster() *gocb.Cluster {
return cluster
}
func init() {]
c, err := gocb.Connect(
"couchbase://localhost",
gocb.ClusterOptions{
Username: "Administrator",
Password: "password",
TimeoutsConfig: gocb.TimeoutsConfig{
ConnectTimeout: 30 * time.Second,
},
},
)
if err != nil {
panic(err)
}
cluster = c
}
I checked the credentials and they are correct, also replacing localhost with 0.0.0.0 or 127.0.0.1 didn't help at all.

Vault Docker Image - Cant get REST Response

I am deploying vault docker image on Ubuntu 16.04, I am successful initializing it from inside the image itself, but I cant get any Rest Responses, and even curl does not work.
I am doing the following:
Create config file local.json :
{
"listener": [{
"tcp": {
"address": "127.0.0.1:8200",
"tls_disable" : 1
}
}],
"storage" :{
"file" : {
"path" : "/vault/data"
}
}
"max_lease_ttl": "10h",
"default_lease_ttl": "10h",
}
under /vault/config directory
running the command to start the image
docker run -d -p 8200:8200 -v /home/vault:/vault --cap-add=IPC_LOCK vault server
entering bash terminal of the image :
docker exec -it containerId /bin/sh
Running inside the following command
export VAULT_ADDR='http://127.0.0.1:8200' and than vault init
It works fine, but when I am trying to send rest to check if vault initialized:
Get request to the following url : http://Ip-of-the-docker-host:8200/v1/sys/init
Getting No Response.
even curl command fails:
curl http://127.0.0.1:8200/v1/sys/init
curl: (56) Recv failure: Connection reset by peer
Didnt find anywhere online with a proper explanation what is the problem, or if I am doing something wrong.
Any Ideas?
If a server running in a Docker container binds to 127.0.0.1, it's unreachable from anything outside that specific container (and since containers usually only run a single process, that means it's unreachable by anyone). Change the listener address to 0.0.0.0:8200; if you need to restrict access to the Vault server, bind it to a specific host address in the docker run -p option.

Couldn't connect to Docker Aerospike from host

I'm running aerospike server in docker.
$ docker run -d --name aerospike aerospike/aerospike-server
0ad3b2df67bd17f896e87ed119758d9af7fcdd9b82a8632828e01072e2c5673f
It is started successfully.
$docker ps
CONTAINER ID IMAGE COMMAND
CREATED STATUS PORTS NAMES
0ad3b2df67bd aerospike/aerospike-server "/entrypoint.sh asd"
4 seconds ago Up 2 seconds 3000-3003/tcp aerospike
I found the ip address of docker using below command.
$ docker inspect -f '{{.NetworkSettings.IPAddress }}' aerospike
172.17.0.2
When I trying to connect to aql using the below command, it is successful as well.
$ docker run -it aerospike/aerospike-tools aql -h $(docker inspect -f
'{{.NetworkSettings.IPAddress }}' aerospike)
Aerospike Query Client
Version 3.15.0.3
C Client Version 4.2.0
Copyright 2012-2017 Aerospike. All rights reserved.
aql> select * from test.person
0 rows in set (0.002 secs)
Now I am trying to connect to the aerospike server in docker using java client in host machine.
public class AerospikeDemo {
public static void main(String []args) {
AerospikeClient client = new AerospikeClient("172.17.0.2", 3000);
Key key = new Key("test", "demo", "putgetkey");
//Key key2 = new Key("1", "2", "3");
Bin bin1 = new Bin("bin1", "value1");
Bin bin2 = new Bin("bin2", "value2");
Bin bin3 = new Bin("bin2", "value3");
// Write a record
client.put(null, key, bin1, bin2, bin3);
// Read a record
Record record = client.get(null, key);
System.out.println("record is "+ record);
System.out.println("record bins is " + record.bins);
client.close();
}
}
When I run the above program, I'm getting below error -
objc[3446]: Class JavaLaunchHelper is implemented in both
/Library/Java/JavaVirtualMachines/jdk1.8.0_144.jdk/Contents/Home/bin/java (0x10f7b14c0) and /Library/Java/JavaVirtualMachines/jdk1.8.0_144.jdk/Contents/Home/jre/lib/libinstrument.dylib (0x10f8794e0). One of the two will be used. Which one is undefined.
Exception in thread "main" com.aerospike.client.AerospikeException$Connection:
Error Code 11: Failed to connect to host(s): 172.17.0.2 3000 Error Code 11: java.net.SocketTimeoutException: connect timed out
at com.aerospike.client.cluster.Cluster.seedNodes(Cluster.java:413)
at com.aerospike.client.cluster.Cluster.tend(Cluster.java:306)
at com.aerospike.client.cluster.Cluster.waitTillStabilized(Cluster.java:271)
at com.aerospike.client.cluster.Cluster.initTendThread(Cluster.java:181)
at com.aerospike.client.AerospikeClient.<init>(AerospikeClient.java:210)
at com.aerospike.client.AerospikeClient.<init>(AerospikeClient.java:151)
at com.demo.aerospike.AerospikeDemo.main(AerospikeDemo.java:12)
I've tried both AerospikeClient("172.17.0.2", 3000) and AerospikeClient("localhost", 3000)
I see in the Dockerfile the port 3000 is exposed to the host but I'm not sure why I'm not able to use the aerospike server in the docker.
The IP 172.17.0.2 is only accessible within Docker (therefore you can use another container to connect). In case you want to connect from your host you need to map the respective port.
docker run -d --name aerospike -p 3000:3000 aerospike/aerospike-server
Afterwards you can use:
AerospikeClient client = new AerospikeClient("localhost", 3000);

check_disk not generating alerts: nagios

I am new to nagios.
I am trying to configure the "check_disk" service for one host but I am not getting the expected results.
I should get the emails when when disk usage goes beyond 80%.
So, There is already service defined for this task with multiple hosts, as below:
define service{
use local-service ; Name of service template to use
host_name localhost, host1, host2, host3, host4, host5, host6
service_description Root Partition
check_command check_local_disk!20%!10%!/
contact_groups unix-admins,db-admins
}
The Issue:
Further I tried to test single host i.e. "host2". The current usage of host2 is as follow:
# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-rootvol 94G 45G 45G 50% /
So to get instant emails, I written another service as below, where warning set to <60% and critical set to <40%.
define service{
use local-service
host_name host2
service_description Root Partition again
check_command check_local_disk!60%!40%!/
contact_groups dev-admins
}
But still I am not receive any emails for the same.
Where it going wrong.
The "check_local_disk" command is defined as below:
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}
Your command definition currently is setup to only check your Nagios server's disk, not the remote hosts (such as host2). You need to define a new command definition to execute check_disk on the remote host via NRPE (Nagios Remote Plugin Execution).
On Nagios server, define the following:
define command {
command_name check_remote_disk
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_disk -a $ARG1$ $ARG2$ $ARG3$
register 1
}
define service{
use genric-service
host_name host1, host2, host3, host4, host5, host6
service_description Root Partition
check_command check_remote_disk!20%!10%!/
contact_groups unix-admins,db-admins
}
Restart the Nagios service.
On the remote host:
Ensure you have NRPE plugin installed.
Instructions for Ubuntu: http://tecadmin.net/install-nrpe-on-ubuntu/
Instructions for CentOS / RHEL: http://sharadchhetri.com/2013/03/02/how-to-install-and-configure-nagios-nrpe-in-centos-and-red-hat/
Ensure there is a command defined for check_disk on the remote host. This is usually included in nrpe.cfg, but commented-out. You'd have to un-comment the line.
Ensure you have the check_disk plugin installed on the remote host. Mine is located at: /usr/lib64/nagios/plugins/check_disk
Ensure that allowed_hosts field of nrpe.cfg includes the IP address / hostname of your Nagios server.
Ensure that dont_blame_nrpe field of nrpe.cfg is set to 1 to allow command line arguments to NRPE commands: dont_blame_nrpe=1
If you made any changes, restart the nrpe service.

Resources