I am experiencing a network issue with docker that I haven't seen before. Could it be related to my ubuntu network conf? or docker setup?
Sending build context to Docker daemon 36.86kB
Step 1/2 : FROM centos:centos7
---> 5e35e350aded
Step 2/2 : RUN curl https://google.com/
---> Running in d65fe6ad9d57
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to 2a00:1450:4007:808::200e: Cannot assign requested address
regards
The problem was on the network I was using provided by my cellphone (an Android hotspot on a Xiaomi redmi7). I still cannot figure out the real cause of this misfunctionality.
Related
I have a medium sized Zabbix Setup. I have one Central Zabbix Server and Multiple Zabbix Proxies, one at each Site I'm monitoring. All of those are setup with the Official Docker Containers, the main Server:
postgres:11-alpine
zabbix/zabbix-web-nginx-pgsql:alpine-4.0-latest
zabbix/zabbix-snmptraps:alpine-4.0-latest
zabbix/zabbix-server-pgsql:alpine-4.0-latest
The Proxies are all just a single Docker image:
zabbix/zabbix-proxy-sqlite3:ubuntu-4.0-latest
The Proxies mostly monitor other VMs on in the same VMWare vCenter.
The Problem that arises is that on the Proxies in the Logs I see a very high amount of network errors that all look somewhat like this:
Zabbix agent item "some.item" on host "SOME HOST" failed: first network error, wait for 15 seconds
From that it arises, that there is a High Amount of False Positive Problems in Zabbix. Mostly Zabbix agent on SOME HOST is unreachable for 5 minutes, but sometimes also other Problems that are triggered by .nodata().
There is also a high amount of missing item Data, since the hosts with network errors are considered "offline" for a bit and no items from them are checked.
I've also tried to investigate it a bit and found the source code that produces this error: https://github.com/zabbix/zabbix/blob/135111a0fd1f16f203226f8632881ac0a8bf541a/src/zabbix_server/poller/poller.c#L302
Unfortunatly the same message seems to be triggerd in 3 different failure cases: https://github.com/zabbix/zabbix/blob/135111a0fd1f16f203226f8632881ac0a8bf541a/src/zabbix_server/poller/poller.c#L749
Therefore I couldn't really find out anything that way. I also of cause looked at cpu, ram, disk and network usage on the proxies and couldn't find anything that looked out of the norm for me.
How should I proceed to find out the cause of these errors? Has anyone else had this happen to them?
Check the network stats for errors in the RX-ERR and TX-ERR columns on some of your hosts and proxy servers:
$ ifconfig -s
Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
docker0 1500 953410 0 0 0 1691396 0 0 0 BMRU
enp0s31f 1500 300757888 4 0 0 192733308 0 0 0 BMRU
lo 65536 23324198 0 0 0 23324198 0 0 0 LRU
tap0 1500 1317609 0 0 0 4530946 0 9 0 BMPU
tun0 1500 0 0 0 0 589 0 0 0 MOPRU
vboxnet0 1500 0 0 0 0 2324 0 0 0 BMU
veth20f8 1500 11690 0 0 0 538547 0 0 0 BMRU
veth2238 1500 0 0 0 0 76 0 0 0 BMRU
virbr0 1500 1317609 0 0 0 4519309 0 0 0 BMU
wlp2s0 1500 5584389 0 0 0 4387278 0 0 0 BMU
I did a lot more digging. I also posted this question, along with my discoveries on the Zabbix forum: https://www.zabbix.com/forum/zabbix-troubleshooting-and-problems/393381-getting-a-high-amount-of-unreachable-hosts-and-network-errors-in-local-network
I solved the problem for myself, but one error just vanished after I didn't thouch the system for two weeks, not sure what exactly happend.
The other Problem I encounterd, was because I am kinda new to Linux in some ways and didn't quite grasp how systemd works.
Systemd looks for a pid file, in the case of zabbix it looks for it in /run/zabbix/zabbix_agentd.pid, I did not tell Zabbix where to write that file to. In the end the fix was in /etc/zabbix/zabbix_agentd.conf to insert PidFile=/run/zabbix/zabbix_agentd.conf.
Before that Zabbix Agent started and was happy. But it didn't tell systemd about it, so after a timeout where systemd let's deamons startup it just restarted zabbix agent. If the Proxy tried to connect to the agent while it was not running... it produced network errors.
I have an ansible playbook as follows (with irrelevant info stripped):
tasks:
- name: get public_ip4 output
shell: curl http://169.254.169.254/latest/meta-data/public-ipv4
register: public_ip4
- debug: var=public_ipv4.stdout
- name: Create docker_pull
template: <SNIP>
- name: Pull containers
command: "sh /root/pull_agent.sh"
- name: (re)-create the agent
docker_container:
name: agent
image: registry.gitlab.com/project_agent
state: started
exposed_ports: 8889
published_ports: 8889:8889
recreate: yes
env:
host_machine: public_ipv4.stdout
The target machine is an AWS EC2 instance. The purpose is to get its public IPv4 address and give it as an environment variable to a container. The container has a Python instance, Agent, that will use os.environ.get(host_machine) to thus access the IPv4 address of the EC2 instance.
The output from the debug logs is (with irrelevant info removed and the ipv4 address replaced with ):
PLAY [swarm_manager_main] ******************************************************
TASK [get public_ip4 output] ***************************************************
[WARNING]: Consider using the get_url or uri module rather than running
'curl'. If you need to use command because get_url or uri is insufficient you
can add 'warn: false' to this command task or set 'command_warnings=False' in
ansible.cfg to get rid of this message.
changed: [tm001.stackl.io] => {"changed": true, "cmd": "curl http://169.254.169.254/latest/meta-data/public-ipv4", "delta": "0:00:00.013260", "end": "2019-08-02 08:36:26.649844", "rc": 0, "start": "2019-08-02 08:36:26.636584", "stderr": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 12 100 12 0 0 2600 0 --:--:-- --:--:-- --:--:-- 3000", "stderr_lines": [" % Total % Received % Xferd Average Speed Time Time Time Current", " Dload Upload Total Spent Left Speed", "", " 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0", "100 12 100 12 0 0 2600 0 --:--:-- --:--:-- --:--:-- 3000"], "stdout": "<HIDDEN>", "stdout_lines": ["52.210.80.33"]}
TASK [debug] *******************************************************************
ok: [tm001.stackl.io] => {
"public_ipv4.stdout": "VARIABLE IS NOT DEFINED!: 'public_ipv4' is undefined"
}
TASK [Create docker_pull] ******************************************************
<SNIP>
TASK [Pull containers] *********************************************************
<SNIP>
TASK [(re)-create the agent] ********************************************
changed: <SNIP> ["host_machine=public_ipv4.stdout", <SNIP>
I don't understand why the public_ipv4 variable is not used correctly. I've tried multiple things (including setting set_fact or setting a new variable) but to no avail.
What am I doing wrong?
There is a typo in your playbook: ip4 and ipv4
register: public_ip4
debug: var=public_ipv4.stdout
I'm quite new to Docker and Apache-Superset and trying to run a container (using docker) from the container image. Loaded the .tar file with
docker load --input ./inc_superset.tar
Which went as expected, tried running the container from this image file with
docker run --cidfile ./cid.txt <IMAGE_ID>
This starts my container but is has unhealthy status; upon inspecting the container ( with docker inspect) I get a huge JSON, below is a snippet of the log received (can post the entire log on request).
"Health": {
"Status": "unhealthy",
"FailingStreak": 5,
"Log": [
{
"Start": "2019-01-22T19:59:00.8036984+05:30",
"End": "2019-01-22T19:59:01.5698797+05:30",
"ExitCode": 7,
"Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 8088: Connection refused\n"
},
...
...
{
"Start": "2019-01-22T20:01:02.589517677+05:30",
"End": "2019-01-22T20:01:02.794486003+05:30",
"ExitCode": 7,
"Output": " % Total % Received % Xferd Average Speed Time Time Time Current\n Dload Upload Total Spent Left Speed\n\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 8088: Connection refused\n"
}
]
}
Am I making any mistake? Am I missing something? Any troubleshooting help on this requested.
Thanks
The problem was the webserver within the superset container doesn't run by default as of the configuration available on apache.org as of 7 September 2019.
Solved it as follows:
#Go into the container
docker-compose exec superset bash
#Start the webserver that is exposed on all interfaces so that we can access it from docker host
superset run -p 8088 --host 0.0.0.0
I was facing the same issue. Ran via docker-compose using the instructions on https://superset.incubator.apache.org/installation.html but get no response from localhost:8088.
Docker inspect State.Health.Status = "unhealthy" and log show several entries with curl: (7) Failed to connect to localhost port 8088: Connection refused\n"
docker ps shows that the container is exposed on 0.0.0.0:8088
I am trying to run following command in my dockerfile for downloading pandas package
RUN curl -x proxy.temp.com:8080 -U myid_9076:pwd123* -L -O https://files.pythonhosted.org/packages/b4/31/bbd2c915aad67c7cb572b7c6ca8f645fcb112064ef6774436d4f65acd5a1/pandas-0.20.3-cp27-cp27m-manylinux1_x86_64.whl
But I get following error while building docker-image using same dockerfile:
---> Running in c8ddcdcdb155
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0curl: (7) Failed to connect to proxy.temp.com port 8080: No route to host
Same command runs and downloads package on bare metal node on which docker is installed. But while building docker-image it gives above error. What is the issue, how can I download this package required for my image?
Also tried setting environment variables in my dockerfile before downloading pandas package as below,
ENV http_proxy 'http://myid_9076:pwd123*#proxy.temp.com:8080/'
ENV https_proxy 'http://myid_9076:pwd123*#proxy.temp.com:8080/'
but same error is seen in this case too.
It was network conflict issue, my network was conflicting with docker network. So I have set parameter "bip" in /etc/docker/daemon.json as shown below
{
"bip" : "12.12.0.1/24"
}
Got the answer from link Unable to access local network IP from docker container
[!] /usr/bin/curl -f -L -o /var/folders/6f/wy97bnq96f30fl3qlgfmn0tc0000gn/T/d20160826-2616-gnrjr/file.tgz https://www.gstatic.com/cpdc/cc5f7aac07ccdd0a/Firebase-3.5.0.tar.gz --create-dirs --netrc-optional
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (35) SSL peer handshake failed, the server most likely requires a client certificate to connect
this is an error when i install Firebase, Does someone who can help?