ERROR: traefik, xdbautomationworker, Container is unhealthy - docker

Trying to create sitecore 10 image using Docker on Windows 10 Enterprise locally but getting unhealthy containers. Please help me out as I have tried various steps that was updated in the forums.
Getting below errors:
Creating network "sitecore-xp0_default" with the default driver
Creating sitecore-xp0_solr_1 ... done
Creating sitecore-xp0_mssql_1 ... done
Creating sitecore-xp0_id_1 ... done
Creating sitecore-xp0_solr-init_1 ... done
Creating sitecore-xp0_xconnect_1 ... done
Creating sitecore-xp0_cm_1 ... done
ERROR: for cortexprocessingworker Container "992574e988e3" is unhealthy.
ERROR: for xdbautomationworker Container "992574e988e3" is unhealthy.
ERROR: for xdbsearchworker Container "992574e988e3" is unhealthy.
ERROR: for traefik Container "933b548fc2f9" is unhealthy.
ERROR: Encountered errors while bringing up the project.
Checked the following things:
docker-compose stop on Powershell.
docker-compose down on Powershell.
iisreset /stop on Powershell to make sure that the required ports are free.
docker-compose up -d on Powershell.
Stopped, removed the container and executed the command docker-compose.exe up --detach multiple times but no luck.

Check the .env file and make sure SITECORE_LICENSE has a value.
You may need to run the init.ps1 file.

Based on the logs now provided in the comments above, my suggestion would be to check the collection SQL connection string, to the shardsmanager database.
You can inspect the SQL container in docker for Windows and find the IP address of the SQL server. Connect to that using ssms and try connecting with the creds you have in current string.
Edit: looking again at the exception, it looks like it can't find the SQL server. Yet the CM server appears to not have a problem finding the same server. So compare the web/master/core connection string to the collection one. I'm guessing the SQL server portion will be different?

Related

foundationdb running docker image macos database unavailable

I am trying to run foundation db using a docker image in Macos as below.
docker run --init --rm --name=fdb-0 foundationdb/foundationdb:6.2.22
Starting FDB server on 172.17.0.2:4500
This seems to start. But then I connect to fdb cli after logging into the container I get the following error statuses.
docker exec -it fdb-0 /bin/bash
root#9e8bb6985be5:/var/fdb# fdbcli
Using cluster file `/var/fdb/fdb.cluster'.
The database is unavailable; type `status' for more information.
Welcome to the fdbcli. For help, type `help'.
fdb> status
Using cluster file `/var/fdb/fdb.cluster'.
The coordinator(s) have no record of this database. Either the coordinator
addresses are incorrect, the coordination state on those machines is missing, or
no database has been created.
172.17.0.2:4500 (reachable)
Unable to locate the data distributor worker.
Unable to locate the ratekeeper worker.
I saw this issue https://forums.foundationdb.org/t/fdbcli-access-external-docker/1069. But, could not successfully run in host network as well. Any help would be appreciated.
Try running fdbcli with fdbcli --exec "configure new single memory ; status". This will start the new database with single redundancy memory mode.

VSCode: Cannot connect to Docker container using Remote Development extension

I'm trying to set up VSCode so I can work on a project which resides inside a docker container. There's a recently published extension Remote Development which seems to enable just that.
I followed detailed official instructions on creating .devcontainer/devcontainer.json and setting up remote by running Remote-Containers: Reopen Folder in Container, however, even with official/provided containers and settings I get the error:
Setting up container for folder: /home/ilijas/<path_to>/workspace
Error: (HTTP code 500) server error - linux spec user: unable to find user ilijas: no matching entries in passwd file
at /home/ilijas/.vscode-insiders/extensions/ms-vscode-remote.remote-containers-0.53.0/dist/extension.js:1:151013
at /home/ilijas/.vscode-insiders/extensions/ms-vscode-remote.remote-containers-0.53.0/dist/extension.js:1:150976
at m.buildPayload (/home/ilijas/.vscode-insiders/extensions/ms-vscode-remote.remote-containers-0.53.0/dist/extension.js:1:150986)
at IncomingMessage.<anonymous> (/home/ilijas/.vscode-insiders/extensions/ms-vscode-remote.remote-containers-0.53.0/dist/extension.js:1:150486)
at IncomingMessage.emit (events.js:187:15)
at endReadableNT (_stream_readable.js:1090:12)
at process._tickCallback (internal/process/next_tick.js:63:19)
In my first attempts I tried to mount a local workspace to remote one, however, since I couldn't resolve this user-not-found error, I removed all of the arguments inside docker settings which regarded user, just to make one dummy container work. I had no success. I know this is a fresh extension, but still, I hope someone can help.
Essentially, removing all the previous docker containers solved the issue.
Reference GitHub issue:
The container has a label with the folder as the value, so it can be found again. When you close the window, the container is only stopped, not removed, for later use. (You could have some changes inside the container you want to keep. Also: Reusing an existing container is slightly faster.)

Docker image fails to create netlink handle

Can anyone help me make sense of the below error and others like it? I've Googled around, but nothing makes sense for my context. I download my Docker Image, but the container refuses to start. The namespace referenced is not always 26, but could be anything from 20-29. I am launching my Docker container onto an EC2 instance and pulling the image from AWS ECR. The error is persistent no matter if I re-launch the instance completely or restart docker.
docker: Error response from daemon: oci runtime error:
container_linux.go:247: starting container process caused
"process_linux.go:334: running prestart hook 0 caused \"error running
hook: exit status 1, stdout: , stderr: time=\\\"2017-05-
11T21:00:18Z\\\" level=fatal msg=\\\"failed to create a netlink handle:
failed to set into network namespace 26 while creating netlink socket:
invalid argument\\\" \\n\"".
Update from my Github issue: https://github.com/moby/moby/issues/33656
It seems like the DeepSecurity agent (ds_agent) running on a container with Docker can cause this issue invariably. A number of other users reported this problem, causing me to investigate. I previously installed ds_agent on these boxes, before replacing it with other software as a business decision, which is when the problem went away. If you are having this problem, might be worthwhile to check if you are running the ds_agent process, or other similar services that could be causing a conflict using 'htop' as the user in the issue above did.
Did you try running it with the --privileged option?
If it still doesn't run, try adding --security-opts seccomp=unconfined and either --security-opts apparmor=unconfined or --security-opts selinux=unconfined depending whether you're running Ubuntu or a distribution with SELinux enabled, respectively.
If it works, try substituting the --privileged option with --cap-add=NET_ADMIN` instead, as running containers in privileged mode is discouraged for security reasons.

Can't pull image from private docker registry

Trying to get a private repo running on my EC2 instance so my other docker hosts created by docker-machine can pull from the private repo. I've disabled SSL and have put up a firewall to compensate that allows my test server(the one I'm trying to pull on) to connect to my main EC2 instance (the private repo). So far I can push to the private repo where it's hosted on my main EC2 instance (was getting an EOF error before disabling SSL) but I get the following error when I run this on my text server:
docker pull ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com:5000/scoredeploy
this is the error it spits out:
Error response from daemon: Get https://ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com:5000/v1/_ping: EOF
Googling this error on yields results of people having similar issues, but without any fixes.
Anybody have any idea of what's going on here?
You might need to set the --insecure-registry <registry-ip>:5000 flag on the docker daemon's startup command on your non-docker-registry machine. In your case: --insecure-registry ec2-xx-xx-xxx-xxx.us-west-2.compute.amazonaws.com:5000
If you want to use your already-running docker machine, this should help you out setting the flag: https://docs.docker.com/registry/insecure/#/deploying-a-plain-http-registry
If you're using boot2docker, the file location and format is slightly different. Give this a shot if this is the case: http://www.developmentalmadness.com/2016/03/09/docker-configure-insecure-registry-in-boot2docker/
I've had issues with my docker machines not saving this setting on reboots. If you run into that issue, I'd recommend you make a new machine including the flag --engine-insecure-registry <registry-ip>:5000 in the docker-machine create command.
Best of luck!

Bamboo: docker task "An error occurred trying to connect: Post http://127.0.0.1:2375/v1.22/build"

I've been trying to set up an Continuous Delivery server with Bamboo. I've got everything going nicely up to the deployment. Bamboo builds and tests my C# project as it should.
Then I created a "deployment plan", installed docker and added the server capability to use docker, set up the docker tasks to build and deploy to dockerHub.
When I try to deploy, I get this error:
An error occurred trying to connect: Post http: //127.0.0.1:2375/v1.22/build ?buildargs=%7B%7D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&forcerm=1&memory=0&memswap=0&rm=1&shmsize=0&t=srgskiri%2Fresttest&ulimits=null : dial tcp 127.0.0.1:2375: connectex: No connection could be made because the target machine actively refused it.
01-mrt-2016 13:19:03 Failing task since return code of [C:\Program Files\Docker Toolbox\docker.exe build --force-rm=true --tag="srgskiri/resttest" C:\Users\Srg\bamboo-home\xml-data\build-dir\2129921-2195457] was 1 while expected 0
Now I think that it means that the bamboo 'object' that is calling the command to build, can't communicate with my docker engine/container.
First I thought it was because I didn't have docker-machine running, so I started it and ran the deploy, and still got this error.
This is what I have:
Server capability: path to docker
Docker task: building into an Image
Is there something I'm missing?
PS: Docker works perfectly on its own, both with docker UI or docker terminal. It's bamboo that can't interact with docker.
UPDATE: I didn't mention this, but I ran Bamboo in a Console, not as a service. Maybe thats the problem, that bamboo can't access docker out of console. I can't try this myself now because I can't install bamboo as a service. Keeps hanging if I try to start it as a service.
Will ask the bamboo support about it.
I figured it out... If u work on a Windows, Bamboo has to start the docker-machine itself.
So you have to add Command tasks to:
1) create a docker-machine (if u don't have any yet)
2) start it (if you start docker in bamboo, you can't access it in Windows and vice-versa)
only then you are able to use Docker in Bamboo on Windows.
I feel silly now
-EDIT- To use the Docker tasks after starting the docker-machine, you must also specify the Environment variables for the tasks (like DOCKER_TLS_VERIFY=1)
Otherwise you'll get the error mentioned above.

Resources