I have created a cluster on Digital Ocean (DC/OS 1.9) using terraform following these instructions here
Everything seems to have installed correctly, to pull from a private docker repo, I need to add a compressed .docker file to my /core/home/ and fetch it during deployment by including it in my JSON.
"fetch":[
{
"uri":"file:///home/core/docker.tar.gz"
}
]
Based on these instructions: https://docs.mesosphere.com/1.9/deploying-services/momee/docker-creds-agent/
And I'm still getting errors:
Failed to launch container:
Failed to fetch all URIs for container 'abc123-xxxxx' with exit status: 256
Upon looking at the logs of one of the agents:
Starting container '123-abc-xxx' for task 'my-docker-image-service.321-dfg-xxx' (and executor 'my-docker-image-service.397d20cb-1
Begin fetcher log (stderr in sandbox) for container 123-abc-xxx from running command: /opt/mesosphere/packages/mesos--aaedd03eee0d57f5c0d49c
Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/94af100c-4dc2-416d-b6d7-eec0d947a1a6-S11","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"file:\/\/\/home\/core\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slave\/slaves\/94af100c-4dc2-416d-b6d7-eec0d947a1a6-S11\/frameworks\/94af100c-4dc2-416...
Fetching URI 'file:///home/core/docker.tar.gz'
Fetching directly into the sandbox directory
Fetching URI 'file:///home/core/docker.tar.gz'
Copied resource '/home/core/docker.tar.gz' to '/var/lib/mesos/slave/slaves/94af100c-4dc2-416d-b6d7-eec0d947a1a6-S11/frameworks/94af100c-4dc2-416d-b6d7-eec0d947a1a6-0
Failed to obtain the IP address for 'digitalocean-dcos-agent-20'; the DNS service may not be able to resolve it: Name or service not known
End fetcher log for container 123-abc-xxx
Failed to run mesos-fetcher: Failed to fetch all URIs for container '123-abc-xxx' with exit status: 256
You are missing the extract instruction:
"fetch":[
{
"uri":"file:///home/core/docker.tar.gz",
"extract":true
}
]
Related
I am trying to build my own linux image using buildroot in docker with GitLab CI. Everything is going fine until I start downloading the "linux" repository. Then I get an error like below.
>>> linux d0f5c460aac292d2942b23dd6199fe23021212ad Downloading
Doing full clone
Cloning into bare repository 'linux-d0f5c460aac292d2942b23dd6199fe23021212ad'...
Looking up git.ti.com ... done.
Connecting to git.ti.com (port 9418) ... 198.47.28.207 done.
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
--2023-01-05 11:53:37-- http://sources.buildroot.net/linux-d0f5c460aac292d2942b23dd6199fe23021212ad.tar.gz
Resolving sources.buildroot.net (sources.buildroot.net)... 104.26.1.37, 172.67.72.56, 104.26.0.37, ...
Connecting to sources.buildroot.net (sources.buildroot.net)|104.26.1.37|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-01-05 11:53:37 ERROR 404: Not Found.
package/pkg-generic.mk:73: recipe for target '/builds/XXX/XXX/output/build/linux-d0f5c460aac292d2942b23dd6199fe23021212ad/.stamp_downloaded' failed
make: *** [/builds/XXX/XXX/output/build/linux-d0f5c460aac292d2942b23dd6199fe23021212ad/.stamp_downloaded] Error 1
Cleaning up project directory and file based variables
00:02
ERROR: Job failed: exit code 1
The image being built without docker has no problem downloading this repository. I was building this image in docker a while ago and there was no problem downloading this repository. Could it be a problem of poorer network connection? The package is bigger than the others
You are using a custom git repo (git.ti.com) which is not working and buildroot doesn't know anything about.
For this reason, you cannot expect a mirror copy available on sources.buildroot.net: buildroot only has copies of the packages distributed whithin it.
Locally, I am able to successfully authorize and pull modules from my private Nexus registry (.npmrc file).
However on Jenkins I get
error An unexpected error occurred: "https://myprivaterepo.com/myprivatemodule.tgz: Request failed \"401 Unauthorized\"".
When I run npm whoami on Jenkins it returns a valid user. npm config ls prints valid configuration as well.
The problem started to occur when I changed myprivaterepo url (we've migrated it). Is there something I don't know (ie. I have to logout/login again or there's some cache in Jenkins)??
Thanks in advance!
I am trying to encrypt the docker images using this tutorial.
I have images stored in Azure Container Registry and I want to encrypt them .
As images from Azure CR are not supported in ctr-enc environment so I am pulling image from Azure CR, tagging it to local registry(sudo docker tag "azure-cr-image-name" localhost:5000/test:0.1) and pushing it(sudo docker push localhost:5000/test:0.1) and then pulling it in ctr-enc from local registry.
All the steps works fine. The image runs successfully so I exported it to a tar file.
The error comes when I try to import the tar file in any other device.
The error is as follows
unpacking localhost:5000/test:0.1 (sha256:7b60c337c1d319c682369508673f8da65ce027cd95917d80abec71c753f90341)...INFO[0119] apply failure, attempting cleanup error="failed to extract layer sha256:0447c1aa276497ad5424dd1f8597b7f667126d868489277bab7aea547a4aa982: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount138280154: archive/tar: invalid tar header: unknown" key="extract-395814385-sMwu sha256:0447c1aa276497ad5424dd1f8597b7f667126d868489277bab7aea547a4aa982"
ctr: failed to extract layer sha256:0447c1aa276497ad5424dd1f8597b7f667126d868489277bab7aea547a4aa982: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount138280154: archive/tar: invalid tar header: unknown
All I want to know is will this flow work and am I missing something or this entire flow is wrong. I don't have much idea about it so any help will be appreciated.
I resolved the error with the help of following steps.
Tried re-creating the flow and I used this link to setup my local repository.
Pulled image from acr, encrypted it and pushed it to the local registry and started pulling the encrypted image from local registry to the other devices
I got following error in the device
unpacking linux/arm64/v8 sha256:cfd940f7d5d6a6817e8d4f4a811a27263fa11dc00507ebf638ff24be703e5320...
INFO[0293] apply failure, attempting cleanup error="failed to extract layer sha256:0447c1aa276497ad5424dd1f8597b7f667126d868489277bab7aea547a4aa982: call to DecryptLayer failed: missing private key needed for decryption\n: unknown" key="extract-20510027-zCdy sha256:0447c1aa276497ad5424dd1f8597b7f667126d868489277bab7aea547a4aa982"
ctr: failed to extract layer sha256:0447c1aa276497ad5424dd1f8597b7f667126d868489277bab7aea547a4aa982: call to DecryptLayer failed: missing private key needed for decryption
: unknown
After providing the secret key in the pull command itself the image got downloaded and it ran without any errors.
Pull command example: sudo ctr-enc images pull --plain-http=true --key mykey.pem registry.local.com:5000/encrypted-image/test:0.1
Key Points:
1] Add following line in the hosts file in the device where you are setting up the repository as well as the device in which you want to run the encrypted image. Replace it with your actual IP
Ex:- 192.168.0.1 repository.local.com
2] Add following line in the /docker/daemon.json file for the devices where you are planning to run encrypted image
"insecure-registries":["registry.local.com:5000"]
I'm using spark-submit in "cluster" mode with a Python script against a Spark cluster running on Mesos, and a custom Docker image for the executor set in spark.mesos.executor.docker.image.
My script file is already baked into the Docker image (let's say at path /app/script.py), so I don't want to use spark-submit's feature to download the script via HTTP before executing.
Per https://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management this seems possible by specifying the application script as a local: URL, e.g. spark-submit [...options...] local:/app/script.py. However this didn't work, I'm seeing errors like the following on Mesos (stderr of the spark driver task, scheduled by the spark-dispatcher framework):
I0727 20:31:50.164263 9207 fetcher.cpp:533] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/root","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"extract":true,"value":"\/app\/script.py"}}],"sandbox_directory":"\/data\/mesos\/slaves\/GUID\/frameworks\/GUID\/executors\/driver-TIMESTAMP\/runs\/GUID","user":"root"}
I0727 20:31:50.170289 9207 fetcher.cpp:444] Fetching URI '/app/script.py'
I0727 20:31:50.170361 9207 fetcher.cpp:285] Fetching directly into the sandbox directory
I0727 20:31:50.170413 9207 fetcher.cpp:222] Fetching URI '/app/script.py'
cp: cannot stat ‘/app/script.py’: No such file or directory
E0727 20:31:50.174051 9207 fetcher.cpp:579] EXIT with status 1: Failed to fetch '/app/script.py': Failed to copy '/app/script.py': exited with status 1
s 1
After browsing through https://spark.apache.org/docs/latest/running-on-mesos.html, my guess is that the local: path is interpreted by a "MesosClusterDispatcher", which is a daemon that spins up a container for the Spark driver process (using my custom spark executor Docker image). Since this dispatcher doesn't itself run in the custom Docker image/container, it can't find the file.
Is there any other way to tell spark-submit to not download the application script and just use the script already present in the Docker image?
I am following this tutorial at https://gettech1.wordpress.com/2016/05/26/setting-up-kubernetes-cluster-on-ubuntu-14-04-lts/ to setup kubernet multi node with 2 minions and 1 master node on remote ubuntu machines, after following all the steps it goes OK. But when I am trying to run the ./kube-up.sh bash file. It returns the following errors
ubuntu#ip-XXX-YYY-ZZZ-AAA:~/kubernetes/cluster
$ ./kube-up.sh
Starting cluster in us-central1-b using provider gce ... calling
verify-prereqs Can't find gcloud in PATH, please fix and retry. The
Google Cloud SDK can be downloaded from
https://cloud.google.com/sdk/.
Edit : I have fixed above issue after exporting different environment variables like
$ export KUBE_VERSION=2.2.1
$ export FLANNEL_VERSION=0.5.5
$ export ETCD_VERSION=1.1.8
but after that it is generating this issue
kubernet gzip: stdin: not in gzip format tar: Child returned status 1
tar: Error is not recoverable: exiting now
The command you should be executing is KUBERNETES_PROVIDER=ubuntu ./kube-up.sh
Without setting that environment variable kube-up.sh tries to deploy VMs on Google Compute Engine and to do so it needs the gcloud binary that you don't have installed.