I would like to copy a zipped file from host machine to a container using go code running inside a container. The setup has go code running in a container with docker.sock mounted. The idea is to copy zip file from host machine to the container that runs go code. The path parameter is on the host machine. On host machine command line looks like this
docker cp hostFile.zip myContainer:/tmp/
The documentation for docker-client CopyToContainer looks
func (cli *Client) CopyToContainer(ctx context.Context, containerID, dstPath string, content io.Reader, options types.CopyToContainerOptions) error
How to create content io.Reader argument ?
cli, err := client.NewClientWithOpts(client.FromEnv, client.WithAPIVersionNegotiation())
if err != nil {
panic(err)
}
// TODO
// reader := io.Reader()
// reader := file.NewReader()
// tar.NewReader()
cli.CopyToContainer(context.Background(), containerID, dst, reader, types.CopyToContainerOptions{
AllowOverwriteDirWithFile: true,
CopyUIDGID: true,
})
There are a huge variety of things that implement io.Reader. In this case the normal way would be to open a file with os.Open, and then the resulting *os.File pointer is an io.Reader.
As you note in comments, though, this only helps you read and write things from your "local" filesystem. Having access to the host's Docker socket is super powerful but it doesn't directly give you read and write access to the host filesystem. (As #mkopriva suggests in a comment, launching your container with a docker run -v /host/path:/container/path bind mount is much simpler and avoids the massive security problem I'm about to discuss.)
What you need to do instead is launch a second container that bind-mounts the content you need, and read the file out of the container. It sounds like you're trying to write it into the local filesystem, which simplifies things. From a docker exec shell prompt inside the container you might do something like
docker run --rm -v /:/host busybox cat /host/some/path/hostFile.zip \
> /tmp/hostFile.zip
In Go it's more involved but still very doable (untested, imports omitted)
ctx := context.Background()
cid, err := client.ContainerCreate(
ctx,
&container.Config{
Image: "docker.io/library/busybox:latest",
Cmd: strslice.StrSlice{"cat", "/host/etc/shadow"},
},
&container.HostConfig{
Mounts: []mount.Mount{
{
Type: mount.TypeBind,
Source: "/",
Target: "/host",
},
},
},
nil,
nil,
""
)
if err != nil {
return err
}
defer client.ContainerRemove(ctx, cid.ID, &types.ContainerRemoveOptions{})
rawLogs, err := client.ContainerLogs(
ctx,
cid.ID,
types.ContainerLogsOptions{ShowStdout: true},
)
if err != nil {
return err
}
defer rawLogs.close()
go func() {
of, err := os.Create("/tmp/host-shadow")
if err != nil {
panic(err)
}
defer of.Close()
_ = stdcopy.StdCopy(of, io.Discard, rawLogs)
}()
done, cerr := client.ContainerWait(ctx, cid.ID, container. WaitConditionNotRunning)
for {
select {
case err := <-cerr:
return err
case waited := <-done:
if waited.Error != nil {
return errors.New(waited.Error.Message)
} else if waited.StatusCode != 0 {
return fmt.Errorf("cat container exited with status code %v", waited.StatusCode)
} else {
return nil
}
}
}
As I hinted earlier and showed in the code, this approach bypasses all controls on the host; I've decided to read back the host's /etc/shadow encrypted password file because I can, and nothing would stop me from writing it back with my choice of root password using basically the same approach. Owners, permissions, and anything else don't matter: the Docker daemon runs as root and most containers run as root by default (and you can explicitly request it if not).
Related
I have a multi module SBT project that I'm using to build a Docker image out of. There is one module that depends on all the other modules and I'm actually trying to build the Docker image for this module. Here is a snippet from my build.sbt
lazy val impute = (project in file(MODULE_NAME_IMPUTE)).dependsOn(core% "compile->compile;test->test", config)
.settings(
commonSettings,
enablingCoverageSettings,
name := MODULE_NAME_IMPUTE,
description := "Impute the training data"
)
.enablePlugins(JavaAppPackaging, DockerPlugin)
lazy val split = (project in file(MODULE_NAME_SPLIT)).dependsOn(core% "compile->compile;test->test", config)
.settings(
commonSettings,
enablingCoverageSettings,
dockerSettings("split"),
name := MODULE_NAME_SPLIT,
description := "Split the dataset into train and test"
)
.enablePlugins(JavaAppPackaging, DockerPlugin)
lazy val run = (project in file(MODULE_NAME_RUN)).dependsOn(core % "compile->compile;test->test", config, cleanse, encode, feature, impute, split)
.settings(
commonSettings,
dockerSettings("run"),
enablingCoverageSettings,
name := MODULE_NAME_RUN,
description := "To run the whole setup as a pipeline locally"
)
.enablePlugins(JavaAppPackaging, DockerPlugin)
As you can see, the run module depends on all the other modules and I'm actually trying to build the Docker image for the run module for which I used the following command:
sbt run/docker:publishLocal
This works fine and the Docker image is also built, but when I inspect my Docker image, especially on the ENTRYPOINT, I get to see the following:
"Entrypoint": [
"/opt/docker/bin/run"
],
But instead, I would have expected to see something like this:
"Entrypoint": [
"java",
"-cp",
"com.mypackage.housingml.run.Main"
],
Is there anything else that I'm missing? Here is my dockerSettings() function from my build.sbt:
def dockerSettings(name: String) = {
Seq(
// Always use latest tag
dockerUpdateLatest := true,
maintainer := s"$projectMaintainer",
// https://hub.docker.com/r/adoptopenjdk/openjdk13
// Remember to use AshScriptPlugin if you are using an alpine based image
dockerBaseImage := "adoptopenjdk/openjdk13:alpine-slim",
// If you want to publish to a remote docker repository, uncomment the following:
//dockerRepository := Some("remote-docker-hostname"),
Docker / packageName := s"joesan/$projectName-$name",
// If we're running in a docker container, then export logging volume.
Docker / defaultLinuxLogsLocation := "/opt/docker/logs",
dockerExposedVolumes := Seq((Docker / defaultLinuxLogsLocation).value),
dockerEnvVars := Map(
"LOG_DIR" -> (Docker / defaultLinuxLogsLocation).value,
)
)
}
The sbt-native-packager creates a executable bash script to run your code and Docker ENTRYPOINT is by default set to that executable script as you can see.
This is customisable, see the docs for reference.
Also, the bash script is customisable - docs.
I am using the Go exec package to execute a docker pull debian command:
import (
"bufio"
"os/exec"
"strings"
)
func main() {
cmd := exec.Command("docker", "pull", "debian")
stdout, _ := cmd.StdoutPipe()
cmd.Start()
scanner := bufio.NewScanner(stdout)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
return nil
}
But it never shows me the progress bar. It only shows an update when it is fully complete. For larger images over a GB it is hard to see if there is progess being made. This is what it shows:
e9afc4f90ab0: Pulling fs layer
e9afc4f90ab0: Verifying Checksum
e9afc4f90ab0: Download complete
e9afc4f90ab0: Pull complete
Is it possible to get output similar to what I see when I run docker pull debian in the terminal or something that I can use to show progress?:
e9afc4f90ab0: Downloading [==========> ] 10.73MB/50.39MB
As David mentionned, you would rather use the official docker engine SDK to interact with docker.
Initialize the docker client
cli, _ := client.NewClientWithOpts(client.FromEnv, client.WithAPIVersionNegotiation())
Pull the image
reader, _ := cli.ImagePull(context.Background(), "hello-world", types.ImagePullOptions{})
Parse the json stream
id, isTerm := term.GetFdInfo(os.Stdout)
_ = jsonmessage.DisplayJSONMessagesStream(reader, os.Stdout, id, isTerm, nil)
You will get the same output as the docker cli provide when you do a docker pull hello-world
I have the following Dockerfile
ARG DEV_USER=dev
# Other stuff ...
USER $DEV_USER
# Other stuff ...
WORKDIR /home/$DEV_USER/Projects
When I start a container and execute ls /home/dev, the Projects folder is owned by root. Does WORKDIR ignore the fact that USER was invoked earlier?
I failed to find detail documents for this, but I'm interested on this, so I just had a look for docker source code, I guess we can get the clue from sourcecode:
moby/builder/dockerfile/dispatcher.go (Line 299):
// Set the working directory for future RUN/CMD/etc statements.
//
func dispatchWorkdir(d dispatchRequest, c *instructions.WorkdirCommand) error {
......
if err := d.builder.docker.ContainerCreateWorkdir(containerID); err != nil {
return err
}
return d.builder.commitContainer(d.state, containerID, runConfigWithCommentCmd)
}
Above, we can see it will call ContainerCreateWorkdir, next is the code:
moby/daemon/workdir.go:
func (daemon *Daemon) ContainerCreateWorkdir(cID string) error {
......
return container.SetupWorkingDirectory(daemon.idMapping.RootPair())
}
Above, we can see it call SetupWorkingDirectory, next is the code:
moby/container/container.go (Line 259):
func (container *Container) SetupWorkingDirectory(rootIdentity idtools.Identity) error {
......
if err := idtools.MkdirAllAndChownNew(pth, 0755, rootIdentity); err != nil {
pthInfo, err2 := os.Stat(pth)
if err2 == nil && pthInfo != nil && !pthInfo.IsDir() {
return errors.Errorf("Cannot mkdir: %s is not a directory", container.Config.WorkingDir)
}
return err
}
return nil
}
Above, we can see it call MkdirAllAndChownNew(pth, 0755, rootIdentity), next is the code:
moby/pkg/idtools/idtools.go (Line 54):
// MkdirAllAndChownNew creates a directory (include any along the path) and then modifies
// ownership ONLY of newly created directories to the requested uid/gid. If the
// directories along the path exist, no change of ownership will be performed
func MkdirAllAndChownNew(path string, mode os.FileMode, owner Identity) error {
return mkdirAs(path, mode, owner, true, false)
}
Above will setup folder in intermediate build container & also change the ownership of the folder with rootIdentity.
Finally, what is rootIdentity here?
It's passed here as daemon.idMapping.RootPair(), next is the declare:
moby/pkg/idtools/idtools.go (Line 151):
// RootPair returns a uid and gid pair for the root user. The error is ignored
// because a root user always exists, and the defaults are correct when the uid
// and gid maps are empty.
func (i *IdentityMapping) RootPair() Identity {
uid, gid, _ := GetRootUIDGID(i.uids, i.gids)
return Identity{UID: uid, GID: gid}
}
See the function desc:
RootPair returns a uid and gid pair for the root user
You can continue to see what GetRootUIDGID is, but I think it's enough now from the function desc. It will finally use change the ownership of WORKDIR to root.
And, additional to see what USER do?
__moby/builder/dockerfile/dispatcher.go (Line 543):__
// USER foo
//
// Set the user to 'foo' for future commands and when running the
// ENTRYPOINT/CMD at container run time.
//
func dispatchUser(d dispatchRequest, c *instructions.UserCommand) error {
d.state.runConfig.User = c.User
return d.builder.commit(d.state, fmt.Sprintf("USER %v", c.User))
}
Above, just set user to run config and directly commit for further command, but did nothing related to WORKDIR setup.
And, if you want to change the ownership, I guess you will have to do it by yourself use chown either in RUN or ENTRYPOINT/CMD.
Docker is always run as root user in its environment, SO instead we specify it to use another user.
So before running anything we can add a user to its environment then specify that useR to perform action in Dockerfile.
This example add user to www-data
ARG USER_ID=1000
ARG GROUP_ID=1000
RUN groupadd -g ${GROUP_ID} www-data &&\
useradd -l -u ${USER_ID} -g www-data www-data &&\
I had some issues with /dev/random and I replaced it with /dev/urandom on some of my servers:
lrwxrwxrwx 1 root root 12 Jul 22 21:04 /dev/random -> /dev/urandom
I since have replaced some of my infrastructure with Docker. I thought, it would be sufficient to replace /dev/random on my host machine. But when starting the service, I quickly noticed that some RNG operations blocked because it used the original implementation of /dev/random.
I wrote a little test program to proof this behavior:
package main
import (
"fmt"
"os"
)
func main() {
f, err := os.Open("/dev/random")
if err != nil {
panic(err)
}
chunk := make([]byte, 1024)
index := 0
for {
index = index + 1
_, err := f.Read(chunk)
if err != nil {
panic(err)
}
fmt.Println("iteration ", index)
}
}
Executing this program on the host machine (which has the symlink) works as expected - it will not block and run until I shut it down.
When running this in a container, it will block after the first iteration (at least on my machine).
I can obviously fix this problem by mounting my random file into the container:
docker run -it --rm -v /dev/random:/dev/random bla
But this is not the point of this question. I want to know, the following:
How does Docker set up the devices listed in /dev
Why is it not just using (some) of the device files of the host machine.
Docker never uses any of the host system's filesystem unless you explicitly instruct it to. The device files in /dev are whatever (if anything) is baked into the image.
Also note that the conventional Unix device model is that a device "file" is either a character- or block-special device, and a pair of major and minor device numbers, and any actual handling is done in the kernel. Relatedly, the host's kernel is shared across all Docker containers. So if you fix your host's broken implementation of /dev/random then your containers will inherit this fix too; but merely changing what's in the host's /dev will have no effect.
The -i flag is described as "Keep STDIN open even if not attached", but Docker run reference also says:
If you do not specify -a then Docker will attach all standard streams.
So, by default, stdin is attached, but not opened? I think it doesn't make any sense when STDIN is attached but not opened, right?
The exact code associated with that documentation is:
// If neither -d or -a are set, attach to everything by default
if len(flAttach) == 0 && !*flDetach {
if !*flDetach {
flAttach.Set("stdout")
flAttach.Set("stderr")
if *flStdin {
flAttach.Set("stdin")
}
}
}
With:
flStdin := cmd.Bool("i", false, "Keep stdin open even if not attached")
In other words, stdin is attached only if -i is set.
if *flStdin {
flAttach.Set("stdin")
}
In that sense, "all" standard streams isn't accurate.
As commented below, that code (referenced by the doc) has since changed to:
cmd.Var(&flAttach, []string{"a", "-attach"}, "Attach to STDIN, STDOUT or STDERR")
-a does not man anymore "attach all streams", but "specify which streams you want attached".
var (
attachStdin = flAttach.Get("stdin")
attachStdout = flAttach.Get("stdout")
attachStderr = flAttach.Get("stderr")
)
-i remains a valid option:
if *flStdin {
attachStdin = true
}