distributed tensorflow using docker

distributed tensorflow using docker - docker

I am experimenting with distributed tensorflow and an example project.
Running the project on the same docker container seems to work well. As soon as you run the application on different conatiners, they cannot connect to eachother.
I don't really know the problem, but I think this is because docker and tensorflow open ports which have to be concatenated to connect to the application like localhost:[docker-port]:[tf-port]
Do you think my guess is correct? And how can I solve this problem?

Related

How to develop and test Docker images remotely that are going to be deployed into K8s cluster that has different hardware than my local computer?

What are the "best practices" workflow for developing and testing an image (locally I guess) that is going to be deployed into a K8s cluster, and that has different hardware than my laptop?
To explain the context a bit, I'm running some deep learning code that needs gpus and my laptop doesn't have any so I launch a "training job" into the K8s cluster (K8s is probably not meant to be used this way, but is the way that we use it where I work) and I'm not sure how I should be developing and testing my Docker images.
At the moment I'm creating a container that has the desired gpu and manually running a bunch of commands till I can make the code work. Then, once I got the code running, I manually copy all the commands from history that made the code work and then copy them to a local docker file on my computer, compile it and push it to a docker hub, from which the docker image is going to be pulled the next time I launch a training job into the cluster, that will create a container from it and train the model.
The problem with this approach is that if there's a bug in the image, I have to wait until the deployment to the container to realize that my Docker file is wrong and I have to start the process all over again to change it. Also finding bugs from the output of kubectl logs is very cumbersome.
Is it a better way to do this?
I was thinking of installing docker into the docker container and use IntelliJ (or any other IDE) to attach it to the container via SSH and develop and test the image remotely; but I read in many places that this is not a good idea.
What would you recommend then instead?
Many thanks!!

Windows 10 Docker Network DNS doesn't work after reboot

I'm not sure if this is an issue with the current version of Windows Docker network or poor configuration and misunderstanding on my part, but I have the following setup:
2 Docker containers (built using the Microsoft/ASP.NET image as a base) running a .NET MVC application in each.
1 Docker container running SQL server (built using the Microsoft/mssql-server-windows image)
When I create all 3 containers everything works great, I can attach and ping all other the other containers using their names without any issue. The applications run and can communicate with each other as I hoped.
However, when I reboot my machine and start all the containers again they can no longer ping/communicate with each other using their names (using IP addresses is fine).
I've tried this on the default NAT network and also tried replacing the NAT network with my own custom NAT network.
To resolve the issue I have to run the force network disconnect command for each container as such:
docker network disconnect nat <containername> --force
And then I have to reconnect each container to the network before starting them up. All containers can then ping/communicate with each other using their names as well as their IP addresses.
FYI, this is a development environment but I was hoping to do something similar in Azure using a Windows Server 2016 VM, although I don't quite know what the best network configuration is for live production yet as I need to have multiple applications (in separate containers) on the same node accessed via their own subdomains.
Any help or guidance would be great.

I'm not sure, in part because this question was asked several months before any other example I've run into, but this sounds very similar to the problem described at https://github.com/docker/for-win/issues/1038.
Basically, there appears to be a problem introduced with the 1709 update to Windows 10 which results in a scenario where Hyper-V networking doesn't work the way it ought to.
There appear to be two common ways of working around this problem: Turning off "Fast Start" in the Control Panel => Power Options => System Settings, or restarting Docker for Windows and any containers after booting. I also thought I saw something on a Microsoft blog post indicating that the underlying problem has now been resolved and will be included in an update to Windows 10, but alas I can no longer find that information or the specific version number in which the problem was (theoretically) resolved. It may well be the delayed 1803 "Spring Creators Update" release.

Docker swarm for usb devices

I'm trying to build a distributed python application that connects several hosts with android devices over usb. These hosts then connect over TCP to a central broker for job disbursement. I'm currently tackling the problem of supporting multiple python builds for developers (linux/windows) as well as production (runs an older OS which requires it's own build of python). On the surface, docker seems like a good fit here as it would allow me to support a single python build.
However, docker doesn't seem suited well to working with external hardware. There is the --device option to pass a specific device, but that requires that the device be present before the docker run command and it doesn't persist across device reboots. I can get around that problem with --privileged but docker swarm currently does not support that (see issue 24862) so I'd have to manually setup the service on each of the hosts, which would not only be a pain, but I'd lose the niceness of swarm's automatic deployment and rollout.
Does anyone have any suggestions on how to make something like this work with docker, or am I just barking up the wrong tree here?

you can try developing on docker source code, and build docker from source code to support your requirement.
There is a hack, how to do that. In the end of this issue:
https://github.com/docker/swarmkit/issues/1244

Using docker to run a distributed computation

I just came across docker, and was looking through its docs to figure out how to use this to distribute a java project across multiple nodes, while making this distribution platform independent i.e the nodes can be running any platform. Currently i'm sending classes to different nodes and running it on them with the assumption that these nodes have the same environment as the client. I couldn't quite figure out how to do this, any suggestions wouldbe greatly appreciated.

I do something similar. In my humble opinion Docker or not is not your biggest problem. However, using Docker images for this purpose can and will save you a lot of headaches.
We have a build pipeline where a very large Java project is built using Maven. The outcome of this is a single large JAR file that contains the software we need to run on our nodes.
But some of our nods also need to run some 3rd party software such as Zookeeper and Cassandra. So after the Maven build we use packer.io to create a Docker image that contains all needed components which ends up on a web server that can be reached only from within our private cloud infrastructure.
If we want to roll out our system we use a combination of Python scripts that talk with the OpenStack API and create virtual machines on our cloud, and Puppet which performs the actual software provisioning inside of the VMs. Our VMs are CentOS 7 images, so what Puppet actually does is to add the Docker yum repos. Then installs Docker through yum, pulls in the Docker image from our repository server and finally uses a custom bash script to launch our Docker image.
For each of these steps there are certainly even more elegant ways of doing it.

How Docker can be used for multi-layered application?

I am trying to understand how docker can be used to dockerize multilayered application.
My tomcat application needs mongodb, mysql, redis, solr and rabbitmq. I am playing with Docker for couple of weeks now. I am able to install and use mongo/mysql containers. But I am not getting how can I completely ship application using Docker. I have few questions.
How should the images be. Should I have one image that has all the components installed or have separate images (like one for tomcat, one for mongo, one for mysql etc) and start those containers using a bash script outside of docker.
What is the docker way of maintaining multiple containers at once. Meaning say I have multiple containers (like mongo, mysql, tomcat etc...) that needs to be worked together to run my application, Is there any inbuilt way of dealing this so that one command/script does this?
Suppose I dockerize my application, how can i manage various routine tasks that need to be performed like incremental code deployment, database patches etc. Currently we are using vagrant, we also use fabric along with vagrant for various tasks.Like after vagrant up we use fab tasks for all kind of routine things like code deployment, db refresh, adding volumes, start/stop services etc. What would be the docker's way of doing this?
With Vagrant if VM crashes due to High CPU etc. host system is not affected. But I see docker is eating up lot of host resources. Can we put limits for that say not more than one cpu core for that container etc..?
Because we use vagrant, most of the questions above are in that context. When started with docker I thought docker as a kind of visualization technology that can be a replacement for our huge Vagrant based infra. Please correct me if I am wrong?

I advise you to look at docker-compose:
you'll be able to define an architecture of your application
you can then easily build it and run it (with one command)
pretty much same setup for dev and prod

For microservices, composition etc I won't repost on this.
For containet resource allocation:
Docker run has various resource control options (using google cgroups) see my gist here
https://gist.github.com/afolarin/15d12a476e40c173bf5f

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart