My .dockerignore is setup to ignore busy directories, but altering a single file seems to have a huge impact on the run performance.
If I make a change to a single, non-dependent file (for example .php or .jpg) in the origin directory, the performance of the next request is really slow.
Subsequent requests are fast, until I make a change to any file in the origin directory and then request times return to ~10s.
Neither :cached or :delegated make any difference
Is there anyway to speed this up? It seems like Docker is doing a lot in the background considering only one file has been changed?
The .dockerignore file does not affect volume mounts. It is only used when sending context to the Docker daemon during image builds. So that is not a factor here.
Poor performance in some situations is a longstanding known issue in Docker for Mac. They discuss this topic in the documentation. In my experience, the worst performance happens with fs event scanners, i.e. you are watching some directory for changes and reloading an app server in response. My way of dealing with that is to disable the fs event watcher and restart the app server manually when that's needed. (May or may not be practical for your situation.)
The short answer is that you can try third party solutions, or you can accept the poor performance in development, realizing it won't follow you to production (which is probably not going to be on the Mac platform).
I ran into a similar issue but on Windows. The way I got around it was to use vagrant. Vagrant has great support for provisioning using Docker. In your Vagrantfile set up the shared directory to use rsync. This will copy over the directories on the VM. Docker can access these directories quickly when in memory on the VM.
This is a great article that helped me come to this conclusion: http://blog.zenika.com/2014/10/07/setting-up-a-development-environment-using-docker-and-vagrant/
More information on provisioning vagrant using docker: https://www.vagrantup.com/docs/provisioning/docker.html
More information on vagrant rsync: https://www.vagrantup.com/docs/synced-folders/rsync.html
I hope this helps you as much as it did me.
Related
While testing new Docker builds (modifying Dockerfile) it can take quite some time for the image to rebuild due to huge download size (either direct by wget, or indirect using apt, pip, etc)
One way around this that I personally use often is to just split commands I plan to modify into their own RUN variable. This avoids re-downloading some parts because previous layers are cached. This, however, doesn't cut it if the command that requires "tuning" is early on in the Dockerfile.
Another solution is to use an image that already contains most of the required packages so that it would just be pulled once and cached, but this can come with unnecessary "baggage".
So is there a straight forward way to cache all downloads done by Docker while building/running? I'm thinking of having Memcached on the host machine but it seems kind of an overkill. Any suggestions?
I'm also aware that I can test in an interactive shell first but sometimes you need to test the Dockerfile and make sure it's production-ready (including arguments and defaults) especially if the only way you will ever see what's going on after that point is ELK or cluster crash logs
This here:
https://superuser.com/questions/303621/cache-for-apt-packages-in-local-network
Is the same question but regarding a local network instead of the same machine. However, the answer can be used in this scenario, it's actually a simpler scenario than a network with multiple machines.
If you install Squid locally you can use it to cache all your downloads including your host-side downloads.
But more specifically, there's also a Squid Docker image!
Headsup: If you use a squid service in a docker-compose file, don't forget to use the squid service name instead of docker's subnet gateway 172.17.0.1:3128 becomes squid:3128
the way i did this was
used the new --mount=type=cache,target=/home_folder/.cache/curl
wrote a script which looks into the cache before calling curl (wrapper over curl with cache)
called the script in the Dockerfile during build
it is more a hack, works
I'm using Django but I guess the question is applicable to any web project.
In our case, there are two types of codes, the first one being python code (run in django), and others are static files (html/js/css)
I could publish new image when there is a change in any of the code.
Or I could use bind mounts for the code. (For django, we could bind-mount the project root and static directory)
If I use bind mounts for code, I could just update the production machine (probably with git pull) when there's code change.
Then, docker image will handle updates that are not strictly our own code changes. (such as library update or new setup such as setting up elasticsearch) .
Does this approach imply any obvious drawback?
For security reasons is advised to keep an operating system up to date with the last security patches but docker images are meant to be released in an immutable fashion in order we can always be able to reproduce productions issues outside production, thus the OS will not update itself for security patches being released. So this means we need to rebuild and deploy our docker image frequently in order to stay on the safe side.
So I would prefer to release a new docker image with my code and static files, because they are bound to change more often, thus requiring frequent release, meaning that you keep the OS more up to date in terms of security patches without needing to rebuild docker images in production just to keep the OS up to date.
Note I assume here that you release new code or static files at least in a weekly basis, otherwise I still recommend to update at least once a week the docker images in order to get the last security patches for all the software being used.
Generally the more Docker-oriented solutions I've seen to this problem learn towards packaging the entire application in the Docker image. That especially includes application code.
I'd suggest three good reasons to do it this way:
If you have a reproducible path to docker build a self-contained image, anyone can build and reproduce it. That includes your developers, who can test a near-exact copy of the production system before it actually goes to production. If it's a Docker image, plus this code from this place, plus these static files from this other place, it's harder to be sure you've got a perfect setup matching what goes to production.
Some of the more advanced Docker-oriented tools (Kubernetes, Amazon ECS, Docker Swarm, Hashicorp Nomad, ...) make it fairly straightforward to deal with containers and images as first-class objects, but trickier to say "this image plus this glop of additional files".
If you're using a server automation tool (Ansible, Salt Stack, Chef, ...) to push your code out, then it's straightforward to also use those to push out the correct runtime environment. Using Docker to just package the runtime environment doesn't really give you much beyond a layer of complexity and some security risks. (You could use Packer or Vagrant with this tool set to simulate the deploy sequence in a VM for pre-production testing.)
You'll also see a sequence in many SO questions where a Dockerfile COPYs application code to some directory, and then a docker-compose.yml bind-mounts the current host directory over that same directory. In this setup the container environment reflects the developer's desktop environment and doesn't really test what's getting built into the Docker image.
("Static files" wind up in a gray zone between "is it the application or is it data?" Within the context of this question I'd lean towards packaging them into the image, especially if they come out of your normal build process. That especially includes the primary UI to the application you're running. If it's things like large image or video assets that you could reasonably host on a totally separate server, it may make more sense to serve those separately.)
Docker seems to be the incredible new tool to solve all developer headaches when it comes to packaging and releasing an application, yet i'm unable to find simple solutions for just upgrading a existing application without having to build or buy into whole "cloud" systems.
I don't want any kubernetes cluster or docker-swarm to deploy hundreds of microservices. Just simply replace an existing deployment process with a container for better encapsulation and upgradability.
Then maybe upgrade this in the future, if the need for more containers increases so manual handling would not make sense anymore
Essentially the direct app dependencies (Language and Runtime, dependencies) should be bundled up without the need to "litter" the host server with them.
Lower level static services, like the database, should still be in the host system, as well as a entry router/load-balancer (simple nginx proxy).
Does it even make sense to use it this way? And if so, is there any "best practice" for doing something like this?
Update:
For the application i want to use it on, i'm already using Gitlab-CI.
Tests are already run inside a docker environment via Gitlab-CI, but deployment still happens the "old way" (syncing the git repo to the server and automatically restarting the app, etc).
Containerizing the application itself is not an issue, and i've also used full docker deployments via cloud services (mostly Heroku), but for this project something like this is overkill. No point in paying hundreds of $$ for a cloud server environment if i need pretty much none of the advantages of it.
I've found several of "install your own heroku" kind of systems but i don't need or want to manage the complexity of a dynamic system.
I suppose basically a couple of remote bash commands for updating and restarting a docker container (after it's been pushed to a registry by the CI) on the server, could already do the job - though probably pretty unreliably compared to the current way.
Unfortunately, the "best practice" is highly subjective, as it depends entirely on your setup and your organization.
It seems like you're looking for an extremely minimalist approach to Docker containers. You want to simply put source code and dependencies into a container and push that out to a system. This is definitely possible with Docker, but the manner of doing this is going to require research from you to see what fits best.
Here are the questions I think you should be asking to get started:
1) Is there a CI tool that will help me package together these containers, possibly something I'm already using? (Jenkins, GitLab CI, CircleCI, TravisCI, etc...)
2) Can I use the official Docker images available at Dockerhub (https://hub.docker.com/), or do I need to make my own?
3) How am I going to store Docker Images? Will I host a basic Docker registry (https://hub.docker.com/_/registry/), or do I want something with a bit more access control (Gitlab Container Registry, Harbor, etc...)
That really only focuses on the Continuous Integration part of your question. Once you figure this out, then you can start to think about how you want to deploy those images (Possibly even using one of the tools above).
Note: Also, Docker doesn't eliminate all developer headaches. Does it solve some of the problems? Absolutely. But what Docker, and the accompanying Container mindset, does best is shift many of those issues to the left. What this means is that you see many of the problems in your processes early, instead of those problems appearing when you're pushing to prod and you suddenly have a fire drill. Again, Docker should not be seen as a solve-all. If you go into Docker thinking it will be a solve-all, then you're setting yourself up for failure.
So I have noticed that on Mac there is a huge problem with sync while developing a PHP app. It can take up to 60 seconds before page loads.
As on Mac, Docker uses additional virtual machine I have used http://docker-sync.io to fix it. But I wonder, are you guys having similar issues? Yesterday I have noticed that there is something called File Sharing in Docker settings
img. As I've put my code at /Volumes/Documents/wwwdata should I have to add it also?
As the author of docker-sync, i might be able to give you an comprehensive answer.
Yet, under macOS, there is no solution with native docker for mac tools, to have a somewhat acceptable development environment - which means, sharing source code into the container - during its lifetime.
The main reasons are, that read and write speed on mounted volumes in docker for mac is extremely slow, see the performance comparison . This said, you could mount a volume using -v or volumes into a normal container, but this will be extremely slow. virtualbox or fusion shares are slow out of the same reasons, OSXFS even right now performs better then those, but still is horrible slow.
Docker-sync tries to detach the slow read/write speed from OSXFS by using unison as sync, not direct mount:
Long story short:
Docker for mac is still (very) slow, this hold even for High Sierra with APFS - unusable for development purposes.
The "folder" you are looking at and named "images" are nothing more then OSXFS based mounts into the hyperkit container, so just what it has been used in the past, you just now can configure other folders to be OSXFS synced and available to be mounted then the default ones. So this will not help you at all either.
To make this answer more balanced towards the general case, you find alternatives to docker-sync here - the amount of alternatives also tells you, that there is ( still ) a huge issue in docker-for-mac, it's not docker-sync made up.
I have noticed that assets requests are very, very slow on my Rails app. When volumes are inside docker image it takes around 20 ms to get asset file. When I am starting container and mount files, it takes around 400 ms to fetch them!
Docker filesystem is slow, but rails app boot time is pretty much same in both cases, so its not necessary the reason. Do you have an idea what could be reason here?
I had the same issue and was like impossible work in development environment with a Dockerized Rails application because on Mac is terrifically slow.
This is a known issue, Docker is very slow on Mac and Windows, in particular due to Volumes mounting.
First of all we took some precautions:
Be sure that you are not mounting big files or folders. For example my log directory size was 10GB! You can install ncdu to find big files/folders, follow this: https://maketips.net/tip/461/docker-build-is-slow-or-docker-compose-build-is-slow
Check if you are facing this network known issue: https://github.com/docker/compose/issues/3419#issuecomment-221793401
Anyway above precautions didn't helped too much.
The big improvement was adding docker-sync gem!
Check-out this: http://docker-sync.io/
Basically with this gem you are using a different approach to sync folders between your machine and app container. This works very well and now everything is very fast, almost similar to Linux performance!