I have a .NET Core 2.0 web api that is running on a Docker container behind a load balancer (also a docker container). I want to scale the web api by having multiple containers, one per customer. Having that in mind I had to make changes in the configuration to separate and abstract the customer details so I can have app settings per customer. That way I would have appsettings.CustomerA.config, appsettings.CustomerB.config etc.
My general Dockerfile is:
FROM microsoft/aspnetcore-build AS builder
WORKDIR /app
# copy csproj and restore as distinct layers
COPY ./Project/*.csproj ./
RUN dotnet restore
# copy everything else and build
COPY ./Project ./
RUN dotnet publish -c Release -o out
# build runtime image
FROM microsoft/aspnetcore
WORKDIR /app
COPY --from=builder /app/out ./
ENV ASPNETCORE_ENVIRONMENT Production
ENTRYPOINT ["dotnet", "Project.dll"]
That's all fine but what I don't know is whether I can have different Dockerfiles, one per customer where I specify the customer (not sure if that's a good practice or not because I would be mixing environments with customer but at the same time, the environment is for a given customer if that makes sense) or whether I should create a different config file that I copy across depending on the customer?
I took the docker template from the Microsoft docker documentation https://docs.docker.com/engine/examples/dotnetcore/#create-a-dockerfile-for-an-aspnet-core-application and added the environment myself.
Thanks for the help
It is not recommended to have a separate Dockerfile for each customer or even a separate image. This will quickly lead to big number of dockerfiles to manage and will introduce many complexities, especially when you want to upgrade all files/images when a new features demands so.
It is preferable to have one Docker image and externalize all customer specific configuration to the outside of the image.
There are multiple ways you can do that. In Docker, you can pass environment variables to the container to configure it. You can also wrap the environment vars in a config file and pass it as an env file.
Another "Docker way", to externalize stuff outside of the image is to use bind mounts. You can bind a directory from the host onto the container when running the container. In the directory, you can have config files ... that the container can pick up when starting up.
Related
I have hosted a docker image on the gitlab repo.
I have some sensitive data in one of the image layers.
Now if someone pulls the image, can he sees the sensitive date on the intermediate layer.
Also can he know the Dockerfile commands I have used for the image.
I want the end user to only have the image and dont have any other info about its Dockerfile
But atleast i dont want him to see the intermediate files
If someone pulls the image, can he sees the sensitive date on the intermediate layer?
Yes.
Also can he know the Dockerfile commands I have used for the image.
Yes.
You should take these into account when designing your image build system. For example, these mean you should never put any sort of credentials into your Dockerfile, because anyone who has the image can easily retrieve them. You can mitigate this somewhat with #RavindraBagale's suggestion to use a multi-stage build but even so it's risky. Run commands like git clone that need real credentials from outside the Dockerfile.
The further corollary to this is, if you think your code is sensitive, Docker on its own is not going to protect it. Using a compiled language (Go, C++, Rust) will mean you can limit yourself to distributing the compiled binary, which is harder to reverse-engineer; JVM languages (Java, Scala, Kotlin) can only distribute the jar file, though IME the bytecode is relatively readable; for interpreted languages (Python, Ruby, Javascript) you must distribute human-readable code. This may influence your initial language choice.
You can use multi-stage builds,
manage secrets in an intermediate image layer that is later disposed off ,so that no sensitive data reaches the final image build.
such as in the following example:
FROM: ubuntu as intermediate
WORKDIR /app
COPY secret/key /tmp/
RUN scp -i /tmp/key build#acme/files .
FROM ubuntu
WORKDIR /app
COPY --from intermediate /app .
Another options to maintain secret are
docker secret : you can use docker secret if you are using docker swarm
secrets in docker-compose file (without swarm)
version: "3.6"
services:
my_service:
image: centos:7
entrypoint: "cat /run/secrets/my_secret"
secrets:
- my_secret
secrets:
my_secret:
file: ./super_duper_secret.txt
I'm trying to understand the pros and cons of these four methods of packaging an application using Docker after development:
Use a very light-weight image (such as Alpine) as the base of the image containing the main artifact, then update the original docker compose file to use it along with other services when creating and deploying the final containers.
Something else I can do is, first docker commit, then use the result image as the base image of my artifact image.
One other method can be using a single FROM only, to base my image on one of the required services, and then use RUN commands to install the other required services as "Linux packages"(e.g. apt-get another-service) inside the container when it's run.
Should I use multiple FROMs for those images? Wouldn't it be complicated and only needed in more complex projects? Also it sounds vague to decide in what order those FROMs need to be written if none of them seems to be more important than the others as much as my application is concerned.
In the development phase, I used a "docker compose file" to run multiple docker containers. Then I used these containers and developed a web application (accessing files on the host machine using a bind volume). Now I want to write a Dockerfile to create an image that will contain my application's artifact, plus those services present in the initial docker compose file.
I'd suggest these rules of thumb:
A container only runs one program. If you need multiple programs (or services) run multiple containers.
An image contains the minimum necessary to run its application, and no more (and no less -- do not depend on bind mounts for the application to be functional).
I think these best match your first option. Your image is built FROM a language runtime, COPYs its code in, and does not include any other services. You can then use Compose or another orchestrator to run multiple containers in parallel.
Using Node as an example, a super-generic Dockerfile for almost any Node application could look like:
# Build the image FROM an appropriate language runtime
FROM node:16
# Install any OS-level packages, if necessary.
# RUN apt-get update \
# && DEBIAN_FRONTEND=noninteractive \
# apt-get install --no-install-recommends --assume-yes \
# alphabetical \
# order \
# packages
# Set (and create) the application directory.
WORKDIR /app
# Install the application's library dependencies.
COPY package.json package-lock.json .
RUN npm ci
# Install the rest of the application.
COPY . .
# RUN npm build
# Set metadata for when the application is run.
EXPOSE 3000
CMD npm run start
A matching Compose setup that includes a PostgreSQL database could look like:
version: '3.8'
services:
app:
build: .
ports: ['3000:3000']
environment:
PGHOST: db
db:
image: postgres:14
volumes:
- dbdata:/var/lib/postgresql/data
# environment: { ... }
volumes:
dbdata:
Do not try to (3) run multiple services in a container. This is complex to set up, it's harder to manage if one of the components fails, and it makes it difficult to scale the application under load (you can usually run multiple application containers against a single database).
Option (2) suggests doing setup interactively and then docker commit an image from it. You should almost never run docker commit, except maybe in an emergency when you haven't configured persistent storage on a running container; it's not part of your normal workflow at all. (Similarly, minimize use of docker exec and other interactive commands, since their work will be lost as soon as the container exits.) You mention docker save; that's only useful to move built images from one place to another in environments where you can't run a Docker registry.
Finally, option (4) discusses multi-stage builds. The most obvious use of these is to remove build tools from a final build; for example, in our Node example above, we could RUN npm run build, but then have a final stage, also FROM node, that NODE_ENV=production npm ci to skip the devDependencies from package.json, and COPY --from=build-stage the built application. This is also useful with compiled languages where a first stage contains the (very large) toolchain and the final stage only contains the compiled executable. This is largely orthogonal to the other parts of the question; you could update the Dockerfile I show above to use a multi-stage build without changing the Compose setup at all.
Do not bind-mount your application code into the container. This hides the work that the Dockerfile does, and it's possible the host filesystem will have a different layout from the image (possibly due to misconfiguration). It means you're "running in Docker", with the complexities that entails, but it's not actually the image you'll actually deploy. I'd recommend using a local development environment (try running docker-compose up -d db to get a database) and then using this Docker setup for final integration testing.
So I have a Dockerfile with the following build steps:
FROM composer:latest AS composer
COPY ./src/composer.json ./src/composer.lock ./
RUN composer install --prefer-dist
FROM node:latest AS node
COPY ./src ./
RUN yarn install
RUN yarn encore prod
FROM <company image>/php74 as base
COPY --from=composer --chown=www-data:www-data /app/vendor /var/www/vendor
COPY --from=node --chown=www-data:www-data ./public/build /var/www/public/build
# does the rest of the build...
and in my docker-compose file, I've got a volume for local changes
volumes:
- ./src:/var/www
The container runs fine on the CI/CD pipeline and deploys just fine, it grabs everything it needs and COPY's the correct files in the src directory. The problem is when we use a local volume for the code (for working in development). We have to composer/yarn install on the host because the src folder does not container node_modules/ or vendor/.
Is there a way to publish the node_modules/vendor directory back to the volume?
My attempts have been within the Dockerfile and publishing node_modules and vendor as volumes and that didn't work. Maybe it's not possible to publish a volume inside another volume? IE: within Dockerfile: VOLUME /vendor
The only other way I can think of solving this would be to write a bash script that docker run composer on docker-compose up. Then that would make the build step pointless.
Hopefully I've explained what I'm trying to achieve here. Thanks.
You should delete that volumes: block, especially in a production environment.
A typical workflow here is that your CI system produces a self-contained Docker image. You can run it in a pre-production environment, test it, and promote that exact image to production. You do not separately need to copy the code around to run the image in various environments.
What that volumes: declaration says is to replace the entire /var/www directory – everything the Dockerfile copies into the image – with whatever happens to be in ./src on the local system. If you move the image between systems you could potentially be running completely different code with a different filesystem layout and different installed packages. That's not what you really want. Instead of trying to sync the host content from the image, it's better to take the host filesystem out of the picture entirely.
Especially if your developers already need Composer and Node installed on their host system already, they can just use that set of host tools for day-to-day development, setting environment variables to point at data stores in containers as required. If it's important to do live development inside a container, you can put the volumes: block (only) in a docker-compose.override.yml file that isn't deployed to the production systems; but you still need to be aware that you're "inside a container" but not really actually running the system in the form it would be in production.
You definitely do not want a Dockerfile VOLUME for your libraries or code. This has few obvious effects; its most notable are to cause RUN commands to be able to change that directory, and (if you're running in Compose) for changes in the underlying image to be ignored. Its actual effect is to cause Docker to create an anonymous volume for that directory if nothing else is already mounted there, which then generally behaves like a named volume. Declaring a VOLUME or not isn't necessary to mount content into the container and doesn't affect the semantics if you do so.
Before docker:
I used to build the Release bits and run a set of test on them. Those bits were tested from Unit Tests to Comp, Functional, E2E, etc..
With docker:
1) I have a CI pipeline to test the bits, but ...
2) With Docker file I build and push the bit on the image in the same time. So considering the non-deterministic build system this is risk.. Is any way that I can write the Dockerfile to address this concern, or what is your approach?
Dockerfile for .net core that I am using as sample:
COPY . .
COPY /service /src/service
RUN dotnet build "src/service/ConsoleApp1/ConsoleApp1.csproj" -c release -o /app
FROM build AS publish
RUN dotnet publish "src/service/ConsoleApp1/ConsoleApp1.csproj" -c release -o /app
WORKDIR /app
FROM runtime AS final
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "ConsoleApp1.dll"]
First of all, our build system should be deterministic - if it is not, you have a problem.
As for your Dockerfile: The file shown is really threeimages - the first one builds the code, the second one publishes it and the last one merely executes it.
A good pipeline usually looks like this:
Build AND Unittest - if the unittests fail, abort the build.
If unittests are green, publish the generated image to a docker registry of your choosing.
Use that image in a dev environment, e.g. a Kubernetes cluster, azure container instances etc. pp. Now run any E2E, IT tests etc you need. Only and only if all of your tests are coming back green promote the image to a production environment, e.g. move them from dev to prod. Your dev environment can look totally different depending on the solution - if you have a single service you want to deploy, run the e2e tests directly, but maybe you have a complex cluster you need to set up, so run the tests on such a test cluster.
Since the image cannot change due to the nature of docker you can safely promote the image from dev to production. But you need to make sure that you never overwrite an image in your registry, e.g. avoid the latest tag and use explicit tags or even sha256 tags.
The general practice I follow is:
First stage:
once feature branch merged to master, run tests in CICD(eg. Jenkins)
Once tests succeed, build the versioned artifact eg. (.JAR for java, .DLL for dotnet)
publish the versioned artifact to artifactory(eg Jfrog or nexus) if needed
create a git tag
Second stage:
use versioned artifact created above and create versioned container image copying only the artifact. If you don't have artifactory yet, you can simply copy the local artifact.
eg. (WARNING: is not tested)
FROM microsoft/dotnet:latest
RUN mkdir -p /usr/local/app
COPY your-service/artifact-location/artifact-name.dll /usr/local/app
ENTRYPOINT ["dotnet", "/usr/local/app/ConsoleApp1.dll"]
tag the versioned container image and push to container registry (eg. elastic container registry(ecr) from Amazon)
Third stage:
update kubernetes deployment manifest with new version of container image.
apply kubernetes deployment manifest
Here is example for java here if that helps - https://github.com/prayagupd/eccount-rest/blob/master/Dockerfile and CICD pipeline for Jenkins - https://github.com/prayagupd/eccount-rest/blob/master/Jenkinsfile
I link my hub.docker.com account with bitbucket.org for automated build. In core folder of my repository exist Dockerfile, which is inside 2 image building steps. If I build images based same Dockerfile in local (i mean in Windows), I get 2 different images. But if I will use hub.docker.com for building, only last image is saved and tagged as "latest".
Dockerfile:
#-------------First image ----------
FROM nginx
#-------Adding html files
RUN mkdir /usr/share/nginx/s1
COPY content/s1 /usr/share/nginx/s1
RUN mkdir /usr/share/nginx/s2
COPY content/s2 /usr/share/nginx/s2
# ----------Adding conf file
RUN rm -v /etc/nginx/nginx.conf
COPY conf/nginx.conf /etc/nginx/nginx.conf
RUN service nginx start
# ------Second image -----------------
# Start with a base image containing Java runtime
FROM openjdk:8-jdk-alpine
# Add a volume pointing to /tmp
VOLUME /tmp
# Make port 8080 available to the world outside this container
EXPOSE 8080
# The application's jar file
ARG JAR_FILE=jar/testbootstap-0.0.1-SNAPSHOT.jar
# Add the application's jar to the container
ADD ${JAR_FILE} test.jar
# Run the jar file
ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/test.jar"]
Anybody did this before or its not possible?
PS: There only one private repository for free use, may be this is main reason.
Whenever you specify a second FROM in your Dockerfile, you start creating a new image. That's the reason why you see only the last image being saved and tagged.
You can accomplish what you want by creating multiple Dockerfiles, i.e. by creating the first image in its Dockerfile and then using that to create the second image - all of it using docker-compose to co-ordinate between the containers.
I found some walk-around for this problem.
I separate docker file to two file.
1.docker for ngix
2.docker for java app
In build settings set this two files as dockerfile and tag with different tags.
After building you have one image but versions is define as image name. For example you can use
for nginx server youraccount/test:nginx
for app image youraccount/test:java
I hope this will be no problem in future processes.