I use gitlab.com and CI with the shared docker runner that runs tests for my Ruby on Rails project on every commit to master. I noticed that about 90% of the build time is spent on 'bundle install'. Is it possible to somehow cache the installed gems between commits to speed up the 'bundle install'?
UPDATE:
To be more specific, below is the content of my .gitlab-ci.yml. The first 3 lines of the 'test' script take about 90% of the time making the build run for 4-5 minutes.
image: ruby:2.2.4
services:
- postgres
test:
script:
- apt-get update -qy
- apt-get install -y nodejs
- bundle install --path /cache
- bundle exec rake db:drop db:create db:schema:load RAILS_ENV=test
- bundle exec rspec
I don't know if you have special requirements for doing a apt-get all the time, if that is not needed create your own dockerfile with those commands in it. So that your base has already those updates/nodejs packages. If you want to update later on, you can always update your dockerfile again.
For your gems, if you want it quicker you can cache them in between builds too. Normally this is per job and per branch. See example here http://doc.gitlab.com/ee/ci/yaml/README.html#cache
cache:
paths:
- /cache
I prefer to add key: "$CI_BUILD_REF_NAME" so that for that particular branch my files are cached. See environments to see what keys you can use more.
You can setup BUNDLE_PATH environment variable and point it to a folder where you want your gems to be installed. The first time you run bundle install it will install all gems and consequent runs will only check if there are any new gems and install only those.
Note: That's supposed to be the default behavior. Check your BUNDLE_PATH env var value. Is it being changed to a temporary per commit folder or something? Or, is 90% of the build time being spent on downloading gem meta information from rubygems.org? In which case you might want to consider using --local flag (but not sure this is a good idea on CI server).
Fetching source index for https://rubygems.org/
After looking at your .gitlab-ci.yml I noticed that your --path options is missing =. I think it is supposed to be:
- bundle install --path=/cache
Related
Before I begin: This is not a post about speeding up bundle install that runs when I build the container.
I am building a Docker application that needs to run bundle install during runtime. It may take a while to explain this specific use-case, but the important component is: my running container will download rails projects, and run bundle install. Currently, this takes an extremely long time (likely because of nokogiri).
Is there a way to build my container, such that anytime my script runs bundle install during runtime, it uses cached gems?
I am using:
Docker Compose Version 3
Fargate
ECS
Set your BUNDLE_PATH env var to vendor/bundle
Mount a volume in Fargate to the bundle path
The first run will be slow since it has to build up the bundle cache, but after that it should only update gems if necessary.
When using Rails inside a Docker container
several posts, (including one on docker.com) use the following pattern:
In Dockerfile do ADD Gemfile and ADD Gemfile.lock, then RUN bundle install.
Create a new Rails app with docker-compose run web rails new.
Since we RUN bundle install to build the image, it seems appropriate to docker-compose build web after updating the Gemfile.
This works insomuch as the gemset will be updated inside the image, but:
The Gemfile.lock on the Docker host will not be updated to reflect the changes to the Gemfile. This is a problem because:
Gemfile.lock should be in your repository, and:
It should be consistent with your current Gemfile.
So:
How can one update the Gemfile.lock on the host, so it may be checked in to version control?
Executing the bundle inside run does update the Gemfile.lock on the host:
docker-compose run web bundle
However: You must still also build the image again.
Just to be clear, the commands to run are:
docker-compose run web bundle
docker-compose up --build
where web is the name of your Dockerized Rails app.
TL;DR - make the changes on the container, run bundle on the container and restart for good measure. Locally these changes will be reflected in your app and are ready to test/push out to git, and your production server will use it to rebuild.
Long; Read:
docker exec -it name_of_app_1 bash
vim Gemfile and put something like gem 'sorcery', '0.9.0' I feel this ensures you get the version you're looking for
bundle to get just this version in the current container's Gemfile and Gemfile.lock
This has been semi normal "Rails" type stuff here, just you are doing it on the container that is running. Now you don't have to worry about git and getting these changes onto. your git repo because these changes are all happening on your local copy. So like, open a terminal tab and go into your app and less Gemfile and you should see the changes.
Now you can restart your running container. Your Dockerfile will get rebuilt (in my case by docker-compose up locally, tests should pass. Browser test at will.
Commit your changes to git and use your deploy process to check it out on staging.
Notes:
Check that Dockerfile like the OP's links say. I'm assuming that you have some kind of bundle or bundle install line in your Dockerfile
Run bundle update inside the container, then rebuild.
$ docker-compose run web bundle update
$ docker-compose build
I'm experimenting with more cost effective ways to deploy my Rails apps, and went through the Ruby Starter Projects to get a feel for Google Cloud Platform.
It's almost perfect, and certainly competitive on price, but the deployments are incredibly slow.
When I run the deployment command from the sample Bookshelf app:
$ gcloud preview app deploy app.yaml worker.yaml --promote
I can see a new gae-builder-vm instance on the Compute Engine/VM Instances page and I get the familiar Docker build output - this takes about ten minutes to finish.
If I immediately redeploy, though, I get a new gae-builder-vm spun up that goes through the exact same ten-minute build process with no apparent caching from the first time the image was built.
In both cases, the second module (worker.yaml) gets cached and goes really quickly:
Building and pushing image for module [worker]
---------------------------------------- DOCKER BUILD OUTPUT ----------------------------------------
Step 0 : FROM gcr.io/google_appengine/ruby
---> 3e8b286df835
Step 1 : RUN rbenv install -s 2.2.3 && rbenv global 2.2.3 && gem install -q --no-rdoc --no-ri bundler --version 1.10.6 && gem install -q --no-rdoc --no-ri foreman --version 0.78.0
---> Using cache
---> efdafde40bf8
Step 2 : ENV RBENV_VERSION 2.2.3
---> Using cache
---> 49534db5b7eb
Step 3 : COPY Gemfile Gemfile.lock /app/
---> Using cache
---> d8c2f1c5a44b
Step 4 : RUN bundle install && rbenv rehash
---> Using cache
---> d9f9b57ccbad
Step 5 : COPY . /app/
---> Using cache
---> 503904327f13
Step 6 : ENTRYPOINT bundle exec foreman start --formation "$FORMATION"
---> Using cache
---> af547f521411
Successfully built af547f521411
but it doesn't make sense to me that these versions couldn't be cached between deployments if nothing has changed.
Ideally I'm thinking this would go faster if I triggered a rebuild on a dedicated build server (which could remember Docker images between builds), which then updated a public image file and asked Google to redeploy with the prebuilt image, which would go faster.
Here's the Docker file that was generated by gcloud:
# This Dockerfile for a Ruby application was generated by gcloud with:
# gcloud preview app gen-config --custom
# The base Dockerfile installs:
# * A number of packages needed by the Ruby runtime and by gems
# commonly used in Ruby web apps (such as libsqlite3)
# * A recent version of NodeJS
# * A recent version of the standard Ruby runtime to use by default
# * The bundler and foreman gems
FROM gcr.io/google_appengine/ruby
# Install ruby 2.2.3 if not already preinstalled by the base image
# base image: https://github.com/GoogleCloudPlatform/ruby-docker/blob/master/appengine/Dockerfile
# preinstalled ruby versions: 2.0.0-p647 2.1.7 2.2.3
RUN rbenv install -s 2.2.3 && \
rbenv global 2.2.3 && \
gem install -q --no-rdoc --no-ri bundler --version 1.10.6 && \
gem install -q --no-rdoc --no-ri foreman --version 0.78.0
ENV RBENV_VERSION 2.2.3
# To install additional packages needed by your gems, uncomment
# the "RUN apt-get update" and "RUN apt-get install" lines below
# and specify your packages.
# RUN apt-get update
# RUN apt-get install -y -q (your packages here)
# Install required gems.
COPY Gemfile Gemfile.lock /app/
RUN bundle install && rbenv rehash
# Start application on port 8080.
COPY . /app/
ENTRYPOINT bundle exec foreman start --formation "$FORMATION"
How can I make this process faster?
Well, you're kinda mixing up 2 different cases:
re-deploying the exact same app code - indeed Google doesn't check if there was any change in the app to be deployed in which case the entire docker image could be re-used - but you already have that image, effectively you don't even need to re-deploy. Unless you suspect something went wrong and you really insist on re-building the image (and the deployment utility does exactly that). A rather academic case with little bearing to cost-effectiveness of real-life app deployments :)
you're deploying a different app code (doesn't matter how much different) - well, short of re-using the cached artifacts during the image building (which happens, according to your build logs) - the final image still needs to be built to incorporate the new app code - unavoidable. Re-using the previously built image is not really possible.
Update: I missed your point earlier, upon a closer look at both your logs I agree with your observation that the cache seems to be local to each build VM (explained by the cache hits only during building the worker modules, each on the same VM where the corresponding default module was built beforehand) and thus not re-used across deployments.
Another Update: there might be a way to get cache hits across deployments...
The gcloud preview app deploy DESCRIPTION indicates that the hosted build could also be done using the Container Builder API (which appears to be the default setting!) in addition to a temporary VM:
To use a temporary VM (with the default --docker-build=remote
setting), rather than the Container Builder API to perform docker
builds, run:
$ gcloud config set app/use_cloud_build false
Builds done using the Container Builder API might use a shared storage, which might allow cache hits across deployments. IMHO it's worth a try.
I'm trying to understand some of the details behind bundling for deployment that I can't wrap my head around. I've read a few posts on here such as this one:
What does Rails 3's Bundler "bundle install --deployment" exactly do?
and I feel I understand what it should do. On my computer, I ran bundle install initially and have been developing a project. However, I wanted to see if I could run it in deployment just to get a feel as to how a production server like Heroku sets up the application.
Therefore, I started by running bundle install --deployment, which correctly installs all my gems into the local vendor/bundle local directory. However, when I run bundle show [GEM], I'm still seeing the path to my system gem. I feel it should be showing a path to the local folder, but it's not.
Can someone clear up on what my misconception is?
Have a look at the description of the two on Bundler's site.
Running bundle install --deployment is to be run in the production environment, but will grab the gems from rubygems when run. Read more here under the 'Deploying Your Application' heading for the purpose of the --deployment flag.
Anybody knows how to make BUNDLE INSTALL Cache'ing work in latest DOCKER release?
I've tried so far:
1.
WORKDIR /tmp
ADD ./Gemfile Gemfile
ADD ./Gemfile.lock Gemfile.lock
RUN bundle install
2.
ADD . opt/railsapp/
WORKIDR opt/rails/app
RUN bundle install
None of them work, it still runs "BUNDE INSTALL" everytime from scratch without Gemfile being changed.
Anyone knows how to make Caching for bundle install work correctly?
Cheers,
Andrew
Each time you change any file in your local app directory, the cache will be wiped out, forcing every step afterwards to be re-run, including the last bundle install.
The solution is don't run bundle install in step 2. You have already installed your gems in step 1 and there is little chance the Gemfile will change between step 1 and step 2 ;-).
The whole point of step 1 is to add your Gemfile, which should not change to often, so you can cache it and the subsequent bundle command before adding the rest of your app, which will probably change very often if you are still developing it.
Here's how the Dockerfile could look like:
1.
WORKDIR /tmp
ADD ./Gemfile Gemfile
ADD ./Gemfile.lock Gemfile.lock
RUN bundle install
2.
ADD . opt/railsapp/
WORKIDR opt/rails/app
Versions of Docker before 0.9.1 did not cache ADD instructions. Can you check that you're running a version of Docker 0.9.1 or greater?
Also, which installation of Docker are you using? According to this GitHub issue, some users have experienced cache-busting ADD behavior when using unsupported Docker builds. Make sure you're using an official Docker build.
ADD caching is based on all the metadata of the file, not just the contents.
if you are running docker build in a CI-like environment with a fresh checkout, then it is possible the timestamps of the files are being updated which would invalidate the cache.