What would be the best way for all Dask (distributed) workers / schedulers to understand a custom git repository's python modules?
It would be a plus if the new commits to the git repository are reflected on the Dask workers / schedulers the same way.
I have tried the following things:
(1) Using client.upload_file API, copy the files from master node to worker nodes. Copying the files individually loses the module / directory structure so zipping the files then uploading that could work. But updates to the git repository wouldn't be reflected in the zipped repositories of master and worker nodes.
(2) (From Amazon EMR) In the bootstrap script, I included "pip install git+https://github.com/my_repo.git", so that all nodes would have the repository upon cluster instantiation. But same as (1); updates to the git repository wouldn't be reflected in the installed packages under site-packages/.
Dask does not manage user software environments. Typically people handle this with Docker images or Network File Systems (NFS)
Related
I learnt the basics of github and docker and both work well in my environment. On my server, I have project directories, each with a docker-compose.yml to run the necessary containers. These project directories also have the actual source files for that particular app which are mapped to virtual locations inside the containers upon startup.
My question is now- how to create a pro workflow to encapsulate all of this? Should the whole directory (including the docker-compose files) live on github? Thus each time changes are made I push the code to my remote, SSH to the server, pull the latest files and rebuild the container. This rebuilding of course means pulling the required images from dockerhub each time.
Should the whole directory (including the docker-compose files) live on github?
It is best practice to keep all source code including dockerfiles, configuration ... versioned. Thus you should put all the source code, dockerfile, and dockercompose in a git reporitory. This is very common for projects on github that have a docker image.
Thus each time changes are made I push the code to my remote, SSH to the server, pull the latest files and rebuild the container
Ideally this process should be encapsulated in a CI workflow using a tool like Jenkins. You basically push the code to the git repository,
which triggers a jenkins job that compiles the code, builds the image and pushes the image to a docker registry.
This rebuilding of course means pulling the required images from dockerhub each time.
Docker is smart enough to cache the base images that have been previously pulled. Thus it will only pull the base images once on the first build.
I am new to Kubernetes and so I'm wondering what are the best practices when it comes to putting your app's source code into container run in Kubernetes or similar environment?
My app is a PHP so I have PHP(fpm) and Nginx containers(running from Google Container Engine)
At first, I had git volume, but there was no way of changing app versions like this so I switched to emptyDir and having my source code in a zip archive in one of the images that would unzip it into this volume upon start and now I have the source code separate in both images via git with separate git directory so I have /app and /app-git.
This is good because I do not need to share or configure volumes(less resources and configuration), the app's layer is reused in both images so no impact on space and since it is git the "base" is built in so I can simply adjust my dockerfile command at the end and switch to different branch or tag easily.
I wanted to download an archive with the source code directly from repository by providing credentials as arguments during build process but that did not work because my repo, bitbucket, creates archives with last commit id appended to the directory so there was no way o knowing what unpacking the archive would result in, so I got stuck with git itself.
What are your ways of handling the source code?
Ideally, you would use continuous delivery patterns, which means use Travis CI, Bitbucket pipelines or Jenkins to build the image on code change.
that is, every time your code changes, your automated build will get triggered and build a new Docker image, which will contain your source code. Then you can trigger a Deployment rolling update to update the Pods with the new image.
If you have dynamic content, you likely put this a persistent storage, which will be re-mounted on Pod update.
What we've done traditionally with PHP is an overlay on runtime. Basically the container will have a volume mounted to it with deploy keys to your git repo. This will allow you to perform git pull operations.
The more buttoned up approach is to have custom, tagged images of your code extended from fpm or whatever image you're using. That way you would run version 1.3 of YourImage where YourImage would contain code version 1.3 of your application.
Try to leverage continuous integration and continuous deployment. You can use Jenkins as CI/CD server, and create some jobs for building image, pushing image and deploying image.
I recommend putting your source code into docker image, instead of git repo. You can also extract configuration files from docker image. In kubernetes v1.2, it provides new feature 'ConfigMap', so we can put configuration files in ConfigMap. When running a pod, configuration files will be mounted automatically. It's very convenience.
I don't like when it comes to release my projects on production server.. May be i just don't have enough experience, nobody taught me how to do this in a right way.
For now i have several repos with scala (on top of spray). I have everything to build and run this projects on my local machine (of course, i develop them). So installed jenkins on my production server in order to sync from git, build and run. It works for now but i don't like it, because i need to install jenkins on every machine i want to have run my projects. What if i want to show my project to my friend in cafe?
So i've come with idea: what if i run tests before building app, make portable build (e.q. with sbt native packager) and save it on remote server "release server". That server just keeps these ready to be launched apps.
Then i go to production server, run bash script that downloads executables from release server and runs my project on a machine
In future i want to:
download and run projects inside docker containers.
keep ready to be served static files for frontend. Run docker
container with nginx and linked volume with static files
I heard about nexus (http://www.sonatype.org/nexus/), that artist use to save their songs, images, so on. I believe there should be open source projects that expose idea like mine
Any help is appreciated!
A common anti-pattern, in my opinion, is to build the software every time you perform a deployment.You are best advised to separate the process of build from the act of deployment by introducing a binary repository manager (you've mentioned on such example, nexus).
Best Practice - Using a Repository Manager
Binary repository manager
How can I automatically deploy a war from Nexus to Tomcat?
Only successfully tests builds get pushed to the repository, so you can treat each successful build as a mini-release. A by-product of this is that your production server does not have to have all the build software pre-installed (like, Jenkins, ANT , Maven, etc).
It should be noted that modern repository managers like Nexus and Artifactory now support Docker registries too, so that you use these for deploying docker images too.
Update
A related chef question, a technology where there is no intermediate binary file (like a jar). In this case the software is still "released" by creating a tar distribution stored in the repo.
chef cookbook delivery - chef server vs. artifactory + berkshelf
A web application typically consists of code, config and data. Code can often be made open source on GitHub. But per-instance config and data may contain secretes therefore are inappropriate be saved in GH. Data can be imported to a persistent storage so disregard for now.
Assuming the configs are file based and are saved in another private secured SVN repo, in order to deploy the web app to OpenShift and implement CI, I need to merge config files with code prior to running build scripts. In addition, the build strategy should support GH webhooks for automated build.
My questions are, to be more specific:
Does OS BuildConfig support multiple data sources, especially from svn?
If not, how to deploy such web app to OS?
The solution I came up with so far:
Instead of relying on OS for CI, use Jenkin instead.
Merge config files with code using Jenkins.
Instead of using Git source type in BuildConfig, use binary source instead
Let jenkins run
oc start-build --from-dir=<directory>
where <directory> contains merged code/config
I have 2 rails applications that live inside the same git repo.
There is a shared folder where common logic lives.
- app_1
- shared
- app_2
The shared folder is really just a symlink to the appropriate places inside the app_1 folder. There is also a shared_public folder that is symlinked to app_1/public/files and app_2/public/files.
How can I do this? I'm open to anything, it's a clean slate. The project was never deployed previously, so I don't have a existing infrastructure to rely on. And splitting the shared logic out is (unfortunately) not an option currently, because of the timeframe I have to work with.
Git
When you mention the shared folder is a symlink - this only exists in operating systems, not git
Since git is just a deployment mechanism in this instance (I.E will place your files from your repo onto your server), you'll probably be able to do the following:
Initialize a git repo on your server ($ git init on your server)
Clone your github repo locally (git clone https://github.com.... on your local box)
CD into your new folder and add the server's repo as a remote
$ git push [server repo name] master
This isn't what you want, I know.
It will push your files onto your server - so you'll get the following folder structure:
app1
app2
The shared folder could then be created on your server itself.
If you have your appropriate server setup, you should be able to get this running from performing these steps
Capistrano
If you want to use Capistrano, you'll have to do something a little more involved, as this does more than just push your files to your server
If you want to use Capistrano, you'll have to split your app1 and app2 into separate applications, and deploy them individually. This will still allow you to create a symlink on your server, except you'll have a slightly different structure to your directories