yarn workspaces & monorepo & monobuild and how to build only what changed and dependencies? - yarn-workspaces

There are a few things done in monorepos/monobuilds (you can do a monorepo with no monobuild) that make things very nice but I don't see how yarn workspaces solves it just yet. One of the main ones is I do not see how yarn workspaces can do this part of a mono build process (very typical for scale)
git status to figure out which files changed
map those files to projects that have changed
build those projects and projects that depend on those and projects that depend on those
I am a little confused there. As a monobuild scales up, we really desire build times of a server change is under 3 minutes and changes to a library that may affect all projects would take a long time as it builds the entire repo (unless we split it out to different machines and the build time goes way down again).

Don't think there is necessarily one answer here but a number of things to consider in the context of your project:
If your project is really humungously large, consider someting like Bazel which is a bit complex but allows for incremental building and testing.
There are some specific tools to help with building large projects quickly. For instance, for JavaScript, there are Turborepo and Nx.
Yarn Workspaces or npm workspaces can generally help with enabling better monorepo build processes by allowing us to run build scripts only for a subset of workspaces. They won't solve the problem though of figuring out what to build when, they just provide us with the basic building block of running scripts selectively.
Finally a bit of Bash/Git/Makefile magic will probably be required. The following git command for instance can help us determine if files in particular paths have changed since the last commit git diff --quiet HEAD~1 HEAD -- [paths]. Note though this can can create a few annoying edge cases, especially if builds fail and we risk missing out on builing projects that we should build.
There are plugins for some CI/CD platforms that wrap the Git commands in a somewhat easier to use way. For instance, I have used the GitHub action has-changed-path and I think there was a plugin for BuildKite too, but I cannot find the link to that.
Generally I think it will be challenging to have a monorepo setup that avoids installing dependencies for all modules/workspaces and compiling all code. But I think it is possible to get to scale up to a few hundred thousand lines of code and hundreds of dependencies and keep install and compile times under 2-3 min using TypeScript in Yarn - when making good use of TypeScript project references and using something like Yarn Zero Installs.

Related

Does Jenkins support incremental pipeline builds?

I have been searching far and wide to see if I can find information on Jenkins incremental pipeline builds that does not involve Maven.
The general idea is that I want to build a generic project and run specific steps of the pipeline if the underlying code has changed. If the code did not change, I want to re-use the results from a previous build.
The reason why I want to do this, is to drastically reduce build times for huge projects.
Imagine that you only need to fix 1 line in a SCSS file, but the whole project needs to be rebuild, repackaged, etc because of this. In the meantime, the site is live and broken and waiting 15 mins to be fixed.
Can someone give a basic example of how such a build can be created or where I can find more information on incremental building?
The only thing I have been able to find is incremental building for Maven projects, but this is not applicable for me.
The standard solution is to create modules that depends on each others.
Publish the built artifact of your modules to a binary repository like Sonatype Nexus (you can easily create private npm repo as well as proxy npm repo).
During the build download the dependencies, instead of building them.
If this solution is not the one you want to take, you will have a hard time hacking a solution. To persist the state of your steps, an easy solution is to create files in the job workspace and read them at next build

NuGet restore packages crashes when two parallel threads attempt to restore same package

I have a build where multiple parallel stages each start out with a NuGet restore, before doing different stuff (build and run tests, build for iOS, build for Android). The restore is executed in each stage, since they can run on different build agents. However, since our CI setup has two executors per agent, they can also end up being executed on the same agent, and this is where my issue occurs.
When NuGet comes across a package that is not in the global packages directory (~/.nuget/packages, since I'm building on Macs), it will attempt to install it, and this tends to happen concurrently in the two parallel stages, causing an error to occur in either or both stages. The error message will be along the lines of:
[Stage1] Installing BtDriver 1.0.0.
[Stage1] WARNING: Error downloading 'BtDriver.1.0.0' from 'https://MyArtifactory/api/nuget/BtDriver/1.0.0'.
[Stage1] Directory /Users/MyUser/.nuget/packages/btdriver/1.0.0/lib is not empty
Or from the other stage:
[Stage2] Installing BtDriver 1.0.0.
[Stage2] WARNING: Error downloading 'BtDriver.1.0.0' from 'https://MyArtifactory/api/nuget/BtDriver/1.0.0'.
[Stage2] /Users/MyUser/.nuget/packages/btdriver/1.0.0/g45y07q7.6ap does not exist
I have looked far and wide for a solution to this, but so far I have been unable to find anyone running into the same issue, leading me to believe, that I may have missed something obvious, so I hope someone can lead me in the right direction.
Bonus info: I'm using Jenkins to assign the agents and orchestrate the build and NuGet Restore is invoked using Cake's NuGetRestore() method, but I'm able to reproduce using only 'nuget restore' from two separate terminals at the same time, so I'm assuming the error does not lie with Jenkins or Cake, although solutions involving either will be welcome.
I have considered adding a small delay to one of the stages, so there is a smaller chance of the two restores executing concurrently, but I would prefer a more robust solution. Limiting the number of executors to one per agent is not feasible either.
We have many CI agents running on the same hosts.
We eventually get rid of this problem with a custom MSBuildWithMutex task.
https://gist.github.com/dedale/675ec80313f2a70266deb0ab78a0e2c6
For what it's worth, I've had partial success telling nuget/msbuild to use different paths for its caches. See https://learn.microsoft.com/en-us/nuget/consume-packages/managing-the-global-packages-and-cache-folders for environment variables.
However, once past that issue I sometimes see socket errors with nuget restores running in parallel. I believe that each nuget process is fighting for access to the remote server and causing timeouts.
I am looking for a clean approach - perhaps convincing Jenkins to use a different home directory for each executor, so that I don't have to run extra code in each job, but I haven't figured out how to do that yet.

How to install (complex) dependencies in Travis-CI?

I would like to setup a documentation CI build, i.e. a build that requires nothing more than ASCIIDOC, TeX, XSLT (Saxon) et cetera.
Now I am aware of [1] which states that regular apt commands can be used for hopefully installing any of this dependencies.
But how do do so? It appears cumbersome to change .travis.yml, push a build and start again if there was a typo or other error in the install command.
Thus I was looking into 'travis console' to (somehow) interactively test the setup dependency process - with no luck.
What is the recommended way of setting up dependencies (packages)?
Edit:
The document generation process is driven by a simple hand crafted Makefile. The Makefile invokes various programs, especially asciidoc, python, TeX, DBLaTeX, libxslt, Saxon. Basic TeX is not enough as some fancy TeX packages are required as well. The installation of DBLaTeX is naturally cumbersome.
[1] http://docs.travis-ci.com/user/installing-dependencies
If you want to run Travis locally on your own virtual machine, you may want to look at Travis Build. Travis Build allows you to generate the shell script that performs the Travis build. Setting this up is a bit cumbersome and may not be worth it unless you have a very complicated build.
The documentation build that you're describing seems relatively straightforward (although you're not giving us much details). I'd say you should be able to put those dependencies together by trial and error.
There's also a middle ground between Travis Build and pure trial-and-error. Use Vagrant to set up a virtual machine with Ubuntu Precise (same version as Travis is using). Then figure out which packages you need to install (apt-get install ...) to get your build running on the virtual machine. Then replicate those steps in your .travis.yml and you should be good to go.

Build Rails+SPA as a distribution prior to deployment

I have a small but growing Angular application that runs with a Rails backend.
Deployment right now takes a very long time to run because all of the devDependencies have to be installed and then everything has to compiled.
What I'd like to do is have a distribution created for my application, and then that is used to deploy from. I think I'm going to want this anyhow since the application is going to be downloaded by users. I'd rather they not have to deal with installing the mass of npm modules and gems that should only be needed for development if they didn't have to.
Is Jenkins going to fit the bill here? The tasks I see that need to be accomplished to create a new distribution are:
Lint the code with JSHint, Rubocop, Brakeman, ...
Compile and or compress the JavaScript, Sass, images
Run the Karma, Rspec tests
Rev the files
Clean up any temporary or unnecessary files
Create git tag
Commit build
Also is it odd to want to commit the builds to a {original}-dist repository?

Automated Deployment in Rails

I'm working on my first rails app and am struggling trying to find an efficient and clean solution for doing automated checkouts and deployments.
So far I've looked at both CruiseControl.rb (having been familiar with CruiseControl.NET) and Capistrano. Unfortunately, unless I'm missing something, each one of them only does about half of what I want (with each one doing a different half).
For what I've seen so far:
CruiseControl
Strengths
Automated builds on repository checkouts upon commit
Also runs unit/functional tests and reports back
Weaknesses
No built-in deployment mechanisms (best I can find so far is writing your own bash scripts)
Capistrano
Strengths
Built for deployments
Weaknesses
Has to be kicked off via a command (i.e. doesn't do automated checkouts upon commit)
I've found ways that I can string the two together -- i.e. have CruiseControl ping the repository for changes, do a checkout upon commit, run the tests, etc. and then make a call to Capistrano when finished to do the deployment (even though Capistrano is also going to do a repository checkout).
Basically, when all is said and done, I'd like to have three projects set up:
Dev: Checkout/Deployment is entirely no touch. When someone commits a file, something checks it out, runs the tests, deploys the changes, and reports back
Stage: Checkout/Deployment requires a button click
Prod: Button click does either a tagged check out or moves the files from stage
I have this working with a combination of CruiseControl.NET and MSBuild in the .NET world, and it was fairly straightforward. I would guess this is also a common pattern in the ruby deployment world, but I could easily be mistaken.
I would give Hudson a try (free and open source). I started off using CruiseControl but got sick of having to relearn the XML configuration every time I needed to change a setting or add a project. Then I started using Hudson and never looked back. Hudson is more or less completely configurable over the web. It was initially a continuous integration tool for Java but has plugins for other development stack such as .NET and Ruby on Rails. There's a Rake plugin. If that doesn't work, you can configure it to execute any arbitrary command line after running your Rake builds/tests.
I should also add it's extremely easy to get Hudson going:
java -jar hudson.war
Or you can drop the war in any servlet container.
I would use two system to build and deploy anyway. At least two reasons: you should be able to run it separately and you should have two config files one for deploy and one for build. But you can easily glue the two systems together.
Just create a simple capistrano task, that tests and reports back to you. You can use the "run" command to do anything you want.
If you don't want any command line tool there was webistrano 2 years ago.
To could use something like http://github.com/benschwarz/gitnotify/tree/master to trigger the build deploy if you use git as repository.
At least for development automated deployments, check out the hook scripts available in git:
http://git-scm.com/docs/githooks
I think you'll want to focus on the post-receive hook script, since this runs after a push to a remote server.
Also worth checking out Mislav's git-deploy on github. Makes managing deployments pretty clean.
http://github.com/mislav/git-deploy

Resources