Crucible - Smart commits do not work for large repositories

Crucible - Smart commits do not work for large repositories - atlassian-crucible

Smart commits either do not work or works with lag with large repositories. Reviews are created sporadically. The smart commit works on certain branches and not others. Have you seen experienced such nuances, and if so, what would be an option to try?

Large repositories take a lot of time to index, especially if you have lot of feature branches. The lag is because it takes time for crucible to index all that data
You can try 2 things , You can force crucible to run a smart index on a large repository ahead of time
In case your repository is large because you have too many feature branches, try cleaning up your feature branches after a release . We ran into this issue where we had too many feature branches. Once we started cleaning this up , our indexing ran a lot faster and the lag time reduced significantly
I have not seen smart commits not working on branches. In case you have a specific branching scheme that crucible is monitoring and your new branch does not match that pattern crucible will not monitor it. For ex if your crucible is setup to monitor all branches whose name starts with "feature" and your branch name does not start with feature it will not be monitored, period

Related

How do I structure Jobs in Jenkins?

I have been tasked with setting up automated deployment and, after some research, settled on Jenkins to get the job done. Prior to this I had approximately zero knowledge of Jenkins, beyond hearing the name. I have no real knowledge of Devops beyond what I have learnt in the last couple of weeks; no formal training, no actual books, just Google searches.
We are not running a full blown/classic CI/CD process; this is a business decision. The basic requirements are:
Source code will be stored in GitHub.
Pull requests must be peer approved.
Pull requests must pass build/unit/db deploy tests.
Commits to specific branches must trigger a deployment to a related specific environment (Production, Staging or Development).
The basic functionality that I am attempting to support covers (what I currently see as) two separate processes:
On creation of a pull request, application is built, unit tests run, and db deploy tested. Status info must be passed to GitHub.
On commit to one of three specific branches (master, staging and dev) the application should be built, and deployed to one of three environments (production, staging and dev).
I have managed to cobble together a pipeline that does the first task rather well. I am using the generic web hook trigger, and manually handling all steps using a declarative pipeline stored in source control. This works rather well so far and, after much hacking, I am quite happy with the shape of it.
I am now starting work on the next bit, automated deployment.
On to my actual question(s).
In short, how do I split this up into Jobs in Jenkins?
To my mind, there are 1, 2 or 4 Jobs to be created:
One Job to Rule them All
This seems sub-optimal to me, as the pipeline will include relatively complex conditional logic and, depending on whether the Job is triggered by a Pull Request or a Commit, different stages will be run. The historical data will be so polluted as to be near useless.
OR
One job for handling pull requests
One job for handling commits
Historical data for deployments across all environments will be intermixed. I am a little concerned that I will end up with >1 Jenkinsfile in my repository. Although I see no technical reason why I can't have >1 Jenkinsfile, every example I see uses a single file. Is it OK to have >1 Jenkinsfile (Jenkinsfile_Test & Jenkinsfile_Deploy) in the repository?
OR
One job for handling pull requests
One job for handling commits to Development
One job for handling commits to Staging
One job for handling commits to Production
This seems to have some benefit over the previous option, because historical data for deployments into each environment will not be cross polluting each other. But now we're well over the >1 Jenkinsfile (perceived) limit, and I will end up with (Jenkinsfile_Test, Jenkinsfile_Deploy_Development, Jenkinsfile_Deploy_Staging and Jenkinsfile_Deploy_Production). This method also brings either extra complexity (common code in a shared library) or copy/paste code reuse, which I certainly want to avoid.
My primary objective is for this to be maintainable by someone other than myself, because Bus Factor. A real Devops/Jenkins person will have to update/manage all of this one day, and I would strongly prefer them not to suffer from my ignorance.
I have done countless searches, but I haven't found anything that provides the direction I need here. Searches for best practices make no mention on handling >1 Jenkinsfile, instead focusing on the contents of a single pipeline.

After further research, I have found an answer to my core question. This might not be the absolute correct answer, but it makes sense to me, and serves my needs.
While it is technically possible to have >1 Jenkinsfile in a project, that does not appear to align with best practices.
The best practice appears to be to create a separate repository for each Jenkinsfile, which maps 1:1 with a Job in Jenkins.
To support my specific use case I have removed the Jenkinsfile from my main source code repository. I then create 4 new repositories:
Project_Jenkinsfile_Test
Project_Jenkinsfile_Deploy_Development
Project_Jenkinsfile_Deploy_Staging
Project_Jenkinsfile_Deploy_Production
Each repository contains a single Jenkinsfile and a readme.md that, in theory, contains useful information.
This separation gives me a nice view of the historical success/failure of the Test runs as a whole, and Deployments to each environment separately.
It is highly likely that I will later create a fifth repository:
Project_Jenkinsfile_Deploy_SharedLibrary
This last repository would contain pipeline code that is shared amongst the four 'core' pipelines. Once I have the 'core' pipelines up and running properly, I will consider refactoring what I can into this shared library.
I will not accept my own answer at this point, in the hope that more answers are forthcoming.

Here's a proposal I would try for your requirements based on the experience at my last job.
Job1: builds and runs unit tests on every commit on master or whatever your main dev branch is (checks every 20 minutes or whatever suits you); this job usually finds compile and unit test issues very fast
Job2 (optional): run integration tests and various static code checks (e.g. clang-tidy, valgrind, cppcheck, etc.) every night, if the last run of Job1 was successful; this job finds usually lots of things, but probably takes lots of time, so let it run only at night
Job3: builds and tests every pull request for release branches; so you get some info in your pull requests, if its mature enough to be merged into the release branches
Job4: deploys to the appropriate environment on every commit on a release branch; on dev and staging you could probably trigger some more tests, if you have them
So Job1, Job2 and Job3 should run all the time. If pull requests to your release branches are approved by QA (i.e. reviews OK and tests successful) and merged to release branches, the deployment is done by Job4 automatically.
It depends on your requirements and your dev process, if you want to trigger Job4 only manually instead.

Custom Jenkins scheduler

We're seeing a problem with Jenkins and the scheduling of builds. Specifically, we trigger Jenkins to build a pipeline of work with every push to every branch of our git repo. On its own, the whole pipeline can take from 10 to 20 minutes to build. This can cause a problem if multiple pushes to a branch happened faster than the builds are completing. Multiplied by the twenty or thirty branches that are in development.
So, I'd like to be able to automatically deprioritise any scheduled builds on Jenkins if they are triggered on a Git commit sha that is no longer the tip of its branch. This is just one example of a factor that might indicate a desired priority. Others would be that branches with open pull requests should have higher priority than those without; or manual input in order to prioritise a PR or branch that needs feedback immediately.
Is there anyway to programmatically interact with the Queue of jobs on Jenkins and reorder it?
There is the Priority Sorter Plugin, but as far as I know this assigns each build a static priority. I would like to dynamically reprioritise items in the queue based on external info (e.g. from git).
I've found reference to two other plugins whose names indicate that they might do what I want, but I can't find any meaningful documentation on them. The former doesn't provide the options it claims to, and the latter doesn't even exist in the plugins repository. Neither seems to be maintained.
My alternatives seem to be
write my own implementation of hudson.model.Queue, which seems like overkill
maintain a separate queueing service that triggers individual jobs on Jenkins, in which case what is Jenkins even for?
Am I missing something obvious? I can't be the only person who wants more fine-grained control of Jenkins build ordering.

travis/coverity: automatically re-schedule a build after given time

I'm using githubs integration of travis-ci with coverity-scan (the free versions of all these services) to test my FLOSS code.
The problem I'm facing is that when continuously working on the code, i'm hitting the coverity quota pretty soon.
Since I'm working on multiple projects simultaneously, it can therefore well be that I switch away from working on a given project before I'm allowed to submit a coverity again, thus potentially having flaws in the code for weeks although they would have been caught easily by coverity.
I would like to avoid this.
The first measure to prevent hitting the quota too frequently, is by using a dedicated branch (usually coverity_scan) which does not receive pushes as often as the master and/or feature branches.
However, this puts cognitive load on the user (me), which I also like to avoid.
Also, sometimes I still hit the quota (some of my projects as in the 100k-500k lines-of-code range, so they have a lower threshold than usual).
What I would like to have is being able to automatically re-trigger a coverity-scan once the quota has expired, if (and only if) the current build did hit the quota.
Is somthing like this possible with plain travis-ci/coverity features?
Or would I have to setup a separate hook, that monitors the coverity quota and travis-ci builds?

You don't need to run Coverity on every check-in. It's just too slow.
You should configure your (coverity build) system to poll your repo for changes, but have them checked infrequently. Something like a few times per day.
This will trigger the build when things change, but not on every change that is detected.

How to cherry pick after having merged several changesets into one

We are using TFS 2010 with the Basic Branch Plan outlined in the Branching Guide on codeplex
for an internal web application. We have just the 3 basic branches = Dev, Main (QA/Testing), and Release (Production).
Because the app is an internal web application, we only support the single production release.
We basically develop locally and once we complete a task (a bug fix or enhancement), we commit it to Dev. We also generally do a Get Latest from Dev every day when we start work to pull down anything checked in by the other developers. After some period of time (usually a week or two), we'll decide we have enough changes to justify updating the QA site and do a Merge All from Dev to Main and then deploy the merged Main branch to a QA server for testing.
QA will then start testing the site, and after they're satisfied, we'll do a Merge All from Main to Release and deploy the merged Release branch to our production server. Sometimes we even wind up doing multiple Dev-to-Main merges before actually merging everything on up to Release.
Anyway, we've been using this strategy for a couple of months now and until recently everything was looking great. We were able to hotfix Release if we ran into some critical problem in production and then just merge it backwards. All was looking good.
Then we ran into something we didn't know how to deal with. We were given the directive of merging ONLY a single code fix on up from Main to Release (without merging everything else in Main). Now since we didn't know this was coming, when the original changeset was merged from Dev to Main, it was merged along with several other changesets. So when I went to merge from Main to Release, the only option I had was for the entire merged changset. I couldn't "drill-down" into the merged changeset and pick just the one original changeset from Dev that I really wanted.
I wound up manually applying the change like a hotfix in Release just to get it out there. But now I'm trying to understand how you prevent a situation like this.
I've read several articles on merging strategy and everything seems to recommend NOT cherry-picking changesets when you go to merge - to simple merge everything available... which makes sense.. but if you always merge multiple changesets (and they become one changeset in the destination branch), then how do you potentially merge only one of the original changesets on up to production if the need arises?
For example, if merging Dev (C1, C2, C3) to Main (becomes C4) - then how to merge only C1 from 'within' C4 on up to Release?
It makes me think we'd be better off merging every single changeset individually from Dev to Main instead of doing several at once. At least then we could easily just take one on up from Main to Release if the need arises.
Any recommendations/life lessons/etc. on handling branching/merging for this specific scenario would be greatly appreciated.

In your scenario you could have done the following:
Rollback C4 in Main (becomes C5, because rollbacks are changesets themselves, which apply inverse changes)
Merge from Dev to Main again, but this time select only C1 (becomes C6 in Main).
Now rollback changesets C5 and C6 again, so you have all changes in Main like before. (becomes C7 in Main).
After this you have the same code base in Main as before and you can now merge C6 (which has only the changes from C1) from Main to Release.
However, to prevent such trouble in future you should really consider merging every single changeset from dev to main separately.

I would not recommend merging every single change-set from dev to main; That would be a bad idea with much additional risk!
but if you always merge multiple changesets (and they become one
changeset in the destination branch), then how do you potentially
merge only one of the original changesets on up to production if the
need arises?
You don't and should not let the need arise.
This is probably not the easy answer that you are looking for, but there really is no easy answer. Merging every single change-set is creating a massive amount of effort to prepare for something that should not be happening anyway. Indeed the process of merging individual change-sets introduces yet more complexity that will, in the end, bit you in the ass when you can't figure out why your software is not working... "dam, I missed change-set 43 out of 50"...
If the result of a bug:
In your scenario it may have been better if you manually re-applied the "fix" to either a "hotfix" branch off of Release or directly to the Release line.
That is just the cost of having bugs slip through to production and I would spend a little time figuring out why this problem got passed QA and how to prevent it in the future.
If the result of an enhancement:
Did your financial (CFO) guys authorise the reduction in quality in production that is a direct result of shipping untested code? I hope that they did as they effectively own the balance statements upon which that software is listed as an organisational asset!
It is not viable to ship only one feature, built and tested with other features, to production without completing your entire regression cycle again.
Conclusion
I would not recommend merging every single change-set or feature from dev to main; That would be a bad idea with much additional risk that should be hi-lighted to the appropriate people!

TFS 2010: Rolling CI Builds

I've been looking around online at ways of improving our build time (which is currently ~30-40 minutes, depending on which build agent gets the task), and one common theme I've seen is use CI builds.
I understand the logic behind this, and it makes sense that it would reduce the time each build takes. Our problem, however, is that building on every check-in is a pointless use of our resources, because in our development branch, we only keep the latest successful build. This means that if 2 people check-in in a short space of time, whoever checked-in last will be the one whose build is kept.
It's this reason (along with disk space limitations) that we changed to using Rolling Builds, so that we only built the development branch a maximum of once every 45 minutes (obviously we could manually trigger builds on otp of that).
What I want to know (and haven't been able to find anywhere) is whether there's a way of combining rolling builds AND continuous integration. So keep building only once every 45 minutes, but only get and build files that have changed.
I'm not even sure it's possible, and if not then I'll look into other ways, but this seems like something that should be possible.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart