Libgit2sharp equivalent of 'git gc'? - libgit2sharp

We have 100+ remote repositories that we are looking to operate on exclusively via libgit2sharp but need to keep the repositories as small as possible. We were intending to just set gc.auto low and let git handle running garbage collection when the repositories got to big but after some tests we noticed that libgit2sharp didn't support that config setting. Upon further investigation I noticed that someone already pretty much already asked about libgit2sharp's support of the gc.auto config here:
is libgit2 automatically packing repositories
While I understand the reasoning in that response I was wondering is there a way to manually force a garbage collection on a repository via libgit2sharp?

I was wondering is there a way to manually force a garbage collection on a repository via libgit2sharp?
There's no way to request a garbage collection at this moment.
Some required low level functions already exist at the libgit2 level, but most of the logic has yet to be implemented.
No entry exist yet in the issue tracker about a git gc-like API. The best way for you to be kept updated about this topic would be to log a new feature request.

Related

Complex/Orchestrated CD with AWS CodePipeline or others

Building a AWS serverless solution (lambda, s3, cloudformation etc) I need an automated build solution. The application should be stored in a Git repository (pref. Bitbucket or Codecommit). I looked at BitBucket pipelines, AWS CodePipeline, CodeDeploy , hosted CI/CD solutions but it seems that all of these do something static as in receiving a dumb signal that something changed to rebuild the whole environment.... like it is 1 app, not a distributed application.
I want to define ordered steps of what to do depending on the filetype per change.
E.g.
1. every updated .js file containing lambda code should first be used to update the existing lambda
2. after that, every new or changed cloudformation file/stack shoud be used to update or create existing ones, there may be a needed order (importing values from each other)
3. after that, code for new lambda's in .js files should be used to update the created lambda's (prev step) code.
Non updated resources should NOT be updated or recreated!
It seems that my pipelines should be ordered AND have the ability to filter input (e.g. only .js files from a certain path) and receive as input also what the name of the changed resource(s) is(are).
I dont seem to find this functionality withing AWS or hosted git solutions like BitBucket or CI/CD pipelines like CircleCI or Codeship, aws CodePipeline, CodeDeploy etc.
How come? Doesn't anyone need this? Seems like a basic requirement in my eyes....
I looked again at available AWS tooling and got to the following conclusion:
When coupling CodePipeline to CodeCommit repositry, every commit puts a whole package of the repositry on S3 as input for CodeCommit. So not only the changes but everything.
In CodePipeline there is the orchestration functionality i was looking for. You can have actions for every component like create-change-set for SAM component and execute-chage-set etc and have control over the order of all.
But:
Since all code is given as input I assume all actions in CodeCommit will be triggered even for a small change in code which does not affect 99% of the resources. Underwater SAM or CF will determine themself what did or did not change. But it is not very efficient. See my post here.
I cannot see in the pipeline overview which one was run the last time and its status...
I cannot temporary disable a pipeline or trigger with custom input
In the end I think to make a main pipeline with custom lambda code determining what actually changed using CodeCommit API and splitting all actions in sub pipelines. From the main pipeline I will push their needed input to S3 and execute them.
(i'm not allow to comment, so i'll try and provide an answer instead - probably not the one you were hoping for :) )
There is definitely a need and at Codeship we're looking into how best to support FaaS/Serverless workflows. It's been a bit of a moving target over the last years, but more common practices etc. are starting to emerge/mature to a point where it makes more sense to start codifying them.
For now, it seems most people working in this space have resorted to scripting (either the Serverless framework, or directly against the FaaS providers) but everyone's struggling with the issue of just deploying what's changes vs. deploying everything as you point to. Adding further complexity with sequencing is obviously just making things harder.
Most services (Codeship included) will allow you some form of sequenced/stepped approach to deploying, but you'll have to do all the heavy lifting of working out what has changed etc.
As to your question of How come? i think it's purely down to how fast the tooling has been changing lately combined with how few are really doing it. There's a huge push for larger companies to move to K8s and i think they've basically just drowned out the FaaS adopters. Not that it should be like that, or that we at Codeship don't want to change that; it's just how i personally see things.

Centreon/Icinga: command by services

I was wondering if it could be possible to recognize how a command is comformed with any services in Centreon? For example, which services contain the 'check_uptime' command?
Maybe its possible using some sql magic queries but I'ver done that.
Though your question reminded me of this week's icinga2 api development where dependency tracking for objects has been implemented - this is important in case someone deletes a checkcommand at runtime, with many hosts/services depending on it. By default the api will deny removal, but cascading deletes would cause the entire dependency tree being deleted.
A side-effect from that development is the output of such object dependencies inside the status query for these objects.
Check the screenshot over here: https://twitter.com/dnsmichi/status/637586226711764992
It may not help you now, but with 2.4 being released in November.

Coding on multiple machines

What methods would you use to securely use multiple machines to work on code in active development?
My Ideal Situation
Sharing development code securely among multiple machines (at least two)
Automatic synchronization (think Google docs whereby any user's changes update all the others immediately). The reason for this is that I'd like to be able to use these computers interchangeably without having to commit / clone every time I switch. My understanding is that automatic synchronization would make it possible to switch machines seamlessly without having to commit a bunch of files each time.
The location of the development code is such that it can be accessed by a local Rails server and rendered on localhost:3000.
The solution works for Apple machines (both my computers are Apple).
I'm not sure if this question is a 'reasonable' question in terms of its specificity but it's the best first attempt I have. Thanks!
If you are the only person working on this project, then a service like Dropbox would work and provide you with the automatic synchronisation you're after.
However, if you're working with someone else on this project, or you're likely to do so in the future, then it's worth learning the basics of Git (or some other distributed version control system). It's probably not as hard as you expect:
You can get by with a few basic commands (see Everyday Git with 20 commands or so).
You can simplify things even further with git-up (this isn't perfect, but in most cases it makes fetching remote changes into a single command).
There are various OS X GUIs available to help you, including GitHub for Mac and GitX
You can get private repository hosting from GitHub (for a small fee) or from butbucket (for free).
Syncing with Git won't be automatic, but it does give you a lot of flexibility.
use any version control system (guess you do already) and if you really need automatic synchronisation put something like this in your crontab (assuming svn and growl for notifications):
* * * * * cd /path/to/checkout; svn update && svn commit -m '' || (echo "sync failed" |growlnotify )
I think version control is safest best practice for code sharing.
If it is only you working on the code, you can certainly set up another mode of syncing like auto-sync dropbox folder.
Also, if the assumption is you always got your laptop accessible as well when working on desktop, you could just use the actual laptop's code location as a shared network folder on your desktop, so no syncing is needed :)
Dropbox is the best free and easy solution. I use it as well for my personal projects. Only beware to run dropbox on two computers at once when from one pc you sending changes and second one is only running dropbox, you propably loose your work(that was my case).

How should a team push their changes to git master?

In the past we (a coworker and I) would push our changes directly to master. And then inform each other that changes need to be pulled.
A new coworker suggests forking the git repo and when he makes changes. He does a pull request. I would still be on the master repo and accept the request for the pull.
Which is the traditional / common approach when working together as a team? Or is there a better approach?
It depends on if you want to have one central repository or not. Many organizations have been using and continue using a central repository when they switch to git. It also depends on access, trust and how many developers you are. If you are only a few devs and you all trust each other, I'd go with a central bare repository that everyone pushes to and pulls from. Keep it simple.
If you are 100 developers and perhaps also external developers that you don't trust using your central repo and want to restrict access for some other reasons then pull requests might be the solution.
The important thing is to look at what kind of workflow YOU want and keep in mind that git will not get in your way and will let you decide that for yourself.
The traditional way is to fork-pull, i.e. the Linus-fork of the Linux kernel is the official line. The difference to your current approach is the amount of control you are having over the changes. If you don't need this control or if you can't check the changes anyway because you don't have the time to do so, theres no advantage in pulling manually. Git does handle resets/deletes very good and you can always go back in history.
Forking a git repo is a much better approach when working in a team as it makes sure your repo and code never hit an inconsistent stage. However this practice is not very common in the world. Most teams have made it a tradition where everybody is working on the same branch at the same time, this is a bit dangerous but can be made efficient by adding an email alert sending functionality in the post-receive hook of the repo so that the other team-mates can pull the changes as soon as they get an email alert.
I hope this helps.
I would say there are two main workflows:
As Magnus said everybody is pushing to the "blessed" bare repo, while working on the local clones of it.
More restricted work flow may suggest limited number of people having push access to the blessed repo and all other contributors are either sending pull requests or if the pull is technically difficult they provide patches. But they are pulling from blessed repo to keep their repos in sync. This workflow implies code review by "lieutenants" before the change goes to the blessed repo

Deploy tracking with Ruby on Rails and Capistrano

Like every commit has a reason and purpose, I think each deploy has a purpose and reason. Source code commits have a comment. But deploying doesn't have any.
How do I record a reason and purpose for each deploy automatically?
I need to keep a record of:
Who deployed to where and what time.
Why deployed? Bug fixes? Feature update? Emergency fix not on iteration plan?
Which git or svn ref was used?
Have anybody felt the need for this kind of system? How do you feel about my approach?
How can I achieve my goal? I'm currently using Capistrano for deployment.
A bounty added. I'd like to hear more stories from different developers who are doing "continuous deployment".
I found two services that do deploy tracking:
Codebase
Hoptoad
Webistrano - https://github.com/peritor/webistrano/wiki - is a web interface to capistrano, that also tracks who's deployed what and when, so that could be worth investigating.
My current project uses a modified version of the apinsein's git-deployment recipe, which (when you tell cap to do a deploy) will tag current HEAD with a Git tag (which gives you all the benefits of normal Git commits).
I've built a web service for this exact problem, http://deploytracking.com, it hooks into capistrano and records the time, user, branch, ref, environment and repo that was involved in the deployment.
Strano - The Github backed Capistrano deployment management UI.
Regarding continuous deployment, I also submitted pull request there, which Introduce automatic deployments for GitHub projects, for now it simply triggers deploy task when somebody push to the master branch.
I don't know if it is still relevant but I would like to come up with a different solution. I am building a new deployment tool that does just what you are looking for.
I do not intend to spam my stuff here but since I am building something that could help you...
Anyway, have a look here https://alessiosantocs.github.io/Captain. I'm gathering feedback so if you have any please let me know.
Update
As suggested, I'm giving an explanation :)
I have also felt this need. I work in a digital startup and we're constantly deploying stuff 5 days a week on different Ruby on Rails application with Capistrano.
What we noticed was that for every single deployment, we should have done several things:
Keep track of which pull requests and commits went online that exact moment
Give some sort of a name to the deploy so we could recognize it
Alert our team members so that everyone could have been on the same page (without asking us of deployment's news)
Keep track of every deployments for future bugs and errors we might find at some point in time (which happened often)
So for this reason we started developing this custom solution that would integrate with Capistrano and our SCM (bitbucket) and keep track of every change we made to our master branch. This is what it does right now.
We are currently tracking deployment environment, repo source, deployment branch and revision. Mainly we manage pull requests, because we found that pull requests, better than commits, did solve an organizational issue in our team (it was difficult to approve other team member's code without a rigid system like PRs)
I would like to explain more about Captain and about our personal dev management strategy with you guys if you want.
Thanks #thirumalaimurugan for asking for clarification!
Update 2
We tried git tagging too. It was good and fun at the beginning but we couldn't manage them very well.
A tag is basically a bookmark to a specific revision. So we're talking about commits. A tag keeps no track of pull requests. It was quite a mess for us.
I don't think they're bad at what you're trying to achieve, but I think there must be some other solutions that could fit exactly your (and our too) problem.

Resources