Performance issue while iterating through repo's commits? - libgit2sharp

I want to display just the first 100 commits of a repository. I used the linux-repo to test this:
const int maxSize = 100;
Stopwatch sw = new Stopwatch();
Console.WriteLine( "Getting Commits in own Thread" );
sw.Start();
using( Repository repo = new Repository( path_to_linux_repo ) )
{
ICommitLog commits = repo.Commits.QueryBy( new CommitFilter { Since = "HEAD" } );
int index = 0;
foreach( Commit commit in commits )
{
if( index++ > maxSize ) break;
}
}
sw.Stop();
Console.WriteLine( "Took {0}ms for {1} entries", sw.ElapsedMilliseconds, maxSize );
This simple loop takes over 9000ms on my machine. Its FAR faster when using a repo with less commits, but why is it so slow in repos with a lot of commits?
Another question: is it possible to just retrieve a given number of commits e.g. to page through
all commits?

I can reproduce here. It's definitely far too long to take. It looks as though libgit2 is enqueueing the full graph before returning, which would be a bug with the given settings. Would you mind opening an issue?
As for retrieving a number of commits, the iteration is pull-based, so you will only grab as many out of the repository as you ask for. the commit log implements IEnumerable so you can use whatever LINQ methods you like (or do it manually as in this example).
UPDATE:
The bug was quite embarrassing, but there's a PR to fix it in libgit2 which will make its way into libgit2sharp releases in due course. With the fix, this test now takes ~80ms. Thanks for bringing it up.
UPDATE 2:
The fix is now available in the vNext branch of LibGit2Sharp.

Related

How to Shrink the TFS 2017 Fast Growing tbl_content Table

We use TFS 2017 CI / CD pipelines, and it works great. However, the TFS 2017 databases grow averaging around 1GB per day. One database grows from 10GB to 44GB as of 10/23/2018. The growth slowly becomes unsustainable for us. We already adjusted retention policy to minimum.
Researched and read at least 30 articles. Here are some relevant articles:
TFS tbl_Content started growing very fast after using VNext build
https://mattyrowan.com/2014/04/02/need-help-tfs-tbl_content-table-and-database-growth-out-of-control/
https://developercommunity.visualstudio.com/content/problem/63712/tfs-database-size.html
Here are what I did so far:
Reviewed again and again the retention policies, and reduced to minimum (1 day 1 copy). Adjusted 'Keep Deleted' for 10 days.
Uncheck the 'Retain Build' box in release definition
Run the scripts from three articles mentioned above, and found:
a) FileContainer, has 149176 number of files, 43GB, (34GB compressed)
b) FileContainerOwner: Build, 29GB
So the main cause of the growth is Build (and artifacts).
My question is how to shrink the database size down?
I look at the tabs 'History' and 'Deleted' under build definitions.
Some records in 'History' are locked with 'Retained by release'. I can click on records and delete. But it doesn't do anything. The records are still there.
All records in 'Deleted' are still there.
So back to my question again, how do I delete these records so that the space can be reclaimed?
Thanks.
After reset the RetainedByRelease to false and waited for at least 24 hours, the growth spurt stopped and entries in tbl_content were removed daily.
So in a summary, I also did this:
Reset the RetainedByRelease to false using TFS REST API after nuget Microsoft.VisualStudio.Services.Client
Special thanks to these two threads:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/5f649821-b1bf-4008-bba9-0c960e124abb/tfs-releasemanagement-vnext-quotthis-build-has-been-retained-by-a-releasequot-issue?forum=tfsbuild
Trying to get list of TFS users trough client library
The full source code to help fellow developers:
using Microsoft.TeamFoundation.Build.WebApi;
using Microsoft.TeamFoundation.Core.WebApi;
using Microsoft.VisualStudio.Services.Client;
using Microsoft.VisualStudio.Services.Common;
using Microsoft.VisualStudio.Services.WebApi;
using System;
namespace TfsRestAPIs
{
public class RestAPI
{
public static void UpdateRetainedByRelaseToFalse()
{
Uri tfsURI = new Uri("http://TFS2017:8080/tfs/YourProjectCollection");
VssCredentials creds = new VssClientCredentials();
creds.Storage = new VssClientCredentialStorage();
VssConnection connection = new VssConnection(tfsURI, creds);
var projectClient = connection.GetClient<ProjectHttpClient>();
var projects = projectClient.GetProjects().Result;
var buildClient = connection.GetClient<BuildHttpClient>();
foreach (var project in projects)
{
Log(project.Name);
if (project.Name == "YourProjectName")
{
var builds = buildClient.GetBuildsAsync(project.Id).Result;
foreach (Build build in builds)
{
if (build.BuildNumber.StartsWith("YourSearchCondition"))
try
{
if (build.RetainedByRelease.Value)
{
Log(build.BuildNumber + "'s RetainedByRelease=true");
build.RetainedByRelease = false;
var res = buildClient.UpdateBuildAsync(build, build.Id).Result;
Log(" --> RetainedByRelease is set to " + res.RetainedByRelease.Value);
}
}
catch (Exception e)
{
Log(build.BuildNumber + ":" + e.Message);
}
}
}
}
}
private static void Log(string msg)
{
Console.WriteLine(msg);
}
}
}

In Jenkins Delarative Pipeline, how to evaluate whether a specific directory is changed

The conditionals of Jenkins Declarative Pipeline is built around branches namely.
I would like to evaluate whether a specific folder is changed within any branch and then run the stages.
Something like:
stage {
when {
folderX changed
}
}
To give you a better idea of why I need this feature, let me elaborate.
The project I am working on exists out of a few modules (let's say micro services). Although every module can have its own branch or even repository, we have chosen to put them together in their own folders, so we can always keep everything clean in the master.
Our Jenkins pipleine has a stage for every module. However, we do not want to rebuild every module if nothing is changed in that folder.
I finally solved the problem by a script block which I found here:
when{ expression {
def changeLogSets = currentBuild.changeSets
for (int i = 0; i < changeLogSets.size(); i++) {
def entries = changeLogSets[i].items
for (int j = 0; j < entries.length; j++) {
def entry = entries[j]
def files = new ArrayList(entry.affectedFiles)
for (int k = 0; k < files.size(); k++) {
def file = files[k]
if(file.path.contains("FolderOfInterest")){
return true
}
}
}
}
return false
}
}
There are couple problems here:
The feature you're proposing does not exist. You could submit a ticket in the jenkins issue tracker for that.
Also, I think you're suggesting that you want to look at code changes across branches. This is not a standard use case for pipelines. Usually, you want to build a specific branch of a project (when it changes, for example), so you are going to feel moderate to intense pain trying to do what you're suggesting.
If you want to see what has changed on the current branch from within a Jenkisfile, you can use currentBuild.changeSets (docs). You could combine that with if statements and whatnot, within a script block if you want to use a declarative pipeline like you seem to be suggesting.
Keep at it and you'll figure it out. Good luck!

Programmatically delete a TFS branch

I want to programmatically delete a branch in TFS that was create automatically.
There is an existing method "ICommonStructureService.DeleteBranches" that should do the work.
My problem is that the method requires a parameter "string[] nodeUris" that specifies the branch to delete using a "vstfs://... " URI and I just don't know how to get that for my branch.
What I need is something like:
var projectCollection = TfsTeamProjectCollectionFactory.GetTeamProjectCollection(new Uri <myCollectionUrl>));
var cssService = projectCollection.GetService<ICommonStructureService3>();
var project = cssService.GetProjectFromName(<myProjectName>);
But how can I get the Branch Uri from there?
Meanwhile I found a solution. For deleting the branches I am using
versionControl.Destroy(new ItemSpec(myBranchPath, RecursionType.Full), VersionSpec.Latest, null, DestroyFlags.KeepHistory);
This does exactly what I needed.
versionControl is of type VersionControlServer and must be initialized using the Team Collection
Deleting a branch in version control is like deleting any other version control item. You will need to pend a delete with Workspace.PendDelete on the Item.
The method you reference is wholly unrelated to version control, it's part of the TFS common structure service, which controls the "areas and iterations" that TFS work items can be assigned to.
In short, there's no way to perform any sort of version control operations against the common structure service. You delete a branch by creating a Workspace against a VersionControlServer, pending a delete and then checking in your pending changes.
I agree to Edward Thomson about using Destroy command. So I followed on advice from him and came up with following,
public void DeleteBranch(string path)
{
var vcs = GetVersionControlServer();
var itemSpec = new ItemSpec(path, RecursionType.Full);
var itemSpecs = new[] {itemSpec};
var workSpace = GetOrCreateWorkSpace(vcs);
try
{
workSpace.Map(path, #"c:\Temp\tfs");
var request = new GetRequest(itemSpec, VersionSpec.Latest);
workSpace.Get(request, GetOptions.GetAll | GetOptions.Overwrite);
workSpace.PendDelete(path, RecursionType.Full);
var pendingchanges = workSpace.GetPendingChanges(itemSpecs);
workSpace.CheckIn(pendingchanges, "Deleting The Branch");
}
finally
{
if (workSpace != null)
{
workSpace.Delete();
}
}
}
If there is a neat way to do the same than I am looking forward to it. This is bit slow as it does too many things,
Creates Temp Workspace
Gets All changes to that
Performs Delete to whole change set
checks it in
Cleans up the workspace

TFS2010 - Wrong changeset appearing at SourceGetVersion

I am currently setting up a Team Foundation Server 2010 and I found a very strange behavior when performing a build:
The situation explained:
We have 2 Branches
Development
Main
All developers check in code into the Development branch only. Once per day, the build manager merges some changesets over to the Main branch. On the Development brach, a continuous build at each check in is running. On the Main branch, once per day (in the night) a build is triggered.
Now suppose that the changesets 1-100 are being merged into the Main brach at 5pm, giving changeset 101 as the merge operation. Some developers check in changesets 102-106 after 5 o'clock into the Development branch. Now at 11pm the daily build is automatically triggered and runs on the Main branch. The last changeset of the Main branch is changeset 101. However, the Build details shows changeset 106:
I could imagine that this behavior is intended, because if you check out changeset 106 on the Main branch, you will in fact get the content of changeset 101. But it would be much more readable if this Build summary showed the correct number.
Question 1: Is there a way of manipulating the ouput of the SourceGetVersion information? Maybe through the Build Process Template?
The second scenario, where the TFS behaves strange is even worse:
When queuing a new build, there is the option of entering the "Get Version" Parameter, as shown in the following picture:
If I now click on "queue", the build is triggered and AGAIN the build detail outputs the changeset 106 although I specifically set it to get changeset 76.
Question 2: Is this a bug? Is there a hotfix or something to fix this? Or is there any option flag that has to be set?
I hope someone knows more about this. I don't really believe that this is a bug, because it is such a vital functionality that other people must have encountered it before.
Thanks for any help!!
Christian
EDIT 1
The folder structure of the Team Project is:
$ProjectName
BuildProcessTemplates
Documentation
SourceCode
Development <-- this is a branch
3rdParty
Source
Main <-- this is a branch
3rdParty
Source
The build only pulls the Main branch and everything below it.
EDIT 2
Here is a picture of the Workspace tab in the build definition:
Finally I found out what is going on:
Basically The changeset that can be seen in my picture 1 is always the latest changeset of the entire Team Project Collection. It is the property "SourceGetVersion" on the object "BuildDetails" of type "IBuildDetails".
I think this is a bug which can be worked around:
If you change the BuildDetails.SourceGetVersion (which is a string) to some other value, then the build summary will show the updated string. Furthermore, it is then saved correctly to the collection database.
What I have done in order to add the correct changeset number is I have created a custom build activity that takes the branch which should be build as input parameter. It outputs the correct changeset. The activity finds out the correct changeset by connecting to the TFS and downloading the History. Then it looks at all the items in the history and outputs the largest changeset number. Here is the code of that activity:
using System.Activities;
using System.Collections;
using System.Net;
using Microsoft.TeamFoundation.Client;
using Microsoft.TeamFoundation.VersionControl.Client;
using Microsoft.TeamFoundation.Build.Client;
namespace SourceGetVersionActivity
{
[BuildActivity(HostEnvironmentOption.All)]
public sealed class SourceGetVersionActivity : CodeActivity<string>
{
// Define an activity input argument of type string
public InArgument<string> Branch { get; set; }
// If your activity returns a value, derive from CodeActivity<TResult>
// and return the value from the Execute method.
protected override string Execute(CodeActivityContext context)
{
// Obtain the runtime value of the Text input argument
string branch = context.GetValue(this.Branch);
ICredentials account = new NetworkCredential("Useranme", "password", "domain");
// connect / authenticate with tfs
TeamFoundationServer tfs = new TeamFoundationServer("http://tfs:8080/tfs/CollectionName", account);
tfs.Authenticate();
// get the version control service
VersionControlServer versionControl = (VersionControlServer)tfs.GetService(typeof(VersionControlServer));
IEnumerable changesets = versionControl.QueryHistory(branch, VersionSpec.Latest, 0, RecursionType.Full,
null, null, null, int.MaxValue, false, false, false, false);
int maxVersion = 0;
foreach (Changeset c in changesets)
{
if (c.ChangesetId > maxVersion)
{
maxVersion = c.ChangesetId;
}
}
return string.Concat('C', maxVersion.ToString());
}
}
}
I call this activity as soon as possible (after the GetBuild activity).
Basically in the BuildProcessTemplate I have added an Argument (string) "Branch" which needs to be filled with a string that points to the top folder that is being build. The custom activity takes that as input and outputs a string which is the correct changeset id. The BuildDetail.SourceGetVersion property will then be overriden by the correct changeset id.
I find it really strange that no-one else seems to have encountered this problem. I could not find any person on the internet with the same problem. Anyway, I hope this answer helps someone else in the future as well.
EDIT - Writing the above code directly in Workflow Foundation:
To get the correct changeset using more compact code and avoiding custom activites, it is also possible to use Workflow Foundation directly. Below is the "code" (doing exactly what is done in above C# code):
(1) The GetTeamProjectCollection activity gets the current collection. I am saving it inside the TeamProjectCollection variable (see bottom of the picture). Important: The variable needs to be defined inside this sequence, if you define it in outer scope, an error will occur: "Unable to serialize type 'Microsoft.TeamFoundation.Client.TfsTeamProjectCollection'. Verify that the type is public and either has a default constructor or an instance descriptor."
(2) Foreach "changeset" in "TeamProjectCollection.GetService(Of VersionControlServer).QueryHistory(Branch, VersionSpec.Latest, 0, RecursionType.Full, Nothing, Nothing, Nothing, Integer.MaxValue, False, False, False).Cast(Of Changeset)()"
The TypeArgument of the Foreach loop is "Microsoft.TeamFoundation.VersionControl.Client.Changeset".
This expression gets the version control object from the collection, calls it "QueryHistory" method which returns an IEnumerable with all changesets.
(3) So we are iterating over all changesets and looking at the ChangesetId. Then saving the maximum ChangesetId to the variable "maxId".
(4) At the end, BuildDetails.SourceGetVersion = "C" + maxId.ToString(). The "C" indicates, that the version is a changeset.
I hope someone finds this piece of "Code" useful!
Christian

TFS2010 Custom Build Activity : to Merge branches

I'm working on customizing our build activity. I'd like to have your help for an issue.
Following is our version control hierarchy.
Main
|- Dev
|- QA
we are working on Dev branch and while taking the build we need to merge Dev branch to Main then to QA.
Main is the root branch as you might know.
In our build template, I've added two custom activities to merge one from Dev to Main and another one to merge from Main to QA. Following is the code for the custom activity.
protected override string Execute(CodeActivityContext context)
{
string lstrStatus = string.Empty;
string lstrSourceBranchPath = context.GetValue(this.SourceBranchPath);
string lstrTargetBranchPath = context.GetValue(this.TargetBranchPath);
// Obtain the runtime value of the input arguments
Workspace workspace = context.GetValue(this.Workspace);
GetStatus status = workspace.Merge(lstrSourceBranchPath,
lstrTargetBranchPath,
null,
null,
LockLevel.None,
RecursionType.Full,
MergeOptions.None);
// resolve the conflicts, if any
if (status.NumConflicts > 0)
{
Conflict[] conflicts = workspace.QueryConflicts(new string[]
{ lstrTargetBranchPath }, true);
foreach (Conflict conflict in conflicts)
{
conflict.Resolution = Resolution.AcceptTheirs;
workspace.ResolveConflict(conflict);
}
}
// checkin the changes
PendingChange[] pendingChanges = workspace.GetPendingChanges();
if (pendingChanges != null && pendingChanges.Length > 0)
{
workspace.CheckIn(pendingChanges, "Merged by MERGE BRANCHES activity");
}
return lstrStatus;
}
Problem is, merging happens perfectly in the server. But, it's not getting reflected in the local folder. I tried to add SyncWorkspace activity after each Merge custom activity. Still not working.
My guess was that a SyncWorkspace should be the only thing to do.
You could try doing a RevertWorkspace before that.
EDIT
After you now stated that even this wouldn't work, I would generate a bug against MS at least to get an official answer.
In the meanwhile you can try with the following method, which I absolutely see as an overkill: Once you have checked in, redo all the steps within sequence Initialize Workspace.
If even that doesn't work I'd consider two different builds, one that does your merge & one that does the actual build. You can then organize a scheme where your first build, once it's done, triggers the second one. Here is a good resource for that.

Resources