I'd like to extract a task (issue) cycle time from my project and explore how they can be visualized in meaningful and helpful ways. Is this information possible to retrieve from the github API? After spending some time in the docs, I can't find this information available. Here are the available endpoints: https://docs.github.com/en/rest/overview/endpoints-available-for-github-apps
It seems there's isn't a cycle time (process time) value, how can I retrieve it?
I think you've hit the nail on the head by wanting to do this using the GitHub API, instead of needing to rely on any inaccurate metrics from project management tools like Jira. It's definitely possible to extract meaningful software engineering metrics from GitHub, but the easiest way is to use a commercial service to do this.
Haystack Analytics offers a commercial service that allows you to extract Cycle Time. That Cycle Time can then be broken down into Development Time and Review Time (which in turn can then be broken down into First Response Time, Rework Time and Idle Completion Time). You can track the other DevOps Four Key Metrics (as mentioned in the Accelerate DevOps book) in Haystack too.
Google Cloud Platform have also open-sourced a self-hosted tool on GitHub that allows you to extract the Four Key Metrics from Git data. Do note that this tool contains a number of limitations and the metrics might not work with your use-case out of the box.
Related
Can anyone please help me to point to a tool which could be used to measure the time spent on individual components of a gcloud dataflow? Also if the tools exists, can you please share the link too? Thanks
I don't know about any external tool used to measure a dataflow job's components; however, I think this information can be retrieved by using the Monitoring UI that contains detailed information about the general pipeline job execution, as well as the Total Execution Time per step. Additionally, you can take a look in this link that contains a Understanding timing guide in case you want to get a deeper understanding of this feature.
What are peoples opinions on jira studio? i.e. using the hosted product for a large company. Especially with hosted source control and reliability of the service?
Is this product up to large scale implementations yet?
I've been using JIRA Studio (hosted) extensively over the last few weeks with a Java project. So far my experience has been resoundingly positive, with the following caveats:
Setting up Elastic Bamboo requires filing a support ticket. While admittedly the process is fully automated and very easy, it can still take a day or two before you can begin setting up your builds;
In my opinion, SVN hosting is limiting. I've been very much looking forward to working with git or Mercurial, but I'm not aware of any plans to add support for those. You can certainly find a separate host for your sources, but you'd be losing on ease of use, out-of-the-box integration with issue tracking and the JIRA dashboard (which I've grown to absolutely love) and will have to sign with a second provider.
I would rate the primary advantages as:
Very low integration cost (compared to e.g. setting up your own Bugzilla+Mediawiki+Hudson setup);
Relatively low TCO, particularly if you have a small staff and no Linux hackers to get you started up;
Very smooth administration and usage experience. I've very rarely had to look in the documentation, and then it was usually clean and informative.
A recent announcement by Google about the Google Prediction API sounded very interesting. It could be useful for a project that is coming up, and would probably do a better job than some custom code I was considering.
However, there is some vendor lock-in. Google retain the trained model, and could later choose to overcharge me for it. It occurred to me that there are probably open-source equivalents, if I was willing to host the training myself (I am) and live without their ability to throw hardware at the problem at a moment's notice.
Last time I looked at 3rd Party computer training code was many years ago, and there were a lot of details that needed to be carefully considered and customised for your project. Google appear to have hidden those decisions, and take care of them for you. To me, this is still indistinguishable from magic, but I would like to hear whether others can do the same.
So my question is:
What alternatives to Google Prediction API exist which:
categorise data with supervised machine learning,
can be easily configured (or don't need configuration) for different kinds and scales of data-sets?
are open-source and self-hosted (or at the very least, provide you with a royalty free use of your model, without a dependence on a third party)
Maybe Apache Mahout?
PredictionIO is an open source machine learning server for software developers to create predictive features, such as personalization, recommendation and content discovery.
Have been looking recently at tools like google prediction API, one of the first ones I got put on to was Weka machine learning tool which could be worth checking out for anyone looking.
I'm not sure if it's relevant, but directededge seams to be doing exactly that :)
There is good free for use service Yandex Predictor with 100000/day request quota. It works for text only, supports several languages and spell correction.
You need to get free API Key, then you can use simple RESTful API. Api support JSON, XML and JSONP as output.
Unfortunately I cannot find documentation in English. You can use Google Translate.
I can translate docs if there is some demand.
What would be a proper way to simulate a large number of requests to test if my web application can handle it?
You could try using Microsoft's WCAT tools. Look here: http://support.microsoft.com/kb/231282
They're free, too. That's always nice.
Depending on your budget, you may be interested in some load testing software designed for this. A Google search brings up all sorts of alternatives. This is probably the best way to do it.
This one has a free trial version and isn't too pricey, but I would recommend shopping around first.
I've used JMeter in the past, and I find it to be very useful for stress/load testing as website, even ones written in ASP.NET (with or without MVC).
In general you would want to (with any tool) write a script of what an average user of your site would do. You may even end up creating multiple of these scripts. Tools like JMeter even allow for a random element to be added to a script. With these scripts created a load testing tool can then simulate as many users as you desire hitting your site.
I would recommend allow JMeter to slowly ramp up the number of concurrent users and watch the response time graph. At the point where the response time starts increasing too highly is at the point where you've hit the maximum number of users (given you scripts) that your site can handle.
ab and httperf are two, more unixy options, if you don't mind delving in that direction.
There's a nice screencast for using httperf by peepcode.
Use the load testing tools from Visual Studio Team System. 2010 if you can get it.
The tools are great to use and provide wonderful instrumentation. There is also a programming model to go with the tools, allowing you to make some very complex testing scenarios possible.
Post the URL on stackoverflow.
Make it sound like a challenge, so lots of people come check it out: "Can you find the hidden performance problem in this app?"
Any suggestions for an accurate Web Log analysis tool to generate reports on the IIS logs? We used WebTrends, but I don't feel it was accurate.
To analyze weblogs, I don't think you can go wrong with Analog: http://www.analog.cx/
If you are analyzing your own logs, which are often huge files, you will want the fastest analyzer you can find. Analog is fast.
You'll want one that's been around awhile and is still supported. Analog just celebrated its 10'th birthday.
Analog claims to be the most popular logfile analyser in the world.
Multi-languages.
Did I say its free and open source?
As far as accuracy goes, no tool gives perfect results. Javascript fails often in catching hits. Trying to track individual people's paths through a website (i.e. for Analytics purposes) is fraught with problems. And even trying to differentiate hits versus visits and screening out the bots is all more of a black art than a science.
What is best is simply to have a tool that gives decent basic statistics that tell you what you need to know.
I've looked at other tools, such as Deep Log Analyzer: http://www.deep-software.com/, which attempts to do analytics from your weblogs. But speed was a problem. They claim their new version 3.5 - April 2008, which I didn't try, has improved performance. The big advantage of a program like this is the advanced reporting you can do, including custom SQL requests. You have to purchase their professional version ($200) to do most of the analytics and custom queries. If Analog is too simple for you, then try the free version of Deep Log Analyzer.
And you can also try Microsoft's own Log Parser, as was the recommended answer in: https://stackoverflow.com/questions/157677/a-good-iis-log-viewer-for-large-log-files.
But you will need some extra skills to use it.
What are you wanting to analyze from your logs? There are a bunch of tools out there - free or paid for - that will go through the logs and spit out a great variety of figures. Some have real meaning, others are best used with a grain of salt.
What none will show you is "How many people are actually reading my wonderful web pages". Those that attempt to show "distinct site visitors" or any detailed metrics are at best a rough approximation to an indication of a vague trend...
But for what it's worth, we use Analog.
SHORT ANSWER:
You are correct to question the results; log analysis is not adequate to report actual traffic.
LONGER ANSWER:
WebTrends is a great tool for what it delivers. But as a previous administrator of a WebTrends installation, I found that web logs are notoriously bad at capturing metrics of interest.
For instance, if there exists any caching in your web delivery stack (or on the consumers side-- *I'm shaking my fist at YOU, AOL!), then your web logs are instantly non-reflective of your site's actual activity. This is because log analysis assumes that all user consumption will translate to an HTTP request back to the web server-- and thus having been recorded in the IIS logs. In the case of a cache, this would not be the case.
In the future if you want more reliable results, you ultimately need to ensure that there exists a way to bust any caching strategy. The obvious answer is dynamic content. But if you do not want to rewrite all of your content in such a fashion, just ensure your web traffic analysis uses a dynamic call.
WebTrends actually offers a solution to this problem, called SDC server. This is exactly what Google Analytics offers as well-- it's a javascript call back to the analysis server.
...I could go for days on this. If you want more specific information, comment back. ;)
EDIT: With WebTrends, specifically, it is quite important to configure session tracking beyond their default IP/userAgent configuration. If your web server assigns a session cookie, you will find this will increase your reliability; especially for differentiating between users which may sit behind the same NAT.
I have had really good luck with SmarterStats, from SmarterTools.
There is a logging package for free from MSFT for viewing this information using SQL Reporting Services. Google it.
doing it with the logs is only a good idea if it's internal - I'd use google analytics for anyhing on teh internets
I have been using Summary, which is paid for software, for years, and love it. The cost of updates is getting to me, and paying for an update to just get user agent string updates out of the deal is getting bothersome. Not that there are not other fixes, I just tend to not need them.
Anyone care to share if they have used Summary compared to analog?
Look at XpoLog log analysis platform for web application servers and web servers log. it a log management and analysis platform that integrate to web servers logs and create reports, provide search and log viewer and also monitor for problems. XpoLog