understanding Eclemma results - code-coverage

I am trying to understand the concept of code coverage and a complete novice to this topic.
I am using Eclemma to measure the code coverage of an open-source code. Can somebody help me to know what are important insights I should consider into the below snapshot?

Code coverage is a metric, which expresses which portion of the (application) code gets executed when you run your test cases. However, it is just a measure of completeness; it provides no information about how thoroughly the executed code was tested by the test cases.
In your screenshot, the third line of the table (src/main/java) is a relevant one. It expresses that the application code consists of 3,846 (bytecode) instructions; out of these, roughly 67% were executed (presumably by the automated test cases residing in src/test/java). This means that the test cases cannot reveal any fault in one third of the whole application because the test cases do not touch that code at all. The remaining code (other two thirds) is executed by at least one test case. Test cases can reveal faults in this code; how effectively they do depends on their used input data and oracles.
Note that it is often not possible or sensible to achieve 100% coverage.

Related

How to shift development of an existing MVC3 app to a TDD approach?

I have a fairly large MVC3 application, of which I have developed a small first phase without writing any unit tests, targeted especially detecting things like regressions caused by refactoring. I know it's a bit irresponsible to say this, but it hasn't really been necessary so far, with very simple CRUD operations, but I would like to move toward a TDD approach going forward.
I have basically completed phase 1, where I have written actions and views where members can register as authors and create course modules. Now I have more complex phases to implement where consumers of the courses and their trainees must register and complete courses, with academic progress tracking, author feedback and financial implications. I feel it would be unwise to proceed without a solid unit testing strategy, and based on past experience I feel TDD would be quite suitable to my future development efforts here.
Are there any known procedures for 'converting' a development effort to TDD, and for introducing unit tests to already written code? I don't need kindergarten level step by step stuff, but general strategic guidance.
BTW, I have included the web-development and MVC tags on this question as I believe these fields of development can significant influence on the unit testing requirements of project artefacts. If you disagree and wish to remove any of them, please be so kind as to leave a comment saying why.
I don't know of any existing procedures, but I can highlight what I usually do.
My approach for an existing system would be to attempt writing tests first to reproduce defects and then modify the code to fix it. I say attempt, because not everything is reproducible in a cost effective manner. For example, trying to write a test to reproduce an issue related to CSS3 transitions on a very specific version of IE may be cool, but not a good use of your time. I usually give myself a deadline to write such tests. The only exception may be features that are highly valued or difficult to manually test (like an API).
For any new features you add, first write the test (as if the class under test is an API), verify the test fails and implement the feature to satisfy the test. Repeat. When you done with the feature, run it through something like PEX. It will often highlight things you never thought of. Be sensible about which issues to fix.
For existing code, I'll use code coverage to help me find features I do not have tests for. I comment out the code, write the test (which fails), uncomment the code, verify test passes and repeat. If needed, I'll refactor the code to simplify testing. PEX can also help.
Pay close attention to pain points, as it highlights areas that should be refactored. For example, if you have a controller that uses ObjectContext/IDbCommand/IDbConnection directly for data access, you may find that you require a database to be configured etc just to test business conditions. That is my hint that I need an interface to a data access layer so I can mock it and simulate those business conditions in my controller. The same goes for registry access and so forth.
But be sensible about what you write tests for. The value of TDD diminishes at some point and it may actually cost more to write those tests than it is to give it to someone in India to manually test.

How to assure I am testing everything, I have all the features and only those features for old code?

We are running a pretty big website, we have some critical legacy code there we want to have well covered.
At the same time we would like to have a report of the features we are currently supporting and covered. And also we want to be sure we really cover every possible corner case. Some code paths are critical and will need many more tests even after achieving 100% coverage.
As we are already using rspec and rspec has "feature" and "scenario" keywords, we tried to make a list using rspec rather than going for cucumber but I think this question can be applied to any testing tool.
We want something like this:
feature "each advertisement will be shown a specified % of impressions"
scenario "As ..."
This feature is minimal from the point of view of managers but huge in the code. It involves a backend tool, a periodic task, logic in the models and views in backend and front end.
We tried to divide it like this:
feature "each creative will be shown a specified % of impressions"
context "configuration"
context "display"
scenario "..."
context "models"
it "should ..."
context "frontend"
context "display"
scenario "..."
context "models"
it "should ..."
Configuration takes place in another tool, display would contain integration tests and models would contain unit test.
I repeat myself but the idea is sto assure that the feature is really finished(including building the configuration tool) and 100% tested.
But looking at this file, it is not integration, nor unit test not even belong to any particular project.
Definitely there should be a better way of managing this.
Any experiences, resources, ideas you can share to guide us ?
The scenario you're describing is a huge reason why BDD is so popular. It forces you to write code in a way that's easy to test. Having said that, you're obviously not going to go back and rewrite the entire legacy application. There are a few things you should consider though:
As you go through each section of the application, you should ask yourself 'Will it be harder to refactor than to write tests for this?'. Sometimes refactoring before writing tests just cannot be avoided.
Testing isn't about 100% coverage, it's about 100% confidence. As you mentioned, you plan on writing more tests even when you have 100% coverage. This is because you're going for confidence. Once you're confident in a piece of code, move on. You can always come back to it at a later time.
From my experience, Cucumber is easier for tests that cover a large portion of the application. I think the reason for this is that writing out the tests in plain english makes you think of things you wouldn't have otherwise. It also allows you to focus on the behavior instead of the code and can make refactoring a less daunting task.
You don't really get much out of adding tests to existing code if you never touch that code again. Start with testing the code you want to make changes (i.e. refactor) to first.
I also recommend the book Rails Test Prescriptions, specifically one of the last chapters called "Testing a Legacy Application".

overall code coverage for specific developers in TFS

I was just wondering if there was a way to analyse TFS, to figure out code coverage results against a specific developer.
Say I want to determine the code coverage statistics for one of the developers in my team.
It’s not supported out of the box.
Don’t think it will be in the future, since its a very hard measurement:
Developes write source code. Source code within a project is usually divided in functionality not the developers. Code coverage results is the coverage of a given assembly being exercised in a testrun. We therefore need to analyse the coverage per code line and relate that code line back to a given developer changeset. Regardless of the unittest dll being instrumented, both the code in the unittest as the code being exercised is involved in the code coverage result. So which codeline being covered is counted for a specific developer ? The lines unittest, a line in a shared library, a line that was changed by 4 developers (is coverage shared), othes issues ?
But why are you asking this question, if you are trying to improve the quality of a specific individual code reviews and pair programming would be a more efficient approach. Even if it would be possible, terrorizing individuals on their code coverage would only result in dysfunctional measurement. Source code of a given product is shared in a Team, therefore the team is responsible for the coverage. Require your team to take that responsibility.

Minimum Acceptable Code Coverage Numbers in the real world [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is a reasonable code coverage % for unit tests (and why)?
I am in the middle of putting together some guidelines around unit test code coverage and I want to specify a number that really makes sense. It's easy to repeat the 100% mantra that I see all over the internet without considering the cost benefit analysis and when diminishing returns actually sets in.
I solicit comments from persons who have actually reported code coverage on real-life, medium/large-sized projects. What percentages were you seeing? How much is too much? I really want some balance (in figures) that will help developers produce hight quality code. Is 65% coverage too low to expect? Is 80% too high?
When you mix code coverage with cyclomatic complexity, you can use the CRAP metric.
From artima.com:
Individual Method Interpretation:
Bob Evans and I have looked at a lot
of examples (using our code and many
open source projects) and listened to
a LOT of opinions. After much debate,
we decided to INITIALLY use a CRAP
score of 30 as the threshold for
crappiness. Below is a table that
shows the amount of test coverage
required to stay below the CRAP
threshold based on the complexity of a
method:
Method’s Cyclomatic Complexity % of coverage required to be
below CRAPpy threshold
------------------------------ --------------------------------
0 – 5 0%
10 42%
15 57%
20 71%
25 80%
30 100%
31+ No amount of testing will keep methods
this complex out of CRAP territory.
No amount of code coverage is going to guarantee "high quality code" by itself alone.
From the comments...
It's definitely too lax to give simple methods a pass on coverage. What you will likely find when implementing this on existing code is that the code coverage will rise as you're refactoring those ugly methods (code coverage should rise otherwise you're refactoring dangerously).
The 0-5's are essentially low-hanging fruit and the ROI isn't all that great. That being said, those methods are wonderful for learning TDD because they're often very easy to test.
Personally I would go for 80% coverage, but of course this is only relative... I personally didn't achive this yet, too.
Currently we have very high coverage (99%) on our utility classes, which is good because bugs in there will hunt you through your whole application.
Mediocre coverage is for most GUIs, because writing tests for them is hard and time expensive, so we often leave it to opening the gui in the unit tests and if there is no error we close it automatically.
I don't think you can really have too much code coverage. I think you need to determine what code runs the "regular course of business" in the application and have that completely covered. For the remaining code that isn't in the normal course of business, start whittling that down by doing the most critical first. Abnormal business that isn't terribly important has low gain for getting good code coverage on it.
The only correct answer is you test as much as you can afford. Obviously, this is an axiom across every engineering project.
Beyond that, it's all subjective and highly dependent upon the project at hand. For example, the flight control systems lockheed puts out had better be tested more than 80%, but 80% may suffice for my GUI front-end to an XML viewer.
Typically, you break down the cost of running tests with your team. In the theory world, it is customary to have man-hours as a result of the question: how much testing can we afford?
After this, you examine your modules and you determine which parts of the code have the most time spent in them. Each critical module should be covered once. From here on, you give an appropriate number of tests compared to the amount of time specific modules are executed. So in the end, there's no hard number of "X%" is covered.
John Musa has a really interesting book on the subject.
On the program that I'm on (~500k SLOC), we use 100%. That is a program requirement to proceed to the next phase of testing. Here are the reasons behind it:
The program is used in some safety
critical situations, and you don't
want any off nominal conditions to
not be tested
If you aren't hitting 100%, then you
either wrote code that isn't
necessary, and are hence wasting
money, or you aren't testing your
off nominal paths completely. See
#1.
Your unit test scenarios should
naturally get you close to 100%,
regardless of the actual program
code coverage metric you're using.
If someone is at 95% based solely on
their off nominal scenarios,
requiring 100% isn't onerous (and,
again, you should be asking why you
aren't at a 100% then. See #2.)
Your mileage will certainly vary. If you aren't working on a mission / safety critical application, than you probably don't need to be worrying about your code coverage as much - however, I'd have to ask again: why are you writing code that you don't need?
[Edit]
Based on the comments I've received below, I should clarify. The program guideline is 100% code coverage for unit tests. That development process requirement can be waived if, for a technical reason, a branch of code cannot be reached (protected default constructor that is never called, etc.) Approval is usually granted from an external, independent portion of the organization (go go SQA).
From an integration / systems test, code coverage becomes moot, as you start looking at requirements coverage. That's a different ball of yarn altogether.
The original question was looking for real world situations: I agree that not (most?) all real world situations will warrant 100% code coverage on a unit test level, but there are certainly cases that do, and programs that do. And it is a habit of some developers to write code that they don't need, which then ends up untested. This becomes a maintenance nightmare, as a latter developer will call methods that were never "meant" to be used (or were included because someone thought they were a "good" idea). Shooting for 100% coverage forces you to answer the question "why did I write this?"
It really depends. I know a lot of software that goes 0%. I have a lot of software that has single digit %. The main question is what really is needed, and wanted in financial terms.

Is there a way to determine code coverage without running the code?

I am not asking the static code analysis which is provided by StyleCop or Fxcop. Both are having different purpose and it serves well. I am asking whether is there a way to find the code coverage of your user control or sub module? For ex, you have an application which uses the helper classes in a separate assembly. Inorder to ensure the unit testing code coverage, we need to run the application and ensure using NCover or similar tool.
My requirement is, without running it, is there any possible to find code coverage of the helper classes or similar kind of assemblies?
See Static Estimation for Test Coverage for a technique that estimates coverage without executing the source code.
The basic idea is to compute a program slice for each test case, and then "count" what the slice enumerates. A (forward) slice is effectively that part of a program that you can reach from a specific starting point in the code, in this case, the test code.
While the technical paper above is hard to get if you're not an ACM member [or you didn't attend the conference where it was presented :], there's a slide presentation here.
Of course, running this static estimator only tells you (roughly) what code will be exercised. It doesn't substitute for actually running the tests, and verifying that they pass!
In general, the answer is no. This is equivalent to the halting problem, which is not computable.
There are (research) tools based on abstract interpretation or model checking that can show coverage properties without execution, for subsets of language. See, e.g.
"Analyzing Functional Coverage in Bounded Model Checking", Grosse, D. Kuhne, U. Drechsler, R. 2008
In general, yes, there are approaches, but they're specialized, and may require some formal methods experience. This kind of stuff is still cutting edge research.
I would say no; with the exception of 'dead code' which a compiler can determine.
My definition of code coverage is a result which indicates how many times each line of code is run in your program: which, of course, means running the program. The determining factor here is usually the values of data passing through the program which the determine the paths of executions taken by conditionals. A static analysis, like a compiler, could deduce lines of code that cannot run under any conditions.
An example here is if your program uses a third-party library, but there is a bug in the library. If your program never uses those parts of the library, or the data you send to the library causes it to avoid the bug, then you won't be affected.
You could write a program that, by reflection, assumes that all conditionals will be taken, and follows all function calls, through all derived classes, but I'm not sure what this will tell you. It certainly can't tell you whether or not there are any bugs in the lines of code covered.
Coverity Static Analysis is a tool that is can identify many secuirty flaws in a program. It can also identify dead code and can be used to help satisfy testing regulations such as D0178B which requires that the developers demonstrate that all code can be executed.
If you are using Visual Studio, you can first run 'Analyze Code Coverage', Then you can export code Coverage results using below Button(marked in Green) in Visual Studio:
Later you can import the Coverage Result file back to Visual Studio

Resources