How is coverage percentage calculated when branch coverage is enabled in coverage.py - code-coverage

I am using coverage.py tool to get coverage of python code. If I use the command without --branch flag like below,
coverage run test_cmd
I get the coverage report like this,
Name Stmts Miss Cover
--------------------------------------------------------------------------------------------------------------------------
/path/file.py 9 2 78%
From this I understand that the cover percentage value is derived as this
cover = (Stmts Covered/total stmts)*100 = (9-2/9)*100 = 77.77%
But when I run coverage with --branch flag enabled like this
coverage run --branch test_cmd
I get the coverage report like this,
Name Stmts Miss Branch BrPart Cover
----------------------------------------------------------------------------------------------------------------------------------------
/path/file.py 9 2 2 1 73%
From this report I am not able to understand the formula used to get Cover=73%
How is this number coming and is this correct value for code coverage?

While you probably shouldn't worry too much about the exact numbers, here is how they are calculated:
Reading from the output you gave, we have:
n_statements = 9
n_missing = 2
n_branches = 2
n_missing_branches = 1
Then it is calculated (I simplified the syntax a bit, in the real code everything is hidden behind properties.)
https://github.com/nedbat/coveragepy/blob/a9d582a47b41068d2a08ecf6c46bb10218cb5eec/coverage/results.py#L205-L207
n_executed = n_statements - n_missing
https://github.com/nedbat/coveragepy/blob/8a5049e0fe6b717396f2af14b38d67555f9c51ba/coverage/results.py#L210-L212
n_executed_branches = n_branches - n_missing_branches
https://github.com/nedbat/coveragepy/blob/8a5049e0fe6b717396f2af14b38d67555f9c51ba/coverage/results.py#L259-L263
numerator = n_executed + n_executed_branches
denominator = n_statements + n_branches
https://github.com/nedbat/coveragepy/blob/master/coverage/results.py#L215-L222
pc_covered = (100.0 * numerator) / denominator
Now put all of this in a script, add print(pc_covered) and run it:
72.72727272727273
It is then rounded of course, so there you have your 73%.

Related

JQL for this condition

how would the JQL look like for this condition?:
Generate a report of all HIGH severity JIRA tickets for a project with key X (or name X) that were created from 9 PM EST to 12 AM EST from the start of the year?
I tried something like :
Project = X AND Severity = "HIGH" AND created > "2015/01/01 21:00" and created < "2015/09/09",
but I need only those issues that are created between 9 PM and 12 AM everyday, from the beginning of the year.
Any ideas would be greatly appreciated.
Unfortunately there seems to be no tool to get the hour from created date but you can workaround it.
My two ideas are:
find these tickets directly on Jira database (if you have access) it should be very easy since there are functions like hour (mySQL) or truncate (postgres)
prepare JQL filter using some generator script - it is definetely less comfortable but possible to achieve even when you have not acces to database. The worst thing is that Jira filter fields accepts only 2000 characters string so you would need to copy that filter few lines by few lines.
Little crazy but ok - it works so what's the idea? The idea is to use startOfYear() JQL function and its *offset version**. For example:
created >= startOfYear(21h) and created < startOfYear(24h)
will give you all tickets from 1 Jan 21:00 - 2 Jan 00:00
then you can use this Python script:
step = 27
maxDay = 1
while maxDay <= 365 + step:
maxDay += step
output = "project = X and Severity = HIGH and ("
for i in range(maxDay-step, maxDay):
output += " (created >= startOfYear(" + str( ( (i-1) * 24 ) + 21) + "h) and created < startOfYear(" + str(i*24) + "h)) or"
output = output[:-3]
output += ")"
print output
print
which will generate you set of JQL requests to copy-paste and execute (it is actually 15 of them - you can see here). Every set bounds 28 days because of 2000 limit of filter input in Jira.
I fixed this issue by writing a custom JQL function and then using that function with a JQL query, which fits well with our requirements :
created >= "2015-01-01" and created <= "2015-12-31" and issue in getIssuesForTimeRange("21", "24")

Lcov inconsistent coverage

I started using lcov about a month back. The coverage count seems inconsistent. The first run reported about 75% line coverage where as second run reported only 19%. The test suite used was some for both the runs. I see following warning during lcov --remove. Any suggestions?
lcov: WARNING: negative counts found in tracefile all.info
Is this something to worry about?
Same known issue is reported here on GitHub.
Replacing all counts of -1 in the output with 0 (e.g. with sed -i -e 's/,-1$/,0/g' <outputfile>) causes the warning to disappear from the lcov and genhtml output while still producing the correct coverage report.
More importantly (at least for me), submitting the file with the counts set to 0 instead of -1 to codecov.io results in the results being parsed correctly and the coverage information being available through codecov.io.
Codecov also handle this kind of value error:
# Fix negative counts
$count = $2 < 0 ? 0 : $2;
if ($2 < 0)
{
$negative = 1;
}
Follow some other fixes:
Fix Undocumented Value
Remove fix for negative coverage counts
See this bug report : https://github.com/psycofdj/coverxygen/issues/6
Replacing all counts of -1 in the output with 0 (e.g. with sed -i -e 's/,-1$/,0/g' ) causes the warning to disappear from the lcov and genhtml output while still producing the correct coverage report.
More importantly (at least for me), submitting the file with the counts set to 0 instead of -1 to codecov.io results in the results being parsed correctly and the coverage information being available through codecov.io.

Bad Result And Evaluation From Giza++

I have tried to work with giza++ on window (using Cygwin compiler).
I used this code:
//Suppose source language is French and target language is English
plain2snt.out FrenchCorpus.f EnglishCorpus.e
mkcls -c30 -n20 -pFrenchCorpus.f -VFrenchCorpus.f.vcb.classes opt
mkcls -c30 -n20 -pEnglishCorpus.e -VEnglishCorpus.e.vcb.classes opt
snt2cooc.out FrenchCorpus.f.vcb EnglishCorpus.e.vcb FrenchCorpus.f_EnglishCorpus.e.snt >courpuscooc.cooc
GIZA++ -S FrenchCorpus.f.vcb -T EnglishCorpus.e.vcb -C FrenchCorpus.f_EnglishCorpus.e.snt -m1 100 -m2 30 -mh 30 -m3 30 -m4 30 -m5 30 -p1 o.95 -CoocurrenceFile courpuscooc.cooc -o dictionary
But the after getting the output files from giza++ and evaluate output, I observed that the results were too bad.
My evaluation result was:
RECALL = 0.0889
PRECISION = 0.0990
F_MEASURE = 0.0937
AER = 0.9035
Dose any body know the reason? Could the reason be that I have forgotten some parameters or I should change some of them?
in other word:
first I wanted train giza++ by huge amount of data and then test it by small corpus and compare its result by desired alignment(GOLD STANDARD) , but I don't find any document or useful page in web.
can you introduce useful document?
Therefore I ran it by small courpus (447 sentence) and compared result by desired alignment.do you think this is right way?
Also I changed my code as follows and got better result but It's still not good:
GIZA++ -S testlowsf.f.vcb -T testlowde.e.vcb -C testlowsf.f_testlowde.e.snt -m1 5 -m2 0 -mh 5 -m3 5 -m4 0 -CoocurrenceFile inputcooc.cooc -o dictionary -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 0 -nsmooth 4 -onlyaldumps 1 -p0 0.999 -diagonal yes -final yes
result of evaluation :
// suppose A is result of GIZA++ and G is Gold standard. As and Gs is S link in A And G files. Ap and Gp is p link in A and G files.
RECALL = As intersect Gs/Gs = 0.6295
PRECISION = Ap intersect Gp/A = 0.1090
FMEASURE = (2*PRECISION*RECALL)/(RECALL + PRECISION) = 0.1859
AER = 1 - ((As intersect Gs + Ap intersect Gp)/(A + S)) = 0.7425
Do you know the reason?
Where did you get those parameters? 100 iterations of model1?! Well, if you actually manage to run this, I strongly suspect that you have a very small parallel corpus. If so, you should consider adding more parallel data in training. And how exactly do you calculate the recall and precision measures?
EDIT:
With less than 500 sentences you're unlikely to get any reasonable performance. The usual way to do it is not find a larger (unaligned) parallel corpus, run GIZA++ on both together and then evaluate the small part for which you have the manual alignments. Check Europarl or MultiUN, these are freely available corpora, both contain a relatively large amount of English-French parallel data. The instructions on preparing the data can be found on the websites.

What is the difference between code coverage and line coverage in sonar

I know what the difference is between line and branch coverage, but what is the difference between code coverage and line coverage? Is the former instruction coverage?
Coverage is a subtle ;-) mix of the line and the branch coverage.
You can find the formula on our metric description page:
coverage = (CT + CF + LC)/(2*B + EL)
where
CT - branches that evaluated to "true" at least once
CF - branches that evaluated to "false" at least once
LC - lines covered (lines_to_cover - uncovered_lines)
B - total number of branches (2*B = conditions_to_cover)
EL - total number of executable lines (lines_to_cover)
To expand on the answer, you can only query sonar for these terms:
conditions_to_cover
uncovered_conditions
lines_to_cover
uncovered_lines
And then you can covert to the terms above using these equations:
CT + CF = conditions_to_cover - uncovered_conditions
2*B = conditions_to_cover
LC = lines_to_cover - uncovered_lines
EL = lines_to_cover
You can use the Sonar Drilldown or REST API to get the metric values above:
http://my.sonar.com/drilldown/measures/My-Project-Name?metric=line_coverage
http://my.sonar.com/api/resources?resource=55555&metrics=ncloc,conditions_to_cover,uncovered_conditions,lines_to_cover,uncovered_lines,coverage,line_coverage,branch_coverage,it_conditions_to_cover,it_uncovered_conditions,it_lines_to_cover,it_uncovered_lines,it_coverage,it_line_coverage,it_branch_coverage,overall_conditions_to_cover,overall_uncovered_conditions,overall_lines_to_cover,overall_uncovered_lines,overall_coverage,overall_line_coverage,overall_branch_coverage
This blog post has additional information: http://sizustech.blogspot.com/2015/10/making-sense-of-sonar-qube-stats-like.html

How do I get a random number in template toolkit?

I want to get a random number using template toolkit. It doesn't have to be particularly random. How do I do it?
Hmm, you might have issues if you don't have (or cannot import) Slash::Test.
From a "vanilla" installation of TT, you can simply use the Math plugin:
USE Math;
GET Math.rand; # outputs a random number from 0 to 1
See this link in the template toolkit manual for more information on the Math plugin and the various methods.
Update: Math.rand requires a parameter. Therefore to get a random number from 0 to 1, use:
GET Math.rand(1);
From this post at Slashcode:
[slash#yaz slash]$ perl -MSlash::Test -leDisplay
[%
digits = [ 0 .. 9 ];
anumber = digits.rand _ digits.rand _ digits.rand;
anumber;
%]
^D
769

Resources