I know what the difference is between line and branch coverage, but what is the difference between code coverage and line coverage? Is the former instruction coverage?
Coverage is a subtle ;-) mix of the line and the branch coverage.
You can find the formula on our metric description page:
coverage = (CT + CF + LC)/(2*B + EL)
where
CT - branches that evaluated to "true" at least once
CF - branches that evaluated to "false" at least once
LC - lines covered (lines_to_cover - uncovered_lines)
B - total number of branches (2*B = conditions_to_cover)
EL - total number of executable lines (lines_to_cover)
To expand on the answer, you can only query sonar for these terms:
conditions_to_cover
uncovered_conditions
lines_to_cover
uncovered_lines
And then you can covert to the terms above using these equations:
CT + CF = conditions_to_cover - uncovered_conditions
2*B = conditions_to_cover
LC = lines_to_cover - uncovered_lines
EL = lines_to_cover
You can use the Sonar Drilldown or REST API to get the metric values above:
http://my.sonar.com/drilldown/measures/My-Project-Name?metric=line_coverage
http://my.sonar.com/api/resources?resource=55555&metrics=ncloc,conditions_to_cover,uncovered_conditions,lines_to_cover,uncovered_lines,coverage,line_coverage,branch_coverage,it_conditions_to_cover,it_uncovered_conditions,it_lines_to_cover,it_uncovered_lines,it_coverage,it_line_coverage,it_branch_coverage,overall_conditions_to_cover,overall_uncovered_conditions,overall_lines_to_cover,overall_uncovered_lines,overall_coverage,overall_line_coverage,overall_branch_coverage
This blog post has additional information: http://sizustech.blogspot.com/2015/10/making-sense-of-sonar-qube-stats-like.html
Related
I'm trying to refactor if-else branches and am unable to generate accurate Replacements for it. I try to fetch nodes using:
clang::IfStmt *ifstmt = const_cast<clang::IfStmt *>(
result.Nodes.getNodeAs<clang::IfStmt>("if_simple_else_bind_name"))
and then calculate its source range using
ifstmt->getSourceRange()
However, I find that after using this SourceRange for calculating the start and end for the branch sometimes misses a semicolon for cases like these:
if(cond) {
return a;
}
else
return b; <-- here
How do I find the correct source range and further on generate the correct replacement rewriting the whole branch regardless of whether there are any braces or simple statements?
Any help would be highly appreciated!
I am using coverage.py tool to get coverage of python code. If I use the command without --branch flag like below,
coverage run test_cmd
I get the coverage report like this,
Name Stmts Miss Cover
--------------------------------------------------------------------------------------------------------------------------
/path/file.py 9 2 78%
From this I understand that the cover percentage value is derived as this
cover = (Stmts Covered/total stmts)*100 = (9-2/9)*100 = 77.77%
But when I run coverage with --branch flag enabled like this
coverage run --branch test_cmd
I get the coverage report like this,
Name Stmts Miss Branch BrPart Cover
----------------------------------------------------------------------------------------------------------------------------------------
/path/file.py 9 2 2 1 73%
From this report I am not able to understand the formula used to get Cover=73%
How is this number coming and is this correct value for code coverage?
While you probably shouldn't worry too much about the exact numbers, here is how they are calculated:
Reading from the output you gave, we have:
n_statements = 9
n_missing = 2
n_branches = 2
n_missing_branches = 1
Then it is calculated (I simplified the syntax a bit, in the real code everything is hidden behind properties.)
https://github.com/nedbat/coveragepy/blob/a9d582a47b41068d2a08ecf6c46bb10218cb5eec/coverage/results.py#L205-L207
n_executed = n_statements - n_missing
https://github.com/nedbat/coveragepy/blob/8a5049e0fe6b717396f2af14b38d67555f9c51ba/coverage/results.py#L210-L212
n_executed_branches = n_branches - n_missing_branches
https://github.com/nedbat/coveragepy/blob/8a5049e0fe6b717396f2af14b38d67555f9c51ba/coverage/results.py#L259-L263
numerator = n_executed + n_executed_branches
denominator = n_statements + n_branches
https://github.com/nedbat/coveragepy/blob/master/coverage/results.py#L215-L222
pc_covered = (100.0 * numerator) / denominator
Now put all of this in a script, add print(pc_covered) and run it:
72.72727272727273
It is then rounded of course, so there you have your 73%.
So I would like to print polynomials in one variable (s) with one parameter (a), say
a·s^3 − s^2 - a^2·s − a + 1.
Sage always displays it with decreasing degree, and I would like to get something like
1 - a - a^2·s - s^2 + a·s^3
to export it to LaTeX. I can't figure out how to do this... Thanks in advance.
As an alternative to string manipulation, one can use the series expansion.
F = a*s^3 - s^2 - a^2*s - a + 1
F.series(s, F.degree(s)+1)
returns
(-a + 1) + (-a^2)*s + (-1)*s^2 + (a)*s^3
which appears to be what you wanted, save for some redundant parentheses.
This works because (a) a power series is ordered from lowest to highest coefficients; (b) making the order of remainder greater than the degree of the polynomial ensures that the series is just the polynomial itself.
This is not easy, because the sort order is defined in Pynac, a fork of Ginac, which Sage uses for its basic symbolic manipulation. However, depending on what you need, it is possible programmatically:
sage: F = 1 + x + x^2
sage: "+".join(map(str,sorted([f for f in F.operands()],key=lambda exp:exp.degree(x))))
'1+x+x^2'
I don't know whether this sort of thing is powerful enough for your needs, though. You may have to traverse the "expression tree" quite a bit but at least your sort of example seems to work.
sage: F = a + a^2*x + x^2 - a*x^2
sage: "+".join(map(str,sorted([f for f in F.operands()],key=lambda exp:exp.degree(x))))
'a+a^2*x+-a*x^2+x^2'
Doing this in a short statement requires a number of Python tricks like this, which are very well worth learning if you are going to use Sage (or Numpy, or pandas, or ...) a fair amount.
how would the JQL look like for this condition?:
Generate a report of all HIGH severity JIRA tickets for a project with key X (or name X) that were created from 9 PM EST to 12 AM EST from the start of the year?
I tried something like :
Project = X AND Severity = "HIGH" AND created > "2015/01/01 21:00" and created < "2015/09/09",
but I need only those issues that are created between 9 PM and 12 AM everyday, from the beginning of the year.
Any ideas would be greatly appreciated.
Unfortunately there seems to be no tool to get the hour from created date but you can workaround it.
My two ideas are:
find these tickets directly on Jira database (if you have access) it should be very easy since there are functions like hour (mySQL) or truncate (postgres)
prepare JQL filter using some generator script - it is definetely less comfortable but possible to achieve even when you have not acces to database. The worst thing is that Jira filter fields accepts only 2000 characters string so you would need to copy that filter few lines by few lines.
Little crazy but ok - it works so what's the idea? The idea is to use startOfYear() JQL function and its *offset version**. For example:
created >= startOfYear(21h) and created < startOfYear(24h)
will give you all tickets from 1 Jan 21:00 - 2 Jan 00:00
then you can use this Python script:
step = 27
maxDay = 1
while maxDay <= 365 + step:
maxDay += step
output = "project = X and Severity = HIGH and ("
for i in range(maxDay-step, maxDay):
output += " (created >= startOfYear(" + str( ( (i-1) * 24 ) + 21) + "h) and created < startOfYear(" + str(i*24) + "h)) or"
output = output[:-3]
output += ")"
print output
print
which will generate you set of JQL requests to copy-paste and execute (it is actually 15 of them - you can see here). Every set bounds 28 days because of 2000 limit of filter input in Jira.
I fixed this issue by writing a custom JQL function and then using that function with a JQL query, which fits well with our requirements :
created >= "2015-01-01" and created <= "2015-12-31" and issue in getIssuesForTimeRange("21", "24")
I have tried to work with giza++ on window (using Cygwin compiler).
I used this code:
//Suppose source language is French and target language is English
plain2snt.out FrenchCorpus.f EnglishCorpus.e
mkcls -c30 -n20 -pFrenchCorpus.f -VFrenchCorpus.f.vcb.classes opt
mkcls -c30 -n20 -pEnglishCorpus.e -VEnglishCorpus.e.vcb.classes opt
snt2cooc.out FrenchCorpus.f.vcb EnglishCorpus.e.vcb FrenchCorpus.f_EnglishCorpus.e.snt >courpuscooc.cooc
GIZA++ -S FrenchCorpus.f.vcb -T EnglishCorpus.e.vcb -C FrenchCorpus.f_EnglishCorpus.e.snt -m1 100 -m2 30 -mh 30 -m3 30 -m4 30 -m5 30 -p1 o.95 -CoocurrenceFile courpuscooc.cooc -o dictionary
But the after getting the output files from giza++ and evaluate output, I observed that the results were too bad.
My evaluation result was:
RECALL = 0.0889
PRECISION = 0.0990
F_MEASURE = 0.0937
AER = 0.9035
Dose any body know the reason? Could the reason be that I have forgotten some parameters or I should change some of them?
in other word:
first I wanted train giza++ by huge amount of data and then test it by small corpus and compare its result by desired alignment(GOLD STANDARD) , but I don't find any document or useful page in web.
can you introduce useful document?
Therefore I ran it by small courpus (447 sentence) and compared result by desired alignment.do you think this is right way?
Also I changed my code as follows and got better result but It's still not good:
GIZA++ -S testlowsf.f.vcb -T testlowde.e.vcb -C testlowsf.f_testlowde.e.snt -m1 5 -m2 0 -mh 5 -m3 5 -m4 0 -CoocurrenceFile inputcooc.cooc -o dictionary -model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 0 -nsmooth 4 -onlyaldumps 1 -p0 0.999 -diagonal yes -final yes
result of evaluation :
// suppose A is result of GIZA++ and G is Gold standard. As and Gs is S link in A And G files. Ap and Gp is p link in A and G files.
RECALL = As intersect Gs/Gs = 0.6295
PRECISION = Ap intersect Gp/A = 0.1090
FMEASURE = (2*PRECISION*RECALL)/(RECALL + PRECISION) = 0.1859
AER = 1 - ((As intersect Gs + Ap intersect Gp)/(A + S)) = 0.7425
Do you know the reason?
Where did you get those parameters? 100 iterations of model1?! Well, if you actually manage to run this, I strongly suspect that you have a very small parallel corpus. If so, you should consider adding more parallel data in training. And how exactly do you calculate the recall and precision measures?
EDIT:
With less than 500 sentences you're unlikely to get any reasonable performance. The usual way to do it is not find a larger (unaligned) parallel corpus, run GIZA++ on both together and then evaluate the small part for which you have the manual alignments. Check Europarl or MultiUN, these are freely available corpora, both contain a relatively large amount of English-French parallel data. The instructions on preparing the data can be found on the websites.