llvm-cov: statistics for uninstantiated functions

llvm-cov: statistics for uninstantiated functions - code-coverage

I'm starting to work with llvm-cov to produce coverage statistics for my project. llvm-cov has several categories: line coverage, function coverage and region coverage. But they all consider only instantiated functions, functions which are not instantiated are simply ignored. This way it is easy to get close to 100% coverage for files which have a low percentage of instantiated functions, which is not what I want. Is it possible to make llvm-cov consider even uninstantiated functions or make it produce separate coverage statistics?

At the moment, unfortunately not. This is a missing capability in llvm-cov.
The reason for this is that clang does not emit any code for unspecialized templates, and the coverage generation logic depends on clang emitting code for a function. This is a weird limitation. The compiler does have enough information to describe these templates.
Edit: Of course, another point to consider is that C++ translation units tend to contain absolutely enormous amounts of unspecialized/uninstantiated templates, and if the compiler were to emit coverage mapping regions for each of these, compile-time and binary size would likely regress massively.

Related

Does clang static analyzer core support multi-threaded programs?

Couldn't find any documentation on behavior of clang static analyzer core when it observes multi-threading programming constructs. Does the core identify them and create separate paths for each thread?

No, the Clang Static Analyzer does not attempt to directly analyze the simultaneous execution of multiple threads. Instead, it analyzes one path at a time. Quoting the developer manual:
The analyzer core performs symbolic execution of the given program. All the input values are represented with symbolic values; further, the engine deduces the values of all the expressions in the program based on the input symbols and the path. The execution is path sensitive and every possible path through the program is explored.
The thread-related checks that Clang performs are done within the context of a single path. Skimming the list of available checkers, I found three that relate to C/C++ threading (there are also some for Objective C):
alpha.core.C11Lock: Checks use of the mtx_t API, particularly double-locking and (I assume) failing to release locks.
alpha.unix.PthreadLock: Similar checking for pthread_mutex_lock and related functions.
optin.mpi.MPI-Checker: Checks use of the MPI API, for example that the code waits for a message receive operation to complete before using the data.
In addition to these checkers, there is the Clang Thread Safety Analysis system, which relies on programmer-provided annotations to (mainly) enforce correct usage of mutexes to protect shared data.
What all of these have in common is that the properties being enforced entail the correct usage of an API within the context of a single thread. The analyzer does not need to consider what other threads might be doing to diagnose these issues.
Some commercial static analysis tools have more numerous and sophisticated analyses for detecting issues with threaded code, and may consider what happens along multiple independent (and potentially concurrent) paths, but they also do not directly analyze the interleaving possibilities.
There are techniques that directly consider concurrent execution and interleaving, usually with some variant of a model checking algorithm, but getting such techniques to scale to programs larger than a few tens of lines of code is an open area of research.

Which of the three mutually exclusive Clang sanitizers should I default to?

Clang has a number of sanitizers that enable runtime checks for questionable behavior. Unfortunately, they can't all be enabled at once.
It is not possible to combine more than one of the -fsanitize=address, -fsanitize=thread, and -fsanitize=memory checkers in the same program.
To make things worse, each of those three seems too useful to leave out. AddressSanitizer checks for memory errors, ThreadSanitizer checks for race conditions and MemorySanitizer checks for uninitialized reads. I'm worried about all of those things!
Obviously, if I have a hunch about where a bug lies, I can choose a sanitizer according to that. But what if I don't? Going further, what if I want to use the sanitizers as a preventative tool rather than a diagnostic one, to point out bugs that I didn't even know about?
In other words, given that I'm not looking for anything in particular, which sanitizer should I compile with by default? Am I just expected to compile and test the whole program three times, once for each sanitizer?

As you pointed out, sanitizers are typically mutually exclusive (you can combine only Asan+UBsan+Lsan, via -fsanitize=address,undefined,leak, maybe also add Isan via -fsanitize=...,integer if your program does not contain intentional unsigned overflows) so the only way to ensure complete coverage is to do separate QA runs with each of them (which implies rebuilding SW for every run). BTW doing yet another run with Valgrind is also recommended.
Using Asan in production has two aspects. On one hand common experience is that some bugs can only be detected in production so you do want to occasionally run sanitized builds there, to increase test coverage [*]. On the other hand Asan has been reported to increase attack surface in some cases (see e.g. this oss-security report) so using it as hardening solution (to prevent bugs rather than detect them) has been discouraged.
[*] As a side note, Asan developers also strongly suggest using fuzzing to increase coverage (see e.g. Cppcon15 and CppCon17 talks).
[**] See Asan FAQ for ways to make AddressSanitizer more strict (look for "aggressive diagnostics")

Is there a way to preprocess ruby code and find errors that would occur runtime?

We have huge code base and we are generating issues that would have been caught at compile time in type languages such as Java but we are not catching them until runtime in Ruby. This is bad since we generate bugs that most of the time are typos or refactoring that leaves some invalid code.
Example:
def mysuperfunc
# some code goes here
# this was a valid call but not anymore since enforcesecurity
# signature changed
#system.enforcesecurity
end
I mean, IDEs can do it but some guys use ATOM or sublime, so we need something to "compile" and report that kind of issues so they don't reach deployment. What have you been using?
This is generating a little percentage of our bug reports, but since we are forced to produce at a ridiculous pace we don't have 100% code coverage. If there is no tool to help, I'll just make sure everybody uses and IDE and run the reports with tools such as Rubymine.
Our stack includes, rspec, minitest, SimpleCov. We enforce code reviews, multistack deployments (dev, qa, pre-prod, sandbox, prod). And still some issues are reaching higher level and makes us programmers look bad. I'm not looking of magic, just a little automation that might help a bit.

Unfortunately, the Halting Problem, Rice's Theorem, and all the other Undecidability and Uncomputability Results tell us that it is simply impossible in the general case to statically determine any "interesting" property about the runtime behavior of a program. We cannot even statically determine something as simple as "will it halt", so how are we going to determine "is bug-free"?
There are certain things that can be statically determined, and there are certain restricted programs for which some interesting properties can be statically determined, but largely, this is not possible. And even to the small extent that it is possible, it generally requires the language to be specifically designed to be easy to statically analyze (which Ruby isn't).
That being said, there are certain tools that contain certain heuristics to point out code that may have problems. There are certain coding standards that may help avoid bugs, and there are tools to enforce those coding standards. Keywords to search for are "code quality tools", "linter", "static analyzer", etc. You have already been given examples in the other answers and comments, and given those examples and these keywords, you'll likely find more.
However, I also wanted to discuss something you wrote:
we are forced to produce at a ridiculous pace we don't have 100% code coverage
That's a problem, which has to be approached from two sides:
Practice, practice, practice. You need to practice testing and writing high-quality code until it is so naturally to you that not doing it actually ends up being harder and slower. It should become second nature to you, such that under pressure when your mind goes blank, the only thing you know is to write tests and write well-designed, well-factored, high-quality code. Note: I'm talking about deliberate practice, which means setting time aside to really practice … and practice is practice, it's not work, it's not fun, it's not hobby, if you don't delete the code you wrote immediately after you have written it, you are not practicing, you are working.
Sustainable Pace. You should never develop faster than the pace you could sustain indefinitely while still producing well-tested, well-designed, well-factored, high-quality code, having a fulfilling social life, no stress, plenty of free time, etc. This is something that has to be backed and supported and understood by management.

I'm unaware of anything exactly like you want. However, there are a few gems that will analyze code and warn you about some errors and/or bad practices. Try these:
https://github.com/bbatsov/rubocop
https://github.com/railsbp/rails_best_practices

FLAY
https://rubygems.org/gems/flay
Via the repo https://github.com/seattlerb/flay:
DESCRIPTION:
Flay analyzes code for structural similarities. Differences in literal
values, variable, class, method names, whitespace, programming style,
braces vs do/end, etc are all ignored. Making this totally rad.
[FEATURES:]
Reports differences at any level of code.
Adds a score multiplier to identical nodes.
Differences in literal values, variable, class, and method names are ignored.
Differences in whitespace, programming style, braces vs do/end, etc are ignored.
Works across files.
Add the flay-persistent plugin to work across large/many projects.
Run --diff to see an N-way diff of the code.
Provides conservative (default) and --liberal pruning options.
Provides --fuzzy duplication detection.
Language independent: Plugin system allows other languages to be flayed.
Ships with .rb and .erb.
javascript and others will be
available separately.
Includes FlayTask for Rakefiles.
Uses path_expander, so you can use:
dir_arg -- expand a directory automatically
#file_of_args -- persist arguments in a file
-path_to_subtract -- ignore intersecting subsets of
files/directories
Skips files matched via patterns in .flayignore (subset format of .gitignore).
Totally rad.
FLOG
https://rubygems.org/gems/flog
Via the repo https://github.com/seattlerb/flog:
DESCRIPTION:
Flog reports the most tortured code in an easy to read pain report.
The higher the score, the more pain the code is in.
[FEATURES:]
Easy to read reporting of complexity/pain.
Uses path_expander, so you can use:
dir_arg – expand a directory automatically
#file_of_args – persist arguments in a file
-path_to_subtract – ignore intersecting subsets of files/directories
SYNOPSIS:
% ./bin/flog -g lib
Total Flog = 1097.2 (17.4 flog / method)
323.8: Flog total
85.3: Flog#output_details
61.9: Flog#process_iter
53.7: Flog#parse_options
...

There is a ruby gem called guard that does automated testing. You can set your own custom rules.
For example, you can make it where anytime you modify certain files, the test framework will automatically run.
Here is the link for guard

Generating intermediate code in a compiler. Is an AST or parse tree always necessary when dealing with conditionals?

I'm taking a compiler-design class where we have to implement our own compiler (using flex and bison). I have had experience in parsing (writing EBNF's and recursive-descent parsers), but this is my first time writing a compiler.
The language design is pretty open-ended (the professor has left it up to us). In class, the professor went over generating intermediate code. He said that it is not necessary for us to construct an Abstract Syntax Tree or a parse tree while parsing, and that we can generate the intermediate code as we go.
I found this confusing for two reasons:
What if you are calling a function before it is defined? How can you resolve the branch target? I guess you would have to make it a rule that you have to define functions before you use them, or maybe pre-define them (like C does?)
How would you deal with conditionals? If you have an if-else or even just an if, how can you resolve the branch target for the if when the condition is false (if you're generating code as you go)?
I planned on generating an AST and then walking the tree after I create it, to resolve the addresses of functions and branch targets. Is this correct or am I missing something?

The general solution to both of your issues is to keep a list of addresses that need to be "patched." You generate the code and leave holes for the missing addresses or offsets. At the end of the compilation unit, you go through the list of holes and fill them in.
In FORTH the "list" of patches is kept on the control stack and is unwound as each control structure terminates. See FORTH Dimensions
Anecdote: an early Lisp compiler (I believe it was Lisp) generated a list of machine code instructions in symbolic format with forward references to the list of machine code for each branch of a conditional. Then it generated the binary code walking the list backwards. This way the code location for all forward branches was known when the branch instruction needed to be emitted.

The Crenshaw tutorial is a concrete example of not using an AST of any kind. It builds a working compiler (including conditionals, obviously) with immediate code generation targeting m68k assembly.
You can read through the document in an afternoon, and it is worth it.

Is there a way to determine code coverage without running the code?

I am not asking the static code analysis which is provided by StyleCop or Fxcop. Both are having different purpose and it serves well. I am asking whether is there a way to find the code coverage of your user control or sub module? For ex, you have an application which uses the helper classes in a separate assembly. Inorder to ensure the unit testing code coverage, we need to run the application and ensure using NCover or similar tool.
My requirement is, without running it, is there any possible to find code coverage of the helper classes or similar kind of assemblies?

See Static Estimation for Test Coverage for a technique that estimates coverage without executing the source code.
The basic idea is to compute a program slice for each test case, and then "count" what the slice enumerates. A (forward) slice is effectively that part of a program that you can reach from a specific starting point in the code, in this case, the test code.
While the technical paper above is hard to get if you're not an ACM member [or you didn't attend the conference where it was presented :], there's a slide presentation here.
Of course, running this static estimator only tells you (roughly) what code will be exercised. It doesn't substitute for actually running the tests, and verifying that they pass!

In general, the answer is no. This is equivalent to the halting problem, which is not computable.

There are (research) tools based on abstract interpretation or model checking that can show coverage properties without execution, for subsets of language. See, e.g.
"Analyzing Functional Coverage in Bounded Model Checking", Grosse, D. Kuhne, U. Drechsler, R. 2008
In general, yes, there are approaches, but they're specialized, and may require some formal methods experience. This kind of stuff is still cutting edge research.

I would say no; with the exception of 'dead code' which a compiler can determine.
My definition of code coverage is a result which indicates how many times each line of code is run in your program: which, of course, means running the program. The determining factor here is usually the values of data passing through the program which the determine the paths of executions taken by conditionals. A static analysis, like a compiler, could deduce lines of code that cannot run under any conditions.
An example here is if your program uses a third-party library, but there is a bug in the library. If your program never uses those parts of the library, or the data you send to the library causes it to avoid the bug, then you won't be affected.
You could write a program that, by reflection, assumes that all conditionals will be taken, and follows all function calls, through all derived classes, but I'm not sure what this will tell you. It certainly can't tell you whether or not there are any bugs in the lines of code covered.

Coverity Static Analysis is a tool that is can identify many secuirty flaws in a program. It can also identify dead code and can be used to help satisfy testing regulations such as D0178B which requires that the developers demonstrate that all code can be executed.

If you are using Visual Studio, you can first run 'Analyze Code Coverage', Then you can export code Coverage results using below Button(marked in Green) in Visual Studio:
Later you can import the Coverage Result file back to Visual Studio

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart