Flex multiple .l file arguments don't work? (eg "flex a.l b.l") - flex-lexer

I finally have a comfortable-enough workflow for writing my flex programs, and I'll work bison into it soon (I dabbled with it before but I restarted my project entirely).
flex yy.l; flex flip.l will generate a lex.yy.c and lex.flip.ccorrectly, since I use the prefix option. But I am curious why flex yy.l flip.l or flex *.l does not.
gcc lex* seems to work perfectly fine when all .c files are correctly generated, as by the first command, but trying the same shortcut with flex produces a single lex.yy.c file, which seemed valid up until the unprocessed flip.l file pasted on the end, preventing gcc compilation.
Is this just flex telling me my workflow is dumb and I should use more start conditions in a big file? I'd prefer not to, at least until I have a more complete program to tweak for speed.
My workflow is:
fg 1; fg 2; fg 3; fg 4; flex a.l; flex flip.l; flex rot.l; gcc -g lex*; ./a.out < in
With nano editors as jobs 1, 2, 3, 4 to fg out of the background.
I'm lexing the file in this order: flip, rot, a, rot, flip. And it works, and I can even use preprocessor definitions gcc -DALONE to correctly compile my .c files alone, for testing.

I think what flex is telling you, if anything, is to learn how to use make rather than trying to put together massive build commands.
It's true that flex will only process one file per invocation. On the other hand, both gcc and clang are simply drivers which invoke the actual compiler(s) and linker(s) so that you don't have to write more complicated build recipes. You could easily write a little driver program which invoked flex multiple times, once per argument, but it would be even simpler to use make, with the additional advantage that flex would only be invoked as necessary.
In fact, most large C projects do not use gcc's ability to compile multiple files in a single invocation. Instead, they let make figure out which object files need to be rebuilt (because the corresponding source file changed), thereby considerably speeding up the debug/edit/build cycle.

Related

What does .text.unlikely mean in ELF object files?

In my objdump -t output, I see the following two lines:
00000000000004d2 l F .text.unlikely 00000000000000ec function-signature-goes-here [clone .cold.427]
and
00000000000018e0 g F .text 0000000000000690 function-signature-goes-here
I know l means local and g means global. I also know that .text is a section, or a type of section, in an object file, containing compiled program instructions. But what is .text.unlikely? Assuming it's a different section (or type-of-section) from .text - what's the difference?
In my GCC v5.4.0 manpage, I found the following switch:
-freorder-functions
which says:
Reorder functions in the object file in order to improve code
locality. This is implemented by using special subsections
".text.hot" for most frequently executed functions and
".text.unlikely" for unlikely executed functions. Reordering is done
by the linker so object file format must support named sections and
linker must place them in a reasonable way.
Also profile feedback must be available to make this option effective.
See -fprofile-arcs for details.
Enabled at levels -O2, -O3, -Os.
Looks like the compiler was run with optimization flags or that switch for this binary, and functions are organized in subsections to optimize spatial locality.

More metrics for CodeCoverage Elixir

Background
I have a test suite and I need to know the coverage of the project.
I have played around with mix test --cover but I find the native erlang's coverage analysis tool to be insufficient at best.
The native coverage tool doesn't tell you about branch coverage nor function coverage. It's only metric seems to be relevant lines which I have no idea how they calculate. For all I know, this is just the most basic form of test coverage: see if a given text line was executed.
What have you tried?
I have tried Coverex but the result was disastrous. Not only does it suffer from the same issues that the native tool does, it also seems not produce correct results as it counts imported modules as untested.
Or maybe it is doing a great job and my code is poorly tested, but I can't know for sure because it doesn't tell me how it is evaluating my code. Have 40% coverage in a file? What am I missing? I can't know, the tool wont tell me.
I am now using ExCoveralls. It is considerably better than the previous options, it allows me to easily configure which folders I want to ignore, but it uses the native coverage tool, so it suffers pretty much from the same issues.
What do you want?
I was hoping to find something among the lines of Istanbul, or in this case nyc:
https://github.com/istanbuljs/nyc
It's test coverage analysis tells me everything I need to know, metrics and all:
Branches, Functions, Lines, Statements, everything you need to know is there.
Questions
Is there any tool that uses Istanbul for code coverage metrics with Elixir instead of the native erlang one?
If not, is there a way to configure the native coverage tool to give me more information?
Which metrics does the native coverage tool uses ?
The native coverage tool inserts "bump" calls on every line of the source code, recording module, function, arity, clause number and line number:
bump_call(Vars, Line) ->
A = erl_anno:new(0),
{call,A,{remote,A,{atom,A,ets},{atom,A,update_counter}},
[{atom,A,?COVER_TABLE},
{tuple,A,[{atom,A,?BUMP_REC_NAME},
{atom,A,Vars#vars.module},
{atom,A,Vars#vars.function},
{integer,A,Vars#vars.arity},
{integer,A,Vars#vars.clause},
{integer,A,Line}]},
{integer,A,1}]}.
(from cover.erl)
The code inserted by the function above is:
ets:update_counter(?COVER_TABLE,
{?BUMP_REC_NAME, Module, Function, Arity, Clause, Line}, 1)
That is, increment the entry for the given module / function / line in question by 1. After all tests have finished, cover will use the data in this table and show how many times a given line was executed.
As mentioned in the cover documentation, you can get coverage for modules, functions, function clauses and lines. It looks like ExCoveralls only uses line coverage in its reports, but there is no reason it couldn't do all four types of coverage.
Branch coverage is not supported. Seems like supporting branch coverage would require expanding the "bump" record and updating cover.erl to record that information. Until someone does that, coverage information is only accurate when branches appear on different lines. For example:
case always_false() of
true ->
%% this line shows up as not covered
do_something();
false ->
ok
end.
%% this line shows up as covered, even though do_something is never called
always_false() andalso do_something()
To add to #legoscia excellent response, I also want to clarify why cover does not do statements evaluation. According to this discussion in the official forum:
https://elixirforum.com/t/code-coverage-tools-for-elixir/18102/10
The code is first compiled into erlang and then from erlang into a modified binary file (but no .beam file is created) that is automatically loaded into memory and executed.
Because of the way erlang code works, a single statement can have several instructions:
and single line can result in multiple VM “statements”, for example:
Integer.to_string(a + 1)
Will result with 2 instructions:
{line,[{location,"lib/tasks.ex",6}]}.
{gc_bif,'+',{f,0},1,[{x,0},{integer,1}],{x,0}}.
{line,[{location,"lib/tasks.ex",6}]}.
{call_ext_only,1,{extfunc,erlang,integer_to_binary,1}}.
Therefore it is rather tricky for an automatic analysis tool to provide statement coverage because it is hard to match statements to instructions, especially as in theory a compiler is free to reorder commands as it pleases as long as the result is the same.

Dissassembly of Forth code words with 'see'

I am preparing overall knowledge on building a Forth interpreter and want to disassemble some of the generic Forth code words such as +, -, *, etc.
My Gforth (I currently have version 0.7.3, installed on Ubuntu Linux) will allow me to disassemble colon definitions that I make with the command see, as well as the single code word .. But when I try it with other code words, see + or see /, I get an error that says, Code +, and then I'm not able to type in my terminal anymore, even when I press control-c.
I should be able to decompile/disassemble the code words, as shown by the Gforth manual: https://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Decompilation-Tutorial.html
Has anyone else had this issue, and do you know how to fix it?
Reverting to the old ptrace method did it for me.
First, from the command line as user root run:
echo 0 >/proc/sys/kernel/yama/ptrace_scope
After which see should disassemble whatever it can't decompile. Command line example (need not be root):
gforth -e "see + bye"
Output:
Code +
0x000055a9bf6dad66 <gforth_engine+2454>: mov %r14,0x21abf3(%rip) # 0x55a9bf8f5960 <saved_ip>
0x000055a9bf6dad6d <gforth_engine+2461>: lea 0x8(%r13),%rax
0x000055a9bf6dad71 <gforth_engine+2465>: mov 0x0(%r13),%rdx
0x000055a9bf6dad75 <gforth_engine+2469>: add $0x8,%r14
0x000055a9bf6dad79 <gforth_engine+2473>: add %rdx,(%rax)
0x000055a9bf6dad7c <gforth_engine+2476>: mov %rax,%r13
0x000055a9bf6dad7f <gforth_engine+2479>: mov -0x8(%r14),%rcx
0x000055a9bf6dad83 <gforth_engine+2483>: jmpq *%rcx
end-code
Credit: Anton Ertl
Most versions of SEE that I've seen are meant only for decompiling colon definitions. + and / and other arithmetic operations are usually written in assembly code and SEE doesn't know what to do with them. That's why you were getting the CODE error message: they're written in code, not Forth. There are several Forth implementations I've seen that have built in assemblers, but I don't think I've ever seen a dis-assembler. Your best bet for seeing the inner workings of + or / or other such words might be to use DUMP or another such word to get a list of the bytes in the word and either disassemble the word by hand or feed the data into an external disassembler. Or see if you can find the source code for your implementation or a similar one.
SEE is a word that has not a tightly controlled behaviour. It is a kind of best effort to show the code of a word X if invoked as
SEE X
It behaves slightly different according how difficult it is to do this. If you defined the word yourself in the session, you're pretty much guaranteed to get your code back. If it is a built in word, especially if it is a very elementary word like + , it is harder. It may look nothing much like the original definition, because of optimisation or compilation into machine code.
Specifically for gforth, if it gets hard gforth invokes the standard tools that are present on the system to analyse object files. So it may be necessary to install gdb and/or investigate how gforth tries to connect to it. For the concrete example of Ubuntu and gforth 0.7.3 Lutz Mueller gives a recipee.
.
I think SEE does it's job as designed.
There are words in FORTH defined in machine code (often called as primitives) and also there is a possibility to define machine code via assembler by the user ie.:
: MYCODE assembler memonics ;CODE
So the output of SEE shows not Code error, but that (ie.) + word was defined as machine code and one can see the disassembled mnenonics on the right of it's output.

meaning of llvm[n] when compiling llvm, where n is an integer

I'm compiling LLVM as well as clang. I noticed that the output of compilation has llvm[1]: or llvm[2]: or llvm[3]: prefixed to each line. What do those integers in brackets mean?
Apparently, it's not connected to the number of the compilation job (can be easily checked via make -j 1). The autoconf-based build system indicates the "level" of the makefile inside the source tree). To be prices, it's a value of make's MAKELEVEL variable.
The currently accepted answer is not correct. Furthermore, this is really a GNU Make question, not a LLVM question.
What you're seeing is the current value of the MAKELEVEL variable echoed by make to the command line. This value is set as a result of recursive execution. From the GNU make manual:
As a special feature, the variable MAKELEVEL is changed when it is passed down from level to level. This variable’s value is a string which is the depth of the level as a decimal number. The value is ‘0’ for the top-level make; ‘1’ for a sub-make, ‘2’ for a sub-sub-make, and so on. The incrementation happens when make sets up the environment for a recipe.
If you have a copy of the GNU Make source code on hand, you can see your output message being generated in output.c with the void message(...) function. In GNU make v4.2, this happens on line 626. Specifically the program argument is set to the string "llvm" and the makelevel argument is set as noted above.
Since it was erroneously brought up, it is not the number of the compilation job. The -j [jobs] or --jobs[=jobs] options enable parallel execution of up to jobs number of recipes simultaneously. If -j or --jobs is selected, but jobs is not set, GNU Make attempts to execute as many recipes simultaneously as possible. See this section of the GNU Make manual for more information.
It is possible to have recursive execution without parallel execution, and parallel execution without recursive execution. This is the main reason that the currently accepted answer is not correct.
It's the number of the compilation job (make -j). Helpful to trace compilation errors.

Automatically create Delphi/Freepascal interface unit from C header file

Is it possible to automatically generate interface units from C header files? In particular, I want to wrap the HDF5 library, and it would be great if I could avoid writing the interface unit manually.
The free pascal includes the H2PAS tool.
h2pas attempts to convert a C header file to a pascal unit. it can
handle most C constructs that one finds in a C header file, and
attempts to translate them to their pascal counterparts.
Bob Swart (Dr Bob) has a utility which will convert a lot of header files (although there's usually some manual work involved as well) called HeaderConvert. I've never compared it to the tool #RRUZ links, but it's another option.
Project JEDI has one as well; I've never tested it. You can find it here.
In general fully automated translating of C headers to something else (that isn't an effective superset of the needed C functionality) is hard to impossible.
This because due to macros one can't see how to translate them. Macros often only get their meaning from context. Example
#define uglymacro 1,2,3,4
but also (and this one is more common):
SCARYAPIMACRO void func(int c);
SCARYAPIMACRO is then often a macro that tests OS defines to select the right calling convention for the right OS/architecture.
Still, that doesn't mean that the tools are not real timesavers. But the result is more semiautomatic, I've the most and best experience with h2pas.
I've translated a lot of Windows headers (including FPC's commctrl which has a sendmessage macro every few lines).
What I usually do is craft a small pascal program that scans the source linebased and uses heuristics to split it into parts that are mostly homogeneous (all structs or constants,macros, procedure declaration etc). Then I look at the source and often do some global substitutes.
Only then I run it through the translator, the process is often iterative (refine separation, do global substitutions, try to translate, if it fails, try again etc).
The process unfortunately does require a good grasp of C, pragma stuff included.
You can download HDF5 API header Delphi translations, Delphi XE2 HDF5 table test program with source code and somewhat modified hdf5dll from my page:
http://www.astro.ff.vu.lt/index.php?option=com_content&task=view&id=46&Itemid=63

Resources