Flex-lexer: Is using `%option debug` the same as defining `FLEX_DEBUG` - flex-lexer

When running flex on my *.l file with %option debug i see #define FLEX_DEBUG in the generated scanner file.
Is there any difference between using %option debug and just defining FLEX_DEBUG by passing -DFLEX_DEBUG to gcc?

There is no difference with the current version of flex. The only thing that %option debug does is insert #define FLEX_DEBUG into the output file. This is also what is done if -d or --debug is specified on the flex command-line.
However, there is one important difference: The FLEX_DEBUG macro is not documented, so there is no guarantee that the behaviour will not change with future versions. (This is different from the yacc/bison YYDEBUG macro, which is documented.)

Related

How to get automake to recognize the non-default filename produced by flex when %option prefix= is set

In general, automake's capability to properly invoke flex and bison when building parsers and scanners is incredibly useful. However, I'm running up against an issue that I can't seem to resolve.
I have a lex file, trigraphs.l, which performs the text replacement of C trigraph sequences. There will be multiple lexers included in the final executable, which will be a pre-preprocessor that prepares the source file for the C preprocessor proper, so a subsequent lexer will take care of joining logical lines that span multiple physical lines, another one will strip out comments, etc. Per the C standard, there is a specific order these transformations are supposed to take place in, so to avoid issues I'm using three separate lexers that will run in the proper sequence.
Anyway, to avoid symbol name conflicts I'm using %option prefix="blah," so my trigraph scanner is as follows:
%option noyywrap
%option prefix="trigraphs_"
%%
"??<" { printf("{"); }
"??>" { printf("}"); }
"??(" { printf("["); }
"??)" { printf("]"); }
"??=" { printf("#"); }
"??/" { printf("\\"); }
"??'" { printf("^"); }
"??!" { printf("|"); }
"??-" { printf("~"); }
. { printf("%c", *yytext); }
%%
The issue seems to be that automake->ylwrap is expecting an output from flex with the traditional lex.yy.c filename, to rename to trigraphs.c. However, because the %option prefix also changes the output filename (to lex.trigraphs_.c), it doesn't get renamed to the trigraphs.c that the Makefile is expecting. This causes the obvious compilation error when the Makefile wants to compile trigraphs.c and it doesn't exist.
One solution that occurred to me is to use %option outfile="lex.yy.c" to sidestep around that. It seems to work so far; however, it feels a bit "hackish" so I wonder if there's a more idiomatic or canonical way to handle this. When searching for solutions others had found, I did come across this question on StackExchange, but it's eight years old now so it's possible that there have been developments in the interim.
There is no more idiomatic way to handle this because Automake rules know nothing about flex. Automake expects to work with project that are compatible with POSIX lex, and expects any lex implementation to follow the POSIX convention of producing a lex.yy.c file.

Flex dropping predefined macro

I have the following flex source:
%{
#if !defined(__linux__) && !defined(__unix__)
/* Maybe on windows */
#endif
int num_chars = 0;
%}
%%
. ++num_chars;
%%
int main()
{
yylex();
printf("%d chars\n", num_chars);
return 0;
}
int yywrap()
{
return 1;
}
I generate a C file by the command flex flextest.l and compile the result with gcc -o fltest lex.yy.c
To my surprise, I get the following output:
flextest.l:2:37: error: operator "defined" requires an identifier
#if !defined(__linux__) && !defined(__unix__)
After further checking, the issue seems to be that flex has actually replaced __unix__ with the empty string, as shown by:
$ grep __linux_ lex.yy.c
#if !defined(__linux__) && !defined()
Why does this happen, and is it possible to avoid it?
It's actually m4 (the macro processor which is used by current versions of flex) which is expanding __unix__ to the empty string. The Gnu implementation of m4 defines certain symbols to empty macros so that they can be tested with ifdef.
Of course, it's (better said, it was) a bug in flex. Flex shouldn't allow m4 to expand macros within user content copied from the scanner definition file, and the current version of flex correctly arranges for the text included from the scanner description file to be quoted so that it will pass through m4 unmodified even if it happens to include a string which could be interpreted by m4 as a macro expansion.
The bug is certainly present in v2.5.39 and v2.6.1 of flex. I didn't test all previous versions, but I suppose it was introduced when flex was modified to use m4, which was v2.5.30 according to the NEWS file.
This particular quoting issue was fixed in v2.6.2 but the current version of flex (2.6.4) contains various other bug fixes, so I'd recommend you upgrade to the latest version.
If you really need a version which could work with both the buggy and the more recent versions of flex, you could use one of the two following hacks:
Find some other way to write __unix__. One possibility is the following
#define C(x,y) x##y
#define UNIX_ C(__un,ix__)
#if !defined(__linux__) && !UNIX_
That hack won't work with defined, since defined(UNIX_) tests whether UNIX_ is defined, not whether what it expands to is defined. But normally built-in symbols like __unix__ are actually defined to be 1, if they are defined, and the #if directive treats any identifier which is not #define'd as though it were 0, which means that you can usually leave use x instead of defined(x). (However, it will produce different results if there were a #define x 0 in effect, so it's not quite a perfect substitute.)
Flex, like many m4 applications, redefines m4's quote marks to be [[ and ]]. Both the buggy flex and the corrected versions substitute these quote marks with a rather elaborate sequence which effectively quotes the quote marks. However, the buggy version does not otherwise quote user-defined text, so macro substitutions will be performed in user text. (As mentioned, this is why __unix__ becomes the empty string.
In flex versions in which user-defined text is not quoted, it is possible to invoke the m4 macro which redefines quote marks. These new quote marks can then be used to quote the #if line, preventing macro substitution of __unix__. However, the quote definition must be restored, or it will completely wreck macro processing of the rest of the file. That's a bit tricky because it is impossible to write [[. (Flex will substitute it with a different string.)
The following seems to do the trick. Note that the macro invocations are placed inside C comments. The changequote macros will expand to an empty string, if they are expanded. But in flex versions since v2.6.2, user-supplied text is quoted, so the changequote macros will not be expanded. Putting them inside comments hides them from the C compiler.
%{
/*m4_changequote(<<,>>)<<*/
#if !defined(__linux__) && !defined(__unix__)
/*>>m4_changequote(<<[>><<[>>,<<]>><<]>>)*/
/* Maybe on windows */
#endif
(The m4 macro which changes quote marks is changequote but flex invokes m4 with the -P flag which changes builtins like changequote to m4_changequote. In the second call to changequote, the two [ which make up the [[ sign are individually quoted with the temporary << quote marks, which hides them from the code in flex which modifies use of [[.)
I don't know how reliable this hack is but it worked on the versions of flex which I had kicking around on my machine, including 2.5.4 (pre-M4) 2.5.39 (buggy), 2.6.1 (buggy), 2.6.2 (somewhat debugged) and 2.6.4 (more debugged).

How to output variable contents to "LogCat" window in Android-ndk

I am using Android-sdk-ndk in an Eclipse+ADT environment. In Android-sdk Java development, I could use "Log.i", "Log.w", ... statements to output messages and variable contents to the "LogCat" window. However, in Android-ndk C/C++ development, is there any similar C/C++ "print-like" statement that outputs messages / variable contents from a JNI C/C++ module to the "LogCat" window so that I could have some debug informations for my program.
Thanks for any suggestion.
Lawrence
From this guide: http://www.srombauts.fr/2011/03/06/standalone-toolchain/
You can #define the logging methods like this:
#define LOGI(...) ((void)__android_log_print(ANDROID_LOG_INFO, "hello-ndk", __VA_ARGS__))
And you need to make sure you're linking to liblog by compiling similar to this (just add -l log):
arm-linux-androideabi-gcc hello-ndk.c -l log -o hello-ndk

What's the macro to distinguish ifort from other fortran compilers?

I'm working with Fortran code that has to work with various Fortran compilers (and is interacting with both C++ and Java code). Currently, we have it working with gfortran and g95, but I'm researching what it would take to get it working with ifort, and the first problem I'm having is figuring out how to determine in the source code whether it's using ifort or not.
For example, I currently have this piece of code:
#if defined(__GFORTRAN__)
// Macro to add name-mangling bits to fortran symbols. Currently for gfortran only
#define MODFUNCNAME(mod,fname) __ ## mod ## _MOD_ ## fname
#else
// Macro to add name-mangling bits to fortran symbols. Currently for g95 only
#define MODFUNCNAME(mod,fname) mod ## _MP_ ## fname
#endif // if __GFORTRAN__
What's the macro for ifort? I tried IFORT, but that wasn't right, and further guessing doesn't seem productive. I also tried reading the man page, using ifort -help, and Google.
You're after __INTEL_COMPILER, as defined in http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/win/compiler_f/bldaps_for/common/bldaps_use_presym.htm
According to their docs, they define __INTEL_COMPILER=910 . The 910 may be a version number, but you can probably just #ifdef on it.
I should note that ifort doesn't allow macros unless you explicity turn it on with the /fpp flag.

#line and jump to line

Do any editors honer C #line directives with regards to goto line features?
Context:
I'm working on a code generator and need to jump to a line of the output but the line is specified relative to the the #line directives I'm adding.
I can drop them but then finding the input line is even a worse pain
If the editor is scriptable it should be possible to write a script to do the navigation. There might even be a Vim or Emacs script that already does something similar.
FWIW when I writing a lot of Bison/Flexx I wrote a Zeus Lua macro script that attempted to do something similar (i.e. move from input file to the corresponding line of the output file by search for the #line marker).
For any one that might be interested here is that particular macro script.
#line directives are normally inserted by the precompiler, not into source code, so editors won't usually honor that if the file extension is .c.
However, the normal file extension for post-compiled files is .i or .gch, so you might try using that and see what happens.
I've used the following in a header file occasionally to produce clickable items in
the VC6 and recent VS(2003+) compiler ouptut window.
Basically, this exploits the fact that items output in the compiler output
are essentially being parsed for "PATH(LINENUM): message".
This presumes on the Microsoft compiler's treatment of "pragma remind".
This isn't quite exactly what you asked... but it might be generally helpful
in arriving at something you can get the compiler to emit that some editors might honor.
// The following definitions will allow you to insert
// clickable items in the output stream of the Microsoft compiler.
// The error and warning variants will be reported by the
// IDE as actual warnings and errors... which means you can make
// them occur in the task list.
// In theory, the coding standards could be checked to some extent
// in this way and reminders that show up as warnings or even
// errors inserted...
#define strify0(X) #X
#define strify(X) strify0(X)
#define remind(S) message(__FILE__ "(" strify( __LINE__ ) ") : " S)
// example usage
#pragma remind("warning: fake warning")
#pragma remind("error: fake error")
I haven't tried it in a while but it should still work.
Use sed or a similar tool to translate the #lines to something else not interpreted by the compiler, so you get C error messages on the real line, but have a reference to the original input file nearby.

Resources