How to get automake to recognize the non-default filename produced by flex when %option prefix= is set - flex-lexer

In general, automake's capability to properly invoke flex and bison when building parsers and scanners is incredibly useful. However, I'm running up against an issue that I can't seem to resolve.
I have a lex file, trigraphs.l, which performs the text replacement of C trigraph sequences. There will be multiple lexers included in the final executable, which will be a pre-preprocessor that prepares the source file for the C preprocessor proper, so a subsequent lexer will take care of joining logical lines that span multiple physical lines, another one will strip out comments, etc. Per the C standard, there is a specific order these transformations are supposed to take place in, so to avoid issues I'm using three separate lexers that will run in the proper sequence.
Anyway, to avoid symbol name conflicts I'm using %option prefix="blah," so my trigraph scanner is as follows:
%option noyywrap
%option prefix="trigraphs_"
%%
"??<" { printf("{"); }
"??>" { printf("}"); }
"??(" { printf("["); }
"??)" { printf("]"); }
"??=" { printf("#"); }
"??/" { printf("\\"); }
"??'" { printf("^"); }
"??!" { printf("|"); }
"??-" { printf("~"); }
. { printf("%c", *yytext); }
%%
The issue seems to be that automake->ylwrap is expecting an output from flex with the traditional lex.yy.c filename, to rename to trigraphs.c. However, because the %option prefix also changes the output filename (to lex.trigraphs_.c), it doesn't get renamed to the trigraphs.c that the Makefile is expecting. This causes the obvious compilation error when the Makefile wants to compile trigraphs.c and it doesn't exist.
One solution that occurred to me is to use %option outfile="lex.yy.c" to sidestep around that. It seems to work so far; however, it feels a bit "hackish" so I wonder if there's a more idiomatic or canonical way to handle this. When searching for solutions others had found, I did come across this question on StackExchange, but it's eight years old now so it's possible that there have been developments in the interim.

There is no more idiomatic way to handle this because Automake rules know nothing about flex. Automake expects to work with project that are compatible with POSIX lex, and expects any lex implementation to follow the POSIX convention of producing a lex.yy.c file.

Related

Flex dropping predefined macro

I have the following flex source:
%{
#if !defined(__linux__) && !defined(__unix__)
/* Maybe on windows */
#endif
int num_chars = 0;
%}
%%
. ++num_chars;
%%
int main()
{
yylex();
printf("%d chars\n", num_chars);
return 0;
}
int yywrap()
{
return 1;
}
I generate a C file by the command flex flextest.l and compile the result with gcc -o fltest lex.yy.c
To my surprise, I get the following output:
flextest.l:2:37: error: operator "defined" requires an identifier
#if !defined(__linux__) && !defined(__unix__)
After further checking, the issue seems to be that flex has actually replaced __unix__ with the empty string, as shown by:
$ grep __linux_ lex.yy.c
#if !defined(__linux__) && !defined()
Why does this happen, and is it possible to avoid it?
It's actually m4 (the macro processor which is used by current versions of flex) which is expanding __unix__ to the empty string. The Gnu implementation of m4 defines certain symbols to empty macros so that they can be tested with ifdef.
Of course, it's (better said, it was) a bug in flex. Flex shouldn't allow m4 to expand macros within user content copied from the scanner definition file, and the current version of flex correctly arranges for the text included from the scanner description file to be quoted so that it will pass through m4 unmodified even if it happens to include a string which could be interpreted by m4 as a macro expansion.
The bug is certainly present in v2.5.39 and v2.6.1 of flex. I didn't test all previous versions, but I suppose it was introduced when flex was modified to use m4, which was v2.5.30 according to the NEWS file.
This particular quoting issue was fixed in v2.6.2 but the current version of flex (2.6.4) contains various other bug fixes, so I'd recommend you upgrade to the latest version.
If you really need a version which could work with both the buggy and the more recent versions of flex, you could use one of the two following hacks:
Find some other way to write __unix__. One possibility is the following
#define C(x,y) x##y
#define UNIX_ C(__un,ix__)
#if !defined(__linux__) && !UNIX_
That hack won't work with defined, since defined(UNIX_) tests whether UNIX_ is defined, not whether what it expands to is defined. But normally built-in symbols like __unix__ are actually defined to be 1, if they are defined, and the #if directive treats any identifier which is not #define'd as though it were 0, which means that you can usually leave use x instead of defined(x). (However, it will produce different results if there were a #define x 0 in effect, so it's not quite a perfect substitute.)
Flex, like many m4 applications, redefines m4's quote marks to be [[ and ]]. Both the buggy flex and the corrected versions substitute these quote marks with a rather elaborate sequence which effectively quotes the quote marks. However, the buggy version does not otherwise quote user-defined text, so macro substitutions will be performed in user text. (As mentioned, this is why __unix__ becomes the empty string.
In flex versions in which user-defined text is not quoted, it is possible to invoke the m4 macro which redefines quote marks. These new quote marks can then be used to quote the #if line, preventing macro substitution of __unix__. However, the quote definition must be restored, or it will completely wreck macro processing of the rest of the file. That's a bit tricky because it is impossible to write [[. (Flex will substitute it with a different string.)
The following seems to do the trick. Note that the macro invocations are placed inside C comments. The changequote macros will expand to an empty string, if they are expanded. But in flex versions since v2.6.2, user-supplied text is quoted, so the changequote macros will not be expanded. Putting them inside comments hides them from the C compiler.
%{
/*m4_changequote(<<,>>)<<*/
#if !defined(__linux__) && !defined(__unix__)
/*>>m4_changequote(<<[>><<[>>,<<]>><<]>>)*/
/* Maybe on windows */
#endif
(The m4 macro which changes quote marks is changequote but flex invokes m4 with the -P flag which changes builtins like changequote to m4_changequote. In the second call to changequote, the two [ which make up the [[ sign are individually quoted with the temporary << quote marks, which hides them from the code in flex which modifies use of [[.)
I don't know how reliable this hack is but it worked on the versions of flex which I had kicking around on my machine, including 2.5.4 (pre-M4) 2.5.39 (buggy), 2.6.1 (buggy), 2.6.2 (somewhat debugged) and 2.6.4 (more debugged).

FAKE Fsc task is writing build products to wrong directory

I'm just learning F#, and setting up a FAKE build harness for a hello-world-like application. (Though the phrase "Hell world" does occasionally come to mind... :-) I'm using a Mac and emacs (generally trying to avoid GUI IDEs by preference).
After a bit of fiddling about with documentation, here's how I'm invoking the F# compiler via FAKE:
let buildDir = #"./build-app/" // Where application build products go
Target "CompileApp" (fun _ -> // Compile application source code
!! #"src/app/**/*.fs" // Look for F# source files
|> Seq.toList // Convert FileIncludes to string list
|> Fsc (fun p -> // which is what the Fsc task wants
{p with //
FscTarget = Exe //
Platform = AnyCpu //
Output = (buildDir + "hello-fsharp.exe") }) // *** Writing to . instead of buildDir?
) //
That uses !! to make a FileIncludes of all the sources in the usual way, then uses Seq.toList to change that to a string list of filenames, which is then handed off to the Fsc task. Simple enough, and it even seems to work:
...
Starting Target: CompileApp (==> SetVersions)
FSC with args:[|"-o"; "./build-app/hello-fsharp.exe"; "--target:exe"; "--platform:anycpu";
"/Users/sgr/Documents/laboratory/hello-fsharp/src/app/hello-fsharp.fs"|]
Finished Target: CompileApp
...
However, despite what the console output above says, the actual build products go to the top-level directory, not the build directory. The message above looks like the -o argument is being passed to the compiler with an appropriate filename, but the executable gets put in . instead of ./build-app/.
So, 2 questions:
Is this a reasonable way to be invoking the F# compiler in a FAKE build harness?
What am I misunderstanding that is causing the build products to go to the wrong place?
This, or a very similar problem, was reported in FAKE issue #521 and seems to have been fixed in FAKE pull request #601, which see.
Explanation of the Problem
As is apparently well-known to everyone but me, the F# compiler as implemented in FSharp.Compiler.Service has a practice of skipping its first argument. See FSharp.Compiler.Service/tests/service/FscTests.fs around line 127, where we see the following nicely informative comment:
// fsc parser skips the first argument by default;
// perhaps this shouldn't happen in library code.
Whether it should or should not happen, it's what does happen. Since the -o came first in the arguments generated by FscHelper, it was dutifully ignored (along with its argument, apparently). Thus the assembly went to the default place, not the place specified.
Solutions
The temporary workaround was to specify --out:destinationFile in the OtherParams field of the FscParams setter in addition to the Output field; the latter is the sacrificial lamb to be ignored while the former gets the job done.
The longer term solution is to fix the arguments generated by FscHelper to have an extra throwaway argument at the front; then these 2 problems will annihilate in a puff of greasy black smoke. (It's kind of balletic in its beauty, when you think about it.) This is exactly what was just merged into the master by #forki23:
// Always prepend "fsc.exe" since fsc compiler skips the first argument
let optsArr = Array.append [|"fsc.exe"|] optsArr
So that solution should be in the newest version of FAKE (3.11.0).
The answers to my 2 questions are thus:
Yes, this appears to be a reasonable way to invoke the F# compiler.
I didn't misunderstand anything; it was just a bug and a fix is in the pipeline.
More to the point: the actual misunderstanding was that I should have checked the FAKE issues and pull requests to see if anybody else had reported this sort of thing, and that's what I'll do next time.

DART string constant set to timestamp as at compilation

How can I get a string constant automatically set to the datestamp as at compile time?
Something like:
const String COMPILE_DATESTAMP = eval_static(DateTime.now().toString());
...
String s = "This program was compiled $COMPILE_DATESTAMP";
where s would then be for e.g.
"This program was compiled 1971-02-03 04:05:06"
Thanks for the question!
There's no required compile step in Dart. (We do have an optional Dart-to-JavaScript compiler, or even a Dart-to-Dart processor that does tree shaking.) Dart's VM accepts input as text files. Similar to Ruby or Python, it runs text-based scripts.
As others have mentioned, this is a job for some sort of build step.
I'm new to Dart, but I haven't seen anything in the documentation to suggest that such a thing is possible. I strongly suspect that it isn't.
If you really need functionality like you describe, I think your best bet is to roll your own build script. Something simple like:
#!/bin/bash
sed -ri "s/INSERT_DATETIME_HERE/`date`/" $1
dart2js $1 -o$1.js
could be modified to suit your needs. (I'd want some sanity checks in there if it were me; I'm just suggesting a starting point.) Your code would become:
const String COMPILE_DATESTAMP = "INSERT_DATETIME_HERE";
...
String s = "This program was compiled $COMPILE_DATESTAMP";
You must write another dart program that could examine the actual compiled program. Then is simple as:
File compiledApp = new File('path/to/compiled/app.dart');
compiledApp.lastModified().then(
(modifiedDate)
{
print("This program was compiled $modifiedDate");
},
onError: (exp)
{
// File doesn't exist ?
}
);
This trick build on knowledge that compiler will modify the 'last modify date' of the file

How to distinguish ERTS versions at pre-processing time?

In the recent Erlang R14, inets' file httpd.hrl has been moved from:
src/httpd.hrl
to:
src/http_server/httpd.hrl
The Erlang Web framework includes the file in several places using the following directive:
-include_lib("inets/src/httpd.hrl").
Since I'd love the Erlang Web to compile with both versions of Erlang (R13 and R14), what I'd need ideally is:
-ifdef(OLD_ERTS_VERSION).
-include_lib("inets/src/httpd.hrl").
-else.
-include_lib("inets/src/http_server/httpd.hrl").
-endif.
Even if it possible to retrieve the ERTS version via:
erlang:system_info(version).
That's indeed not possible at pre-processing time.
How to deal with these situations? Is the parse transform the only way to go? Are there better alternatives?
Not sure if you'll like this hackish trick, but you could use a parse transform.
Let's first define a basic parse transform module:
-module(erts_v).
-export([parse_transform/2]).
parse_transform(AST, _Opts) ->
io:format("~p~n", [AST]).
Compile it, then you can include both headers in the module you want this to work for. This should give the following:
-module(test).
-compile({parse_transform, erts_v}).
-include_lib("inets/src/httpd.hrl").
-include_lib("inets/src/http_server/httpd.hrl").
-export([fake_fun/1]).
fake_fun(A) -> A.
If you're on R14B and compile it, you should have the abstract format of the module looking like this:
[{attribute,1,file,{"./test.erl",1}},
{attribute,1,module,test},
{error,{3,epp,{include,lib,"inets/src/httpd.hrl"}}},
{attribute,1,file,
{"/usr/local/lib/erlang/lib/inets-5.5/src/http_server/httpd.hrl",1}},
{attribute,1,file,
{"/usr/local/lib/erlang/lib/kernel-2.14.1/include/file.hrl",1}},
{attribute,24,record,
{file_info,
[{record_field,25,{atom,25,size}},
{record_field,26,{atom,26,type}},
...
What this tells us is that we can use both headers, and the valid one will automatically be included while the other will error out. All we need to do is remove the {error,...} tuple and get a working compilation. To do this, fix the parse_transform module so it looks like this:
-module(erts_v).
-export([parse_transform/2]).
parse_transform(AST, _Opts) ->
walk_ast(AST).
walk_ast([{error,{_,epp,{include,lib,"inets/src/httpd.hrl"}}}|AST]) ->
AST;
walk_ast([{error,{_,epp,{include,lib,"inets/src/http_server/httpd.hrl"}}}|AST]) ->
AST;
walk_ast([H|T]) ->
[H|walk_ast(T)].
This will then remove the error include, only if it's on the precise module you wanted. Other messy includes should fail as usual.
I haven't tested this on all versions, so if the behaviour changed between them, this won't work. On the other hand, if it stayed the same, this parse_transform will be version independent, at the cost of needing to order the compiling order of your modules, which is simple enough with Emakefiles and rebar.
If you are using makefiles, you can do something like
ERTS_VER=$(shell erl +V 2>&1 | egrep -o '[0-9]+.[0-9]+.[0-9]+')
than match the string and define macro in erlc arguments or in Emakefile.
There is no other way, AFAIK.

#line and jump to line

Do any editors honer C #line directives with regards to goto line features?
Context:
I'm working on a code generator and need to jump to a line of the output but the line is specified relative to the the #line directives I'm adding.
I can drop them but then finding the input line is even a worse pain
If the editor is scriptable it should be possible to write a script to do the navigation. There might even be a Vim or Emacs script that already does something similar.
FWIW when I writing a lot of Bison/Flexx I wrote a Zeus Lua macro script that attempted to do something similar (i.e. move from input file to the corresponding line of the output file by search for the #line marker).
For any one that might be interested here is that particular macro script.
#line directives are normally inserted by the precompiler, not into source code, so editors won't usually honor that if the file extension is .c.
However, the normal file extension for post-compiled files is .i or .gch, so you might try using that and see what happens.
I've used the following in a header file occasionally to produce clickable items in
the VC6 and recent VS(2003+) compiler ouptut window.
Basically, this exploits the fact that items output in the compiler output
are essentially being parsed for "PATH(LINENUM): message".
This presumes on the Microsoft compiler's treatment of "pragma remind".
This isn't quite exactly what you asked... but it might be generally helpful
in arriving at something you can get the compiler to emit that some editors might honor.
// The following definitions will allow you to insert
// clickable items in the output stream of the Microsoft compiler.
// The error and warning variants will be reported by the
// IDE as actual warnings and errors... which means you can make
// them occur in the task list.
// In theory, the coding standards could be checked to some extent
// in this way and reminders that show up as warnings or even
// errors inserted...
#define strify0(X) #X
#define strify(X) strify0(X)
#define remind(S) message(__FILE__ "(" strify( __LINE__ ) ") : " S)
// example usage
#pragma remind("warning: fake warning")
#pragma remind("error: fake error")
I haven't tried it in a while but it should still work.
Use sed or a similar tool to translate the #lines to something else not interpreted by the compiler, so you get C error messages on the real line, but have a reference to the original input file nearby.

Resources