How to explictly write the default rule of gnu flex? - flex-lexer

flex manual say the following.
By default, any text not matched by a flex scanner is copied to the output
I want to understand how to write it explictly. Is it something like this?
%%
. ECHO;
Also, how to disable the default rule?

The default rule is:
.|\n ECHO;
(In every start condition)
Remember that . in (f)lex does not match a newline.
To disable the default rule, use the declaration
%option nodefault
Once you do that, you will get a warning if your rules do not cover every eventuality. If you ignore the warning and use the generated scanner, it will stop with a fatal error if the input does not match any pattern.
Since you hardly ever want the default rule, I recommend always using the above %option.
If you have some default rule of your own in mind, you can place it at as the last rule in your file:
<*>.|\n /* default action here */

Related

clang-format giving inconsistent results

I'm working on a program that does a clang-format check of any newly submitted git files and shows a diff of what the user needs to change to match the correct format.
The issue here is that I've already run clang-format on the current git repo, and when I run it again (on the already clang-formatted file), it decides to continue to make changes. One of the examples is:
1294c1294
< // 3)
---
> //3)
clang-format can't seem to decide whether or not it wants a space here. Another example:
229,230c229
< * testCodeUnknown OBJECT IDENTIFIER ::=
< *{
---
> * testCodeUnknown OBJECT IDENTIFIER ::= {
This is within a comment, so it really doesn't matter, but it's making a mess of the 1600 files I have to check and diff when clang-format doesn't produce consistent results.
I am calling clang-format the same way both times, other than the fact that the actual format includes -i and the check pipes stdout to a temp file.
Is this expected behavior? Is there a way to get clang-format to make up its mind (do I have to explicitly set these values in the .clang-format file?)
Edit
As per this question (clang-format makes changes to an already formatted file), this is an idempotency bug in clang-format. I will further look into reproduction cases, but I need some help on figuring out what rule this is so I can potentially just not implement that rule as a workaround

Swig-template delete Whitespaces per default

I am using Swig as the template engine in my project to create XML.
To make the XML output look nice I need to add a "-" everytime I use the template functions ({% -%}, {{ -}}, {# -#}).
It would be nice to be able to change the default behavior to always strip whitespace before and after. Is there a setting for this already?
No, there isn't.
The stripping is done in line 624 in parser.js:
https://github.com/paularmstrong/swig/blob/2e0e135ac04da5bf75f79cf8d4498094b3b49d35/lib/parser.js#L624
The variables stripNext and stripPrev are set to true only if a tag or a variable expression includes this -. If not stripping will not be done. There is no other way.

How to simulate a user's input (only internally)?

I need to parse something during the runtime of my eclipse plugin (created with Xtext) which would then be treated as if the user has typed it in but without actually popping up and beeing visible for the user as an input of himself.
For exmample I have this input:
for "i" from 1 to 3 do {};
My problem woth this input is that the variable i is not declared as a normal declaration with a "=" but I need the parser to tell that it is one. So I want to let the parser parse
i = 1;
so it recognizes it as a normal declaration and allows cross-references to it.
Greeting Krzmbrzl
EDIT:
All in all the thing I want is to add a statement i=1; to the AST
I just want to have eclipse support for an existing language so I'm not writing an interpreter nor a generator. The problem is that when I have a for-loop like above the actual interpreter of that language declares a variable i (or however it's named in the loop header) and therefore this variable is available in the loop body. In my case my parser doesn't recognise i as a variable because it only knows that a declaration is done via "=" so I can't use i in the loop body (if I try so I get the error that the declaration i cannot be resolved). This is why I want to add this declaration manually when such a loop is created. I don't need to worry about any compiling or interpreting difficulties because I don't do this myself. As I already said I just want to have all the cool eclipse features for this language.
Ok, now I understand your problem. It is still no good idea to add any element to the AST to resolve any cross reference! You don't do that! Instead you should try to refactor your grammar in a way that "i" in for "i" from ... is a compatible declaration of a variable. There are several tricks to do that. Have you completely read the Xtext documentation? Have also read the Xtext book? Both documents tell a lot about how to make Xtext do things you will not expect.
Anyway, two tricks I often use are:
Introduce an unused, abstract Parser Rule which you can then use as destination of a cross reference, but which is never used as an attribute (containment reference).
AbstractDecl:
VarDecl | ForVarDecl;
VarDecl:
name=ID ...;
ForVarDecl:
'"' name=ID '"';
For:
'for' decl=ForVarDecl 'from' from=INT 'to' to=INT 'do' block=Block;
...
StatementWithCR:
ref=[AbstractDecl] ...;
Define any ParserRule, which returns an other type.
ForDecl returns VarDecl:
'"' name=ID '"';
If you would post the grammar which corresponds to this specific problem, we could develop a solution which is safe. Adding anything to the AST during live processing content of the editor will lead to a faulty state which can destroy your document.

Print definition of a function in Forth

When a word is already defined in Forth, is there a way to print its definition?
I've heard that many of Forth's built-in functions such as emit, drop, etc. are defined in terms of the language itself, and I'd like to be able to look at their definitions.
You can usually use
see emit
Which in Gforth gives you something like:
: (emit)
outfile-id emit-file drop ;
latestxt
Defer emit
IS emit
ok

Can I get lex to put out a yylex() function with a different name?

I want to have two lexers in one project, and I don't want to run into problems with having multiple yylex functions in the build. Can I make lex output with a different prefix?
You can use the -Pprefix parameter for flex in your makefile. Using flex -Pfoo you would effectively prefix all yy generated functions. Have a look at the manual page for further details.
flex lets you do that. Just define the YY_DECL macro. Dunno about actual Unix(tm) lex(1) though.
You could build a C++ lexer. This means all the state information is held in an object.
Then it is just a matter of using the correct object!

Resources