Flex input buffer reset after error - buffer

I'm using flex & bison to parse a custom language and I'm in the situation described here: http://www.gnu.org/software/bison/manual/html_node/How-Can-I-Reset-the-Parser.html.
To be more precise
I invoke yyparse several times, and on correct input it works
properly; but when a parse error is found, all the other calls fail
too. How can I reset the error flag of yyparse?
My parser and scanner run inside a separate thread, but there is only one thread working with the input file. In my understanding I don't need to write a reentrant scanner since there is only one thread working with the input file. In that page the problem is clearly explained but the solution is not clear to me.
It says:
Therefore, whenever you change yyin, you must tell the Lex-generated
scanner to discard its current buffer and switch to the new one. This
depends upon your implementation of Lex; see its documentation for
more. For Flex, it suffices to call ‘YY_FLUSH_BUFFER’ after each
change to yyin. If your Flex-generated scanner needs to read from
several input streams to handle features like include files, you might
consider using Flex functions like ‘yy_switch_to_buffer’ that
manipulate multiple input buffers
My parser thread calls yyparse in order to build my AST. What is not clear to me is when and where I have to call yy_flush_buffer to fix the problem. In my understanding the scanner code (generated by Flex) is called by the parser code (generated by Bison). The Bison generated code is generated by the grammar. As a result the parser code is not under my direct control. This means I cannot include the call to yy_flush_buffer into the parser code since it would be overwritten every time I generate the parser code by the grammar. It means that I should put the yy_flush_buffer in the grammr file somewhere. But where?

I fixed the problem by doing:
...
FILE *f = fopen(_filename, "r");
yyrestart(f);
yyparse();
...
I leave the question since it could be useful for other people.

Related

Add #include's to the headers of a program using llvm clang

I need to add headers to an already existing program by transforming it with LLVM and Clang.
I have used clang's rewriter to accomplish a similar thing in the changing function names and arguments, etc.
But the header files aren't present in clang's AST. I already know we need to use PPCallbacks (https://clang.llvm.org/doxygen/classclang_1_1PPCallbacks.html) but I am in dire need of some examples on how to make it work with the rewriter if at all possible.
Alternatively, adding a #include statement just before the first
using namespace <namespace>;
Also works. I would like to know an example of this as well.
Any help would be appreciated.
There is a bit of confusion in your question. You need to understand in details how the preprocessor works. Be aware that most of C++ compilation happens after the preprocessing phase (so most C++ static analyzers work after that phase).
In other words, the C++ specification (and also the C specification) defines first what is preprocessing, and then what is the syntax and the semantics of the preprocessed form.
In other words, when compiling foo.cc your compiler see the preprocessed form foo.ii that you could obtain with clang++ -C -E foo.cc > foo.ii
In the 1980s the preprocessor /lib/cpp was a separate program forked by the compiler (and some temporary foo.ii was sitting on the disk and removed at end of compilation). Today, it is -for performance reasons- some initial processing done inside the compiler. But you could reason as if it was still separate.
Either you want to alter the Clang compiler, and it deals (like every other C++ compiler or C++ static analyzer) mostly with the preprocessed form. Then you don't want to add new #include-s, but you want to alter the flow of AST given to the compiler (after preprocessing), and that is a different question: you then want to add some AST between existing AST elements (independently of any preprocessor directives).
Or you want to automatically change the C++ source code. The hard part is determining what you want to change and at what place. I suppose that you have used complex stuff to determine that a #include <vector> has to be inserted after line 34 of file foo.cc. Once you've got that information (and getting it is the hard thing), doing the insertion is pretty trivial. For example, you could read every C++ source line, and insert your line when you have read enough lines.

flex yy_fatal_error exist just like that. I want handler back to application

flex yy_fatal_error exist just like that. But I want handler back to my application. How to avoid exist call? from yy_fatal_error. whether this problem addressed in any version? your suggestion is highly appreciated. help me on this issues.
You can override the function, by #defineing your own. Note that in the generated code there is
/* Report a fatal error. */
#ifndef YY_FATAL_ERROR
#define YY_FATAL_ERROR(msg) yy_fatal_error( msg )
#endif
If you #define the macro YY_FATAL_ERROR(msg) to call your own function, the lexer will call that function rather than the one from the template.
However, the lexer template is written to assume that this function does not return. You can make it do that by using setjmp and longjmp to prepare a predictable place to return in your application and jumping back (from your own yy_fatal_error function) to that when a "fatal" error is used.
vi like emacs does this for instance, because it uses lexers for syntax highlighting. If a fatal error is generated by the lexer, you would not want the editor to stop.
Here are a few links discussing setjmp and longjmp:
Practical usage of setjmp and longjmp in C
setjmp and longjmp - understanding with examples

Steps to follow while parsing second file through yyparse

I want to parse two files. I have Yacc/lex code which generates the parser.
It works fine when I parse the first file (a.txt) but when i parse the second file (b.txt) it returns error (syntax error), but when i parse second file(b.txt) first it can parse it smoothly.
My guess is that after reading first file when it start reading second file some buffers or states are not cleared. So i wanted to ask to know do I have to reset some buffers or states which parser maintains before proceeding for parsing second file.
I cannot paste my code over here as it is too large.
Thanks in advance.
You want a reentrant parser. Bison at least supports this, I'm not really sure if yacc does this, but switching to bison should be effectively painless.
Add %pure-parser in your grammar file.
http://www.delorie.com/gnu/docs/bison/bison_66.html
Actually I found the answer to this through some other question. The problem was in clearing the buffer so if you add a
YY_FLUSH_BUFFER
Befor opening a new file it solves the problem.

Can I get lex to put out a yylex() function with a different name?

I want to have two lexers in one project, and I don't want to run into problems with having multiple yylex functions in the build. Can I make lex output with a different prefix?
You can use the -Pprefix parameter for flex in your makefile. Using flex -Pfoo you would effectively prefix all yy generated functions. Have a look at the manual page for further details.
flex lets you do that. Just define the YY_DECL macro. Dunno about actual Unix(tm) lex(1) though.
You could build a C++ lexer. This means all the state information is held in an object.
Then it is just a matter of using the correct object!

lua script error checking

Is it possible to check if a lua script contains errors without executing it? I have fallowing code:
if(luaL_loadbuffer(L, data, size, name))
{
fprintf (stderr, "%s", lua_tostring (L, -1));
lua_pop (L, 1);
}
if(lua_pcall(L, 0, 0, 0))
{
fprintf (stderr, "%s", lua_tostring (L, -1));
lua_pop (L, 1);
}
But if the script contains errors it passes first if and it is executed. I want to know if it contains errors when I load it, not when I execute it. Is this possible?
You can use the LUA Compiler. It will only compile your file to bytecode without executing it.
Your program will also have the advantage the run faster if it is compiled.
You can even use the -p option to only perform a syntax checking, according to the linked man page :
-p load files but do not generate any output file. Used mainly for syntax checking or testing precompiled chunks: corrupted files will probably generate errors when loaded. For a thourough integrity test, use -t.
(This was originally meant as a reply to the first comment to Krtek's question, but I ran out of space there and to be honest it works as an answer just fine.)
Functions are essentially values, and thus a named function is actually a variable of that name. Variables, by their very definition, can change as a script is executed. Hell, someone might accidentally redefine one of those functions. Is that bad? To sum my thoughts up: depending on the script, parameters passed and/or actual implementations of those pre-defined functions you speak of (one might unset itself or others, for example), it is not possible to guarantee things work unless you are willing to narrow down some of your demands. Lua is too dynamic for what you are looking for. :)
If you want a flawless test: create a dummy environment with all bells and whistles in place, and see if it crashes anywhere along the way (loading, executing, etc). This is basically a sort of unit test, and as such would be pretty heavy.
If you want a basic check to see if a script has a valid syntax: Krtek gave an answer for that already. I am quite sure (but not 100%) that the lua equivalent is to loadfile or loadstring, and the respective C equivalent is to try and lua_load() the code, each of which convert readable script to bytecode which you would already need to do before you could actually execute the code in your normal all-is-well usecase. (And if that contained function definitions, those would need to be executed later on for the code inside those to execute.)
However, these are the extent of your options with regards to pre-empting errors before they actually happen. Lua is a very dynamic language, and what is a great strength also makes for a weakness when you want to prove correctness. There are simply too many variables involved for a perfect solution.
In general it is not possible, as Lua is a dynamic language, and most of errors happen in runtime.
If you want to check for syntax errors, use luac -p option. I use it as a part of my pre-commit hook, for example.
Other common errors are triggering by misusing the global variables. You may analyze output of luac -l to catch these cases. See here: http://lua-users.org/wiki/DetectingUndefinedVariables.
If you want something more advanced, there are several more-or-less functional static analysis tools for Lua code. Start with LuaInspect.
In any case, you are advised to write unit tests instead of just relying on static code checks. Less pain, more gain.

Resources