flex 2.5.35 gives error when ctrl-M used in lex file - flex-lexer

I have a simple lex file.
%{
#include <stdio.h>
%}
space_char [ \t\^M]
space {space_char}+
%%
%%
int yywrap(void) {
return 1;
}
int main(void) {
yylex();
return 0;
}
When I compile this file with flex-2.5.35, it gives following errors:
lex.l:5: bad character:
lex.l:5: name defined twice
But, with flex-2.5.4, it runs fine.
I understand this error is due to special character ctrl-m (carriage-return). I want to know if flex-2.5.35 doesn't support special characters like ctrl-l, ctrl-m? And if so, then what's the alternate way? Please note, I am restricted with the use of 2.5.35 only.
Thanks.

As in C, you can use \r for the carriage return character.

Related

How to use myname2.lex in the flex examples?

I see the following example in flex examples. I am able to compile it. But I am not sure what input I should give to it. Could anybody let me know? Thanks.
/*
* myname2.lex : A sample Flex program
* that does token replacement.
*/
%{
#include <stdio.h>
%}
%x STRING
%%
\" ECHO; BEGIN(STRING);
<STRING>[^\"\n]* ECHO;
<STRING>\" ECHO; BEGIN(INITIAL);
%NAME { printf("%s",getenv("LOGNAME")); }
%HOST { printf("%s",getenv("HOST")); }
%HOSTTYPE { printf("%s",getenv("HOSTTYPE"));}
%HOME { printf("%s",getenv("HOME")); }
The rules %NAME, %HOST, %HOSTTYPE and %HOME match those exact strings respectively. So you could enter those and see their corresponding actions execute.
You could also enter one of them surrounded by quotes (e.g. "%HOST") and observe that its action will not be executed because the whole thing was seen as a string literal.

Use yylex() to get the list of token types from an input string

I have a CLI that was made using Bison and Flex which has grown large and complicated, and I'm trying to get the complete sequence of tokens (yytokentype or the corresponding yytranslate Bison symbol numbers) for a given input string to the parser.
Ideally, every time yyerror() is called I want to store the sequence of tokens that were identified during parse. I don't need to know the yylval's, states, actions, etc, just the token list resulting from the string input to the buffer.
If a straightforward way of doing this doesn't exist, then just a stand-alone way of going from string --> yytokentypes will work.
The below code just has debugging printouts, which I'll change to storing it in the place I want as soon as I figure out how to get the tokens.
// When an error condition is reached, yylex() to get the yytokentypes
void yyerror(const char *s)
{
std::cerr<<"LEX\n";
int tok; // yytokentype
do
{
tok = yylex();
std::cerr<<tok<<",";
}while(tok);
std::cerr<<"LEX\n";
}
A simpler solution is to just change the name of the lexer using the YY_DECL macro and then add a definition of yylex at the end:
%{
// ...
#include "parser.tab.h"
#define YY_DECL static int wrapped_lexer(void)
%}
%%
/* rules */
%%
int yylex(void) {
int token = wrapped_lexer();
/* do something with the token */
return token;
}
Having said that, unless the source code is read-once for some reason, it's probably faster on the whole to rescan the input only if an error is encountered rather than saving the token list in case an error is an encountered. Lexing is really pretty fast, and in many use cases, syntactically correct inputs are more common than erroneous ones.
OK I figured a way to do this without having to re-tokenize the input string. Flex allows you to define YY_DECL, which by default is found in the generated lexer file to produce the yylex() declaration:
#ifndef YY_DECL
//some other stuff
#define YY_DECL int yylex (void)
#endif /* !YY_DECL */
And this goes in place
/** The main scanner function which does all the work.
*/
YY_DECL
{
// Body of yylex() which returns the yytokentype
}
A tricky thing that I'm able to do is re-define yylex() via YY_DECL to capture every token before it gets returned to the caller. This allows me to store the yytokentype for every call without changing the parser's behavior one bit. Below I'm just printing it out here for testing:
#define YY_DECL \
int yylex2(void); \
int yylex (void) \
{ \
int ret; \
ret = yylex2(); \
std::cerr<<"yylex2 returns: "<<ret<<"\n"; \
return ret; \
} \

Why call yylex() only once in main()

When I write a yylex() for a yacc parser, the yylex() usually return symbol at a time, that is, the yylex() must be called muti-times until the file to an end.
But when I write a main function for a lex scanner, I just call the yylex() once, but the whole file still fully scanned.
void main(int argc, char* argv[]) {
printf("start\n");
yyin = fopen(argv[1], "r");
yylex();
printf("word count: %d\n", wordCount);
fclose(yyin);
}
Why?
Sorry for asking a silly question, I have read the c file generated by lex, and find that the action code is pasted in a switch case segment, so, as #rici said, it is very much depend on what you write in the action, since my code in action does not return, so one call for yylex will go through the stream. When there's a return, I should use a while() to call yylex.

luaL_dostring() crashes when given script has syntax error

i try to integrate Lua in a embedded project using GCC on a Cortex-M4. i am able to load and run a Lua script, calling Lua functions from C, calling C functions from Lua. but the C program crashes (HardFault_Handler trap rises) when the given script passed as parameter in luaL_dostring() contains any Lua syntax errors.
here the relevant C code that crashes due to the syntax error in Lua:
//create Lua VM...
luaVm = lua_newstate(luaAlloc, NULL);
//load libraries...
luaopen_base(luaVm);
luaopen_math(luaVm);
luaopen_table(luaVm);
luaopen_string(luaVm);
//launch script...
luaL_dostring(luaVm, "function onTick()\n"
" locaal x = 7\n" //syntax error
"end\n"
"\n" );
when doing the same with correct Lua syntax, then it works:
luaL_dostring(luaVm, "function onTick()\n"
" local x = 7\n"
"end\n"
"\n" );
when debugging and stepping through luaL_dostring(), i can follow the Lua parsing line for line, and when reaching the line with the syntax error, then the C program crashes.
can anybody help? thanks.
have disabled setjmp/longjmp in Lua source code in the following way:
//#define LUAI_THROW(L,c) longjmp((c)->b, 1) //TODO oli4 orig
//#define LUAI_TRY(L,c,a) if (setjmp((c)->b) == 0) { a } //TODO oli4 orig
#define LUAI_THROW(L,c) while(1) //TODO oli4 special
#define LUAI_TRY(L,c,a) { a } //TODO oli4 special
...so there is no setjmp/longjmp used anymore, but i still have the crash :-(
must have another cause???
found the problem: it is the sprintf function called on Lua syntax error. in fact, on my platform sprintf seems not support floating point presentation. so i changed luaconf.h the following way, limiting the presentation to integer format.
//#define LUA_NUMBER_FMT "%.14g"
#define LUA_NUMBER_FMT "%d"
must have another cause???
Yes: you can't use Lua here.
Lua's error handling system is built on a framework of setjmp/longjump. You can't just make LUAI_THROW and LUAI_TRY do nothing. That means lua_error and all internal error handling stops working. Syntax errors are part of Lua's internal error handling.
If your C compiler doesn't provide proper support for the C standard library, then Lua is simply not going to be functional in that environment. You might try LuaJIT, but I doubt that will be any better.
#define LUAI_THROW(L,c) c->throwed = true
#define LUAI_TRY(L,c,a) \
__try { a } __except(filter()) { if ((c)->status == 0 && ((c)->throwed)) (c)->status = -1; }
#define luai_jmpbuf int /* dummy variable */
struct lua_longjmp {
struct lua_longjmp *previous;
luai_jmpbuf b;
volatile int status; /* error code */
bool throwed;
};
Works as expected even you build without C++ exceptions

Complete URL encoding

Anyone know of a tool to completely encode a string to URL encoding? Best known example is something to convert space character to %20. I want to do this for every single character. What's a good tool for this (linux)?
thanks everyone for down voting, if i cared what language i would have specified. couldnt find anything useful in the other post linked below so i wrote this. this is good enough for me, might be good enough for you.
#include <stdio.h>
// Treats all args as one big string. Inserts implicit spaces between args.
int main(int argc, char *argv[])
{
if(argc == 1)
{
printf("Need something to encode.");
return 1;
}
int count = 0;
while(++count < argc)
{
char *input = argv[count];
while(*input != '\0')
{
printf("%%%x", *input);
input++;
}
printf("%%20");
}
printf("\n");
return 0;
}
Take a look at this SO question:
How to urlencode data for curl command?
Which programming language? You can even do something client-side...
i modified this of the other link
perl -p -e 's/(.)/sprintf("%%%02X", ord($1))/seg'
it works nice enough..
run this.. type in what you want to convert..(or pipe it through) and it'll output everything %encoded

Resources