nodemcu string.format odd results - lua

I need a specific format for a float number: (sign)xx.dd
when trying to set a string.format for thiss format I get odd results.
h= 5.127 --(it should beconverted to +05.13)
print(string.format("%+05.2f",h))
--> 05.13
print(string.format("%+06.2f",h))
--> 005.13
h= -5.127 --(it should beconverted to -05.13)
print(string.format("%05.2f",h))
--> -5.13
print(string.format("%06.2f",h))
--> 0-5.13
Of course, I have an easy workaround, but I think that there is something wrong in this build.
build created on 2018-04-09 15:12
powered by Lua 5.1.4 on SDK 2.2.1(cfd48f3)
BR,
eHc

This is a bug (or undocumented deficiency) in NodeMCU.
Lua implements most of the handling of string.format format specifiers by handing them off to the C standard library's sprintf function. (There are a few things sprintf allows that Lua doesn't, but + ought to work fine.)
NodeMCU has modified Lua to replace most (or all) of the standard library calls with calls to replacement functions defined by NodeMCU (which is normally crazy, but maybe okay in the embedded systems domain). NodeMCU's sprintf implementation doesn't support +.
This is the relevant code from NodeMCU's source (c_stdio.c). Notice that unknown characters in the format specifier are silently ignored:
for (; *s; s++) {
if (strchr("bcdefgilopPrRsuxX%", *s))
break;
else if (*s == '-')
fmt = FMT_LJUST;
else if (*s == '0')
fmt = FMT_RJUST0;
else if (*s == '~')
fmt = FMT_CENTER;
else if (*s == '*') {
// [snip]
// ...
} else if (*s >= '1' && *s <= '9') {
// [snip]
// ...
} else if (*s == '.')
haddot = 1;
}
Similarly, the 0 formatting is not implemented currently for numbers -- as you have noticed, it just pads on the left regardless of sign.

Related

How to add local variables to yylex function in flex lexer?

I was writing a lexer file that matches simple custom delimited strings of the form xyz$this is stringxyz. This is nearly how I did it:
%{
char delim[16];
uint8_t dlen;
%}
%%
.*$ {
dlen = yyleng-1;
strncpy(delim, yytext, dlen);
BEGIN(STRING);
}
<STRING>. {
if(yyleng >= dlen) {
if(strncmp(delim, yytext[yyleng-dlen], dlen) == 0) {
BEGIN(INITIAL);
return STR;
}
}
yymore();
}
%%
Now I wanted to convert this to reentrant lexer. But I don't know how to make delim and dlen as local variables inside yylex apart from modifying generated lexer. Someone please help me how should I do this.
I don't recommend to store these in yyextra because, these variables need not persist across multiple calls to yylex. Hence I would prefer an answer that guides me towards declaring these as local variables.
In the (f)lex file, any indented lines between the %% and the first rule are copied verbatim into yylex() prior to the first statement, precisely to allow you to declare and initialize local variables.
This behaviour is guaranteed by the Posix specification; it is not a flex extension: (emphasis added)
Any such input (beginning with a <blank>or within "%{" and "%}" delimiter lines) appearing at the beginning of the Rules section before any rules are specified shall be written to lex.yy.c after the declarations of variables for the yylex() function and before the first line of code in yylex(). Thus, user variables local to yylex() can be declared here, as well as application code to execute upon entry to yylex().
A similar statement is in the Flex manual section 5.2, Format of the Rules Section
The strategy you propose will work, certainly, but it's not very efficient. You might want to consider using input() to read characters one at a time, although that's not terribly efficient either. In any event, delim is unnecessary:
%%
int dlen;
[^$\n]{1,16}\$ {
dlen = yyleng-1;
yymore();
BEGIN(STRING);
}
<STRING>. {
if(yyleng > dlen * 2) {
if(memcmp(yytext, yytext + yyleng - dlen, dlen) == 0) {
/* Remove the delimiter from the reported value of yytext. */
yytext += dlen + 1;
yyleng -= 2 * dlen + 1;
yytext[yyleng] = 0;
return STR;
}
}
yymore();
}
%%

Any suggestions about how to implement a BASIC language parser/interpreter?

I've been trying to implement a BASIC language interpreter (in C/C++) but I haven't found any book or (thorough) article which explains the process of parsing the language constructs. Some commands are rather complex and hard to parse, especially conditionals and loops, such as IF-THEN-ELSE and FOR-STEP-NEXT, because they can mix variables with constants and entire expressions and code and everything else, for example:
10 IF X = Y + Z THEN GOTO 20 ELSE GOSUB P
20 FOR A = 10 TO B STEP -C : PRINT C$ : PRINT WHATEVER
30 NEXT A
It seems like a nightmare to be able to parse something like that and make it work. And to make things worse, programs written in BASIC can easily be a tangled mess. That's why I need some advice, read some book or whatever to make my mind clear about this subject. What can you suggest?
You've picked a great project - writing interpreters can be lots of fun!
But first, what do we even mean by an interpreter? There are different types of interpreters.
There is the pure interpreter, where you simply interpret each language element as you find it. These are the easiest to write, and the slowest.
A step up, would be to convert each language element into some sort of internal form, and then interpret that. Still pretty easy to write.
The next step, would be to actually parse the language, and generate a syntax tree, and then interpret that. This is somewhat harder to write, but once you've done it a few times, it becomes pretty easy.
Once you have a syntax tree, you can fairly easily generate code for a custom stack virtual machine. A much harder project is to generate code for an existing virtual machine, such as the JVM or CLR.
In programming, like most engineering endeavors, careful planning greatly helps, especially with complicated projects.
So the first step is to decide which type of interpreter you wish to write. If you have not read any of a number of compiler books (e.g., I always recommend Niklaus Wirth's "Compiler Construction" as one of the best introductions to the subject, and is now freely available on the web in PDF form), I would recommend that you go with the pure interpreter.
But you still need to do some additional planning. You need to rigorously define what it is you are going to be interpreting. EBNF is great for this. For a gentile introduction EBNF, read the first three parts of a Simple Compiler at http://www.semware.com/html/compiler.html It is written at the high school level, and should be easy to digest. Yes, I tried it on my kids first :-)
Once you have defined what it is you want to be interpreting, you are ready to write your interpreter.
Abstractly, you're simple interpreter will be divided into a scanner (technically, a lexical analyzer), a parser, and an evaluator. In the simple pure interpolator case, the parser and evaluator will be combined.
Scanners are easy to write, and easy to test, so we won't spend any time on them. See the aforementioned link for info on crafting a simple scanner.
Lets (for example) define your goto statement:
gotostmt -> 'goto' integer
integer -> [0-9]+
This tells us that when we see the token 'goto' (as delivered by the scanner), the only thing that can follow is an integer. And an integer is simply a string a digits.
In pseudo code, we might handle this as so:
(token - is the current token, which is the current element just returned via the scanner)
loop
if token == "goto"
goto_stmt()
elseif token == "gosub"
gosub_stmt()
elseif token == .....
endloop
proc goto_stmt()
expect("goto") -- redundant, but used to skip over goto
if is_numeric(token)
--now, somehow set the instruction pointer at the requested line
else
error("expecting a line number, found '%s'\n", token)
end
end
proc expect(s)
if s == token
getsym()
return true
end
error("Expecting '%s', found: '%s'\n", curr_token, s)
end
See how simple it is? Really, the only hard thing to figure out in a simple interpreter is the handling of expressions. A good recipe for handling those is at: http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm Combined with the aforementioned references, you should have enough to handle the sort of expressions you would encounter in BASIC.
Ok, time for a concrete example. This is from a larger 'pure interpreter', that handles a enhanced version of Tiny BASIC (but big enough to run Tiny Star Trek :-) )
/*------------------------------------------------------------------------
Simple example, pure interpreter, only supports 'goto'
------------------------------------------------------------------------*/
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <string.h>
#include <setjmp.h>
#include <ctype.h>
enum {False=0, True=1, Max_Lines=300, Max_Len=130};
char *text[Max_Lines+1]; /* array of program lines */
int textp; /* used by scanner - ptr in current line */
char tok[Max_Len+1]; /* the current token */
int cur_line; /* the current line number */
int ch; /* current character */
int num; /* populated if token is an integer */
jmp_buf restart;
int error(const char *fmt, ...) {
va_list ap;
char buf[200];
va_start(ap, fmt);
vsprintf(buf, fmt, ap);
va_end(ap);
printf("%s\n", buf);
longjmp(restart, 1);
return 0;
}
int is_eol(void) {
return ch == '\0' || ch == '\n';
}
void get_ch(void) {
ch = text[cur_line][textp];
if (!is_eol())
textp++;
}
void getsym(void) {
char *cp = tok;
while (ch <= ' ') {
if (is_eol()) {
*cp = '\0';
return;
}
get_ch();
}
if (isalpha(ch)) {
for (; !is_eol() && isalpha(ch); get_ch()) {
*cp++ = (char)ch;
}
*cp = '\0';
} else if (isdigit(ch)) {
for (; !is_eol() && isdigit(ch); get_ch()) {
*cp++ = (char)ch;
}
*cp = '\0';
num = atoi(tok);
} else
error("What? '%c'", ch);
}
void init_getsym(const int n) {
cur_line = n;
textp = 0;
ch = ' ';
getsym();
}
void skip_to_eol(void) {
tok[0] = '\0';
while (!is_eol())
get_ch();
}
int accept(const char s[]) {
if (strcmp(tok, s) == 0) {
getsym();
return True;
}
return False;
}
int expect(const char s[]) {
return accept(s) ? True : error("Expecting '%s', found: %s", s, tok);
}
int valid_line_num(void) {
if (num > 0 && num <= Max_Lines)
return True;
return error("Line number must be between 1 and %d", Max_Lines);
}
void goto_line(void) {
if (valid_line_num())
init_getsym(num);
}
void goto_stmt(void) {
if (isdigit(tok[0]))
goto_line();
else
error("Expecting line number, found: '%s'", tok);
}
void do_cmd(void) {
for (;;) {
while (tok[0] == '\0') {
if (cur_line == 0 || cur_line >= Max_Lines)
return;
init_getsym(cur_line + 1);
}
if (accept("bye")) {
printf("That's all folks!\n");
exit(0);
} else if (accept("run")) {
init_getsym(1);
} else if (accept("goto")) {
goto_stmt();
} else {
error("Unknown token '%s' at line %d", tok, cur_line); return;
}
}
}
int main() {
int i;
for (i = 0; i <= Max_Lines; i++) {
text[i] = calloc(sizeof(char), (Max_Len + 1));
}
setjmp(restart);
for (;;) {
printf("> ");
while (fgets(text[0], Max_Len, stdin) == NULL)
;
if (text[0][0] != '\0') {
init_getsym(0);
if (isdigit(tok[0])) {
if (valid_line_num())
strcpy(text[num], &text[0][textp]);
} else
do_cmd();
}
}
}
Hopefully, that will be enough to get you started. Have fun!
I will certainly get beaten by telling this ...but...:
First, I am actually working on a standalone library ( as a hobby ) that is made of:
a tokenizer, building linear (flat list) of tokens from the source text and following the same sequence as the text ( lexems created from the text flow ).
A parser by hands (syntax analyse; pseudo-compiler )
There is no "pseudo-code" nor "virtual CPU/machine".
Instructions(such as 'return', 'if' 'for' 'while'... then arithemtic expressions ) are represented by a base c++-struct/class and is the object itself. The base object, I name it atom, have a virtual method called "eval", among other common members, that is the "execution/branch" also by itself. So no matter I have an 'if' statement with its possible branchings ( single statement or bloc of statements/instructions ) as true or false condition, it will be called from the base virtual atom::eval() ... and so on for everything that is an atom.
Even 'objects' such as variables are 'atom'. 'eval()' will simply return its value from a variant container held by the atom itself ( pointer, refering to the 'local' variant instance (the instance variant iself) held the 'atom' or to another variant held by an atom that is created in a given 'bloc/stack'. So 'atom' are 'inplace' instructions/objects.
As of now, as an example, chunk of not really meaningful 'code' as below just works:
r = 5!; // 5! : (factorial of 5 )
Response = 1 + 4 - 6 * --r * ((3+5)*(3-4) * 78);
if (Response != 1){ /* '<>' also is not equal op. */
return r^3;
}
else{
return 0;
}
Expressions ( arithemtics ) are built into binary tree expression:
A = b+c; =>
=
/ \
A +
/ \
b c
So the 'instruction'/statement for expression like above is the tree-entry atom that in the above case, is the '=' (binary) operator.
The tree is built with atom::r0,r1,r2 :
atom 'A' :
r0
|
A
/ \
r1 r2
Regarding 'full-duplex' mecanism between c++ runtime and the 'script' library, I've made class_adaptor and adaptor<> :
ex.:
template<typename R, typename ...Args> adaptor_t<T,R, Args...>& import_method(const lstring& mname, R (T::*prop)(Args...)) { ... }
template<typename R, typename ...Args> adaptor_t<T,R, Args...>& import_property(const lstring& mname, R (T::*prop)(Args...)) { ... }
Second: I know there are plenty of tools and libs out there such as lua, boost::bind<*>, QML, JSON, etc... But in my situation, I need to create my very own [edit] 'independant' [/edit] lib for "live scripting". I was scared that my 'interpreter' could take a huge amount of RAM, but I am surprised that it is not as big as using QML,jscript or even lua :-)
Thank you :-)
Don't bother with hacking a parser together by hand. Use a parser generator. lex + yacc is the classic lexer/parser generator combination, but a Google search will reveal plenty of others.

how do i decode, change, then re-encode a CORBA IOR file (Visibroker) in my Java client code?

I am writing code to ingest the IOR file generated by the team responsible for the server and use it to bind my client to their object. Sounds easy, right?
For some reason a bit beyond my grasp (having to do with firewalls, DMZs, etc.), the value for the server inside the IOR file is not something we can use. We have to modify it. However, the IOR string is encoded.
What does Visibroker provide that will let me decode the IOR string, change one or more values, then re-encode it and continue on as normal?
I've already looked into IORInterceptors and URL Naming but I don't think either will do the trick.
Thanks in advance!
When you feel like you need to hack an IOR, resist the urge to do so by writing code and whatnot to mangle it to your liking. IORs are meant to be created and dictated by the server that contains the referenced objects, so the moment you start mucking around in there, you're kinda "voiding your warranty".
Instead, spend your time finding the right way to make the IOR usable in your environment by having the server use an alternative hostname when it generates them. Most ORBs offer such a feature. I don't know Visibroker's particular configuration options at all, but a quick Google search revealed this page that shows a promising value:
vbroker.se.iiop_ts.host
Specifies the host name used by this server engine.
The default value, null, means use the host name from the system.
Hope that helps.
Long time ago I wrote IorParser for GNU Classpath, the code is available. It is a normal parser written being aware about the format, should not "void a warranty" I think. IOR contains multiple tagged profiles that are encapsulated very much like XML so we could parse/modify profiles that we need and understand and leave the rest untouched.
The profile we need to parse is TAG_INTERNET_IOP. It contains version number, host, port and object key. Code that reads and writes this profile can be found in gnu.IOR class. I am sorry this is part of the system library and not a nice piece of code to copy paste here but it should not be very difficult to rip it out with a couple of dependent classes.
This question has been repeatedly asked as CORBA :: Get the client ORB address and port with use of IIOP
Use the FixIOR tool (binary) from jacORB to patch the address and port of an IOR. Download the binary (unzip it) and run:
fixior <new-address> <new-port> <ior-file>
The tool will override the content of the IOR file with the 'patched' IOR
You can use IOR Parser to check the resulting IOR and compare it to your original IOR
Use this function to change the IOR. pass stringified IOR as first argument.
void hackIOR(const char* str, char* newIOR )
{
size_t s = (str ? strlen(str) : 0);
char temp[1000];
strcpy(newIOR,"IOR:");
const char *p = str;
s = (s-4)/2; // how many octets are there in the string
p += 4;
int i;
for (i=0; i<(int)s; i++) {
int j = i*2;
char v=0;
if (p[j] >= '0' && p[j] <= '9') {
v = ((p[j] - '0') << 4);
}
else if (p[j] >= 'a' && p[j] <= 'f') {
v = ((p[j] - 'a' + 10) << 4);
}
else if (p[j] >= 'A' && p[j] <= 'F') {
v = ((p[j] - 'A' + 10) << 4);
}
else
cout <<"invalid octet"<<endl;
if (p[j+1] >= '0' && p[j+1] <= '9') {
v += (p[j+1] - '0');
}
else if (p[j+1] >= 'a' && p[j+1] <= 'f') {
v += (p[j+1] - 'a' + 10);
}
else if (p[j+1] >= 'A' && p[j+1] <= 'F') {
v += (p[j+1] - 'A' + 10);
}
else
cout <<"invalid octet"<<endl;
temp[i]=v;
}
temp[i] = 0;
// Now temp has decoded IOR string. print it.
// Replace the object ID in temp.
// Encoded it back, with following code.
int temp1,temp2;
int l,k;
for(k = 0, l = 4 ; k < s ; k++)
{
temp1=temp2=temp[k];
temp1 &= 0x0F;
temp2 = temp2 & 0xF0;
temp2 = temp2 >> 4;
if(temp2 >=0 && temp2 <=9)
{
newIOR[l++] = temp2+'0';
}
else if(temp2 >=10 && temp2 <=15)
{
newIOR[l++] = temp2+'A'-10;
}
if(temp1 >=0 && temp1 <=9)
{
newIOR[l++] = temp1+'0';
}
else if(temp1 >=10 && temp1 <=15)
{
newIOR[l++] = temp1+'A'-10;
}
}
newIOR[l] = 0;
//new IOR is present in new variable newIOR.
}
Hope this works for you.

How to parse template languages in Ragel?

I've been working on a parser for simple template language. I'm using Ragel.
The requirements are modest. I'm trying to find [[tags]] that can be embedded anywhere in the input string.
I'm trying to parse a simple template language, something that can have tags such as {{foo}} embedded within HTML. I tried several approaches to parse this but had to resort to using a Ragel scanner and use the inefficient approach of only matching a single character as a "catch all". I feel this is the wrong way to go about this. I'm essentially abusing the longest-match bias of the scanner to implement my default rule ( it can only be 1 char long, so it should always be the last resort ).
%%{
machine parser;
action start { tokstart = p; }
action on_tag { results << [:tag, data[tokstart..p]] }
action on_static { results << [:static, data[p..p]] }
tag = ('[[' lower+ ']]') >start #on_tag;
main := |*
tag;
any => on_static;
*|;
}%%
( actions written in ruby, but should be easy to understand ).
How would you go about writing a parser for such a simple language? Is Ragel maybe not the right tool? It seems you have to fight Ragel tooth and nails if the syntax is unpredictable such as this.
Ragel works fine. You just need to be careful about what you're matching. Your question uses both [[tag]] and {{tag}}, but your example uses [[tag]], so I figure that's what you're trying to treat as special.
What you want to do is eat text until you hit an open-bracket. If that bracket is followed by another bracket, then it's time to start eating lowercase characters till you hit a close-bracket. Since the text in the tag cannot include any bracket, you know that the only non-error character that can follow that close-bracket is another close-bracket. At that point, you're back where you started.
Well, that's a verbatim description of this machine:
tag = '[[' lower+ ']]';
main := (
(any - '[')* # eat text
('[' ^'[' | tag) # try to eat a tag
)*;
The tricky part is, where do you call your actions? I don't claim to have the best answer to that, but here's what I came up with:
static char *text_start;
%%{
machine parser;
action MarkStart { text_start = fpc; }
action PrintTextNode {
int text_len = fpc - text_start;
if (text_len > 0) {
printf("TEXT(%.*s)\n", text_len, text_start);
}
}
action PrintTagNode {
int text_len = fpc - text_start - 1; /* drop closing bracket */
printf("TAG(%.*s)\n", text_len, text_start);
}
tag = '[[' (lower+ >MarkStart) ']]' #PrintTagNode;
main := (
(any - '[')* >MarkStart %PrintTextNode
('[' ^'[' %PrintTextNode | tag) >MarkStart
)* #eof(PrintTextNode);
}%%
There are a few non-obvious things:
The eof action is needed because %PrintTextNode is only ever invoked on leaving a machine. If the input ends with normal text, there will be no input to make it leave that state. Because it will also be called when the input ends with a tag, and there is no final, unprinted text node, PrintTextNode tests that it has some text to print.
The %PrintTextNode action nestled in after the ^'[' is needed because, though we marked the start when we hit the [, after we hit a non-[, we'll start trying to parse anything again and remark the start point. We need to flush those two characters before that happens, hence that action invocation.
The full parser follows. I did it in C because that's what I know, but you should be able to turn it into whatever language you need pretty readily:
/* ragel so_tag.rl && gcc so_tag.c -o so_tag */
#include <stdio.h>
#include <string.h>
static char *text_start;
%%{
machine parser;
action MarkStart { text_start = fpc; }
action PrintTextNode {
int text_len = fpc - text_start;
if (text_len > 0) {
printf("TEXT(%.*s)\n", text_len, text_start);
}
}
action PrintTagNode {
int text_len = fpc - text_start - 1; /* drop closing bracket */
printf("TAG(%.*s)\n", text_len, text_start);
}
tag = '[[' (lower+ >MarkStart) ']]' #PrintTagNode;
main := (
(any - '[')* >MarkStart %PrintTextNode
('[' ^'[' %PrintTextNode | tag) >MarkStart
)* #eof(PrintTextNode);
}%%
%% write data;
int
main(void) {
char buffer[4096];
int cs;
char *p = NULL;
char *pe = NULL;
char *eof = NULL;
%% write init;
do {
size_t nread = fread(buffer, 1, sizeof(buffer), stdin);
p = buffer;
pe = p + nread;
if (nread < sizeof(buffer) && feof(stdin)) eof = pe;
%% write exec;
if (eof || cs == %%{ write error; }%%) break;
} while (1);
return 0;
}
Here's some test input:
[[header]]
<html>
<head><title>title</title></head>
<body>
<h1>[[headertext]]</h1>
<p>I am feeling very [[emotion]].</p>
<p>I like brackets: [ is cool. ] is cool. [] are cool. But [[tag]] is special.</p>
</body>
</html>
[[footer]]
And here's the output from the parser:
TAG(header)
TEXT(
<html>
<head><title>title</title></head>
<body>
<h1>)
TAG(headertext)
TEXT(</h1>
<p>I am feeling very )
TAG(emotion)
TEXT(.</p>
<p>I like brackets: )
TEXT([ )
TEXT(is cool. ] is cool. )
TEXT([])
TEXT( are cool. But )
TAG(tag)
TEXT( is special.</p>
</body>
</html>
)
TAG(footer)
TEXT(
)
The final text node contains only the newline at the end of the file.

Lua = operator as print

In Lua, using the = operator without an l-value seems to be equivalent to a print(r-value), here are a few examples run in the Lua standalone interpreter:
> = a
nil
> a = 8
> = a
8
> = 'hello'
hello
> = print
function: 003657C8
And so on...
My question is : where can I find a detailed description of this use for the = operator? How does it work? Is it by implying a special default l-value? I guess the root of my problem is that I have no clue what to type in Google to find info about it :-)
edit:
Thanks for the answers, you are right it's a feature of the interpreter. Silly question, for I don't know which reason I completely overlooked the obvious. I should avoid posting before the morning coffee :-) For completeness, here is the code dealing with this in the interpreter:
while ((status = loadline(L)) != -1) {
if (status == 0) status = docall(L, 0, 0);
report(L, status);
if (status == 0 && lua_gettop(L) > 0) { /* any result to print? */
lua_getglobal(L, "print");
lua_insert(L, 1);
if (lua_pcall(L, lua_gettop(L)-1, 0, 0) != 0)
l_message(progname, lua_pushfstring(L,
"error calling " LUA_QL("print") " (%s)",
lua_tostring(L, -1)));
}
}
edit2:
To be really complete, the whole trick about pushing values on the stack is in the "pushline" function:
if (firstline && b[0] == '=') /* first line starts with `=' ? */
lua_pushfstring(L, "return %s", b+1); /* change it to `return' */
Quoting the man page:
In interactive mode ... If a line starts with '=', then lua displays the values of all the expressions in the remainder of the line. The expressions must be separated by commas.
I think that must be a feature of the stand alone interpreter. I can't make that work on anything I have compiled lua into.
I wouldn't call it a feature - the interpreter just returns the result of the statement. It's his job, isn't it?
Assignment isn't an expression that returns something in Lua like it is in C.

Resources