String literals support in LuaJIT's C parser - lua

On the official LuaJIT website one can read the following:
The C parser complies to the C99 language standard plus the following extensions:
- The '\e' escape in character and string literals.
This should mean that LuaJIT C parser supports string literals!
Let's try:
ffi.cdef[[static const char * s = "foo";]]
ffi.cdef[[static const char s[] = "foo";]]
ffi.cdef[[static const char s[4] = "foo";]]
Unfortunately, none of the above works: the error invalid C type is raised on each ffi.cdef invocation.
Can you provide an example how to feed a string literal to the C parser integrated into LuaJIT?
P.S.
I should mention that I can successfully declare int and char variables the same way, but not strings.
ffi.cdef[[static const int k = 42;]]
print(ffi.C.k) --> 42

Related

Flex: How to define a term to be the first one at the beginning of a line(exclusively)

I need some help regarding a problem I face in my flex code.
My task: To write a flex code which recognizes the declaration part of a programming language, described below.
Let a programming language PL. Its variable definition part is described as follows:
At the beginning we have to start with the keyword "var". After writing this keyword we have to write the variable names(one or more) separated by commas ",". Then a colon ":" is inserted and after that we must write the variable type(say real, boolean, integer or char in my example) followed by a semicolon ";". After doing the previous steps there is the potentiality to declare into a new line new variables(variable names separated by commas "," followed by colon ":" followed by variable type followed by a semicolon ";"), but we must not use the "var" keyword again at the beginning of the new line( the "var" keyword is written once!!!)
E.g.
var number_of_attendants, sum: integer;
ticket_price: real;
symbols: char;
Concretely, I do not know how to make it possible to define that each and every declaration part must start only with the 'var' keyword. Until now, if I would begin a declaration part directly declaring a variable, say x (without having written "var" at the beginning of the line), then no error would occur(unwanted state).
My current flex code below:
%{
#include <stdio.h>
%}
VAR_DEFINER "var"
VAR_NAME [a-zA-Z][a-zA-Z0-9_]*
VAR_TYPE "real"|"boolean"|"integer"|"char"
SUBEXPRESSION [{VAR_NAME}[","{VAR_NAME}]*":"[ \t\n]*{VAR_TYPE}";"]+
EXPRESSION {VAR_DEFINER}{SUBEXPRESSION}
%%
^{EXPRESSION} {
printf("This is not a well-syntaxed expression!\n");
return 0;
}
{EXPRESSION} printf("This is a well-syntaxed expression!\n");
";"[ \t\n]*{VAR_DEFINER} {
printf("The keyword 'var' is defined once at the beginning of a new line. You can not use it again\n");
return 0;
}
{VAR_DEFINER} printf("A keyword: %s\n", yytext);
^{VAR_DEFINER} printf("Each and every declaration part must start with the 'var' keyword.\n");
{VAR_TYPE}";" printf("The variable type is: %s\n", yytext);
{VAR_NAME} printf("A variable name: %s\n", yytext);
","/[ \t\n]*{VAR_NAME} /* eat up commas */
":"/[ \t\n]*{VAR_TYPE}";" /* eat up single colon */
[ \t\n]+ /* eat up whitespace */
. {
printf("Unrecognized character: %s\n", yytext);
return 0;
}
%%
main(argc, argv)
int argc;
char** argv;
{
++argv, --argc;
if (argc > 0)
yyin = fopen(argv[0],"r");
else
yyin = stdin;
yylex();
}
I hope to have made it as much as possible clear.
I am looking forward to reading your answers!
You seem to be trying to do too much in the scanner. Do you really have to do everything in Flex? In other words, is this an exercise to learn advanced use of Flex, or is it a problem that may be solved using more appropriate tools?
I've read that the first Fortran compiler took 18 staff-years to create, back in the 1950's. Today, "a substantial compiler can be implemented even as a student project in a one-semester compiler design course", as the Dragon Book from 1986 says. One of the main reasons for this increased efficiency is that we have learned how to divide the compiler into modules that can be constructed separately. The two first such parts, or phases, of a typical compiler is the scanner and the parser.
The scanner, or lexical analyzer, can be generated by Flex from a specification file, or constructed otherwise. Its job is to read the input, which consists of a sequence of characters, and split it into a sequence of tokens. A token is the smallest meaningful part of the input language, such as a semicolon, the keyword var, the identifier number_of_attendants, or the operator <=. You should not use the scanner to do more than that.
Here is how I woould write a simplified Flex specification for your tokens:
[ \t\n] { /* Ignore all whitespace */ }
var { return VAR; }
real { return REAL; }
boolean { return BOOLEAN; }
integer { return INTEGER; }
char { return CHAR; }
[a-zA-Z][a-zA-Z0-9_]* { return VAR_NAME; }
. { return yytext[0]; }
The sequence of tokens is then passed on to the parser, or syntactical analyzer. The parser compares the token sequence with the grammar for the language. For example, the input var number_of_attendants, sum : integer; consists of the keyword var, a comma-separated list of variables, a colon, a data type keyword, and a semicolon. If I understand what your input is supposed to look like, perhaps this grammar would be correct:
program : VAR typedecls ;
typedecls : typedecl | typedecls typedecl ;
typedecl : varlist ':' var_type ';' ;
varlist : VAR_NAME | varlist ',' VAR_NAME ;
var_type : REAL | BOOLEAN | INTEGER | CHAR ;
This grammar happens to be written in a format that Bison, a parser-generator that often is used together with Flex, can understand.
If you separate your solution into a lexical part, using Flex, and a grammar part, using Bison, your life is likely to be much simpler and happier.

F# Literal/constant can be composed with strings, but not int?

Why is this OK:
let [<Literal>] hi = "hi"
let [<Literal>] bye = "bye"
let [<Literal>] shortMeeting = hi + bye
...but this is not?
let [<Literal>] me = 1
let [<Literal>] you = 1
let [<Literal>] we = me + you
The third line gives the error:
This is not a valid constant expression
What's up with that?
So the spec / docs are a little unclear, but provide hints.
From the spec (for F# 3.0):
A value that has the Literal attribute is subject to the following
restrictions:
It may not be marked mutable or inline. It may not also have the
ThreadStaticor ContextStatic attributes. The righthand side expression
must be a literal constant expression that is made up of either:
A simple constant expression, with the exception of (), native integer
literals, unsigned native integer literals, byte array literals,
BigInteger literals, and user-defined numeric literals.
OR
A reference to another literal
This seems to suggest that even the combination of strings isn't allowed.
The documentation states that this changed in F# 3.1:
https://msdn.microsoft.com/en-us/library/dd233193.aspx
As of F# 3.1, you can use the + sign to combine string literals. You
can also use the bitwise or (|||) operator to combine enum flags. For
example, the following code is legal in F# 3.1:
Note that integer addition is not on that list

How to print lld in Lua 5.3

string.format (formatstring, ยทยทยท)
Returns a formatted version of its variable number of arguments following the description given in its first argument (which must be a string). The format string follows the same rules as the ISO C function sprintf. The only differences are that the options/modifiers *, h, L, l, n, and p are not supported and that there is an extra option, q.
Lua 5.3 doesn't support lld, how can I print lld in Lua 5.3?
Short answer: use %d.
In C sprintf, %lld is used to format a long long type, which is an integer type at least 64 bit.
In Lua 5.3, the type number has two internal representations, integer and float. Integer representation is 64-bit in standard Lua. You can use %d to print it no matter its internal representation:
print(string.format("%d", 2^62))
Output: 4611686018427387904
In Lua source file luaconf.h, you can see that Lua converts %d to the approapriate format:
#define LUA_INTEGER_FMT "%" LUA_INTEGER_FRMLEN "d"
and LUA_INTEGER_FRMLEN is defined as "", "l" or "ll" if different internal representation for integer is used:
#if defined(LLONG_MAX) /* { */
/* use ISO C99 stuff */
#define LUA_INTEGER long long
#define LUA_INTEGER_FRMLEN "ll"
//...

Evaluating constant expressions in clang tools

I'm writing a Clang tool and I'm trying to figure out how to evaluate a string literal given access to the program's AST. Given the following program:
class DHolder {
public:
DHolder(std::string s) {}
};
DHolder x("foo");
I have the following code in the Clang tool:
const CXXConstructExpr *ctor = ... // constructs `x` above
const Expr *expr = ctor->getArg(0); // the "foo" expression
???
How can I get from the Expr representing the "foo" string literal to an actual C++ string in my tool? I've tried to do something like:
// From ExprConstant.cpp
Evaluate(result, info, expr);
but I don't know how to initialize the result and info parameters.
Any clues?
I realize this is an old question, but I ran into this a moment ago when I could not use stringLiteral() to bind to any arguments (the code is not C++11). For example, I have a CXXMMemberCallExpr:
addProperty(object, char*, char*, ...); // has 7 arguments, N=[0,6]
The AST dump shows that ahead of the StringLiteral is a CXXBindTemporaryExpr. So in order for my memberCallExpr query to bind using hasArgument(N,expr()), I wrapped my query with bindTemporaryExpr() (shown here on separate lines for readability):
memberCallExpr(
hasArgument(6, bindTemporaryExpr(
hasDescendant(stringLiteral().bind("argument"))
)
)
)
The proper way to do this is to use the AST matchers to match the string literal and bind a name to it so it can be later referenced, like this:
StatementMatcher m =
constructExpr(hasArgument(0, stringLiteral().bind("myLiteral"))).bind("myCtor");
and then in the match callback do this:
const CXXConstructExpr *ctor =
result.Nodes.getNodeAs<CXXConstructExpr("optionMatcher");
const StringLiteral *optNameLiteral =
result.Nodes.getNodeAs<StringLiteral>("optName");
The literal can then be accessed through
optNameLiteral->getString().str();

what does this return parameter means?

+(const char /*wchar_t*/ *)wcharFromString:(NSString *)string
{
return [string cStringUsingEncoding:NSUTF8StringEncoding];
}
Does it return char or wchar_t?
from the method name, it should return wchar_t, but why there is a comment around wchar_t return type?
Source is here:
How to convert wchar_t to NSString?
That code just looks incorrect. They're claiming it does one thing, but it actually does another. The return type is const char *.
This method is not correct. It returns a const char *, encoded as a UTF8 string. That is a perfectly sensible way of getting a C string from an NSString, but nowhere here is anyone actually doing anything with wchar_ts.
wchar_t is a "wide char", and a pointer to it would be a "wide string" (represented by const wchar_t *). These are designed to precisely represent larger character sets, and can be two-byte wide character strings; they use a whole different variant set of string manipulation functions to do things with them. (Strings like this are very rarely seen in iOS development, for what it's worth.)

Resources