What is the "*all* format in lua file:read() means? - lua

I maintain some old code written in LUA, there are some snippet I could not understand,
local f = io.open("someFile.lua", "r");
local szFileContent = "return {};";
if f then
szFileContent = f:read("*all");
f:close();
end
The format used in read function is something weird, I see the format *a, and *l in lua51 manual https://www.lua.org/manual/5.1/manual.html#pdf-file:read,
but not the *all format

in the function read from liolib.c only the first two ('*' and 'a') characters are checked, the rest of the string is ignored:
// ...
const char *p = lua_tostring(L, n);
luaL_argcheck(L, p && p[0] == '*', n, "invalid option");
switch (p[1]) {
case 'n': /* number */
success = read_number(L, f);
break;
case 'l': /* line */
success = read_line(L, f);
break;
case 'a': /* file */
read_chars(L, f, ~((size_t)0)); /* read MAX_SIZE_T chars */
success = 1; /* always success */
break;
default:
return luaL_argerror(L, n, "invalid format");
//...

Related

How do I pass a "C" string from a "C" routine to a GO function (and convert it to a GO string?)

This must be something really silly and basic, but the cgo docs (and google fu) have left me stranded. Here's what I am trying to do: I want a GO function to call a "C" function using 'import "C"'. Said "C" function needs to store the address of a "C" string (malloc or constant - neither has worked for me) into an argument passed to it as *C.char. The GO function then needs to convert this to a GO string. It actually does work, except I get this:
panic: runtime error: cgo argument has Go pointer to Go pointer
If I run with GODEBUG=cgocheck=0, it all works fine. If I leave as default:
strptr = 4e1cbf ('this is a C string!')
main: yylex returned token 1
yylval.tstrptr 4e1cbf
stringval token "this is a C string!"
The problematic line seems to be:
yylval.stringval = C.GoString(yylval.tstrptr)
What little I can find about C.GoString, it left me with the impression that it allocates a GO string, and fills it in from the "C" string provided, but that seems to not be the case, or why am I getting a complaint about 'Go pointer to Go pointer'? I've tried a number of other approaches, like having the "C" function malloc the buffer and the GO function do C.free() on it. Nothing has worked (where worked == avoiding this runtime panic).
The GO source:
package main
import (
"fmt"
"unsafe"
)
// #include <stdio.h>
// int yylex (void * foo, void *tp);
import "C"
type foo_t struct {
i int32
s string
}
var foo foo_t
func main() {
var retval int
var s string
var tp *C.char
for i := 0; i < 2; i++ {
retval = int(C.yylex(unsafe.Pointer(&foo), unsafe.Pointer(&tp)))
fmt.Printf("main: yylex returned %d\n", retval)
fmt.Printf("tp = %x\n", tp)
if retval == 0 {
s = C.GoString(tp)
fmt.Printf("foo.i = %d s = %q\n", foo.i, s)
} else {
foo.s = C.GoString(tp)
fmt.Printf("foo.i = %d foo.s = %q\n", foo.i, foo.s)
}
}
}
The "C" source
#include <stdio.h>
int yylex (int * foo, char ** tp)
{
static num;
*foo = 666;
*tp = "this is a C string!";
printf ("strptr = %x ('%s')\n", *tp, *tp);
return (num++);
}
What's interesting is that if the GO func stores into foo.s first, the 2nd call to yylex bombs with the panic. If I do s and then foo.s (depending on whether I check retval as 0 or non-zero), it doesn't fail, but I'm guessing that is because the GO function exits right away and there are no subsequent calls to yylex.

Parsing an integer and HEX value in Ragel

I am trying to design a parser using Ragel and C++ as host langauge.
There is a particular case where a parameter can be defined in two formats :
a. Integer : eg. SignalValue = 24
b. Hexadecimal : eg. SignalValue = 0x18
I have the below code to parse such a parameter :
INT = ((digit+)$incr_Count) %get_int >!(int_error); #[0-9]
HEX = (([0].'x'.[0-9A-F]+)$incr_Count) %get_hex >!(hex_error); #[hexadecimal]
SIGNAL_VAL = ( INT | HEX ) %/getSignalValue;
However in the above defined parser command, only the integer values(as defined in section a) gets recognized and parsed correctly.
If an hexadecimal number(eg. 0x24) is provided, then the number gets stored as ´0´ . There is no error called in case of hexadecimal number. The parser recognizes the hexadecimal, but the value stored is '0'.
I seem to be missing out some minor details with Ragel. Has anyone faced a similar situation?
The remaning part of the code :
//Global
int lInt = -1;
action incr_Count {
iGenrlCount++;
}
action get_int {
int channel = 0xFF;
std::stringstream str;
while(iGenrlCount > 0)
{
str << *(p - iGenrlCount);
iGenrlCount--;
}
str >> lInt; //push the values
str.clear();
}
action get_hex {
std::stringstream str;
while(iGenrlCount > 0)
{
str << std::hex << *(p - iGenrlCount);
iGenrlCount--;
}
str >> lInt; //push the values
}
action getSignalValue {
cout << "lInt = " << lInt << endl;
}
It's not a problem with your FSM (which looks fine for the task you have), it's more of a C++ coding issue. Try this implementation of get_hex():
action get_hex {
std::stringstream str;
cout << "get_hex()" << endl;
while(iGenrlCount > 0)
{
str << *(p - iGenrlCount);
iGenrlCount--;
}
str >> std::hex >> lInt; //push the values
}
Notice that it uses str just as a string buffer and applies std::hex to >> from std::stringstream to int. So in the end you get:
$ ./a.out 245
lInt = 245
$ ./a.out 0x245
lInt = 581
Which probably is what you want.

How to read H5T_STRING from hdf5 using C

So I have a hdf5 file which contains a dataset:
DATASET "updateDateTime" {DATATYPE H5T_STRING{
STRSIZE 24;
STRPAD H5T_STR_NULLPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE{ (5) / (5) }
DATA{
(0) : "2015-05-12\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
(1) : "2015-05-13\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
(2) : "2015-05-14\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
(3) : "2015-05-15\000\000\000\000\000\000\000\000\000\000\000\000\000\000",
(4) : "2015-05-16\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
}
I want to read this dataset using C, but I can't find a proper example(I am new to HDF5). Specifically, I can't figure which H5T_NATIVE_* to use when reading. Here is the code i have right now:
hid_t time_ds = H5Dopen(grp, "updateDateTime", H5P_DEFAULT);
auto time_shape = get_dataset_shape(time_ds);
char** time_str = (char **)malloc(time_shape[0] * sizeof(char *)); // TODO: memeory allocation correct??
status = H5Dread(time_ds, H5T_NATIVE_CHAR, H5S_ALL, H5S_ALL, H5P_DEFAULT,
time_str);
/*do my stuff*/
free(time_str);
status = H5Dclose(time_ds);
Try
char* time_str = (char*) malloc(time_shape[0] * sizeof(char));
status = H5Dread(time_ds, H5T_NATIVE_CHAR, H5S_ALL, H5S_ALL, H5P_DEFAULT, &time_str);
After digging into the source code of h5dump(the tool goes with hdf5 package), I finally got it to work. I can't say this is a good solution, but hope this can help other people who encounter the similar problem.
It turns out the native type can be speculated by this function
hid_t h5tools_get_native_type(hid_t type)
{
hid_t p_type;
H5T_class_t type_class;
type_class = H5Tget_class(type);
if (type_class == H5T_BITFIELD)
p_type = H5Tcopy(type);
else
p_type = H5Tget_native_type(type, H5T_DIR_DEFAULT);
return(p_type);
}
Then, read the dataset like this:
type = H5Dget_type(dset);
native_type = h5tools_get_native_type(type);
auto shape = get_dataset_shape(dset);
n_element = std::accumulate(shape.begin(), shape.end(), 1ull, std::multiplies<size_t>());
type_size = std::max(H5Tget_size(type), H5Tget_size(native_type));
size_t alloc_size = n_element * type_size;
char * buf = BAT_NEW char[alloc_size];
status = H5Dread(dset, native_type, H5S_ALL, H5S_ALL, H5P_DEFAULT, buf);
/*do my stuff*/
H5Tclose(native_type);
H5Tclose(type);
delete[] buf;
Alternatively, you could read the dataset (of datatype H5T_STRING) using HDFql in C like this:
hdfql_execute("SELECT FROM updateDateTime");
hdfql_cursor_first(NULL);
printf("Dataset value is %s\n", hdfql_cursor_get_char(NULL));
In case the dataset stores more than one string (which seems to be your case by looking at the result of h5dump posted above), you can retrieve these by looping the result set:
hdfql_execute("SELECT FROM updateDateTime");
while(hdfql_cursor_next(NULL) == HDFQL_SUCCESS)
{
printf("Dataset value is %s\n", hdfql_cursor_get_char(NULL));
}

Nom Parser To Unescape String

I'm writing a Nom parser for RCS. RCS Files tend to be ISO-8859-1 encoded. One of the grammar productions is for a String. This is #-delimited and literal # symbols are escaped as ##.
#A String# -> A String
#A ## String# -> A # String
I have a working function (see end). IResult is from Nom, you either return the parsed thing, plus the rest of the unparsed input, or an Error/Incomplete. Cow is used to return a reference built on the original input slice if no unescaping was required, or an owned string if it was.
Are there any built in Nom macros that could have helped with this parse?
#[macro_use]
extern crate nom;
use std::str;
use std::borrow::Cow;
use nom::*;
/// Parse an RCS String
fn string<'a>(input: &'a[u8]) -> IResult<&'a[u8], Cow<'a, str>> {
let len = input.len();
if len < 1 {
return IResult::Incomplete(Needed::Unknown);
}
if input[0] != b'#' {
return IResult::Error(Err::Code(ErrorKind::Custom(0)));
}
// start of current chunk. Chunk is a piece of unescaped input
let mut start = 1;
// current char index in input
let mut i = start;
// FIXME only need to allocate if input turned out to need unescaping
let mut s: String = String::new();
// Was the input escaped?
let mut escaped = false;
while i < len {
// Check for end delimiter
if input[i] == b'#' {
// if there's another # then it is an escape sequence
if i + 1 < len && input[i + 1] == b'#' {
// escaped #
i += 1; // want to include the first # in the output
s.push_str(str::from_utf8(&input[start .. i]).unwrap());
start = i + 1;
escaped = true;
} else {
// end of string
let result = if escaped {
s.push_str(str::from_utf8(&input[start .. i]).unwrap());
Cow::Owned(s)
} else {
Cow::Borrowed(str::from_utf8(&input[1 .. i]).unwrap())
};
return IResult::Done(&input[i + 1 ..], result);
}
}
i += 1;
}
IResult::Incomplete(Needed::Unknown)
}
It looks like the way to use the nom library is using the macro combinators. A quick browse of the source code gives some nice examples of parsers, including parsing of strings with escape characters. This is what I came up with:
#[macro_use]
extern crate nom;
use nom::*;
named!(string< Vec<u8> >, delimited!(
tag!("#"),
fold_many0!(
alt!(
is_not!(b"#") |
map!(
complete!(tag!("##")),
|_| &b"#"[..]
)
),
Vec::new(),
|mut acc: Vec<u8>, bytes: &[u8]| {
acc.extend(bytes);
acc
}
),
tag!("#")
));
#[test]
fn it_works() {
assert_eq!(string(b"#string#"), IResult::Done(&b""[..], b"string".to_vec()));
assert_eq!(string(b"#string with ## escapes#"), IResult::Done(&b""[..], b"string with # escapes".to_vec()));
assert_eq!(string(b"#invalid string"), IResult::Incomplete(Needed::Size(16)));
}
As you can see, I simply copy the bytes into a vector using Vec::extend - you could be more sophisticated here and return a Cow byte slice if you wanted.
The escaped! macro does not appear to be of use in this case unfortunately, as it can't seem to work when the terminator is the same as the escape character (which is actually a pretty common case).

Ask user for path for fopen in C?

This is my function. It's working absolutely fine; I just can't get one more thing working.
Instead of the static fopen paths, I need the user to write the path for the files. I tried several things but I can't get it working. Please help
int FileToFile() {
FILE *fp;
FILE *fp_write;
char line[128];
int max=0;
int countFor=0;
int countWhile=0;
int countDo = 0;
fp = fopen("d:\\text.txt", "r+");
fp_write = fopen("d:\\results.txt", "w+");
if (!fp) {
perror("Greshka");
}
else {
while (fgets(line, sizeof line, fp) != NULL) {
countFor = 0;
countWhile = 0;
countDo = 0;
fputs(line, stdout);
if (line[strlen(line)-1] = "\n") if (max < (strlen(line) -1)) max = strlen(line) -1;
else if (max < strlen(line)) max = strlen(line);
char *tmp = line;
while (tmp = strstr(tmp, "for")){
countFor++;
tmp++;
}
tmp = line;
while (tmp = strstr(tmp, "while")){
countWhile++;
tmp++;
}
tmp = line;
while (tmp = strstr(tmp, "do")){
countDo++;
tmp++;
}
fprintf(fp_write, "Na tozi red operatora for go ima: %d pyti\n", countFor);
fprintf(fp_write, "Na tozi red operatora for/while go ima: %d pyti\n", countWhile - countDo);
fprintf(fp_write, "Na tozi red operatora do go ima: %d pyti\n", countDo);
}
fprintf(fp_write, "Maximalen broi simvoli e:%d\n", max);
fclose(fp_write);
fclose(fp);
}
}
Have a look at argc and argv. They are used for command-line arguments passed to a program. This requires that your main function be revised as follows:
int main(int argc, char *argv[])
The argc is an integer that represents the number of command-like arguments, and argv is an array of char* that contain the arguments themselves. Note that for both, the program name itself counts as an argument.
So if you invoke your program like this:
myprog c:\temp
Then argc will be 2, argv[0] will be myprog, and argv[1] will be c:\temp. Now you can just pass the strings to your function. If you pass more arguments, they will be argv[2], etc.
Keep in mind if your path contains spaces, you must enclose it in double quotes for it to be considered one argument, because space is used as a delimiter:
myprog "c:\path with spaces"

Resources