Read binary data as Lua number with FFI - lua

I have a file I opened as binary like this: local dem = io.open("testdem.dem", "rb")
I can read out strings from it just fine: print(dem:read(8)) -> HL2DEMO, however, afterwards there is a 4-byte little endian integer and a 4-byte float (docs for the file format don't specify endianess but since it didn't specify little like the integer I'll have to assume big).
This cannot be read out with read.
I am new to the LuaJIT FFI and am not sure how to read this out. Frankly, I find the documentation on this specific aspect of the FFI to be underwhelming, although I'm just a lua programmer and don't have much experience with C. One thing I have tried is creating a cdata, but I don't think I understand that:
local dem = io.open("testdem.dem", "rb")
print(dem:read(8))
local cd = ffi.new("int", 4)
ffi.copy(cd, dem:read(4), 4)
print(tostring(cd))
--[[Output
HL2DEMO
luajit: bad argument #1 to 'copy' (cannot convert 'int' to 'void *')
]]--
Summary:
Goal: Read integers and floats from binary data.
Expected output: A lua integer or float I can then convert to string.

string.unpack does this for Lua 5.3, but there are some alternatives for LuaJIT as well. For example, see this answer (and other answers to the same question).

Related

Segmentation fault while using ffi to convert lua strings to C strings

I met a strange problem whne tying to convert a Lua string to C char arry.
local str = "1234567890abcdef"
local ffi = require "ffi"
ffi.cdef[[
int printf(const char *fmt, ...);
]]
print(#str)
print(str)
local cstr = ffi.new("unsigned char[?]", #str, str)
run this code get:
[root#origin ~]# luajit test.lua
16
1234567890abcdef
Segmentation fault
I know ffi.new("unsigned char[?]", #str+1, str) will solve this, but I dont know why.
I don't think it is the \0 problem because I found some strange point.
if str is not 16 bytes, this will not happen.
if I delete ffi.cdef which I didn't use , this will not happen.
if I put ffi.cdef behind the ffi.new ,this will not happen .
[root#origin ~]# luajit test.lua
17
1234567890abcdefg
// this is the result that I only append a 'g' to `str`.
I tried on Luajit 2.0.5 and Luajit 2.1.0-beta3 with default compiler arguments.
So, is there anyone knows how this happen, thanks.
It is exactly because of string size is 17 but array allocated only for 16 bytes. https://github.com/LuaJIT/LuaJIT/blob/0c0e7b168ea147866835954267c151ef789f64fb/src/lj_cconv.c#L582 is the code that copies string to resulting array. As you can see, if target is array and its size is smaller than string length, it shrinks string; however your type is VLA (variable-length array), and for VLA size is not specified (it is 2**32-1, actually, which is way bigger than 17).
"I don't get error if" is not an argument here - you stomp memory that is used for something else. Sometimes stomping extra byte with 0 is not fatal (e.g. due to alignment this byte wasn't used anyway, or just happened to be 0 already) or doesn't result in hard crash - it doesn't make it correct.

What does 16#4000000 mean in Erlang?

I'm reading ejabberd source, specifically ejabberd_http.erl.
In the code below,
...
case (State#state.sockmod):recv(State#state.socket,
min(Len, 16#4000000), 300000)
of
{ok, Data} ->
recv_data(State, Len - byte_size(Data), <<Acc/binary, Data/binary>>);
...
What does 16#4000000 mean?
I've tested this in the Erlang shell.
%%erlang shell
...
7>16#4000000.
67108864
8>is_integer(16#4000000).
true
I know it's just an integer value.
Is there any advantage to writing 16#4000000 instead of 67108864?
In Erlang, the number before the # is the integer base. In your example, 16#4000000 means the hexadecimal representation of 67108864. In other languages it is often represented as 0x4000000.
One reason for using the hex representation is because each digit represents 4 bits, for example 16#F is 16 (in decimal), or 1111 in binary. When working with binary processing, using base 16 makes it easier to handle and understand for the human reader.

Read first bytes of lrange results using Lua scripting

I'm want to read and filter data from a list in redis. I want to inspect the first 4 bytes (an int32) of data in a blob to compare to an int32 I will pass in as an ARG.
I have a script started, but how can I check the first 4 bytes?
local updates = redis.call('LRANGE', KEYS[1], 0, -1)
local ret = {}
for i=1,#updates do
-- read int32 header
-- if header > ARGV[1]
ret[#ret+1] = updates[i]
end
return ret
Also, I see there is a limited set of libraries: http://redis.io/commands/EVAL#available-libraries
EDIT: Some more poking around and I'm running into issues due to how LUA stores numbers - ARGV[1] is a 8 byte string, and cannot be safely be converted into a 64 bit number. I think this is due to LUA storing everything as doubles, which only have 52 bits of precision.
EDIT: I'm accepting the answer below, but changing the question to int32. The int64 part of the problem I put into another question: Comparing signed 64 bit number using 32 bit bitwise operations in Lua
The Redis Lua interpreter loads struct library, so try
if struct.unpack("I8",updates) > ARGV[1] then

Lua true Binary I/O

I read this question, and I checked it myself.
I use the following snippet:
f = io.open("file.file", "wb")
f:write(1.34)
f:close()
This creates the file, into which 1.34 is written. This is same as : 00110001 00101110 00110011 00110100 , that is binary codes for the digit 1, the decinal point, then 3 and finally 4.
However, I would like to have printed 00111111 10101100 11001100 11001101, which is a true float representation. How do I do so?
You may need to convert it to binary representation, using something similar to this answer. This discussion on serialization of lua numbers may also be useful.

Is there a built in routine to convert lightuserdata to int?

Lightuserdata is different from userdata so what can I do with it? I mean the operations of lightuserdata in lua. Looks like I cannot convert it to any other data type.
One of my case:
My C library returns a C pointer named 'c_pointer', AKA lightuserdata to Lua, and then I want:
my_pointer = c_pointer +4
and then pass 'my_pointer' back to C library. Since I cannot do anything with 'c_pointer', so the expression 'c_pointer + 4' is invalid.
I am wondering are there some practical solutions to this?
Lightuserdata are created by C libraries. They are simply C pointers.
For example, you can use them to refer to data you allocate with malloc, or statically allocate in your module. Your C library can transfer these pointers to the Lua side as a lightuserdata using lua_pushlightuserdata, and later Lua can give it back to your library (or other C code) on the stack. Lua code can use the lightuserdata as any other value, storing it in a table, for example, even as a table key.
ADDENDUM
To answer your revised question, if you want to add an offset to the pointer, do it on the C side. Pass the lightuserdata and the integer offset to C, and let C do the offset using ptr[n]
void * ptr = lua_touserdata(L, idx1);
lua_Integer n = lua_tointeger(L. idx2);
// do something with
((char *)ptr)[n];
Plain Lua has no pointer arithmetic, so as Doug Currie stated you would need to do the pointer arithmetic on the C side.
LuaJIT on the other hand can do pointer arithmetic (via the FFI library), so consider using that instead.

Resources