Lua string.format %c versus string.char - lua

Should Lua string.format( "%c", value ) be equivalent to string.char( value )?
It seems not when character value is zero.
string.format( "%c", 0 ):len()
returns 0
string.char( 0 ):len()
returns 1
Even stranger,
string.format( "%c%s", 0, "abc" ):len()
returns 3; where any other non-zero-modulo-256 value for %c returns 4, so string.format is not truncating the whole string at null byte like C sprintf, just collapsing %c field to empty string instead of one-byte string. Note that C sprintf writes the zero byte followed by the abc bytes in this case.
I couldn't find anything in the Lua docs describing expected behavior in this case. Most other string handling in Lua seems to treat zero byte as valid string character.
This is on Lua 5.1.4-8 on OpenWrt.
Idiosyncracy or bug?

I think this is a bug.
In Lua 5.1 and LuaJIT 2.0, string.format formats one item at a time (using the sprintf provided by the host C runtime.) It then calls strlen to update the length of the output string. Since strlen stops at the null character, this character will be overwritten.
This is documented behaviour for %s, but probably unintentional for %c
This is fixed in Lua 5.2. I wouldn't expect any more updates to 5.1.

In the book "programming in lua" 2nd edition. In the chapter 2.4, there are some context like below:
"Strings in Lua have the usual meaning: a sequence of characters. Lua is
eight-bit clean and its strings may contain characters with any numeric code,
including embedded zeros. This means that you can store any binary data into
a string.
"
So this is not a bug

Related

What's the , Lua equivalent of pythons endswith()?

I want to convert this python code to lua .
for i in range(1000,9999):
if str(i).endswith('9'):
print(i)
I've come this far ,,
for var=1000,9000 then
if tostring(var).endswith('9') then
print (var)
end
end
but I don't know what's the lua equivalent of endswith() is ,,, im writing an nmap script,,
working 1st time with lua so pls let me know if there are any errors ,, on my current code .
The python code is not great, you can get the last digit by using modulo %
# python code using modulo
for i in range(1000,9999):
if i % 10 == 9:
print(i)
This also works in Lua. However Lua includes the last number in the loop, unlike python.
-- lua code to do this
for i=1000, 9998 do
if i % 10 == 9 then
print(i)
end
end
However in both languages you could iterate by 10 each time
for i in range(1009, 9999, 10):
print(i)
for i=9, 9998, 10 do
print(i)
for var = 1000, 9000 do
if string.sub(var, -1) == "9" then
-- do your stuff
end
end
XY-Problem
The X problem of how to best port your code to Lua has been answered by quantumpro already, who optimized it & cleaned it up.
I'll focus on your Y problem:
What's the Lua equivalent of Python endswith?
Calling string functions, OOP-style
In Lua, strings have a metatable that indexes the global string library table. String functions are called using str:func(...) in Lua rather than str.func(...) to pass the string str as first "self" argument (see "Difference between . and : in Lua").
Furthermore, if the argument to the call is a single string, you can omit the parentheses, turning str:func("...") into str:func"...".
Constant suffix: Pattern Matching
Lua provides a more powerful pattern matching function that can be used to check whether a string ends with a suffix: string.match. str.endswith("9") in Python is equivalent to str:match"9$" in Lua: $ anchors the pattern at the end of the string and 9 matches the literal character 9.
Be careful though: This approach doesn't work with arbitrary, possibly variable suffices since certain characters - such as $ - are magic characters in Lua patterns and thus have a special meaning. Consider str.endswith("."); this is not equivalent to string:match".$" in Lua, since . matches any character.
I'd say that this is the lua-esque way of checking whether a string ends with a constant suffix. Note that it does not return a boolean, but rather a match (the suffix, a truthy value) if successful or nil (a falsey value) if unsuccessful; it can thus safely be used in ifs. To convert the result into a boolean, you could use not not string:match"9$".
Variable suffix: Rolling your own
Lua's standard library is very minimalistic; as such, you often need to roll your own functions even for basic things. There are two possible implementations for endswith, one using pattern matching and another one using substrings; the latter approach is preferable because it's shorter, possibly faster (Lua uses a naive pattern matching engine) and doesn't have to take care of pattern escaping:
function string:endswith(suffix)
return self:sub(-#suffix) == suffix
end
Explanation: self:sub(-#suffix) returns the last suffix length characters of self, the first argument. This is compared against the suffix.
You can then call this function using the colon (:) syntax:
str = "prefixsuffix"
assert(str:endswith"suffix")
assert(not str:endswith"prefix")

is there a way to convert an integer to be always a 4 digit hex number using Lua

I'm creating a Lua script which will calculate a temperature value then format this value as a 4 digit hex number which must always be 4 digits. Having the answer as a string is fine.
Previously in C I have been able to use
data_hex=string.format('%h04x', -21)
which would return ffeb
however the 'h' string formatter is not available to me in Lua
dropping the 'h' doesn't cater for negative answers i.e
data_hex=string.format('%04x', -21)
print(data_hex)
which returns ffffffeb
data_hex=string.format('%04x', 21)
print(data_hex)
which returns 0015
Is there a convenient and portable equivalent to the 'h' string formatter?
I suggest you try using a bitwise AND to truncate any leading hex digits for the value being printed.
If you have a variable temp that you are going to print then you would use something like data_hex=string.format("%04x",temp & 0xffff) which would remove the leading hex digits leaving only the least significant 4 hex digits.
I like this approach as there is less string manipulation and it is congruent with the actual data type of a signed 16 bit number. Whether reducing string manipulation is a concern would depend on the rate at which the temperature is polled.
For further information on the format function see The String Library article.

GSub with a plus/minus character

I am trying to convert a text source into an HTML readable page.
The code I have have tried:
local newstr=string.gsub(str,"±", "±")
local newstr=string.gsub(str,"%±", "±")
However, the character shows up as  in the output.
I can't seem to find any other documentation on how to handle this specific special character. How do I handle this character when reading in so that it will output properly?
Edit: After trying suggestions I'm able to determine this:
local function sanitizeheader(str)
if not(str)then return "" end
str2 = "Depth ±"
local newstr=string.gsub(str2, string.char(177), "±")
return newstr
end
In the testing, if I use str2 ± does show up in the output. However, when I try to use str as it is passed in from reading the excel file, it doesn't pick up the character and still returns the  character.
Lua string assume strings as sequence of bytes. You are trying utf8 multi byte character. The code you are trying should work as it just replacing a sequence of bytes. However, Lua 5.3 has utf8 library to handle unicode character
local str="±®ª"
for code in str:gmatch(utf8.charpattern) do
print("&#" .. utf8.codepoint(code) .. ";")
end
Output:
±
®
ª
Check Lua Reference Manual for more info.

string.sub in Corona Lua crashes with ÅÄÖ

this snippet crashes my simulator bad.
s = "stämma"
s1 = string.sub(s,3,3)
print(s1)
It seems like it handles my character as nil, any ideas?
Joakim
I assume you are using UTF-8 encoding.
In UTF-8, a character can have a variable number of bytes, between 1 to 4. The "ä" character (228) is encoded with the two bytes 0xC3 0xA4.
The instruction string.sub(s, 3, 3) returns the third byte from the string (0xC3), and not the third character. As this byte alone is invalid UTF-8, Corona can't display the character.
See also Extract the first letter of a UTF-8 string with Lua

In Erlang how do I convert a String to a binary value?

In Erlang how do I convert a string to a binary value?
String = "Hello"
%% should be
Binary = <<"Hello">>
In Erlang strings are represented as a list of integers. You can therefore use the list_to_binary (built-in-function, aka BIF). Here is an example I ran in the Erlang console (started with erl):
1> list_to_binary("hello world").
<<"hello world">>
the unicode (utf-8/16/32) character set needs more number of bits to express characters that are greater than 1-byte in length:
this is why the above call failed for any byte value > 255 (the limit of information that a byte can hold, and which is sufficient for IS0-8859/ASCII/Latin1)
to correctly handle unicode characters you'd need to use
unicode:characters_to_binary() R1[(N>3)]
instead, which can handle both Latin1 AND unicode encoding.
HTH ...

Resources