hdf5dump of H5T_STRING - hdf5

I'm trying to figure out how to dump a text block from an HDF5 file (a Bathymetric Attributed Grid / BAG). When I do h5dump -d /BAG_root/metadata H11703_Office_5m.bag, and anything else I've tried, I always get the data with each character of the XML quoted. Is there an "easy" option to have it dump the raw data contents to a file or the terminal?
DATASET "/BAG_root/metadata" {
DATATYPE H5T_STRING {
STRSIZE 1;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 5097 ) / ( H5S_UNLIMITED ) }
DATA {
(0): "<", "?", "x", "m", "l", " ", "v", "e", "r", "s", "i", "o", "n", "=",
(14): """, "1", ".", "0", """, "?", ">", "
", "<", "s", "m",
(25): "X", "M", "L", ":", "M", "D", "_", "M", "e", "t", "a", "d", "a",

Marcus Cole emailed me this solution after I brought up the topic on the OpenNavSurf mailing list:
h5dump -b FILE -o H12279_VB_4m_MLLW_1of1.xml -d BAG_root/metadata H12279_VB_4m_MLLW_1of1.bag
This writes out a clean XML file.

Re: Python & BAG, GDAL 1.7.0+ supports the BAG format; e.g.:
from osgeo import gdal
bag = gdal.OpenShared(r"C:\DATA\NGDC\H11555_2m_1.bag")
bagmetadata = bag.GetMetadata("xml:BAG")[0]

The data is stored as an array of 5097 single characters strings (STRSIZE 1). To dump the text, it should have been stored as a real string (e.g. in a scalar dataspace).
So I think you cannot do it with h5dump alone, you probably have to process the dump with sed or your favorite regexp tool.

Related

GSUB doesnt works without spaces

I was trying to make a simple text cryptor, but the script works only if put spaces after every symbol
code:
local text = ""
local tdext = text:gsub("%S+", {["+"] = "a", ["×"] = "b", ["÷"] = "c", ["="] = "d", ["/"] = "e", ["_"] = "f", ["€"] = "g", ["¥"] = "h", ["₩"] = "i", ["!"] = "j", ["#"] = "k", ["#"] = "l", ["$"] = "m", ["%"] = "n", ["^"] = "o", ["&"] = "p", ["*"] = "q", ["("] = "r", [")"] = "s", ["-"] = "t", ["'"] = "u", [":"] = "v", [";"] = "w", [","] = "x", ["?"] = "y", ["."] = "z", [" "] = " "})
print(tdext)
I tried fixing it, but it doesnt do what it should.
If i put in text variable "÷ =" it outputs "b c", but if i am putting "÷=" in variable it will output "÷=".
Let's take a closer look at your substitutions:
local subs = {
["+"] = "a", ["×"] = "b", ["÷"] = "c", ["="] = "d", ["/"] = "e",
["_"] = "f", ["€"] = "g", ["¥"] = "h", ["₩"] = "i", ["!"] = "j",
["#"] = "k", ["#"] = "l", ["$"] = "m", ["%"] = "n", ["^"] = "o",
["&"] = "p", ["*"] = "q", ["("] = "r", [")"] = "s", ["-"] = "t",
["'"] = "u", [":"] = "v", [";"] = "w", [","] = "x", ["?"] = "y",
["."] = "z", [" "] = " "
}
local tdext = text:gsub("%S+", subs)
%S+ matches a sequence of one or more non-space bytes. If you have single characters - multi-byte (UTF-8) or single-byte (ASCII) - this will work fine. However if you have a sequence of multiple characters (say, +-), this won't perform the replacement, since both + and - won't be found in your lookup table. The same is the case for the multi-byte ÷=: ÷ = works, because your characters are separated by spaces; ÷= doesn't, because the pattern greedily matches the sequence.
If this is supposed to be a character-wise substitution, you'll need to match characters (UTF-8 sequences, which includes ASCII). Lua 5.3 and later will have the "constant" utf8.charpattern which is a pattern string matching a single UTF-8 character. If you have a recent Lua version, the fix becomes trivial: Just replace "%S+" with utf8.charpattern:
local tdext = text:gsub(utf8.charpattern, subs)
In older Lua versions (up to and including 5.2), you'll have to write this pattern yourself, using decimal escapes:
local charpattern = "[%z-\127\194-\244][\128-\191]*"
local tdext = text:gsub(charpattern, subs)
Alternatively, if you also want to support multi-character substitutions, you can simply apply the substitutions one by one (which is however significantly less efficient by a factor linear in the number of entries in the subs table):
-- We need to escape everything to make Lua treat it as a literal string
local function escape_pattern(str)
return str:gsub(".", "%%.")
end
local tdext = text
for from, to in pairs(subs) do
tdext = tdext:gsub(escape_pattern(from), escape_pattern(to))
end

regex for matching a string into words but leaving multiple spaces

Here's what I expect. I have a string with numbers that need to be changed into letters (a kind of cipher) and spaces to move into different letter, and there is a tripple spaces that represent a space in output. For example, a string "394 29 44 44 141 6" will be decrypted into "Hell No".
function string.decrypt(self)
local output = ""
for i in self:gmatch("%S+") do
for j, k in pairs(CODE) do
output = output .. (i == j and k or "")
end
end
return output
end
Even though it decrypts the numbers correctly I doesn't work with spacebars. So the string I used above decrypts into "HellNo", instead of expected "Hell No". How can I fix this?
You can use
CODE = {["394"] = "H", ["29"] = "e", ["44"] = "l", ["141"] = "N", ["6"] = "o"}
function replace(match)
local ret = nil
for i, v in pairs(CODE) do
if i == match then
ret = v
end
end
return ret
end
function decrypt(s)
return s:gsub("(%d+)%s?", replace):gsub(" ", " ")
end
print (decrypt("394 29 44 44 141 6"))
Output will contain Hell No. See the Lua demo online.
Here, (%d+)%s? in s:gsub("(%d+)%s?", replace) matches and captures one or more digits and just matches an optional whitespace (with %s?) and the captured value is passed to the replace function, where it is mapped to the char value in CODE. Then, all double spaces are replaced with a single space with gsub(" ", " ").

Wrap add_header_above in Rmd pdf output

Question
How to wrap header above (inserted by add_header_above())?
There is a simple way to do it to one layered header but doesn't work when there is a second (or third) of header.
Reproducible example
library(kableExtra)
names(iris) <- c("L", "W", "L", "W", " ")
iris[1:2, ] %>%
kable("latex") %>%
add_header_above(
c(
"Sepal is great" = 2,
"Petal is better, (in fac my favorite)" = 2,
"nc" = 1)
) %>%
column_spec(2:ncol(iris), width = "0.3in")
Current output looks
Expected output from R code (roughly)
As I said in Best Practice for newline in LaTeX table, if you need newlines inside all kableExtra functions, just use \n. Otherwise, you can try out the linebreak function.
library(kableExtra)
names(iris) <- c("L", "W", "L", "W", " ")
iris[1:2, ] %>%
kable("latex") %>%
add_header_above(
c(
"Sepal\nis great" = 2,
"Petal is better,\n(in fac my favorite)" = 2,
"nc" = 1)
) %>%
column_spec(2:ncol(iris), width = "0.3in")

Lua-error driving me crazy

I keep getting this error and I cannot find it. Please help.
LUA ERROR: Cannot load buffer.
[string "LuaMacros script"]:191: '}' expected (to close '{' at line 85) near '['
Here is the script:
--Start Script
sendToAHK = function (key)
--print('It was assigned string: ' .. key)
local file = io.open("C:\\Users\\TaranWORK\\Documents\\GitHub\\2nd-keyboard-master\\LUAMACROS\\keypressed.txt", "w") -- writing this string to a text file on disk is probably NOT the best method. Feel free to program something better!
--Make sure to substitute the path that leads to your own "keypressed.txt" file, using the double backslashes.
--print("we are inside the text file")
file:write(key)
file:flush() --"flush" means "save"
file:close()
lmc_send_keys('{F24}') -- This presses F24. Using the F24 key to trigger AutoHotKey is probably NOT the best method. Feel free to program something better!
end
local config = { -- this is line 85
[45] = "insert",
[36] = "home",
[33] = "pageup",
[46] = "delete",
[35] = "end",
[34] = "pagedown",
[27] = "escape",
[112] = "F1",
[113] = "F2",
[114] = "F3",
[115] = "F4",
[116] = "F5",
[117] = "F6",
[118] = "F7",
[119] = "F8",
[120] = "F9",
[121] = "F10",
[122] = "F11",
[123] = "F12",
[8] = "backspace",
[220] = "backslash",
[13] = "enter",
[16] = "rShift",
[17] = "rCtrl",
[38] = "up",
[37] = "left",
[40] = "down",
[39] = "right",
[32] = "space",
[186] = "semicolon",
[222] = "singlequote",
[190] = "period",
[191] = "slash",
[188] = "comma",
[219] = "leftbracket",
[221] = "rightbracket",
[189] = "minus",
[187] = "equals",
[96] = "num0",
[97] = "num1",
[98] = "num2",
[99] = "num3",
[100] = "num4",
[101] = "num5",
[102] = "num6",
[103] = "num7",
[104] = "num8",
[105] = "num9",
[106] = "numMult",
[107] = "numPlus",
[108] = "numEnter" --sometimes this is different, check your keyboard
[109] = "numMinus",
[110] = "numDelete",
[111] = "numDiv",
[144] = "numLock", --probably it is best to avoid this key. I keep numlock ON, or it has unexpected effects
[192] = "`", --this is the tilde key just before the number row
[9] = "tab",
[20] = "capslock",
[18] = "alt",
[string.byte('Q')] = "q",
[string.byte('W')] = "w",
[string.byte('E')] = "e",
[string.byte('R')] = "r",
[string.byte('T')] = "t",
[string.byte('Y')] = "y",
[string.byte('U')] = "u",
[string.byte('I')] = "i",
[string.byte('O')] = "o",
[string.byte('P')] = "p",
[string.byte('A')] = "a",
[string.byte('S')] = "s",
[string.byte('D')] = "d",
[string.byte('F')] = "f",
[string.byte('G')] = "g",
[string.byte('H')] = "h",
[string.byte('J')] = "j",
[string.byte('K')] = "k",
[string.byte('L')] = "l",
[string.byte('Z')] = "z",
[string.byte('X')] = "x",
[string.byte('C')] = "c",
[string.byte('V')] = "v",
[string.byte('B')] = "b",
[string.byte('N')] = "n",
[string.byte('M')] = "m",
[string.byte('0')] = "0",
[string.byte('1')] = "1",
[string.byte('2')] = "2",
[string.byte('3')] = "3",
[string.byte('4')] = "4",
[string.byte('5')] = "5",
[string.byte('6')] = "6",
[string.byte('7')] = "7",
[string.byte('8')] = "8",
[string.byte('9')] = "9",
--[255] = "printscreen" --these keys do not work
}
-- define callback for whole device
lmc_set_handler('MACROS', function(button, direction)
--Ignoring upstrokes ensures keystrokes are not registered twice, but activates faster than ignoring downstrokes. It also allows press and hold behaviour
if (direction == 0) then return end -- ignore key upstrokes.
if type(config[button]) == "string" then
print(' ')
print('Your key ID number is: ' .. button)
print('It was assigned string: ' .. config[button])
sendToAHK(config[button])
else
print(' ')
print('Not yet assigned: ' .. button)
end
end)
There's a comma missing after the string here:
[108] = "numEnter" --sometimes this is different, check your keyboard

Expected any character but end of input found

my input is a recursive structure looks like this (notice the blank 2nd line):
xxx #{} yyy #{ zzz #{} wwww }
the grammar as i see that would read it should look like this:
start = item+
item = thing / space
thing = '#{' item* '}'
space = (!'#' .)+
but what i get is
Line 2, column 1: Expected "#{", "}", or any character but end of input found.
what am i doing wrong?
I do not know peg at all, but a quick look at the docs seems to say the dot in the 4th rule is the problem. The online parser succeeds with:
start = item+
item = thing / space
thing = '#{' item* '}'
space = [ a-z]+
This produces:
[
[
"x",
"x",
"x",
" "
],
[
"#{",
[],
"}"
],
[
" ",
"y",
"y",
"y",
" "
],
[
"#{",
[
[
" ",
"z",
"z",
"z",
" "
],
[
"#{",
[],
"}"
],
[
" ",
"w",
"w",
"w",
"w",
" "
]
],
"}"
]
]
In order to make it run, I modified the code as:
start = item+
item = thing / space
thing = '#{' item* '}'
space =[^#}]+

Resources