hdf5dump of H5T_STRING

hdf5dump of H5T_STRING - hdf5

I'm trying to figure out how to dump a text block from an HDF5 file (a Bathymetric Attributed Grid / BAG). When I do h5dump -d /BAG_root/metadata H11703_Office_5m.bag, and anything else I've tried, I always get the data with each character of the XML quoted. Is there an "easy" option to have it dump the raw data contents to a file or the terminal?
DATASET "/BAG_root/metadata" {
DATATYPE H5T_STRING {
STRSIZE 1;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 5097 ) / ( H5S_UNLIMITED ) }
DATA {
(0): "<", "?", "x", "m", "l", " ", "v", "e", "r", "s", "i", "o", "n", "=",
(14): """, "1", ".", "0", """, "?", ">", "
", "<", "s", "m",
(25): "X", "M", "L", ":", "M", "D", "_", "M", "e", "t", "a", "d", "a",

Marcus Cole emailed me this solution after I brought up the topic on the OpenNavSurf mailing list:
h5dump -b FILE -o H12279_VB_4m_MLLW_1of1.xml -d BAG_root/metadata H12279_VB_4m_MLLW_1of1.bag
This writes out a clean XML file.

Re: Python & BAG, GDAL 1.7.0+ supports the BAG format; e.g.:
from osgeo import gdal
bag = gdal.OpenShared(r"C:\DATA\NGDC\H11555_2m_1.bag")
bagmetadata = bag.GetMetadata("xml:BAG")[0]

The data is stored as an array of 5097 single characters strings (STRSIZE 1). To dump the text, it should have been stored as a real string (e.g. in a scalar dataspace).
So I think you cannot do it with h5dump alone, you probably have to process the dump with sed or your favorite regexp tool.

Related

GSUB doesnt works without spaces

I was trying to make a simple text cryptor, but the script works only if put spaces after every symbol
code:
local text = ""
local tdext = text:gsub("%S+", {["+"] = "a", ["×"] = "b", ["÷"] = "c", ["="] = "d", ["/"] = "e", ["_"] = "f", ["€"] = "g", ["¥"] = "h", ["₩"] = "i", ["!"] = "j", ["#"] = "k", ["#"] = "l", ["$"] = "m", ["%"] = "n", ["^"] = "o", ["&"] = "p", ["*"] = "q", ["("] = "r", [")"] = "s", ["-"] = "t", ["'"] = "u", [":"] = "v", [";"] = "w", [","] = "x", ["?"] = "y", ["."] = "z", [" "] = " "})
print(tdext)
I tried fixing it, but it doesnt do what it should.
If i put in text variable "÷ =" it outputs "b c", but if i am putting "÷=" in variable it will output "÷=".

Let's take a closer look at your substitutions:
local subs = {
["+"] = "a", ["×"] = "b", ["÷"] = "c", ["="] = "d", ["/"] = "e",
["_"] = "f", ["€"] = "g", ["¥"] = "h", ["₩"] = "i", ["!"] = "j",
["#"] = "k", ["#"] = "l", ["$"] = "m", ["%"] = "n", ["^"] = "o",
["&"] = "p", ["*"] = "q", ["("] = "r", [")"] = "s", ["-"] = "t",
["'"] = "u", [":"] = "v", [";"] = "w", [","] = "x", ["?"] = "y",
["."] = "z", [" "] = " "
}
local tdext = text:gsub("%S+", subs)
%S+ matches a sequence of one or more non-space bytes. If you have single characters - multi-byte (UTF-8) or single-byte (ASCII) - this will work fine. However if you have a sequence of multiple characters (say, +-), this won't perform the replacement, since both + and - won't be found in your lookup table. The same is the case for the multi-byte ÷=: ÷ = works, because your characters are separated by spaces; ÷= doesn't, because the pattern greedily matches the sequence.
If this is supposed to be a character-wise substitution, you'll need to match characters (UTF-8 sequences, which includes ASCII). Lua 5.3 and later will have the "constant" utf8.charpattern which is a pattern string matching a single UTF-8 character. If you have a recent Lua version, the fix becomes trivial: Just replace "%S+" with utf8.charpattern:
local tdext = text:gsub(utf8.charpattern, subs)
In older Lua versions (up to and including 5.2), you'll have to write this pattern yourself, using decimal escapes:
local charpattern = "[%z-\127\194-\244][\128-\191]*"
local tdext = text:gsub(charpattern, subs)
Alternatively, if you also want to support multi-character substitutions, you can simply apply the substitutions one by one (which is however significantly less efficient by a factor linear in the number of entries in the subs table):
-- We need to escape everything to make Lua treat it as a literal string
local function escape_pattern(str)
return str:gsub(".", "%%.")
end
local tdext = text
for from, to in pairs(subs) do
tdext = tdext:gsub(escape_pattern(from), escape_pattern(to))
end

regex for matching a string into words but leaving multiple spaces

Here's what I expect. I have a string with numbers that need to be changed into letters (a kind of cipher) and spaces to move into different letter, and there is a tripple spaces that represent a space in output. For example, a string "394 29 44 44 141 6" will be decrypted into "Hell No".
function string.decrypt(self)
local output = ""
for i in self:gmatch("%S+") do
for j, k in pairs(CODE) do
output = output .. (i == j and k or "")
end
end
return output
end
Even though it decrypts the numbers correctly I doesn't work with spacebars. So the string I used above decrypts into "HellNo", instead of expected "Hell No". How can I fix this?

You can use
CODE = {["394"] = "H", ["29"] = "e", ["44"] = "l", ["141"] = "N", ["6"] = "o"}
function replace(match)
local ret = nil
for i, v in pairs(CODE) do
if i == match then
ret = v
end
end
return ret
end
function decrypt(s)
return s:gsub("(%d+)%s?", replace):gsub(" ", " ")
end
print (decrypt("394 29 44 44 141 6"))
Output will contain Hell No. See the Lua demo online.
Here, (%d+)%s? in s:gsub("(%d+)%s?", replace) matches and captures one or more digits and just matches an optional whitespace (with %s?) and the captured value is passed to the replace function, where it is mapped to the char value in CODE. Then, all double spaces are replaced with a single space with gsub(" ", " ").

Wrap add_header_above in Rmd pdf output

Question
How to wrap header above (inserted by add_header_above())?
There is a simple way to do it to one layered header but doesn't work when there is a second (or third) of header.
Reproducible example
library(kableExtra)
names(iris) <- c("L", "W", "L", "W", " ")
iris[1:2, ] %>%
kable("latex") %>%
add_header_above(
c(
"Sepal is great" = 2,
"Petal is better, (in fac my favorite)" = 2,
"nc" = 1)
) %>%
column_spec(2:ncol(iris), width = "0.3in")
Current output looks
Expected output from R code (roughly)

As I said in Best Practice for newline in LaTeX table, if you need newlines inside all kableExtra functions, just use \n. Otherwise, you can try out the linebreak function.
library(kableExtra)
names(iris) <- c("L", "W", "L", "W", " ")
iris[1:2, ] %>%
kable("latex") %>%
add_header_above(
c(
"Sepal\nis great" = 2,
"Petal is better,\n(in fac my favorite)" = 2,
"nc" = 1)
) %>%
column_spec(2:ncol(iris), width = "0.3in")

Lua-error driving me crazy

I keep getting this error and I cannot find it. Please help.
LUA ERROR: Cannot load buffer.
[string "LuaMacros script"]:191: '}' expected (to close '{' at line 85) near '['
Here is the script:
--Start Script
sendToAHK = function (key)
--print('It was assigned string: ' .. key)
local file = io.open("C:\\Users\\TaranWORK\\Documents\\GitHub\\2nd-keyboard-master\\LUAMACROS\\keypressed.txt", "w") -- writing this string to a text file on disk is probably NOT the best method. Feel free to program something better!
--Make sure to substitute the path that leads to your own "keypressed.txt" file, using the double backslashes.
--print("we are inside the text file")
file:write(key)
file:flush() --"flush" means "save"
file:close()
lmc_send_keys('{F24}') -- This presses F24. Using the F24 key to trigger AutoHotKey is probably NOT the best method. Feel free to program something better!
end
local config = { -- this is line 85
[45] = "insert",
[36] = "home",
[33] = "pageup",
[46] = "delete",
[35] = "end",
[34] = "pagedown",
[27] = "escape",
[112] = "F1",
[113] = "F2",
[114] = "F3",
[115] = "F4",
[116] = "F5",
[117] = "F6",
[118] = "F7",
[119] = "F8",
[120] = "F9",
[121] = "F10",
[122] = "F11",
[123] = "F12",
[8] = "backspace",
[220] = "backslash",
[13] = "enter",
[16] = "rShift",
[17] = "rCtrl",
[38] = "up",
[37] = "left",
[40] = "down",
[39] = "right",
[32] = "space",
[186] = "semicolon",
[222] = "singlequote",
[190] = "period",
[191] = "slash",
[188] = "comma",
[219] = "leftbracket",
[221] = "rightbracket",
[189] = "minus",
[187] = "equals",
[96] = "num0",
[97] = "num1",
[98] = "num2",
[99] = "num3",
[100] = "num4",
[101] = "num5",
[102] = "num6",
[103] = "num7",
[104] = "num8",
[105] = "num9",
[106] = "numMult",
[107] = "numPlus",
[108] = "numEnter" --sometimes this is different, check your keyboard
[109] = "numMinus",
[110] = "numDelete",
[111] = "numDiv",
[144] = "numLock", --probably it is best to avoid this key. I keep numlock ON, or it has unexpected effects
[192] = "`", --this is the tilde key just before the number row
[9] = "tab",
[20] = "capslock",
[18] = "alt",
[string.byte('Q')] = "q",
[string.byte('W')] = "w",
[string.byte('E')] = "e",
[string.byte('R')] = "r",
[string.byte('T')] = "t",
[string.byte('Y')] = "y",
[string.byte('U')] = "u",
[string.byte('I')] = "i",
[string.byte('O')] = "o",
[string.byte('P')] = "p",
[string.byte('A')] = "a",
[string.byte('S')] = "s",
[string.byte('D')] = "d",
[string.byte('F')] = "f",
[string.byte('G')] = "g",
[string.byte('H')] = "h",
[string.byte('J')] = "j",
[string.byte('K')] = "k",
[string.byte('L')] = "l",
[string.byte('Z')] = "z",
[string.byte('X')] = "x",
[string.byte('C')] = "c",
[string.byte('V')] = "v",
[string.byte('B')] = "b",
[string.byte('N')] = "n",
[string.byte('M')] = "m",
[string.byte('0')] = "0",
[string.byte('1')] = "1",
[string.byte('2')] = "2",
[string.byte('3')] = "3",
[string.byte('4')] = "4",
[string.byte('5')] = "5",
[string.byte('6')] = "6",
[string.byte('7')] = "7",
[string.byte('8')] = "8",
[string.byte('9')] = "9",
--[255] = "printscreen" --these keys do not work
}
-- define callback for whole device
lmc_set_handler('MACROS', function(button, direction)
--Ignoring upstrokes ensures keystrokes are not registered twice, but activates faster than ignoring downstrokes. It also allows press and hold behaviour
if (direction == 0) then return end -- ignore key upstrokes.
if type(config[button]) == "string" then
print(' ')
print('Your key ID number is: ' .. button)
print('It was assigned string: ' .. config[button])
sendToAHK(config[button])
else
print(' ')
print('Not yet assigned: ' .. button)
end
end)

There's a comma missing after the string here:
[108] = "numEnter" --sometimes this is different, check your keyboard

Expected any character but end of input found

my input is a recursive structure looks like this (notice the blank 2nd line):
xxx #{} yyy #{ zzz #{} wwww }
the grammar as i see that would read it should look like this:
start = item+
item = thing / space
thing = '#{' item* '}'
space = (!'#' .)+
but what i get is
Line 2, column 1: Expected "#{", "}", or any character but end of input found.
what am i doing wrong?

I do not know peg at all, but a quick look at the docs seems to say the dot in the 4th rule is the problem. The online parser succeeds with:
start = item+
item = thing / space
thing = '#{' item* '}'
space = [ a-z]+
This produces:
[
[
"x",
"x",
"x",
" "
],
[
"#{",
[],
"}"
],
[
" ",
"y",
"y",
"y",
" "
],
[
"#{",
[
[
" ",
"z",
"z",
"z",
" "
],
[
"#{",
[],
"}"
],
[
" ",
"w",
"w",
"w",
"w",
" "
]
],
"}"
]
]

In order to make it run, I modified the code as:
start = item+
item = thing / space
thing = '#{' item* '}'
space =[^#}]+

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

hdf5dump of H5T_STRING - hdf5

Marcus Cole emailed me this solution after I brought up the topic on the OpenNavSurf mailing list: h5dump -b FILE -o H12279_VB_4m_MLLW_1of1.xml -d BAG_root/metadata H12279_VB_4m_MLLW_1of1.bag This writes out a clean XML file.

Re: Python & BAG, GDAL 1.7.0+ supports the BAG format; e.g.: from osgeo import gdal bag = gdal.OpenShared(r"C:\DATA\NGDC\H11555_2m_1.bag") bagmetadata = bag.GetMetadata("xml:BAG")[0]

The data is stored as an array of 5097 single characters strings (STRSIZE 1). To dump the text, it should have been stored as a real string (e.g. in a scalar dataspace). So I think you cannot do it with h5dump alone, you probably have to process the dump with sed or your favorite regexp tool.

Related

GSUB doesnt works without spaces

regex for matching a string into words but leaving multiple spaces

Wrap add_header_above in Rmd pdf output

Lua-error driving me crazy

Expected any character but end of input found

Categories

Resources