I've a nested JSONB hash that I require to display in date order.
The hash is stored like so:
hash =
{"residential_la"=>
{"current"=>{"periods"=>{"1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0", "7/2023"=>"0", "8/2023"=>"0", "9/2023"=>"0", "10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0"}},
"original"=>{"periods"=>{"1/2023"=>"901", "2/2023"=>"1315", "3/2023"=>"4377", "4/2023"=>"1815", "5/2023"=>"1835", "6/2023"=>"896", "7/2023"=>"1996", "8/2023"=>"4219", "9/2023"=>"3369", "10/2022"=>"3335", "11/2022"=>"4198", "12/2022"=>"3127"}},
"NominalCode"=>"500"},
"residential_private"=>
{"current"=>{"periods"=>{"1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0", "7/2023"=>"0", "8/2023"=>"0", "9/2023"=>"0", "10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0"}},
"original"=>{"periods"=>{"1/2023"=>"4389", "2/2023"=>"1265", "3/2023"=>"4496", "4/2023"=>"980", "5/2023"=>"1617", "6/2023"=>"1396", "7/2023"=>"4839", "8/2023"=>"4248", "9/2023"=>"1770", "10/2022"=>"3513", "11/2022"=>"1294", "12/2022"=>"4240"}},
"NominalCode"=>"520"}}
As you can see, being JSONB it's storing it in length order. I need to sort this when calling to display in the correct way.
I know for sure I can sort the periods nested hash with the following:
{ periods: periods_hash[:periods].sort_by { |k, _v| Date.strptime(k, "%d/%Y") }.to_h }
This creates a new hash of periods and sorts it, correcty.
My problem is getting to this nested hash and returning the full hash with no changes but the order of those periods.
My current efforts produced this:
def order_data
data.each do |_k, v|
v.each do |sk, sv|
next if sk.include? "NominalCode"
sv.each do |x, _y|
{ periods: x[:periods].sort_by { |k, _v| Date.strptime(k, "%d/%Y") }.to_h }
end
end
end
end
The above isn't working and gives me no implicit conversion of Symbol into Integer, I'm assuiming as I'm .each'ing there's a problem with enummerable and the symbols of the hash.
What I'm expecting in return is:
{"residential_la"=>
{"current"=>{"periods"=>{"10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0", "7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0"}},
"original"=>{"periods"=>{"10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0", "7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0"}},
"NominalCode"=>"500"},
"residential_private"=>
{"current"=>{"periods"=>{"10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0", "7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0"}},
"original"=>{"periods"=>{"10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0", "7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0"}},
"NominalCode"=>"500"}
I'm unsure whether I should be mapping to create a new hash, or just iterating through with each to only modify the sub hash, ultimately there's a lack of understanding on my part, any help and guidance is appreciated.
Thanks.
Given the following task definition:
we need to sort any hash under key periods...
regardless of its (periods) depth
and keep everything else intact
one could go with the simple recursive algorithm (let's name it deep_sort_periods):
Start with a new hash (n)
Check the next key/value (k/v) of the existing hash
If the v is not a hash leave it as is (set n[k] = v)
If the v is a hash and k is periods - sort the value (n[k] = <your sorting logic>)
Otherwise (v is a hash but k is smth. different, not periods) - repeat 1-4 for the nested hash and assign the result to n[k] (n[k] = deep_sort_periods(v))
The implementation can literally follow the description, kinda:
def deep_sort_periods(hash)
hash.each_with_object({}) do |(k, v), new_hash|
if !v.is_a?(Hash)
new_hash[k] = v
elsif k == "periods"
new_hash[k] = v.sort_by { |date, _| Date.strptime(date, "%d/%Y") }.to_h
else
new_hash[k] = deep_sort_periods(v)
end
end
end
and then
pry(main)> deep_sort_periods(hash)
=> {"residential_la"=>
{"current"=>{"periods"=>{"7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0", "10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0"}},
"original"=>{"periods"=>{"7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0", "10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0"}},
"NominalCode"=>"500"},
"residential_private"=>
{"current"=>{"periods"=>{"7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0", "10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0"}},
"original"=>{"periods"=>{"7/2022"=>"0", "8/2022"=>"0", "9/2022"=>"0", "10/2022"=>"0", "11/2022"=>"0", "12/2022"=>"0", "1/2023"=>"0", "2/2023"=>"0", "3/2023"=>"0", "4/2023"=>"0", "5/2023"=>"0", "6/2023"=>"0"}},
"NominalCode"=>"500"}}
Disclaimer: if there are more restrictions for the task definition - for example, periods are not arbitrarily nested but rather sit at some fixed depth - the task most probably can be solved in a more concise and performant way with no recursion at all...
Related
I would like to create a custom scanner for i18n-tasks that can detect enums declared as hashes in models.
My enum declaration pattern will always be like this:
class Conversation < ActiveRecord::Base
enum status: { active: 0, archived: 1}, _prefix: true
enum subject: { science: 0, literature: 1, music: 2, art: 3 }, _prefix: true
end
The enums will always be declared as hashes, and will always have a numerical hash value, and will always have the option _prefix: true at the end of the declaration. There can be any number of values in the hash.
My custom scanner currently looks like this:
require 'i18n/tasks/scanners/file_scanner'
class ScanModelEnums < I18n::Tasks::Scanners::FileScanner
include I18n::Tasks::Scanners::OccurrenceFromPosition
# #return [Array<[absolute key, Results::Occurrence]>]
def scan_file(path)
text = read_file(path)
text.scan(/enum\s([a-zA-Z]*?):\s\{.*\W(\w+):.*\}, _prefix: true$/).map do |prefix, attribute|
occurrence = occurrence_from_position(
path, text, Regexp.last_match.offset(0).first)
model = File.basename(path, ".rb") #.split('/').last
name = prefix + "_" + attribute
["activerecord.attributes.%s.%s" % [model, name], occurrence]
end
end
end
I18n::Tasks.add_scanner 'ScanModelEnums'
However this is only returning the very last element of each hash:
activerecord.attributes.conversation.status_archived
activerecord.attributes.conversation.subject_art
How can I return all the elements of each hash? I am wanting to see a result like this:
activerecord.attributes.conversation.status_active
activerecord.attributes.conversation.status_archived
activerecord.attributes.conversation.subject_science
activerecord.attributes.conversation.subject_literature
activerecord.attributes.conversation.subject_music
activerecord.attributes.conversation.subject_art
For reference, the i18n-tasks github repo offers an example of a custom scanner.
The file scanner class that it uses can be found here.
This works:
def scan_file(path)
result = []
text = read_file(path)
text.scan(/enum\s([a-zA-Z]*?):\s\{(.*)}, _prefix: true$/).each do |prefix, body|
occurrence = occurrence_from_position(path, text,
Regexp.last_match.offset(0).first)
body.scan(/(\w+):/).flatten.each do |attr|
model = File.basename(path, ".rb")
name = "#{prefix}_#{attr}"
result << ["activerecord.attributes.#{model}.#{name}", occurrence]
end
end
result
end
It's similar to your 'answer' approach, but uses the regex to get all the contents between '{...}', and then uses another regex to grab each enum key name.
The probable reason your 'answer' version raises an error is that it is actually returning a three-dimensional array, not two:
The outer .map is an array of all iterations.
Each iteration returns retval, which is an array.
Each element of retail is an array of ['key', occurrence] pairs.
This isn't the answer, this is just the other attempt I made, which outputs a two dimensional array instead of a single array:
require 'i18n/tasks/scanners/file_scanner'
class ScanModelEnums < I18n::Tasks::Scanners::FileScanner
include I18n::Tasks::Scanners::OccurrenceFromPosition
# #return [Array<[absolute key, Results::Occurrence]>]
def scan_file(path)
text = read_file(path)
text.scan(/enum\s([a-zA-Z]*?):\s\{(.*)\}, _prefix: true/).map do |prefix, attributes|
retval = []
model = File.basename(path, ".rb")
names = attributes.split(",").map!{ |e| e.strip; e.split(":").first.strip }
names.each do |attribute|
pos = (Regexp.last_match.offset(0).first + 8 + prefix.length + attributes.index(attribute))
occurrence = occurrence_from_position(
path, text, pos)
name = prefix + "_" + attribute
# p "================"
# p type
# p message
# p ["activerecord.attributes.%s.%s" % [model, name], occurrence]
# p "================"
retval.push(["activerecord.attributes.%s.%s" % [model, name], occurrence])
end
retval
end
end
end
I18n::Tasks.add_scanner 'ScanModelEnums'
This however gives me an error for the second detected attribute:
gems/i18n-tasks-0.9.34/lib/i18n/tasks/scanners/results/key_occurrences.rb:48:in `each': undefined method `path' for ["activerecord.attributes.conversation.status_archived", Occurrence(app/models/project.rb:3:32:98::)]:Array (NoMethodError)
I'm currently working on an XML parser and I'm trying to use Lua's pattern matching tools but I'm not getting the desired result. Let's say I have this XML snippet:
<Parent>
<Child>
<Details>Text in Parent tag and Details child tag</Details>
<Division>Text in Parent tag and Division child tag</Division>
</Child>
</Parent>
I need to pull the Parent tag out into a table, followed by any child tags, and their corresponding text data. I already have the pattern for pulling the data figured out:
DATA = "<.->(.-)<"
Likewise for pulling tags individually:
TAGS ="<(%w+)>"
However like I mentioned, I need to differentiate between tags that are nested and tags that aren't. Currently the pattern that's getting the closest result I need is:
CHILDTAG= "<%w->.-<(%w-)>"
Which should print only "Child" but it prints "Division" as well for a reason I can't comprehend. The idea behind the CHILDTAG pattern is it captures a tag IFF it had an enclosing tag, i.e , the ".-" is there to signify that it may/may not have a new line between it, however I think that's completely wrong because \n- doesn't work and that signifies a new line. I referred to the documentation and to:
https://www.fhug.org.uk/wiki/wiki/doku.php?id=plugins:understanding_lua_patterns
I use Lua 5.1. I want to parse an XML file of the following pattern. How should I go about it?
Lua XML extract from pattern
Simple XML parser (Named entities in XML are not supported)
local symbols = {lt = '<', gt = '>', amp = '&', quot = '"', apos = "'", nbsp = ' ', euro = '€', copy = '©', reg = '®'}
local function unicode_to_utf8(codepoint)
-- converts numeric unicode to string containing single UTF-8 character
local t, h = {}, 127
while codepoint > h do
local low6 = codepoint % 64
codepoint = (codepoint - low6) / 64
t[#t+1] = 128 + low6
h = 288067 % h
end
t[#t+1] = 254 - 2*h + codepoint
return string.char((table.unpack or unpack)(t)):reverse()
end
local function unescape(text)
return (
(text..'<![CDATA[]]>'):gsub('(.-)<!%[CDATA%[(.-)]]>',
function(not_cdata, cdata)
return
not_cdata
:gsub('%s', ' ')
--:gsub(' +', ' ') -- only for html
:gsub('^ +', '')
:gsub(' +$', '')
:gsub('&(%w+);', symbols)
:gsub('&#(%d+);', function(u) return unicode_to_utf8(to_number(u)) end)
:gsub('&#[xX](%x+);', function(u) return unicode_to_utf8(to_number(u, 16)) end)
..cdata
end
)
)
end
function parse_xml(xml)
local tag_stack = {}
local result = {find_child_by_tag = {}}
for text_before_tag, closer, tag, attrs, self_closer in xml
:gsub('^%s*<?xml.-?>', '') -- remove prolog
:gsub('^%s*<!DOCTYPE[^[>]+%[.-]>', '')
:gsub('^%s*<!DOCTYPE.->', '')
:gsub('<!%-%-.-%-%->', '') -- remove comments
:gmatch'([^<]*)<(/?)([%w_]+)(.-)(/?)>'
do
table.insert(result, unescape(text_before_tag))
if result[#result] == '' then
result[#result] = nil
end
if closer ~= '' then
local parent_pos, parent
repeat
parent_pos = table.remove(tag_stack)
if not parent_pos then
error("Closing unopened tag: "..tag)
end
parent = result[parent_pos]
until parent.tag == tag
local elems = parent.elems
for pos = parent_pos + 1, #result do
local child = result[pos]
table.insert(elems, child)
if type(child) == 'table' then
--child.find_parent = parent
parent.find_child_by_tag[child.tag] = child
end
result[pos] = nil
end
else
local attrs_dict = {}
for names, value in ('\0'..attrs:gsub('%s*=%s*([\'"])(.-)%1', '\0%2\0')..'\0')
:gsub('%z%Z*%z', function(unquoted) return unquoted:gsub('%s*=%s*([%w_]+)', '\0%1\0') end)
:gmatch'%z(%Z*)%z(%Z*)'
do
local last_attr_name
for name in names:gmatch'[%w_]+' do
name = unescape(name)
if last_attr_name then
attrs_dict[last_attr_name] = '' -- boolean attributes (such as "disabled" in html) are converted to empty strings
end
last_attr_name = name
end
if last_attr_name then
attrs_dict[last_attr_name] = unescape(value)
end
end
table.insert(result, {tag = tag, attrs = attrs_dict, elems = {}, find_child_by_tag = {}})
if self_closer == '' then
table.insert(tag_stack, #result)
end
end
end
for _, child in ipairs(result) do
if type(child) == 'table' then
result.find_child_by_tag[child.tag] = child
end
end
-- Now result is a sequence of upper-level tags
-- each tag is a table containing fields: tag (string), attrs (dictionary, may be empty), elems (array, may be empty) and find_child_by_tag (dictionary, may be empty)
-- attrs is a dictionary of attributes
-- elems is a sequence of elements (with preserving their order): tables (nested tags) or strings (text between <tag> and </tag>)
return result
end
Usage example:
local xml= [[
<Parent>
<Child>
<Details>Text in Parent tag and Details child tag</Details>
<Division>Text in Parent tag and Division child tag</Division>
</Child>
</Parent>
]]
xml = parse_xml(xml)
--> both these lines print "Text in Parent tag and Division child tag"
print(xml[1].elems[1].elems[2].elems[1])
print(xml.find_child_by_tag.Parent.find_child_by_tag.Child.find_child_by_tag.Division.elems[1])
What parsed xml looks like:
xml = {
find_child_by_tag = {Parent = ...},
[1] = {
tag = "Parent",
attrs = {},
find_child_by_tag = {Child = ...},
elems = {
[1] = {
tag = "Child",
attrs = {},
find_child_by_tag = {Details = ..., Division = ...},
elems = {
[1] = {
tag = "Details",
attrs = {},
find_child_by_tag = {},
elems = {[1] = "Text in Parent tag and Details child tag"}
},
[2] = {
tag = "Division",
attrs = {},
find_child_by_tag = {},
elems = {[1] = "Text in Parent tag and Division child tag"}
}
}
}
}
}
}
I am trying to modify the example code in pyparsing to handle operands that are key value pairs, like:
(Region:US and Region:EU) or (Region:Asia)
This is a boolean expression with three operands - Region:US, Region:EU and Region:Asia. If they were simple operands like x, y and z, I'd be good to go. I don't need to do any special processing on them to break up the key-value pairs. I need to treat the operand in its entirety as though it might have just been x, and need to assign truth values to it and evaluate the full expression.
How might I modify the following code to handle this:
#
# simpleBool.py
#
# Example of defining a boolean logic parser using
# the operatorGrammar helper method in pyparsing.
#
# In this example, parse actions associated with each
# operator expression will "compile" the expression
# into BoolXXX class instances, which can then
# later be evaluated for their boolean value.
#
# Copyright 2006, by Paul McGuire
# Updated 2013-Sep-14 - improved Python 2/3 cross-compatibility
#
from pyparsing import infixNotation, opAssoc, Keyword, Word, alphas
# define classes to be built at parse time, as each matching
# expression type is parsed
class BoolOperand(object):
def __init__(self,t):
self.label = t[0]
self.value = eval(t[0])
def __bool__(self):
return self.value
def __str__(self):
return self.label
__repr__ = __str__
__nonzero__ = __bool__
class BoolBinOp(object):
def __init__(self,t):
self.args = t[0][0::2]
def __str__(self):
sep = " %s " % self.reprsymbol
return "(" + sep.join(map(str,self.args)) + ")"
def __bool__(self):
return self.evalop(bool(a) for a in self.args)
__nonzero__ = __bool__
__repr__ = __str__
class BoolAnd(BoolBinOp):
reprsymbol = '&'
evalop = all
class BoolOr(BoolBinOp):
reprsymbol = '|'
evalop = any
class BoolNot(object):
def __init__(self,t):
self.arg = t[0][1]
def __bool__(self):
v = bool(self.arg)
return not v
def __str__(self):
return "~" + str(self.arg)
__repr__ = __str__
__nonzero__ = __bool__
TRUE = Keyword("True")
FALSE = Keyword("False")
boolOperand = TRUE | FALSE | Word(alphas,max=1)
boolOperand.setParseAction(BoolOperand)
# define expression, based on expression operand and
# list of operations in precedence order
boolExpr = infixNotation( boolOperand,
[
("not", 1, opAssoc.RIGHT, BoolNot),
("and", 2, opAssoc.LEFT, BoolAnd),
("or", 2, opAssoc.LEFT, BoolOr),
])
if __name__ == "__main__":
p = True
q = False
r = True
tests = [("p", True),
("q", False),
("p and q", False),
("p and not q", True),
("not not p", True),
("not(p and q)", True),
("q or not p and r", False),
("q or not p or not r", False),
("q or not (p and r)", False),
("p or q or r", True),
("p or q or r and False", True),
("(p or q or r) and False", False),
]
print("p =", p)
print("q =", q)
print("r =", r)
print()
for t,expected in tests:
res = boolExpr.parseString(t)[0]
success = "PASS" if bool(res) == expected else "FAIL"
print (t,'\n', res, '=', bool(res),'\n', success, '\n')
Instead of p, q, r, I'd like to use "Region:US", "Region:EU" and "Region:Asia." Any ideas?
EDIT: Using Paul McGuire's suggestion, I tried writing the following code which breaks on parsing:
#
# simpleBool.py
#
# Example of defining a boolean logic parser using
# the operatorGrammar helper method in pyparsing.
#
# In this example, parse actions associated with each
# operator expression will "compile" the expression
# into BoolXXX class instances, which can then
# later be evaluated for their boolean value.
#
# Copyright 2006, by Paul McGuire
# Updated 2013-Sep-14 - improved Python 2/3 cross-compatibility
#
from pyparsing import infixNotation, opAssoc, Keyword, Word, alphas
# define classes to be built at parse time, as each matching
# expression type is parsed
class BoolOperand(object):
def __init__(self,t):
self.label = t[0]
self.value = validValues[t[0]]
def __bool__(self):
return self.value
def __str__(self):
return self.label
__repr__ = __str__
__nonzero__ = __bool__
class BoolBinOp(object):
def __init__(self,t):
self.args = t[0][0::2]
def __str__(self):
sep = " %s " % self.reprsymbol
return "(" + sep.join(map(str,self.args)) + ")"
def __bool__(self):
return self.evalop(bool(a) for a in self.args)
__nonzero__ = __bool__
__repr__ = __str__
class BoolAnd(BoolBinOp):
reprsymbol = '&'
evalop = all
class BoolOr(BoolBinOp):
reprsymbol = '|'
evalop = any
class BoolNot(object):
def __init__(self,t):
self.arg = t[0][1]
def __bool__(self):
v = bool(self.arg)
return not v
def __str__(self):
return "~" + str(self.arg)
__repr__ = __str__
__nonzero__ = __bool__
TRUE = Keyword("True")
FALSE = Keyword("False")
boolOperand = TRUE | FALSE | Word(alphas+":",max=1)
boolOperand.setParseAction(BoolOperand)
# define expression, based on expression operand and
# list of operations in precedence order
boolExpr = infixNotation( boolOperand,
[
("not", 1, opAssoc.RIGHT, BoolNot),
("and", 2, opAssoc.LEFT, BoolAnd),
("or", 2, opAssoc.LEFT, BoolOr),
])
if __name__ == "__main__":
validValues = {
"Region:US": False,
"Region:EU": True,
"Type:Global Assets>24": True
}
tests = [("Region:US", True),
("Region:EU", False),
("Region:US and Region:EU", False),
("Region:US and not Region:EU", True),
("not not Region:US", True),
("not(Region:US and Region:EU)", True),
("Region:EU or not Region:US and Type:Global Assets>24", False),
("Region:EU or not Region:US or not Type:Global Assets>24", False),
("Region:EU or not (Region:US and Type:Global Assets>24)", False),
("Region:US or Region:EU or Type:Global Assets>24", True),
("Region:US or Region:EU or Type:Global Assets>24 and False", True),
("(Region:US or Region:EU or Type:Global Assets>24) and False", False),
]
print("Region:US =", validValues["Region:US"])
print("Region:EU =", validValues["Region:EU"])
print("Type:Global Assets>24 =", validValues["Type:Global Assets>24"])
print()
for t,expected in tests:
res = boolExpr.parseString(t)[0]
success = "PASS" if bool(res) == expected else "FAIL"
print (t,'\n', res, '=', bool(res),'\n', success, '\n')
Thanks to Paul McGuire's help, here is the solution:
boolOperand = TRUE | FALSE | Combine(Word(alphas)+":"+quotedString) | Word(alphas+":<>")
This does the parsing as I wanted it.
There are two parts to making this change: changing the parser, and then changing the post-parsing behavior to accommodate these new values.
To parse operands that are not just simple 1-character names, change this line in the parser:
boolOperand = TRUE | FALSE | Word(alphas,max=1)
The simplest (but not strictest would be to just change it to:
boolOperand = TRUE | FALSE | Word(alphas+":")
But this would accept, in addition to your valid values of "Region:US" or "TimeZone:UTC", presumably invalid values like "XouEWRL:sdlkfj", ":sldjf:ljsdf:sdljf", and even ":::::::". If you want to tighten up the parser, you could enforce the key entry to:
valid_key = oneOf("Region Country City State ZIP")
valid_value = Word(alphas+"_")
valid_kv = Combine(valid_key + ":" + valid_value)
boolOperand = TRUE | FALSE | valid_kv
That should take care of the parser.
Second, you will need to change how this entry is evaluated after the parsing is done. In my example, I was emphasizing the parsing part, not the evaluating part, so I left this to simply call the eval() builtin. In your case, you will probably need to initialize a dict of valid values for each acceptable key-value pair, and then change the code in BoolOperand to do a dict lookup instead of calling eval. (This has the added benefit of not calling eval() with user-entered data, which has all kinds of potential for security problems.)
I encounter a strange problem when trying to alter values from a Hash. I have the following setup:
myHash = {
company_name:"MyCompany",
street:"Mainstreet",
postcode:"1234",
city:"MyCity",
free_seats:"3"
}
def cleanup string
string.titleize
end
def format
output = Hash.new
myHash.each do |item|
item[:company_name] = cleanup(item[:company_name])
item[:street] = cleanup(item[:street])
output << item
end
end
When I execute this code I get: "TypeError: no implicit conversion of Symbol into Integer" although the output of item[:company_name] is the expected string. What am I doing wrong?
Your item variable holds Array instance (in [hash_key, hash_value] format), so it doesn't expect Symbol in [] method.
This is how you could do it using Hash#each:
def format(hash)
output = Hash.new
hash.each do |key, value|
output[key] = cleanup(value)
end
output
end
or, without this:
def format(hash)
output = hash.dup
output[:company_name] = cleanup(output[:company_name])
output[:street] = cleanup(output[:street])
output
end
This error shows up when you are treating an array or string as a Hash. In this line myHash.each do |item| you are assigning item to a two-element array [key, value], so item[:symbol] throws an error.
You probably meant this:
require 'active_support/core_ext' # for titleize
myHash = {company_name:"MyCompany", street:"Mainstreet", postcode:"1234", city:"MyCity", free_seats:"3"}
def cleanup string
string.titleize
end
def format(hash)
output = {}
output[:company_name] = cleanup(hash[:company_name])
output[:street] = cleanup(hash[:street])
output
end
format(myHash) # => {:company_name=>"My Company", :street=>"Mainstreet"}
Please read documentation on Hash#each
myHash.each{|item|..} is returning you array object for item iterative variable like the following :--
[:company_name, "MyCompany"]
[:street, "Mainstreet"]
[:postcode, "1234"]
[:city, "MyCity"]
[:free_seats, "3"]
You should do this:--
def format
output = Hash.new
myHash.each do |k, v|
output[k] = cleanup(v)
end
output
end
Ive come across this many times in my work, an easy work around that I found is to ask if the array element is a Hash by class.
if i.class == Hash
notation like i[:label] will work in this block and not throw that error
end
When I convert an XML structure to hash with Hash.from_xml(#xml) in Rails, the parser does not distinguish between empty arrays and nil values, whereas the XML depicts nodes that are immediately terminated with \ to be empty arrays, e.g. <audio_languages/> vs. those with attribute nil="true" to be interpreted as nil values.
The XML structure (which I have control over on how to generate) looks like this:
<response>
<medias>
<media>
<id>1</id>
<name>Media-1</name>
<audio_languages/>
<avg_rating nil="true"></avg_rating>
</media>
<media>
<id>2</id>
<name>Media-2</name>
<audio_languages/>
<avg_rating nil="true"></avg_rating>
</media>
</medias>
</response>
The expected output from Hash.from_xml(#xml) would be:
{"response"=>{"medias"=>{"media"=>[{"id"=>"1", "name"=>"Media-1", "audio_languages"=>[], "avg_rating"=>nil}, {"id"=>"2", "name"=>"Media-2", "audio_languages"=>[], "avg_rating"=>nil}]}}}
instead, I get nil values for audio_languages and avg_rating:
{"response"=>{"medias"=>{"media"=>[{"id"=>"1", "name"=>"Media-1", "audio_languages"=>nil, "avg_rating"=>nil}, {"id"=>"2", "name"=>"Media-2", "audio_languages"=>nil, "avg_rating"=>nil}]}}}
I ended up parsing the nodes using libxml and I am checking if the nodes has the signature I am looking for in order to figure out if I want to convert as an empty array vs. a nil value.
# Usage: Hash.from_xml_with_libxml(xml)
require 'xml/libxml'
# adapted from
# http://movesonrails.com/articles/2008/02/25/libxml-for-active-resource-2-0
class Hash
class << self
def from_xml_with_libxml(xml, strict=true)
LibXML::XML.default_load_external_dtd = false
LibXML::XML.default_pedantic_parser = strict
result = LibXML::XML::Parser.string(xml).parse
return { result.root.name.to_s => xml_node_to_hash_with_libxml(result.root)}
end
def xml_node_to_hash_with_libxml(node)
# If we are at the root of the document, start the hash
if node.element?
if node.children?
result_hash = {}
node.each_child do |child|
result = xml_node_to_hash_with_libxml(child)
if child.name == "text"
if !child.next? and !child.prev?
return result
end
elsif result_hash[child.name]
if result_hash[child.name].is_a?(Object::Array)
result_hash[child.name] << result
else
result_hash[child.name] = [result_hash[child.name]] << result
end
else
result_hash[child.name] = result
end
end
return result_hash
else
# Nodes of sort <audio_languages/>, are arrays,
# and nodes like <average_rating "nil"="true"/> are nil values.
if node.to_s.match(/^\<(.+)\/\>$/) && nil == node.attributes["nil"]
return []
end
return nil
end
else
return node.content.to_s
end
end
end
end