Have been hacking together a couple of libraries, and had an issue where a string was getting 'double escaped'.
for example:
Fixed example
> x = ['a']
=> ["a"]
> x.to_s
=> "[\"a\"]"
>
Then again to
\"\[\\\"s\\\"\]\"
This was happening while dealing with http headers. I have a header which will be an array, but the http library is doing it's own character escaping on the array.to_s value.
The workaround I found, was to convert the array to a string myself, and then 'undo' the to_s. Like so:
formatted_value = value.to_s
if value.instance_of?(Array)
formatted_value = formatted_value.gsub(/\\/,"") #remove backslash
formatted_value = formatted_value.gsub(/"/,"") #remove single quote
formatted_value = formatted_value.gsub(/\[/,"") #remove [
formatted_value = formatted_value.gsub(/\]/,"") #remove ]
end
value = formatted_value
... There's gotta be a better way ... (without needing to monkey-patch the gems I'm using). (yeah, this break's if my string actually contains those strings.)
Suggestions?
** UPDATE 2 **
Okay. Still having troubles in this neighborhood, but now I think I've figured out the core issue. It's serializing my array to json after a to_s call. At least, that seems to be reproducing what I'm seeing.
['a'].to_s.to_json
I'm calling a method in a gem that is returning the results of a to_s, and then I'm calling to_json on it.
I've edited my answer due to your edited question:
I still can't duplicate your results!
>> x = ['a']
=> ["a"]
>> x.to_s
=> "a"
But when I change the last call to this:
>> x.inspect
=> "[\"a\"]"
So I'll assume that's what you're doing?
it's not necessarily escaping the values - per se. It's storing the string like this:
%{["a"]}
or rather:
'["a"]'
In any case. This should work to un-stringify it:
>> x = ['a']
=> ["a"]
>> y = x.inspect
=> "[\"a\"]"
>> z = Array.class_eval(y)
=> ["a"]
>> x == z
=> true
I'm skeptical about the safe-ness of using class_eval though, be wary of user inputs because it may produce un-intended side effects (and by that I mean code injection attacks) unless you're very sure you know where the original data came from, or what was allowed through to it.
Related
I'm new to Ruby and I am building a web scraper. I have a variable that is assigned a value if a conditional is true.
The problem is that the value of the variable is really long and I'd like to avoid repeating myself with these long values.
I am using conditionals because the number of data that exists is not a static figure.
#Grab the top 3 comps if they exist
#comp1
if b.element(:xpath => '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr[1]/td[13]/span').exists?
comp1 = b.element(:xpath => '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr[1]/td[13]/span')
end
#comp2
if b.element(:xpath => '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr[2]/td[13]/span').exists?
comp2 = b.element(:xpath => '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr[2]/td[13]/span')
end
#comp3
if b.element(:xpath => '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr[3]/td[13]/span').exists?
comp3 = b.element(:xpath => '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr[3]/td[13]/span')
end
Is there a way to decrease it the length of that such as
if "telement with really long xpath location on the webpage that we are checking to see if it is true ".exists?
x = "That conditional referenced above"
end
Since you're just replacing a single number in that long xpath selector you can use a template string:
elements = (1..3).map do |x|
b.element(
xpath: '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr[%d]/td[13]/span' % x
)
end.select(&:exists?)
See Kernel#sprintf for the options which are pretty much identical to the venerable C sprintf function.
Break up the string, either literally, or logically:
# literally
table_xpath = '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table'
if b.element(:xpath => "#{table_xpath}/tbody/tr[1]/td[13]/span").exists?
#...
end
# logically
table = b.element(xpath: '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table')
if table.element(xpath: "tbody/tr[1]/td[13]/span").exists?
end
break it up as many or as few times as you feel like to make the code read well.
You can directly write WATIR CODE as shown below, you have to use elements instead of element
b.elements(:xpath => '/html/body/form/div[3]/div[6]/table/tbody/tr/td/div[2]/div[3]/div[3]/div/div/div[1]/table/tbody/tr')
.take(3)
.map{|tr|tr.element(xpath: "./td[13]/span")}
But still, the above code is not optimized, you can write the below code Once you located the table, For the below code, I assume the table number is 2.
b.table(index: 2)
.rows
.to_enum
.take(3)
.map{|row| row.cell(index: 13).span}
I have a string params, whose value is "1" or "['1','2','3','4']". By using eval method, I can get the result 1 or [1,2,3,4], but I need the result [1] or [1,2,3,4].
params[:city_id] = eval(params[:city_id])
scope :city, -> (params) { params[:city_id].present? ? where(city_id: (params[:city_id].is_a?(String) ? eval(params[:city_id]) : params[:city_id])) : all }
Here i don't want eval.
scope :city, -> (params) { params[:city_id].present? ? where(city_id: params[:city_id]) : all }
params[:city_id] #should be array values e.g [1] or [1,2,3,4] instead of string
Your strings look very close to JSON, so probably the safest thing you can do is parse the string as JSON. In fact:
JSON.parse("1") => 1
JSON.parse('["1","2","3","4"]') => ["1","2","3","4"]
Now your array uses single quotes. So I would suggest you to do:
Array(JSON.parse(string.gsub("'", '"'))).map(&:to_i)
So, replace the single quotes with doubles, parse as JSON, make sure it's wrapped in an array and convert possible strings in the array to integers.
I've left a comment for what would be my preferred approach: it's unusual to get your params through as you are, and the ideal approach would be to address this. Using eval is definitely a no go - there are some big security concerns to doing so (e.g. imagine someone submitting "City.delete_all" as the param).
As a solution to your immediate problem, you can do this using a regex, scanning for digits:
str = "['1','2','3','4']"
str.scan(/\d+/)
# => ["1", "2", "3"]
str = '1'
str.scan(/\d+/)
# => ["1"]
# In your case:
params[:city_id].scan(/\d+/)
In very simple terms, this looks through the given string for any digits that are in there. Here's a simple Regex101 with results / an explanation: https://regex101.com/r/41yw9C/1.
Rails should take care of converting the fields in your subsequent query (where(city_id: params[:city_id])), though if you explictly want an array of integers, you can append the following (thanks #SergioTulentsev):
params[:city_id].scan(/\d+/).map(&:to_i)
# or in a single loop, though slightly less readable:
[].tap { |result| str.scan(/\d+/) { |match| result << match.to_i } }
# => [1, 2, 3, 4]
Hope that's useful, let me know how you get on or if you have any questions.
i've written a function within a model to scrape a site and store certain attributes within a separate model (story):
def get_content
request = HTTParty.get("#{url}")
doc = Nokogiri::HTML(request.body)
doc.css("#{anchor}")["#{range}"].each do |entry|
story = self.stories.new
story.title = entry.text
story.url = entry[:href]
story.save
end
This uses the url, anchor, and range attributes of a Sections variable. The range attribute is stored as an array range - i.e. 0..2 or 11..13 - however, I'm being told that it can't convert a string into a variable. I've tried storing range as an integer and as a string, but both fail.
I realise I could input the beginning and end of the range as two separate integers in my db, and put ["#{beginrange}".."#{endrange}"] but this seems a messy way of doing it.
Any other ideas? Many thanks in advance
Hmm if you are sure that the range is always a string like '1..2' ('<Integer >..<Integer>'), you can use the eval method:
In my IRB console:
1.9.3p0 :032 > (eval "1..2").each { |l| puts l }
1
2
=> 1..2
1.9.3p0 :033 > (eval "1..2").inspect
=> "1..2"
1.9.3p0 :034 > (eval "1..2").class
=> Range
In your case:
doc.css("#{anchor}")[eval(range)].each do |entry|
#...
end
But eval is kind of dangerous. If you are sure that the range attribute is a Range as a String (validations and Regex are here to help), you can use eval without risk.
There's a couple things I see wrong.
["#{beginrange}".."#{endrange}"] creates a range of characters, not a range of integers, which Array[] needs:
beginrange = 1
endrange = 2
["#{beginrange}".."#{endrange}"]
=> ["1".."2"]
[beginrange..endrange]
=> [1..2]
But, you're storing the representation of the array range you need as a string. If I had a string representation of a range, I'd use this:
range_value = '1..2'
[Range.new(*range_value.scan(/\d+/).map(&:to_i))]
=> [1..2]
Or, if there was a chance I'd encounter an exclusive-range:
[Range.new(*range_value.scan(/\d+/).map(&:to_i), range_value['...'])]
=> [1..2]
range_value = '1...2'
[Range.new(*range_value.scan(/\d+/).map(&:to_i), range_value['...'])]
=> [1...2]
Those are all good when you can't trust your Range string representation's source, i.e., the value is coming from a form or a file someone else created. If you own the incoming value, or, for convenience, stored it as a string in a database, you can easily recreate the range using eval:
eval('1..2').class
=> Range
eval('1..2')
=> 1..2
eval('1...2')
=> 1...2
People are afraid of eval, because, used unwisely, it is dangerous. That doesn't mean we should avoid using it, instead, we should use it when it's safe.
You could use a regex to check the format of the string, raise an exception if it's not acceptable, then continue:
raise "Invalid range value received" if (!range_value[/\A\d+\s*\.{2,3}\s*\d+\z/])
[eval(range_value)]
Is there a short hand or best practice for assigning things to a hash when they are nil in ruby? For example, my problem is that I am using another hash to build this and if something in it is nil, it assigns nil to that key, rather than just leaving it alone. I understand why this happens so my solution was:
hash1[:key] = hash2[:key] unless hash2[:key].nil?
Because I cannot have a value in the has where the key actually points to nil. (I would rather have an empty hash than one that has {:key => nil}, that can't happen)
My question would be is there a better way to do this? I don't want to do a delete_if at the end of the assignments.
a little bit shorter if you negate the "unless" statement
hash1[:key] = hash2[:key] if hash2[:key] # same as if ! hash2[:key].nil?
you could also do the comparison in a && statement as suggested in other answers by Michael or Marc-Andre
It's really up to you, what you feel is most readable for you. By design, there are always multiple ways in Ruby to solve a problem.
You could also modify the hash2 :
hash1 = hash2.reject{|k,v| v.nil?}
hash2.reject!{|k,v| v.nil?} # even shorter, if in-place editing of hash2
this would remove key/value pairs :key => nil from hash2 (in place, if you use reject! )
I like this the best, loop and conditional overriding all in one line!
h1 = {:foo => 'foo', :bar => 'bar'}
h2 = {:foo => 'oof', :bar => nil}
h1.merge!(h2) { |key, old_val, new_val| new_val.nil? ? old_val : new_val }
#=> {:foo => 'oof', :bar => 'bar'}
This will replace every value in h1 with the value of h2 where the keys are the same and the h2 value is not nil.
I'm not sure if that's really any better, but
hash2[:key] && hash[:key] = hash2[:key]
could work. Note that this would behave the same way for false and nil, if that's not what you want
!hash2[:key].nil? && hash[:key] = hash2[:key]
would be better. All of this assuming that :key would be an arbitrary value that you may not have control over.
How about something like this?
hash2.each_pair do |key, value|
next if value.nil?
hash1[key] = value
end
If you are doing just a single assignment, this could shave a few characters:
hash2[:key] && hash1[:key] = hash2[:key]
My first example could also be shaved a bit further:
hash2.each_pair{ |k,v| v && hash1[k] = v }
I think the first is the easiest to read/understand. Also, examples 2 and 3 will skip anything that evaluates false (nil or false). This final example is one line and won't skip false values:
hash2.each_pair{ |k,v| v.nil? || hash1[k] = v }
I believe the best practice is to copy the nil value over to the hash. If one passes an option :foo => nil, it can mean something and should override a default :foo of 42, for example. This also makes it easier to have options which should default to true, although one should use fetch in those cases:
opt = hash.fetch(:do_cool_treatment, true) # => will be true if key is not present
There are many ways to copy over values, including nil or false.
For a single key, you can use has_key? instead of the lookup:
hash1[:key] = hash2[:key] if hash2.has_key? :key
For all (or many) keys, use merge!:
hash1.merge!(hash2)
If you only want to do this for a couple of keys of hash2, you can slice it:
hash1.merge!(hash2.slice(:key, ...))
OK, so if the merge doesn't work because you want more control:
hash1[:key] = hash2.fetch(:key, hash1[:key])
This will set hash1's :key to be hash2, unless it doesn't exist. In that case, it will use the default value (2nd argument to fetch), which is hash1's key
Add this to your initializers hash.rb
class Hash
def set_safe(key,val)
if val && key
self[key] = val
end
end
end
use
hash = {}
hash.set_safe 'key', value_or_nil
My objective is to convert form input, like "100 megabytes" or "1 gigabyte", and converts it to a filesize in kilobytes I can store in the database. Currently, I have this:
def quota_convert
#regex = /([0-9]+) (.*)s/
#sizes = %w{kilobyte megabyte gigabyte}
m = self.quota.match(#regex)
if #sizes.include? m[2]
eval("self.quota = #{m[1]}.#{m[2]}")
end
end
This works, but only if the input is a multiple ("gigabytes", but not "gigabyte") and seems insanely unsafe due to the use of eval. So, functional, but I won't sleep well tonight.
Any guidance?
EDIT: ------
All right. For some reason, the regex with (.*?) isn't working correctly on my setup, but I've worked around it with Rails stuff. Also, I've realized that bytes would work better for me.
def quota_convert
#regex = /^([0-9]+\.?[0-9]*?) (.*)/
#sizes = { 'kilobyte' => 1024, 'megabyte' => 1048576, 'gigabyte' => 1073741824}
m = self.quota.match(#regex)
if #sizes.include? m[2].singularize
self.quota = m[1].to_f*#sizes[m[2].singularize]
end
end
This catches "1 megabyte", "1.5 megabytes", and most other things (I hope). It then makes it the singular version regardless. Then it does the multiplication and spits out magic answers.
Is this legit?
EDIT AGAIN: See answer below. Much cleaner than my nonsense.
You can use Rails ActiveHelper number_to_human_size.
def quota_convert
#regex = /([0-9]+) (.*)s?/
#sizes = "kilobytes megabytes gigabytes"
m = self.quota.match(#regex)
if #sizes.include? m[2]
m[1].to_f.send(m[2])
end
end
Added ? for optional plural in the regex.
Changed #sizes to a string of plurals.
Convert m[1] (the number to a float).
Send the message m[2] directly
why don't you simply create a hash that contains various spellings of the multiplier as the key and the numerical value as the value? No eval necessary and no regexs either!
First of all, changing your regex to #regex = /([0-9]+) (.*?)s?/ will fix the plural issue. The ? says match either 0 or 1 characters for the 's' and it causes .* to match in a non-greedy manner (as few characters as possible).
As for the size, you could have a hash like this:
#hash = { 'kilobyte' => 1, 'megabyte' => 1024, 'gigabyte' => 1024*1024}
and then your calculation is just self.quota = m[1].to_i*#hash[m2]
EDIT: Changed values to base 2