Regex in Ruby: expression not found - ruby-on-rails

I'm having trouble with a regex in Ruby (on Rails). I'm relatively new to this.
The test string is:
http://www.xyz.com/017010830343?$ProdLarge$
I am trying to remove "$ProdLarge$". In other words, the $ signs and anything between.
My regular expression is:
\$\w+\$
Rubular says my expression is ok. http://rubular.com/r/NDDQxKVraK
But when I run my code, the app says it isn't finding a match. Code below:
some_array.each do |x|
logger.debug "scan #{x.scan('\$\w+\$')}"
logger.debug "String? #{x.instance_of?(String)}"
x.gsub!('\$\w+\$','scl=1')
...
My logger debug line shows a result of "[]". String is confirmed as being true. And the gsub line has no effect.
What do I need to correct?

Use /regex/ instead of 'regex':
> "http://www.xyz.com/017010830343?$ProdLarge$".gsub(/\$\w+\$/, 'scl=1')
=> "http://www.xyz.com/017010830343?scl=1"

Don't use a regex for this task, use a tool designed for it, URI. To remove the query:
require 'uri'
url = URI.parse('http://www.xyz.com/017010830343?$ProdLarge$')
url.query = nil
puts url.to_s
=> http://www.xyz.com/017010830343
To change to a different query use this instead of url.query = nil:
url.query = 'scl=1'
puts url.to_s
=> http://www.xyz.com/017010830343?scl=1
URI will automatically encode values if necessary, saving you the trouble. If you need even more URL management power, look at Addressable::URI.

Related

How to combine Ruby regexp conditions

I need to check if a string is valid image url.
I want to check beginning of string and end of string as follows:
Must start with http(s):
Must end by .jpg|.png|.gif|.jpeg
So far I have:
(https?:)
I can't seem to indicate beginning of string \A, combine patterns, and test end of string.
Test strings:
"http://image.com/a.jpg"
"https://image.com/a.jpg"
"ssh://image.com/a.jpg"
"http://image.com/a.jpeg"
"https://image.com/a.png"
"ssh://image.com/a.jpeg"
Please see http://rubular.com/r/PqERRim5RQ
Using Ruby 2.5
Using your very own demo, you could use
^https?:\/\/.*(?:\.jpg|\.png|\.gif|\.jpeg)$
See the modified demo.
One could even simplify it to:
^https?:\/\/.*\.(?:jpe?g|png|gif)$
See a demo for the latter as well.
This basically uses anchors (^ and $) on both sides, indicating the start/end of the string. Additionally, please remember that you need to escape the dot (\.) if you want to have ..
There's quite some ambiguity going on in the comments section, so let me clarify this:
^ - is meant for the start of a string
(or a line in multiline mode, but in Ruby strings are always in multiline mode)
$ - is meant for the end of a string / line
\A - is the very start of a string (irrespective of multilines)
\z - is the very end of a string (irrespective of multilines)
You may use
reg = %r{\Ahttps?://.*\.(?:png|gif|jpe?g)\z}
The point is:
When testing at online regex testers, you are testing a single multiline string, but in real life, you will validate lines as separate strings. So, in those testers, use ^ and $ and in real code, use \A and \z.
To match a string rather than a line you need \A and \z anchors
Use %r{pat} syntax if you have many / in your pattern, it is cleaner.
Online Ruby test:
urls = ['http://image.com/a.jpg',
'https://image.com/a.jpg',
'ssh://image.com/a.jpg',
'http://image.com/a.jpeg',
'https://image.com/a.png',
'ssh://image.com/a.jpeg']
reg = %r{\Ahttps?://.*\.(?:png|gif|jpe?g)\z}
urls.each { |url|
puts "#{url}: #{(reg =~ url) == 0}"
}
Output:
http://image.com/a.jpg: true
https://image.com/a.jpg: true
ssh://image.com/a.jpg: false
http://image.com/a.jpeg: true
https://image.com/a.png: true
ssh://image.com/a.jpeg: false
The answers here are quite good, but if you wanted to avoid using a complicated regex and communicate your intent more clearly to a reader, you could let URI and File do the heavy lifting for you.
(And since you're using 2.5, let's use #match? instead of other regex-matching methods.)
def valid_url?(url)
# Let URI parse the URL.
uri = URI.parse(url)
# Is the scheme http or https, and does the extension match expected formats?
uri.scheme.match?(/https?/i) && File.extname(uri.path).match?(/(png|jpe?g|gif)/i)
rescue URI::InvalidURIError
# If it's an invalid URL, URI will throw this error.
# We'll return `false`, because a URL that can't be parsed by URI isn't valid.
false
end
urls.map { |url| [url, valid_url?(url)] }
#=> Results in:
'http://image.com/a.jpg', true
'https://image.com/a.jpg', true
'ssh://image.com/a.jpg', false
'http://image.com/a.jpeg', true
'https://image.com/a.png', true
'ssh://image.com/a.jpeg', false
'https://image.com/a.tif', false
'http://t.co.uk/proposal.docx', false
'not a url', false

Tidy long string in Ruby

I have a method in Ruby, which needs an API URL:
request_url = "http://api.abc.com/v3/avail?rev=#{ENV['REV']}&key=#{ENV['KEY']}&locale=en_US&currencyCode=#{currency}&arrivalDate=#{check_in}&departureDate=#{check_out}&includeDetails=true&includeRoomImages=true&room1=#{total_guests}"
I want to format it to be more readable. It should take arguments.
request_url = "http://api.abc.com/v3/avail?
&rev=#{ENV['REV']}
&key=#{ENV['KEY']}
&locale=en_US
&currencyCode=#{currency}
&arrivalDate=#{check_in}
&departureDate=#{check_out}
&includeDetails=true
&includeRoomImages=true
&room1=#{total_guests}"
But of course there's line break. I tried heredoc, but I want it to be in one line.
I would prefer to not build URI queries by joining strings, because that might lead to URLs that are not correctly encoded (see a list of characters that need to be encoded in URIs).
There is the Hash#to_query method in Ruby on Rails that does exactly what you need and it ensure that the parameters are correctly URI encoded:
base_url = 'http://api.abc.com/v3/avail'
arguments = {
rev: ENV['REV'],
key: ENV['KEY'],
locale: 'en_US',
currencyCode: currency,
arrivalDate: check_in,
departureDate: check_out,
includeDetails: true,
includeRoomImages: true,
room1: total_guests
}
request_url = "#{base_url}?#{arguments.to_query}"
You could use an array and join the strings:
request_url = [
"http://api.abc.com/v3/avail?",
"&rev=#{ENV['REV']}",
"&key=#{ENV['KEY']}",
"&locale=en_US",
"&currencyCode=#{currency}",
"&arrivalDate=#{check_in}",
"&departureDate=#{check_out}",
"&includeDetails=true",
"&includeRoomImages=true",
"&room1=#{total_guests}",
].join('')
Even easier, you can use the %W array shorthand notation so you don't have to write out all the quotes and commas:
request_url = %W(
http://api.abc.com/v3/avail?
&rev=#{ENV['REV']}
&key=#{ENV['KEY']}
&locale=en_US
&currencyCode=#{currency}
&arrivalDate=#{check_in}
&departureDate=#{check_out}
&includeDetails=true
&includeRoomImages=true
&room1=#{total_guests}
).join('')
Edit: Of course, spickermann makes a very good point above on better ways to accomplish this specifically for URLs. However, if you're not constructing a URL and just working with strings, the above methods should work fine.
You can extend strings in Ruby using the line continuation operator. Example:
request_url = "http://api.abc.com/v3/avail?" \
"&rev=#{ENV['REV']}" \
"&key=#{ENV['KEY']}"

How to make Rails.logger.debug print hash more readable

I'm using Rails.logger.debug print variables for debugging purposes. The issue is it prints hashes in an impossible to read format (can't distinguish keys from values). For example, I add the following lines to my code base
#code_base.rb
my_hash = {'a' => 'alligator', 'b'=>'baboon'}
Rails.logger.debug my_hash
Then I launch my rails app and type
tail -f log/development.log
But when my_hash gets printed, it looks like
bbaboonaalligator
The key and values are scrunched up, making it impossible to parse. Do you guys know what I should do to fix this?
Nevermind, I found the answer to my own question. I need to use
my_hash = {'a' => 'alligator', 'b'=>'baboon'}
Rails.logger.debug "#{my_hash.inspect}"
Then, it looks like
{"b"=>"baboon", "a"=>"aligator"}
It's even easier to read it when you use to_yaml eg:
logger.debug my_hash.to_yaml
Which is an easy to read format over multiple lines. The inspect method simply spews out a string.
my_hash = {'a' => 'alligator', 'b'=>'baboon'}
logger.debug "#{my_hash}"
Then, it looks like
{"b"=>"baboon", "a"=>"aligator"}
do not need inspect
There is an another way to do this. There is a ruby built in module pp.rb that is Pretty-printer for Ruby objects.
non-pretty-printed output by p is:
#<PP:0x81fedf0 #genspace=#<Proc:0x81feda0>, #group_queue=#<PrettyPrint::GroupQueue:0x81fed3c #queue=[[#<PrettyPrint::Group:0x81fed78 #breakables=[], #depth=0, #break=false>], []]>, #buffer=[], #newline="\n", #group_stack=[#<PrettyPrint::Group:0x81fed78 #breakables=[], #depth=0, #break=false>], #buffer_width=0, #indent=0, #maxwidth=79, #output_width=2, #output=#<IO:0x8114ee4>>
pretty-printed output by pp is:
#<PP:0x81fedf0
#buffer=[],
#buffer_width=0,
#genspace=#<Proc:0x81feda0>,
#group_queue=
#<PrettyPrint::GroupQueue:0x81fed3c
#queue=
[[#<PrettyPrint::Group:0x81fed78 #break=false, #breakables=[], #depth=0>],
[]]>,
#group_stack=
[#<PrettyPrint::Group:0x81fed78 #break=false, #breakables=[], #depth=0>],
#indent=0,
#maxwidth=79,
#newline="\n",
#output=#<IO:0x8114ee4>,
#output_width=2>
For more complex objects even with ActiveRecord, this could be achieved with a JSON.pretty_generate
Rails.logger.debug JSON.pretty_generate(my_hash.as_json)

Removing a part of a URL with Ruby

Removing the query string from a URL in Ruby could be done like this:
url.split('?')[0]
Where url is the complete URL including the query string (e.g. url = http://www.domain.extension/folder?schnoo=schnok&foo=bar).
Is there a faster way to do this, i.e. without using split, but rather using Rails?
edit: The goal is to redirect from http://www.domain.extension/folder?schnoo=schnok&foo=bar to http://www.domain.extension/folder.
EDIT: I used:
url = 'http://www.domain.extension/folder?schnoo=schnok&foo=bar'
parsed_url = URI.parse(url)
new_url = parsed_url.scheme+"://"+parsed_url.host+parsed_url.path
Easier to read and harder to screw up if you parse and set fragment & query to nil instead of rebuilding the URL.
parsed = URI::parse("http://www.domain.extension/folder?schnoo=schnok&foo=bar#frag")
parsed.fragment = parsed.query = nil
parsed.to_s
# => "http://www.domain.extension/folder"
url = 'http://www.domain.extension/folder?schnoo=schnok&foo=bar'
u = URI.parse(url)
p = CGI.parse(u.query)
# p is now {"schnoo"=>["schnok"], "foo"=>["bar"]}
Take a look on the : how to get query string from passed url in ruby on rails
You can gain performance using Regex
'http://www.domain.extension/folder?schnoo=schnok&foo=bar'[/[^\?]+/]
#=> "http://www.domain.extension/folder"
Probably no need to split the url. When you visit this link, you are pass two parameters to back-end:
http://www.domain.extension/folder?schnoo=schnok&foo=bar
params[:schnoo]=schnok
params[:foo]=bar
Try to monitor your log and you will see them, then you can use them in controller directly.

Ruby/Rails 3.1: Given a URL string, remove path

Given any valid HTTP/HTTPS string, I would like to parse/transform it such that the end result is exactly the root of the string.
So given URLs:
http://foo.example.com:8080/whatsit/foo.bar?x=y
https://example.net/
I would like the results:
http://foo.example.com:8080/
https://example.net/
I found the documentation for URI::Parser not super approachable.
My initial, naïve solution would be a simple regex like:
/\A(https?:\/\/[^\/]+\/)/
(That is: Match up to the first slash after the protocol.)
Thoughts & solutions welcome. And apologies if this is a duplicate, but my search results weren't relevant.
With URI::join:
require 'uri'
url = "http://foo.example.com:8080/whatsit/foo.bar?x=y"
baseurl = URI.join(url, "/").to_s
#=> "http://foo.example.com:8080/"
Use URI.parse and then set the path to an empty string and the query to nil:
require 'uri'
uri = URI.parse('http://foo.example.com:8080/whatsit/foo.bar?x=y')
uri.path = ''
uri.query = nil
cleaned = uri.to_s # http://foo.example.com:8080
Now you have your cleaned up version in cleaned. Taking out what you don't want is sometimes easier than only grabbing what you need.
If you only do uri.query = '' you'll end up with http://foo.example.com:8080? which probably isn't what you want.
You could use uri.split() and then put the parts back together...
WARNING: It's a little sloppy.
url = "http://example.com:9001/over-nine-thousand"
parts = uri.split(url)
puts "%s://%s:%s" % [parts[0], parts[2], parts[3]]
=> "http://example.com:9001"

Resources