Regular expression not working completely in Ruby on Rails - ruby-on-rails

So, I am trying to apply regular expression to email addresses coming into a site I am working on to try and verify that they are mostly valid. The regular expression is the one below.
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#
(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])
When put into ruby as below.
if email =~ [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*# (?:[a-z0-9]
(?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])
#logic here if regex passes
end
The problem is that the regular expression below contains '#' characters which are understood as comments in ruby. So, is there a way to use the regular expression without the '#' being interpreted as comments? Can regular expression be stored as strings or something similar?

You have to use ruby regex syntax /regex/, or build new regexp with Regexp.new(string)
regexp = /[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])/
if email =~ regexp
#logic here if regex passes
end

Related

Rails bad URI error

How can I get rid of special characters with some regular expression?
I have a search_controller in which I do this with CGI.escape.
def index
#page = search_params[:page] || 1
#per_page = search_params[:per_page] || 20
#query = URI.parse(CGI.escape(search_params[:query]).gsub("%40", "#").gsub("%C3%9F", "ß").gsub("%C3%BC", "ü"))
But I want to do it with some kind of regexp instead of white listing all the other characters I need besides the ones which break the search.
As it looks like you are doing some sort of pagination I highly recommend to use a pagination gem like kaminari or will_paginate.
If you are escaping the query for security reasons, what you are doing here is blacklisting some special chars, doing so is not recommended, the best way would be instead doing some sort of whitelisting of permitted characters.
http://guides.rubyonrails.org/security.html#whitelists-versus-blacklists
A possible regular expression to only match characters could be
query.gsub(/([\W]+)/, '')
This matches characters from a-z, A-Z, 0-9, including the _ (underscore) character.
But I wouldn't recommend to use a simple regex like this in production.

Rails strip all except numbers commas and decimal points

Hi I've been struggling with this for the last hour and am no closer. How exactly do I strip everything except numbers, commas and decimal points from a rails string? The closest I have so far is:-
rate = rate.gsub!(/[^0-9]/i, '')
This strips everything but the numbers. When I try add commas to the expression, everything is getting stripped. I got the aboves from somewhere else and as far as I can gather:
^ = not
Everything to the left of the comma gets replaced by what's in the '' on the right
No idea what the /i does
I'm very new to gsub. Does anyone know of a good tutorial on building expressions?
Thanks
Try:
rate = rate.gsub(/[^0-9,\.]/, '')
Basically, you know the ^ means not when inside the character class brackets [] which you are using, and then you can just add the comma to the list. The decimal needs to be escaped with a backslash because in regular expressions they are a special character that means "match anything".
Also, be aware of whether you are using gsub or gsub!
gsub! has the bang, so it edits the instance of the string you're passing in, rather than returning another one.
So if using gsub! it would be:
rate.gsub!(/[^0-9,\.]/, '')
And rate would be altered.
If you do not want to alter the original variable, then you can use the version without the bang (and assign it to a different var):
cleaned_rate = rate.gsub!(/[^0-9,\.]/, '')
I'd just google for tutorials. I haven't used one. Regexes are a LOT of time and trial and error (and table-flipping).
This is a cool tool to use with a mini cheat-sheet on it for ruby that allows you to quickly edit and test your expression:
http://rubular.com/
You can just add the comma and period in the square-bracketed expression:
rate.gsub(/[^0-9,.]/, '')
You don't need the i for case-insensitivity for numbers and symbols.
There's lots of info on regular expressions, regex, etc. Maybe search for those instead of gsub.
You can use this:
rate = rate.gsub!(/[^0-9\.\,]/g,'')
Also check this out to learn more about regular expressions:
http://www.regexr.com/

regex basic url expression

Hi I'm creating a regular expression (ruby) to test the beginning and end of string. I have both parts but can't join them.
Beginning of string
\A(http:\/\/+)
End of string
(.pdf)\z
How to join?
Bonus if it could validate in-between and accept anything (to avoid http://.pdf)
By the way, rubular http://rubular.com is a neat place to validate expressions
Use .+ to match any character except \n one or more times.
\A(http:\/\/+).+(\.pdf)\z
Should match http://www.stackoverflow.com/bestbook.pdf but not http://.pdf

How do you store a Ruby regex via a Rails controller?

For an admin function in a Rails app, I want to be able to store regexes in the DB (as strings), and add them via a standard controller action.
I've run into 2 issues:
1) The Rails parameter filters seem to be automatically escaping backslashes (escape characters), which messes up the regex. For instance:
\s{1,2}(foo)
becomes:
\\s{1,2}(foo)
2) So then I tried to use a write_attribute to gsub instances of double backslashes with single backslashes (essentially unescaping them). This proved to be much trickier than expected. (I'm using Ruby 1.9.2 if it matters). Some things I've found:
"hello\\world".gsub(/\\/, ' ') #=> "hello world"
"hello\\world".gsub(/\\/, "\\") #=> "hello\\world"
"hello\\world".gsub(/\\/, '\\') #=> "hello\\world"
What I'm trying to do is:
"hello\\world".gsub(/\\/, something) #=> "hello\world"
I'd love to know both solutions.
1) How can you safely pass and store regexes as params to a Rails controller action?
2) How can you substitute double backslashes with a single backslash?
In short, you can't substitute a double backslash with a single one in a string, because a single backslash in a string is an escape character. What you can do is the following:
Regexp.new("hello\\world") #=> /hello\world/
This will convert your string into a regular expression. So that means: store your regular expressions as strings (with the escaped characters) and convert them into regular expressions when you want to compare against them:
regexp = "\\s{1,2}(foo)"
reg = Regexp.new(regexp) #=> /\s{1,2}(foo)/
" foo" =~ reg #=> 0

Assistance with Some Interesting Syntax in Some Ruby Code I've Found

I'm currently reading Agile Web Development With Rails, 3rd edition. On page 672, I came across this method:
def capitalize_words(string)
string.gsub(/\b\w/) { $&.upcase }
end
What is the code in the block doing? I have never seen that syntax. Is it similar to the array.map(&:some_method) syntax?
It's Title Casing The Input. inside the block, $& is a built-in representing the current match (\b\w i.e. the first letter of each word) which is then uppercased.
You've touched on one of the few things I don't like about Ruby :)
The magic variable $& contains the matched string from the previous successful pattern match. So in this case, it'll be the first character of each word.
This is mentioned in the RDoc for String.gsub:
http://ruby-doc.org/core/classes/String.html#M000817
gsub replaces everything that matched in the regex with the result of the block. so yes, in this case you're matching the first letter of words, then replacing it with the upcased version.
as to the slightly bizarre syntax inside the block, this is equivalent (and perhaps easier to understand):
def capitalize_words(string)
string.gsub(/\b\w/) {|x| x.upcase}
end
or even slicker:
def capitalize_words(string)
string.gsub /\b\w/, &:upcase
end
as to the regex (courtesy the pickaxe book), \b matches a word boundary, and \w any 'word character' (alphanumerics and underscore). so \b\w matches the first character of the word.

Resources