ruby regex match - ruby-on-rails

For my ruby on rails project, I have a model called message which has a to field. I want to implement a wildcard search so that, for example, %545 will bring up all messages ending with 545, 545% will bring up all numbers starting with 545, %545% will bring up all messages including 545.
I have a query like Message.where("to like ?", str) where str is the string to match, e.g. %545, %545%, 545%...etc.
Everything works but I'm concerned about SQL injection attack. So I want to do a regex matching for str so that it only allows % and numbers to pass through. So I want strings like %545, %545%, 545% to pass, but not abc, %545a, a545%, %54a5% to pass.
I've tried str.scan(/.*?(\d+%)/) but that doesn't work.
Thanks.

You are correctly using placeholders, so you are protected from SQL injection attacks already. Rails will escape any unsafe characters in the pattern; you don't need to take any further action.
If you still want to strip characters other than digits and %, you can use Ruby's String#delete method:
str.delete('^1-9%')
The '^1-9%' argument means "Delete every character that is not 1 to 9 or %". (n.b. you cannot use \d here, because #delete doesn't understand regular expression meta characters.)
See https://ruby-doc.org/core-2.5.3/String.html#method-i-delete.

Related

How to remove from string before __

I am building a Rails 5.2 app.
In this app I got outputs from different suppliers (I am building a webshop).
The name of the shipping provider is in this format:
dhl_freight__233433
It could also be in this format:
postal__US-320202
How can I remove all that is before (and including) the __ so all that remains are the things after the ___ like for example 233433.
Perhaps some sort of RegEx.
A very simple approach would be to use String#split and then pick the second part that is the last part in this example:
"dhl_freight__233433".split('__').last
#=> "233433"
"postal__US-320202".split('__').last
#=> "US-320202"
You can use a very simple Regexp and a ask the resulting MatchData for the post_match part:
p "dhl_freight__233433".match(/__/).post_match
# another (magic) way to acces the post_match part:
p $'
Postscript: Learnt something from this question myself: you don't even have to use a RegExp for this to work. Just "asddfg__qwer".match("__").post_match does the trick (it does the conversion to regexp for you)
r = /[^_]+\z/
"dhl_freight__233433"[r] #=> "233433"
"postal__US-320202"[r] #=> "US-320202"
The regular expression matches one or more characters other than an underscore, followed by the end of the string (\z). The ^ at the beginning of the character class reads, "other than any of the characters that follow".
See String#[].
This assumes that the last underscore is preceded by an underscore. If the last underscore is not preceded by an underscore, in which case there should be no match, add a positive lookbehind:
r = /(?<=__[^_]+\z/
This requires the match to be preceded by two underscores.
There are many ruby ways to extract numbers from string. I hope you're trying to fetch numbers out of a string. Here are some of the ways to do so.
Ref- http://www.ruby-forum.com/topic/125709
line.delete("^0-9")
line.scan(/\d/).join('')
line.tr("^0-9", '')
In the above delete is the fastest to trim numbers out of strings.
All of above extracts numbers from string and joins them. If a string is like this "String-with-67829___numbers-09764" outut would be like this "6782909764"
In case if you want the numbers split like this ["67829", "09764"]
line.split(/[^\d]/).reject { |c| c.empty? }
Hope these answers help you! Happy coding :-)

Escape only some characters in where clause

I want to fetch records that have some string field start with a given prefix and end on any one character. Basically:
Model.where('field LIKE ?', "#{prefix}_").count
The problem is that the prefix itself might contain special characters (like % or _).
Is there a way to escape the prefix, but not the trailing _ without rolling my own sanitizer with a bunch of #gsubs?
There is no better solution than replacing all _ with \_ and all % with \% to escape their special meaning.
Model.where("field LIKE ?||'_'", escapeDataFunction("#{prefix}")).count
The idea is to escape what needs to be escaped and hard code the other part in the "where" condition. Also note that when using substitution variables (? or :1), then the data need not be escaped at all in general, but "like" expressions are an exception, and in that case, you should escape the special characters with meaning in the like operator.

Rails strip all except numbers commas and decimal points

Hi I've been struggling with this for the last hour and am no closer. How exactly do I strip everything except numbers, commas and decimal points from a rails string? The closest I have so far is:-
rate = rate.gsub!(/[^0-9]/i, '')
This strips everything but the numbers. When I try add commas to the expression, everything is getting stripped. I got the aboves from somewhere else and as far as I can gather:
^ = not
Everything to the left of the comma gets replaced by what's in the '' on the right
No idea what the /i does
I'm very new to gsub. Does anyone know of a good tutorial on building expressions?
Thanks
Try:
rate = rate.gsub(/[^0-9,\.]/, '')
Basically, you know the ^ means not when inside the character class brackets [] which you are using, and then you can just add the comma to the list. The decimal needs to be escaped with a backslash because in regular expressions they are a special character that means "match anything".
Also, be aware of whether you are using gsub or gsub!
gsub! has the bang, so it edits the instance of the string you're passing in, rather than returning another one.
So if using gsub! it would be:
rate.gsub!(/[^0-9,\.]/, '')
And rate would be altered.
If you do not want to alter the original variable, then you can use the version without the bang (and assign it to a different var):
cleaned_rate = rate.gsub!(/[^0-9,\.]/, '')
I'd just google for tutorials. I haven't used one. Regexes are a LOT of time and trial and error (and table-flipping).
This is a cool tool to use with a mini cheat-sheet on it for ruby that allows you to quickly edit and test your expression:
http://rubular.com/
You can just add the comma and period in the square-bracketed expression:
rate.gsub(/[^0-9,.]/, '')
You don't need the i for case-insensitivity for numbers and symbols.
There's lots of info on regular expressions, regex, etc. Maybe search for those instead of gsub.
You can use this:
rate = rate.gsub!(/[^0-9\.\,]/g,'')
Also check this out to learn more about regular expressions:
http://www.regexr.com/

regex basic url expression

Hi I'm creating a regular expression (ruby) to test the beginning and end of string. I have both parts but can't join them.
Beginning of string
\A(http:\/\/+)
End of string
(.pdf)\z
How to join?
Bonus if it could validate in-between and accept anything (to avoid http://.pdf)
By the way, rubular http://rubular.com is a neat place to validate expressions
Use .+ to match any character except \n one or more times.
\A(http:\/\/+).+(\.pdf)\z
Should match http://www.stackoverflow.com/bestbook.pdf but not http://.pdf

Matching function in erlang based on string format

I have user information coming in from an outside source and I need to check if that user is active. Sometimes I have a User and a Server and other times I have User#Server. The former case is no problem, I just have:
active(User, Server) ->
do whatever.
What I would like to do with the User#Server case is something like:
active([User, "#", Server]) ->
active(User, Server).
Doesn't seem to work. When calling active in the erlang terminal with a#b for example, I get an error that there is no match. Any help would be appreciated!
You can tokenize the string to get the result:
active(UserString) ->
[User,Server] = string:tokens(UserString,"#"),
active(User,Server).
If you need something more elaborate, or with better handling of something like email addresses, it might then be time to delve into using regular expressions with the re module.
active(UserString) ->
RegEx = "^([\\w\\.-]+)#([\\w\\.-]+)$",
{match, [User,Server]} = re:run(UserString,RegEx,[{capture,all_but_first,list}]),
active(User,Server).
Note: The supplied Regex is hardly sufficient for email address validation, it's just an example that allows all alphanumeric characters including underscores (\\w), dots (\\.), and dashes (-) seperated by an at symbol. And it will fail if the match doesn't stretch the whole length of the string: (^ to $).
A note on the pattern matching, for the real solution to your problem I think #chops suggestions should be used.
When matching patterns against strings I think it's useful to keep in mind that erlang strings are really lists of integers. So the string "#" is actually the same as [64] (64 being the ascii code for #)
This means that you match pattern [User, "#", Server] will match lists like: [97,[64],98], but not "a#b" (which in list form is [97,64,98]).
To match the string you need to do [User,$#,Server]. The $ operator gives you the ascii value of the character.
However this match pattern limits the matching string to be 1 character followed by # and then one more character...
It can be improved by doing [User, $# | Server] which allows the server part to have arbitrary length, but the User variable will still only match one single character (and I don't see a way around that).

Resources