regular expression form to match string - ios

I am using IOS regular expression engine to match any text in the form:
"[h1]test text[/h1]"
i wrote: #"\\[h1]([^.]*)[/h1\\]]"
to match this form, but it is working sometimes and other times it matches text out of bound of the last bracket, is it the best form to match these strings or what you suggest ?

I would recommend using (.*?) instead of ([^.]*?).
It looks want you want is "between [h1] and [/h1] match anything." That would be (.*?).
What you have is "between [h1] and [/h1] match anything which is not a period (.)."
In addition, you have a problem with your ending [/h1\\]] means end with a /, h, 1, or ]. I think you want \\[/h1] which means end with the string [/h1].
The final regex would be #"\\[h1](.*?)\\[/h1]".

Related

Rails, Postgres 12, Query where pattern matches regex, and contains substring

I have a field in the database which contains strings that look like: 58XBF2022L1001390 I need to be able to query results which match the last letter(in this case 'L'), and match or resemble the last four digits.
The regular expression I've been using to find records which match the structure is: \d{2}[A-Z]{3}\d{4}[A-Z]\d{7}, So far I've tried using a scope to refine the results, but I'm not getting any results. Here's my scope
def self.filter_by_shortcode(input)
q = input
starting = q.slice!(0)
ending = q
where("field ~* ?", "\d{2}[A-Z]{3}\d{4}/[#{starting}]/\d{3}[#{ending}]\g")
end
Here are some more example strings, and the substring that we would be looking for. Not every string stored in this database field matches this format, so we would need to be able to first match the string using the regex provided, then search by substring.
36GOD8837G6154231
G4231
13WLF8997V2119371
V9371
78FCY5027V4561374
V1374
06RNW7194P2075353
P5353
57RQN0368Y9090704
Y0704
edit: added some more examples as well as substrings that we would need to search by.
I do not know Rails, but the SQL for what you want is relative simple. Since your string if fixed format, once that format is validated, simple concatenation of sub-strings gives your desired result.
with base(target, goal) as
( values ('36GOD8837G6154231', 'G4231')
, ('13WLF8997V2119371', 'V9371')
, ('78FCY5027V4561374', 'V1374')
, ('06RNW7194P2075353', 'P5353')
, ('57RQN0368Y9090704', 'Y0704')
)
select substr(target,10,1) || substr(target,14,4) target, goal
from base
where target ~ '^\d{2}[A-Z]{3}\d{4}[A-Z]\d{7}$';

Rails - Detecting keywords in a string with exact match

This one is tricky, at least for me as I am new to rails.
soccer = ["football pitch", "soccer", "free kick", "penalty"]
string = "Did anyone see that free kick last night, let me get my pen!!!"
What I want to do is search for instances of keywords but with 2 main rules:
1 - Don't do partial matches i.e it should not match pen with penalty, has to be a full match.
2 - Match multiple sets of words like "nice day" "sweet tooth" "three's a crowd" (max of 3)
This code works perfect for scenario 1:
def self.check_for_keyword_match?(string,keyword_array)
string.split.any? { |word| keyword_array.include?(word) }
end
if check_for_keyword_match?(string,soccer)
soccer.to_set.freeze
keywords_found.push('soccer')
# send a response saying Hey, I see you are interested in soccer.
end
In that example it would not match pen but it would match penalty which is perfect.
But I also want it to match 2-3 sets of keywords i.e "free kick" should match but only "free" and "kick" would match if they were written as singular keywords. Free is too broad, same with kick but "free kick" is not broad so it works much better at deciphering their interests.
I can change the format of the soccer array but the string been submitted would be from a slack post so I can't control how that is formatted. In the actual program I have 20 or so of those arrays with keywords but once I figure out how to do one, the rest I can handle.
For manipulating strings, Regular Expressions are useful.
The following code should fix your issue:
def self.check_for_keyword_match?(string, keyword_array)
keyword_array.any? { |word| Regexp.new('\b' + word + '\b').match(string) }
end
Instead of splitting string, go through keyword_array and search the entire string for each keyword.
The regex adds a 'word boundary' modifier \b so that it will only match entire words (Rule 1, if you use include? here, then a keyword of "pen" will match "penalty").

How to use regex in active record queries?

There are values in a specific column of my database that ends with a number but others that do not.
I'm trying to only take the data that is not containing these numbers.
I tried to use these queries but they do not work :
User.where.not("spec like ?", "%\d")
User.where.not("spec ~ ?", "%\d")
How could I find this data ?
Use SIMILAR TO with %[0-9] pattern:
User.where.not("spec SIMILAR TO ?", "%[0-9]")
The SIMILAR TO operator is similar to regex, but allows the use of wildcards as with LIKE and some "light" regex constructs, e.g. bracket expressions like [0-9] or [A-Z]. The pattern should match the whole input as with LIKE.
So, the %[0-9] pattern will match any strings that start with any text (% wildcard does that) and end with an ASCII digit (due to the [0-9] at the end).

Rails query by number of digits in field

I have a Rails app with a table: "clients". the clients table has a field: phone. phone data type is string. I'm using postgresql. I would like to write a query which selects all clients which have a phone value containing more than 10 digits. phone does not have a specific format:
+1 781-658-2687
+1 (207) 846-3332
2067891111
(345)222-777
123.234.3443
etc.
I've been trying variations of the following:
Client.where("LENGTH(REGEXP_REPLACE(phone,'[^\d]', '')) > 10")
Any help would be great.
You almost have it but you're missing the 'g' option to regexp_replace, from the fine manual:
The regexp_replace function provides substitution of new text for substrings that match POSIX regular expression patterns. [...] The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one.
So regexp_replace(string, pattern, replacement) behaves like Ruby's String#sub whereas regexp_replace(string, pattern, replacement, 'g') behaves like Ruby's String#gsub.
You'll also need to get a \d through your double-quoted Ruby string all the way down to PostgreSQL so you'll need to say \\d in your Ruby. Things tend to get messy when everyone wants to use the same escape character.
This should do what you want:
Client.where("LENGTH(REGEXP_REPLACE(phone, '[^\\d]', '', 'g')) > 10")
# --------------------------------------------^^---------^^^
Try this:
phone_number.gsub(/[^\d]/, '').length

Rails validates_format_of

I want to use validates_format_of to validate a comma separated string with only letters (small and caps), and numbers.
So.
example1, example2, 22example44, ex24
not:
^&*, <> , asfasfsdafas<#%$#
Basically I want to have users enter comma separated words(incl numbers) without special characters.
I'll use it to validate tags from acts_as_taggable_on. (i don't want to be a valid tag for example.
Thanks in advance.
You can always test out regular expressions at rubular, you would find that both tiftiks and Tims regular expressions work albeit with some strange edge cases with whitespace.
Tim's solution can be extended to include leading and trailing whitespace and that should then do what you want as follows :-
^\s*[A-Za-z0-9]+(\s*,\s*[A-Za-z0-9]+)*\s*$
Presumably when you have validated the input string you will want to turn it into an array of tags to iterate over. You can do this as follows :-
array_var = string_var.delete(' ').split(',')
^([a-zA-Z0-9]+,\s*)*[a-zA-Z0-9]+$
Note that this regex doesn't match values with whitespace, so it won't match multiple words like "abc xyz, fgh qwe". It matches any amount of whitespace after commas. You might not need ^ or $ if validates_format_of tries to match the whole string, I've never used Rails so I don't know about that.
^[A-Za-z0-9]+([ \t]*,[ \t]*[A-Za-z0-9]+)*$
should match a CSV line that only contains those characters, whether it's just one value or many.

Resources