I want to achieve this: retrieve a word from a CSV file, then look for the existence of a hashtag with the word in a post the problem is that I was unable to perform the concatenation
The "Type mismatch" error could be solved by enclosing the concatenation in parentheses, as in:
WHERE line[0] =~ (".*#" + line[0] + ".*")
However, logically, that WHERE clause can never be true. A string cannot be equal to a longer string (itself, preceded by an extra character).
If you just trying to see if a word starts with a hashtag, this should work:
WHERE line[0] STARTS WITH "#"
Or, if you want to see if there is a hashtag in the string:
WHERE line[0] CONTAINS "#"
Related
I would like to match instances of a word in string, as long as the word is not in a URL.
An example would be find the instances of 'hello' in the following:
hello this is a regex problem http://geocities.com/hello/index.html?hello! Hello how are you!
The simplest regex for this problem is:
/\bhello\b/i
However this returns all four instances of 'hello', including the two contained within the URL string.
I have experimented with negative look-behinds for 'http' but so far nothing has worked. Any ideas?
Here are several solutions based on The Best Regex Trick Ever for 1) counting matches outside of a URL, 2) removing matches not in a URL, and 3) wrapping the matches with a tag outside of a URL:
s = "hello this is a regex problem http:"+"//geocities.com/hello/index.html?hello! Hello how are you!"
# Counting
p s.scan(/https?:\/\/\S*|(hello)/i).flatten.compact.count
## => 2
# Removing
p s.gsub(/(https?:\/\/\S*)|hello/i, '\1')
## => " this is a regex problem http://geocities.com/hello/index.html?hello! how are you!"
# Wrapping with a tag
p s.gsub(/(https?:\/\/\S*)|(hello)/i) { $1 || "<span>#{$2}</span>" }
## => "<span>hello</span> this is a regex problem http://geocities.com/hello/index.html?hello! <span>Hello</span> how are you!"
You may wrap hello pattern with word boundaries if you need to match a whole word, \bhello\b.
See the online Ruby demo
Notes
.scan(/https?:\/\/\S*|(hello)/i).flatten.compact.count - matches a URL starting with http or https, or matches and captures hello in Group 1, .scan only returns captured substrings, but it also returns nil once the URL is matched, so .compact is required to remove nil items from the flattened array and .count returns the number of items in the array.
.gsub(/(https?:\/\/\S*)|hello/i, '\1') matches and captures URLs into Group 1 and hello just matches all hellos outside of URLs, and the matches are replaced with \1, backreference to Group 1 that is an empty string when just hello is found.
s.gsub(/(https?:\/\/\S*)|(hello)/i) { $1 || "<span>#{$2}</span>" } matches and captures URLs into Group 1 and hellos into Group 2. If Group 1 was matched, $1 puts this value back into the string, else, the Group 2 is wrapped with tags and inserted back into the string.
If I'm correct you need to get words after url. You can just use space(\s) as delimiter of your string
"http://geocities.com/hello/index.html?hello! Hello how are you!".scan(/\s(\w+)/i)
=> [["Hello"], ["how"], ["are"], ["you"]]
Or
"http://geocities.com/hello/index.html?hello! Hello how are you!".scan(/\s(hello)/i)
=> [["Hello"]]
Here, we can first collect our URLs, altered by our desired words in a capturing group, with an expression similar to:
http[^\s]+|(hello|you)
Demo
RegEx Circuit
jex.im visualizes regular expressions:
Advice
The fourth bird advises that:
I would go for the word boundaries and only hello in the group: \bhttp\S+|\b(hello)\b
I am trying to get all records where my phone field starts with a '+'
Company.where("phone LIKE ?", "+%") // RETURNS 0 RESULTS
and for some reason its listing zero results, even when there are results that start with '+'
I also tried to use a \ in order to escape the special meaning of '+' to no avail.
Although, if I try to match string that start with +1 for example, it works as expected.
Company.where("phone LIKE ?", "+1%") // WORKS FINE
Its the quotation marks!
Company.where("phone LIKE ?", '+%')
Using ' single quotes instead of " double-quotes, fixed the query.
In my program, I am trying match a string that has two letters and then a few words between them like such: "! hello my name !" In this example, the string "hello my name" can change in the number of words to a string such as: "hello" or even more words. Anyways, how can I match the string between the exclamation marks? The main problem is that I cannot figure out the expression to use in the string match to represent a string with multiple words of an unknown length.
Use the pattern !([^!]+)!, in which [^!]* matches zero or more characters that aren't !.
print(string.match("! hello my name !","!([^!]*)!"))
Try also the pattern "!(.-)!".
This matches the shortest string of this form, unlike "!(.*)!", which matches the longest one.
I have a long list of information stored in a variable and I need to run some regex expressions against that variable and get various pieces of information from what is found.
How can you store the line that matches a regex expression in a variable?
How can you get the line number of the line that matches a regex expression?
Here is an example of what I'm talking about.
body = "service timestamps log datetime msec localtime show-timezone
service password-encryption
!
hostname switch01
!
boot-start-marker"
If I search for the line that contains "hostname" I need the line number, in this case it would be 4. I also need to store the line "hostname switch01" as another variable.
Any ideas?
Thanks!
First you'd want to convert the string to lines: body.split('\n'), then you want to add line numbers to the lines: .each_with_index. Then you want to select the lines .select {|line, line_nr| line =~ your_regex }. Putting it all together:
body.split('\n').each_with_index
.select {|line, line_nr| line =~ your_regex }
.map {|line, line_nr| line_nr }
This will give you all the lines matching 'your_regex'
Let's say you have an object file that provides a #lines method:
lines = file.lines.each_with_index.select {|line, i| line =~ /regex/ }
If you already have a list of lines you can leave out the call to #lines. If you have a string you can use string.split("\n").
This will result in the variable lines containing an array of 2-element arrays with the line that matched your RegEx and the index of the line in the original file.
Breakdown
file.lines gets the lines - of course the other methods I mentioned might also apply here for you. We then add the index to each element with #each_with_index, because you want to store these as well. This has the same effect as #map.with_index {|e, i| [e, i]}, i.e. map every element to [element, index]. We then use the #select method to get all lines that do match your RegEx (FYI, =~ is the matching operator in Ruby, Perl and other languages - in case you didn't already know). We're done after that, but you might need to further transform the data so you can process it.
I have an array list of email address and i want to delete eveything after the ";"
i tried chomp. but that didnt seem to work.
How would i loop through the array and remove everything after the ";" in my view.
Try:
["aaaa;bbb"].map { |e| e.gsub(/;.*/, ';') }
From documentation: gsub returns a copy of str with the all occurrences of pattern substituted for the second argument.
So this regexp will match ; and any character after, so you have to pass ; as second argument.