How can I identify and process all URLs in a text string? - ruby-on-rails

I would like to enumerate all the URLs in a text string, for example:
text = "fasòls http://george.it sdafsda"
For each URL found, I want to invoke a function method(...) that transforms the string.
Right now I'm using a method like this:
msg = ""
for i in text.split
if (i =~ URI::regexp).nil?
msg += " " + i
else
msg+= " " + method(i)
end
end
text = msg
This works, but it's slow for long strings. How can I speed this up?

I think "gsub" is your friend here:
class UrlParser
attr_accessor :text, :url_counter, :urls
def initialize(text)
#text = parse(text)
end
private
def parse(text)
#counter = 0
#urls = []
text.gsub(%r{(\A|\s+)(http://[^\s]+)}) do
#urls << $2
"#{$1}#{replace_url($2)}"
end
end
def replace_url(url)
#counter += 1
"[#{#counter}]"
end
end
parsed_url = UrlParser.new("one http://x.com/url two")
puts parsed_url.text
puts parsed_url.urls
If you really need extra fast parsing of long strings, you should build a ruby C extension with ragel.

Related

Finding letters that are near, exact or not in a user input string

I am currently developing a small modified version of Hangman in Rails for children. The game starts by randomly generating a word from a text file and the user has to guess the word by entering a four letter word. Each word is the split by each character for example "r", "e", "a", "l" and returns a message on how they are to the word.
Random Generated word is "real"
Input
rlax
Output
Correct, Close, Correct, Incorrect
I have tried other things which I have found online but haven't worked and I am fairly new to Ruby and Rails. Hopefully someone can guide me in the right direction.
Here is some code
def letterCheck(lookAtLetter)
lookAHead = lookAtLetter =~ /[[:alpha:]]/
end
def displayWord
$ranWordBool.each_index do |i|
if($ranWordBool[i])
print $ranWordArray[i]
$isWin += 1
else
print "_"
end
end
end
def gameLoop
turns = 10
turnsLeft = 0
lettersUsed = []
while(turnsLeft < turns)
$isWin = 0
displayWord
if($isWin == $ranWordBool.length)
system "cls"
puts "1: Quit"
puts "The word is #{$ranWord} and You Win"
puts "Press any key to continue"
return
end
print "\n" + "Words Used: "
lettersUsed.each_index do |looper|
print " #{lettersUsed[looper]} "
end
puts "\n" + "Turns left: #{turns - turnsLeft}"
puts "Enter a word"
input = gets.chomp
system "cls"
if(input.length != 4)
puts "Please enter 4 lettered word"
elsif(letterCheck(input))
if(lettersUsed.include?(input))
puts "#{input} already choosen"
elsif($ranWordArray.include?(input))
puts "Close"
$ranWordArray.each_index do |i|
if(input == $ranWordArray[i])
$ranWordBool[i] = true
end
if($ranWordBool[i] = true)
puts "Correct"
else
puts "Incorrect"
end
end
else
lettersUsed << input
turnsLeft += 1
end
else
puts "Not a letter"
end
end
puts "You lose"
puts "The word was #{$ranWord}"
puts "Press any key to continue"
end
words = []
File.foreach('words.txt') do |line|
words << line.chomp
end
while(true)
$ranWord = words[rand(words.length) + 1]
$ranWordArray = $ranWord.chars
$ranWordBool = []
$ranWordArray.each_index do |i|
$ranWordBool[i] = false
end
system "cls"
gameLoop
input = gets.chomp
shouldQuit(input)
end
Something like that:
# Picking random word to guess
word = ['open', 'real', 'hang', 'mice'].sample
loop do
puts "So, guess the word:"
input_word = gets.strip
if word == input_word
puts("You are right, the word is: #{input_word}")
break
end
puts "You typed: #{input_word}"
# Split both the word to guess and the suggested word into array of letters
word_in_letters = word.split('')
input_in_letters = input_word.split('')
result = []
# Iterate over each letter in the word to guess
word_in_letters.each_with_index do |letter, index|
# Pick the corresponding letter in the entered word
letter_from_input = input_in_letters[index]
if letter == letter_from_input
result << "#{letter_from_input} - Correct"
next
end
# Take nearby letters by nearby indexes
# `reject` is here to skip negative indexes
# ie: letter 'i' in a word "mice"
# this will return 'm' and 'c'
# ie: letter 'm' in a word "mice"
# this will return 'i'
letters_around =
[index - 1, index + 1]
.reject { |i| i < 0 }
.map { |i| word_in_letters[i] }
if letters_around.include?(letter_from_input)
result << "#{letter_from_input} - Close"
next
end
result << "#{letter_from_input} - Incorrect"
end
puts result.join("\n")
end

Test-first-ruby 13_xml_document

I am working on test-first-ruby-master (you can find it at https://github.com/appacademy/test-first-ruby).
The 13_xml_document_spec.rb is the Rspec test that my code must pass. This test has several tasks, but it is the last one (called "indents") that my code doesn't accomplish.
Here is the Rspec test:
require "13_xml_document"
describe XmlDocument do
before do
#xml = XmlDocument.new
end
it "renders an empty tag" do
expect(#xml.hello).to eq("<hello/>")
end
it "renders a tag with attributes" do
expect(#xml.hello(:name => "dolly")).to eq('<hello name="dolly"/>')
end
it "renders a randomly named tag" do
tag_name = (1..8).map{|i| ("a".."z").to_a[rand(26)]}.join
expect(#xml.send(tag_name)).to eq("<#{tag_name}/>")
end
it "renders block with text inside" do
expect(#xml.hello { "dolly" }).to eq("<hello>dolly</hello>")
end
it "nests one level" do
expect(#xml.hello { #xml.goodbye }).to eq("<hello><goodbye/></hello>")
end
it "nests several levels" do
xml = XmlDocument.new
xml_string = xml.hello do
xml.goodbye do
xml.come_back do
xml.ok_fine(:be => "that_way")
end
end
end
expect(xml_string).to eq('<hello><goodbye><come_back><ok_fine
be="that_way"/></come_back></goodbye></hello>')
end
it "indents" do
#xml = XmlDocument.new(true)
xml_string = #xml.hello do
#xml.goodbye do
#xml.come_back do
#xml.ok_fine(:be => "that_way")
end
end
end
expect(xml_string).to eq(
"<hello>\n" +
" <goodbye>\n" +
" <come_back>\n" +
" <ok_fine be=\"that_way\"/>\n" +
" </come_back>\n" +
" </goodbye>\n" +
"</hello>\n"
)
end
end
And here is my code:
class XmlDocument
def initialize(indentation = false)
#indentation = indentation
#counter = 0
end
def method_missing(method, *args, &block)
hash = {}
if block
if #indentation == false
"<#{method}>#{yield}</#{method}>"
elsif #indentation == true
string = ""
string << indent1
string << "<#{method}>\n"
(###)
add_indent
string << indent1
string << yield + "\n"
sub_indent
string << indent2
string << "</#{method}\>"
string
end
elsif args[0].is_a?(Hash)
args[0].map { |key,value| "<#{method.to_s} #{key.to_s}=\"#{value.to_s}\"/>" }.join(" ")
elsif hash.empty?
"<#{method.to_s}/>"
end
end
def indent1
" " * #counter
end
def indent2
" " * #counter
end
def add_indent
#counter += 1
end
def sub_indent
#counter -= 1
end
end
This is the output I get for the "indents" part:
<hello>
<goodbye>
<come_back>
+ <ok_fine be="that_way"/>
</come_back>
</goodbye>
</hello>
Contrary to the right answer, the 4th line ('ok_fine be="that_way"/') seems be two indents closer to the left than it is supposed to be. As opposed to the rest of the lines, the 4th line is not a block, but an argument of the called method 'come_back'.
I cannot see where my mistake is. Even writing an exception in the code (where the (###) is in my code) doesn't seem to have any effect on the 4th line.
Here is the exception (###):
if args[0].is_a?(Hash)
add_indent
string << indent
arg[0].map{|key, value| string << "<#{method.to_s} #{key.to_s}=\"#{value.to_s}\"/>"}
end
NOTE: I assume that if I manage to give the 4th line the right numbers of indents, that also will increase the number of indents of the lines after it, so the method 'indent2' will need to be modified.
I figured out what the problem was. As I said in my question, in the Rspec test they have the following input:
xml_string = xml.hello do
xml.goodbye do
xml.come_back do
xml.ok_fine(:be => "that_way")
end
end
end
where the 4th line (xml.ok_fine(:be => "that_way")) doesn't have a block nested, but an argument. In my code I established a condition (if block) for when there is a block present and inside this first condition, a second condition (if #indentation == true) for when #indentation is true:
if block
if #indentation == false
"<#{method}>#{yield}</#{method}>"
elsif #indentation == true
...
It is inside this second condition that I create the variable 'string' where I shovel in the different parts:
elsif #indentation == true
string = ""
string << indent1
string << "<#{method}>\n"
(###)
add_indent
string << indent1
string << yield + "\n"
sub_indent
string << indent2
string << "</#{method}\>"
string
end
But because the 4th line doesn't carry a block, the first condition (if block) doesn't return true for it and therefore this 4th line is skipped.
I've re-written my code so now it passes the Rspec test:
class XmlDocument
def initialize(indentation = false)
#indentation = indentation
#counter = 0
end
def method_missing(method, args = nil, &block)
string = ""
arguments = args
if #indentation == false
if (arguments == nil) && (block == nil)
"<#{method.to_s}/>"
elsif arguments.is_a?(Hash)
arguments.map { |key,value| "<#{method.to_s} #{key.to_s}=\"#{value.to_s}\"/>" }.join(" ")
elsif block
"<#{method}>#{yield}</#{method}>"
end
elsif #indentation == true
if (block) || (arguments.is_a?(Hash))
string << indent1
string << "<#{method}>\n" unless !block
add_indent
string << indent1 unless !block
if block
string << yield + "\n"
elsif arguments.is_a?(Hash)
arguments.map { |key,value| string << "<#{method.to_s} #{key.to_s}=\"#{value.to_s}\"/>" }
end
sub_indent
string << indent2 unless !block
string << "</#{method}\>" unless !block
if indent2 == ""
string << "\n"
end
end
string
end
end
def indent1
" " * #counter
end
def indent2
" " * #counter
end
def add_indent
#counter += 1
end
def sub_indent
#counter -= 1
end
end
In contrast to the code I wrote in my question, in this one, the two main conditions are #indentation == false and #indentation == true and inside these two conditions I establish different exceptions for the different cases (block or no block, argument or no argument...). Specifically for elsif #indentation == true I created a condition that accepts the 4th line: if (block) || (arguments.is_a?(Hash)), or in other words, it accepts methods that have a block or an argument (especifically a a hash).
Now, I shovel in the different parts in 'string', and when I reach a block to yield there is a bifurcation:
if block
string << yield + "\n"
elsif arguments.is_a?(Hash)
arguments.map { |key,value| string << "<#{method.to_s} #{key.to_s}=\"#{value.to_s}\"/>" }
if there is a block I "yield" it, and if there is and argument that is a hash I shovel it into 'string'.
Also, there is this exception unless !block either when I indent or I shovel a method because otherwise it can introduce unwanted indents and '\n' if there is a method that doesn't have a block (as line 4th).
Finally, I had to add at the end
if indent2 == ""
string << "\n"
end
because the solution requires a '\n' at the end.
I hope this answer can help other
NOTE: I wrote a 'NOTE' in my question where I assumed I would have to modify 'indent2'. That, obviously I didn't have to do because the output I was getting did not considered the 4th line (because it doesn't have a block), so the bigger indentation (" ") of 'indent2' is all right.

Ruby function for finding longest_word

I have written ruby function that checks each word in a sentence and returns longest word with its length.
My question is I only want it to return word. I do not want it to return both word and it's length.
Please help and explain how can I make it better.
Thank you
def longest_word(sentence)
words = sentence.split(" ")
frequencies = Hash.new(0)
words.each {|x| frequencies[x] = x.length}
frequencies.max_by{|k,v| v}
end
puts longest_word("short longest")
max_by is the correct idea, but your method can be simpler:
def longest_word(sentence)
sentence.split(/\s+/).max_by(&:size)
end
def longest_word(sentence)
sentence.split.max{|a,b| a.length <=> b.length }
end
puts longest_word("short longest longerer longer")
def longest_word(sentence)
longest_word = ''
sentence.split(' ').each do |word|
if word.length > longest_word.length
longest_word = word
end
end
longest_word
end
def longest_word(sentence)
words = sentence.split(" ").sort_by(&:length)[-1]
end
sentence = %w{I have longest word}
def longest_word(sentence)
longest_word = ''
sentence.each do |word|
longest_word = word if longest_word.length < word.length
end
puts longest_word
end
longest_word(sentence)
=> longest

Escape non HTML tags in plain text (convert plain text to HTML)

Using Rails, I need to get a plain text and show it as HTML, but I don't want to use <pre> tag, as it changes the format.
I needed to subclass HTML::WhiteListSanitizer to escape non whitelisted tags (by changing process_node), monkey patch HTML::Node to don't downcase tags' names and monkey patch HTML::Text to apply <wbr /> word splitting:
class Text2HTML
def self.convert text
text = simple_format text
text = auto_link text, :all, :target => '_blank'
text = NonHTMLEscaper.sanitize text
text
end
# based on http://www.ruby-forum.com/topic/87492
def self.wbr_split str, len = 10
fragment = /.{#{len}}/
str.split(/(\s+)/).map! { |word|
(/\s/ === word) ? word : word.gsub(fragment, '\0<wbr />')
}.join
end
protected
extend ActionView::Helpers::TagHelper
extend ActionView::Helpers::TextHelper
extend ActionView::Helpers::UrlHelper
class NonHTMLEscaper < HTML::WhiteListSanitizer
self.allowed_tags << 'wbr'
def self.sanitize *args
self.new.sanitize *args
end
protected
# Copy, just to reference this Node definition
def tokenize(text, options)
options[:parent] = []
options[:attributes] ||= allowed_attributes
options[:tags] ||= allowed_tags
tokenizer = HTML::Tokenizer.new(text)
result = []
while token = tokenizer.next
node = Node.parse(nil, 0, 0, token, false)
process_node node, result, options
end
result
end
# gsub <> instead of returning nil
def process_node(node, result, options)
result << case node
when HTML::Tag
if node.closing == :close
options[:parent].shift
else
options[:parent].unshift node.name
end
process_attributes_for node, options
options[:tags].include?(node.name) ? node : node.to_s.gsub(/</, "<").gsub(/>/, ">")
else
bad_tags.include?(options[:parent].first) ? nil : node.to_s
end
end
class Text < HTML::Text
def initialize(parent, line, pos, content)
super parent, line, pos, content
#content = Text2HTML.wbr_split content
end
end
# remove tag/attributes downcases and reference this Text
class Node < HTML::Node
def self.parse parent, line, pos, content, strict=true
if content !~ /^<\S/
Text.new(parent, line, pos, content)
else
scanner = StringScanner.new(content)
unless scanner.skip(/</)
if strict
raise "expected <"
else
return Text.new(parent, line, pos, content)
end
end
if scanner.skip(/!\[CDATA\[/)
unless scanner.skip_until(/\]\]>/)
if strict
raise "expected ]]> (got #{scanner.rest.inspect} for #{content})"
else
scanner.skip_until(/\Z/)
end
end
return HTML::CDATA.new(parent, line, pos, scanner.pre_match.gsub(/<!\[CDATA\[/, ''))
end
closing = ( scanner.scan(/\//) ? :close : nil )
return Text.new(parent, line, pos, content) unless name = scanner.scan(/[^\s!>\/]+/)
unless closing
scanner.skip(/\s*/)
attributes = {}
while attr = scanner.scan(/[-\w:]+/)
value = true
if scanner.scan(/\s*=\s*/)
if delim = scanner.scan(/['"]/)
value = ""
while text = scanner.scan(/[^#{delim}\\]+|./)
case text
when "\\" then
value << text
value << scanner.getch
when delim
break
else value << text
end
end
else
value = scanner.scan(/[^\s>\/]+/)
end
end
attributes[attr] = value
scanner.skip(/\s*/)
end
closing = ( scanner.scan(/\//) ? :self : nil )
end
unless scanner.scan(/\s*>/)
if strict
raise "expected > (got #{scanner.rest.inspect} for #{content}, #{attributes.inspect})"
else
# throw away all text until we find what we're looking for
scanner.skip_until(/>/) or scanner.terminate
end
end
HTML::Tag.new(parent, line, pos, name, attributes, closing)
end
end
end
end
end
end

Ruby way to Check for string palindrome

I wanted to check if a string is palindrome or not using ruby code.
I am a starter in ruby so not too aquainted with the string methods in ruby
If you are not acquainted with Ruby's String methods, you should have a look at the documentation, it's very good. Mithun's answer already showed you the basic principle, but since you are new to Ruby, there's a couple more things to keep in mind:
*) If you have a predicate method, it's customary to name it with a trailing question mark, e.g. palindrome?.
*) Boolean expressions evaluate to a boolean, so you don't need to explicitly return true or false. Hence a short idiomatic version would be
def palindrome?(str)
str == str.reverse
end
*) Since Ruby's classes are open, you could add this to the string class:
class String
def palindrome?
self == self.reverse
end
end
*) If you don't want to monkey-patch String, you can directly define the method on single object (or use a module and Object#extend):
foo = "racecar"
def foo.palindrome?
self == self.reverse
end
*) You might want to make the palindrome check a bit more complex, e.g. when it comes to case or whitespace, so you are also able to detect palindromic sentences, capitalized words like "Racecar" etc.
pal = "Never a foot too far, even."
class String
def palindrome?
letters = self.downcase.scan(/\w/)
letters == letters.reverse
end
end
pal.palindrome? #=> true
def check_palindromic(variable)
if variable.reverse == variable #Check if string same when reversed
puts "#{ variable } is a palindrome."
else # If string is not the same when reversed
puts "#{ variable } is not a palindrome."
end
end
The recursive solution shows how strings can be indexed in Ruby:
def palindrome?(string)
if string.length == 1 || string.length == 0
true
else
if string[0] == string[-1]
palindrome?(string[1..-2])
else
false
end
end
end
If reading the Ruby string documentation is too boring for you, try playing around with the Ruby practice questions on CodeQuizzes and you will pick up most of the important methods.
def is_palindrome(value)
value.downcase!
# Reverse the string
reversed = ""
count = value.length
while count > 0
count -= 1
reversed += value[count]
end
# Instead of writing codes for reverse string
# we can also use reverse ruby method
# something like this value == value.reverse
if value == reversed
return "#{value} is a palindrom"
else
return "#{value} is not a palindrom"
end
end
puts "Enter a Word"
a = gets.chomp
p is_palindrome(a)
class String
def palindrome?
self.downcase == self.reverse.downcase
end
end
puts "racecar".palindrome? # true
puts "Racecar".palindrome? # true
puts "mississippi".palindrome? # false
str= gets.chomp
str_rev=""
n=1
while str.length >=n
str_rev+=str[-n]
n+=1
end
if str_rev==str
puts "YES"
else
puts "NO"
end
> first method
a= "malayalam"
if a == a.reverse
puts "a is true"
else
puts "false"
end
> second one
a= "malayalam"
a=a.split("")
i=0
ans=[]
a.count.times do
i=i+1
k=a[-(i)]
ans << k
end
if a== ans
puts "true"
else
puts "false"
end
def palindrome?(string)
string[0] == string[-1] && (string.length <= 2 || palindrome?(string[1..-2]))
end
**Solution 1** Time complexity = O(n), Space complexity = O(n)
This solution does not use the reverse method of the String class. It uses a stack(we could use an array that only allows entry and exit of elements from one end to mimic a stack).
def is_palindrome(str)
stack = []
reversed_str = ''
str.each_char do |char|
stack << char
end
until stack.empty?
reversed_str += stack.pop
end
if reversed_str == str
return true
else
return false
end
end
` Solution 2: Time complexity = O(n), Space complexity = O(1)
def inplace_reversal!(str)
i =0
j = str.length - 1
while i < j
temp = str[i]
str[i] = str[j]
str[j] = temp
i+=1
j-=1
end
return str
end
def palindrome?(str)
return "Please pass the string" if str.nil?
str = str.downcase
str_array = str.split('')
reverse_string = str_array.each_index{ |index| str_array[str_array.count - index - 1 ] end
return ("String #{str} is not a palindrome") unless str == reverse_string.join('')
"String #{str} is palindrome"
end

Resources