elaborate urls with regex

elaborate urls with regex - ruby-on-rails

i have a string that maybe contains text with links.
I use these instructions for elaborate it:
message = message.gsub(/http[s]?:\/\/[^\s]+/) do |m|
replace_url(m)
end
if the string is "http://www.youtube.com/watch?v=6zToqLlM8ms&playnext_from=TL&videos=qpCvM5Ocr3M&feature=sub"
the instructions works.
but if the string is "hi my video is http://www.youtube.com/watch?v=6zToqLlM8ms&playnext_from=TL&videos=qpCvM5Ocr3M&feature=sub"
doesn't works
why?
how can i do?
thanks

Beucase this not match a pattern. Get before url add expression to catch some crap before url.
You can also try match a regex pattern to every word. Because url is always one word separated word.

Related

Clean up & style characters from text

I am getting text from a feed that has alot of characters like:
Insignia&#153; 2.0 Stereo Computer Speaker System (2-Piece) - Black
4th-Generation Apple® iPod® touch
Is there an easy way to get rid of these, or do I have to anticipate which characters I want to delete and use the delete method to remove them? Also, when I try to remove
&
with
str.delete("&")
It leaves behind "amp;" Is there a better way to delete this type of character? Do I need to re-encode the text?

String#delete is certainly not what you want, as it works on characters, not the string as a whole.
Try
str.gsub /&/, ""
You may also want to try replacing the & with a literal ampersand, such as:
str.gsub /&/, "&"
If this is closer to what you really want, you may get the best results unescaping the HTML string. If so try this:
CGI::unescapeHTML(str)
Details of the unescapeHTML method are here.

If you are getting data from a 'feed', aka RSS XML, then you should be using an XML parser like Nokogiri to process the XML. This will automatically unescape HTML entities and allow you to get the proper string representation directly.

For removing try to use gsub method, something like this:
text = "foo&bar"
text.gsub /\b&\b/, "" #=> foobar

How to use the rails match method with a string?

Is there a way to use the rails match method with a simple string, rather than a regular expression?
I'm trying to match a url as such
sometext.match('http://www.example.com')
The problem is, this is stil getting evaluated as a regular expression, so I have to escape all the special characters for this to work properly, as such:
sometext.match('http:\/\/www.example\.com\?foo=bar')
If there's no way to match just a string, is there a way to escape it automatically for regular expressions?
Thanks!

If you just want to know if a string is part of another, use include?.
sometext.include?("http://www.example.com/")

You could try something like:
sometext =~ %r{http://example.com?foo=bar}

Regular expression not working when put in an object

I'm trying to store regexes in a database but they're not working when used in a .sub(), even though the same regex works when used directly in .sub() as a string.
regex = Class.object.field // Class.object is an active record containing "\w*\s\/\s"
mystring = "first / second"
mystring.sub(/#{regex}/, '')
// => nil
mystring.sub(/\w*\s\/\s/, '')
// => second
Any insight appreciated!
Thanks,
Matt.
Editing to correct class/object terminology (thanks) & correcting my 2nd example as I had shown #{} wrapped around the working regex (cut & paste SNAFU).

To answer your question: It is not quite what kind of thing your Class.object is. If it's an ActiveRecord, it won't work.
Edit: You obviously found that the problem is Rails escaping the regexp.
An ActiveRecord cannot "contain" your regular expression directly; the regexp will be in one of the fields of your record. In which case you'd want to do something like regexp = Class.object.field_containing_the_regexp.
Even if that is not the case, I suspect that the problem is that your regexp is something other than a string. You can quickly test this by using
puts "My regexp: #{regexp}"
The string that you will see in the output will be the one that is used for the regexp.

A String is not a Regexp. You have to create a Regexp object first.
regex = Regexp.new("\w*\s\/\s")

Turns out my regexp didn't cater for all cases - \w didn't account for symbols. After checking in rails console, and seeing the screwey escaping I was alreasdy half-way down the wrong track.
Thanks for the help.

How to get regex to ignore URL strings?

I have the following Regexp to create a hash of values by separating a string at a semicolon:
Hash["photo:chase jarvis".scan(/(.*)\:(.*)/)]
// {'photo' => 'chase jarvis'}
But I also want to be able to have URL's in the string and recognize it so it maintains the URL part in the value side of the hash i.e:
Hash["photo:http://www.chasejarvis.com".scan(/(.*)\:(.*)/)]
// Results in {'photo:http' => '//www.chasejarvis.com'}
I want, of course:
Hash["photo:chase jarvis".scan(/ ... /)]
// {'photo' => 'http://www.chasejarvis.com'}

If you only want to match up to first colon you could change (.*)\:(.*) to ([^:]*)\:(.*).
Alternatively, you could make it a non-greedy match, but I prefer saying "not colon".

How do figure out a person's family name and first name?
Changing chasejarvis to chase and jarvis might not be possible unless you have a solution for that.
Do you already know everyone's name in your project? Nobody is having the initial of a middle name like charvisdjarvis (assuming the name is "Charvis D. Jarvis".)?

I am creating a Twitter clone in Ruby On Rails, how do I code it so that the '#...''s in the 'tweets' turn into links?

I am somewhat of a Rails newbie so bear with me, I have most of the application figured out except for this one part.

def linkup_mentions_and_hashtags(text)
text.gsub!(/#([\w]+)(\W)?/, '#\1\2')
text.gsub!(/#([\w]+)(\W)?/, '#\1\2')
text
end
I found this example here: http://github.com/jnunemaker/twitter-app
The link to the helper method: http://github.com/jnunemaker/twitter-app/blob/master/app/helpers/statuses_helper.rb

Perhaps you could use Regular Expressions to look for "#..." and then replace the matches with the corresponding link?

You could use a regular expression to search for #sometext{whitespace_or_endofstring}

You can use regular expressions, i don't know ruby but the code should be almost exactly as my example:
Regex.Replace("this is an example #AlbertEin",
"(?<type>[##])(?<nick>\\w{1,}[^ ])",
"${type}${nick}");
This example would return
this is an example <a href="http://twitter.com/AlbertEin>#AlbertEin</a>
If you run it on .NET
The regex (?<type>[##])(?<nick>\\w{1,}[^ ]) means, capture and name it TYPE the text that starts with # or #, and then capture and name it NAME the text that follows that contains at least one text character until you fin a white space.

Perhaps you can use a regular expression to parse out the words starting with #, then update the string at that location with the proper link.
This regular expression will give you words starting with # symbols, but you might have to tweak it:
\#[\S]+\

You would use a regular expression to search for #username and then turn that to the corresponding link.
I use the following for the # in PHP:
$ret = preg_replace("#(^|[\n ])#([^ \"\t\n\r<]*)#ise",
"'\\1<a href=\"http://www.twitter.com/\\2\" >#\\2</a>'",
$ret);

I've also been working on this, I'm not sure that it's 100% perfect, but it seems to work:
def auto_link_twitter(txt, options = {:target => "_blank"})
txt.scan(/(^|\W|\s+)(#|#)(\w{1,25})/).each do |match|
if match[1] == "#"
txt.gsub!(/##{match.last}/, link_to("##{match.last}", "http://twitter.com/search/?q=##{match.last}", options))
elsif match[1] == "#"
txt.gsub!(/##{match.last}/, link_to("##{match.last}", "http://twitter.com/#{match.last}", options))
end
end
txt
end
I pieced it together with some google searching and some reading up on String.scan in the api docs.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

elaborate urls with regex - ruby-on-rails

Beucase this not match a pattern. Get before url add expression to catch some crap before url. You can also try match a regex pattern to every word. Because url is always one word separated word.

Related

Clean up & style characters from text

How to use the rails match method with a string?

Regular expression not working when put in an object

How to get regex to ignore URL strings?

I am creating a Twitter clone in Ruby On Rails, how do I code it so that the '#...''s in the 'tweets' turn into links?

Categories

Resources