So, I have a string like this:
str1 = "blablablabla... original_url=\"https://facebook.com/125642\"> ... blablablabla..."
what is the best approach to extract this original_url?
what I have done so far is this:
original_url = str1['content'][str1['content'].index('original_url')+12..str1['content'].index('>')-2]
it works, but it seems such like a poor solution, mostly I'm stuggling to find this substring /">
here's what I have tried so far
str1.index('\">')
str1.index('\\">') # escaping only one backslach
str1.index('\\\">') # escaping both back slash and "
str1.index("\\\">") # was just without idea over here
I'm not a ruby programmer, so I'm kinda lost here
The best approach to parse xml namespaces is to use Nokogiri as suggested by #spickermann.
Quick but not elegant and not even efficient solutions:
str1 = "blablablabla... original_url=\"https://facebook.com/125642\"> ... blablablabla..."
original_url=str1[str1.index("original_url")+14...str1.index("\">")]
# => "https://facebook.com/125642"
original_url=str1.split(/original_url=\"/)[1].split(/">/).first
# => "https://facebook.com/125642"
Related
due to using nginx lua (kong gateway)
would like to replace
from
{"body":"\r\n \"username\": \"sampleUser\",\r\n \"password\": \"samplePassword\",\r\n"}
to
{"body":"\r\n \"username\": \"sampleUser\",\r\n \"password\": \"***REDACTED***\",\r\n"}
from
username=sampleUser&password=samplePassword
to
username=sampleUser&password=***REDACTED***
from
"password" : "krghfkghkfghf"
to
"password" : "***REDACTED***"
i did try on sample https://stackoverflow.com/a/16638937/712063
local function replacePass(configstr, newpass)
return configstr:gsub("(%[\"password\"%]%s*=%s*)%b\"\"", "%1\"" .. newpass .. "\"")
end
it does not work, any recommended reference ?
maybe something like this:
local t = {
[[{"body":"\r\n \"username\": \"sampleUser\",\r\n \"password\": \"samplePassword\",\r\n"} ]],
[[username=sampleUser&password=samplePassword]],
[["password" : "krghfkghkfghf"]]
}
local newpass ="***REDACTED***"
for k,example in pairs(t) do
res = example:gsub('(password[\\" :]+)(.-)([\\"]+)',"%1"..newpass.."%3")
res = res:gsub("(password=).+","%1"..newpass)
print(res)
end
out:
{"body":"\r\n \"username\": \"sampleUser\",\r\n \"password\": \"***REDACTED***\",\r\n"}
username=sampleUser&password=***REDACTED***
"password" : "***REDACTED***"
There's several things wrong with that at first sight.
1. You only check for =
In your test data you seem to have cases with key = value as well as key : value but your pattern only checks for =; replace that with [=:] for a simple fix.
2. You can't balance quotation marks
I don't know how you expect this to work, but [[%b""]] just finds the shortest possible string between two " characters, just as [[".-"]] would. If that's your intention, then there's nothing wrong with writing it using %b though.
3. Just don't
As I don't know the context, I can't say if this really is a bad idea or not, but it does seem like a very brittle solution. If this is an option, I would recommend considering the alternative and going with something more robust.
As for what a better alternative could look like, I can't say without knowledge of your requirements. Maybe normalizing the data into a Lua table and replacing the password key in a uniform way? This would make sure that the data is either sanitized, or errors during parsing.
Beyond that, it would help if you told us how it doesn't work. It's easy to miss bugs when reading someone elses code, but knowing how the code misbehaves can help a lot with actually spotting the problem.
EDIT:
4. You didn't even remove the brackets
You didn't even remove the [] from that other stack overflow answer. Obviously that doesn't work without any modifications.
A while ago I created a function in PHP to "twitterize" the text of tweets pulled via Twitter's API.
Here's what it looked like:
function twitterize($tweet){
$patterns = array ( "/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/",
"/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/",
"/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/");
$replacements = array ("<a href='\\0' target='_blank'>\\0</a>", "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>", "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>");
return preg_replace($patterns, $replacements, $tweet);
}
Now I'm a little stuck with Ruby's gsub, I tried:
def twitterize(text)
patterns = ["/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/"]
replacements = ["<a href='\\0' target='_blank'>\\0</a>",
"<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
"<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]
return text.gsub(patterns, replacements)
end
Which obviously didn't work and returned an error:
No implicit conversion of Array into String
And after looking at the Ruby documentation for gsub and exploring a few of the examples they were providing, I still couldn't find a solution to my problem: How can I have gsub handle multiple patterns and multiple replacements at once?
Well, as you can read from the docs, gsub does not handle multiple patterns and replacements at once. That's what causing your error, quite explicit otherwise (you can read that as "give me a String, not an Array!!1").
You can write that like this:
def twitterize(text)
patterns = [/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/]
replacements = ["<a href='\\0' target='_blank'>\\0</a>",
"<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
"<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]
patterns.each_with_index do |pattern, i|
text.gsub!(pattern, replacements[i])
end
text
end
This can be refactored into more elegant rubyish code, but I think it'll do the job.
The error was because you tried to use an array of replacements in the place of a string in the gsub function. Its syntax is:
text.gsub(matching_pattern,replacement_text)
You need to do something like this:
replaced_text = text.gsub(pattern1, replacement1)
replaced_text = replaced_text.gsub(pattern2, replacement2)
and so on, where the pattern 1 is one of your matching patterns and replacement is the replacement text you would like.
I'm trying to pass more than one regex parameter for parts of a string that needs to be replaced. Here's the string:
str = "stands in hall "Let's go get to first period everyone" Students continue moving to seats."
Here is the expected string:
str = "stands in hall "Let's go get to first period everyone" Students continue moving to seats."
This is what I tried:
str.gsub(/'|"/, "'" => "\'", """ => "\"")
This is what I got:
"stands in hall \"Let's go get to first period everyone\" Students continue moving to seats."
How do I get the quotes in while sending in two regex parameters using gsub?
This is an HTML unescaping problem.
require 'cgi'
CGI.unescape_html(str)
This gives you the correct answer.
From my comments on this question:
Your updated version is correct. The only reason the slashes are in your final line of code is that it's an escape sequence so that you don't mistakenly think the first slash is used to terminate the string. Try assigning your output and printing it:
str1 = str.gsub(/'|"/, "'" => "\'", """ => "\"")
puts str1
and you'll see that the slashes are gone when str1 is printed using puts.
The difference is that autoevaluating variables within irb (which is what I assume you're doing to execute this sample code) automatically calls the inspect method, which for string variables shows the string in its entirety.
Because I did not understand unescaping characters I found an alternative solution that might be the "rails-way"
Can you use <%= raw 'some_html' %>
My final solution ended up being this instead of messy regex and requiring CGI
<%= raw evidence_score.description %>
Unescaping HTML string in Rails
I'm having trouble with a regex in Ruby (on Rails). I'm relatively new to this.
The test string is:
http://www.xyz.com/017010830343?$ProdLarge$
I am trying to remove "$ProdLarge$". In other words, the $ signs and anything between.
My regular expression is:
\$\w+\$
Rubular says my expression is ok. http://rubular.com/r/NDDQxKVraK
But when I run my code, the app says it isn't finding a match. Code below:
some_array.each do |x|
logger.debug "scan #{x.scan('\$\w+\$')}"
logger.debug "String? #{x.instance_of?(String)}"
x.gsub!('\$\w+\$','scl=1')
...
My logger debug line shows a result of "[]". String is confirmed as being true. And the gsub line has no effect.
What do I need to correct?
Use /regex/ instead of 'regex':
> "http://www.xyz.com/017010830343?$ProdLarge$".gsub(/\$\w+\$/, 'scl=1')
=> "http://www.xyz.com/017010830343?scl=1"
Don't use a regex for this task, use a tool designed for it, URI. To remove the query:
require 'uri'
url = URI.parse('http://www.xyz.com/017010830343?$ProdLarge$')
url.query = nil
puts url.to_s
=> http://www.xyz.com/017010830343
To change to a different query use this instead of url.query = nil:
url.query = 'scl=1'
puts url.to_s
=> http://www.xyz.com/017010830343?scl=1
URI will automatically encode values if necessary, saving you the trouble. If you need even more URL management power, look at Addressable::URI.
I have a string of the format,
/d.phpsoft_id=369242&url=http://f.1mobile.com/mobile_software/finance/com.mshift.android.achieva_2.apk
and i need to edit this string using regular expression that the result string should start from http: ie the resultatnt string should be
http://f.1mobile.com/mobile_software/finance/com.mshift.android.achieva_2.apk
please help
For these types of situations, I prefer to go with readily available tools that will help provide a solution or at the very least will point me in the right direction. My favourite for regex is txt2re because it will output example code in many languages, including ruby.
After running your string through the parser and selecting httpurl for matching, it output:
txt='/d.phpsoft_id=369242&url=http://f.1mobile.com/mobile_software/finance/com.mshift.android.achieva_2.apk'
re1='.*?' # Non-greedy match on filler
re2='((?:http|https)(?::\\/{2}[\\w]+)(?:[\\/|\\.]?)(?:[^\\s"]*))' # HTTP URL 1
re=(re1+re2)
m=Regexp.new(re,Regexp::IGNORECASE);
if m.match(txt)
httpurl1=m.match(txt)[1];
puts "("<<httpurl1<<")"<< "\n"
end
str = "/d.phpsoft_id=369242&url=http://f.1mobile.com/mobile_software/finance/com.mshift.android.achieva_2.apk"
str.split("url=")[1]
Simple Answer
You need to do following
str = "/d.phpsoft_id=369242&url=http://f.1mobile.com/mobile_software/finance/com.mshift.android.achieva_2.apk"
start=str.index('http://')
resultant=str[start,str.length]