Jenkins/Groovy: How to pull specific part of a string - jenkins

I have a string that look like:
data = ABSIFHIEHFINE -2938 NODFNJN {[somedate]} oiejfoen
I need to pull {[somedate]} only with {[]} included.
I tried to do data.substring(0, data.indexOf(']}')) to remove the end of the string but it is also removing the symbols that I need to keep

I need to pull {[somedate]} only with {[]} included.
def data = 'ABSIFHIEHFINE -2938 NODFNJN {[somedate]} oiejfoen'
// you could do error checking on these to ensure
// >= 0 and end > start and handle that however
// is appropriate for your requirements...
def start = data.indexOf '{['
def end = data.indexOf ']}'
def result = data[start..(end+1)]
assert result == '{[somedate]}'

You can do it using regular expression search:
data = "ABSIFHIEHFINE -2938 NODFNJN {[somedate]} oiejfoen"
def matcher = data =~ /\{\[.+?\]\}/
if( matcher ) {
echo matcher[0]
}
else {
echo "no match"
}
Output:
{[somedate]}
Explanations:
=~ is the find operator. It creates a java.util.regex.Matcher.
The string between the forward slashes (which is just another way to define a string literal), is the regular expression: \{\[.+?\]\}
RegEx breakdown:
\{\[ - literal { and [ which must be escaped because they have special meaning in RegEx
.+? - any character, at least one, as little as possible (to support finding multiple sub strings enclosed in {[]})
\]\} - literal ] and } which must be escaped because they have special meaning in RegEx
You can test the RegEx only or use Groovy IDE to test the full sample code (replace echo by println).

Related

Remove all indents and spaces from JSON string except inside its value in Ruby

My problematic string is like this:
'{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'
I want to parse it as JSON object(Hash) by JSON.parse(jsonstring)
The expecting result is:
{ "test": "AAAA", "test2": "BBB\nB"}
However, I get the error:
JSON::ParserError: 809
I happend to know that indentaion code in jsonstring be escaped,
so I tried this:
escaped_jsonstring = '{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'.gsub(/\R/, '\\n')
JSON.parse(escaped_jsonstring)
I still have JSON::ParserError.
Indentations outside the key or value may cause this error.
How can I remove \n(or \r any indentation code) only outside the key or value in Ruby?
which means,
'{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'
↓
'{"test":"AAAA","test2":"BBB\n\n\nBBB"}'
try this
'{\n"test":"AAAA",\n"test2":"BBB\n\n\nBBB"\n}'.gsub(/\B(\\n)+/, "")
\n" is considered inside boundary (so i use \B), meanwhile "\n is considered outside boundary (\b), (\\n)+ to fix case '...,\n\n\n"test2":...
update
turn out \s\n also be considered an inside boundary ... iam not sure there's other cases ...
for now, the updated version
'{\n"test":"AAAA",\n"test2":"BBB \n\n\n BBB"\n}'
.gsub(/([{,\"]\s*)\B(\\n)+/) { $1 }
better way
i found another way to solve your problem, also using regexp, now i will scan through the input text (valid or invalid json) then filter follow the pair pattern "<key>":"<value>" and don't care anything else outside those pairs, finally output the hash
def format(json)
matches = json.scan(/\"(?<key>[^\"]+)\":\"(?<val>[^\"]+)\",*/)
matches&.to_h
end
format('{\n "test\n parser":"AA\nAA", \n\n"test2":"BBB ? ;\n\n\n BBB" \n}')
# {"test\n parser"=>"AA\nAA", "test2"=>"BBB ? ;\n\n\n BBB"}

How to combine Ruby regexp conditions

I need to check if a string is valid image url.
I want to check beginning of string and end of string as follows:
Must start with http(s):
Must end by .jpg|.png|.gif|.jpeg
So far I have:
(https?:)
I can't seem to indicate beginning of string \A, combine patterns, and test end of string.
Test strings:
"http://image.com/a.jpg"
"https://image.com/a.jpg"
"ssh://image.com/a.jpg"
"http://image.com/a.jpeg"
"https://image.com/a.png"
"ssh://image.com/a.jpeg"
Please see http://rubular.com/r/PqERRim5RQ
Using Ruby 2.5
Using your very own demo, you could use
^https?:\/\/.*(?:\.jpg|\.png|\.gif|\.jpeg)$
See the modified demo.
One could even simplify it to:
^https?:\/\/.*\.(?:jpe?g|png|gif)$
See a demo for the latter as well.
This basically uses anchors (^ and $) on both sides, indicating the start/end of the string. Additionally, please remember that you need to escape the dot (\.) if you want to have ..
There's quite some ambiguity going on in the comments section, so let me clarify this:
^ - is meant for the start of a string
(or a line in multiline mode, but in Ruby strings are always in multiline mode)
$ - is meant for the end of a string / line
\A - is the very start of a string (irrespective of multilines)
\z - is the very end of a string (irrespective of multilines)
You may use
reg = %r{\Ahttps?://.*\.(?:png|gif|jpe?g)\z}
The point is:
When testing at online regex testers, you are testing a single multiline string, but in real life, you will validate lines as separate strings. So, in those testers, use ^ and $ and in real code, use \A and \z.
To match a string rather than a line you need \A and \z anchors
Use %r{pat} syntax if you have many / in your pattern, it is cleaner.
Online Ruby test:
urls = ['http://image.com/a.jpg',
'https://image.com/a.jpg',
'ssh://image.com/a.jpg',
'http://image.com/a.jpeg',
'https://image.com/a.png',
'ssh://image.com/a.jpeg']
reg = %r{\Ahttps?://.*\.(?:png|gif|jpe?g)\z}
urls.each { |url|
puts "#{url}: #{(reg =~ url) == 0}"
}
Output:
http://image.com/a.jpg: true
https://image.com/a.jpg: true
ssh://image.com/a.jpg: false
http://image.com/a.jpeg: true
https://image.com/a.png: true
ssh://image.com/a.jpeg: false
The answers here are quite good, but if you wanted to avoid using a complicated regex and communicate your intent more clearly to a reader, you could let URI and File do the heavy lifting for you.
(And since you're using 2.5, let's use #match? instead of other regex-matching methods.)
def valid_url?(url)
# Let URI parse the URL.
uri = URI.parse(url)
# Is the scheme http or https, and does the extension match expected formats?
uri.scheme.match?(/https?/i) && File.extname(uri.path).match?(/(png|jpe?g|gif)/i)
rescue URI::InvalidURIError
# If it's an invalid URL, URI will throw this error.
# We'll return `false`, because a URL that can't be parsed by URI isn't valid.
false
end
urls.map { |url| [url, valid_url?(url)] }
#=> Results in:
'http://image.com/a.jpg', true
'https://image.com/a.jpg', true
'ssh://image.com/a.jpg', false
'http://image.com/a.jpeg', true
'https://image.com/a.png', true
'ssh://image.com/a.jpeg', false
'https://image.com/a.tif', false
'http://t.co.uk/proposal.docx', false
'not a url', false

How to concatenate API request URL safely

Let's imagine I have the following parts of a URL:
val url_start = "http://example.com"
val url_part_1 = "&fields[...]&" //This part of url can be in the middle of url or in the end
val url_part_2 = "&include..."
And then I try to concatenate the resulting URL like this:
val complete_url = url_start + url_part_2 + url_part_1
In this case I'd get http://example.com&include...&fields[...]& (don't consider syntax here), which is one & symbol between URL parts which means that concatenation was successful, BUT if I use different concat sequence in a different request like this:
val complete_url = url_start + url_part_1 + url_part_2
I'd get http://example.com&fields[...]&&include..., to be specific && in this case. Is there a way to ensure that concatenation is safer?
To keep you code clean use an array or object to keep your params and doin't keep "?" or "&" as part of urlStart or params. Add these at the end. e.g.
var urlStart = "http://example.com"
var params=[]
params.push ('a=1')
params.push ('b=2')
params.push ('c=3', 'd=4')
url = urlStart + '?' + params.join('&')
console.log (url) // http://example.com?a=1&b=2&c=3&d=4
First, you should note that it is invalid to have query parameters just after domain name; it should be something like http://example.com/?include...&fields[...] (note the /? part, you can replace it with / to make it a path parameter, but it's not likely that the router of the website supports parameters like this). Refer, for example, to this article: https://www.talisman.org/~erlkonig/misc/lunatech%5Ewhat-every-webdev-must-know-about-url-encoding/ to know more about what URLs can be valid.
For the simple abstract approach, you can use Kotlin's joinToString():
val query_part = arrayOf(
"fields[...]",
"include..."
).joinToString("&")
val whole_url = "http://example.com/?" + query_part
print(whole_url) // http://example.com/?fields[...]&include...
This approach is abstract because you can use joinToString() not only for URLs, but for whatever strings you want. This also means that if there will be an & symbol in one of the input strings itself, it will become two parameters in the output string. This is not a problem when you, as a programmer, know what strings will be joined, but if these strings are provided by user, it can become a problem.
For URL-aware approach, you can use URIBuilder from Apache HttpComponents library, but you'll need to import this library first.

How to have gsub handle multiple patterns and replacements

A while ago I created a function in PHP to "twitterize" the text of tweets pulled via Twitter's API.
Here's what it looked like:
function twitterize($tweet){
$patterns = array ( "/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/",
"/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/",
"/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/");
$replacements = array ("<a href='\\0' target='_blank'>\\0</a>", "<a href='http://twitter.com/\\1' target='_blank'>\\0</a>", "<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>");
return preg_replace($patterns, $replacements, $tweet);
}
Now I'm a little stuck with Ruby's gsub, I tried:
def twitterize(text)
patterns = ["/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/", "/(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/"]
replacements = ["<a href='\\0' target='_blank'>\\0</a>",
"<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
"<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]
return text.gsub(patterns, replacements)
end
Which obviously didn't work and returned an error:
No implicit conversion of Array into String
And after looking at the Ruby documentation for gsub and exploring a few of the examples they were providing, I still couldn't find a solution to my problem: How can I have gsub handle multiple patterns and multiple replacements at once?
Well, as you can read from the docs, gsub does not handle multiple patterns and replacements at once. That's what causing your error, quite explicit otherwise (you can read that as "give me a String, not an Array!!1").
You can write that like this:
def twitterize(text)
patterns = [/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/, /(?<=^|(?<=[^a-zA-Z0-9-\.]))#([A-Za-z_]+[A-Za-z0-9_]+)/]
replacements = ["<a href='\\0' target='_blank'>\\0</a>",
"<a href='http://twitter.com/\\1' target='_blank'>\\0</a>",
"<a href='http://twitter.com/search?q=\\1&src=hash' target='_blank'>\\0</a>"]
patterns.each_with_index do |pattern, i|
text.gsub!(pattern, replacements[i])
end
text
end
This can be refactored into more elegant rubyish code, but I think it'll do the job.
The error was because you tried to use an array of replacements in the place of a string in the gsub function. Its syntax is:
text.gsub(matching_pattern,replacement_text)
You need to do something like this:
replaced_text = text.gsub(pattern1, replacement1)
replaced_text = replaced_text.gsub(pattern2, replacement2)
and so on, where the pattern 1 is one of your matching patterns and replacement is the replacement text you would like.

Regex in Ruby: expression not found

I'm having trouble with a regex in Ruby (on Rails). I'm relatively new to this.
The test string is:
http://www.xyz.com/017010830343?$ProdLarge$
I am trying to remove "$ProdLarge$". In other words, the $ signs and anything between.
My regular expression is:
\$\w+\$
Rubular says my expression is ok. http://rubular.com/r/NDDQxKVraK
But when I run my code, the app says it isn't finding a match. Code below:
some_array.each do |x|
logger.debug "scan #{x.scan('\$\w+\$')}"
logger.debug "String? #{x.instance_of?(String)}"
x.gsub!('\$\w+\$','scl=1')
...
My logger debug line shows a result of "[]". String is confirmed as being true. And the gsub line has no effect.
What do I need to correct?
Use /regex/ instead of 'regex':
> "http://www.xyz.com/017010830343?$ProdLarge$".gsub(/\$\w+\$/, 'scl=1')
=> "http://www.xyz.com/017010830343?scl=1"
Don't use a regex for this task, use a tool designed for it, URI. To remove the query:
require 'uri'
url = URI.parse('http://www.xyz.com/017010830343?$ProdLarge$')
url.query = nil
puts url.to_s
=> http://www.xyz.com/017010830343
To change to a different query use this instead of url.query = nil:
url.query = 'scl=1'
puts url.to_s
=> http://www.xyz.com/017010830343?scl=1
URI will automatically encode values if necessary, saving you the trouble. If you need even more URL management power, look at Addressable::URI.

Resources