How to replace string after extraction from WebHarvest? - webharvest

I wanted to insert the records I had extracted from website to DB, but the extraction text contained the symbol apostrophe, and had caused me syntax error during sql insertion. May I know how to replace apostrophe with "’" instead in WebHarvest?
Thanks in advance!

I normally use the script element to work on strings, and then output to a new webharvest variable. For example:
<var-def name="r_output">
A long string with lots of funny characters
new lines and & and ' single and " double quotes
</var-def>
<var-def name="r_output2">
<script return="r_output2">
<![CDATA[
String r_output2 = "\n" + r_output.toString().replaceAll("&", "&").replaceAll("\\t","").replaceAll("\\n","").replaceAll("\\r","");
]]>
</script>
</var-def>
<var name="r_output2"/>
as a side note, instead of quoting your apostrophes in quotes it is much better to quote the whole chunk of data eg: "a string with a ' single quote" instead of a string with a "'" single quote

Related

Rails remove '\' from json response

json response
{"skill"=>"{\"dept_id\"=>\"01\", \"user_id\"=>\"001\", \"level_cd\"=>\"04_swim\", \"first_name\"=>\"rohit\", \"last_name\"=>\"patel\", \"dept_full_name\"=>\"swiming\", \"rank\"=>\"04_swim\"}, {\"dept_id\"=>\"02\", \"user_id\"=>\"002\", \"level_cd\"=>\"04_swim\", \"first_name\"=>\"ranjit\", \"last_name\"=>\"shinde\", \"dept_full_name\"=>\"running\", \"rank\"=>\"03_run\"}, {\"dept_id\"=>\"04\", \"user_id\"=>\"004\", \"level_cd\"=>\"02_jump\", \"first_name\"=>\"kedar\", \"last_name\"=>\"patil\", \"dept_full_name\"=>\"jumping\", \"rank\"=>\"02_jump\"}, {\"dept_id\"=>\"05\", \"user_id\"=>\"005\", \"level_cd\"=>\"03_run\", \"first_name\"=>\"kapil\", \"last_name\"=>\"bote\", \"dept_full_name\"=>\"Hammer\", \"rank\"=>\"03_run\"}"
How to remove only \ from this response
expected output is
"skill"=>{"dept_id"=>"01", "user_id"=>"001", "level_cd"=>"04_swim", "first_name"=>"rohit", "last_name"=>"patel", "dept_full_name"=>"swiming", "rank"=>"04_swim"}, {"dept_id"=>"02", "user_id"=>"002", "level_cd"=>"04_swim", "first_name"=>"ranjit", "last_name"=>"shinde", "dept_full_name"=>"running", "rank"=>"03_run"}, {"dept_id"=>"04", "user_id"=>"004", "level_cd"=>"02_jump", "first_name"=>"kedar", "last_name"=>"patil", "dept_full_name"=>"jumping", "rank"=>"02_jump"}, {"dept_id"=>"05", "user_id"=>"005", "level_cd"=>"03_run", "first_name"=>"kapil", "last_name"=>"bote", "dept_full_name"=>"Hammer", "rank"=>"03_run"}
There are currently no backslashes in the string. The backslash is only there because the context is within a double quoted string context.
If you want to use a double quote in double quoted string context you need to escape it with a backslash, otherwise the compiler thinks you want to end the string.
"John Doe said: "Hello Word!""
The above is not valid. The " before Hello World! will end the string. Meaning that Hello World! will not be in string context and Ruby tries to parse Hello and World as constants.
To prevent this from happening you escape the " with a backslash \.
"John Doe said: \"Hello Word!\""
\" will be interpreted as one " character. There is no backslash present within the resulting string. See the Ruby literals documentation.
When using single quotes for string delimiters there is no need to escape the double quotes (but you do need to escape single quotes). The above could also be written as:
'John Doe said: "Hello Word!"'
Similarly your data can be written as:
{"skill"=>'{"dept_id"=>"01", "user_id"=>"001", "level_cd"=>"04_swim", "first_name"=>"rohit", "last_name"=>"patel", "dept_full_name"=>"swiming", "rank"=>"04_swim"}, {"dept_id"=>"02", "user_id"=>"002", "level_cd"=>"04_swim", "first_name"=>"ranjit", "last_name"=>"shinde", "dept_full_name"=>"running", "rank"=>"03_run"}, {"dept_id"=>"04", "user_id"=>"004", "level_cd"=>"02_jump", "first_name"=>"kedar", "last_name"=>"patil", "dept_full_name"=>"jumping", "rank"=>"02_jump"}, {"dept_id"=>"05", "user_id"=>"005", "level_cd"=>"03_run", "first_name"=>"kapil", "last_name"=>"bote", "dept_full_name"=>"Hammer", "rank"=>"03_run"}'
The above clearly demonstrates that there are no backslash characters present in the string.
However the string is not JSON. I suggest changing the server response if possible. You can eval the current response, but I would advise not to use eval ever (eval is evil). If the server would send malicious Ruby code, eval will execute it without any issues and might corrupt your machine.
Looks like the hash example needs to end with an } to be valid. So I added it in my example. Further more it looks to be a collection of records, but it also looks like it's missing a list. If it were inside a list it would be valid but as the example stands now, it is not a valid hash.
But let's say just for fun, I did want to take the string and put it inside an array. Maybe something like this:
data = {"skill"=>"{\"dept_id\"=>\"01\", \"user_id\"=>\"001\", \"level_cd\"=>\"04_swim\", \"first_name\"=>\"rohit\", \"last_name\"=>\"patel\", \"dept_full_name\"=>\"swiming\", \"rank\"=>\"04_swim\"}, {\"dept_id\"=>\"02\", \"user_id\"=>\"002\", \"level_cd\"=>\"04_swim\", \"first_name\"=>\"ranjit\", \"last_name\"=>\"shinde\", \"dept_full_name\"=>\"running\", \"rank\"=>\"03_run\"}, {\"dept_id\"=>\"04\", \"user_id\"=>\"004\", \"level_cd\"=>\"02_jump\", \"first_name\"=>\"kedar\", \"last_name\"=>\"patil\", \"dept_full_name\"=>\"jumping\", \"rank\"=>\"02_jump\"}, {\"dept_id\"=>\"05\", \"user_id\"=>\"005\", \"level_cd\"=>\"03_run\", \"first_name\"=>\"kapil\", \"last_name\"=>\"bote\", \"dept_full_name\"=>\"Hammer\", \"rank\"=>\"03_run\"}"}
parsed_data = data["skill"].split("}, ").map{|x| x.end_with?("\"") ? x + '}' : x}.map{|x| eval(x)}
puts parsed_data
{"dept_id"=>"01", "user_id"=>"001", "level_cd"=>"04_swim", "first_name"=>"rohit", "last_name"=>"patel", "dept_full_name"=>"swiming", "rank"=>"04_swim"}
{"dept_id"=>"02", "user_id"=>"002", "level_cd"=>"04_swim", "first_name"=>"ranjit", "last_name"=>"shinde", "dept_full_name"=>"running", "rank"=>"03_run"}
{"dept_id"=>"04", "user_id"=>"004", "level_cd"=>"02_jump", "first_name"=>"kedar", "last_name"=>"patil", "dept_full_name"=>"jumping", "rank"=>"02_jump"}
{"dept_id"=>"05", "user_id"=>"005", "level_cd"=>"03_run", "first_name"=>"kapil", "last_name"=>"bote", "dept_full_name"=>"Hammer", "rank"=>"03_run"}
Now with the data in an array you can convert it to json if you'd like
require 'json'
2.6.5 :007 > parsed_data.to_json
=> "[{\"dept_id\":\"01\",\"user_id\":\"001\",\"level_cd\":\"04_swim\",\"first_name\":\"rohit\",\"last_name\":\"patel\",\"dept_full_name\":\"swiming\",\"rank\":\"04_swim\"},{\"dept_id\":\"02\",\"user_id\":\"002\",\"level_cd\":\"04_swim\",\"first_name\":\"ranjit\",\"last_name\":\"shinde\",\"dept_full_name\":\"running\",\"rank\":\"03_run\"},{\"dept_id\":\"04\",\"user_id\":\"004\",\"level_cd\":\"02_jump\",\"first_name\":\"kedar\",\"last_name\":\"patil\",\"dept_full_name\":\"jumping\",\"rank\":\"02_jump\"},{\"dept_id\":\"05\",\"user_id\":\"005\",\"level_cd\":\"03_run\",\"first_name\":\"kapil\",\"last_name\":\"bote\",\"dept_full_name\":\"Hammer\",\"rank\":\"03_run\

#HttpContext.Current.User.Identity.Name not showing backslash

Super Simple. Only issues I find are people getting null. Which I obvi fixed. But where is the backslash???!!
params.me = '#HttpContext.Current.User.Identity.Name';
This returns
"domainUserName" <- Browser
"domain\\UserName" <- Debugging
What I expect is
"domain\UserName" <- Browser
Any ideas?
Based on your comments you are using the following code to show the user name:
alert('#HttpContext.Current.User.Identity.Name');
#HttpContext.Current.User.Identity.Nameis a string that can contain "\" backslash character. This character is considered as a escape character in javascript as it is in C# as well.
You need to escape the "\" character in the string before passing it to Javascript like that:
alert('#HttpContext.Current.User.Identity.Name.Replace("\\", "\\\\")')

namevaluecollection removes "+" characters from querystring

I have the followigurl localhost.dev?q=dyYJDXWoTKjj9Za6Enzg4lB+NHJsrZQehfY1dqbU1fc= and extract the query string as follows:
NameValueCollection query = HttpUtility.ParseQueryString(actionContext.Request.RequestUri.Query);
string str1 = query[0];
If i call query.ToString() it shows the correct characters query string. However, when I access the value from the NameValueCollection 'query[0]' the "+" is replaced by a empty " " i.e. dyYJDXWoTKjj9Za6Enzg4lB NHJsrZQehfY1dqbU1fc=
I've tried specifing different encoding and using the Get method from the namevaluecollection. I've also tried spliting the string, but the "+" is being removed each time. Has anyone got any ideas? Many thanks
You can't use this chars in the url variables, you need use URLEncode and URLDecode of HttpUtility class to convert this into a valid url.
I hope this help you.

ruby regex with quotes

I'm trying to pass more than one regex parameter for parts of a string that needs to be replaced. Here's the string:
str = "stands in hall "Let's go get to first period everyone" Students continue moving to seats."
Here is the expected string:
str = "stands in hall "Let's go get to first period everyone" Students continue moving to seats."
This is what I tried:
str.gsub(/'|"/, "'" => "\'", """ => "\"")
This is what I got:
"stands in hall \"Let's go get to first period everyone\" Students continue moving to seats."
How do I get the quotes in while sending in two regex parameters using gsub?
This is an HTML unescaping problem.
require 'cgi'
CGI.unescape_html(str)
This gives you the correct answer.
From my comments on this question:
Your updated version is correct. The only reason the slashes are in your final line of code is that it's an escape sequence so that you don't mistakenly think the first slash is used to terminate the string. Try assigning your output and printing it:
str1 = str.gsub(/'|"/, "'" => "\'", """ => "\"")
puts str1
and you'll see that the slashes are gone when str1 is printed using puts.
The difference is that autoevaluating variables within irb (which is what I assume you're doing to execute this sample code) automatically calls the inspect method, which for string variables shows the string in its entirety.
Because I did not understand unescaping characters I found an alternative solution that might be the "rails-way"
Can you use <%= raw 'some_html' %>
My final solution ended up being this instead of messy regex and requiring CGI
<%= raw evidence_score.description %>
Unescaping HTML string in Rails

xml markupbuilder in grails changing single quotes in atribute value to &apos;

I am using groovys xml markupbuilder to generate my xml. I have attribute of a tag which has single quote (') as part of its value, and when I set it in the code and do a printout, I see the generated xml has the single quote changed to &apos;
Is this automatically converted to single quote when I render this xml string in gsp?
if not how do I retain the single quote in the attribute value?
I tried to escape the single quote using \ but it stil shows &apos in output log
here is the markupbuilder code I have
xml.map(id:"worldmap",name:"worldmap"){
res_row.each{
area(shape:"circle",alt:it.key,title:it.key,onclick:"loadActivity(\'"+it.key+"\')")
}
}
the final attribute should be onclick="loadActivity('New York')"
Thanks
you can configure the markup-builder to use double quotes:
xml.setDoubleQuotes(true)
complete example:
import groovy.xml.MarkupBuilder
def xml = new MarkupBuilder()
xml.setDoubleQuotes(true)
def res_row = [a:1, b:2]
def text= xml.map(id:"worldmap",name:"worldmap"){
res_row.each{
area(shape:"circle",alt:it.key,title:it.key,onclick:"loadActivity('${it.key}')")
}
}
println text
prints:
<map id="worldmap" name="worldmap">
<area shape="circle" alt="a" title="a" onclick="loadActivity('a')" />
<area shape="circle" alt="b" title="b" onclick="loadActivity('b')" />

Resources