< issue in AXIOM parser - xml-parsing

Axiom parser converts "&lt ;" to "<" if it is followed by empty node.This makes the xml content invalid.
XML Input:
case 1: <A> test <B></B> < test1 </A>
case 2: <A> test <B>ear</B> < test1 </A>
XML Output:
case 1: <A> test <B/> < test1 </A> [Incorrect]
case 2: <A> test <B>ear</B> < test1 </A> [Correct]
Axiom Code:
InputStream ina = new FileInputStream(fileName);
OMElement root = OMXMLBuilderFactory.createOMBuilder(ina).getDocumentElement();
Is there any way to handle this scenario ?

This is AXIOM-509. A fix for that issue was released in Axiom 1.4.0.

Related

Separate link_to with comma

I want to separate this block with commas :
- game_publication.groups.each_with_index do |group, index|
= link_to store_group_path(current_store, group) do
%span= #groups.find(group).name.to_s + (index > 0 ? ', ' : '')
But for the moment it returns something like
<label>Groups :</label>
<a href="/66-store/groups/4594?locale=en">
<span>party hard</span>
</a>
<a href="/66-store/groups/5063?locale=en">
<span>b0m,</span>
</a>
<a href="/66-store/groups/5066?locale=en">
<span>test,</span>
</a>
</label>
It doesn't seems a situation where I can use any rails helpers.
I would like something like group1, group2, group3.
<label>Groups :</label>
<a href="/66-store/groups/4594?locale=en">
<span>party hard,</span>
</a>
<a href="/66-store/groups/5063?locale=en">
<span>b0m,</span>
</a>
<a href="/66-store/groups/5066?locale=en">
<span>test</span>
</a>
</label>
First, are you sure that you pasted here the exact code which gave the posted result? In your code, you have
(index > 0 ? '' : ',')
which means: Do not add a comma UNLESS we are on the first element. The result you posted had the comma the otherway round: It has a comma everywhere, EXCEPT for the first element. With other words: The code you posted, can't produce the output you posted.
Now for your problem: You want to add a comma on every element, except the last. This means that you need to know the highest (last) index value:
last_index = game_publication.groups.size - 1
With this, you can write your expression as
(index == last_index ? '' : ',')

How to apply additional inline style to html tags in ruby?

I have a html string. In that string I want to parse all <p> tags and apply additional inline style.
Additional Style: style="margin:0px;padding:0px;" or it could be something else
Case1:
input string: <p>some string</p>
output string: <p style="margin:0px;padding:0px;">some string</p>
Case2:
input string: <p style="text-align:right;" >some string</p>
output string: <p style="text-align:right;margin:0px;padding:0px;">some string</p>
Case3:
input string: <p align="justify">some string</p>
output string: <p style="margin:0px;padding:0px;" align="justify">some string</p>
Right now I am using regex like this
myHtmlString.gsub("<p", "<p style = \"margin:0px;padding:0px\"")
Which works fine except it removes previous styling. I am using Ruby (ROR).
I need help to tweak this a bit.
You can do this using Nokogiri, by setting [:style] on the relevant Nodes.
require "nokogiri"
inputs = [
'<p>some string</p>',
'<p style="text-align:right;" >some string</p>',
'<p align="justify">some string</p>'
]
inputs.each do |input|
noko = Nokogiri::HTML::fragment(input)
noko.css("p").each do |tag|
tag[:style] = (tag[:style] || "") + "margin:0px;padding:0px;"
end
puts noko.to_html
end
This will loop through all elements matching the css selector p, and set the style attribute like you want.
Output:
<p style="margin:0px;padding:0px;">some string</p>
<p style="text-align:right;margin:0px;padding:0px;">some string</p>
<p align="justify" style="margin:0px;padding:0px;">some string</p>
I recommend against using regex for this, as in general HTML can't be properly parsed by regex. That said, as long as your input data is consistent, regex will still work. You want to match whatever content is already in a p element's style attribute using parentheses, then insert it in the substitution string:
myHtmlString.gsub(/<p( style="(.*)")?/,
"<p style=\"#{$2};margin:0px;padding:0px\"")
Here's how the match pattern works:
/ #regex delimiter
<p #match start of p tag
( #open paren used to group, everything in this group gets saved in $1
style=" #open style attribute
(.*) #group contents of style attribute, gets saved to $2
" #close style attribute
)? #question mark makes everything in the paren group optional
/ #regex delimiter
I ended up doing something like this, I had to do this just before sending the email. I know this is not the best way to do it but worth sharing here. Solutions given by #sgroves and #Dobert are really good and helpful.
But I din't want to included Nokogiri, though I have picked the idea from above 2 solutions only. Thanks.
Here is my code ( I am new to ROR so nothing much fancy here, I used it in HAML block)
myString.gsub!(/<p[^>]*>/) do |match|
match1 = match
style1_arr = match1.scan(/style=".*"/)
unless style1_arr.blank?
style1 = style1_arr.first.sub("style=", "").gsub(/\"/, "").to_s
style2 = style1 + "margin:0px;padding:0px;"
match2 = match1.sub(/style=".*"/, "style=\"#{style2.to_s}\"")
else
match2 = match1.sub(/<p/, "<p style = \"margin:0px;padding:0px;\"")
end
end
Now myString will be updated string.(notice the ! after gsub)

Capybara: page_find doesn't seem to be working with double slash (//)

Trying to validate the 2nd link in the following HTML:
<div id="navigation">
<ul>
<li>
TV
</li>
<li>
Radio
</li>
with the following expression:
page.find(:xpath, "//div[#id='navigation']//a").should have_content('Radio')
and I'm getting the following error:
expected there to be content "Radio" in "TV"
Should'nt the find method research in all the A elements inside the DIV node as I'm using a double slash? Could this be a bug or am I doing something wrong?
And is there any other way to be able to validate the 2nd link?
Thanks for the help guys!
In your case find will find first a in Capybara < 2.0 and will raise an Ambiguous Match exception in Capybara 2.0 as there are more than one elements with such locator.
I suggest you to do the following:
page.should have_selector('#navigation a', text: 'Radio')

How do I trim all whitespace in HAML?

In my example I want to individually markup letters in the word "word"
%span.word
%span.w W
%span.o O
%span.r R
%span.d D
As it is, this produces html like
<span class="word">
<span class="w">W</span>
<span class="o">O</span>
<span class="r">R</span>
<span class="d">D</span>
</span>
As you'd expect this displays as
W O R D
But I want it to display as
WORD
How can tell haml to remove all whitespace within the %span.word block?
%span.word
%span.w> W
%span.o> O
%span.r> R
%span.d> D
> (for whitespaces around a tag) and < (for whitespaces inside a tag) are used for whitespace removal.
http://haml.info/docs/yardoc/Haml/Options.html#remove_whitespace-instance_method
HAML allows you set a remove_whitespace option which will remove whitespace from all tags, if you don't want to litter your templates with < and > everywhere.

sanitize gem issue with < and >

I am using the sanitize gem https://github.com/rgrove/sanitize to remove some HTML tags from a string.
However, before sanitizing the string in my controller, the string is being set as follows:
<p>This is <b>bold</b> and this <span style="text-decoration: underline;">is</span> <i>italics</i> ok? This <em>is not </em>a problem.</p>
meaning that < and > are being replaced by < and >.
How can I use the sanitize gem to remove for example and when these tags are being represented as <i> and </i> in the controller?
If you want the escaped HTML tags (< and >) to be treated as HTML for the purposes of sanitizing, then you'll have to unescape them first:
require 'cgi'
Sanitize.clean(CGI.unescapeHTML(your_string))

Resources