How do I trim all whitespace in HAML? - ruby-on-rails

In my example I want to individually markup letters in the word "word"
%span.word
%span.w W
%span.o O
%span.r R
%span.d D
As it is, this produces html like
<span class="word">
<span class="w">W</span>
<span class="o">O</span>
<span class="r">R</span>
<span class="d">D</span>
</span>
As you'd expect this displays as
W O R D
But I want it to display as
WORD
How can tell haml to remove all whitespace within the %span.word block?

%span.word
%span.w> W
%span.o> O
%span.r> R
%span.d> D
> (for whitespaces around a tag) and < (for whitespaces inside a tag) are used for whitespace removal.

http://haml.info/docs/yardoc/Haml/Options.html#remove_whitespace-instance_method
HAML allows you set a remove_whitespace option which will remove whitespace from all tags, if you don't want to litter your templates with < and > everywhere.

Related

SLIM h3 inside p

If I want to put header tag h1..h5 into p it render HTML in wrong way:
slim
p
h3 Header here
span Just text
expected
<p>
<h3>Header here</h3>
<span>Just text</span>
</p>
in real it render
<p></p>
<h3>Header here</h3>
<span>Just text</span>
<p></p>
looks like nothing special but in this case I can't bind CSS style because structure renders in broken way.
Is it a bug? Or I can solve this in some way?
You can not define tags like h3 and span inside the p tag.
See HTML5 specification for more information.
Slim itself doesn’t care about the tags, and renders your code as you expect:
$ slimrb
p
h3 Header here
span Just text
produces:
<p><h3>Header here</h3><span>Just text</span></p>
which matches what your expected code, except for whitespace.
This isn’t valid HTML though, so when the browser parses it it will correct it to something valid. For example the Chrome inspector will show:
<p></p>
<h3>Header here</h3>
<span>Just text</span>
<p></p>
Where the browser has closed the p element before the next h3 element.
If you want this to work in the browser, make sure you are generating valid HTML. Perhaps use a div instead of a p?

How to get text from <li> elements

I have:
<ul>
<li>text1</li>
<li>text2 </li>
</ul>
Right now I get the text from <li> like this:
result = page.css(' ul li').text
The problem is, as a result I get a string with no spaces like
text1text2
I want it to be divided with <br>, like text1<br>text2<br>.
How do I do this?
From "Searching a XML/HTML Document"
:
methods xpath and css actually return a NodeSet, which acts very much
like an array, and contains matching nodes from the document.
So, if you want to concatenate all texts from all <li> tags, then you should work with the css method result as with a collection:
page.css('ul li') # selects all li tags and returns collection of Node objects
.map(&:text) # maps collection of li nodes into array of corresponding texts
.join('<br>') # concatenates all nodes texts into a single string with <br> separator
See: http://ruby.bastardsbook.com/chapters/html-parsing/

Thymeleaf-Not able to append '/' in html

<span th:each="entry: ${productOptionDisplayValues}">
<span th:if="${entry?.key != 'OfferStatus'}">
<span th:if="${entry?.key=='color_'}" th:text="${entry.value}+ ' / '"/>
<span th:unless="${entry?.key=='color_'}" th:text="${entry.value}"/>
</span>
</span>
I have the following span tag in my html where i am iterating on a map and printing the key value.
At the end of first if block , after printing the value i want to appand a '/' for which i am appending + ' / ' at the end of first if block.
But it's appearing as '아이보리 T38' where instead of / , T is getting appended after the color variance.
It should come as '아이보리 / 38'
If you want Thymeleaf to respect your XHTML or HTML5 tags and not escape them,
You will have to use a different attribute: th:utext (for "unescaped text").

How to apply additional inline style to html tags in ruby?

I have a html string. In that string I want to parse all <p> tags and apply additional inline style.
Additional Style: style="margin:0px;padding:0px;" or it could be something else
Case1:
input string: <p>some string</p>
output string: <p style="margin:0px;padding:0px;">some string</p>
Case2:
input string: <p style="text-align:right;" >some string</p>
output string: <p style="text-align:right;margin:0px;padding:0px;">some string</p>
Case3:
input string: <p align="justify">some string</p>
output string: <p style="margin:0px;padding:0px;" align="justify">some string</p>
Right now I am using regex like this
myHtmlString.gsub("<p", "<p style = \"margin:0px;padding:0px\"")
Which works fine except it removes previous styling. I am using Ruby (ROR).
I need help to tweak this a bit.
You can do this using Nokogiri, by setting [:style] on the relevant Nodes.
require "nokogiri"
inputs = [
'<p>some string</p>',
'<p style="text-align:right;" >some string</p>',
'<p align="justify">some string</p>'
]
inputs.each do |input|
noko = Nokogiri::HTML::fragment(input)
noko.css("p").each do |tag|
tag[:style] = (tag[:style] || "") + "margin:0px;padding:0px;"
end
puts noko.to_html
end
This will loop through all elements matching the css selector p, and set the style attribute like you want.
Output:
<p style="margin:0px;padding:0px;">some string</p>
<p style="text-align:right;margin:0px;padding:0px;">some string</p>
<p align="justify" style="margin:0px;padding:0px;">some string</p>
I recommend against using regex for this, as in general HTML can't be properly parsed by regex. That said, as long as your input data is consistent, regex will still work. You want to match whatever content is already in a p element's style attribute using parentheses, then insert it in the substitution string:
myHtmlString.gsub(/<p( style="(.*)")?/,
"<p style=\"#{$2};margin:0px;padding:0px\"")
Here's how the match pattern works:
/ #regex delimiter
<p #match start of p tag
( #open paren used to group, everything in this group gets saved in $1
style=" #open style attribute
(.*) #group contents of style attribute, gets saved to $2
" #close style attribute
)? #question mark makes everything in the paren group optional
/ #regex delimiter
I ended up doing something like this, I had to do this just before sending the email. I know this is not the best way to do it but worth sharing here. Solutions given by #sgroves and #Dobert are really good and helpful.
But I din't want to included Nokogiri, though I have picked the idea from above 2 solutions only. Thanks.
Here is my code ( I am new to ROR so nothing much fancy here, I used it in HAML block)
myString.gsub!(/<p[^>]*>/) do |match|
match1 = match
style1_arr = match1.scan(/style=".*"/)
unless style1_arr.blank?
style1 = style1_arr.first.sub("style=", "").gsub(/\"/, "").to_s
style2 = style1 + "margin:0px;padding:0px;"
match2 = match1.sub(/style=".*"/, "style=\"#{style2.to_s}\"")
else
match2 = match1.sub(/<p/, "<p style = \"margin:0px;padding:0px;\"")
end
end
Now myString will be updated string.(notice the ! after gsub)

How do I remove white space between HTML nodes?

I'm trying to remove whitespace from an HTML fragment between <p> tags
<p>Foo Bar</p> <p>bar bar bar</p> <p>bla</p>
as you can see, there always is a blank space between the <p> </p> tags.
The problem is that the blank spaces create <br> tags when saving the string into my database.
Methods like strip or gsub only remove the whitespace in the nodes, resulting in:
<p>FooBar</p> <p>barbarbar</p> <p>bla</p>
whereas I'd like to have:
<p>Foo Bar</p><p>bar bar bar</p><p>bla</p>
I'm using:
Nokogiri 1.5.6
Ruby 1.9.3
Rails
UPDATE:
Occasionally there are children nodes of the <p>Tags that generate the same problem: white space between
Sample Code
Note: the Code normally is in one Line, I reformatted it because it would be unbearable otherwise...
<p>
<p>
<strong>Selling an Appartment</strong>
</p>
<ul>
<li>
<p>beautiful apartment!</p>
</li>
<li>
<p>near the train station</p>
</li>
.
.
.
</ul>
<ul>
<li>
<p>10 minutes away from a shopping mall </p>
</li>
<li>
<p>nice view</p>
</li>
</ul>
.
.
.
</p>
How would I strip those white spaces aswell?
SOLUTION
It turns out that I messed up using the gsub method and didn't further investigate the possibility of using gsub with regex...
The simple solution was adding
data = data.gsub(/>\s+</, "><")
It deleted whitespace between all different kinds of nodes... Regex ftw!
This is how I'd write the code:
require 'nokogiri'
doc = Nokogiri::HTML::DocumentFragment.parse(<<EOT)
<p>Foo Bar</p> <p>bar bar bar</p> <p>bla</p>
EOT
doc.search('p, ul, li').each { |node|
next_node = node.next_sibling
next_node.remove if next_node && next_node.text.strip == ''
}
puts doc.to_html
It results in:
<p>Foo Bar</p><p>bar bar bar</p><p>bla</p>
Breaking it down:
doc.search('p')
looks for only the <p> nodes in the document. Nokogiri returns a NodeSet from search, or a nil if nothing matched. The code loops over the NodeSet, looking at each node in turn.
next_node = node.next_sibling
gets the pointer to the next node following the current <p> node.
next_node.remove if next_node && next_node.text.strip == ''
next_node.remove removes the current next_node from the DOM if the next node isn't nil and its text isn't empty when stripped, in otherwords, if the node has only whitespace.
There are other techniques to locate only the TextNodes if all of them should be stripped from the document. That's risky, because it can end up deleting all blanks between tags, causing run-on sentences and joined words, which probably isn't what you want.
A first solution can be to remove empty text nodes, a quick way to do this for your exact case can be:
require 'nokogiri'
doc = Nokogiri::HTML("<p>Foo Bar</p> <p>bar bar bar</p> <p>bla</p>")
doc.css('body').first.children.map{|node| node.to_s.strip}.compact.join
This won't work for nested elements as-is but should give you a good path for start.
UPDATE:
You can actually optimise a little with:
require 'nokogiri'
doc = Nokogiri::HTML::DocumentFragment.parse("<p>Foo Bar</p> <p>bar bar bar</p> <p>bla</p>")
doc.children.map{|node| node.to_s.strip}.compact.join
Here is all the possible task you can be looking for which deals with unnecessary whitespaces(including unicode one) in parsing output.
html = "<p>A paragraph.<em> </em> <br><br><em>
</em></p><p><em> </em>
</p><p><em>
</em><strong><em>\" Quoted Text \" </em></strong></p>
<ul><li><p>List 1</p></li><li><p>List 2</p></li><li><p>List 3 </p>
<p><br></p><p><br><em> </em><br>
A text content.<br><em><br>
</em></p></li></ul>"
doc = Nokogiri::HTML.fragment(html)
doc.traverse { |node|
# removes any whitespace node
node.remove if node.text.gsub(/[[:space:]]/, '') == ''
# replace mutiple consecutive spaces with single space
node.content = node.text.gsub(/[[:space:]]{2,}/, ' ') if node.text?
}
# Gives you html without any text node including <br> or multiple spaces anywhere in the text of html
puts doc.to_html
# Gives text of html, concatenating li items with a space between them
# By default li items text are concatenated without the space
Nokogiri::HTML(doc.to_html).xpath('//text()').map(&:text).join(' ')
#Output
# "A paragraph. \" Quoted Text \" \n List 1 \n List 2 \n \n List 3 \n A text content. \n \n"
# To Remove newline character '\n'
Nokogiri::HTML(doc.to_html).xpath('//text()').map(&:text).join(' ').gsub(/\n+/,'')
#Output
# "A paragraph. \" Quoted Text \" List 1 List 2 List 3 A text content."
Note: If you are not using fragment in case of a complete html doc then you might have to replace traverse with other function like search.
data.squish does the same thing and is way more readable.

Resources