Traverse HTML with no CSS class using Nokogiri? - ruby-on-rails

I've got the following HTML:
<table width="100%" border="0" cellpadding="6" cellspacing="1">
<tbody>
<tr>
<td bgcolor="#ffd204" width="40%" nowrap=""><b>Tracking Number:</b></td>
<td bgcolor="#ffffff" width="60%" nowrap="">C123456789012345</td>
</tr>
<!-- ...there could be additional table rows here... -->
<tr>
<td bgcolor="#ffd204" width="40%" nowrap=""><b>Deliver To:</b></td>
<td bgcolor="#ffffff" width="60%" nowrap="">ANYWHERE, NY</td>
</tr>
</tbody>
</table>
Say, for instance I need to pull the ANYWHERE, NY data. How would I do that using Nokogiri? Or is there something better for traversing this sort of thing where there aren't any CSS selectors to search with?

Since we don't have a CSS class, id attribute, or other semantic markup to use, we instead look for something that is likely to not change in this document to anchor our search to. In this case, I suspect that the "Deliver To:" label will always come right before the td we want. So:
require 'nokogiri'
html = # Fetch either from http via open-uri's open() or from file via IO.read()
doc = Nokogiri.HTML(html)
delivery = doc.at_xpath '//td[preceding-sibling::td[b="Deliver To:"]]/text()'
p delivery.content
#=> "ANYWHERE, NY"
That XPath expression says:
// — at any level,
td — find me an element named td
[…] — but only if…
preceding-sibling:: — it has a preceding sibling
td — that is an element named td
[…] — but only if…
b — it has a child element named b
="Deliver To:" — whose text content equals this string
/text() — and then find me the child text node(s) of that td.
Because we used at_xpath instead of xpath, Nokogiri returns the first matching node it can find—which in this case happens to be the only child text node of that td—instead of an array of nodes.
In case that <td> can have markup, such as <td…>ANYWHERE,<br>NY</td> you can modify the expression to omit the trailing /text() (so that you select only the <td> itself) and then use the text method to fetch the combined visible text inside there.

Given that you don't mind some preprocessing, you could do:
lookup = {}
c = Nokogiri::HTML(open("http://..."))
c.search("tr").each do |tr|
cells = tr.search("td")
lookup[cells.first.text.gsub(':', '')] = cells.last.text
end
puts lookup["Tracking Number"]
I didn't test that code so there might be some syntax issues.

Related

insert line feed inside mat-table cell or replace html tags in text

Our old internet site was getting data from a database and was having HTML tags inside text that was formatting text in the site. I explain.
When I was calling the database to have the text of the field INFO.
The text was:
"List of available city:<br>1-Boston<br>2-Washington<br>3-Miami... etc"
So in the old site when I was using my table It was like this:
<table>...
<td><%= INFO %> </td>
So the site was formatting the text inside the TD field and was giving the list of the city in the cell field of the table like this:
List of available city:
1-Boston
2-Washington
3-Miami
Now we're using Angular and Mat-Table to create a new site. The information is coming from the same database having text already formatting with some HTML tags.
If I use this code:
<table>
<tr *ngFor="let rule of element.Reglements">
<td class="line_rules">{{ rule.Reglement }}</td>
</tr>
</table>
Unfortunately the result in the cell gives a text including the HTML tags like this:
"List of available city:<br>1-Boston<br>2-Washington<br>3-Miami... etc"
Question: How can I format my Table cell to use the HTML tags ?
Do I need to change the HTML tags for something else ? (I can directly modify the database or loop though field and replace code before passing data to the table ?
Is there a way to insert line feed in a text cell ?
Thanks
I found something to do this.
My code was:
<table>
<tr *ngFor="let rule of element.Reglements">
<td class="line_rules">{{ rule.Reglement }}</td>
</tr>
</table>
I do it like this:
<table>
<tr *ngFor="let rule of element.Reglements">
<td class="line_rules" [innerHTML]="rule.Reglement"></td>
</tr>

Convert thymeleaf to freemarker

Please help, i cann't find in freemarker guide how to convert from thymeleaf this:
lists.isEmpty and for each
<th:block th:if="${#lists.isEmpty(employees)}">
<h3>No employee</h3>
</th:block>
<th:block th:unless="${#lists.isEmpty(employees)}">
<tr th:each="contact,iterStat : ${employees}">
<td th:text="${iterStat.count}"></td>
<td th:text="${contact.name}"></td>
<td th:text="${contact.phone}"></td>
Thanks!
Maybe Something like this? (sketch, not tested)
<#list employees as contact>
<tr>
<td>${contact?index}
<td>${contat.name}</td>
<td>${contact.phone}</td>
</tr>
<#else>
<h3>No employee</h3>
</#list>
Notes
<#list> Will generate a <tr> element for each item in the employees sequence, containing <td>'s for each field.
If the employees sequence is empty it will generate the <h3> element.
See List Directive Doc
It gets the zero based index of the item using the build-in function
?index. See built-ins and loop variables in the help. Freemarker built-ins Doc. If you want one based, you can add one to it.
It's works
<#list employees as contact>
<tr>
<td>${contact?index}
<td>${contat.name}</td>
<td>${contact.phone}</td>
</tr>
<#else>
<h3>No employee</h3>
</#list>

Capybara: Pulling in multiple Radio button options

Im running into an issue, and I think the issue is with how my page.all is pulling radio button questions in.
So here is the HTML for the table itself (Multiple questions with 5 radio button choices a piece):
<table class="table table-striped table-stuff table-collapsible">
<colgroup>
<thead>
<tbody>
<input id="0_answer_question_id" value="9966" name="response[answers][0][answer_id]" type="hidden">
<tr>
<td class="heading">
<td class="option">
<div class="radio-inline radio-inline--empty">
<input id="question_1_1" value="1" name="response[answers_attributes][0][answer_opinion]" type="radio">
<label for="question_1_1">Strongly Disagree</label>
</div>
</td>
<td class="option">
<td class="option">
<td class="option">
<td class="option">
</tr>
<input id="response_1_question_id" value="9966" name="response[answers_attributes][1][answer_question_id]" type="hidden">
<tr>
<input id="response_1_id" value="<a number>" name="response[answers_attributes][1][id]" type="hidden">
<Same as above repeated 5 times with numbers changed>
</tbody>
</table>
Im using:
page.all('table.table-stuff tbody tr', minimum: 6).each do |row|
row.all("td label").sample.trigger('click')
end
To get each row and select one from it. HOWEVER, I notice "sometimes" a row will not have one selected. My theory is the "heading" (which has a <label> itself is accepting one of the clicks perhaps? (since from my understanding of how page.all works it's grabbing every tbody tr within the table...but is maybe grabbing the heading too? (since it contains a td label?)
Also when a table is named something like table table-striped table-stuff table-collapsible...how can you tell what the actual table "name" is? (I didn't write this website, just doing tests for it). When putting it in the page.all('table.<etc>')?
If the heading td (it's not expanded in your example) also contains a label element (so it would be included in the results of your all call) then you just need to change the CSS selector so it wouldn't be included - something like
row.all("td.option label").sample.trigger('click') # only choose labels contined in tds with the class of 'option'
or
row.all("td:not(.heading) label").sample.trigger('click') # choose labels contained in tds without the class of 'heading'
On your second question about table names, I don't really understand what you're asking. Tables don't have name attributes, they could have an id attribute or a caption containing some text which could then be used to find them with capybara via find(:table, 'id or caption text') or within_table('id or caption text') { code to execute within scope of the table }. Rather, you seem to be talking about the classes on the element which are specified in a CSS selector with '.'. Therefore a CSS selector to match a table element with all the classes you listed would be - 'table.table.table-striped.table-stuff.table-collapsible'
Note: If you're sure there's always only 5 choices you could add the :count option to your find to make sure your selector is only finding those items
row.all("td.option label", count: 5).sample.trigger('click')

parsing the id value by giving a known td?

With the help of firePath, I got this:
.//*[#id='#table-row-51535240d7037e70b9000062']/td[1]
Parot of My HTML looks like this:
<table class="table table-bordered table-striped">
<tbody>
<tr>
<tr>
<tr id="#table-row-51535240d7037e70b9000062"> #this is the id that i want to get
<td> 54 </td> #this is the td that i know
<td>
<td>
<td>Open</td>
<td/>
What i really want to do here is, by giving the td value (54), I want to be able to get the id (parse the id), any hints how can i achieve that?
Thanks in advance.
PS: sorry for my English, and for my lack of knowledge :)
First of all your HTML is invalid (because it contains nested <tr> nodes). Nokogiri may be able to parse it, but if you can you should fix it before that.
You can fetch that id by the following ruby code:
doc.at_xpath("//td[contains(text(), '54')]/..")['id']
//td[contains(text(), '54')] will grab all the <td> nodes which contain 54, /.. will go to their parents.
Document#at_xpath will fetch only the first matching item
['id'] will get the attribute of the matching node.
Using jquery
$(function(){
// (i dont know if you have id for that td or not, it will be more easy if u do have id for that td)
console.log($('table tbody tr td:first').closest('tr').attr('id')); // you can remove :first if you want to.
});
Oops, I misread your question, and one more thing, there is a problem in your tr tag.

Struts2 tags: get a request attribute whose name is itself a value of another request attribute

I have a list of table names in a request attribute, "BillSummaryTables". I am iterating through the list and I want to use each table name to get a request attribute for that particular table name. Corresponding to each table name I have another list in request attribute and I want to iterate through that.
This is what I am doing.
<s:iterator value='#request.BillSummaryTables' var="tableName" status="itStatus">
<div class="contentbox" role="content">
<table class="rpt">
<s:iterator value="#request.get('%{#tableName}').getData()" var="ocRow" status="itStatus">
<tr style="border:1px solid #CCCCCC">
<s:iterator value='#ocRow' var="cell" status="itStatus2">
<td>
<s:property value="#cell.getValue()"/>
</td>
</s:iterator>
</tr>
</s:iterator>
<tr>
<s:iterator value="#request.get('%{#tableName}').getData()" var="ocTotal">
<td>
<s:property value="#ocTotal"/>
</td>
</s:iterator>
</tr>
</table>
</div>
</s:iterator>
I have also tried
#request[<s:property value="#tableName" />].getData()
and
#request['<s:property value="#tableName" />'].getData()
and
#request.%{#tableName}
But nothing is returned in any case.
However, this code works fine if I hard code the values.
i.e. if i use: #request['other_charges'].getData()
Note: I am able to retrieve the list of tableName (#request.BillSummaryTables).
#1) You are using three nested iterators, but both the first and the second have an instance of IteratorStatus called itStatus; they must have different names to work.
#2) If the Lists corresponding to the Table name is, effectively, a List, then you should iterate the list, not the getData() stuff (what is that ?)
#3) Why using request ? why not simply using an HashMap on the Action (with the getter), adding elements dynamically using table names as key ?
#4) This #request[<s:property value="#tableName" />].getData() will obviously not work if put inside another Struts2 tag, like an Iterator (cannot nest Struts2 tags).
However, try something like this (I stripped the second iterator, make it running before, then add stuff), and see if it works (and what it prints):
<s:iterator value='#request.BillSummaryTables' var="tableName" status="statusAllTables">
<div class="contentbox" role="content">
<br/>==== START DEBUG ====
<br/>Current table name: [<s:property value="#tableName"/>]
<br/>Corresponding request object: [<s:property value="#request['%{#tableName}']"/>]
<br/>getData on that object: [<s:property value="#request['%{#tableName}'].getData()"/>]
<br/>===== END DEBUG =====
<table class="rpt">
<s:iterator value="#request['%{#tableName}'].getData()" var="ocRow" status="statusThisTable">
<tr style="border:1px solid #CCCCCC">
<s:iterator value='#ocRow' var="cell" status="statusThisField">
<td>
<s:property value="#cell.getValue()"/>
</td>
</s:iterator>
</tr>
</s:iterator>
</table>
</div>
</s:iterator>
EDIT
Ok, but then why are you using request.setAttribute ? Actions are created per-request... just use a private List<MyObjects> myObjects with its getter (public List<MyObject> getMyObjects()), and call it from JSP with <s:iterator value="myObjects"> (in your case, <s:iterator value="myObjects.data">.
Please note that .getData() in OGNL should become .data (i didn't noticed it before), removing the get, lowering the first letter of the method, and removing round brackets...
Retry and let us know.

Resources