Create a table in Prawn by its html representation - ruby-on-rails

Earlier Prawn gem allowed to create a table by its html representation (having an html table string as an input argument like <table class="abc"> .... </table>). Now I didn't find this facility in the manual.
So is it possible now? If not, is there any other option then?

TL;DR: if your use-case is 1) generating both HTML and PDF data (like online invoices etc.), and 2) making sure both look the same, then Prawn is not really the best solution (which is the same suggestion in the Prawn Readme).
In your case, you could parse the HTML using Nokogiri or Upton and extract the data from the HTML table and then use it to generate the PDF representation via Prawn. The HTML styles may not directly translate into the ones used by Prawn and so, even with a lot of code-wrangling, you might not achieve the consistency in styling — which I assume, from the comments on the answer by royalGhost, is the result you want. Also, a simple Nokogiri parsing solution won't work if your HTML table is nested and the parsing code does not cater to that. For example, consider this:
<table>
<tr>
<td>First Column, First Row</td>
<td>Second Column, First Row</td>
</tr>
<tr>
<table>
<tr>
<td>First Column, Second Row</td>
<td>Second Column, Second Row</td>
<td>Third Column, Second Row</td>
</tr>
</table>
</tr>
</table>
Then, in the Ruby parsing snippet, you should ensure that the inner <table>...</table> is parsed into a Prawn::Table object and not a row of Prawn::Table::Cell objects.
Any wkhtmltopdf based options such as WickedPDF or PDFKit offer much cleaner way of achieving the HTML to PDF conversion solution.
You have two options:
Ditch Prawn entirely and prefer the solution above.
Use Prawn by extracting the data from the HTML via Nokogiri/Upton and generate the PDF and not worry about styling being the same as that in the HTML representation.

Well you can use prawnto gem for templates to create table using prawn.
For e.g if you define the following templates, it will draw table with 3 columns with x, y and z width.
data = [ ["Column 1", "Column 2", "Column 3"] ]
table(data, :column_widths => [x,y,z], :cell_style => { :inline_format => true })

Related

Shopware 5 - Looping through all attributes

Hello i need some help with shopware.
My question is probably pretty basic but i cannot get it done. I want to print out attributes from an article. In shopware documentation they call them {$sArticle.attr1} till {$sArticle.attr20} but they can also have different names so i cannot refer directly to the name and instead i want only a few attributes to be printed.
so far i know that all attributes are stored in the s_articles_attributes database table and i only want to print out those columns when the column name containes='artikelattribut_'
The code is going to be implemented in a table from frontend/detail/tabs --> description.tpl
the actual table already uses the $sArticle.sProperties and the code looks the following:
{if $sArticle.sProperties}
<div class="product--properties panel has--border">
<table class="product--properties-table">
{foreach $sArticle.sProperties as $sProperty}
<tr class="product--properties-row">
{* Property label *}
{block name='frontend_detail_description_properties_label'}
<td class="product--properties-label is--bold">{$sProperty.name|escape}:</td>
{/block}
{* Property content *}
{block name='frontend_detail_description_properties_content'}
<td class="product--properties-value">{$sProperty.value|escape}</td>
{/block}
</tr>
{/foreach}
</table>
</div>
{/if}
The thing is that $sArticle.sProperties and {$sArticle.attr1} till {$sArticle.attr20} are different. All i want is a second {foreach} that loops threw all article attributes maybe the idea is getting clear with that:
{foreach $sArticle.attr FROM s_articles_attributes WHERE name contains='artikelattribut_'}
I hope somebody understands my problem. Thankfull for any advice.
Thanks
First, keep in mind, that "properties" and "attributes" mean something different in Shopware as you might know it from other shop systems.
"Properties" are used for characteristics of a product, like the taste or colour of product.
"Attributes" in Shopware do not have anything to do with attributes in the usual meaning. You can find those *_attributes tables for almost every entity and they are used more like custom fields or columns which you could add to the entities to extend them with custom data.
Now back to your problem. Try this:
{foreach $sArticle.attributes.core->toArray() as $attributeName => $attribute}
{$attributeName|var_dump}
{$attribute|var_dump}
{/foreach}
There are two ways to access the attributes of a product.
All attributes are directly assigned to the $sArticle variable and you can use them, as you already described in your text.
Attributes are also stored in $sArticle.attributes where you can find different types of attributes. By default those are core and marketing for products on the detail page. Be aware that the values of those keys are objects of type Shopware\Bundle\StoreFrontBundle\Struct\Attribute. That's why we need to call the toArray method, to get an array which we can iterate.

How to access an element with two classes divided by space and drop all the others (nokogiri, ruby on rails)

I'm writing a parser and want to take element only with class name "row1 processed"
<tbody class = "processed"> some data1 </tbody>
<tbody class = "row1 props processed"> some data2 </tbody>
<tbody class = "row1 processed"> some data3 </tbody>
via gem nokogiri.
I can do it for row1, processed, props; but I need only "row1 processed"
test = el.css('tbody.row1')
test = el.css('tbody.processed')
How can i do this?
I'm using ruby on rails 5.2.2
Update
When I typed el.css('tbody.row1.props') it displayed informaton from this
element
<tbody class = "row1 props processed"> some data2 </tbody>
but when I added "processed" class then I got nothing...
Separate multiple classes with dots:
el.css('tbody.row1.processed')
As mentioned on the Ruby developers Slack channel, the underlying issue here is that Nokogiri can only access the initially loaded HTML from the page (what you see when you click View Source), before it has been modified by JavaScript. This is why it's not accessible in Nokogiri, since at this point the processed class hasn't been loaded. The other answer here works if the HTML is available on page load.
If you need to modify a page after it has been modified by JavaScript, you have two choices: either use JavaScript to access newly-modified DOM elements, or rethink how you parse the web page to get what you want.

Can axlsx convert html format to work in excel?

My DB stores text from a WYSIWYG editor that looks something like this:
<p><s>Hi!</s></p>
<p>My Name is Bob's.</p>
<p> </p>
<p>I like to eat these things:</p>
<ul>
<li>Candy</li>
<li>Veggies</li>
<li>Everything</li>
</ul>
<p>Enjoy<sup>2</sup></p>
In my view I have something like:
sheet.add_row [#event.text], style: font_format
where #event.text is the above html
Is there a way to make this formatting work in excel using axlsx?
I don't think there is any automatic conversion of html to styles. You'd have to write one yourself. I would use the rich text example as a guide.
I believe it handles any normal Axlsx style on a chunk of text. It at least handles bold, italic, and strikethrough.
For a forced line feed use "\x0A" (breaks between paragraphs.)
But, that means you'll have to parse the html.
Yes, to_spreadsheet is just for you. I have just finish a Rails app using it to generate xlsx file for download. I just follow the instruction and create a view 'show.xlsx.erb' in view directory. And it's done!

How to display some html text from a database in a VIew within a TD cell, but only first 300 chars.....?

Not sure if this is possible, but I have some HTML in a DB column which I want to display in a Table TD cell in a Razor View. However the issue is that I only want the first 300 chars followed by "..."
ie:
<h2>My Test</h2>
<p>My Test description is very long</p>
So if I return the first 25 chars for the purpose of this question plus "...", I would get:
<h2>My Test</h2>
<p>My Tes ...
Which would then upset the containing page, due to the invalid HTML
ie
<table>
<tr>
<td>
<h2>My Test</h2>
<p>My Tes ...
</td>
</tr>
</table>
Is there a way round this?
At the moment I am using:
#Html.Raw(Model.myTestHtml)
to display the test HTML.
Perhaps I can only strip out the text from the HTMl and then .substring this.
Thanks appreciated.
If the html is not dynamic and it always follows same pattern, you may:
Parse the html to xml content
use LINQ2XML and find the node you want in there
edit that text and replace the additional parts with (...)
parse back to html
render it
LINQ2XML is very reliable. I am not sure if you can find dependencies which does same work as it with same level of accuracy and performance. but if you do find it, then you would not need to parsing in the process(to xml and from xml)

Extract text from specific HTML location across multiple pages

I have been experimenting with Jericho HTML Parser and Selenium IDE for the purpose of extracting text from a specific location inside HTML across multiple pages.
I have not found a simple example of how to do this and I don't know java.
I would like to find in a folder all HTML pages in the 1st table, 4th row, 1st div any string of text:
</table>
<tr class="abc"><td class="xyz"><div align="center">The Text I don't want</div></td></tr>
<tr class="abc"><td class="xyz"><div align="center">The Text I don't want</div></td></tr>
<tr class="abc"><td class="xyz"><div align="center">The Text I don't want</div></td></tr>
<tr class="abc"><td class="xyz"><div align="center">The Text I want</div></td></tr>
</table>
And print the selected text to a txt file in a list like this:
The Text I want
Another Text I want
All the source files are stored locally and may contain bad HTML, so figured Jericho might be best for this purpose. However I'm happy to learn any method to achieve the desired result.
Well in the end I went with beautifulsoup and used a python script with something like this:
# open source html file
with open(html_pathname, 'r') as html_file:
# using BeautifulSoup module search html tag's tree
soup = BeautifulSoup(html_file)
# find according your criteria "1st table, 6th tr, 1st td, 1st div"
trs = soup.html.body.table.tr.findNextSiblings('tr')[4].td.div
# write found text to result txt
print ' - writing to result txt'
result_file.write(''.join(trs.contents) + '\n')
print ' - ok!'

Resources