Generate table of contents like on Wikipedia, without JavaScript - ruby-on-rails

I have a page that is formatted like so:
<h1>Header</h1>
<h2>Subheader</h2>
<h3>Subsubheader</h3>
<h1>Another header</h1>
Is it possible to server-side generate a table of contents / outline at the start of the page, like Wikipedia does in its articles? I use Ruby on Rails.
EDIT: WITHOUT JavaScript!

I created a class for this purpose today. It depends on http://www.nokogiri.org/, but that gem comes with Rails already.
Put this in app/models/toc.rb:
class Toc
attr_accessor :html
TOC_CLASS = "toc".freeze
TOC_ELEMENT = "p".freeze
TOC_ITEMS = "h1 | h2 | h3 | h4 | h5".freeze
UNIQUEABLE_ELEMENTS = "h1 | h2 | h3 | h4 | h5 | p".freeze
def initialize(content)
#html = Nokogiri::HTML.fragment content
end
def generate
clear
set_uniq_ids
toc = create_container
html.xpath(TOC_ITEMS).each { |node| toc << toc_item_tag(node) }
html.prepend_child toc
return html.to_s
end
private
def clear
html.search(".#{TOC_CLASS}").remove
end
def set_uniq_ids
html.xpath(UNIQUEABLE_ELEMENTS).
each { |node| node["id"] = rand_id }
end
def rand_id
(0...8).map { ('a'..'z').to_a[rand(26)] }.join
end
def create_container
toc = Nokogiri::XML::Node.new TOC_ELEMENT, html
toc["class"] = TOC_CLASS
return toc
end
def toc_item_tag(node)
"<a data-turbolinks='false' class=\"toc-link toc-link-#{node.name}\" href=\"##{node["id"]}\">#{node.text}</a>"
end
end
Use it like
toc = Toc.new article.body
body_with_toc = toc.generate
article.update body: body_with_toc

You need to generate data source from your hierarchy to be something like this
#toc = [ ['header', 0], ['subheader', 1], ['subsubheader', 2],
['header2', 0], ['header3', 0], ['subheader2', 1]
]
Than it is easy to render it in template, for example:
<%- #toc.each do |item, distance| %>
<%= (' ' * distance * 5).html_safe %>
<%= item %>
<br/>
<%- end %>
Would give you:
header
subheader
subsubheader
header2
header3
subheader2
Of course you can use 'distance' for determining style size instead of 'depth', but I hope you get the main idea.

yes, it is possible. you don't really need rails for this; you can also use javascript to generate a table of contents.
Here is an exmaple library that you can use.
http://www.kryogenix.org/code/browser/generated-toc/
You could alternatively create your anchor links as you loop through elements in your rails erb/haml views.

Related

Nokogiri displaying data in view

Trying to figure out how display the text and images I have scraped in my application/html.
Here is my app/scrape2.rb file
require 'nokogiri'
require 'open-uri'
url = "https://marketplace.asos.com/boutiques/independent-label"
doc = Nokogiri::HTML(open(url))
label = doc.css('#boutiqueList')
#label = label.css('#boutiqueList img').map { |l| p l.attr('src') }
#title = label.css("#boutiqueList .notranslate").map { |o| p o.text }
Here is the controller:
class PagesController < ApplicationController
def about
#used to change the routing to /about
end
def index
#label = label.css('#boutiqueList img').map { |l| p l.attr('src') }
#title = label.css("#boutiqueList .notranslate").map { |o| p o.text }
end
end
and finally the label.html.erb page:
<% #label.each do |image| %>
<%= image_tag image %>
<% end %>
do I need some other method, not storing the arrays properly?
Your controller needs to load the data itself, or somehow pull the data from scrape2.rb. Controllers do not have access to other files unless specified (include, extend, etc).
require 'nokogiri'
require 'open-uri'
class PagesController < ApplicationController
def index
# Call these in your controller:
url = "https://marketplace.asos.com/boutiques/independent-label"
doc = Nokogiri::HTML(open(url))
label = doc.css('#boutiqueList')
#label = label.css('#boutiqueList img').map { |l| p l.attr('src') }
#title = label.css("#boutiqueList .notranslate").map { |o| p o.text }
end
end
You're not parsing the data correctly.
label = doc.css('#boutiqueList')
should be:
label = doc.at('#boutiqueList')
#boutiqueList is an ID, of which only one can exist in a document at a time. css returns a NodeSet, which is like an Array, but you really want to point to the Node itself, which is what at would do. at is equivalent to search('...').first.
Then you use:
label.css('#boutiqueList img')
which is also wrong. label is supposed to already point to the node containing #boutiqueList, but then you want Nokogiri to look inside that node and find additional nodes with id="boutiqueList" and that contain <img> tags. But, again, because #boutiqueList is an ID and it can't occur more than once in the document, Nokogiri can't find any nodes:
label.css('#boutiqueList img').size # => 0
whereas using label.css correctly finds <img> nodes:
label.css('img').size # => 48
Then you use map to print out values, but map is used to modify the contents of an Array as it iterates over it. p will return the value it outputs, but it's bad form to rely on the returned value of p in a map. Instead you should map to convert the values, then puts the result if you need to see it:
#label = label.css('#boutiqueList img').map { |l| l.attr('src') }
puts #label
Instead of using attr('src'), I'd write the first line as:
#label = label.css('img').map { |l| l['src'] }
The same is true of:
#title = label.css("#boutiqueList .notranslate").map { |o| p o.text }

Render a template inside wysiwyg text

I have a #page.content that is stored in database as a text column. Is there an easy way to embed a render tag inside that html content?
<div>Lorem ipsum</div>
<%= render 'image_slider' %>
<div>Lorem ipsum</div>
I choose the nokogiri way and finished with two urly helpers
def print_content_start( page, shift=4 )
result = ''
doc = Nokogiri::HTML( page.content )
doc.css('div,p').each_with_index do |node, i|
break if i == shift
result += node.to_s
end
result
end
def print_content_end( page, shift=4 )
result = ''
doc = Nokogiri::HTML( page.content )
doc.css('div,p').drop( shift ).each do |node|
result += node.to_s
end
result
end
If anyone knows a better way, please let me know!

Write simple rails code better

I'm newbie on rails.
In my form I get string like "123, xxx_new item, 132, xxx_test "
if the item start with "xxx_" than its mean that i should add the item to the db otherwise enter the value
this is my code and i sure that there is a better way to write this code
tags = params[:station][:tag_ids].split(",")
params[:station][:tag_ids] = []
tags.each do |tag|
if tag[0,4] =="xxx_"
params[:station][:tag_ids] << Tag.create(:name => tag.gsub('xxx_', '')).id
else
params[:station][:tag_ids]<< tag
end
end
I'm looking for how to improve my code syntax
What about:
tags = params[:station][:tag_ids].split(',')
params[:station][:tag_ids] = tags.each_with_object([]) do |tag, array|
array << tag.start_with?('xxx_') ? Tag.create(name: tag[4..-1]).id : tag
end

Prawn: Table of content with page numbers

I need to create a table of contents with Prawn. I have add_dest function calls in my code and the
right links in the table of content:
add_dest('Komplett', dest_fit(page_count - 1))
and
text "* <link anchor='Komplett'> Vollstaendiges Mitgliederverzeichnis </link>", :inline_format = true
This works and I get clickable links which forward me to the right pages. However, I need to have page numbers in the table of content. How do I get it printed out?
I would suggest a much simpler solution.
Use pdf.page_number to store the page number of all your sections in a hash as you populate the pages
In the code, output the table of contents after populating the rest of your pages. Insert the TOC into the doc in the right spot by navigating in the PDF pdf.go_to_page(page_num).
For example:
render "pdf/frontpage", p: p
toc.merge!(p.page_number => "Section_Title")
p.start_new_page
toc.merge!(p.page_number => "Section_Title")
render "pdf/calendar"
p.start_new_page
toc.merge!(p.page_number => "Section_Title")
render "pdf/another_section"
p.go_to_page(1)
p.start_new_page
toc.merge!(p.page_number => "Table of Contents")
render "pdf/table_of_contents", table_of_contents: toc
you should read the chapter on Outline in this document http://prawn.majesticseacreature.com/manual.pdf, p.96. It explains with examples on how to create TOC.
UPDATE
destinations, page_references = {}, {}
page_count.downto(1).each {|num| page_references[num] = state.store.object_id_for_page(num)}
dests.data.to_hash.each_value do |values|
values.each do |value|
value_array = value.to_s.split(":")
dest_name = value_array[0]
dest_id = value_array[1].split[0]
destinations[dest_name] = Integer(dest_id)
end
end
state.store.each do |reference|
if !(dest_name = destinations.key(reference.identifier)).nil?
puts "Destination - #{dest_name} is on Page #{page_references.key(Integer(reference.data[0].to_s.split[0]))}"
end
end
I also needed to create a dynamic TOC. I put together a quick spike that needs some clean-up but does pretty much what I want. I didn't include click-able links but they could easily be added. The example also assumes the TOC is being placed on the 2nd page of the document.
The basic strategy I used was to store the TOC in a hash. Each time I add a new section to the document that I want to appear in the TOC I add it to the hash, i.e.
#toc[pdf.page_count] = "the toc text for this section"
Then prior to adding the page numbers to the document I iterate thru the hash:
number_of_toc_entries_per_page = 10
offset = (#toc.count.to_f / number_of_toc_entries_per_page).ceil
#toc.each_with_index do |(key, value), index|
pdf.start_new_page if index % number_of_toc_entries_per_page == 0
pdf.text "#{value}.... page #{key + offset}", size: 38
end
Anyway, the full example is below, hope it helps.
require 'prawn'
class TocTest
def self.create
#toc = Hash.new
#current_section_header_number = 0 # used to fake up section header's
pdf = Prawn::Document.new
add_title_page(pdf)
21.times { add_a_content_page(pdf) }
fill_in_toc(pdf)
add_page_numbers(pdf)
pdf.render_file './output/test.pdf'
end
def self.add_title_page(pdf)
pdf.move_down 200
pdf.text "This is my title page", size: 38, style: :bold, align: :center
end
def self.fill_in_toc(pdf)
pdf.go_to_page(1)
number_of_toc_entries_per_page = 10
offset = (#toc.count.to_f / number_of_toc_entries_per_page).ceil
#toc.each_with_index do |(key, value), index|
pdf.start_new_page if index % number_of_toc_entries_per_page == 0
pdf.text "#{value}.... page #{key + offset}", size: 38
end
end
def self.add_a_content_page(pdf)
pdf.start_new_page
toc_heading = grab_some_section_header_text
#toc[pdf.page_count] = toc_heading
pdf.text toc_heading, size: 38, style: :bold
pdf.text "Here is the content for this section"
# randomly span a section over 2 pages
if [true, false].sample
pdf.start_new_page
pdf.text "The content for this section spans 2 pages"
end
end
def self.add_page_numbers(pdf)
page_number_string = 'page <page> of <total>'
options = {
at: [pdf.bounds.right - 175, 9],
width: 150,
align: :right,
size: 10,
page_filter: lambda { |pg| pg > 1 },
start_count_at: 2,
}
pdf.number_pages(page_number_string, options)
end
def self.grab_some_section_header_text
"Section #{#current_section_header_number += 1}"
end
end
I built a report generator featuring a clickable table of contents using code and ideas gathered from this discussion. Here is the relevant parts of the code, in case somebody else needs to do the same.
What it does:
include Prawn::View to use Prawn's methods without having to prefix them with pdf
insert a blank page where the table of contents will be displayed
add the document contents, using h1 and h2 helpers for titles
the h1 and h2 helpers store the position of headings in the document
rewind and generate the actual table of contents
indent subsections in the table of contents
right-align the dots between toc entry and page number for visual consistency
if the table doesn't fit on one page, it adds new pages and increments the relevant page numbers
add a PDF outline with the section and subsection titles for bonus points.
Enjoy!
PDF generator
class ReportPdf
include Prawn::View
COLOR_GRAY = 'BBBBBB' # Color used for the dots in the table of contents
def initialize(report)
#toc = []
#report = report
generate_report
end
private
def generate_report
add_table_of_contents
add_contents
update_table_of_contents
add_outline
end
def add_table_of_contents
# Insert a blank page, which will be filled in later using update_table_of_contents
start_new_page
end
def add_contents
#report.sections.each do |section|
h1(section.title, section.anchor)
section.subsections.each do |subsection|
h2(subsection.title, subsection.anchor)
# subsection contents
end
end
end
def update_table_of_contents
go_to_page(1) # Rewind to where the table needs to be displayed
text 'Table of contents', styles_for(:toc_title)
move_down 20
added_pages = 0
#toc.each do |entry|
unless fits_on_current_page?(entry[:name])
added_pages += 1
start_new_page
end
entry[:page] += added_pages
add_toc_line(entry)
entry[:subsections].each do |subsection_entry|
unless fits_on_current_page?(subsection_entry[:name])
added_pages += 1
start_new_page
end
subsection_entry[:page] += added_pages
add_toc_line(subsection_entry, true)
end
end
end
def add_outline
outline.section 'Table of contents', destination: 2
#toc.each do |entry|
outline.section entry[:name], destination: entry[:page] do
entry[:subsections].each do |subsection|
outline.page title: subsection[:name], destination: subsection[:page]
end
end
end
end
def h1(name, anchor)
add_anchor(anchor, name)
text name, styles_for(:h1)
end
def h2(name, anchor)
add_anchor(anchor, name, true)
text name, styles_for(:h2)
end
def styles_for(element = :p)
case element
when :toc_title then { size: 24, align: :center }
when :h1 then { size: 20, align: :left }
when :h2 then { size: 16, align: :left }
when :p then { size: 12, align: :justify }
end
end
def add_anchor(name, anchor, is_subsection = false)
add_dest anchor, dest_xyz(bounds.absolute_left, y + 20)
if is_subsection
#toc.last[:subsections] << { anchor: anchor, name: name, page: page_count }
else
#toc << { anchor: anchor, name: name, page: page_count, subsections: [] }
end
end
def add_toc_line(entry, is_subsection = false)
anchor = entry[:anchor]
name = entry[:name]
name = "#{Prawn::Text::NBSP * 5}#{name}" if is_subsection
page_number = entry[:page].to_s
dots_info = dots_for(name + ' ' + page_number)
float do
text "<link anchor='#{anchor}'>#{name}</link>", inline_format: true
end
float do
indent(dots_info[:dots_start], dots_info[:right_margin]) do
text "<color rgb='#{COLOR_GRAY}'>#{dots_info[:dots]}</color>", inline_format: true, align: :right
end
end
indent(dots_info[:dots_end]) do
text "<link anchor='#{anchor}'>#{page_number}</link>", inline_format: true, align: :right
end
end
def dots_for(text)
dot_width = text_width('.')
dots_start = text_width(text)
right_margin = text_width(' ') * 6
space_for_dots = bounds.width - dots_start - right_margin
dots = space_for_dots.negative? ? '' : '.' * (space_for_dots / dot_width)
dots_end = space_for_dots - right_margin
{
dots: dots,
dots_start: dots_start,
dots_end: dots_end,
right_margin: right_margin
}
end
def fits_on_current_page?(str)
remaining_height = bounds.top - bounds.absolute_top + y
height_of(str) < remaining_height
end
def text_width(str, size = 12)
font(current_font).compute_width_of(str, size: size)
end
def current_font
#current_font ||= font.inspect.split('<')[1].split(':')[0].strip
end
end
Using the generator
Using Rails, I generate PDFs from a report using the following code:
# app/models/report.rb
class Report < ApplicationRecord
# Additional methods
def pdf
#pdf ||= ReportPdf.new(self)
end
end
# app/controllers/reports_controller.rb
class ReportsController < ApplicationController
def show
respond_to do |format|
format.html
format.pdf do
doc = #report.pdf
send_data doc.render, filename: doc.filename, disposition: :inline, type: Mime::Type.lookup_by_extension(:pdf)
end
end
end

Split #blogs into three divs using size of description field as weight

I have a collection of Blog items.
#blogs = Blog.find(:all)
Each blog has a description textfield with some text. What I would like to do is splitting the #blogs objects into 3 divs, but with roughly the same characters in each column.
<div id="left">
#blog1 (653 characters)
</div>
<div id="center">
#blog2 (200 characters)
#blog5 (451 characters)
</div>
<div id="right">
#blog3 (157 characters)
#blog4 (358 characters)
#blog6 (155 characters)
</div>
I can't figure out how to do that without getting really complicated and probably inefficient.
So far I have thought about converting the description field (size) to % of total characters in the #blogs collection, but how do I match/split the elements, so that I get closest to 33% in each column - like a super simple tetris game :)
Any thoughts?
Here's a quick hack that isn't perfect, but might get you pretty close. The algorithm is simple:
Sort items by size.
Partition items into N bins.
Resort each bin by date (or other field, per your desired presentation order)
Here's a quick proof of concept:
#!/usr/bin/env ruby
# mock out some simple Blog class for this example
class Blog
attr_accessor :size, :date
def initialize
#size = rand(700) + 100
#date = Time.now + rand(1000)
end
end
# create some mocked data for this example
#blogs = Array.new(10) { Blog.new }
# sort by size
sorted = #blogs.sort_by { |b| b.size }
# bin into NumBins
NumBins = 3
bins = Array.new(NumBins) { Array.new }
#blogs.each_slice(NumBins) do |b|
b.each_with_index { |x,i| bins[i] << x }
end
# sort each bin by date
bins.each do |bloglist|
bloglist.sort_by! { |b| b.date }
end
# output
bins.each_with_index do |bloglist,column|
puts
puts "Column Number: #{column+1}"
bloglist.each do |b|
puts "Blog: Size = #{b.size}, Date = #{b.date}"
end
total = bloglist.inject(0) { |sum,b| sum + b.size }
puts "TOTAL SIZE: #{total}"
end
For more ideas, look up the multiprocessor scheduling problem.

Resources