My problem is similar to this post, but not identical. I somehow can't figure out the correct pandoc command line parameters for maintaining/resolving cross-document links when using a couple of interlinked HTML files as the input.
Let's say I have two files, chapter1.xhtml and chapter2.xhtml located in the /home/user/Documents folder with the following contents:
<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<h3>Chapter 1</h3>
<p>Next chapter<br /></p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</body>
</html>
which contains a link to the next document.
and
<?xml version="1.0" encoding="utf-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<h3>Chapter 2</h3>
<p>Previous chapter<br /></p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
</body>
</html>
which contains a link to the previous document.
I used the following command line parameters:
$ pandoc -s --toc --verbose -o /home/user/Documents/output.markdown /home/user/Documents/chapter1.xhtml /home/user/Documents/chapter2.xhtml
And I got the following output:
---
---
- [Chapter 1](#chapter-1)
- [Chapter 2](#chapter-2)
### Chapter 1
[Next chapter](/home/user/Documents/chapter2.xhtml)\
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
### Chapter 2
[Previous chapter](/home/user/Documents/chapter1.xhtml)\
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
This problem also occurs when I select docx or latex/pdf as the output format. I also tried to use relative links, but nothing worked.
What are the correct parameters for resolving cross-document links?
tl;dr I.e. I don't want link references that contain the original paths; I want them to point to the new output document.
The problem is that your links contain absolute paths (/home/user/Documents/chapter1.xhtml) instead of relative ones (chapter1.xhtml). I cannot imagine the ePUB file containing absolute paths, and if it does, the links in the file will only ever work correctly on your computer. So the solution will have to be fixing those ePUB files before feeding them to pandoc.
Note that roundtripping from pandoc from markdown to epub and back to html works as expected:
$ pandoc -o foo.epub
# foo
adfs
# bar
go [to foo](#foo)
$ unzip foo.epub
$ cat ch002.xhtml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title>bar</title>
<link rel="stylesheet" type="text/css" href="stylesheet.css" />
</head>
<body>
<div id="bar" class="section level1">
<h1>bar</h1>
<p>go to foo</p>
</div>
</body>
</html>
$ pandoc foo.epub
<p><span id="ch001.xhtml"></span></p>
<div id="ch001.xhtml#foo" class="section level1">
<h1>foo</h1>
<p>adfs</p>
</div>
<p><span id="ch002.xhtml"></span></p>
<div id="ch002.xhtml#bar" class="section level1">
<h1>bar</h1>
<p>go to foo</p>
</div>
P.S.
Using two input documents like:
pandoc -o output.md chapter1.xhtml chapter2.xhtml
works as the pandoc README states:
If multiple input files are given, pandoc will concatenate them all (with blank lines between them) before parsing.
So for the parsing done by pandoc, it sees it as one document... so no wonder that cross-file links won't work.
Related
Any ideas why the rss feed (http://facebook.moiraecreative.co.uk/facebook-test) is not populating either my production or development feed with any articles. all help is greatly appreciated.
You have not formatted as per Facebook guidelines.
Have a look at this: https://developers.facebook.com/docs/instant-articles/reference/
As per guidelines issued by Facebook, your code is not appropriate. It should contain the following code.
<content:encoded>
<![CDATA[
<!doctype html>
<html lang="en" prefix="op: http://media.facebook.com/op#">
<head>
<meta charset="utf-8">
<link rel="canonical" href="http://example.com/article.html">
<meta property="op:markup_version" content="v1.0">
</head>
<body>
<article>
<header>
<!— Article header goes here -->
</header>
<!— Article body goes here -->
<footer>
<!— Article footer goes here -->
</footer>
</article>
</body>
</html>
]]>
</content:encoded>
I'm trying to convert a latex table made with pgfplotstable typeset to html with pandoc, for example:
\begin{table}
\centering
\pgfplotstableset{
every head row/.style={before row=\toprule,after row=\midrule},
every last row/.style={after row=\bottomrule}}
\pgfplotstabletypeset[
fixed zerofill,
precision=2,
display columns/0/.style={string type},
col sep=comma]{images/prvsflow.txt}
\caption{Variation of pressure drop with flow rate (m/s)}
\label{tab:pvv}
\end{table}
. If I just use it straight with
pandoc -s example.tex -o example.html
then it gives
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="pandoc" />
<title></title>
<style type="text/css">code{white-space: pre;}</style>
</head>
<body>
<p>[tab:pvv]</p>
</body>
</html>
Has anyone had any experience with this?
I just found htlatex, which seems to do a much better job with tables than pandoc, and also works quite nicely with tables made with pgfplotstabletypeset
I'm working with Sendgrid's template system and need to manually inline some css for content that will be included in the Sendgrid smtpapi call.
Premailer doesn't seem to be actually inlining the css styles. I can inspect the result of calling Premailer.new but the processed_doc and doc both do not have the styles inlined.
Different methods I've tried:
Including the css file directly:
header = <<-HTML
<div class="preview-content">
#{data["content"]}
</div>
HTML
p header
=> "<div class=\"preview-content\">\n<p>Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. </p>\n</div>\n"
premailer = Premailer.new(header, with_html_string: true, adapter: :nokogiri,css: [Rails.root.join('app', 'assets', 'stylesheets', 'email_base.css').to_s], input_encoding: "UTF-8", verbose: true)
p premailer.processed_doc.to_html
=> "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\" \"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body>\n<div class=\"preview-content\">\n<p>Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean.</p>\n</div>\n</body></html>\n"
premailer.doc.to_html returns the same thing with no inlined css.
I checked that the css file is accessible and that the styles apply to .preview-content p.
Adding a header to the document
header = <<-HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8"> <!-- utf-8 works for most cases -->
<meta name="viewport" content="width=device-width"> <!-- Forcing initial-scale shouldn't be necessary -->
<meta http-equiv="X-UA-Compatible" content="IE=edge"> <!-- Use the latest (edge) version of IE rendering engine -->
<title></title> <!-- The title tag shows in email notifications, like Android 4.4. -->
<link rel="stylesheet" href="/asset/email_base.css" media="all">
</head>
<body width="100%" height="100%" bgcolor="#ffffff" style="margin: 0; padding: 0 20px;">
<div class="preview-content">
#{data["content"]}
</div>
</body>
</html>
HTML
p header
=> '<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n <html xmlns=\"http://www.w3.org/1999/xhtml\">\n <head>\n <meta charset=\"utf-8\"> <!-- utf-8 works for most cases -->\n <meta name=\"viewport\" content=\"width=device-width\"> <!-- Forcing initial-scale shouldn't be necessary -->\n <meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\"> <!-- Use the latest (edge) version of IE rendering engine -->\n <title></title> <!-- The title tag shows in email notifications, like Android 4.4. -->\n <link rel=\"stylesheet\" href=\"/asset/email_base.css\" media=\"all\">\n </head>\n <body width=\"100%\" height=\"100%\" bgcolor=\"#ffffff\" style=\"margin: 0; padding: 0 20px;\">\n <div class=\"preview-content\">\n<p>Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean.</p></div>\n </body>\n </html>\n'
The desired css to be inline:
.the-excerpt,
.the-excerpt p,
.preview-content p
// +responsive-text(18px, 30px)
line-height: 1.8 !important
font-size: 18px
Is there something I'm missing to inline css manually? Both ways don't seem to yield any different results.
I'm trying out Premailer right now, and having some problems of my own.. but the way that i see Premailer actually processing the content and getting something different back (it removes the classes with the proper configuration setting, at least) is using this method:
premailer = Premailer.new(html, { :with_html_string=>true, :verbose=>true, :remove_classes=>true })
return premailer.to_inline_css
Help. Am learning HTML5/CSS. Things are going spiffy until I cannot debug my HTML/CSS markup.
Am using WeBuilder which auto-completes and has links to standard tools like Tidy and others.
Here is what I’ve tried
used an internal CSS link in my HTML: it works;
put the styles.css in same and in a css folder- BUMMER
have relocated both files to another HDD- BUMMER
both files validate with my available tools
I am sure the problem is in the HTML file and have fiddled with every modification I can find suggestions about. I have rewritten the HTML again using WeBuilder’s auto complete but have not done it in Notepad. I understand the basics of HTML and CSS plus am very familiar with files and folders so have directed the href correctly (even so have tried several ideas from W3C.
NOTE: I see in the "publish" here, it picks up the Arial font where mine has times. If Arial is not your default, I'm at a loss because the color doesn't show. Neither shows the color. If I can be of further help please advise. I really thank you for any help.
Here is my HTML markup:
<!DOCTYPE html5>
<html>
<head>
<title>A Simple Page</title>
<meta http-equiv="content-type" content="text/htm; charset=utf-8">
<meta name="description" content="">
<meta name="keywords" content="">
<style type="text/css">
<link rel="stylesheet" href="styles.css" />
</style>
</head>
<body>
<h1>First Title</h1>
<p>A paragraph of interesting content</p>
<h2>Second Title</h2>
<p>A paragraph of interesting content</p>
<h2>Third Title</h2>
<p>A paragraph of interesting content</p>
</body>
</html>
Here is the CSS:
h1, h2 {
color: #3366CC;
font-family: "Arial", sans-serif;
}
This makes no sense:
<style type="text/css">
<link rel="stylesheet" href="styles.css" />
</style>
It should simply be:
<link rel="stylesheet" href="styles.css" type="text/css" />
The <style> tags are only used for inline CSS in a page. So if you wanted to you could do this:
<style type="text/css">
h1, h2 {
color: #3366CC;
font-family: "Arial", sans-serif;
}
</style>
But it is really better to keep CSS in a separate file.
Also, there is a minor issue with your DOCTYPE at the top of your HTML file. An HTML5 DOCTYPE is simply:
<!DOCTYPE html>
And not:
<!DOCTYPE html5>
The purpose of HTML5 is to—among other things—simplify document formatting & readability. So there is no such thing as <!DOCTYPE html5> it is simply <!DOCTYPE html>.
I'm experienced with .NET MVC and wanting to learn a Python framework. I chose Pyramid.
.NET MVC has the concept of a master page, views and partial views. A master page would look something like:
<%# Master Language="C#" Inherits="System.Web.Mvc.ViewMasterPage" %>
<!DOCTYPE html>
<html>
<head runat="server">
<title><asp:ContentPlaceHolder ID="TitleContent" runat="server" /></title>
</head>
<body>
<div>
<asp:ContentPlaceHolder ID="MainContent" runat="server" />
</div>
</body>
</html>
I can then create a view that would fill in the space identified by MainContent in the master page.
Going through the Pyramid wiki tutorial here, I see the author has repeated much of the same content in each of his templates--content that would normally be defined in a master page--and totally violated DRY.
Is there a concept of a master page in Pyramid?
Just like MVC.NET Pyramid can use any number of templating languages - and almost all of them support concepts similar to master pages. None of them call them that though ;-)
Chameleon is probably the most far out there - the tools that you use to define slots in master pages ContentPlaceholder, etc.) are called macros in Chameleon and referred to by the rather heavy acronym METAL (Macro Expansion Template Attribute Language).
In Jinja2 and Mako they are called blocks and Breve calls them slots.
Here is what a master page might look like in each of them:
Chameleon:
<!-- Caveat Emptor - I have never used Chameleon in anger -->
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:tal="http://xml.zope.org/namespaces/tal"
xmlns:metal="http://xml.zope.org/namespaces/metal"
xmlns:i18n="http://xml.zope.org/namespaces/i18n">
<!-- We don't *need* all of this in Chameleon, but it's worth
remembering that it adds it for us -->
<head>
<title metal:define-macro="title"><span metal:define-slot="title"></span></title>
</head>
<body metal:define-macro="content">
<div metal:define-slot="content"></div>
</body>
</html>
Jinja2:
<!DOCTYPE html>
<html>
<head>
<title>{% block title %}{% endblock %}</title>
</head>
<body>
{% block content %}{% endblock %}
</body>
</html>
Mako:
<!DOCTYPE html>
<html>
<head>
<title><%block name="title" /></title>
</head>
<body>
<%block name="content" />
</body>
</html>
Breve:
html [
head [
title [ slot("title") ]
]
body [
slot("content")
]
]