Carrying style IDs/names from HTML to .docx? - lua

Is it possible to somehow tell pandoc to carry the names of styles from original HTML to .docx?
I understand that in order to tune the actual styles, I should be using reference.docx file generated by pandoc. However, reference.docx is limited to what styles it has to: headings, body text, block text, etc.
I'd like to:
specify "myStyle" style in the input HTML (via a "class" attribute, via any other HTML attribute or even via a filter code written in Lua),
<html>
<body>
<p>Hello</p>
<p class="myStyle">World!</p>
</body>
</html>
add a custom "myStyle" to reference.docx using Word,
run a html->docx conversion an expect pandoc generate a paragraph element with "myStyle" (instead of BodyText, which I believe it sets by default), so the end result looks like this (contents of word/document.xml inside the resulting output.docx was cut for brevity):
<w:p>
<w:pPr>
<w:pStyle w:val="BodyText" />
</w:pPr>
<w:r>
<w:txml:space="preserve">Hello</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="myStyle" />
</w:pPr>
<w:r>
<w:txml:space="preserve">World!</w:t>
</w:r>
</w:p>
There's some evidence styleId can be passed around, but I don't really understand it and am unable to find any documentation about it.
Doc on filtering in Lua states you can access attrs when manipulating a pandoc.div, but it says nothing about whether any of the attrs will be interpreted by pandoc in any meaningful way.

Finally, found what I needed – Custom styles. It's limited, but better than what I arrived earlier, and of course much better than nothing at all :)
I'll leave a step-by-step guide here in case anyone stumbles upon a similar question.
First, generate a reference.docx file like this:
pandoc --print-default-data-file reference.docx > styles.docx
Then open the file in MS Word (I was using a macOS version) you'll see this:
Click the "New style..." button on the right, and create a style to your liking. In my case I made change the style of text to be bold, in blue color:
Since I am converting from HTML to DOCX, here's my input.html:
<html>
<body>
<div>Page 1</div>
<div custom-style="eugene-is-testing">Page 2</div>
<div>Page 3</div>
</body>
</html>
Run:
pandoc --standalone --reference-doc styles.docx --output output.docx input.html
Finally, enjoy the result:

Related

How to rename meta data entries in Quarto Book

I am using Quarto Book for writing an online Book for my students on scientific writing skills (see www.politik-wissenschaft.at).
I want to add the bibliographic entry of this book at the start site (index.qmd / index.html). I used the abstract option for doing that. This works fine, but the title of this meta data entry is now called "Zusammenfassung" (i.e., abstract).
Is there a way to change this title to "How to cite", for example?
Furthermore, after updating Quarto CLI to 1.2.313 this information ("Zusammenfassung") is not only printed on the index.html file, but on the head of all other sites as well. How can I switched that off, so that these meta data information is only shown at the start site (i.e., index.html).
You can change the meta data using template-partials. Add the following to your _quarto.yml file:
template-partials:
- title-block.html
And create the title-block.html file (see here the original file) and change the abstract-title to your liking:
<header id="title-block-header">
<h1 class="title">$title$</h1>
$if(subtitle)$
<p class="subtitle">$subtitle$</p>
$endif$
$for(author)$
<p class="author">$author$</p>
$endfor$
$if(date)$
<p class="date">$date$</p>
$endif$
$if(abstract)$
<div class="abstract">
<div class="abstract-title">How to Cite </div>
$abstract$
</div>
$endif$
</header>
Different Option:
add a css file and change the text via css (usually not recommended):
styles.css
div.abstract-title {
visibility: hidden;
}
div.abstract-title:before {
content: "How to cite";
visibility: visible;
}

How do I escape HTML for th:errors?

Let's say I have:
<span th:if="${#fields.hasErrors('firstName')}" class="color--error" th:errors="*{firstName}"></span>
How do I escape the text if the error text contains HTML? I know for normal text, we can use th:utext.
As of 3.0.8-SNAPSHOT, Thymeleaf-Spring has th:uerrors.
See this GitHub issue for the discussion: https://github.com/thymeleaf/thymeleaf-spring/issues/153
And this change log for 3.0.8: http://forum.thymeleaf.org/Thymeleaf-3-0-8-JUST-PUBLISHED-td4030687.html
th:errors is just a shortcut. You still use th:utext for this, you just have to manually output your errors. In your case, the code could look something like:
<div th:if="${#fields.hasErrors('firstName')}" th:each="err: ${#fields.errors('firstName')}" th:utext="${err}" class="color--error" />

google translate misses up the coding of my file

i am trying to use google translate for localization of an XML file, it has near 350K lines, but some of them contain coding for in-game font size and color, like so:
<replacement><p horizontalalignment="center"><br/><image enablescale="false" imagesetpath="00015590.InterD_Jeryoung_3"/><br/><image enablescale="true" imagesetpath="00015590.Tag_Dungeon_Six_Superior" scalerate="1.5"/><image enablescale="true" imagesetpath="00015590.Tag_Dungeon_Four_Superior" scalerate="1.5"/><br/><image enablescale="true" imagesetpath="00009499.Field_Boss" scalerate="1.4"/>Хмельной лик<br/><br/></p>Уничтожить зараженных насекомых<br/>возле мест обитания их королевы。<br/></replacement>
now for god knows what reason, google translate alters that code in the process of translation into some unacceptable coding, like so:
<replacement> <p horizontalalignment="center"> <br/> <image enablescale="false" imagesetpath="00015590.InterD_Jeryoung_3"/> <br/> <image enablescale = "true "imagesetpath =" 00015590.Tag_Dungeon_Six_Superior "scalerate =" 1.5 "/> <image enablescale="true" imagesetpath="00015590.Tag_Dungeon_Four_Superior" scalerate="1.5"/> <br/> <image enablescale = "true" imagesetpath = "00009499.Field_Boss" scalerate = "1.4" /> Intoxicated face <br/> <br/> </ p> Destroy infected insects <br/> habitats near their queen. <br/> </ replacement>
is there any way to avoid that, why is it happening exactly? anyhelp is appreciated on that matter,thanks
EDIT : i am also looking for a way to input my text and have it out in the same exact language with only the coding mishaps changing, so i can isolate those,build a comparison table and then use that to fix the errors after the actual translation is done, but i don't see a way for selecting the same language as input AND output in google translate, it always forces me choose a different one in input or output, kind of makes sense but if there is a way to do that, i might be able to work around it..
Do not feed Google translate with your Xml file, as far as I know it doesn't understand Xml.
Extract the text from the Xml file.
Feed the text to translate.
Transform the text back to Xml.
You could simply transform the Xml to a text document with a single line per Xml element so it would be easier to turn it back into Xml.
More detail
According to the Toolkit you can upload:
HTML (.HTML)
Microsoft Word (.DOC/.DOCX)
OpenDocument Text (.ODT)
Plain Text (.TXT)
Rich Text (.RTF)
Wikipedia URLs
And a couple of extras such as JSON. So no Xml.
The best way I see is to transform your Xml document into one of these types (I would probably use JSON) and transform it is such a way that it can easily be transformed back again by using either position (1 line in the text file is the first element in the Xml document) or by an id (add the Id or position of the element in the xml hierarchy to the JSON element)
My guess is that the toolkit recognizes the html tags in the xml and escapes them. So another option might be to un-escape the > to > and &lt to <

Input code in TinyMCE

I'm using TinyMCE on my blog and it seems to be removing the code I'm trying to paste.
I want to add this:
<Files somefile.png>
DefaultType application/x-httpd-php
</Files>
(it's a .htaccess directive)
This gets saved ok (as < and > in the html), but when I reopen my form for editing, it gets transformed as :
DefaultType application/x-httpd-php
Edit : I'm using TinyMCE in a Symfony form, using sfFormExtraPlugin.
Edit 2 : I tried verify_html: false ....
now my code gets transformed as :
<p><files exec="" jpg=""><br /> DefaultType application/x-httpd-php<br /></files></p>
Edit 3: My tinyMCE config is :
tinyMCE.init({
mode: "exact",
elements: "content_contents",
theme: "advanced",
width: "500px",
height: "400px",
theme_advanced_toolbar_location: "top",
theme_advanced_toolbar_align: "left",
theme_advanced_statusbar_location: "bottom",
theme_advanced_resizing: true
,
language : "fr",
convert_urls : false,
verify_html : false
});
I responded to this on the TinyMCE MoxieCode forums topic that was also opened by #Manu however I wanted to update this topic with my thoughts as well.
If I understand #Manu correctly, the problem is that the HTML source, while saving with &lt and &gt correctly is being interpreted as < and > when reloaded into TinyMCE.
If this is the case, then I believe the problem is that the Symphony plugin isn't encoding the HTML content prior to populating the TextArea that TinyMCE replaces. In other words, it leaves &lt when it should be loading &amplt; so tinyMCE receives &lt
What are you doing to the input when putting it back in to TinyMCE? If you're converting it to HTML or anything TinyMCE will clean it up as it's invalid HTML.
As a work around/experiment you could add File in the custom_elements option in your init.
Update As you are accepting all sorts of code, you will probably have to turn off clean up altogether. Put cleanup: false in the config. If I were you I would implement your own custom formatting (like Stack/overflow does) and generate bold, underline, links etc formatting because it will give you a lot more control over the HTML generation, ie you could just print out everything exactly how it is (with escaping), and then turn the pre-defined symbols to <strong> tags, or what ever. This is be far the easiest way of generating safe, accurate HTML output, and in your case, probably the only way.
You would not want to use TinyMCE is this case...
That is because the invalid HTML gets removed (the tinymce cleanup functionality).
A workaround could be to initialize tinymce using the cleanup paramter:
cleanup: false,
I suggest you have a closer look at the tinymce initialization parameters
custom_elements
valid_elements
and
cleanup
replace your < and >
< becomes: <
> becomes: >
Try including
extended_valid_elements : "Files[]",
In your config. It's used to unlock certain html tags like iframe. In the brackets you usually put the allowed options for the tag (like [src|alt|id]) so I'm not sure what to put there for your example ...
the correct answer to your problem, tested by me and 100% working is to wrap your variable into htmlspecialchars in php like this example:
htmlspecialchars($myText)

Sphinx, reStructuredText show/hide code snippets

I've been documenting a software package using Sphinx and reStructuredText.
Within my documents, there are some long code snippets. I want to be able to have them hidden as default, with a little "Show/Hide" button that would expand them (Example).
Is there a standard way to do that?
You don't need a custom theme. Use the built-in directive container that allows you to add custom css-classes to blocks and override the existsting theme to add some javascript to add the show/hide-functionality.
This is _templates/page.html:
{% extends "!page.html" %}
{% block footer %}
<script type="text/javascript">
$(document).ready(function() {
$(".toggle > *").hide();
$(".toggle .header").show();
$(".toggle .header").click(function() {
$(this).parent().children().not(".header").toggle(400);
$(this).parent().children(".header").toggleClass("open");
})
});
</script>
{% endblock %}
This is _static/custom.css:
.toggle .header {
display: block;
clear: both;
}
.toggle .header:after {
content: " ▶";
}
.toggle .header.open:after {
content: " ▼";
}
This is added to conf.py:
def setup(app):
app.add_css_file('custom.css')
Now you can show/hide a block of code.
.. container:: toggle
.. container:: header
**Show/Hide Code**
.. code-block:: xml
:linenos:
from plone import api
...
I use something very similar for exercises here: https://training.plone.org/5/mastering-plone/about_mastering.html#exercises
You can use the built-in HTML collapsible details tag by wrapping the code in two raw HTML directives
.. raw:: html
<details>
<summary><a>big code</a></summary>
.. code-block:: python
lots_of_code = "this text block"
.. raw:: html
</details>
Produces:
<details>
<summary><a>big code</a></summary>
<pre>lots_of_code = "this text block"</pre>
</details>
I think the easiest way to do this would be to create a custom Sphinx theme in which you tell certain html elements to have this functionality. A little JQuery would go a long way here.
If, however you want to be able to specify this in your reStructuredText markup, you would need to either
get such a thing included in Sphinx itself or
implement it in a Sphinx/docutils extension...and then create a Sphinx theme which knew about this functionality.
This would be a bit more work, but would give you more flexibility.
There is a very simplistic extension providing exactly that feature: https://github.com/scopatz/hiddencode
It works rather well for me.
The cloud sphinx theme has custom directive html-toggle that provides toggleable sections. To quote from their web page:
You can mark sections with .. rst-class:: html-toggle, which will make the section default to being collapsed under html, with a “show section” toggle link to the right of the title.
Here is a link to their test demonstration page.
sphinx-togglebutton
Looks like a new sphinx extension has been made to do just this since this question has been answered.
Run: pip install sphinx-togglebutton
Add to conf.py
extensions = [
...
'sphinx_togglebutton'
...
]
In rst source file:
.. admonition:: Show/Hide
:class: dropdown
hidden message
since none of the above methods seem to work for me, here's how I solved it in the end:
create a file substitutions.rst in your source-directory with the following content:
.. |toggleStart| raw:: html
<details>
<summary><a> the title of the collapse-block </a></summary>
.. |toggleEnd| raw:: html
</details>
<br/>
add the following line at the beginning of every file you want to use add collapsible blocks
..include:: substitutions.rst
now, to make a part of the code collapsible simply use:
|toggleStart|
the text you want to collapse
..code-block:: python
x=1
|toggleEnd|
Another option is the dropdown directive in the sphinx-design extension. From the docs:
Install sphinx-design
pip install sphinx-design
Add the extension to conf.py in the extensions list
extensions = ["sphinx_design"]
Use the dropdown directive in your rst file:
.. dropdown::
Dropdown content

Resources