I'm trying to convert a Jupyter notebook that is using RISE to visualize the slides as a slideshow in the browser into a PDF file. The PDF file should have all pages in landscape mode and resemble the view in the browser. Of course, animations are not possible, but it should be possible to have fragments either combined in a single PDF slide or spread across multiple sort of "accumulating" slides (i.e. building upon their forerunner slides) .
I've been trying to create my own Jinja template that generates a LaTeX document utilizing the beamer document class, with not much success so far.
Do you know if there are any tools or templates or exporters or anything available that can help me with this process? Preferably automatically, like, utilizing nbconvert?
Figured it out myself. Take these steps:
launch jupyter nbconvert --to slides --post serve the_notebook.ipynb; the browser will open the node hosted the_notebook.slides.html
replace the # after the the_notebook.slides.html in the browser URL with ?print-pdf so that the url looks most likely like http://127.0.0.1:8000/the_notebook.slides.html?print-pdf
print to PDF file
Some time ago, I needed to programmatically convert Jupyter Notebook presentations to PDF slides. I did some research and you can use puppeteer to automate the process. You need a simple Python script for this:
import asyncio
import os
import tempfile
from subprocess import PIPE, Popen
from pyppeteer import launch
import concurrent.futures
async def html_to_pdf(html_file, pdf_file, pyppeteer_args=None):
"""Convert a HTML file to a PDF"""
browser = await launch(
handleSIGINT=False,
handleSIGTERM=False,
handleSIGHUP=False,
headless=True,
args=["--no-sandbox"],
)
page = await browser.newPage()
await page.setViewport(dict(width=994, height=768))
await page.emulateMedia("screen")
await page.goto(f"file://{html_file}", {"waitUntil": ["networkidle2"]})
page_margins = {
"left": "20px",
"right": "20px",
"top": "30px",
"bottom": "30px",
}
dimensions = await page.evaluate(
"""() => {
return {
width: document.body.scrollWidth,
height: document.body.scrollHeight,
offsetWidth: document.body.offsetWidth,
offsetHeight: document.body.offsetHeight,
deviceScaleFactor: window.devicePixelRatio,
}
}"""
)
width = dimensions["width"]
height = dimensions["height"]
await page.pdf(
{
"path": pdf_file,
"format": "A4",
"printBackground": True,
"margin": page_margins,
}
)
await browser.close()
if __name__ == "__main__":
html_input_file = "/you/need/full/path/here/presentation.slides.html?print-pdf"
pdf_output_file = "slides.pdf"
pool = concurrent.futures.ThreadPoolExecutor()
pool.submit(
asyncio.run,
html_to_pdf(
html_input_file,
pdf_output_file
),
).result()
The script accepts the HTML slides as input and produces the PDF slides as output. Please note that you need to provide full path for the HTML file. I wrote an article on how to convert notebook presentations to pdf slides. If you would like to apply styling, here is longer version of the script.
I guess jupyter nbconvert --to pdf the_notebook.ipynb should work fine.
You do need to install latex though.
Related
I am just starting to use jsPDF and I think it may actually work (after attempting a zillion different ways to produce PDFs in my Quasar/Electron desktop application that have not worked).
Is there a way to display the PDF in the application window?
this.doc = new jsPDF({
orientation: "landscape",
unit: "in",
format: [4, 2]
})
this.doc.text(this.dogArray[0].dogCallName, 1, 1)
this.doc.save("test.pdf")
That works and I can save the PDF, but I'd also like to be able to display the generated PDF in the Electron browser window. I can console.log out this.doc, and I can display it on the window, but it's just a bunch of string info.
Is there something like doc.view("file.pdf") that can be used? I'm looking through the jsPDF documentation but I'm not seeing what I'm looking for.
I want to be able to see the PDF like the author shows on his Demo Website
Another learning project in the works... I am trying to use paperjs in an electron app.
According to the instructions, I think I should be using paper-jsdom (please correct me if I'm wrong). BTW, I am using TypeScript if that makes a difference. I have an HTML document with nothing but an empty <canvas> and a <script> tag referencing this:
import paper, {Color, Point, Path} from 'paper-jsdom'
window.onload = (): void => {
let canvas = document.getElementById("workspace") as HTMLCanvasElement;
paper.setup(canvas);
let path = new Path();
path.strokeColor = Color.random();
let start = new Point(100, 100);
path.moveTo(start);
path.lineTo(start.add(new Point(200, -50)));
paper.view.update();
};
So right off the bat I get:
Uncaught TypeError: paper_jsdom_1.Path is not a constructor
Ugh... So I tried a few random things (it's late, I'm tired...) and changing my import to:
import paper from 'paper'
import {Color, Point, Path} from 'paper-jsdom'
works, or at least the code above works.
Am I supposed to be importing some things from 'paper' and others from 'paper-jsdom'? What is the correct way to use paperjs in an electron app?
Unfortunately paper-jsdom doesn't seem to have any type info for TS.
Thanks!!
Since you are using Paper.js in the renderer process of Electron, you are using it in the browser context and not in Node.js context so you should use the common paper package which relies on browser Canvas API (and not paper-jsdom which targets browserless usage).
So you should be able to use Paper.js as you would for a website.
From your code example, I see that you are using TypeScript so you can have a look at this simple quickstart project that I made to play with Paper.js and TypeScript.
It uses this kind of import:
import * as paper from 'paper';
And then access Paper.js classes through the imported paper object:
new paper.Path.Circle({
center : paper.view.center,
radius : 50,
fillColor: 'orange',
});
Edit
Here is a repository showing the simplest way of using Paper.js in an Electron app.
I am using Google's Colaboratory platform to run python in a Jupyter notebook. In standard Jupyter notebooks, the output of sympy functions is correctly typeset Latex, but the Colaboratory notebook just outputs the Latex, as in the following code snippet:
import numpy as np
import sympy as sp
sp.init_printing(use_unicode=True)
x=sp.symbols('x')
a=sp.Integral(sp.sin(x)*sp.exp(x),x);a
results in Latex output like this:
$$\int e^{x} \sin{\left (x \right )}\, dx$$
The answer cited in these questions, Rendering LaTeX in output cells in Colaboratory and LaTeX equations do not render in google Colaboratory when using IPython.display.Latex doesn't fix the problem. While it provides a method to display Latex expressions in the output of a code cell, it doesn't fix the output from the built-in sympy functions.
Any suggestions on how to get sympy output to properly render? Or is this a problem with the Colaboratory notebook?
I have just made this code snippet to make sympy works like a charm in colab.research.googlr.com !!!
def custom_latex_printer(exp,**options):
from google.colab.output._publish import javascript
url = "https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.3/latest.js?config=default"
javascript(url=url)
return sympy.printing.latex(exp,**options)
init_printing(use_latex="mathjax",latex_printer=custom_latex_printer)
Put it after you imported sympy
This one basically tell sympy to embed mathjax library using colab api before they actually output any syntax.
You need to include MathJax library before display. Set it up in a cell like this first.
from google.colab.output._publish import javascript
url = "https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.3/latest.js?config=default"
Later, you include javascript(url=url) before displaying:
x=sp.symbols('x')
a=sp.Integral(sp.sin(x)*sp.exp(x),x)
javascript(url=url)
a
Then, it will display correctly.
Using colab's mathjax and setting the configuration file to TeX-MML-AM_HTMLorMML worked for me. Below is the code:
from sympy import init_printing
from sympy.printing import latex
def colab_LaTeX_printer(exp, **options):
from google.colab.output._publish import javascript
url_ = "https://colab.research.google.com/static/mathjax/MathJax.js?"
cfg_ = "config=TeX-MML-AM_HTMLorMML" # "config=default"
javascript(url=url_+cfg_)
return latex(exp, **options)
# end of def
init_printing(use_latex="mathjax", latex_printer=colab_LaTeX_printer)
I want to process images before I send them to Tesseract for OCR.
For example:
Resize the image
Change the resolution to 300 dpi
Threshold (B&W image)
Sharpen image
How can I automate this process?
I've just put together an answer (https://graphicdesign.stackexchange.com/questions/53919/editing-several-hundred-images-gimp/53965#53965 ) on graphicdesign, which is intended as an GIMP automation primer for people with no programing skills -
it should be nice for understanding Python-fu as well.
On the very same answer, there are links to the official documentation, and one example of how to create a small script. You should them brose GIMP's PDB to findout about the exact proceeds you want.
But, all in all, you can create a Python file like this:
from gimpfu import *
import glob
def auto():
for filename in glob(source_folder + "/*.png"):
img = pdb.gimp_file_load(source_folder + filename, source_folder + filename)
# place the PDB calls to draw on the image before your interation here
#disp = pdb.gimp_display_new(img)
pdb.gimp_image_merge_visible_layers(img, CLIP_TO_IMAGE)
pdb.gimp_file_save(img, img.layers[0], dest_folder + filename, dest_folder + filename)
# pdb.gimp_display_delete(disp)
pdb.gimp_image_delete(img) # drops the image from gimp memory
register("batch_process_for_blah",
"<short dexcription >Batch Process for Bla",
"<Extended description text>",
"author name",
"license text",
"copyright note",
"menu label for plug-in",
"", # image types for which the plug-in apply - "*" for all, blank for plug-in that opens image itself
[(PF_DIRNAME, "source_folder", "Source Folder", None),
(PF_DIRNAME, "dest_folder", "Dest Folder", None)], # input parameters -
[], # output parameters
menu="<Image>/File", # location of the entry on the menus
)
main()
To find the wanted operations inside the for loop, go to Help->Procedure Browser - or better yet, Filters->Python->Console and hit Browse - it is almost the same, but with an "apply" button that makes it easy to test the call, and copy it over to your plug-in code.
I followed instructions provided here(How to create a shortcut for user's build system in Sublime Text?) to compile latex documents in xelatex, and on top of that I would also like it to automatically open pdf after compiling just like with latexmk, how can I achieve that? The document is built just fine, but I have to open it each time manually.
Here's an extension to the CompileWithXelatexCommand implementation that successfully opens the PDF in my default PDF viewer.
import sublime, sublime_plugin
import os
import time
class CompileWithXelatexCommand(sublime_plugin.TextCommand):
def run(self, edit):
if '/usr/texbin' not in os.environ['PATH']:
os.environ['PATH'] += ':/usr/texbin'
base_fname = self.view.file_name()[:-4]
pdf_fname = base_fname + ".pdf"
self.view.window().run_command('exec',{'cmd': ['xelatex','-synctex=1','-interaction=nonstopmode',base_fname]})
tries = 5
seconds_to_wait = 1
while tries > 0:
if os.path.isfile(pdf_fname):
break
time.sleep(seconds_to_wait)
seconds_to_wait *= 2
tries -= 1
os.system("open " + pdf_fname)
The polling loop is required; otherwise, the open call may happen before the PDF has been generated. There may be a cleaner way to synchronously exec a sequence of commands via run_command.
I don't have access to Windows now, but from this post you'll probably just need to change "open " to "start ". The PATH initialization logic will either need to be eliminated or adjusted.