TCPDF not render tabs space with <pre> tag - tcpdf

I'm using TCPDF to render a simple HTML that mainly have the data between "pre" tag. All is fine except when I have text separated by tabs spaces. Then the text is misaligned and don't show the text aligned.
I'm testing the first sample code.
https://tcpdf.org/examples/example_001/
Then I replace:
$html = <<<EOD
<pre>
RESEARCH LIST
XX-Name: Notes:
10-Progress Get +20
20-Research Get +2
27-Advanced Research Get +5
28-Atmospheric Converters Terraforming +1
29-Comet Ds Terraforming +4
</pre>
EOD;
I get the PDF but the text isn't aligned with columns.
If I open as .html is aligned correctly.
<HTML>
<HEAD>
<TITLE>XYZ</TITLE>
</HEAD>
<body>
<PRE>
DC-Name: Notes:
10-Progress Get +20
20-Research Get +2
27-Advanced Research Get +5
28-Atmospheric Converters Terraforming +1
29-Comet Ds Terraforming +4
</PRE>
</body>
</HTML>
What is wrong?
Salut,
Josep M

Related

Why do some characters such as "ç" look different from other characters?

I've got a French text on a website using "Nunito" from Google Fonts.
On Safari, I found out that my text had bolder letters for signs such as "ç" or "é". Looking again, I realized they also differ on other browser, not just as much.
I've tried including the font in different ways (link, font-face), nothing does the trick.
<html>
<head>
<meta charset="utf-8">
<link href="https://fonts.googleapis.com/css?family=Nunito:700&display=swap" rel="stylesheet">
<style>
body {
font-size:20px;
font-family: 'Nunito', Arial, sans-serif;
}
</style>
</head>
<body>
comment ça marche ?
</body>
</html>
In the example, the "ç" looks off.
At some point, I went and typed some text on Google Fonts directly, and it looked right.
That got me thinking... And trying at my example again.
Bing!
The text I had was copied/pasted from what the marketing sent me. That text didn't work, while "typed" text did.
The "ç" in the text I had was charcode 99 ("c") followed by 807 (the cedilla below it). Chrome and Firefox did attach both in an odd way, but it kind of worked, but Safari just ignored it and took the whole sign from Arial.
The "ç" I typed in Google Fonts for text was the code 231, which is a single character from Latin encoding.

Mathjax formula is not showing properly on show page

I'm using CKEditor as rich text editor. For mathematical formulas I included MathJax plugin. The problem is, when I insert formula in editor's panel - it's working and showing properly, for example, but in show page it's rendering only this, for example: \(x = {-b \pm \sqrt{b^2-4ac} \over 2a}\). What should I do?
The MathJax plugin for ckeditor only converts a TeX string inside the Editor into a readable equation. The original source will still be the TeX string, and if you only put that content in your page - you will see the TeX string: \(x = {-b \pm \sqrt{b^2-4ac} \over 2a}\) (as in your example).
The easiest way to "solve" this is to add the MathJax script to your page (not the editor), which will translate the TeX parts in your html to the readable equations:
<script type="text/javascript" async
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_CHTML">
Here is a working example:
<script type="text/javascript" async
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_CHTML">
</script>
<div>The next line should be displayed as equation (and not just string) if the MathJax was loaded correctly:</div>
<div style="margin: 20px;">
\(x = {-b \pm \sqrt{b^2-4ac} \over 2a}\)
</div>

How to Capture html tags using lua pattern

This is how what i'm trying to extract from looks : http://pastebin.com/VD0K3ZcN
lines:match([[title="(value here)">]])
How can I get the "value here"? it does not have numbers or the ">" symbol inside it, only letters, spaces, ' - and .
I have tried
lines:match([[title="(.+)">]])
but it simply got the whole line after the capture.
The problem with your pattern is this:
title=" -- This is fine, but you probably want to find out what tag title is in.
(.+) -- Problem: Greedy match. I'll illustrate this later.
"> -- Will match a closing tag with a double quote.
Now, if I have this HTML:
<html>
<head title="Foobar">
</head>
<body onload="somejs();">
</body>
</html>
Your pattern will match:
Foobar"></head><body onload="somejs();
You can fix this by using (.-). This is the non-greedy version, and it will match the least amount possible, stopping once it finds the next "> instead of the last ">.

Writing Pure HTML in Jade

Jade appears to choke on WebGL shader/fragment blocks when writing in Jade format, so I would like to write them as straight HTML while still being able to write Jade around it. Is this possible?
These:
html
body
| <h1>Title</h1>
| <p>foo bar baz</p>
compiles into:
<!DOCTYPE html>
<html>
<body>
<h1>Title</h1>
<p>foo bar baz</p>
</body>
</html>
hope it helps

HTML parsing and extracting text

There are a number of resources to parse HTML pages and extract textual content. Jsoup is an example. In my case, I would like to extract the textual content tagged with the html tags under which each sentence occurs. For example, take this page
<html>
<head><title>Test Page</title>
<body>
<h1>This is a test page</h1>
<p> The goal is to extract <b>textual content <em> with html tags</em> </b> from html pages.
</body>
</html>
I'm expecting the output to be like this:
<h1>This is a test page</h1>
<p> The goal is to extract <b>textual content <em> with html tags</em> </b> from html pages.
In other words, I want to include specific html tags within the textual content of the page.
To get your result you can use this:
final String html = "<html>"
+ "<head><title>Test Page</title>"
+ "<body>"
+ "<h1>This is a test page</h1>"
+ "<p> The goal is to extract <b>textual content <em> with html tags</em> </b> from html pages."
+ "</body>"
+ "</html>";
// Parse the String into a Jsoup Document
Document doc = Jsoup.parse(html);
Elements body = doc.body().children();
// Do further things here ...
System.out.println(body);
Instead of the String html you can load a file or a website too - jsoup provides this all.
In this example body contains the html you posted as result.
Or do you need to select something like "h1 followed by p tag"?
However you may take a look at the Jsoup Selector API
You do it in two steps. First, as you have described, create a DOM tree using JSoup. Then process it using an XSL filter. In the XSL filter you can extract only those tags you are interested.

Resources