dustjs html escape output from dusthelper - dust.js

I want to html escape the content from #pre. Output from {#pre key="property-key" type="content"/} is this contains " and ' in sentence. I want this to be escaped as this contains & #34; and & #39; in sentence.
I tried {#pre key="property-key" type="content" filters="h"/} but not working.

Because {#pre} (I assume this one) returns a Chunk, you can't directly modify the output.
You could wrap this helper in your own helper that HTML-escapes the return value.
dust.helpers.escapePre = function(chunk, context, bodies, params) {
return chunk.tap(function(data) {
return dust.escapeHtml(data);
}).helper('pre', context, bodies, params);
}
{#escapePre key="property-key" type="content" /}
This example helper just invokes the pre helper, and taps the output through a function that escapes all data passed to it.

Related

Custom pandoc writer in lua: attempt to call nil value

I'm trying to set up a simple custom writer going from pandoc's markdown to latex. Here's what I have so far:
test.md
# A section
## A subsection
Heres a paragraph.
Heres another
custom_writer.lua
function Header(lev, s, attr)
level_sequences = {
"section",
"subsection",
"subsubsection",
"subsubsubsection"
}
return string.format("\\%s{%s}", level_sequences[lev], s)
end
function Para(s)
return s.."\\parskip"
end
function Str(s)
return s
end
function Space()
return " "
end
Question
As far as I understand from the docs
A writer using the classic style defines rendering functions for each element of the pandoc AST
I checked the resulting JSON from my markdown file and the only the following elements occur:
Header
Para
Str
Space
It seems to my that I've covered all the necessary elements in the AST, so I'm not sure why pandoc complains with Error running lua: attempt to call a nil value when I do the following:
pandoc test.md -t custom_writer.lua
Does anyone know what I'm missing in custom_writer.lua?
I was missing a few things which are not documented:
function Header(lev, s, attr)
level_sequences = {
"section",
"subsection",
"subsubsection",
"subsubsubsection"
}
return string.format("\\%s{%s}", level_sequences[lev], s)
end
function Blocksep()
return "\\parskip"
end
function Para(s)
return s.."\\parskip"
end
function Str(s)
return s
end
function Space()
return " "
end
function Doc(body, metadata, variables)
return body
end

Replace console.log(), how to input multiple arguments? (NodeJS)

I want to make a new function to "replace" console.log(). Basically a print() function that lets you add a timestamp at the beginning of the text and also change the colors of the timestamp, followed by the text you want to print.
Example:
colors = require('colors')
function debug_print(str) {
console.log(new String(Date().getTime().brightBlue + ": " + str)
}
It works, however it doesn't have the feature where you can place multiple arguments into the function call, or so you can print out an object the way console.log does:
myObject = {"hello": "hey"}
console.log("myObject", myObject); // <-- Works, prints out "myObject {'hello' : 'hey'}"
debug_print("myObject", myObject); // <-- Doesn't work
How do I change my function to both allow multiple arguments and also print out objects the same way console.log does, all in one print line?
You can use spread operator when defining function arguments. You should not use arguments itself (unless you really know what you are doing by that).
function debug_print(...msg) {
console.log('whatever: ', ...msg);
}
You can use arguments object to do so.
In Javascript, arguments is a local JavaScript object variable that is available in all non-arrow functions.
It is an Array-like object, containing all the arguments passed.
function my_function() {
// convert Arguments to a normal Array
var args = Array.from(arguments);
console.log.apply(console, args)
}
my_function(1, 2, "custom text") // 1, 2, "custom text"
As you want to add a text at the beginning of the message, you can simply add an element at the beginning of the Array
args.unshift(timestamp + ": ");

Implement heredocs with trim indent using PEG.js

I working on a language similar to ruby called gaiman and I'm using PEG.js to generate the parser.
Do you know if there is a way to implement heredocs with proper indentation?
xxx = <<<END
hello
world
END
the output should be:
"hello
world"
I need this because this code doesn't look very nice:
def foo(arg) {
if arg == "here" then
return <<<END
xxx
xxx
END
end
end
this is a function where the user wants to return:
"xxx
xxx"
I would prefer the code to look like this:
def foo(arg) {
if arg == "here" then
return <<<END
xxx
xxx
END
end
end
If I trim all the lines user will not be able to use a string with leading spaces when he wants. Does anyone know if PEG.js allows this?
I don't have any code yet for heredocs, just want to be sure if something that I want is possible.
EDIT:
So I've tried to implement heredocs and the problem is that PEG doesn't allow back-references.
heredoc = "<<<" marker:[\w]+ "\n" text:[\s\S]+ marker {
return text.join('');
}
It says that the marker is not defined. As for trimming I think I can use location() function
I don't think that's a reasonable expectation for a parser generator; few if any would be equal to the challenge.
For a start, recognising the here-string syntax is inherently context-sensitive, since the end-delimiter must be a precise copy of the delimiter provided after the <<< token. So you would need a custom lexical analyser, and that means that you need a parser generator which allows you to use a custom lexical analyser. (So a parser generator which assumes you want a scannerless parser might not be the optimal choice.)
Recognising the end of the here-string token shouldn't be too difficult, although you can't do it with a single regular expression. My approach would be to use a custom scanning function which breaks the here-string into a series of lines, concatenating them as it goes until it reaches a line containing only the end-delimiter.
Once you've recognised the text of the literal, all you need to normalise the spaces in the way you want is the column number at which the <<< starts. With that, you can trim each line in the string literal. So you only need a lexical scanner which accurately reports token position. Trimming wouldn't normally be done inside the generated lexical scanner; rather, it would be the associated semantic action. (Equally, it could be a semantic action in the grammar. But it's always going to be code that you write.)
When you trim the literal, you'll need to deal with the cases in which it is impossible, because the user has not respected the indentation requirement. And you'll need to do something with tab characters; getting those right probably means that you'll want a lexical scanner which computes visible column positions rather than character offsets.
I don't know if peg.js corresponds with those requirements, since I don't use it. (I did look at the documentation, and failed to see any indication as to how you might incorporate a custom scanner function. But that doesn't mean there isn't a way to do it.) I hope that the discussion above at least lets you check the detailed documentation for the parser generator you want to use, and otherwise find a different parser generator which will work for you in this use case.
Here is the implementation of heredocs in Peggy successor to PEG.js that is not maintained anymore. This code was based on the GitHub issue.
heredoc = "<<<" begin:marker "\n" text:($any_char+ "\n")+ _ end:marker (
&{ return begin === end; }
/ '' { error(`Expected matched marker "${begin}", but marker "${end}" was found`); }
) {
const loc = location();
const min = loc.start.column - 1;
const re = new RegExp(`\\s{${min}}`);
return text.map(line => {
return line[0].replace(re, '');
}).join('\n');
}
any_char = (!"\n" .)
marker_char = (!" " !"\n" .)
marker "Marker" = $marker_char+
_ "whitespace"
= [ \t\n\r]* { return []; }
EDIT: above didn't work with another piece of code after heredoc, here is better grammar:
{ let heredoc_begin = null; }
heredoc = "<<<" beginMarker "\n" text:content endMarker {
const loc = location();
const min = loc.start.column - 1;
const re = new RegExp(`^\\s{${min}}`, 'mg');
return {
type: 'Literal',
value: text.replace(re, '')
};
}
__ = (!"\n" !" " .)
marker 'Marker' = $__+
beginMarker = m:marker { heredoc_begin = m; }
endMarker = "\n" " "* end:marker &{ return heredoc_begin === end; }
content = $(!endMarker .)*

TCPDF - superscript without HTML

I'm using TCPDF to create PDF documents and need to render a superscript character without using HTML as the multicell option. No HTML because I need to align the words vertically at the bottom of the cell which doesn't work when the cell has HTML endabled.
Any ideas?
[Edit]
According to Jakuje's hint, I'm using this code to convert the unicode-characters:
$unicodeTable = array('<sup>1</sup>'=>'U+00B9', '<sup>2</sup>'=>'U+00B2', '<sup>3</sup>'=>'U+00B3', '<sup>4</sup>'=>'U+2074', '<sup>5</sup>'=>'U+2075');
function replace_unicode_escape_sequence($match) {
return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}
function unicode_chr ($chr) {
$x = explode("+", $chr);
$str = "\u".end($x);
return preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $str);
}
foreach($unicodeTable as $uKey=>$uValue){
$text = str_replace($uKey, unicode_chr($uValue), $text);
}
This works in pure php/HTML - but when I use this code with TCPDF, all I get is the unicode-code (e.g. \u00B9)
You can use UTF8 superscript, if it is some "common" letter, such as
x² or xⁿ
I found the following works with TCPDF
json_decode('"\u00B3"') // for PHP 5.x
"\u{00B2}" // for PHP 7.x
Based on this stack overflow article Unicode character in PHP string
TCPDF 6.2.13 with PHP7.1.4

Getting cleaned HTML in text from HtmlCleaner

I want to see the cleaned HTML that we get from HTMLCleaner.
I see there is a method called serialize on TagNode, however don't know how to use it.
Does anybody have any sample code for it?
Thanks
Nayn
Here's the sample code:
HtmlCleaner htmlCleaner = new HtmlCleaner();
TagNode root = htmlCleaner.clean(url);
HtmlCleaner.getInnerHtml(root);
String html = "<" + root.getName() + ">" + htmlCleaner.getInnerHtml(root) + "</" + root.getName() + ">";
Use a subclass of org.htmlcleaner.XmlSerializer, for example:
// get the element you want to serialize
HtmlCleaner cleaner = new HtmlCleaner();
TagNode rootTagNode = cleaner.clean(url);
// set up properties for the serializer (optional, see online docs)
CleanerProperties cleanerProperties = cleaner.getProperties();
cleanerProperties.setOmitXmlDeclaration(true);
// use the getAsString method on an XmlSerializer class
XmlSerializer xmlSerializer = new PrettyXmlSerializer(cleanerProperties);
String html = xmlSerializer.getAsString(rootTagNode);
XmlSerializer xmlSerializer = new PrettyXmlSerializer(cleanerProperties);
String html = xmlSerializer.getAsString(rootTagNode);
the method above has a problem,it will trim content in html label, for example,
this is paragraph1.
will become
this is paragraph1.
and it is getSingleLineOfChildren function does the trim operation. So if we fetch data from website and want to keep the format like tuckunder.
PS:if a html label has children label,the parent label contetn will not be trimed,
for example <p> this is paragraph1. <a>www.xxxxx.com</a> </p> will keep whitespace before "this is paragraph1"

Resources