How do I convert richtext pasteboard content to plaintext with hamerspoon? - lua

I am looking for a solution to auto convert rich-text copied to clipboard (pasteboard) to plain text one in Hammerspoon (lua code).
I know how to access the pasteboard in lua but no idea on how to bind this action to the copy or paste event in order to automate it (neither on how to convert content to plain text).
local pasteboard = require("hs.pasteboard")

The easiest method would be to just use the answer described here to fetch the RTF data in the pasteboard and pipe the data to the already available
textutil command to convert it to plain text to stdout:
osascript -e 'the clipboard as «class RTF »' | \
perl -ne 'print chr foreach unpack("C*",pack("H*",substr($_,11,-3)))' | \
textutil -stdin -stdout -convert txt
We can then in the Hammerspoon environment use hs.execute to run the shell command and return the converted value, so in your Lua code it's as simple as:
local text = hs.execute([[
osascript -e 'the clipboard as «class RTF »' | \
perl -ne 'print chr foreach unpack("C*",pack("H*",substr($_,11,-3)))' | \
textutil -stdin -stdout -convert txt
]])
FYI the Hammerspoon API does allow you to retrive RTF data from the pasteboard using hs.pasteboard.readDataForUTI using the "public.rtf" UTI, so technically you could do all this in Lua, but you would have to manually convert the RTF data yourself.

Related

How bobril translation import/export works in bobril-build?

How bb translation works together?
When I used bb b -l 1 worked fine but there is still needed to rewrite all strings for other languages.
bb t -a adds new language, e.g. "cs-CZ" and creates json file with language code.
The question is how can I export/import all strings into json file to translation?
bb t -e - fileName is json or js in dist? Export doesn't work in my case no strings are exported.
bb t -e filename.txt -l cs-CZ is correct way to export untranslated strings to text file with very simple structure. After it will get back from translation agency you can just import it by bb t -i filename.txt -l cs-CS.
Before exporting always update translation files by bb b -l 1 -u 1 as you already find out. Actual JSON files in translations directory contains array of arrays of 3 or 4 items [original, hint, 0/1 - with/out parameters, translation]. So you can directly translate them if you will create some editor for these...
Also please update bobril-build to 0.56.1, I just fixed wrong error message in export even-though everything was ok. Maybe that confuse you so you have to ask, sorry for that.

How to extract hyperlinks from office documents using tika

I'm using Apache Tika to extract raw text from various document formats including office.
When extracting text from word documents that include hyperlinks, then only the text is extracted and the information about the hyperlink is lost.
Is there a way to configure the parser so that the underlying link is also extracted?
ParseContext context = new ParseContext();
Detector detector = new DefaultDetector();
Parser parser = new AutoDetectParser(detector);
context.set(Parser.class, parser);
Metadata metadata = new Metadata();
try (TikaInputStream input = TikaInputStream.get(new File(fileName))) {
BodyContentHandler handler = new BodyContentHandler();
parser.parse(input, handler, metadata, context);
String rawText = handler.toString();
input.close();
}
I'm using tika-app to extract hyperlinks from office documents in bash. I'm using the --html option to output the HTML content of files. I'm then using sed and grep to filter the HTML to just the contents of href attributes in that HTML. The result I get is the content of each href, one per line.
java -jar /root/tika-app-1.20.jar --html TEST.docx 2>/dev/null | sed 's/href/\nhref/g' | grep '^href' | sed 's/href="//' | sed 's/".*//'
I know that OP is not using tika-app, but the general approach can be applied using Tika from Java too.

zsh script to encode full file path

I want to be able to encode a path for use as a url i.e change spaces to %20. I found this function which does the encoding:
urlencode() {
setopt localoptions extendedglob
input=( ${(s::)1} )
print ${(j::)input/(#b)([^A-Za-z0-9_.\!~*\'\(\)- ])/%${(l:2::0:)$(([##16]#match))}}
}
and want to be able to pass the results of this:
print -l $PWD/* | tail -1
to the function.i.e get the last full path in the file list and encode it.
I thought that something like this:
print -l $PWD/* | tail -1 | urlencode
or
print -l $PWD/* | tail -1 > urlencode
would work but they don't.
Does anyone know how to accomplish it?
Many Thanks
You need to get your input from stdin rather than from the first argument.
Here is one way to adapt the function to do this
urlencode() {
setopt localoptions extendedglob
stdin=`while read line; do echo $line ;done`
input=( ${(s::)stdin} )
print ${(j::)input/(#b)([^A-Za-z0-9_.\!~*\'\(\)- ])/%${(l:2::0:)$(([##16]#match))}}
}
I tested it on my terminal, it works

How do I create a batch file that reads a file and writes a substring into another file?

I currently have an exported text file (output.txt) from a Clear-Case command that I need to parse. It looks like this:
Comparing the following:
R1.PROD.V1#\VOB_pvob
R2.PROD.V1#\VOB_pvob
Differences:
>> M:\ACME_PROD\src\ACME##
>> M:\ACME_PROD\src\ACME\file 2.txt##
>> M:\ACME_PROD\src\ACME\file 1.txt##
>> M:\ACME_PROD\src\ACME\file 3.txt##
What I would like to do is use the findstr command to filter the strings that are contained between the ">> " and "##" strings to get an output file that looks like this (with quotes if possible:
"M:\ACME_PROD\src\ACME"
"M:\ACME_PROD\src\ACME\file 2.txt"
"M:\ACME_PROD\src\ACME\file 1.txt"
"M:\ACME_PROD\src\ACME\file 3.txt"
I am new to writing batch files and so I don't exactly know where to start. I have managed to find code that can loop through the lines of a text file and separate code for the findstr command, but I get stuck trying to put it all together!
Best regards,
Andrew
Here you go
setlocal enabledelayedexpansion
for /f "skip=1 tokens=* delims=> " %%a in ('"findstr /r [\w^>*] output.txt"') do (
set line=%%a
set line=!line:#=!
echo "!line!" >>new.txt
)
The filtered strings will be outputted into new.txt.

Reading a file line by line using bash, extracting some data. How?

I want to read a file a extract information from it based on certain tag. For example :
SCRIPT_NAME:mySimpleShell.sh
This is a simple shell. I would like to have this as
Description. I also want to create a txt file our of this.
SCRIPT_NAME:myComplexShell.sh
This is a complex shell. I would like to have this as
Description. I also want to create a txt file our of this.
So when I pass in this file to my shell script, my shell will read it line by line and
when it gets to SCRIPT_NAME, It extract it and save it in $FILE_NAME, then starts writing
the description to a file on disk with $FILE_NAME.txt name. And It does it until It reaches the end of file. If there is 3 SCRIPT_NAME tag, then it creates 3 description file.
Thanks for helping me in advance :)
Read the lines using a while loop. Use a regex to check if a line has SCRIPT_NAME and if so, extract the filename. This is shown below:
#! /bin/bash
while IFS= read -r line
do
if [[ $line =~ SCRIPT_NAME:(.*$) ]]
then
FILENAME="${BASH_REMATCH[1]}"
echo "Writing to $FILENAME.txt"
else
echo "$line" >> "$FILENAME.txt"
fi
done < inputFile
#!/bin/sh
awk '/^SCRIPT_NAME:/ { split( $0, a, ":" ); name=a[2]; next }
name { print > name ".txt" }' ${1?No input file specified}

Resources