Adobe InDesign GREP over multiple lines and paragraphs - grep

I try to create an InDesign syntax highlighting script for Lua code.
Something very simple and basic. I decided to go with this native approach. I've got strings and numbers and single-line comments already covered. But I'm having trouble with multi-line strings and multi-line comments.
Multiline text begins with [[ and ends with ]] (possibly spanning over multiple paragraphs).
Multiline comments are same, but beginn with --[[ and end with (at least) ]]. You find some examples down below.
The question is, how can I apply a GREP search onto multiple paragraphs in InDesign? I've already tried all kinds of variants with lookbehind and lookahead and also the (?s) modifier which seems to work only on until the next line break.
local text = [[ Lua
multi
line
string]]
--[[
multiline comment
--]]
--[[ alternative
multi-
line
comment
]]

Related

Does [:space:] in a grep command not include newlines and carriage returns? [duplicate]

This question already has answers here:
How to grep for the whole word
(7 answers)
Closed 11 months ago.
I'm curently writing a simple Bash script. The idea is to use grep to find the lines where a certain pattern is found, within some files. The pattern contains 3 capital letters at the start, followed by 6 digits; so the regex is [A-Z]{3}[0-9}{6}.
However, I need to only include the lines where this pattern is not concatenated with other strings, or in other words, if such a pattern is found, it has to be separated from other strings with spaces.
So if the string which matches the pattern is ABC123456 for example, the line something ABC123456 something should be fine, but somethingABC123456something should fail.
I've extended my regex using the [:space:] character class, like so:
[[:space:]][A-Z]{3}[0-9}{6}[[:space:]]
And this seems to work, except for when the string which matches the pattern is the first or last one in the line.
So, the line something ABC123456 something will match correctly;
The line ABC123456 something won't;
And the line something ABC123456 won't as well.
I believe this has something to do with [:space:] not counting new lines and carriage returns as whitespace characters, even though it should from my understanding. Could anyone spot if I'm doing something wrong here?
A common solution to your problem is to normalize the input so that there is a space before and after each word.
sed 's/^ //;s/$/ /' file |
grep -oE '[[:space:]][A-Z]{3}[0-9}{6}[[:space:]]'
Your question assumes that the newlines are part of what grep sees, but that is not true (or at least not how grep is commonly implemented). Instead, it reads just the contents of each new line into a memory buffer, and then applies the regular expression to that buffer.
A similar but different solution is to specify beginning of line or space, and correspondingly space or end of line:
grep -oE '(^|[[:space:]])[A-Z]{3}[0-9}{6}([[:space:]]|$)' file
but this might not be entirely portable.
You might want to postprocess the results to trim any spaces from the extracted strings, too; but I have already had to guess several things about what you are actually trying to accomplish, so I'll stop here.
(Of course, sed can do everything grep can do, and then some, so perhaps switch to sed or Awk entirely rather than build an elaborate normalization pipeline around grep.)

Snippets for Gedit: how to change the text in a placeholder to make the letters uppercase?

I’m trying to improve a snippet for Gedit that helps me write shell scripts.
Currently, the snippet encloses the name of a variable into double quotes surrounding curly brackets preceded with a dollar sign. But to make the letters uppercase, I have to switch to the caps-lock mode or hold down a shift key when entering the words. Here is the code of the snippet:
"\${$1}"
I would like that the snippet makes the letters uppercase for me. To do that, I need to know how to make text uppercase and change the content of a placeholder.
I have carefully read the following articles:
https://wiki.gnome.org/Apps/Gedit/Plugins/Snippets
https://blogs.gnome.org/jessevdk/2009/12/06/about-snippets/
https://www.marxists.org/admin/volunteers/gedit-sed.htm
How do you create a date snippet in gedit?
But I still have no idea how to achieve what I want — to make the letters uppercase. I tried to use the output of shell programs, a Python script, the regular expressions — the initial text in the placeholder is not changed. The last attempt was the following (for clarity, I removed the surrounding double-quotes and the curly brackets with the dollar — working just on the letter case):
${1}$<[1]: return $1.upper()>
But instead of MY_VARIABLE I get my_variableMY_VARIABLE.
Perhaps, the solution is obvious, but I cannot get it.
I did it! The solution found!
Before all, I have to say that I don’t count the solution as correct or corresponding to the ideas of the Gedit editor. It’s a dirty hack, of course. But, strangely, there is no way to change the initial content of placeholders in the snippets — haven’t I just found a standard way to do that?
So. If they don’t allow us to change the text in placeholders, let’s ask the system to do that.
The first thought that stroke me was to print backspace characters. There are two ways to do that: a shell script and a python script. The first approach might look like: $1$(printf '\b') The second one should do the same: $1$<[1]: return '\b'> But both of them don’t work — Gedit prints surrogate squares instead of real backspace characters.
Thus, xdotool is our friend! So is metaprogramming! You will not believe — metaprogramming in a shell script inside a snippet — sed will be writing the scenario for xdotool. Also, I’ve added a feature that changes spaces to underscores for easier typing. Here is the final code of the snippet:
$1$(
eval \
xdotool key \
--delay 5 \
`echo "${1}" | sed "s/./ BackSpace/g;"`
echo "\"\${${1}}\"" \
| tr '[a-z ]' '[A-Z_]'
)$0
Here are some explanations.
Usually, I never use backticks in my scripts because of some troubles and incompatibilities. But now is not the case! It seems Gedit cannot interpret the $(...) constructions correctly when they are nested, so I use the backticks here.
A couple of words about using the xdotool command. The most critical part is the --delay option. By default, it’s 12 milliseconds. If I leave it as is, there will be an error when the length of the text in the placeholder is quite long. Not to mention the snippet processing becomes slow. But if I set the time interval too small, some of the emulated keystrokes sometimes will be swallowed somewhere. So, five milliseconds is the delay that turns out optimal for my system.
At last, as I use backspaces to erase the typed text, I cannot use template parts outside the placeholder. Thus, such transformations must be inside the script. The complex heap after the echo command is what the template parts are.
What the last tr command does is the motivator of all this activity.
It turns out, Gedit snippets may be a power tool. Good luck!

Multiline comments in Dockerfiles

Is there a fast way to comment out multiple lines in a Dockerfile?
I know that I can add # at the beginning of each line. But if there are many lines this is too much work. In some languages there are multiline comments such as /* ... */, which makes commenting out large parts of a file very fast.
As of today, no.
According to Dockerfile reference documentation:
Docker treats lines that begin with # as a comment, unless the line is
a valid parser directive. A # marker anywhere else in a line is
treated as an argument.:
There is no further details on how to comment lines.
As said by some comments already, most IDE will allow you to perform multiline comments easily (such as CTRL + / on IntelliJ)
There is no mentioning of multiline comments in Docker documentation
I also paste here the relevant part for simplicity:
Docker treats lines that begin with # as a comment, unless the line is
a valid parser directive.
A # marker anywhere else in a line is treated as an argument.
This allows statements like:
# Comment
RUN echo 'we are running some # of cool things'
Line continuation characters are not supported in comments.
On the other hand you can achieve the requested result easily with any modern IDE / Text Editor.
This is an example using Sublime Text (Select text and then control + /).
You can achieve the same result with VsCode, Notepad++, JetBrains products (IntelliJ, PyCharm, PHPStorm etc.) and almost 100% of the IDEs / Text Editors I know and use.
good solution in VSCode (and many other IDEs) would be:
select all lines that you want to comment out. Hit TAB three times. Now press CTRL+F and search for three TAB spaces (' ') and then replace all with '# '. All lines that had three TAB spaces in front of it now have a '# ' in front of it.

Atom editor: indentation adding extra unwanted spaces

When hitting enter in Atom editor for python code the cursor at the new line adds two extra spaces. So it doesn't end up where it should, i.e. two spaces in as in the rest of the code.
I have already followed the suggestions mentioned here, i.e. tablength=2, softtab with auto mode: How to change new line indentation in atom editor?
In vim it seems this is set by the shiftwidth keyword (i.e. =4 means same problem as above, =2 means it works). I couldn't find this keyword for Atom though.

LaTeX - Listings - Code Indention

so I'm working on some kind of homework paper about git and I want to insert some console output examples. I'm working with TextMate.
I have my LaTeX code indented like every other normal source code, to make it more readable.
My question now is, why get listings in my output pdf indented and how do I prevent that.
Some example code:
\begin{lstlisting}
$ git ls-files
README
TU_Logo_SW.pdf
beleg.pdf
beleg.tex
\end{lstlisting}
In my file there is one tab in front of \begin and two in the lines following.
When I run pdflatex the code will be indented with two tabstops in the pdf. Quickfix is to format all the listings without indention in my tex file, but thats pretty ugly ;-(
lstlisting has a key that lets you remove spaces:
\begin{lstlisting}[gobble=4] will remove the first four characters from every input line in the environment. (I think a tab should still count as one character at that point.)

Resources