GNU-M4: Strip empty lines - preprocessor

How can I strip empty lines (surplus empy lines) from an input file using M4?
I know I can append dnl to the end of each line of my script to suppress the newline output, but the blank lines I mean are not in my script, but in a data file that is included (where I am not supposed to put dnl's).
I tried something like that:
define(`
',`')
(replace a new-line by nothing)
But it didn't work.
Thanks.

I use divert() around my definitions :
divert(-1) will suppress the output
divert(0) will restore the output
Eg:
divert(-1)dnl output supressed starting here
define(..)
define(..)
divert(0)dnl normal output starting here
use_my_definitions()...

I understand your problem to be a data file with extra line breaks, meaning that where you want to have the pattern data<NL>moredata you have things like data<NL><NL>moredata.
Here's a sample to cut/paste onto your command line that uses here documents to generate a data set and runs an m4 script to remove the breaks in the data set. You can see the patsubst command replaces every instance of one or more newlines in sequence (<NL><NL>*) with exactly one newline.
cat > data << -----
1, 2
3, 4
5, 6
7, 8
9, 10
11, 12
e
-----
m4 << "-----"
define(`rmbreaks', `patsubst(`$*', `
*', `
')')dnl
rmbreaks(include(data))dnl
-----

Related

Remove two lines using sed

I'm writing a script which can parse an HTML document. I would like to remove two lines, how does sed work with newlines? I tried
sed 's/<!DOCTYPE.*\n<h1.*/<newstring>/g'
which didn't work. I tried this statement but it removes the whole document because it seems to remove all newlines:
sed ':a;N;$!ba;s/<!DOCTYPE.*\n<h1.*\n<b.*/<newstring>/g'
Any ideas? Maybe I should work with awk?
For the simple task of removing two lines if each matches some pattern, all you need to do is:
sed '/<!DOCTYPE.*/{N;/\n<h1.*/d}'
This uses an address matching the first line you want to delete. When the address matches, it executes:
Next - append the next line to the current pattern-space (including \n)
Then, it matches on an address for the contents of the second line (following \n). If that works it executes:
delete - discard current input and start reading next unread line
If d isn't executed, then both lines will print by default and execution will continue as normal.
To adjust this for three lines, you need only use N again. If you want to pull in multiple lines until some delimiter is reached, you can use a line-pump, which looks something like this:
/<!DOCTYPE.*/{
:pump
N
/some-regex-to-stop-pump/!b pump
/regex-which-indicates-we-should-delete/d
}
However, writing a full XML parser in sed or awk is a Herculean task and you're likely better off using an existing solution.
If an xml parsing tool is definitely not an option, awk maybe an option:
awk '/<!DOCTYPE/ { lne=NR+1;next } NR==lne && /<h1/ { next }1' file
When we encounter a line with "<!DOCTYPE" set the variable lne to the line number + 1 (NR+1) and then skip to the next line. Then when the line is equal to lne (NR==lne) and the line contains "<h1", skip to the next line. Print all other lines by using 1.
My solution for a document like this:
<b>...
<first...
<second...
<third...
<a ...
this awk command works well:
awk -v RS='<first[^\n]*\n<second[^\n]*\n<third[^\n]*\n' '{printf "%s", $0}'
that's all.
This might work for you (GNU sed):
sed 'N;/<!DOCTYPE.*\n<h1.*/d;P;D' file
Append the following line and if the pattern matches both lines in the pattern space delete them.
Otherwise, print then delete the first of the two lines and repeat.
To replace the two lines with another string, use:
sed 'N;s/<!DOCTYPE.*\n<h1.*/another string/;P;D'

SPSS print a statement to the ouput window

What syntax do I use to print "hello world" in the output window?
I simply want to specify the text in the syntax and have it appear in the output.
You need the title command:
title 'this is my text'.
Note that the title can be up to 256 bytes long.
Alternatively, you could use ECHO also. ECHO is useful for debugging macro variable assignements where as TITLE is useful for neat/organised presentation of your tables with intension to perhaps export output results.
If you want to write arbitrary text in its own block in the Viewer rather than having it stuck in a log block, use the TEXT extension command (Utilities > Create text output). You can even include html markup in the text.
If you don't have this extension installed, you can install it from the Utilities menu in Statistics 22 or 23 or the Extensions menu in V24.
example:
TEXT "The following output is very important!"
/OUTLINE HEADING="Comment" TITLE="Comment".
Outfile here is used in an ambiguous way. The prior two answers (the TITLE and ECHO commands) simply print something to the output window. One additional way to print to the output window is the PRINT command.
DATA LIST FREE / X.
BEGIN DATA
1
2
END DATA.
PRINT /'Hello World'.
EXECUTE.
If you do that set of syntax you will actually see that 'Hello World' is printed twice -- one for each record in the dataset. So one way to only print one line is to wrap it in a DO IF statement and only select the first row of data.
DO IF $casenum=1.
PRINT /'Hello World'.
END IF.
EXECUTE.
Now how is this any different than the prior two commands? Besides aesthetic looks in the output window, PRINT allows you to save an actual text file of the results via the OUTFILE parameter, which is something neither of the prior two commands allows.
DO IF $casenum=1.
PRINT OUTFILE='C:\Users\Your Name\Desktop\Hello.txt' /'Hello World'.
END IF.
EXECUTE.

FastqGeneralIterator Output

I'm using FastqGeneralIterator, but I find that it removes the # from the 1st line of a fastq file and also the information for the 3rd line (it removes the entire 3rd line).
I added the # in the 1st line in the following way:
for line in open("prova_FiltraN_CE_filt.fastq"):
fout.write(line.replace('SEQ', '#SEQ'))
I want to add also the 3rd line, that starts with + and there is nothing after that. For example:
#SEQILMN0
TCATCGTA....
+
#<BBBFFF.....
Can someone help me?
you can use, String Formatting Operations %
from Bio.SeqIO.QualityIO import FastqGeneralIterator
with open("prova_FiltraN_CE_filt.fastq", "rU") as handle:
for (title, sequence, quality) in FastqGeneralIterator(handle):
print("#%s\n%s\n+\n%s" % (title, sequence, quality))
you get fastq print format, using FastqGeneralIterator
#SEQILMN0
TCATCGTA....
+
#<BBBFFF....

How to make the output of Maxima cleaner?

I want to make use of Maxima as the backend to solve some computations used in my LaTeX input file.
I did the following steps.
Step 1
Download and install Maxima.
Step 2
Create a batch file named cas.bat (for example) as follows.
rem cas.bat
echo off
set PATH=%PATH%;"C:\Program Files (x86)\Maxima-5.31.2\bin"
maxima --very-quiet -r %1 > solution.tex
Save the batch in the same directory in which your input file below exists. It is just for the sake of simplicity.
Step 3
Create the input file named main.tex (for example) as follows.
% main.tex
\documentclass[preview,border=12pt,12pt]{standalone}
\usepackage{amsmath}
\def\f(#1){(#1)^2-5*(#1)+6}
\begin{document}
\section{Problem}
Evaluate $\f(x)$ for $x=\frac 1 2$.
\section{Solution}
\immediate\write18{cas "x: 1/2;tex(\f(x));"}
\input{solution}
\end{document}
Step 4
Compile the input file with pdflatex -shell-escape main and you will get a nice output as follows.
!
Step 5
Done.
Questions
Apparently the output of Maxima is as follows. I don't know how to make it cleaner.
solution.tex
1
-
2
$${{15}\over{4}}$$
false
Now, my question are
how to remove such texts?
how to obtain just \frac{15}{4} without $$...$$?
(1) To suppress output, terminate input expressions with dollar sign (i.e. $) instead of semicolon (i.e. ;).
(2) To get just the TeX-ified expression sans the environment delimiters (i.e. $$), call tex1 instead of tex. Note that tex1 returns a string, which you have to print yourself (while tex prints it for you).
Combining these ideas with the stuff you showed, I think your program could look like this:
"x: 1/2$ print(tex1(\f(x)))$"
I think you might find the Maxima mailing list helpful. I'm pretty sure there have been several attempts to create a system such as the one you describe. You can also look at the documentation.
I couldn't find any way to completely clean up Maxima's output within Maxima itself. It always echoes the input line, and always writes some whitespace after the output. The following is an example of a perl script that accomplishes the cleanup.
#!/usr/bin/perl
use strict;
my $var = $ARGV[0];
my $expr = $ARGV[1];
sub do_maxima_to_tex {
my $m = shift;
my $c = "maxima --batch-string='exptdispflag:false; print(tex1($m))\$'";
my $e = `$c`;
my #x = split(/\(%i\d+\)/,$e); # output contains stuff like (%i1)
my $f = pop #x; # remove everything before the echo of the last input
while ($f=~/\A /) {$f=~s/\A .*\n//} # remove echo of input, which may be more than one line
$f =~ s/\\\n//g; # maxima breaks latex tokens in the middle at end of line; fix this
$f =~ s/\n/ /g; # if multiple lines, get it into one line
$f =~ s/\s+\Z//; # get rid of final whitespace
return $f;
}
my $e1 = do_maxima_to_tex("diff($expr,$var,1)");
my $e2 = do_maxima_to_tex("diff($expr,$var,2)");
print <<TEX;
The first derivative is \$$e1\$. Differentiating a second time,
we get \$$e2\$.
TEX
If you name this script a.pl, then doing
a.pl z 3*z^4
outputs this:
The first derivative is $12\,z^3$. Differentiating a second time,
we get $36\,z^2$.
For the OP's application, a script like this one could be what is invoked by the write18 in the latex file.
If you really want to use LaTeX then the maxiplot package is the answer. It provides a maxima environment inside of which you enter Maxima commands. When you process your LaTeX file a Maxima batch file is generated. Process this file with Maxima and process your LaTeX file again to typeset the equations generated by Maxima.
If you would rather have 2D math input with live typesetting then use TeXmacs. It is a cross-platform document authoring environment (a word processor on steroids if you like) that includes plugins for Maxima, Mathematica and many more scientific computing tools. If you need to or are not satisfied with the typesetting, you can export your document to LaTeX.
I know this is a very old post. Excellent answers for the question asked by OP. I was using --very-quiet -r options on the command line for a long time like OP, but in maxima version 5.43.2 they behave differently. See maxima command line v5.43 is behaving differently than v5.41. I am answering this question with a cross reference because when incorporating these answers in your solutions, make sure the changes in behavior of those command line flags are also incorporated.

What is >> for after print command in python 2?

import cStringIO
output = cStringIO.StringIO()
output.write('First line.\n')
print >>output, 'Second line.'
# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()
What does >>output in the print statement on line 5 do?
It redirects the print statement output to an open file-like object. See the print statement documentation:
print also has an extended form, defined by the second portion of the syntax described above. This form is sometimes referred to as “print chevron.” In this form, the first expression after the >> must evaluate to a “file-like” object, specifically an object that has a write() method as described above. With this extended form, the subsequent expressions are printed to this file object. If the first expression evaluates to None, then sys.stdout is used as the file for output.
Essentially, the line is translated to output.write('Second line.' + '\n') asprint` adds a newline to it's output unless the expression ends with a comma.
The syntax is based on the bash append >> syntax (which also influenced C++ << and >> I/O operators); see PEP 214 for a full motivation for why this was chosen.
In Python 3, where print() is a function, you'd write:
print('Second line.', file=output)
instead.

Resources