I frequently use the vi command
:set number
I recently was trying to align some data that had a zero based index with the line numbers in vim, which has a 1 based index. Is there a way to have vi start numbering lines from 0, or more broadly from any other starting point?
#David's comment led me on a bit of a goose chase eventually finding this link: Add line numbers to source code in vim
For my purposes what I needed was not necessarily a persistent numbering, but just a temporary one. Quickly insert/remove line numbers and then resort to the standard numbering. For this case, I hadn't thought of simply inserting then deleting my own line numbering. The below works well. And the possibilities are endless for numbering (see for example https://stackoverflow.com/a/252774/2816571)
%!awk '{print NR,$0}'
For a native vi solution to adding numbers then (from https://stackoverflow.com/a/253041/2816571) the below will insert line numbers then the space character:
:%s/^/\=line('.').' '/
Then (thanks Lance) https://stackoverflow.com/a/4674574/2816571 when done with the numbering, do something like this, but you might need to tweak the search pattern depending if you put a space, comma separator
:%s/^[0-9]* //
If there is something I am missing inside vim that can do the persistent 0 based numbering, please let me know.
Related
I am currently playing around with parsing diff files, and have yet to come across a solid documentation on diff files.
I am especially interested in specifications. E.g. I don't really understand the lines that look like this (at the beginning of each changed code block):
## -296,7 +296,8 ##
I know they have to do with line numbers, and how much lines have changed, but I wasn't really able to figure out the details so far.
What is the syntax of the output diff files (at least, the main parts)?
Check out the documentation for GNU diffutils. There you'll find this section:
Next come one or more hunks of differences; each hunk shows one area where the files differ. Unified format hunks look like this:
## from-file-line-numbers to-file-line-numbers ##
line-from-either-file
line-from-either-file...
If a hunk contains just one line, only its start line number appears. Otherwise its line numbers look like ‘start,count’. An empty hunk is considered to start at the line that follows the hunk.
If a hunk and its context contain two or more lines, its line numbers look like ‘start,count’. Otherwise only its end line number appears. An empty hunk is considered to end at the line that precedes the hunk.
The lines common to both files begin with a space character. The lines that actually differ between the two files have one of the following indicator characters in the left print column:
‘+’
A line was added here to the first file.
‘-’
A line was removed here from the first file.
The Wikipedia page on the diff utility describes the format pretty well.
I know one of LaTeX's bragging points is that it doesn't have this Microsoftish behavior. Nevertheless, it's sometimes useful.
LaTeX already adds an extra space after you type a (non-backslashed) period, so it should be possible to make it automatically capitalize the following letter as well.
Is there an obvious way to write a macro that does this, or is there a LaTeX package that does it already?
The following code solves the problem.
\let\period.
\catcode`\.\active
\def\uppercasesingleletter#1{\uppercase{#1}}
\def.{\period\afterassignment\periodx\let\next= }
\def \periodx{\ifcat\space\next \next\expandafter\uppercasesingleletter \else\expandafter\next\fi}
First. second.third. relax.relax. up
\let\period. save period
\catcode\.\active make all periods to be active symbol (like macro).
\def\uppercasesingleletter#1{\uppercase{#1}} defines macro \uppercasesingleletter to make automatically capitalize the following letter.
\def.{\period\afterassignment\periodx\let\next= } writes saved period and checkes the next symbol.
\def \periodx{\ifcat\space\next \next\expandafter\uppercasesingleletter \else\expandafter\next\fi} If the next letter is a space then \uppercasesingleletter is inserted.
ages ago there was discussion of this idea on comp.text.tex, and the general conclusion was you can't do it satisfactorily. satisfactory, in my book, involves not making characters active, but i can't see how that could work at all.
personally, i would want to make space active, and have it then look at \spacefactor and \MakeUppercase the following character if the factor is 3000.
something like
\catcode\ \active % latex already has a saved space character -- \space
\def {\ifhmode% \spacefactor is invalid
% (or something) in vertical mode
\ifnum\spacefactor<3000\else% note: with space active,
% even cs-ended lines need %-termination
\expandafter\gobbleandupper\fi}%
\def\gobbleandupper#1{\def\tempa{#1}\def\tempb{ }%
\ifx\tempa\tempb% can''t indent the code, either :-(
% here, we have another space
\expandafter\gobbleandupper% try again
\else\space% insert a "real" space to soak up the
% space factor
\expandafter\MakeUppercase\fi}%
this doesn't really do the job -- there are enough loose ends to knit a fairisle jumper. for example, given that we can't rely on \everypar in latex, how do you uppercase the first letter of a paragraph?
no ... however much it hurts (which is why i avoid unnecessary key operations) we need to type latex "properly" :-(
I decided to solve it in the following way:
Since I always compile the LaTeX code three times before i okular the result (to get pagination and references right), I decided to build the capitalization of sentences into that process.
Thus, I now have a shell script that calls my capitalization script (written in CRM114) first, then pdflatex three times, and then okular. This way, all the stuff happens as the result of a single command.
I am using LaTeX and I have a problem concerning string manipulation.
I want to have an operation applied to every character of a string, specifically
I want to replace every character "x" with "\discretionary{}{}{}x". I want to do
this because I have a long string (DNA) which I want to be able to separate at
any point without hyphenation.
Thus I would like to have a command called "myDNA" that will do this for me instead of
inserting manually \discretionary{}{}{} after every character.
Is this possible? I have looked around the web and there wasnt much helpful
information on this topic (at least not any I could understand) and I hoped
that you could help.
--edit
To clarify:
What I want to see in the finished document is something like this:
the dna sequence is CTAAAGAAAACAGGACGATTAGATGAGCTTGAGAAAGCCATCACCACTCA
AATACTAAATGTGTTACCATACCAAGCACTTGCTCTGAAATTTGGGGACTGAGTACACCAAATACGATAG
ATCAGTGGGATACAACAGGCCTTTACAGCTTCTCTGAACAAACCAGGTCTCTTGATGGTCGTCTCCAGGT
ATCCCATCGAAAAGGATTGCCACATGTTATATATTGCCGATTATGGCGCTGGCCTGATCTTCACAGTCAT
CATGAACTCAAGGCAATTGAAAACTGCGAATATGCTTTTAATCTTAAAAAGGATGAAGTATGTGTAAACC
CTTACCACTATCAGAGAGTTGAGACACCAGTTTTGCCTCCAGTATTAGTGCCCCGACACACCGAGATCCT
AACAGAACTTCCGCCTCTGGATGACTATACTCACTCCATTCCAGAAAACACTAACTTCCCAGCAGGAATT
just plain linebreaks, without any hyphens. The DNA sequence will be one
long string without any spaces or anything but it can break at any point.
This is why my idea was to inesert a "\discretionary{}{}{}" after every
character, so that it can break at any point without inserting any hyphens.
This takes a string as an argument and calls \discretionary{}{}{} after each character. The input string stops at the first dollar sign, so you should not use that.
\def\hyphenateWholeString #1{\xHyphenate#1$\wholeString}
\def\xHyphenate#1#2\wholeString {\if#1$%
\else\say{#1}\discretionary{}{}{}%
\takeTheRest#2\ofTheString
\fi}
\def\takeTheRest#1\ofTheString\fi
{\fi \xHyphenate#1\wholeString}
\def\say#1{#1}
You’d call it like \hyphenateWholeString{CTAAAGAAAACAGGACG}.
Instead of \discretionary{}{}{} you can also try \hspace{0pt}, if you like that more (and are in a latex environment). In order to align the right margin, I think you’d need to do some more fine tuning (but see below). The effect is of course minimised by using a font of fixed width.
Revision:
\def\hyphenateWholeString #1{\xHyphenate#1$\wholeString\unskip}
\def\xHyphenate#1#2\wholeString {\if#1$%
\else\transform{#1}%
\takeTheRest#2\ofTheString\fi}
\def\takeTheRest#1\ofTheString\fi
{\fi \xHyphenate#1\wholeString}
\def\transform#1{#1\hskip 0pt plus 1pt}
Steve’s suggestion of using \hskip sounds like a very good idea to me, so I made a few corrections. Note that I’ve renamed the \say macro and made it more useful in that it now actually does the transformation. (However, if you remove the \hskip from \transform, you’ll also need to remove the \unskip in the main macro definition.
Edit:
There is also the seqsplit package which seems to be made for printing DNA data or long numbers. They also bring a few options for nicer output, so maybe that is what you’re looking for…
Debilski's post is definitely a solid way to do it, although the \say is not necessary. Here's a shorter way that makes use of some LaTeX internal shortcuts (\#gobble and \#ifnextchar):
\makeatletter
\def\hyphenatestring#1{\xHyphen#te#1$\unskip}
\def\xHyphen#te{\#ifnextchar${\#gobble}{\sw#p{\hskip 0pt plus 1pt\xHyphen#te}}}
\def\sw#p#1#2{#2#1}
\makeatother
Note the use of \hskip 0pt plus 1pt instead of \discretionary - when I tried your example I ended up with a ragged margin because there's no stretchability. The \hskip adds some stretchable glue in between each character (and the \unskip afterwards cancels the extra one we added). Also note the LaTeX style convention that "end user" macros are all lowercase, while internal macros have an # in them somewhere so that users don't accidentally call them.
If you want to figure out how this works, \#gobble just eats whatever's in front of it (in this case the $, since that branch is only run when a $ is the next char). The main point is that \sw#p is only given one argument in the "else" branch, so it swaps that argument with the next char (that isn't a $). We could just as well have written \def\hyphenate#next#1{#1\hskip...\xHyphen#te} and put that with no args in the "else" branch, but (in my opinion) \sw#p is more general (and I'm surprised it's not in standard LaTeX already).
There is a contrib package on CTAN that deals with typesetting DNA sequences. It does a little more than just line-breaking, for example, it also supports colouring. I'm not sure if it is possible to get the output you are after though, and I have no experience in the DNA-sequence-typesetting area, but is one long string the most readable representation?
Assuming your string is the same, in your preamble, use the \newcommand{}{}. Like this:
\newcommand{\myDNA}{blah blah blah}
if that doesn't satisfy your requirements, I suggest:
2. Break the strings down to the smallest portion, then use the \newcommand and then call the new commands in sequence: \myDNA1 \myDNA2.
If that still doesn't work, you might want to look at writing a perl script to satisfy your string replacement needs.
LaTeX tries to guess whether a period ends a sentence, in which case it puts extra space after it.
Here are two examples where it guesses wrong:
I watched Superman III. Then I went home.
(Too little space after "Superman III.".)
After brushing teeth etc. I went to bed.
(Too much space after "etc.".)
Note that it doesn't matter how much whitespace you use in the LaTeX source since LaTeX ignores that.
I found the answer here: http://john.regehr.org/latex/. Excerpt:
When a non-sentence-ending period is to be followed by a space, the space must be an explicit blank.
So the second example should be:
After brushing teeth etc.\ I went to bed.
The converse of this problem happens when a capital letter precedes a sentence-ending period in the input, as in the first example.
In this case LaTeX assumes that the period terminates an abbreviation and follows it with inter-word space rather than inter-sentence space.
The fix is to put "\#" before the period.
So the first example should be
I watched Superman III\#. Then I went home.
A handy way to find this error is:
grep '[A-Z]\.' *.tex
You can sidestep the spacing issue if you prefer single spaces at the end of sentences: put \frenchspacing on (for older versions of Latex this was a fragile command). Knuth was following the traditional naming in calling it French spacing, although calling double spacing after sentences French spacing has become dominant in publishing.
Dirk Margulis wrote a nice post summarising some of the reasons for the prevalance of single spacing: Space between sentences.
I like the answer from dreeves and the handy search he suggests too. I don't have the Stackoverflow "rep" points to comment, but...
Since lines in raw *.tex tend to be very long, the output from grep can be overwhelming (i.e., entire paragraphs); I suggest a variation to only display the words ending in '[A-Z].' (followed by one or more space, followed by a new capitalized word). It is,
grep -o -E '[A-Z]+\. +[A-Z]+[A-Za-z]+' *.tex
How can I prevent LaTeX from inserting linebreaks in my \texttt{...} or \url{...} text regions? There's no spaces inside I can replace with ~, it's just breaking on symbols.
Update: I don't want to cause line overflows, I'd just rather LaTeX insert linebreaks before these regions rather than inside them.
\mbox is the simplest answer. Regarding the update:
TeX prefers overlong lines to adding too much space between words on a line; I think the idea is that you will notice the lines that extend into the margin (and the black boxes it inserts after such lines), and will have a chance to revise the contents, whereas if there was too much space, you might not notice it.
Use \sloppy or \begin{sloppypar}...\end{sloppypar} to adjust this behavior, at least a little. Another possibility is \raggedright (or \begin{raggedright}...\end{raggedright}).
Surround it with an \mbox{}
Also, if you have two subsequent words in regular text and you want to avoid a line break between them, you can use the ~ character.
For example:
As we can see in Fig.~\ref{BlaBla}, there is nothing interesting to see. A~better place..
This can ensure that you don't have a line starting with a figure number (without the Fig. part) or with an uppercase A.
Use \nolinebreak
\nolinebreak[number]
The \nolinebreak command prevents LaTeX from
breaking the current line at the point of the command. With the
optional argument, number, you can convert the \nolinebreak command
from a demand to a request. The number must be a number from 0 to 4.
The higher the number, the more insistent the request is.
Source: http://www.personal.ceu.hu/tex/breaking.htm#nolinebreak
Define myurl command:
\def\myurl{\hfil\penalty 100 \hfilneg \hbox}
I don't want to cause line overflows,
I'd just rather LaTeX insert linebreaks before
\myurl{\tt http://stackoverflow.com/questions/1012799/}
regions rather than inside them.