LaTeX - Define a list of words to use a certain font - latex

Problem
I'm writing an essay/documentation about an application. Within this doc there are a lot of code-words which I want to be highlighted using a different font. Currently I work with:
{\fontfamily{cmtt}\selectfont SOME-KEY-WORD}
Which is a bit of work to use every time.
Question
I'm looking for a way to declare a list of words to use a specific font within the text.
I know that I can use the listings package and define morekeywords which will be highlighted within the listings-environment but I need it in the text.
I thought of something like this:
\defineList{\fontfamily{cmtt}}{
SOME-KEY-WORD-1,
SOME-Key-word-2,
...
}
EDIT
I forgot to mention that I already tried something like:
\def\somekeyword{\fontfamily{cmtt}\selectfont some\_key\_word\normalfont}
which is a little bit better then the first attempt but I still need to use \somekeyword in the text.
EDIT 2
I came upon a workaround:
\newcommand{\cmtt}[1]{{\fontfamily{cmtt}\selectfont #1\normalfont}}
It's a little better then EDIT but still not the perfect solution.

Substitution every time a word occurs, without providing any clues to TeX, might be difficult and is beyond my skills (though I'd be interested to see someone come up with a solution).
But why not simply create a macro for each of those words?
\newcommand\somekeyword{\fontfamily{cmtt}\selectfont SOME-KEY-WORD}
Use like this:
Hello, \somekeyword{} is the magic word!
The trailing {} are unfortunately necessary to prevent eating the subsequent whitespace; even the built-in \LaTeX command requires them.
If you have very many of these words and are worried about maintainability, you can even create a macro to create the macros:
\newcommand\declareword[2][]{%
\expandafter\newcommand%
\csname\if\relax#1\relax#2\else#1\fi\endcsname%
{{\fontfamily{cmtt}\selectfont #2}}%
}
\declareword{oneword} % defines \oneword
\declareword{otherword} % defines \otherword
\declareword[urlspy]{urls.py} % defines \urlspy
...
The optional argument indicates the name of the command, in case the word itself contains characters like . which cannot be used in the name of a command.

Related

Latex - Apply an operation to every character in a string

I am using LaTeX and I have a problem concerning string manipulation.
I want to have an operation applied to every character of a string, specifically
I want to replace every character "x" with "\discretionary{}{}{}x". I want to do
this because I have a long string (DNA) which I want to be able to separate at
any point without hyphenation.
Thus I would like to have a command called "myDNA" that will do this for me instead of
inserting manually \discretionary{}{}{} after every character.
Is this possible? I have looked around the web and there wasnt much helpful
information on this topic (at least not any I could understand) and I hoped
that you could help.
--edit
To clarify:
What I want to see in the finished document is something like this:
the dna sequence is CTAAAGAAAACAGGACGATTAGATGAGCTTGAGAAAGCCATCACCACTCA
AATACTAAATGTGTTACCATACCAAGCACTTGCTCTGAAATTTGGGGACTGAGTACACCAAATACGATAG
ATCAGTGGGATACAACAGGCCTTTACAGCTTCTCTGAACAAACCAGGTCTCTTGATGGTCGTCTCCAGGT
ATCCCATCGAAAAGGATTGCCACATGTTATATATTGCCGATTATGGCGCTGGCCTGATCTTCACAGTCAT
CATGAACTCAAGGCAATTGAAAACTGCGAATATGCTTTTAATCTTAAAAAGGATGAAGTATGTGTAAACC
CTTACCACTATCAGAGAGTTGAGACACCAGTTTTGCCTCCAGTATTAGTGCCCCGACACACCGAGATCCT
AACAGAACTTCCGCCTCTGGATGACTATACTCACTCCATTCCAGAAAACACTAACTTCCCAGCAGGAATT
just plain linebreaks, without any hyphens. The DNA sequence will be one
long string without any spaces or anything but it can break at any point.
This is why my idea was to inesert a "\discretionary{}{}{}" after every
character, so that it can break at any point without inserting any hyphens.
This takes a string as an argument and calls \discretionary{}{}{} after each character. The input string stops at the first dollar sign, so you should not use that.
\def\hyphenateWholeString #1{\xHyphenate#1$\wholeString}
\def\xHyphenate#1#2\wholeString {\if#1$%
\else\say{#1}\discretionary{}{}{}%
\takeTheRest#2\ofTheString
\fi}
\def\takeTheRest#1\ofTheString\fi
{\fi \xHyphenate#1\wholeString}
\def\say#1{#1}
You’d call it like \hyphenateWholeString{CTAAAGAAAACAGGACG}.
Instead of \discretionary{}{}{} you can also try \hspace{0pt}, if you like that more (and are in a latex environment). In order to align the right margin, I think you’d need to do some more fine tuning (but see below). The effect is of course minimised by using a font of fixed width.
Revision:
\def\hyphenateWholeString #1{\xHyphenate#1$\wholeString\unskip}
\def\xHyphenate#1#2\wholeString {\if#1$%
\else\transform{#1}%
\takeTheRest#2\ofTheString\fi}
\def\takeTheRest#1\ofTheString\fi
{\fi \xHyphenate#1\wholeString}
\def\transform#1{#1\hskip 0pt plus 1pt}
Steve’s suggestion of using \hskip sounds like a very good idea to me, so I made a few corrections. Note that I’ve renamed the \say macro and made it more useful in that it now actually does the transformation. (However, if you remove the \hskip from \transform, you’ll also need to remove the \unskip in the main macro definition.
Edit:
There is also the seqsplit package which seems to be made for printing DNA data or long numbers. They also bring a few options for nicer output, so maybe that is what you’re looking for…
Debilski's post is definitely a solid way to do it, although the \say is not necessary. Here's a shorter way that makes use of some LaTeX internal shortcuts (\#gobble and \#ifnextchar):
\makeatletter
\def\hyphenatestring#1{\xHyphen#te#1$\unskip}
\def\xHyphen#te{\#ifnextchar${\#gobble}{\sw#p{\hskip 0pt plus 1pt\xHyphen#te}}}
\def\sw#p#1#2{#2#1}
\makeatother
Note the use of \hskip 0pt plus 1pt instead of \discretionary - when I tried your example I ended up with a ragged margin because there's no stretchability. The \hskip adds some stretchable glue in between each character (and the \unskip afterwards cancels the extra one we added). Also note the LaTeX style convention that "end user" macros are all lowercase, while internal macros have an # in them somewhere so that users don't accidentally call them.
If you want to figure out how this works, \#gobble just eats whatever's in front of it (in this case the $, since that branch is only run when a $ is the next char). The main point is that \sw#p is only given one argument in the "else" branch, so it swaps that argument with the next char (that isn't a $). We could just as well have written \def\hyphenate#next#1{#1\hskip...\xHyphen#te} and put that with no args in the "else" branch, but (in my opinion) \sw#p is more general (and I'm surprised it's not in standard LaTeX already).
There is a contrib package on CTAN that deals with typesetting DNA sequences. It does a little more than just line-breaking, for example, it also supports colouring. I'm not sure if it is possible to get the output you are after though, and I have no experience in the DNA-sequence-typesetting area, but is one long string the most readable representation?
Assuming your string is the same, in your preamble, use the \newcommand{}{}. Like this:
\newcommand{\myDNA}{blah blah blah}
if that doesn't satisfy your requirements, I suggest:
2. Break the strings down to the smallest portion, then use the \newcommand and then call the new commands in sequence: \myDNA1 \myDNA2.
If that still doesn't work, you might want to look at writing a perl script to satisfy your string replacement needs.

Tex command which affects the next complete word

Is it possible to have a TeX command which will take the whole next word (or the next letters up to but not including the next punctuation symbol) as an argument and not only the next letter or {} group?
I’d like to have a \caps command on certain acronyms but don’t want to type curly brackets over and over.
First of all create your command, for example
\def\capsimpl#1{{\sc #1}}% Your main macro
The solution to catch a space or punctuation:
\catcode`\#=11
\def\addtopunct#1{\expandafter\let\csname punct#\meaning#1\endcsname\let}
\addtopunct{ }
\addtopunct{.} \addtopunct{,} \addtopunct{?}
\addtopunct{!} \addtopunct{;} \addtopunct{:}
\newtoks\capsarg
\def\caps{\capsarg{}\futurelet\punctlet\capsx}
\def\capsx{\expandafter\ifx\csname punct#\meaning\punctlet\endcsname\let
\expandafter\capsend
\else \expandafter\continuecaps\fi}
\def\capsend{\expandafter\capsimpl\expandafter{\the\capsarg}}
\def\continuecaps#1{\capsarg=\expandafter{\the\capsarg#1}\futurelet\punctlet\capsx}
\catcode`\#=12
#Debilski - I wrote something similar to your active * code for the acronyms in my thesis. I activated < and then \def<#1> to print the acronym, as well as the expansion if it's the first time it's encountered. I also went a bit off the deep end by allowing defining the expansions in-line and using the .aux files to send the expansions "back in time" if they're used before they're declared, or to report errors if an acronym is never declared.
Overall, it seemed like it would be a good idea at the time - I rarely needed < to be catcode 12 in my actual text (since all my macros were in a separate .sty file), and I made it behave in math mode, so I couldn't foresee any difficulties. But boy was it brittle... I don't know how many times I accidentally broke my build by changing something seemingly unrelated. So all that to say, be very careful activating characters that are even remotely commonly-used.
On the other hand, with XeTeX and higher unicode characters, it's probably a lot safer, and there are generally easy ways to type these extra characters, such as making a multi (or compose) key (I usually map either numlock or one of the windows keys to this), so that e.g. multi-!-! produces ¡). Or if you're running in emacs, you can use C-\ to switch into TeX input mode briefly to insert unicode by typing the TeX command for it (though this is a pain for actually typing TeX documents, since it intercepts your actual \'s, and please please don't try defining your own escape character!)
Regarding whitespace after commands: see package xspace, and TeX FAQ item Commands gobble following space.
Now why this is very difficult: as you noted yourself, things like that can only be done by changing catcodes, it seems. Catcodes are assigned to characters when TeX reads them, and TeX reads one line at a time, so you can not do anything with other spaces on the same line, IMHO. There might be a way around this, but I do not see it.
Dangerous code below!
This code will do what you want only at the end of the line, so if what you want is more "fluent" typing without brackets, but you are willing to hit 'return' after each acronym (and not run any auto-indent later), you can use this:
\def\caps{\begingroup\catcode`^^20 =11\mcaps}
\def\mcaps#1{\def\next##1 {\sc #1##1\catcode`^^20 =10\endgroup\ }\next}
One solution might be setting another character as active and using this one for escaping. This does not remove the need for a closing character but avoids typing the \caps macro, thus making it overall easier to type.
Therefore under very special circumstances, the following works.
\catcode`\*=\active
\def*#1*{\textsc{\MakeTextLowercase{#1}}}
Now follows an *Acronym*.
Unfortunately, this makes uses of \section*{} impossible without additional macro definitions.
In Xetex, it seems to be possible to exploit unicode characters for this, so one could define
\catcode`\•=\active
\def•#1•{\textsc{\MakeTextLowercase{#1}}}
Now follows an •Acronym•.
Which should reduce the effects on other commands but of course needs to have the character ‘•’ mapped to the keyboard somewhere to be of use.

Define every symbol as a command in LaTeX

I'm working on a large project involving multiple documents typeset in LaTeX. I want to be consistent in my use of symbols, so it might be a nice idea to define a command for every symbol that has a specific meaning throughout the project. Does anyone have any experience with this? Are there issues I should pay attention to?
A little more specific. Say that, throughout the document I would denote something called permability by a script P, would it be an idea to define
\providecommand{\permeability}{\mathscr{P}}
or would this be more like the case "defining a command for $n$"?
A few tips:
Using \providecommand will define that command only if it's not been previously defined. So if you're not getting the results you expected, you may be trying to define a command that's been defined elsewhere.
If you wrap the math in your commands with \ensuremath, it will do the right thing regardless of whether you're in math mode when you issue the command:
\providecommand{\permeability}{\ensuremath{\mathscr{P}}}
Now I can easily use \permeability in text or $\permeability$ in math mode.
Using your own commands allows you to easily change the typographical representation of something later. For instance:
\newcommand{\vect}[1]{\ensuremath{\mathbf{#1}}}
would print \vect{x} as a boldfaced x. If you later decide you prefer arrows above your vectors, you could change the command to:
\newcommand{\vect}[1]{\ensuremath{\vec{#1}}}
I have been doing this for anything that has a specific meaning and is longer than a single symbol, mostly to save typing:
\newcommand{\objId}{\mbox{$\mathit{objId}$}\xspace}
\newcommand{\insOp}[1]{#1\mbox{$^+$}\xspace}
\newcommand{\delOp}[1]{#1\mbox{$^-$}\xspace}
However then I noticed that I stopped making inconsistency errors (objId vs ObjId vs ObjID), so I agree that it is a nice idea.
However I am not sure if it is a good idea in case symbols in the output are, well, single Latin symbols, as in:
\newcommand{\numOfObjs}{$n$}
It is too easy to type a single symbol and forget about it even though a command was defined for it.
EDIT: using your example IMHO it'd be a good idea to define \permeability because it is more than a single P that you have to type in without the command. But it's a close call.

Replace strings in LaTeX

I want LaTeX to automatically replace strings like " a ", " s ", " z " with " a~", " s~", " z~", because they can't be at line end. Any suggestions?
For Czech typographic rules, there is a preprocessor called Vlna" by Petr Olšák - download . The set of (usually prepositions in czech) is customizable - so it might be usable for other languages as well.
You can use \StrSubstitute from xstring package.
e.g.
\StrSubstitute{change ME}{ ME}{d}
will convert change ME into changed.
Although, nesting is not possible, so to make another substitution you must use an intermediate variable in this way
\StrSubstitute{change ME}{ ME}{d}[\mystring]
\StrSubstitute{\mystring}{ed}{ing}
Finally, your solution would be
\usepackage{xstring}
\def\mystring{...source string here...}
\begin{document}
\StrSubstitute{\mystring}{ a }{a~{}}[\mystring]
\StrSubstitute{\mystring}{ s }{s~{}}[\mystring]
\StrSubstitute{\mystring}{ z }{z~{}}[\mystring]
\mystring
\end{document}
Note the use of the empty string {} to avoid the sequence ~}.
I'm afraid (to the best of my knowledge) this is basically impossible with LaTeX. A LuaTeX-based solution might be possible, though.
It's not actually clear to me, however, that " a ", for example, shouldn't appear at the end of a
line. Although I might be used to different typographic rules.
(Is there anything wrong with the line break in the last paragraph? :))
As far as I know there is no way to do this in LaTeX itself. I'd go for automating this with some external tools, as my typical setup involves a Makefile handling the LaTeX run by itself. This makes it rather easy to run tools like sed on the sources and do some replacements using regular expressions, and a simple rule would do this for your case.
If you use some LaTeX editor that does everything for you you should check the editors regular expression search and replace functionality.
Yes, this is the age old argument of data processing vs. data composition. We have always done these things in a pre-processor environment responsible for extracting the information from its source environment, SQL or plain-text, and created the contents of a \input(file.tex).
But yes, it is possible (TeX is after all a programming language) but you will have to become a wizard. Get the 4 volume set TeX in Practice by Stephan von Bechtolsheim.
The approach would be to begin an environment (execute a macro) whose ''argument'' was all text down to the end of the environment. Then just munge though the tokens fixing the ones you want.
Still, I don't think any of us are advocating this approach.
If you are using TeXmaker to write your LaTeX file, then you may click on the Edit button on the toolbar, then click on Replace.
A dialogue box will come up, and you can enter your strings one after the other.
You put the strings to be changed in the Find text input and what you want it to be changed to in the Replace text input.
You can also specify where you want the replacement to start from.
Click Find and Replace (or similiar) in the menu of your text editor and do it.

How do I emit the text content of a reference in LaTeX?

I have a section:
\section{Introduction} \label{sec:introduction}
I'd like a link to the section where the link text is the name of the section. I can use hyperref:
The \hyperrf[sec:introduction]{Introduction} introduces the paper.
But that requires repeating the section title ("Introduction"). Is there a way to grab that? ref yields the section number, which isn't right. autoref yields "section " and then the section number, which isn't right, either.
There are a couple of packages that provide this for you. nameref is distributed as part of hyperref to do this:
http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=nameref
There is a more general package for cross-referencing basically anything, called zref:
http://tug.ctan.org/cgi-bin/ctanPackageInformation.py?id=zref
It's by the same author as hyperref, Heiko Oberdiek; it's the one that I would choose. Here's an example:
\documentclass[oneside,12pt]{article}
\usepackage[user,titleref]{zref}
\begin{document}
\section{Introduction of sorts.}\zlabel{sec:intro}
Hello
\subsection{Structure}
We begin in `\ztitleref{sec:intro}'.
\end{document}
Note that it even removes the trailing period in the section title.
As far as I know, there's no standard way to do this. Simply put, the sectioning commands don't store the names of the sections anywhere they can be easily retrieved. Yes, they're inserted into the Table of Contents (and associated auxiliary file) and marks are set, but access to those is unreliable at best and usually impossible without additional context, which is almost always unavailable by the time you need to refer back to the section.
The code sample you posted looks like what I would write. There might be a package to automate this, but if one exists it's probably pretty hairy code since this is really not a particularly common use case. Actually, to go all grammar nazi on you the final text you're creating is incorrect; the word "introduction" should be lowercase inside the sentence, and this can't be achieved (in general) with backreferences to the actual section titles.
I'd just suck it up and write out references like this manually. There won't be enough of them to justify automation. Of course, if you're doing something more involved than your example suggests (many auto-generated sections or something) things might be different, but if that's the case it's really a different question entirely.
You could try using
\newsavebox
\savebox
\usebox
which won't save you any typeing but will give you a single authoritative source for each title
And you might search ctan.org, I suspect this has been done already.

Resources