LaTeX/Plain TeX
While you play with LaTeX macros, you will notice that it is quite limited. You may wonder how all these packages you are using every day have been implemented with so little. In fact, LaTeX is a set of Plain TeX macros and most packages use Plain TeX code. Plain TeX is much more low-level, it has much more capabilities at the cost of a steep learning curve and complex programming.
Up to a few exceptions, you can use the full Plain TeX language within a valid LaTeX document whereas the opposite is false.
Vocabulary
[edit | edit source]To avoid confusion it seems necessary to explain some terms.
- A group is everything after an opening brace and before the matching closing brace.
- A token is a character, a control sequence, or a group.
- A control sequence is anything that begins with a
\
. It is not printed as is, it is expanded by the TeX engine according to its type. - A command (or function or macro) is a control sequence that may expand to text, to (re)definition of control sequences, etc.
- A primitive is a command that is hard coded in the TeX engine, i.e. it is not written in Plain TeX.
- A register is the TeX way to handle variables. They are limited in numbers (256 for each type of register in classic TeX, 32767 in e-TeX).
- A length is a control sequence that contains a length (a number followed by a unit). See Lengths.
- A font is a control sequence that refers to a font file. See Fonts.
- A box is an object that is made for printing. Anything that ends on the paper is a box: letters, paragraphs, pages... See Boxes.
- A glue is a certain amount of space that is put between boxes when they are being concatenated.
- A counter is a register containing a number. See Counters.
There may be more terms, but we hope that it will do it for now.
Catcodes
[edit | edit source]In TeX some characters have a special meaning that is not to print the associated glyph.
For example, \
is used to introduce a control sequence, and will not print a backslash by default.
To distinguish between different meanings of the characters, TeX splits them into category codes, or catcodes for short. There are 16 category codes in TeX.
A powerful feature of TeX is its ability to redefine the language itself, since there is a \catcode
function that will let you change the category code of any characters.
However, this is not recommended, as it can make code difficult to read. Should you redefine any catcode in a class or in a style file, make sure to revert it back at the end of your file.
If you redefine catcodes in your document, make sure to do it after the preamble to prevent clashes with package loading.
Code | Description | Default set |
---|---|---|
0 | Escape character and control sequences | \
|
1 | Beginning of group | {
|
2 | End of group | }
|
3 | Math shift | $
|
4 | Alignment tab | &
|
5 | End of line | ^^M (ASCII return)
|
6 | Macro parameter | #
|
7 | Superscript | ^ and ^^K
|
8 | Subscript | _ and ^^A
|
9 | Ignored character | ^^@ (ASCII null)
|
10 | Space | ␣ and ^^I (ASCII horizontal tab)
|
11 | Letter | A...Z and a...z
|
12 | Other character | everything not listed in the other catcodes. Most notably, @. |
13 | Active character | ~ and ^^L (ASCII form feed)
|
14 | Comment character | %
|
15 | Invalid character | ^^? (ASCII delete)
|
Active characters
[edit | edit source]Active characters resemble macros: they are single characters that will expand before any other command.
\catcode`| = 13
\def|{\TeX}
...
This is a stupid example of |.
|
This is a stupid example of TeX. |
Note that an active character needs to be directly followed by a definition, otherwise the compilation will fail.
Examples
[edit | edit source]- Texinfo
Texinfo uses a syntax similar to TeX with one major difference: all functions are introduced with a @ instead of a \
. This is not by chance: it actually uses TeX to print the PDF version of the files.
What it basically does is inputting texinfo.tex which redefines the control sequence character. Possible implementation:
\catcode`\@=0
@def@@{@char64} % To write '@' character.
\catcode`\\=13 @def\{{@tt @char92}}
The @TeX command was previously written '\TeX'. It is now written '@@TeX'.
|
The TeX command was previously written '\TeX'. It is now written '@TeX'. |
With this redefinition, the '@' should now introduce every command, while the '\' will actually print a backslash character.
- Itemize
Some may find the LaTeX syntax of list environments a bit cumbersome. Here is a quick way to define a wiki-like itemize:
\catcode`| = 13
\def|{\item {--}}
\def\itemize#1{{\leftskip = 40 pt #1 \par}}
\itemize{
| First item
| Second item
}
|
- Dollar and math
If you have many 'dollar' symbols to print, you may be better off to change the math shift character.
\catcode`$ = 11
\catcode`| = 3
It costs $100.
Let's do the math: |50+50=100|. Let's highlight it:
||50+50=100||
|
\makeatletter and \makeatother
[edit | edit source]If you have done a bit of LaTeX hacking, you must have encountered those two commands, \makeatletter
and \makeatother
.
In TeX the '@' character belongs to catcode 11 letters by default. It means you can use it for macro names. LaTeX makes use of the catcode to specify a rule: all non-public, internal macros that are not supposed to be accessed by the end-user contains at least one '@' character in their name. In the document, LaTeX changes the catcode of '@' to 12, others.
That's why when you need to access LaTeX internals, you must enclose all the commands accessing private functions with \makeatletter
and \makeatother
. All they do is just changing the catcode:
\def\makeatletter{\catcode`@ = 11}
\def\makeatother{\catcode`@ = 12}
|
Plain TeX macros
[edit | edit source]\newcommand
and \renewcommand
are LaTeX-specific control sequences. They check that no existing command gets shadowed by the new definition.
In Plain TeX, the primitives for macro definition make no check on possible shadowing. It's up to you to make sure you are not breaking anything.
The syntax is
\def<macroname>#1<sep1>#2<sep2>{macro content, use of argument #1, blah, #2 ...}
|
You can use (almost) any sequence of character between arguments. For instance let's write a simple macro that will convert the decimal separator from point to comma. First try:
\def\pointtocomma #1.#2{(#1,#2)}
%%...
\pointtocomma 123.456
|
This will print (123,4)56. We added the parentheses just to highlight the issue here. Each parameter is the shortest possible input sequence that matches the macro definition, separators included. Thus #1
matches all characters up to the first point, and #2
matches the first token only, i.e. the first character, since there is no separator after it.
Solution: add a second separator. A space may seem convenient:
\def\pointtocomma #1.#2 {(#1,#2)}
|
As a general rule, everytime you expect several parameters with specific separators, think out the last separator. If you do not want to play with separators, then Plain TeX macros are used just as LaTeX macros (without default parameter):
\def\mymacro#1#2#3{{\bf #1}#2{\bf #3}}
%% ...
\mymacro{word1}{word2 word3}{!!!}
|
Expanded definitions
[edit | edit source]TeX has another definition command: \edef
, which stands for expanded def. The syntax remains the same:
\edef<macroname><argumentslist>{<expanded content>}
|
The content gets expanded (but not executed, i.e. printed) at the point where \edef
is used, instead of where the defined macro is used. Macro expansion is not always obvious...
Example:
\def\intro{Example}
\edef\example#1{\intro~---~#1}
\def\intro{Exercise}
\example{This is an example}
|
Here the redefinition of \intro
will have no effect on \example
.
Global definitions
[edit | edit source]Definitions are limited to their scope. However it might be convenient sometimes to define a macro inside a group that remains valid outside the group, and until the end of the document. This is what we call a global definition.
{
\def\LocalTeX{Local\TeX}
\global\def\GlobalTeX{Global\TeX}
}
I can still access the \GlobalTeX{} macro here.
|
You can also use the \global
command with \gdef
.
Both commands have a shortcut:
\gdef
for\global\def
\xdef
for\global\edef
Long definitions
[edit | edit source]The previous definition commands would not allow you to use them over multiple paragraphs, i.e. text containing the \par
command -- or double line breaks.
You can prefix the definition with the \long
command to allow multi-paragraph arguments.
Example:
\long\def\dummy#1{#1}
\dummy{First paragraph\par Second paragraph}
|
Outer definitions
[edit | edit source]This prefix macro prevent definitions from being used in some context. It is useful to consolidate macros and make them less error-prone because of bad contexts. Outer macros are meant to be used outside of any context, hence the name.
For instance the following code will fail:
\outer\def\test{a test}
\def\failure{\test}
|
Outer macros are not allowed to appear in:
- macro parameters
- skipped conditional
- ...
let and futurelet
[edit | edit source]\let<csname><token>
is the same as \expandafter\def\expandafter<csname>\expandafter{<content>}
. It defines a new control sequence name which is equivalent to the specified token. The token is usually another control sequence.
Note that \let
will expand the token one time only, contrary to \edef
which will expand recursively until no further expansion is possible.
Example[1]:
Using let:\par
\def\txt{a}
\def\foo{\txt}
\let\bar\foo
\bar % Prints a
\def\txt{b}
\bar % Prints b
Using edef:\par
\def\txt{a}
\def\foo{\txt}
\edef\bar{\foo}
\bar % Prints a
\def\txt{b}
\bar % Prints a
|
\futurelet<csname><token1><token2>...
works a bit differently. First, token2 is assigned to csname, then TeX processes the <token1><token2>...
sequence. So \futurelet
allows you to assign a token while using it right after.
Special control sequence name
[edit | edit source]Some macros may have a name that is not directly writable as is. This is the case of macros whose name is made up of macro names. Example:
\def\status{full}
\def\varempty{This is empty}
\def\varfull{This is full}
\csname var\status \endcsname
|
The last line will print a sentence depending on the \status
.
This command actually does the opposite of \string
which prints a control sequence name without expanding it:
{\tt \string\TeX}
|
\TeX |
Controlling expansion
[edit | edit source]\expandafter{token1}{token2}
will expand token2 before token1. It is sometimes needed when token2 expansion is desired but cannot happen because of token1.
{\tt \expandafter\string\csname TeX\endcsname}
|
\TeX |
\noexpand
is useful to have fine grained control over what gets expanded in an \edef
. Example:
\def\intro{Example}
\def\separator{~---~}
\edef\example#1{\intro\noexpand\separator#1}
\example{no expand makes the separator dynamic in an {\tt \string\edef}.}
\def\intro{For instance}
\def\separator{~:~}
\example{the separator changed, but not the first word.}
|
\the
control sequence will let you see the content of various TeX types:
- catcodes
- chardef
- font parameters
- internal parameters
- lengths
- registers
- ...
Example:
Text dimensions: $ \the\hsize \times \the\vsize $
|
Registers
[edit | edit source]Registers are kind of typed variables. They are limited in numbers, ranging from 0 to 255. There are 6 different types:
Type | Description |
---|---|
box | one box |
count | an integer |
dimen | a length |
muskip | a glue (in mu unit) |
skip | a glue |
toks | a sequence of tokens |
TeX uses some registers internally, so you would be better off not using them.
List of reserved registers:
- \box255 is used for the contents of a page
- \count0-\count9 are used for page numbering
Scratch registers (freely available):
- \box0-\box254
- \count255
- \dimen0-\dimen9
- \muskip0-\muskip9
- \skip0-\skip9
Assign register using the '=' control character. For box registers, use the \setbox
command instead.
\count255=17
\setbox\mybox=\hbox{blah}
|
You may use one of the following reservation macro to prevent any clash:
\newbox
\newcount
\newdimen
\newmuskip
\newskip
\newtoks
|
These macros use the following syntax: \new*<csname>
.
Example:
\newbox\mybox
\setbox\mybox=\hbox{blah}
|
These commands can not be used inside macros, otherwise every call to the macro would reserve another register.
You can print a register using the \the
command. For counters use the \number
command instead. For boxes use the \box
command.
\the\hsize
\number\count255
\box\mybox
|
Arithmetic
[edit | edit source]The arithmetic capabilities of TeX are very limited, although this base suffice to extend it to some interesting features. The three main functions:
\advance <register> by <number>
\multiply <register> by <number>
\divide <register> by <number>
|
register may be of type count, dimen, muskip or skip. It does not make sense for box nor toks.
Conditionals
[edit | edit source]The base syntax is
\if* <test><true action>\fi
\if* <test><true action>\else<false action>\fi
|
where \if*
is one command among the following.
Control sequence | Description |
---|---|
\if <a><b>
|
True if two character codes are equal. |
\ifcat <a><b>
|
True if two category codes are equal. |
\ifdim <a><rel><b>
|
Dimension relation, either <, > or =. |
\ifeof
|
True if End-Of-File or non-existent file. |
\iffalse
|
Always false. |
\ifhbox <reg>
|
True if box register contains a horizontal box. |
\ifhmode
|
True if in horizontal mode. |
\ifinner
|
True if in internal mode. |
\ifmmode
|
True if in math mode. |
\ifnum <a><rel><b>
|
Number relation, either <, > or =. |
\ifodd <num>
|
True if number is odd. |
\iftrue
|
Always true. |
\ifvbox <reg>
|
True if box register contains a vertical box. |
\ifvmode
|
True if in vertical mode. |
\ifvoid <reg>
|
True if box register is empty. |
\ifx <a><b>
|
True if two macros expands to the same, or if two character codes are equal, or if two category codes are equal. |
Example:
\ifnum 5>6
This is true
\else
This is false
\fi
|
This is false |
Self defined conditionals
[edit | edit source]You can create new conditionals (as a kind of boolean variables) with the \newif
command. With this self defined conditionals you can control the output of your code in an elegant way.
The best way to illustrate the use of conditionals is through an example.
Two versions of a document must be generated. One version for group A the other one for the rest of people (i.e. not belonging to group A):
1. We use \newif
to define our conditional (i.e. boolean variable).
\newif\ifgroupA
|
2. In the following way we set a value (true or false) for our conditional
\groupAtrue % or
\groupAfalse
|
that is:
\<conditionalsname>true
\<conditionalsname>false
|
depending on which value we want to set in our conditional.
3. Now we can use our conditional anywhere after in an if control structure.
\ifgroupA
% Here we write the code of the document that is
% intended for the group A
\else
% Here we write the code of the document that is
% intended for the rest of the people
\fi
|
A full example is:
\newif\ifdirector
%I set the conditional to false
\directorfalse
\ifdirector
I write something for the director.
\else
I write something for common people.
\fi
|
I write something for common people. |
Case statement
[edit | edit source]The syntax is \ifcase <number><case0>\or<case1>\or...\else<defaultcase>\fi
. If number is equal to the case number, its content will be printed. Note that it starts at 0.
\ifcase 2 a\or b\or c\or d\else e\fi
|
c |
\else
is used to specify the default case (whenever none of the previous cases have matched).
Loops
[edit | edit source]The base syntax is
\loop <content> \if*<condition><true action>\repeat
|
As always, content and true action are arbitrary TeX contents. \if*
refers to any of the conditionals. Note that there is no false action, you cannot put an \else
between \if*
and \repeat
. In some case this will be the opposite of what you want; you have to change the condition or to define a new conditional using \newif
.
Example:
\count255 = 1
\loop
\TeX
\ifnum\count255 < 10
\advance\count255 by 1
\repeat
|
The above code will print TeX ten times.
Doing nothing
[edit | edit source]Sometimes it may be useful to tell TeX that you want to do nothing.
There are two commands for that: \relax
and \empty
.
Classic example:
\def\myspace{\hskip 25pt\relax}
\myspace{} plus 10pt
|
The \relax
prevents undesired behaviour if a plus
or a minus
is encountered after the command.
The difference between \empty
and \relax
lies in the expansion: \empty
disappears after macro expansion.
TeX characters
[edit | edit source]char
[edit | edit source]We can print all characters using the \char {charcode}
command. The charcode is actually the byte value.
For example
\char65 = \char `A = \char `\A
|
Most characters correspond to the ASCII value (e.g. A-Za-z), some replace the non-printable characters from ASCII.
chardef and mathchardef
[edit | edit source]You can define control sequence to expand to a specific char. The syntax is \chardef<control sequence>=<charcode>
.
The following sequences do the same thing.
\chardef\myA=65
\chardef\myA=`A
\chardef\myA=`\A
|
Example:
\mathchardef\alphachar = "010B
$\alphachar$
|
Font encoding map
[edit | edit source]We can use the above primitive to print the font encoding map.
\count255 = 0
\loop
[\number\count255 =\char\number\count255]
\ifnum\count255 < 127
\advance\count255 by 1
\repeat
|
Another version, with different fonts, one entry per line:
\count255 = 0
\loop
[\number\count255 =
\char\number\count255 \
{\tt \char\number\count255}
{\it \char\number\count255}
]
\hfil\break
\ifnum\count255 < 127
\advance\count255 by 1
\repeat
|
Verbatim lines and spaces
[edit | edit source]It is rather confusing to discover (La)TeX treats all whitespace as the same type of spacing glue. Plain TeX provides some commands to preserve the spacing and newlines as you wrote it:
\begingroup
\obeylines
\obeyspaces
Relevant text here
\endgroup
|
which means that you will probably need to combine your own verbatim environment, and your command:
\newenvironment{myverbatim}{\begingroup \obeylines \obeyspaces}{\endgroup}
\newcommand{\mycommand}[n]{do something with #1 .. #n}
|
and then in your tex file:
\begin{myverbatim}
\mycommand{
whichever text it is important you
preserve the spacing and newslines
for, like when you want to generate
a verbatim block later on.
}
\end{myverbatim}
|
Macros defining macros
[edit | edit source]This is useful in some case, for example to define language commands as explained in Multilingual versions, where the end user can write
\en{some english text}
\de{etwas deutscher Text}
|
and make sure it switches to the appropriate Babel language.
Let's define a macros that will define language commands for instance. These commands are simple: if the argument is the value of the \locale
variable, then the corresponding macro prints its content directly. Otherwise, it does nothing.
Basically, what we want to do is extremely simple: define a bunch of macros like this:
\newcommand{\de}[1]{#1}
\newcommand{\en}[1]{}
\newcommand{\fr}[1]{}
|
In the previous snippet of code, only the \de
command in going to output its content, \en
and \fr
will print nothing at all. That's what we want. The problem arises when you want to automate the task, or if you have a lot of languages, and you want to change the language selection. You just have to move the #1
, but that's not convenient and it makes it impossible to choose the Babel language from command line. Think this out...
What we are going to do is to define the language commands dynamically following the value of the \locale
variable (or any variable of your choice). Hence the use of the \equal
command from the ifthen package.
Since it is hardly possible to write it in LaTeX, we will use some Plain TeX.
\def\locale{de}
\def\localedef#1{
\ifthenelse{ \equal{\locale}{#1} }{
%% Set the Babel language.
%% Define the command to print the content.
}{
%% Define the command to print nothing.
}
}
|
Another problem arises: how to define a command whose name is a variable? In most programming languages that's not possible at all. What we could be tempted to write is
\def\#1 #1{#1}
|
It will fail for two reasons.
- The two last '#1' are supposed to refer to the arguments of the new macro, but they get expanded to the
\localedef
macro first argument because they are in the body of that macro. \#1
gets expanded to two tokens: '#' and '1', and the\def
command will fail as it requires a valid control sequence name.
The solution to problem 1 is simple: use '##1', which will expand to '#1' when the macro is executed.
For problem 2, it is a little bit tricky. It is possible to tell tex that a specific token is a control sequence. This is what the \csname...\endcsname
is used for. However
\def\csname#1\endcsname ##1{##1}
|
will fail because it will redefine \csname
to '#1', which is not what we want, then tex will encounter \endcsname
, which will result in an error.
We need to delay the expansion of \def
, i.e. to tell tex to expand the \csname
stuff first, then to apply \def
on it. There is a command for that: \expandafter{token1}{token2}
. It will expand {token2} before {token1}.
Finally if we want to set language from command line, we must be able to set the \locale
variable so that the one in the source code is the default value that can be overridden by the one in the command line. This can be done with \providecommand
:
\providecommand\locale{fr}
|
The final code is
%% Required package.
\usepackage{ifthen}
%% TeX function that generates the language commands.
\def\localedef#1#2{
\ifthenelse{ \equal{\locale}{#1} }{
\selectlanguage{#2}
\expandafter\def\csname#1\endcsname ##1{##1}
}{
\expandafter\def\csname#1\endcsname ##1{}
}
}
%% Selected language. Can be placed anywhere before the language commands.
\providecommand\locale{fr}
%% Language commands.
\localedef{de}{ngerman}
\localedef{en}{english}
\localedef{fr}{frenchb}
%% ...
|
And you can compile with
latex '\providecommand\locale{en}\input{mydocument.tex}'
Notes and References
[edit | edit source]- ↑ From tex.stackexchange.com: What is the difference between \let and \edef?
- Further reading
- The TeXbook, Donald Knuth
- TeX by Topic, Victor Eijkhout
- TeX for the Impatient, Paul W. Abrahams, Karl Berry and Kathryn A. Hargreaves
- TeX command reference in wikibooks