% This is epodd.tex, the description of the treetex macro package as it will
% appear in EP-ODD in summer 89. It is in some aspects more general
% than tree_doc.tex and corrects an error in the computation of
% the number of registers used by treetex. The user interface of
% treetex is explained in more detail in tree_doc.tex.
\documentstyle[12pt,fullpage]{article}
\clubpenalty=10000
\widowpenalty=10000
\def\addcontentsline#1#2#3{\relax}% Some captions are too long for some
% TeX installations (buffer size too small)
\newenvironment{lemma}{\begingroup\samepage\begin{lemmma}\ }{\end{lemmma}%
\endgroup}
\newtheorem{lemmma}{Lemma}[section]
\newenvironment{proof}{\begin{prooof}\rm\ \nopagebreak}{\end{prooof}}
\newcommand{\proofend}{\qquad\ifmmode\Box\else$\Box$\fi}
\newtheorem{prooof}{Proof}
\renewcommand{\theprooof}{} % makes shure that prooof doesn't get numbers
\newenvironment{Figure}{\begin{figure}\vspace{1\baselineskip}}%
{\vspace{1\baselineskip}\end{figure}}
\newlength{\figspace} % space between figures in a single
\setlength{\figspace}{30pt} % Figure environment
\newcommand{\var}[1]{{\it #1\/}} % use it for names of variables
\newcommand{\emph}[1]{{\em #1\/}} % use it for emphazided text
% (This notion sticks to the
% applicative style of markup.)
\renewcommand{\O}{{\rm O}} % O-notation, also for math mode
\newcommand{\T}{{\cal T}} % the set T in math mode
\newcommand{\TreeTeX}{Tree\TeX}
\newcommand{\fig}[1]{Figure~\ref{#1}}
\let\p\par
\input treetex
\Treestyle{\vdist{20pt}\minsep{16pt}}
\dummyhalfcenterdim@n=2pt
% \def\Tree#1\end#2{\end{Tree}} % Trees are not processed
% \let\endTree\relax %
\def\Node(#1,#2){\put(#1,#2){\circle*{4}}}
\def\Edge(#1,#2,#3,#4,#5){\put(#1,#2){\line(#3,#4){#5}}}
\def\enode{\node{\external\type{dot}}}
\def\inode{\node{\type{dot}}}
\def\e{\node{\external\type{dot}}}
\def\i{\node{\type{dot}}}
\def\il{\node{\type{dot}\leftonly}}
\def\ir{\node{\type{dot}\rightonly}}
\newcommand{\stack}[3]{%
\vtop{\settowidth{\hsize}{#1}%
\setlength{\leftskip}{0pt plus 1fill}%
\setlength{\baselineskip}{#2}#3}}
\let\multic\multicolumn
\newlength{\hd} % hidden digit
\setbox0\hbox{1}
\settowidth{\hd}{\usebox{0}}
\newcommand{\ds}{\hspace{\hd}} % digit space
\newcommand{\ccol}[1]{\multicolumn{1}{c}{#1}}
\hyphenation{post-or-der sym-bol Karls-ruhe bool-ean}
\begin{document}
\bibliographystyle{plain}
\title{Drawing Trees Nicely with \TeX\thanks{This work was supported by
a Natural Sciences and Engineering Research Council of Canada
Grant~A-5692, a Deutsche Forschungsgemeinschaft Grant~Sto167/1-1,
and a grant from the Information Technology Research Centre.
It was begun during the first author's stay with
the Data Structuring Group in Waterloo.}}
\author{Anne Br\"uggemann-Klein\thanks{Institut f\"ur Informatik,
Universit\"at Freiburg, Rheinstr.~10--12, 7800~Freiburg,
West~Germany}\ \and Derick Wood\thanks{Data
Structuring Group, Department of Computer Science, University of
Waterloo, Waterloo, Ontario N2L~3G1, Canada}}
\date{}
\maketitle
\begin{abstract}
We present a new solution to the tree drawing problem that
integrates an excellent tree drawing algorithm into one of the best text
processing systems available. More precisely, we present a \TeX{} macro package
called \TreeTeX{} that produces drawings of trees from a purely logical
description. Our approach has three advantages: Labels
for nodes can be handled in a reasonable way; porting
\TreeTeX{} to any site running \TeX{} is a trivial operation; and
modularity in the description of a tree and \TeX{}'s macro capabilities
allow for libraries of subtrees and tree classes.
In addition, \TreeTeX{} has an option that produces
drawings that make the
\emph{structure} of the trees more obvious to the human eye,
even though they may not be as aesthetically pleasing.
\end{abstract}
\section{Introduction}
The problem of successfully integrating pictures and text in a
document processing environment is tantalizing and difficult.
Although there are systems available that allow such integration, they
fall short in many ways, usually in document quality. Furthermore,
most authors using document preparation systems are neither book
designers nor graphic artists. Just as modern document preparation
systems do not expect an author to be a book designer, so we would
prefer that they do not expect an author to be a graphic artist. The
second author, Wood, needed to draw many trees in a series of papers
on trees and in a projected book on trees. This problem enabled us
to tackle the integration issue for one subarea of graphics, namely,
tree drawing. We had the decided advantage that there already existed good
algorithms to draw trees {\em without any author intervention}.
Previous experience of the integration of pictures and text had been
uninspiring; the systems expected the author to prepare each picture
in total. For example, a tree could be built up from smaller
subtrees but the relative placement of them was left to the author.
This situation continues to hold today with the drawing facilities
available on most personal computers, and, because of this, the
resulting figures still appear to be ``hand-drawn.'' Additionally,
they are of inferior quality when compared with the quality of
the surrounding text.
In this paper we present an entirely new solution that integrates
a tree drawing algorithm into one of the best text processing
systems available. More precisely, we describe \TreeTeX{}, a
\TeX{} macro package that produces an aesthetically pleasing
drawing of a tree from a purely logical description.
We made two fundamental design
decisions that heavily influenced the method of implementation.
First, we wanted to allow an author to label the nodes of a tree.
This decision means that the tree drawing package must be able to
typeset labels exactly as they would be typeset by the typesetting
program. There are two reasons for this. Text should be typeset
consistently, wherever it appears in a document, and the tree
drawing program needs to know the dimensions of the typeset labels.
Second, we wanted to ensure that the program could be ported
easily to other installations and sites, so that other, putative
users would be able to use it easily.
Indeed, \TreeTeX{} has been used successfully to typeset trees in
\cite{BaezaTrees}, \cite{KWIFIP}, and \cite{OAPD}.
By basing our package on \TeX{}, which for more subjective
reasons we preferred over other typesetting systems such as
troff, we could ensure wide interest
in the package. By implementing it as a \TeX{} macro package
instead of a preprocessor
we made porting trivial and, furthermore, this also ensured
consistency of typeset text within a document.
The down side of this decision is that we had to program with
\TeX{} macros, not an experience to be recommended, and we had to live
with the inherent register limitations of \TeX{}.
This paper consists of a further nine sections. In Sections~2, 3 and~4,
we discuss the aesthetics of tree drawing and the algorithm of
Reingold and Tilford~\cite{TidierTrees}. In Sections~5, 6, and~7, we
describe our method of incorporating tree drawing into \TeX{}. Then,
in the last three short sections, we consider the expected number of
registers \TeX{} needs to draw a tree, the user interface (and three
\TreeTeX{} examples), and discussion of, among other things, the
performance of \TreeTeX{}.
\section{Aesthetical criteria for drawing trees}
In this paper, we are dealing with ordered
trees in the sense of~\cite{ACP}, specifically binary and unary-binary
trees. A {\em binary tree\/} is a finite set of nodes that either
is empty, or consists of a root and two disjoint binary trees called
the left and right subtrees of the root. A {\em unary-binary tree\/} is
a finite set of nodes that either is empty, or consists of a root and
two disjoint unary-binary trees, or consists of a root and one
nonempty unary-binary tree. An {\em extended binary tree\/} is a binary tree
in which each node has either two nonempty subtrees or two
empty subtrees.
There are some basic agreements on how such trees should be drawn, reflecting
the top-down and left-right ordering of nodes in a tree.
In \cite{TidierTrees} and \cite{TidyTrees} these basic agreements were
formalized as the following axioms.
\begin{enumerate}
\item[1.] Trees impose a distance on the nodes; no node
should be closer to the root than any of its
ancestors.
\item[2.] Nodes on the same level should lie on a straight
line, and the straight lines defining the levels should be
parallel.
\item[3.] The relative order of nodes on any level should be the same
as in the level order traversal of the tree.
\end{enumerate}
These axioms guarantee that trees are drawn as planar graphs: edges do
not intersect except at nodes. Two further axioms improve the aesthetical
appearance of trees.
\begin{enumerate}
\item[4.] In a unary-binary tree, each left child should be positioned
to the left of its parent, each
right child to the right of its parent, and each unary child
should be positioned below its parent.
\item[5.] A parent should be centered over its children.
\end{enumerate}
An additional axiom deals with the problem of tree drawings becoming too wide
and therefore exceeding the physical limit of the output medium:
\begin{enumerate}
\item[6.] Tree drawings should occupy as little width as possible without
violating the other axioms.
\end{enumerate}
In \cite{TidyTrees}, Wetherell and Shannon introduce two algorithms for
tree drawings, the first of which fulfills axioms~1--5, and the second
1--6. However, as Reingold and Tilford in \cite{TidierTrees}
point out, there is a lack of symmetry in the algorithms of
Wetherell and Shannon which may lead to unpleasant results;
therefore, Reingold and Tilford introduce a new structured
axiom.
\begin{enumerate}
\item[7.] A subtree of a given tree should be
drawn the same way regardless of where it occurs in the tree.
\end{enumerate}
Axiom~7 allows the same tree to be drawn differently only when it occurs as
a subtree in different trees.
Reingold and Tilford give an algorithm which fulfills axioms~1--5
and~7. Although
this algorithm doesn't fulfill axiom~6,
the aesthetical improvements are well worth the additional space.
\fig{algorithms} illustrates the benefits of axiom~7, and \fig{narrowtrees}
shows that the algorithm of Reingold and Tilford violates axiom~6.
\begin{Figure}
\centering
\leavevmode\noindent
\begin{Tree}
\enode
\enode\enode\inode\enode\enode\inode\inode\inode
\node{\external\type{dot}\rght{\unskip\hskip2\mins@p\hskip2\dotw@dth}}
\enode\enode\inode\enode\enode\inode\inode\inode
\inode
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\qquad
\begin{Tree}
\enode
\enode\enode\inode\enode\enode\inode\inode\inode
\enode
\enode\enode\inode\enode\enode\inode\inode\inode
\inode
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\
\caption{The left tree is drawn by the algorithm of Wetherell and Shannon,
and the tidier right one is drawn by the algorithm of Reingold and Tilford.}
\label{algorithms}
\vspace{\figspace}
\centering
\leavevmode\noindent
\begin{Tree}
\enode\enode\enode\enode\enode\enode\enode\enode\enode
\enode\inode\inode\inode
\enode\inode\inode\inode
\enode\inode\inode\inode
\enode\inode\inode\inode
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\qquad
\begin{Tree}
\enode\enode\enode\enode\enode\enode\enode\enode
\node{\external\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}}
\enode\inode\inode\node{\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}}
\enode\inode\inode\node{\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}}
\enode\inode\inode\node{\type{dot}\rght{\unskip\hskip\mins@p\hskip\dotw@dth}}
\enode\inode\inode\inode
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\
\caption{The left tree is drawn by the algorithm of Reingold and Tilford, but
the right tree shows that narrower drawings fulfilling all aesthetic axioms
are possible.}
\label{narrowtrees}
\end{Figure}
\section{The algorithm of Reingold and Tilford}
The algorithm of Reingold and Tilford (hereafter called ``the RT~algorithm'')
takes a modular approach to the
positioning of nodes. The relative positions of the nodes in a subtree
are calculated independently of the rest of the tree. After the
relative positions of two subtrees have been calculated, they can be
joined as siblings in a larger tree by placing them as close
together as possible and centering the parent node above them.
Incidentally, this modular approach is the reason that the
algorithm fails to fulfill axiom~6; see~\cite{Complexity}.
Two sibling subtrees are placed as close together as possible,
during a postorder traversal, as follows.
Imagine that the two subtrees of a binary node
have been drawn and cut out of paper along
their contours. Then, starting with the two subtrees superimposed at their
roots, move them apart until a minimal agreed upon distance
between the trees is obtained at each level. This can be done gradually.
Initially, their roots are separated by some agreed upon minimum
distance; then, at the next level, they are pushed
apart until the minimum separation is established there.
This process is continued at successively lower levels until the
last level of the shorter subtree is reached. At some levels no movement may be
necessary, but at no level are the two subtrees moved closer
together. When the process is complete, the position of the
subtrees is fixed relative to their parent, which is centered over them.
Assured that the subtrees will never be placed closer together,
the postorder traversal is continued.
A nontrivial implementation of
this algorithm has been obtained by Reingold and Tilford in~\cite{TidierTrees}
that runs in time $\O(N)$, where $N$ is the number of
nodes of the tree to be drawn.
Their crucial idea is to keep track of the contour of the subtrees
by special pointers, called threads, such that whenever
two subtrees are joined, only the
top part of the trees down to the lowest level of the
smaller tree need to be taken into account.
The nodes are positioned on a fixed grid and are
considered to have zero width; labeling is not provided.
Although the algorithm only draws binary trees, it is easily
extended to multiway trees.
\section{Improving human perception of trees}
It is common understanding in book design that aesthetics and readability
don't necessarily coincide, and---as Lamport (\cite{LaTeX}) puts it---%
``documents are meant to be read, not hung in museums.''
Therefore, readability is more important than aesthetics.
When it comes to tree drawings, readability means that the structure of
a tree must be easily recognizable. This criterion is not always met
by the RT~algorithm. As an example, there are trees whose structure is
different even though they have the same number
of nodes on each level. The RT~algorithm might assign identical positions to
these nodes making it very hard to perceive the structural differences.
Hence, we have modified the RT~algorithm such that additional white space
is inserted between subtrees of
\emph{significant} nodes. Here a binary node
is called significant if the minimum distance
between its two subtrees is achieved \emph{below} their root level.
Setting the amount of additional white space to zero retains the original RT~%
placement. The effect of having nonzero additional white space between
the subtrees of significant
nodes is illustrated in \fig{addspace}.
Another feature we have added to the RT~algorithms is the possibility to draw
an unextended binary tree with the same placement of nodes as its
associated extended version;
this makes the structure of a tree more prominent; see \fig{extended}.
We define the \emph{associated extended version}
of a binary tree to be the binary tree obtained by replacing each empty subtree
having a nonempty sibling with a subtree consisting of one node.
\begin{Figure}
\centering
\leavevmode\noindent
\begin{Tree}
\e\il\e\e\i\i\il % the left subtree
\e\ir\il % the right subtree
\i
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\qquad
\begin{Tree}
\e\il\il\il % the left subtree
\e\e\i\e\i\il % the right subtree
\i
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\qquad
\adds@p10pt
\begin{Tree}
\e\il\e\e\i\node{\type{dot}\lft{$\longrightarrow$}}\il % the left subtree
\e\ir\il % the right subtree
\node{\type{dot}\lft{$\longrightarrow$}}
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\qquad
\begin{Tree}
\e\il\il\il % the left subtree
\e\e\i\e\i\il % the right subtree
\node{\type{dot}\lft{$\longrightarrow$}}
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\
\adds@p0pt
\caption{The nodes of the first two trees are placed in the same positions
by the RT~algorithm, although the structure of the two trees is different.
The alternative drawings highlight the structural differences
of the trees by adding additional white space between the subtrees of
($\longrightarrow$) significant nodes.}
\label{addspace}
\end{Figure}
\begin{Figure}
\centering
\leavevmode\noindent
\begin{Tree}
\e\e\i\il\e\e\i\i
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\hbox{}\qquad
\begin{Tree}
\e\e\i\e\i\e\ir\i
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\hbox{}\\
\extended
\begin{Tree}
\e\e\i\il\e\e\i\i
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\hbox{}\qquad
\begin{Tree}
\e\e\i\e\i\e\ir\i
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\hbox{}\\
\noextended
\begin{Tree}
\e\e\i\e\i\e\e\i\i
\end{Tree}
\hskip\leftdist\box\TeXTree\hskip\rightdist\
\caption{As in the previous figure, the nodes of the first two trees
are placed in the same position by the RT algorithm,
although their structure is different. The modified
RT~algorithms highlights the structural differences of the trees by
drawing them like their identical extended
version (given in the third row), but suppressing the additional nodes.}
\label{extended}
\end{Figure}
\section{Trees in a document preparation environment}
Drawings of trees do not usually appear by themselves,
but are included in some text
that is itself typeset by a text processing system. Therefore, a typical
scenario is a pipe of three stages. First, we have a tree drawing
program that calculates the positioning of the nodes of the tree to
be drawn and outputs a description of the tree drawing in
some graphics language; this is followed by
a graphics system that transforms this
description into an intermediate language that can be interpreted by the output
device; and, finally, we have the
text processing system that integrates the output of the
graphics system into the text.
This scenario loses its linear structure once nodes have to be labeled, since
the labeling influences the positioning of the nodes. Labels usually occur
inside, to the left of, to the right of, or beneath nodes (the latter only for
external nodes). Their widths should certainly be taken into account
by the tree drawing algorithm. But the labels have to be typeset first
to determine their extensions,
preferably by the typesetting program that
is used for the regular text, because this ensures uniformity in the textual
parts of the document and provides the author with the full power of a
text processing system for composing the labels. Hence, a more complex
communication scheme than a simple pipe is required.
Although a system of two processes running simultaneously might be the most
elegant solution, we wanted a system that is easily portable to
widely different machines at our sites
including personal computers with single process
operating systems.
Therefore, we decided to use a text processing system
having programming facilities powerful enough to
program a tree drawing algorithm and graphics facilities powerful enough
to draw a tree. One text processing system
rendering outstanding typographic quality and satisfactory programming
facilities is \TeX, developed by Knuth at Stanford University;
see~\cite{TeXbook}.
The \TeX{} system includes the following programming facilities.
\begin{enumerate}
\item[1.] Datatypes:\\
integers~(256), dimensions\footnote{The term \emph{dimension} is used
in \TeX\ to describe physical measurements of typographical objects;
for example, the length of a word.}~(512),
boxes~(256), tokenlists~(256), and
boolean variables~(unrestricted).
\item[2.] Elementary statements:\\
$a:=\rm const$, $a:=b$ (all types);\\
$a:=a+b$, $a:=a*b$, $a:=a/b$ (integers and dimensions); and\\
horizontal and vertical nesting of boxes.
\item[3.] Control constructs:\\
if-then-else statements testing relations between integers,
dimensions, boxes, or boolean variables.
\item[4.] Modularization constructs:\\
macros with up to 9~parameters (can be viewed as procedures without
the concept of local variables).
\end{enumerate}
Although the programming
facilities of \TeX{} hardly exceed the abilities of a Turing machine,
they are sufficient to
handle small programs. How about the graphics facilities?
Although \TeX{} has no built-in graphics facilities, it
allows the placement of characters in arbitrary positions on
the page. Therefore, complex pictures can be synthesized from elementary
picture elements treated as characters. Lamport has included such
a picture drawing environment in his macro package \LaTeX, using
quarter circles of different sizes and line segments (with and without
arrow heads) of different slopes as basic elements; see~\cite{LaTeX}.
These elements are sufficient for drawing trees.
This survey of \TeX's capabilities implies that \TeX{} may be a suitable
text processing system to implement a tree drawing algorithm directly.
We base our algorithm on the RT~algorithm, because this algorithm
gives, aesthetically, the most pleasing results. In the first version
presented here, we
restrict ourselves to unary-binary trees, although our method is
applicable to arbitrary multiway trees. But to take advantage
of the text processing environment, we expand the algorithm to allow
labeled nodes.
In contrast to previous tree drawing programs, we feel no necessity to
position the nodes of a tree on a fixed grid. While this may be
reasonable for a plotter with a coarse resolution, it is certainly not
necessary for \TeX, a system that is capable of handling
arbitrary dimensions
and producing device \emph{independent} output.
\section{A representation method for \TeX{}trees}
The first problem to be solved in implementing our tree drawing algorithm
is how to choose a good internal representation
for trees. A straightforward adaptation
of the implementation by Reingold and Tilford requires, for each node,
at least:
%
\begin{enumerate}
\item two pointers to the children of the node,
\item two dimensions for the offset to the left and the right child (these
may be different once there are labels of different widths to the
left and right of the nodes),
\item two dimensions for the $x$- and $y$-coordinates of the final
position of the nodes,
\item three or four labels, and
\item one token to store the geometric shape (circle, square, framed text, etc.)
of the node.
\end{enumerate}
%
Because these data are used frequently in calculations, they should be
stored in registers (that's what variables are called in \TeX)
rather than being recomputed, to obtain
reasonably fast performance. This gives a total of $10N$ registers for
a tree with $N$ nodes, which quickly exceeds
\TeX's limited supply of registers. Therefore, we present a
modified algorithm hand-tailored to the abilities of \TeX{}.
We start with the following observation.
Suppose a unary-binary tree is built bottom-up, using a postorder
traversal. This can be done by repeating the following three steps in
an order determined by the tree to be built.
\begin{enumerate}
\item Create a new subtree consisting of one external node.
\item Create a new subtree by appending the two subtrees last created
to a new binary node; see \fig{Construct}.
\item Create a new subtree by appending the subtree created last as a left,
right, or unary subtree of a new node; see \fig{Construct}.
\end{enumerate}
(A pointer to) each subtree that has been
created in steps 1--3 is pushed onto a stack, and
steps 2 and 3 remove two trees or one tree, respectively,
from the stack before the push
operation is carried out. The tree to be built is
the tree remaining on the stack.
\begin{Figure}
\centering
\begin{Tree}
\treesymbol{\lvls{2}}%
\hspace{-\l@stlmoff}\usebox{\l@sttreebox}\hspace{\l@strmoff}
$+$
\treesymbol{\lvls{2}}%
\hspace{-\l@stlmoff}\usebox{\l@sttreebox}\hspace{\l@strmoff}\quad
$\Longrightarrow$\quad
\treesymbol{\lvls{2}}%
\treesymbol{\lvls{2}}%
\node{\type{dot}}%
\hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}%
\end{Tree}
\vskip\baselineskip
\begin{Tree}
\treesymbol{\lvls{2}}%
\hspace{-\l@stlmoff}\usebox{\l@sttreebox}\hspace{\l@strmoff}\quad
$\Longrightarrow$\quad
\treesymbol{\lvls{2}}%
\node{\leftonly\type{dot}}%
\hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}%
\quad or\quad
\treesymbol{\lvls{2}}%
\node{\unary\type{dot}}%
\hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}%
\quad or\quad
\treesymbol{\lvls{2}}%
\node{\rightonly\type{dot}}%
\hspace{-\l@stlmoff}\raisebox{\vd@st}{\usebox\l@sttreebox}\hspace{\l@strmoff}%
\end{Tree}
\caption{Construction steps 2 and 3}
\label{Construct}
\end{Figure}
This tree traversal is performed twice in the RT~algorithm.
During the first pass,
at each execution of steps 2 or~3, the relative positions of the
subtree(s) and of the new node are computed.
A closer examination of the RT~algorithm reveals that information about the
subtree's coordinates is not needed during this pass; the contour information
alone is sufficient. Complete information is only needed in the second
traversal, when the tree is really drawn. This is where we can use
a special feature of \TeX{} that allows us to save registers.
Unlike Pascal, \TeX{} has the capability of
storing a drawing in a single box register that can be positioned freely in
later drawings. This means that in our implementation the two passes
of the original RT~algorithm can be woven into a single pass,
storing the contour and drawing of each subtree on the stack.
Although the latter is a complex object, it takes only one of
\TeX's precious registers.
\section{The internal representation}
Given a tree, the corresponding \TeX{}tree is a box containing
the ``drawing'' of the tree, together with some additional
information about the contour of the tree.
The reference point of a \TeX{}tree-box is always in the root of the
tree. The height, depth, and width of the box of a \TeX{}tree are
of no importance in this context.
The additional information about the contour of the tree is stored in some
registers for numbers and dimensions and
is needed in order to put subtrees together to form a larger tree.
An array \var{loff} of dimensions contains for each
level of the tree the horizontal offset between the
left end of the leftmost node at the current level and the
left end of the leftmost node at the next level.
The horizontal offset between the root
and the leftmost node of the whole tree is hold in \var{lmoff}, and
the horizontal offset between the root and the leftmost node at
the bottom level of the tree is hold in \var{lboff}.
Finally, \var{ltop} holds the distance between the reference point
of the tree and the leftmost end of the root.
We use
\var{roff}, \var{rmoff}, \var{rboff}, and \var{rtop}
as the corresponding variables for ``left'' replaced by ``right.'' Finally,
\var{height} holds the height of the tree, and \var{type} holds the
geometric shape of the root of the tree. \fig{TeXtree} shows an example
\TeX{}tree, that is a tree drawing and the corresponding additional information.
\begin{Figure}
\centering
\begin{Tree}
\e\ir\ir\e
\node{\type{dot}\rightonly\rght{\unskip\vrule height.8pt width5pt depth0pt}}%
\i % A
\end{Tree}
\leavevmode
\stack{-10pt}{\vd@st}{%
-10pt\\10pt\\10pt\\\var{loff}}%
\hspace{1em}%
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}%
\hspace{1em}%
\stack{-10pt}{\vd@st}{%
15pt\\5pt\\-10pt\\\var{roff}}%
\vskip\baselineskip\raggedright
height:~3, type:~dot, ltop:~2pt, rtop:~2pt, lmoff:~-10pt, rmoff:~20pt, lboff:~10pt,
rboff:~10pt.
\caption{A \TeX{}tree consists of the drawing of the tree and the additional
information. The width of the dots is 4pt, the minimal separation between
adjacent nodes is 16pt, making for a distance of 20pt center to center.
The length of the small rule labeling
one of the nodes is 5pt. The column left (right)
of the tree drawing is the array \var{loff} (\var{roff}),
describing the left (right) contour of the tree. At each level,
the dimension given is the horizontal
offset between the border at the current and
at the next level. The offset between
the left border of the root node and the leftmost node at level~1 is -10pt,
the offset between the right border of the root node and the rightmost node at
level~1 is 15pt, etc.}
\label{TeXtree}
\end{Figure}
Given two \TeX{}trees \var{A} and \var{B},
how can a new \TeX{}tree \var{C} be built that
consists of a new root and has \var{A} and \var{B} as subtrees?
An example is given in \fig{AddInfo}.
First we determine which tree is higher; this is
\var{B} in the example.
Then we have to compute the minimal distance
between the roots of \var{A} and \var{B}, such that at all levels
of the trees there is free space of at least \var{minsep} between
the trees when they are drawn side by side.
For this purpose we keep track of two values, \var{totsep} and
\var{currsep}. The variables \var{totsep} and \var{currsep}
hold the total distance between the roots and the distance
between the rightmost node of \var{A} and the leftmost node
of \var{B} at the current level. To calculate
\var{totsep} and \var{currsep}, we start at level 0 and
visit each level of the trees until we reach the bottommost level
of the smaller tree; this is \var{A} in our example.
\begin{Figure}
\centering
\begin{Tree}
\e\ir\ir\e
\node{\type{dot}\rightonly\rght{\unskip\vrule height.8pt width5pt depth0pt}}%
\i % A
\end{Tree}
\leavevmode
A: \stack{-10pt}{\vd@st}{%
-10pt\\10pt\\10pt\\\ \\\var{loff}(\var{A})}%
\hspace{1em}%
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}%
\hspace{1em}%
\stack{-10pt}{\vd@st}{%
15pt\\5pt\\-10pt\\\ \\\var{roff}(\var{A})}%
\qquad
\begin{Tree}
\e\il\e\i\il\il\ir % B
\end{Tree}
\leavevmode
B: \stack{-10pt}{\vd@st}{%
10pt\\-10pt\\-10pt\\-10pt\\-10pt\\\ \\\var{loff}(\var{B})}%
\hspace{1em}%
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}%
\hspace{1em}%
\stack{-10pt}{\vd@st}{%
10pt\\-10pt\\-10pt\\10pt\\-30pt\\\ \\\var{roff}(\var{B})}%
\\[\figspace]
\begin{Tree}
\e\ir\ir\e
\node{\type{dot}\rightonly\rght{\unskip\vrule height.8pt width5pt depth0pt}}%
\i % A
\e\il\e\i\il\il\ir % B
\i % C
\end{Tree}
\leavevmode
C: \stack{-10pt}{\vd@st}{%
-20\\-10pt\\%
\makebox[0pt][r]{\var{loff}(\var{A})$\smash{\left\{\vrule height\vd@st
depth\vd@st width0pt\right.}$ }%
10pt\\10pt\\%
\makebox[0pt][r]{$\longrightarrow$ }%
10pt\\%
\makebox[0pt][r]{\raisebox{-.5\vd@st}{\var{loff}(\var{B})$\smash
{\left\{\vrule height.5\vd@st
depth.5\vd@st width0pt\right.}$ }}%
\makebox[0pt][r]{-}10pt\\\ \\\var{loff}(\var{C})}%
\hspace{1em}%
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}%
\hspace{1em}%
\stack{-10pt}{\vd@st}{%
20pt\\10pt\\-10pt\\-10pt%
\makebox[0pt][l]{\raisebox{-.5\vd@st}{
$\smash{\left\}\vrule height2.5\vd@st
depth2.5\vd@st width0pt\right.}$\var{roff}(\var{B})}}%
\\10pt\\-30pt\\\ \\\var{roff}(\var{C})}%
\vspace{\figspace}
\centering
\begin{tabular}{|l|r|r|r|}
\hline
&\multic{1}{c|}{\var{A}}&\multic{1}{c|}{\var{B}}&\multic{1}{c|}{\var{C}}\\
\hline
height&\multic{1}{c|}{3}& \multic{1}{c|}{5}& \multic{1}{c|}{6}\\
type& \multic{1}{c|}{dot}&\multic{1}{c|}{dot}&\multic{1}{c|}{dot}\\
ltop& 2pt& 2pt& 2pt\\
rtop& 2pt& 2pt& 2pt\\
lmoff& -10pt& -30pt& -30pt\\
rmoff& 20pt& 10pt& 30pt\\
lboff& 10pt& -30pt& -10pt\\
rboff& 10pt& -30pt& -10pt\\
\hline
\end{tabular}\qquad
\begin{tabular}{|c|r|r|}
\hline
\multic{1}{|c|}{level}&\multic{1}{c|}{\var{totsep}}&
\multic{1}{c|}{\var{currsep}}\\
\hline
0&20pt&0/16pt\\
1&25pt&11/16pt\\
2&40pt&1/16pt\\
3&40pt&16pt\\
\hline
\end{tabular}
\caption{The \TeX{}trees \var{A} and~\var{B} are combined to form the
larger \TeX{}\-tree~\var{C}. The first table gives the additional
information of the three \TeX{}trees,
and the second table gives the
history of the computation for \var{totsep} and \var{currsep}.}
\label{AddInfo}
\end{Figure}
At level 0, the distance between the roots of \var{A} and \var{B}
should be at least \var{minsep}. Therefore, we set
$\var{totsep}:=\var{minsep} + \var{rtop}(\var{A})
+ \var{ltop}(\var{B})$ and $\var{currsep}:=\var{minsep}$.
Using $\var{roff}(\var{A})$ and $\var{loff}(\var{B})$, we can
calculate \var{currsep} for the next level.
If $\var{currsep} < \var{minsep}$, we have to increase \var{totsep} by
the difference and update \var{currsep}. This process is
repeated until we reach the lowest level of \var{A}
at which point \var{totsep} holds the final distance between the
nodes of \var{A} and \var{B}, as calculated by the RT~algorithm.
If the root of \var{C} is a significant node, then the additional space,
which is 0pt by default, is added to \var{totsep}.
However, the approach of synthesizing
drawings from simple graphics characters allows only a finite
number of orientations for the tree edges; therefore, \var{totsep}
must be increased slightly to fit the next orientation
available.
Now we are ready to build the box of \TeX{}tree~\var{C}.
Simply put \var{A} and~\var{B} side by side, with the reference
points \var{totsep}~units apart, insert a new node
above them, and connect the parent and children by edges.
Next, we compute the additional information
for \var{C}. This can be done by using the additional information
for \var{A} and~\var{B}.
Note that most components of $\var{roff}(\var{C})$ and
$\var{lroff}(\var{C})$ are the same as in the higher tree, which
is \var{B} in our case.
So, if we can avoid moving this information around,
the number of counters we have to access to update the additional information
for \var{C} is within a small constant of the height of~\var{A}.
Hence, we can apply the same argument as
in~\cite{TidierTrees}, which gives
us a running time of $\O(N)$ for drawing a tree with N nodes.
We must design the allocation of storage registers for
the additional information of \TeX{}trees carefully to fulfill the
following requirement. If a new tree is built from
two subtrees, the additional information of the new tree
shares storage with its larger subtree.
Organizational overhead, that is,
pointers that keep track of the locations of different parts of additional
information, must be avoided.
This means that the additional information
for one \TeX{}tree should be stored in a sequence
of consecutive dimension registers
such that only one pointer for access to the first element
in this sequence is needed. On the other hand, each parent
tree is higher and, therefore, needs more storage than its subtrees.
So we must ensure that there is always enough space in the sequence
for more information.
The obvious way to fulfill these requirements is to use a stack and to
allow only the topmost \TeX{}trees of this stack to be
combined into a larger tree at any time.
This leads to the following allocation of registers: A contiguous sequence of
box registers contains the treeboxes of the subtrees in the stack. A
contiguous sequence of token registers contains the type information for the
nodes of the subtrees in the stack. For each subtree in the stack,
a contiguous sequence of dimension registers contains the contour
information of the subtree. The ordering of these groups of dimension
registers reflects the ordering of the subtrees in the
stack. Finally, a contiguous sequence of counter registers contains
the height and the address of the first dimension register for
each subtree in the stack. Four address counters store the addresses
of the last treebox, type information, height, and address of contour
information. A sketch of the register organization for a stack of \TeX{}trees
is provided in \fig{Registers}.
\begin{Figure}
Dimension registers\\
\var{lmoff}(1) \var{rmoff}(1) \var{lboff}(1) \var{rboff}(1) \var{ltop}(1)
\var{rtop}(1)\\
\var{loff}($h_1$) \var{roff}($h_1$) \dots\ \var{loff}(1) \var{roff}(1)\\
\dots\\
\var{lmoff}($n$) \var{rmoff}($n$) \var{lboff}($n$) \var{rboff}($n$)
\var{ltop}($n$) \var{rtop}($n$)\\
\var{loff}($h_n$) \var{roff}($h_n$) \dots\ \var{loff}(1) \var{roff}(1)\\
\mbox{}\\
Counter registers\\
\var{lasttreebox} \var{lasttreeheight} \var{lasttreeinfo} \var{lasttreetype}\\
\var{treeheight}(1) \var{diminfo}(1) \dots\ \var{treeheight}($n$)
\var{diminfo}($n$)\\
\mbox{}\\
Box registers\\
\var{treebox}(1) \dots\ \var{treebox}($n$)\\
\mbox{}\\
Token registers\\
\var{type}(1) \dots\ \var{type}($n$)
\caption{\var{lasttreebox}, \var{lasttreeheight}, \var{lasttreeinfo},
\var{lasttreetype} contain pointers to \var{treebox}($n$)
\var{treeheight}($n$), \var{lmoff}($n$), \var{type}($n$),
\var{diminfo}($i$) contains a pointer to
\var{lmoff}($i$). Unused dimension registers are
allowed between the dimension registers of subsequent trees. The counter
registers \var{lasttreebox},\ldots,\var{diminfo}($n$) serve as a directory
mechanism to access the \TeX{}trees on the stack.}
\label{Registers}
\end{Figure}
When a new node is pushed onto the stack, the treebox, type information,
height, address of contour information, and contour information are
stored in the next free registers of the appropriate type, and the
four address counters are updated accordingly.
When a new tree is formed from the topmost subtrees on the stack,
the treebox, type information, height, and address of contour information
of the new tree are sorted in the registers formerly used by the bottommost
subtree that has occurred in the construction step,
and the four address registers are
updated accordingly. This means that this information for the subtrees
is no longer accessible. The contour information of the new subtree
is stored in the same registers as the contour information of the larger
subtree used in the construction, apart from the left and right offset
of the root to the left and right child, which are stored in the
following dimension registers. This means that gaps can occur
between the contour information of subtrees in the
stack, namely when the right subtree, which is in a higher position in the
stack, is higher than the left one. To avoid these
gaps, the user can specify an option \verb.\lefttop. when entering a
binary node, which makes the topmost tree in the stack the
left subtree of the node.
This stack concept also has consequences for the design of the user interface
that is discussed in Section~\ref{Interface}.
\section{Space cost analysis}
Suppose we want to draw a unary-binary tree $T$ of height $h$ having
$N$ nodes\footnote{The height $h$ and the number of nodes $N$ refer to the
drawing of the tree. $N$ is the number of circles, squares,~etc., actually
drawn, and $h$ is the number of levels in the drawing minus 1.}.
According to our internal representation,
for each subtree in the stack we need:
\begin{enumerate}
\item one box register to store the box of the \TeX{}tree;
\item one token register to store the type of the root of the subtree;
\item $2h^\prime+6$ dimension registers to store the additional
information, where $h^\prime$ is the height of the
subtree; and
\item three counter registers to store the register numbers of the
box register, the token register, and the first dimension register above.
\end{enumerate}
\begin{lemma}
Let $T$ be a unary-binary tree of height~$h$ and size~$N$; then:
\begin{enumerate}
\item at any time, there are at most $h+1$ subtrees of $T$ on the
stack; and
\item for each set $\T$ of subtrees of $T$ that are on the stack
simultaneously we have
$$\sum_{T^\prime\in \T}({\rm ht}(T^\prime)+1) \le N$$
\end{enumerate}
\end{lemma}
The lemma implies that our implementation
uses at most $9h+2N$~registers.
To compare this with the
$10N$ registers used in the straightforward implementation,
an estimation of the average height of a tree with $N$ nodes is
needed. Several results, depending on the type of trees and of the
randomization model, are cited in \fig{Stat}, which
compares the number of registers used in a straightforward
implementation with the average number of registers used in our
implementation. This table shows clearly the advantage of our
implementation.
\begin{Figure}
\centering
\begin{tabular}{|c|c|c|c|c|}
\hline
®isters&\multicolumn{3}{c|}{average registers}\\
\cline{3-5}
nodes&(straight-&&unary-binary&binary\\
&forward)&binary trees&trees&search trees\\
$N$&&($2\sqrt{\pi N}$) \cite{BinaryTrees}&
($\sqrt{3\pi N}$) ~\cite{BinaryTrees}&
($4.311\log N$) \cite{BinarySearchTrees}\\
\hline
\ds10 & \ds100 & 120.89 & 107.37 & 109.34 \\
\ds20 & \ds200 & 182.68 & 163.56 & 156.23 \\
\ds30 & \ds300 & 234.75 & 211.33 & 191.96 \\
\ds40 & \ds400 & 281.78 & 254.75 & 223.12 \\
\ds50 & \ds500 & 325.60 & 295.37 & 251.78 \\
\ds60 & \ds600 & 367.13 & 334.02 & 278.86 \\
\ds70 & \ds700 & 406.93 & 371.17 & 304.84 \\
\ds80 & \ds800 & 445.36 & 407.13 & 330.02 \\
\ds90 & \ds900 & 482.67 & 442.12 & 354.59 \\
100 & 1000 & 519.04 & 476.30 & 378.68 \\
\hline
\end{tabular}
\caption{The numbers of registers used by a straightforward implementation
(second column) and by our modified implementation (third to fifth column)
of the RT~algorithm are
given for different types of trees and randomization models.
The formulas in parentheses indicate the average height of the respective
classes of trees.}
\label{Stat}
\end{Figure}
\section{The user interface}\label{Interface}
The user interface of \TreeTeX{} has been designed in the spirit of
the thorough separation of the logical description of document components
and their layout; see~\cite{DocumentFormatting,GML}. This concept
ensures both uniformity and flexibility of document layout and frees
authors from layout problems that have nothing to do with the
substance of their work. For some powerful implementations and projects
see \cite{Tables,Karlsruhe,LaTeX,Grif,Scribe}.
The description of a tree consists of a description of its nodes
in postorder. Each description of a node, in turn, has to specify
the outdegree, the geometric shape and the labels of the node.
Defaults are provided for all specifications,
thereby allowing the user to omit many definitions
if the defaults match what he or she wants.
A separate style command defines layout parameters for tree drawings
that are valid for all trees of a document.
Layout parameters include the font to be used for labels, the diameter
of circle nodes, the vertical distance between two subsequent levels
of the tree, and the minimal horizontal distance between nodes.
Standard versions of \TeX{} provide only a limited number of
font and circle sizes. Hence, the user of the style command must make
sure that the specified sizes can be realized. This is especially
cumbersome when everything has to be magnified for later reproduction
with reduction. But the style variables can be made parametric for
installations that provide scalable fonts and replace \LaTeX{}'s
circle- and line-drawing commands with routines that provide arbitrary
diameters and slopes.
Three examples of tree descriptions are given in
Figures~\ref{firstex}--\ref{lastex}.
A more detailed description of the user interface is
given in~\cite{Exeter}.
\section{Conclusions}
We hope that, by now, we have convinced the reader of the main advantages
of \TreeTeX{}: It integrates graphics and text; it is portable to all
sites running \TeX{};
and it is easy to use for the author, because it derives the drawing
of a tree from a purely structural description. But our decision to
implement \TreeTeX{} as a \TeX{} macro package has also some
drawbacks, both for the programmer and for the user of the system.
>From the programmer's point of view, \TeX{}'s macro language is
a low level programming language. Hence, maintaining and extending
the package is a more tedious task than it would be if we had used
a higher level language with better support for modularization.
>From the author's point of view, \TreeTeX{}'s limitations lie in
speed, size of trees, and graphical primitives.
Typesetting all the trees in this article takes about two~minutes on
a VAX~750, and typesetting a complete binary tree with 63~internal
and 64~external nodes takes about one~minute on the same machine.
The size of the trees is limited by three factors, namely,
the number of registers, the complexity of the nested boxes that
contain the drawing of a tree, and the limited number of slopes
that are available for the edges, the latter being the most severe
problem at present. Hence, the main area of application for
\TreeTeX{} is modest use such as in textbooks; displaying
large amounts of statistical data, for example, is out of the question.
Currently edges and circular nodes are drawn from \LaTeX{}'s set of
predefined graphical characters. Hence, \TreeTeX{} cannot draw
arbitrarily wide trees or large circular nodes. We consider
this restriction, however, to be a temporary one, since a committee inside
the \TeX{} Users Group is working on standard graphic
extensions to \TeX{} that will remove these limitations.
As to further developments of \TreeTeX{}, it would be desirable to
draw larger classes of trees, for example multiway trees, and to allow
labels not only for nodes, but also for edges and whole subtrees.
\Treestyle{\vdist{60pt}}
\dummyhalfcenterdim@n=10pt
\begin{Figure}
\centering
\begin{Tree}
\node{\external\bnth{first}\cntr{1}\lft{Beeton}}
\node{\external\cntr{3}\rght{Kellermann}}
\node{\cntr{2}\lft{Carnes}}
\node{\external\cntr{6}\lft{Plass}}
\node{\external\bnth{last}\cntr{8}\rght{Tobin}}
\node{\cntr{7}\rght{Spivak}}
\node{\leftonly\cntr{5}\rght{Lamport}}
\node{\cntr{4}\rght{Knuth}}
\end{Tree}
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}\
\begin{verbatim}
\begin{Tree}
\node{\external\bnth{first}\cntr{1}\lft{Beeton}}
\node{\external\cntr{3}\rght{Kellermann}}
\node{\cntr{2}\lft{Carnes}}
\node{\external\cntr{6}\lft{Plass}}
\node{\external\bnth{last}\cntr{8}\rght{Tobin}}
\node{\cntr{7}\rght{Spivak}}
\node{\leftonly\cntr{5}\rght{Lamport}}
\node{\cntr{4}\rght{Knuth}}
\end{Tree}
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}
\end{verbatim}
\caption{This is an example of a tree that includes labels.}
\label{firstex}
\end{Figure}
\begin{Figure}
\centering
\begin{Tree}
\node{\external\type{frame}\bnth{first}\cntr{Beeton}}
\node{\external\type{frame}\cntr{Kellermann}}
\node{\type{frame}\cntr{Carnes}}
\node{\external\type{frame}\cntr{Plass}}
\node{\external\type{frame}\bnth{last}\cntr{Tobin}}
\node{\type{frame}\cntr{Spivak}}
\node{\leftonly\type{frame}\cntr{Lamport}}
\node{\type{frame}\cntr{Knuth}}
\end{Tree}
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}\
\begin{verbatim}
\begin{Tree}
\node{\external\type{frame}\bnth{first}\cntr{Beeton}}
\node{\external\type{frame}\cntr{Kellermann}}
\node{\type{frame}\cntr{Carnes}}
\node{\external\type{frame}\cntr{Plass}}
\node{\external\type{frame}\bnth{last}\cntr{Tobin}}
\node{\type{frame}\cntr{Spivak}}
\node{\leftonly\type{frame}\cntr{Lamport}}
\node{\type{frame}\cntr{Knuth}}
\end{Tree}
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}
\end{verbatim}
\caption{This is an example of a tree with framed center labels.}
\end{Figure}
\begin{Figure}
\Treestyle{\treefonts{\small\it}\nodesize{16pt}\vdist{40pt}\minsep{16pt}}
\centering
\begin{Tree}
\node{\external\bnth{first}\cntr{1}\lft{Beeton}}
\node{\external\cntr{3}\rght{Kellermann}}
\node{\cntr{2}\lft{Carnes}}
\node{\external\cntr{6}\lft{Plass}}
\node{\external\bnth{last}\cntr{8}\rght{Tobin}}
\node{\cntr{7}\rght{Spivak}}
\node{\leftonly\cntr{5}\rght{Lamport}}
\node{\cntr{4}\rght{Knuth}}
\end{Tree}
\hspace{\leftdist}\usebox{\TeXTree}\hspace{\rightdist}\
\caption{This tree was produced from the same logical description as in
Figure~\ref{firstex}, but with different style parameters}
\label{lastex}
\end{Figure}
\clearpage
\bibliography{trees}
\end{document}