LaTeX/Collaborative Writing of LaTeX Documents

From Wikibooks, open books for an open world
Jump to navigation Jump to search

LaTeX

Getting Started
  1. Introduction
  2. Installation
  3. Installing Extra Packages
  4. Basics
  5. How to get help

Common Elements

  1. Document Structure
  2. Text Formatting
  3. Paragraph Formatting
  4. Colors
  5. Fonts
  6. List Structures
  7. Special Characters
  8. Internationalization
  9. Rotations
  10. Tables
  11. Title creation
  12. Page Layout
  13. Customizing Page Headers and Footers‎
  14. Importing Graphics
  15. Floats, Figures and Captions
  16. Footnotes and Margin Notes
  17. Hyperlinks
  18. Labels and Cross-referencing
  19. Initials

Mechanics

  1. Errors and Warnings
  2. Lengths
  3. Counters
  4. Boxes
  5. Rules and Struts

Technical Text

  1. Mathematics
  2. Advanced Mathematics
  3. Theorems
  4. Chemical Graphics
  5. Algorithms
  6. Source Code Listings
  7. Linguistics

Special Pages

  1. Indexing
  2. Glossary
  3. Bibliography Management
  4. More Bibliographies

Special Documents

  1. Scientific Reports (Bachelor Report, Master Thesis, Dissertation)
  2. Letters
  3. Presentations
  4. Teacher's Corner
  5. Curriculum Vitae
  6. Academic Journals (MLA, APA, etc.)

Creating Graphics

  1. Introducing Procedural Graphics
  2. MetaPost
  3. Picture
  4. PGF/TikZ
  5. PSTricks
  6. Xy-pic
  7. Creating 3D graphics

Programming

  1. Macros
  2. Plain TeX
  3. Creating Packages
  4. Creating Package Documentation
  5. Themes

Miscellaneous

  1. Modular Documents
  2. Collaborative Writing of LaTeX Documents
  3. Export To Other Formats

Help and Recommendations

  1. FAQ
  2. Tips and Tricks

Appendices

  1. Authors
  2. Links
  3. Package Reference
  4. Sample LaTeX documents
  5. Index
  6. Command Glossary

edit this boxedit the TOC

Note: Parts (the part about subversion) of this Wikibook is based on the article Tools for Collaborative Writing of Scientific LaTeX Documents by Arne Henningsen that is published in The PracTeX Journal 2007, number 3 (http://www.tug.org/pracjourn/).

Abstract[edit | edit source]

Collaborative writing of documents requires a strong synchronisation among authors. This Wikibook describes various possible way to organise the collaborative preparation of LaTeX documents.

  1. First several methods are presented which are not based on a version control system.
  2. Then several latex style files are discussed which are suited for collaboration.
  3. This is followed by a solution which based on the version control system Subversion (http://subversion.apache.org/). The Wikibook describes how Subversion can be used together with several other software tools and LaTeX packages to organise the collaborative preparation of LaTeX documents.
  4. Another approach is to use mercurial https://www.mercurial-scm.org/, and bitbucket https://bitbucket.org/

Other Methods (not version control based)[edit | edit source]

  • You can use one of the online solutions listed in the Installation chapter. Most of them have collaboration features.
  • Another option for collaboration is dropbox. It has 2 GB free storage and versioning system. Works like SVN, but more automated and therefore especially useful for beginning LaTeX users. However, Dropbox is not a true versioning control system, and as such it does not allow you to roll the article back to previous versions. Web-based editor LaTeX Base supports syncing to Dropbox.
  • You can use an online collaborative tool built on top of a versioning control system, such as Authorea or ShareLatex. Authorea performs most of the actions described in this document, but in the background (it is built on Git). It allows authors to enter LaTeX or Markdown via a GUI with mathematical notation, figures, d3.js plots, IPython notebooks, data, and tables. All content is rendered to HTML5. Authorea also features a commenting system and article-based chat to ease collaboration and review.
  • As the LaTeX system uses plain text, you can use synchronous collaborative editors like Gobby. In Gobby you can write your documents in collaboration with anyone in real-time. It is strongly recommended that you use utf8 encoding (especially if there are users on multiple operating systems collaborating) and a stable network (typically wired networks).
  • An instance (etherpad on Wikimedia in this example) of EtherPad. To compile use the command:
    wget -O filename.tex "https://etherpad.wikimedia.org/ep/pad/export/xxxx/latest?format=txt" && (latex filename.tex)
    where 'xxxx' should be replaced by the pad number (something like 'z7rSrfrYcH').
  • With a dedicated Linux box with LaTeX & Dropbox it's possible to use Google docs and some scripting to get automatically generated PDFs on Dropbox from updates on Google Docs.
  • You can use a distributed version control system such as Fossil, Mercurial or Git. This is the definitive solution for users looking for control and advanced features like a branch and merge. The learning curve will be steeper than that for a web-based solution.
  • Syncthing is another opensource alternative to synchronizing files across machines

Visualizing diffs in LaTeX: latexdiff and changebar[edit | edit source]

The tools latexdiff and changebar can visualize differences of two LaTeX files inside a generated document. This makes it easier to see impact of certain changes or discuss changes with people not custom to LaTeX. Changebar comes with a script chbar.sh which inserts a bar in the margin indicating parts that have changed. Latexdiff allows different styles of visualization. The default is that discarded text is marked as red and added text is marked as blue. It also supports a mode similar to Changebar which adds a bar in the margin. Latexdiff comes with a script latexrevise which can be used to accept or decline changes. It also has a wrapper script to support version control systems such as the discussed Subversion.

An example on how to use Latexdiff in the Terminal.

   latexdiff old.tex new.tex > diff.tex               # Files old.tex and new.tex are compared and the file visualizing the changes is written to diff.tex
   pdflatex diff.tex                                  # Create a PDF showing the changes

If you use mercurial the syntax is as follows latexdiff-vc --hg test.tex -r revnumber

where revnumber is the (local) revnumber of the relevant changeset.

Important advice:

Sometimes latexdiff runs into problems (the resulting latex code cannot be compiled). This can happen if there are major changes concerning mathematical equations. To deal with this difficulty latexdiff offers the option --math-markup:

  1. --math-markup=3 very sensitive to changes in mathematical equations.
  2. --math-markup=0 changes in mathematical equations are ignored. See the documentation for details.

The program DiffPDF can be used to compare two existing PDFs visually. There is also a command line tool comparepdf based on DiffPDF.

Useful LaTeX styles[edit | edit source]

Subfiles[edit | edit source]

In general for collaboration a modular approach is recommended since it minimizes the danger of conflicting edits. See https://en.wikibooks.org/w/index.php?title=LaTeX/Modular_Documents&stable=0 for details, subfiles are recommended but not the only solution.

Todonotes[edit | edit source]

The todonotes.sty allows you to insert todo-item, either as margins or inline and adds a (hyperref supported) list of todos at the beginning document. What makes this package special (compared say to the fixme.sty) is the fact that the margin todo items point with a line to the text in question. This feature is familiar from Libreoffice, MS Office etc, see the manual for details https://ctan.org/pkg/todonotes?lang=en.

Here is an example:

\documentclass{article}
\usepackage[colorinlistoftodos, textwidth=3cm, shadow]{todonotes}
\newcounter{ubcomment}
\newcommand{\ubcomment}[2][]{%
\refstepcounter{ubcomment}%
{%
\todo[linecolor=black,backgroundcolor={green!40!},size=\footnotesize]{%
\textbf{Fixme: UB [\uppercase{#1}\theubcomment]:}~#2}%
}}
\newcommand{\ubcommentinline}[2][]{%
\refstepcounter{ubcomment}%
{%
\todo[linecolor=black,inline,backgroundcolor={green!40!},size=\footnotesize]{%
\textbf{Fixme: UB [\uppercase{#1}\theubcomment]:}~#2}%
}}

\newcommand{\ubcommentmultiline}[2]{%
\refstepcounter{ubcomment}%
{%
\todo[linecolor=black,inline,caption={\textbf{{Fixme: UB}
    [\theubcomment] #1}} ,backgroundcolor={green!40!},size=\footnotesize]{%
\textbf{Fixme: UB [\theubcomment]:}~#2}%
}}

% add support for todo in equations
\usepackage{marginnote}
\makeatletter
\renewcommand{\@todonotes@drawMarginNoteWithLine}{%
\begin{tikzpicture}[remember picture, overlay, baseline=-0.75ex]%
    \node [coordinate] (inText) {};%
\end{tikzpicture}%
\marginnote[{% Draw note in left margin
    \@todonotes@drawMarginNote%
    \@todonotes@drawLineToLeftMargin%
}]{% Draw note in right margin
    \@todonotes@drawMarginNote%
    \@todonotes@drawLineToRightMargin%
}%
}
\makeatother

\begin{document}

\listoftodos

This is one example \ubcomment{Comment 1}

Now more text and an inline comment: \ubcommentinline{This is an
  inline comment}.

Now even more text and a comment with an enumerate list.
\ubcommentmultiline{Third comment}{
this is not true because
\begin{enumerate}
  \item Reason 1
  \item Reason 2
\end{enumerate}
}

Finally a comment inside a math environment.
\begin{equation}
\label{eq:todo-example:1}
\int f dx =0 \ubcomment{are you sure this integral is zero???}
\end{equation}
\end{document}

Please note: you need to use pdflatex (xelatex also works) and run it various, that is three or four times.

rcsinfo (and rcs-multi)[edit | edit source]

This package https://ctan.org/pkg/rcsinfo?lang=en (and rcs-multi https://ctan.org/pkg/rcs-multi) in a similar way will insert automatically the current version, data, and owner of a file which is under version control of a system which is compatible with the syntax of RCS, like CVS, subversion and mercurial. (see below how to set up mercurial). Here is an example

\documentclass[12pt]{article}
\usepackage[scrpage2]{rcsinfo}
\makeatletter \def\@rcsInfoFancyInfo{{\footnotesize%
      \emph{ \fcolorbox{black}{green}{Rev: \rcsInfoRevision,}
        \fcolorbox{black}{yellow}{\rcsInfoOwner,} \rcsInfoLongDate,
        \rcsInfoTime}}} \makeatother
\rcsInfo $Id: main.tex,v [Hg:291] 2018/08/08 16:36:51 oub Exp oub $
\begin{document}
This  is a test.
\end{document}

Version control systems[edit | edit source]

Independent of collaboration, version control systems are useful even for single author documents, since they allow to keep track of changes and restore older versions if necessary. The oldest still in use version control system is RCS, but it is single file orientated and is not based on a server model. CVS is based on RCS, but serves for various files and also includes a server model. This was superseded, to a certain extend by subversion (see below). A different approach was used by the so called decentralized version control systems, the most popular are git and mercurial (see below).

Requirements for collaboration using a version control system[edit | edit source]

Multi user[edit | edit source]

The systems should allows various users have read/write access to the system, on a sort of «server»

Versions should be saved.[edit | edit source]

The point is not to overwrite a file, or the files by a newer version, instead the system should save in some form, different versions. The reasons are

  1. One can compare changes by running an appropriate diff program,
 for example latexdiff, on different version of the file.
  1. It allows to restore older version if that is necessary.

Non-sequential contributions should be managed[edit | edit source]

Sequential collaboration is one in which one users works, which the other don't do anything, obtain the changes and then the next user starts.

Let us consider a collaboration that is not sequential. (To make the point the example is done of one file, in principle the same problem occurs for different files in directory, but it is less intuitive.)

  1. User1 wants to make changes in section 1 in file1, and needs 2 weeks for
 this modification the results is file1-modified-by-user1
  1. Meanwhile User2 modifies section 2 in file1, which results in
 file1-modified-by-user2
  1. And User3 modifies section 3 in file1, which results in file1-modified-by-user3

A version system must be able to «merge» these three changes without problems.

Conflicting editing should be detected[edit | edit source]

Conflicting editing is by definition editing that occurs in the same line of a file. The system should detect that, advice and provide means to solve it.

Interchanging Documents using a version control system[edit | edit source]

The collaborative preparation of documents requires a considerable amount of coordination among the authors. This coordination can be organised in many different ways, where the best way depends on the specific circumstances.

There are many ways to interchange documents among authors. One possibility is to compose documents by interchanging e-mail messages. This method has the advantage that common users generally do not have to install and learn the usage of any extra software, because virtually all authors have an e-mail account. Furthermore, the author who has modified the document can easily attach the document and explain the changes by e-mail as well. Unfortunately, there is a problem when two or more authors are working at the same time on the same document. So, how can authors synchronise these files? Besides this difficulty, the email based method can become easily cumbersome,if several files are involved. Another method is to use a distributed or server based version control system. Before going into details basic requirements for such an approach will be outlined in the next subsection.

A second possibility is to provide the document on a common file server, which is available in most departments. The risk of overwriting each others' modifications can be eliminated by locking files that are currently edited. However, generally the file server can be only accessed from within a department. Hence, authors who are out of the building cannot use this method to update/commit their changes. In this case, they will have to use another way to overcome this problem. So, how can authors access these files?

A third possibility is to use a version control system. A comprehensive list of version control systems can be found at Wikipedia. Version control systems keep track of all changes in files in a project. If many authors modify a document at the same time, the version control system tries to merge all modifications automatically. However, if multiple authors have modified the same line, the modifications cannot be merged automatically, and the user has to resolve the conflict by deciding manually which of the changes should be kept. Authors can also comment their modifications so that the co-authors can easily understand the workflow of this file. As version control systems generally communicate over the Internet (e.g. through TCP/IP connections), they can be used from different computers with Internet connections. A restrictive firewall policy might prevent the version control system from connecting to the Internet. In this case, the network administrator has to be asked to open the appropriate port. The Internet is only used for synchronising the files. Hence, a permanent Internet connection is not required. The only drawback of a version control system could be that it has to be installed and configured.

Moreover, a version control system is useful even if a single user is working on a project. First, the user can track (and possibly revoke) all previous modifications. Second, this is a convenient way to have a backup of the files on other computers (e.g. on the version control server). Third, this allows the user to easily switch between different computers (e.g. office, laptop, home).

The Version Control System Subversion[edit | edit source]

Subversion (SVN) comes as a successor to the popular version control system CVS. SVN operates on a client-server model in which a central server hosts a project repository that users copy and modify locally. A repository functions similarly to a library in that it permits users to check out the current project, make changes, and then check it back in. The server records all changes a user checks in (usually with a message summarizing what changes the user made) so that other users can easily apply those changes to their own local files.

Each user has a local working copy of a remote repository. For instance, users can update changes from the repository to their working copy, commit changes from their own working copy to the repository, or (re)view the differences between working copy and repository.

To set up a SVN version control system, the SVN server software has to be installed on a (single) computer with permanent Internet access. (If this computer has no static IP address, one can use a service like DynDNS to be able to access the server with a static hostname.) It can run on many Unix, modern MS Windows, and Mac OS X platforms.

Users do not have to install the SVN server software, but a SVN "client" software. This is the unique way to access the repositories on the server. Besides the basic SVN command-line client, there are several Graphical User Interface Tools (GUIs) and plug-ins for accessing the SVN server (see http://subversion.tigris.org/links.html). Additionally, there are very good manuals about SVN freely available on the Internet (e.g. http://svnbook.red-bean.com).

At our department, we run the SVN server on a GNU-Linux system, because most Linux distributions include it. In this sense, installing, configuring, and maintaining SVN is a very simple task.

Most MS Windows users access the SVN server by the TortoiseSVN client, because it provides the most usual interface for common users. Linux users usually use SVN utilities from the command-line, or eSvn--a GUI frontend--with KDiff3 for showing complex differences.

Hosting LaTeX files in Subversion[edit | edit source]

Figure 1: Common texmf tree shown in eSvn's Repository Browser

On our Subversion server, we have one repository for a common texmf tree. Its structure complies with the TeX Directory Structure guidelines (TDS, http://www.tug.org/tds/tds.html, see figure 1). This repository provides LaTeX classes, LaTeX styles, and BibTeX styles that are not available in the LaTeX distributions of the users, e.g. because they were bought or developed for the internal use at our department. All users have a working copy of this repository and have configured LaTeX to use this as their personal texmf tree. For instance, teTeX (http://www.tug.org/tetex/) users can edit their TeX configuration file (e.g. /etc/texmf/web2c/texmf.cnf) and set the variable TEXMFHOME to the path of the working copy of the common texmf tree (e.g. by TEXMFHOME = $HOME/texmf); MiKTeX (http://www.miktex.org/) users can add the path of the working copy of the common texmf tree in the 'Roots' tab of the MiKTeX Options.

If a new class or style file has been added (but not if these files have been modified), the users have to update their 'file name data base' (FNDB) before they can use these classes and styles. For instance, teTeX users have to execute texhash; MiKTeX users have to click on the button 'Refresh FNDB' in the 'General' tab of the MiKTeX Options.

Furthermore, the repository contains manuals explaining the specific LaTeX software solution at our department (e.g. this document).

The Subversion server hosts a separate repository for each project of our department. Although branching, merging, and tagging is less important for writing text documents than for writing source code for software, our repository layouts follow the recommendations of the 'Subversion book' (http://svnbook.red-bean.com). In this sense, each repository has the three directories /trunk, /branches, and /tags.

The most important directory is /trunk. If a single text document belongs to the project, all files and subdirectories of this text document are in /trunk. If the project yields two or more different text documents, /trunk contains a subdirectory for each text document. A slightly different version (a branch) of a text document (e.g. for presentation at a conference) can be prepared either in an additional subdirectory of /trunk or in a new subdirectory of /branches. When a text document is submitted to a journal or a conference, we create a tag in the directory /tags so that it is easy to identify the submitted version of the document at a later date. This feature has been proven very useful. When creating branches and tags, it is important always to use the Subversion client (and not the tools of the local file system) for these actions, because this saves disk space on the server and it preserves information about the same history of these documents.

Often the question arises, which files should be put under version control. Generally, all files that are directly modified by the user and that are necessary for compiling the document should be included in the version control system. Typically, these are the LaTeX source code (*.tex) files (the main document and possibly some subdocuments) and all pictures that are inserted in the document (*.eps, *.jpg, *.png, and *.pdf files). All LaTeX classes (*.cls), LaTeX styles (*.sty), BibTeX data bases (*.bib), and BibTeX styles (*.bst) generally should be hosted in the repository of the common texmf tree, but they could be included in the respective repository, if some (external) co-authors do not have access to the common texmf tree. On the other hand, all files that are automatically created or modified during the compilation process (e.g. *.aut, *.aux, *.bbl, *.bix, *.blg, *.dvi, *.glo, *.gls, *.idx, *.ilg, *.ind, *.ist, *.lof, *.log, *.lot, *.nav, *.nlo, *.out, *.pdf, *.ps, *.snm, and *.toc files) or by the (LaTeX or BibTeX) editor (e.g. *.bak, *.bib~, *.kilepr, *.prj, *.sav, *.tcp, *.tmp, *.tps, and *.tex~ files) generally should be not under version control, because these files are not necessary for compilation and generally do not include additional information. Furthermore, these files are regularly modified so that conflicts are very likely.

The features of Subversion and its workflow[edit | edit source]

A great feature of a version control system is that all authors can easily trace the workflow of a project by viewing the differences between arbitrary versions of the files. Authors are primarily interested in 'effective' modifications of the source code that change the compiled document, but not in 'ineffective' modifications that have no impact on the compiled document (e.g. the position of line breaks). Software tools for comparing text documents ('diff tools') generally cannot differentiate between 'effective' and 'ineffective' modifications; they highlight both types of modifications. This considerably increases the effort to find and review the 'effective' modifications. Therefore, 'ineffective' modifications should be avoided.

In this sense, it is very important not to change the positions of line breaks without cause. Hence, automatic line wrapping of the users' LaTeX editors should be turned off and line breaks should be added manually. Otherwise, if a single word in the beginning of a paragraph is added or removed, all line breaks of this paragraph might change so that most diff tools indicate the entire paragraph as modified, because they compare the files line by line. The diff tools wdiff (http://www.gnu.org/software/wdiff/) and dwdiff (http://os.ghalkes.nl/dwdiff.html) are not affected by the positions of line breaks, because they compare documents word by word. However, their output is less clear so that modifications are more difficult to track. Moreover, these tools cannot be used directly with the Subversion command-line switch --diff-cmd, but a small wrapper script has to be used (http://textsnippets.com/posts/show/1033).

A reasonable convention is to add a line break after each sentence and start each new sentence in a new line. Note that this has an advantage also beyond version control: if you want to find a sentence in your LaTeX code that you have seen in a compiled (DVI, PS, or PDF) file or on a printout, you can easily identify the first few words of this sentence and screen for these words on the left border of your editor window.

Furthermore, we split long sentences into several lines so that each line has at most about 80 characters, because it is rather inconvenient to search for (small) differences in long lines. (Note: For instance, the LaTeX editor Kile (http://kile.sourceforge.net/) can assist the user in this task when it is configured to add a vertical line that marks the 80th column.) We find it very useful to introduce the additional line breaks at logical breaks of the sentence, e.g. before a relative clause or a new part of the sentence starts. An example LaTeX code that is formatted according to these guidelines is the source code of the article Tools for Collaborative Writing of Scientific LaTeX Documents by Arne Henningsen that is published (including the source code) in The PracTeX Journal 2007, Number 3 (http://www.tug.org/pracjourn/2007-3/henningsen/).

If the authors work on different operating systems, their LaTeX editors will probably save the files with different newline (end-of-line) characters (w:Newline. To avoid this type of 'ineffective' modifications, all users can agree on a specific newline character and configure their editor to use this newline character. Another alternative is to add the subversion property 'svn:eol-style' and set it to 'native'. In this case, Subversion automatically converts all newline characters of this file to the native newline character of the author's operating system (http://svnbook.red-bean.com/en/1.4/svn.advanced.props.file-portability.html#svn.advanced.props.special.eol-style).

There is also another important reason for reducing the number of 'ineffective' modifications: if several authors work on the same file, the probability that the same line is modified by two or more authors at the same time increases with the number of modified lines. Hence, 'ineffective' modifications unnecessarily increase the risk of conflicts (see section Interchanging Documents).

Figure 2: Reviewing modifications in KDiff3

Furthermore, version control systems allow a very effective quality assurance measure: all authors should critically review their own modifications before they commit them to the repository (see figure 2). The differences between the user's working copy and the repository can be easily inspected with a single Subversion command or with one or two clicks in a graphical Subversion client. Furthermore, authors should verify that their code can be compiled flawlessly before they commit their modifications to the repository. Otherwise, the co-authors have to pay for these mistakes when they want to compile the document. However, this directive is not only reasonable for version control systems but also for all other ways to interchange documents among authors.

Subversion has a feature called 'Keyword Substitution' that includes dynamic version information about a file (e.g. the revision number or the last author) into the contents of the file itself (see e.g. http://svnbook.red-bean.com, chapter 3). Sometimes, it is useful to include these information not only as a comment in the LaTeX source code, but also in the (compiled) DVI, PS, or PDF document. This can be achieved with the LaTeX packages svn (http://www.ctan.org/tex-archive/macros/latex/contrib/svn/), svninfo (http://www.ctan.org/tex-archive/macros/latex/contrib/svninfo/), or (preferably) svn-multi (http://www.ctan.org/tex-archive/macros/latex/contrib/svn-multi/).

The most important directives for collaborative writing of LaTeX documents with version control systems are summarised in the following box.

Directives for using LaTeX with version control systems

  1. Avoid 'ineffective' modifications.
  2. Do not change line breaks without good reason.
  3. Turn off automatic line wrapping of your LaTeX editor.
  4. Start each new sentence in a new line.
  5. Split long sentences into several lines so that each line has at most about 80 characters.
  6. Put only those files under version control that are directly modified by the user.
  7. Verify that your code can be compiled flawlessly before committing your modifications to the repository.
  8. Use Subversion's diff feature to critically review your modifications before committing them to the repository.
  9. Add a meaningful and descriptive comment when committing your modifications to the repository.
  10. Use the Subversion client for copying, moving, or renaming files and folders that are under revision control.

If the users are willing to let go of the built-in diff utility of SVN and use diff tools that are local on their workstations, they can put to use such tools that are more tailored to text documents. The diff tool that comes with SVN was designed with source code in mind. As such, it is built to be more useful for files of short lines. Other tools, such as Compare It! allows to conveniently compare text files where each line can span hundreds of characters (such as when each line represents a paragraph). When using a diff tool that allows convenient views of files with long lines, the users can author the TeX files without a strict line-breaking policy.

Distributed revision-control mercurial (and git)[edit | edit source]

As stated in: https://en.wikipedia.org/wiki/Distributed_version_control: In software development, distributed version control (also known as distributed revision control) is a form of version control where the complete codebase – including its full history – is mirrored on every developer's computer. This allows branching and merging to be managed automatically, increases speeds of most operations (except for pushing and pulling), improves the ability to work offline, and does not rely on a single location for backups.

Software development author Joel Spolsky, described DVCS as "possibly the biggest advance in software development technology in the [past] ten years."

The most popular DVCS are currently are

  1. git https://git-scm.com/ and
  2. mercurial https://www.mercurial-scm.org/

Although git is more popular and might be more suited for very large software projects, mercurial has a couple of features which makes it especially suited for scientific collaboration, based on LaTeX.

  1. It has besides the changeset hash a local revision number, which makes it easier to deal with different changesets.
  2. It has keyword expansion and is therefore suited to use rcs-multi.sty and other latex styles which add a version number in the footer, or header of the document.
  3. It has named branches, besides bookmarks, which are more intuitive if several branches are desired.
  4. It works flawlessly on all known OS.
  5. The GUI https://www.mercurial-scm.org/wiki/TortoiseHg provides a intuitive interface.

Setting up mercurial (git)[edit | edit source]

The following instructions describe how to set up mercurial. The git set up would be very similar, almost identical, see for example the git/mercurial rosetta stone: https://github.com/sympy/sympy/wiki/Git-hg-rosetta-stone

For installation of mercurial in different OS see: https://www.mercurial-scm.org/downloads

Mercurial is written and ships a wide variety of so called extensions, which have to be enabled separately in the global .hgrc configuration file.

Here is an example:

# example config (see "hg help config" for more info)

[ui]
username = Joe Doe <user@gmail.com>

[extensions]
churn =
# Read the documentation of how to set up the notify extension, this
# extension is not needed if you use the bitbucket notify system
# notify = 
strip =
share = 
progress =
eol =
hgk =
hgext.bookmarks =
interhg =
rebase =
shelve =
purge =
record =
color =
keyword = 
hgext.fetch=
histedit = 

# Advanced extensions
# evolve = 
# hggit = 

# Minimal setting for keyword expansion.
[keyword]
**.tex =
[keywordmaps]
Author = {author|user}
Date = {date|utcdate}
Header = {root}/{file},v {node|short} {date|utcdate} {author|user}
Id = {file|basename},v {rev} {date|utcdate} {author|user} Exp {author|user}
# Other options
# Id = {file|basename},v {rev}{latesttag}.{latesttagdistance} {date|utcdate} {author|user} Exp {author|user}
# the problem with this is that adding a tag increases the rev number so that
# in the latex document the string is always v4.2 or 5.2 but never v4.1.
# Id = {file|basename},v {latesttag}.{latesttagdistance}[Hg:{rev}] {date|utcdate} {author|user} Exp {author|user}
# simplified version
# Id = {file|basename},v |Brch:{branches}|{latesttag}[Hg:{rev}] {date|utcdate} {author|user} Exp {author|user}
RCSFile = {file|basename},v
RCSfile = {file|basename},v
Revision = {node|short}
Source = {root}/{file},v

[hostfingerprints]
bitbucket.org.fingerprints=sha256:ae:ca:bf:83:41:14:55:8d:ea:70:ae:06:7d:ad:c0:44:77:6f:81:1a:c9:1e:d3:ab:f5:38:98:2b:07:4b:d4:70

[color]
custom.rev = red
custom.decorate = yellow
custom.date = green
custom.author = blue bold

[eol]
native = LF

Setting up a template LaTeX directory[edit | edit source]

As said before, it is recommendable to use a modular approach for collaboration with LaTeX, using the subfile package. A typical LaTeX template directory could look like this:

 -rw-r--r--  1 oub oub 254K Aug  7 21:15 bibfile.bib
 -rw-r--r--  1 oub oub  637 Aug 14 09:21 main.tex
 -rw-rw-r--  1 oub oub  576 Aug 14 09:21 sec1-main-result.tex
 -rw-rw-r--  1 oub oub  576 Aug 14 09:21 sec2-proof.tex

Creating a local mercurial repository[edit | edit source]

So in that directory one has to do (using Linux/MacOS, MS Windows would be similar):

hg init

hg addremove

hg commit -m "First commit"

Optionally one could add a .hgignore file, which for LaTeX could look like this:

syntax: glob
*.aux
*.toc
*.mw
*.backup
*.brf
*.tdo
*.bbl
*.blg
*.bib
*.el
*.log
*.dvi
*.nav
*.pdf
*.glo
*.idx
*.ilg
*.ind
*.nlo
*.nls
*.out
*.synctex.gz

hg addremove

hg commit -m "Added .hgignore"

So the directory would look like:

 drwxrwxr-x 11 oub oub 4.0K Aug 14 09:15 ..
 -rw-r--r--  1 oub oub 254K Aug  7 21:15 bibfile.bib
 drwxr-xr-x  4 oub oub 4.0K Aug 14 09:34 .hg
 -rw-r--r--  1 oub oub  215 Aug 14 09:31 .hgignore
 -rw-r--r--  1 oub oub  630 Aug 14 09:34 main.tex
 -rw-rw-r--  1 oub oub  568 Aug 14 09:34 sec1-main-result.tex
 -rw-rw-r--  1 oub oub  562 Aug 14 09:34 sec2-proof.tex

Now one can set up an empty bitbucket repository and invite the collaborators.

Bitbucket and Helix[edit | edit source]

I recommend to use a hosted mercurial repository, like bitbucket. For academic users bitbucket offers a more generous use (no restriction on the numbers of collaborators for example). So the recommended method is that all collaborators open a bitbucket account.

  1. Then one of the authors, who will work as a maintainer, creates a repository and pushes a template as a first version. The bitbucket website gives a nice explanation, but see also below for details.
  2. He shares the repository with the other collaborators.
  3. It is recommended that all collaborators set up the bitbucket notify system.
  4. The collaborators clone the repository: for example hg clone https://bitbucket.org/user/project1.
  5. The collaborators edit the files, commit, for example hg commit -m "Add Introduction"
  6. Then they push hg push
  7. The other collaborators are informed about the push and pull: hg pull -u

However, as of first of August 2020 mercurial has shut down its access via mercurial. There are two alternatives

  1. 1 Stay with mercurial but use the hg-git plugin. That is possible even for for named branches, but requires some fiddling.
  1. Switch to Helix, this service is a bit less generous but still offers free repositories (1 GB free space+5 collaborators for each account). (https://info.perforce.com/try-perforce-helix-teamhub-free)

Get your local Mercurial repository on Bitbucket[edit | edit source]

This is copied from the bitbucket website.

  • Step 1: Switch to your repository's directory

cd /path/to/your/repo

  • Step 2: Connect your existing repository to Bitbucket

hg push https://user@bitbucket.org/user/test

  • Step 3: Update the default field of the repository's .hgrc file with its new url

[paths] default = https://user@bitbucket.org/user/test

Workflow[edit | edit source]

So in a nutshell, the commands to be learned for most collaborators, save the maintainer are:

  1. hg clone https://bitbucket.org/user/project1
  2. hg commit -m "Commit message"
  3. hg push
  4. hg pull -u

It is also recommended to pull always before a push.

Then two things can happened.

  • Either nothing, there is no change upstream.

In which case

hg log -G

looks like

@  changeset:   1:cd6ae8660e6f
|  tag:         tip
|  user:        Joe Doe <user@gmail.com>
|  date:        Thu Aug 09 22:17:47 2018 +0200
|  summary:     My second commit
|
o  changeset:   0:cb0b44f99bd2
  • There is a a change upstream which you pulled in which case you now have a new head, which you should merge.

So in the second case

hg log -G

Looks like:

o  changeset:   2:05548327a272
|  tag:         tip
|  parent:      0:cb0b44f99bd2
|  user:        John Smith <user2@gmail.com>
|  date:        Thu Aug 09 22:17:22 2018 +0200
|  summary:     My second commit
|
| @  changeset:   1:cd6ae8660e6f
|/   user:        Joe Doe  <user@gmail.com>
|    date:        Thu Aug 09 22:17:47 2018 +0200
|    summary:     My second commit
|
o  changeset:   0:cb0b44f99bd2

So you should merge

hg merge -r 2 and obtain

1 files updated, 0 files merged, 0 files removed, 0 files unresolved
(branch merge, don't forget to commit)

so hg log -G

gives

@    changeset:   3:df2f1f46a80c
|\   tag:         tip
| |  parent:      1:cd6ae8660e6f
| |  parent:      2:05548327a272
| |  user:        Joe Doe <user@gmail.com>
| |  date:        Thu Aug 09 22:20:44 2018 +0200
| |  summary:     Merged successfully
| |
| o  changeset:   2:05548327a272
| |  parent:      0:cb0b44f99bd2
| |  user:        John Smith <user2@example.com>
| |  date:        Thu Aug 09 22:17:22 2018 +0200
| |  summary:     My second commit
| |
o |  changeset:   1:cd6ae8660e6f
|/   user:        Joe Doe <user@gmail.com>
|    date:        Thu Aug 09 22:17:47 2018 +0200
|    summary:     My second commit
|
o  changeset:   0:cb0b44f99bd2

Merging of conflicting changesets should be left to the maintainer.

Real-Time Collaborative Writing[edit | edit source]

Several options are available:

  • Overleaf (and sharelatex now part of overleaf) is a web-based real-time collaborative editor
  • Bluelatex is a web-based real-time collaboratitve editor written in Scala
  • Cocalc (formerly SageMathCloud) has a collaborative LaTeX editor
  • Autheora is a web-based real-time collaborative editor
  • Papeera is a web-based real-time collaborative editor

The advantage of this approach is that no additional software is needed, the disadvantage is the one cannot use, easily, his or her favorite LaTeX editor. That is why almost all of these services allow access via git (for which git or mercurial+hg-git plugin) needs to be installed.

These services were almost free a couple of years ago; now, however only very basic features are free of charge, the following table specifies this a bit more.

Name collarborators documents access via git (or mercurial)
authorea no restrictions free of charge only for 3 documents yes
overleaf only one author no restrictions yes
papeeria no restrictions no restrictions only for public repositories
in the free version

Managing collaborative bibliographies[edit | edit source]

Writing of scientific articles, reports, and books requires the citation of all relevant sources. BibTeX is an excellent tool for citing references and creating bibliographies (Markey 2005, Fenn 2006). Many different BibTeX styles can be found on CTAN (http://www.ctan.org) and on the LaTeX Bibliography Styles Database (http://jo.irisson.free.fr/bstdatabase/). If no suitable BibTeX style can be found, most desired styles can be conveniently assembled with custombib/makebst (http://www.ctan.org/tex-archive/macros/latex/contrib/custom-bib/). Furthermore, BibTeX style files can be created or modified manually; however this action requires knowledge of the (unnamed) postfix stack language that is used in BibTeX style files (Patashnik 1988).

At our department, we have a common bibliographic data base in the BibTeX format (.bib file). It resides in our common texmf tree (see section 'Hosting LaTeX files in Subversion') in the subdirectory /bibtex/bib/ (see figure 1). Hence, all users can specify this bibliography by only using the file name (without the full path) --- no matter where the user's working copy of the common texmf tree is located.

All users edit our bibliographic data base with the graphical BibTeX editor JabRef (http://www.jabref.org). As JabRef is written in Java, it runs on all major operating systems. As different versions of JabRef generally save files in a slightly different way (e.g. by introducing line breaks at different positions), all users should use the same (e.g. last stable) version of JabRef. Otherwise, there would be many differences between different versions of .bib files that solely originate from using different version of JabRef. Hence, it would be hard to find the real differences between the compared documents. Furthermore, the probability of conflicts would be much higher (see section 'Subversion really makes the difference'). As JabRef saves the BibTeX data base with the native newline character of the author's operating system, it is recommended to add the Subversion property 'svn:eol-style' and set it to 'native' (see section 'Subversion really makes the difference').

Figure 3: Specify default key pattern in JabRef

JabRef is highly flexible and can be configured in many details. We make the following changes to the default configuration of JabRef to simplify our work. First, we specify the default pattern for BibTeX keys so that JabRef can automatically generate keys in our desired format. This can be done by selecting OptionsPreferencesKey pattern and modifying the desired pattern in the field Default pattern. For instance, we use [auth:lower][shortyear] to get the last name of the first author in lower case and the last two digits of the year of the publication (see figure 3).

Figure 4: Set up general fields in JabRef

Second, we add the BibTeX field location for information about the location, where the publication is available as hard copy (e.g. a book or a copy of an article). This field can contain the name of the user who has the hard copy and where he has it or the name of a library and the shelf-mark. This field can be added in JabRef by selecting OptionsSet up general fields and adding the word location (using the semicolon (;) as delimiter) somewhere in the line that starts with General: (see figure 4).

Figure 5: Specify 'Main PDF directory' in JabRef

Third, we put all PDF files of publications in a specific subdirectory in our file server, where we use the BibTeX key as file name. We inform JabRef about this subdirectory by selecting OptionsPreferencesExternal programs and adding the path of the this subdirectory in the field Main PDF directory (see figure 5). If a PDF file of a publication is available, the user can push the Auto button left of JabRef's Pdf field to automatically add the file name of the PDF file. Now, all users who have access to the file server can open the PDF file of a publication by simply clicking on JabRef's PDF icon.

If we send the LaTeX source code of a project to a journal, publisher, or somebody else who has no access to our common texmf tree, we do not include our entire bibliographic data base, but extract the relevant entries with the Perl script aux2bib (http://www.ctan.org/tex-archive/biblio/bibtex/utils/bibtools/aux2bib).

Conclusion[edit | edit source]

This wikibook describes a possible way to efficiently organise the collaborative preparation of LaTeX documents. The presented solution is based on the Subversion version control system and several other software tools and LaTeX packages. However, there are still a few issues that can be improved.

First, we plan that all users install the same LaTeX distribution. As the TeX Live distribution (http://www.tug.org/texlive/) is available both for Unix and MS Windows operating systems, we might recommend our users to switch to this LaTeX distribution in the future. (Currently, our users have different LaTeX distributions that provide a different selection of LaTeX packages and different versions of some packages. We solve this problem by providing some packages on our common texmf tree.)

Second, we consider to simplify the solution for a common bibliographic data base. Currently it is based on the version control system Subversion, the graphical BibTeX editor JabRef, and a file server for the PDF files of publications in the data base. The usage of three different tools for one task is rather challenging for infrequent users and users that are not familiar with these tools. Furthermore, the file server can be only accessed by local users. Therefore, we consider to implement an integrated server solution like WIKINDX (http://wikindx.sourceforge.net/), Aigaion (http://www.aigaion.nl/), or refBASE (http://refbase.sourceforge.net/). Using this solution only requires a computer with internet access and a web browser, which makes the usage of our data base considerably easier for infrequent users. Moreover, the stored PDF files are available not only from within the department, but throughout the world. (Depending on the copy rights of the stored PDF files, the access to the server --- or least the access to the PDF files --- has to be restricted to members of the department.) Even Non-LaTeX users of our department might benefit from a server-based solution, because it should be easier to use this bibliographic data base in (other) word processing software packages, because these servers provide the data not only in BibTeX format, but also in other formats.

All readers are encouraged to contribute to this wikibook by adding further hints or ideas or by providing further solutions to the problem of collaborative writing of LaTeX documents.

Acknowledgements[edit | edit source]

Arne Henningsen thanks Francisco Reinaldo and Géraldine Henningsen for comments and suggestions that helped him to improve and clarify this paper, Karsten Heymann for many hints and advices regarding LaTeX, BibTeX, and Subversion, and Christian Henning as well as his colleagues for supporting his intention to establish LaTeX and Subversion at their department.

References[edit | edit source]


Previous: Modular Documents Index Next: Export To Other Formats