Temporary Base Class

Intent

Reduce the cost of creating temporary objects.

Motivation

Temporary objects are often created during execution of a C++ program. Result of C++ operators (unary, binary, logical, etc.) and return-by-value functions always give rise to temporary objects. For built-in types, the cost of creating temporaries is minimal because compilers often use CPU registers to manipulate them. However, creation of temporary objects can be quite expensive for user-defined classes that allocate memory in the constructor and may require expensive copy operations in the copy-constructor. Temporaries are often wasteful because their lifespan is usually very short and they exist only to be assigned to a named object (lvalue). Even though their lifespan is short, the C++ language rules require the compilers to create and destroy temporaries to maintain correctness of the program. (In reality, RVO and NRVO optimizations are allowed to eliminate some temporaries). The cost of creating and destroying temporaries can adversely affect the performance of programs that use heavy objects.

Consider, for instance, a Matrix class that uses heap memory to store an array of integers. This class uses the usual RAII idiom to manage the resources: allocation in the constructor and de-allocation in the destructor. The copy constructor and the copy-assignment operator take care of maintaining exclusive ownership of the allocated memory.

void do_addition(int * dest, const int * src1, const int * src2, size_t dim)
{
  for(size_t i = 0; i < dim * dim; ++i, ++dest, ++src1, ++src2)
    *dest = *src1 + *src2;
}

class Matrix
{
  size_t dim;
  int * data;

public:

  explicit Matrix(size_t d)
    : dim(d), data(new int[dim*dim]())
  {
    for(size_t i = 0;i < dim * dim; ++i)
      data[i] = i*i;  
  }
  
  Matrix(const Matrix & m)
    : dim(m.dim), data(new int[dim*dim]())
  {
    std::copy(m.data, m.data + (dim*dim), data);
  }
  
  Matrix & operator = (const Matrix & m)
  {
    Matrix(m).swap(*this);
    return *this;
  }
  
  void swap(Matrix & m)
  {
    std::swap(dim, m.dim);
    std::swap(data, m.data);
  }
  
  Matrix operator + (const Matrix & m) const
  {
    Matrix result(this->dim);
    do_addition(result.data, this->data, m.data, dim);
    return result;
  }

  ~Matrix()
  {
    delete [] data;
  }
};

Using objects of this class in expression such as below have several performance issues.

Matrix m1(3), m2(3), m3(3), m4(3);
Matrix result = m1 + m2 + m3 + m4;

Temporary Matrix objects are created as a result of every summation and they are destroyed immediately. For every pair of parenthesis in (((m1 + m2) + m3) + m4), a temporary object is needed. Creation and destruction of each temporary requires memory allocation and de-allocation, which is quite wasteful.

Temporary Base Class idiom is a way of reducing the overhead of temporaries in arithmetic expression such as the above. However, this idiom has major drawbacks as described towards the end.

Solution and Sample Code

Temporary Base Class idiom (TBCI) does not change the fact that temporary objects are created but it reduces (substantially) the cost of creating them. This is achieved by recognizing the places where temporaries are created and by using a type that is lightweight to create and destroy. Unlike C++11, C++03 does not have a language supported way of distinguishing temporary objects (rvalues) from named objects (lvalues). Const reference is the only way available in C++03 for binding a temporary object to a reference.

In TBCI idiom, each class for heavy objects is represented by two classes. One class, D (for derived), is meant to represent the named objects whereas another class, B (for base), is used to represent temporary objects. Users are expected to use only D class in variable/function declaration because objects of type D behave just like regular objects. For instance, deep copies are performed when copying D objects.

The B class is used for intermediate temporary objects created in arithmetic expressions. Creation and destruction of B type objects is transparent to the user provided all the results computed by D objects are assigned to another D object. The key difference between B and D classes is the copying behavior. Whenever a B or D object is created from a D object, a deep copy (i.e., new memory allocation and copying data) is performed. On the other hand, whenever a B or D object is created from a B object, a shallow copy (i.e., just copy pointers) is performed. These rules are also applicable to the assignment operators with an exception that existing memory may have to be deleted in the left hand side object (e.g, an assignment from a B to B or D). Moreover, unlike D objects, B objects do not have exclusive ownership of the data and therefore do not destroy it when a destructor is called.

Construction and Assignment
Deep Copy	Shallow Copy
From D to B, D	From B to B, D

Interfaces of B and D classes are designed carefully to fully support conversion from one another with respect to the rules of memory management given above. The following example is a TBCI version of the Matrix class shown above. The Matrix class is the main class whereas TMatrix class is meant to represent temporary matrices.

class Matrix;
class TMatrix 
{
  size_t dim;
  int * data;
  bool freeable;
  void real_destroy();

 public:

  explicit TMatrix(size_t d);
  TMatrix(const TMatrix & tm);
  TMatrix(const Matrix & m);
  TMatrix & operator = (const Matrix & m);
  TMatrix & operator = (const TMatrix & tm);
  TMatrix & operator + (const Matrix & m);
  TMatrix & operator + (TMatrix tm);
  ~TMatrix();
  void swap(TMatrix &) throw();

  friend class Matrix;
};

class Matrix : public TMatrix 
{
public:
  explicit Matrix(size_t dim);
  Matrix(const Matrix & tm);
  Matrix(const TMatrix & tm);
  Matrix & operator = (const Matrix & m);
  Matrix & operator = (const TMatrix & tm);
  TMatrix operator + (const Matrix & m) const;
  TMatrix operator + (const TMatrix & tm) const;
  ~Matrix();
};

The interfaces of the above two classes reveal several things. Not only the classes have doubled, number of member function have also (nearly) doubled. In particular, copy-constructor, copy-assignment operator, overloaded operator + are declared for both Matrix and TMatrix. This is necessary to ensure that in all possible cases where Matrix and TMatrix object come together, the behavior is well-defined.

The only way TMatrix objects are created are through the overloaded operator + in the Matrix class. Whenever two Matrix classes are added, the result is returned as a TMatrix object. The result of any subsequent additions of Matrix objects are stored in the same TMatrix object that is a result of the first addition. This eliminates the need to allocate and de-allocate memory for temporary matrices. For instance, TBCI achieves efficiency by executing addition of 4 matrices in the following way.

Matrix result = (((m1 + m2) + m3) + m4);
...
Matrix result = (((T1) + m3) + m4);
...
Matrix result = ((T1) + m4);
...
Matrix result = (T1);

New memory is allocated only when the TMatrix object is created the first time. The results of additions are stored in the temporary TMatrix object shown as T1. Finally, the result must be assigned to a Matrix object to ensure that memory resources do not leak.

Note that other combinations of the arithmetic operations are also possible. In particular, when other higher precedence operators, such as binary multiplication and division are used, TMatrixobjects may be created in different order. For the sake of simplicity, only binary addition is shown in the examples and parentheses are used to enforce precedence. For instance, consider the following order of execution.

((m1 + m2) + (m3 + m4))
...
((T1) + (T2))
...
(T1)

To obtain the above behavior, both the classes (Matrix and TMatrix) are implemented in an idiomatic way as shown below.

/***** Implementation *****/
void do_addition(int * dest, const int * src1, const int * src2, size_t dim)
{
  for(size_t i = 0; i < dim * dim; ++i, ++dest, ++src1, ++src2)
    *dest = *src1 + *src2;
}

void do_self_addition(int * dest, const int * src, size_t dim)
{
  for(size_t i = 0; i < dim * dim; ++i, ++dest, ++src)
    *dest += *src;
}

void populate(int *data, size_t dim)
{
  for(size_t i = 0;i < dim * dim; ++i)
    data[i] = i*i;
}

TMatrix::TMatrix(size_t d) 
: dim(d), data (new int[dim*dim]()), freeable(0)
{
  populate(data, dim);
}

TMatrix::TMatrix(const TMatrix & tm)
: dim(tm.dim), data(tm.data), freeable(0)
{}

TMatrix::TMatrix(const Matrix & m)
: dim(m.dim), data(new int[dim*dim]), freeable(0)
{
  std::copy(data, data + dim*dim, m.data);
}

TMatrix & TMatrix::operator = (const Matrix & m)
{
  std::copy(m.data, m.data + (m.dim * m.dim), data);
  return *this;
}

TMatrix & TMatrix::operator = (const TMatrix & tm)
{
  real_destroy();
  dim = tm.dim;
  data = tm.data;
  freeable = 0;
  return *this;
}

TMatrix & TMatrix::operator + (const Matrix & m)
{
  do_self_addition(this->data, m.data, dim);
  return *this;
}

TMatrix & TMatrix::operator + (TMatrix tm)
{
  do_self_addition(this->data, tm.data, dim);
  tm.real_destroy();
  return *this;
}

TMatrix::~TMatrix() 
{
  if(freeable) real_destroy();
}

void TMatrix::swap(TMatrix & tm) throw()
{
  std::swap(dim, tm.dim);
  std::swap(data, tm.data);
  std::swap(freeable, tm.freeable);
}

void TMatrix::real_destroy()
{
  delete [] data;
}

Matrix::Matrix(size_t dim) 
: TMatrix(dim) 
{}

Matrix::Matrix(const TMatrix & tm)
: TMatrix(tm)
{}

Matrix::Matrix(const Matrix & m)
: TMatrix(m)
{}

Matrix & Matrix::operator = (const Matrix &m)
{
  Matrix temp(m);
  temp.swap(*this);
  return *this;
}

Matrix & Matrix::operator = (const TMatrix & tm)
{
  real_destroy();
  dim = tm.dim;
  data = tm.data;
  freeable = 0;
  return *this;
}

TMatrix Matrix::operator + (const Matrix & m) const
{
  TMatrix temp_result(this->dim);
  do_addition(temp_result.data, this->data, m.data, dim);
  return temp_result;
}

TMatrix Matrix::operator + (const TMatrix & tm) const 
{
  TMatrix temp_result(tm);
  do_addition(temp_result.data, this->data, tm.data, dim);
  return temp_result;
}

Matrix::~Matrix() 
{
  freeable = 1;
}

Some highlights of the above code are given here. TMatrix objects are created only as a result of Matrix additions. Addition of two Matrix objects results in a freshly allocated TMatrix object. On the other hand, when any one of the objects being added are of TMatrix type, the result is stored in the memory referenced by the temporary.

TMatrix constructor allocates memory but destroys it only when freeable is true. Matrix destructor responsible for freeing memory. Unlike copy-constructor of Matrix, TMatrix copy-constructor does a shallow copy. However, as mentioned before, construction of TMatrix from Matrix does a deep copy with its exclusive memory.

TMatrix::operator + is a key function in this idiom. Note that the result of the addition is stored in the temporary object itself (do_self_addition) thereby avoiding another allocation. The TMatrix::operator + that takes a TMatrix object as a parameter is interesting because it accepts TMatrix object by-value. This is necessary because an addition of two temporary objects should result into only one temporary object and the other one must be destroyed. TMatrix objects do manage their memory, therefore, one of the objects is explicitly destroyed using TMatrix::real_destroy. This would not have been possible if right-hand-side TMatrix is bound to a const-reference.

The Matrix class is implemented in terms of TMatrix. Construction of Matrix from another Matrix does an allocation and copy. On the other hand, construction of Matrix from a TMatrix simply copies the pointers because TMatrix does not delete data. The same rules are applicable to the assignment operators of the Matrix class. Finally, Matrix destructor simply changes the freeable flag to true resulting into deleting the memory. To handle some rarely occurring cases correctly, assignments to a TMatrix object are also handled. An assignment from a Matrix is simply a copy operation whereas an assignment from a TMatrix class requires destroying one of them and doing a shallow copy of the pointers.

Caveats

This idiom has some serious drawbacks.

The result of computation must be assigned to a derived class object (Matrix object). Otherwise there is definite resource leak in the program. Such a convention is usually very hard to maintain.
As a consequence, this idiom is not even basic exception safe. If any intermediate memory allocation operation throws an exception, it is very likely that there would be a resource leak.

Known Uses

The TBCI Library

Related Idioms

References

K. Spanderen und Y. Xylander: Gut kalkuliert, "Effiziente Numerik mit C++", iX Seiten 166–173, November 1996
Documentation of the Numerical classes for Plasma simulation, Kurt Garloff