Optimizing C++/Code optimization/Instruction count

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Even the language features that generate inlined code may have a significant cost, as such instruction are anyway to be executed. In this section some techniques are presented to decrease the total number of machine instructions that the processor will have to execute to perform a given operation.

Cases order in switch statement[edit | edit source]

In switch statements, sort the cases by decreasing probability.

In the guideline "Cases order in switch statement" in section 3.1, it was already suggested to put before the most typical cases, that is those that were presumed to be more probable. As further optimization, you can count, in typical runs, the actual number of times every case is chosen, and sort the cases from the most frequent to the less frequent.[dubious ]

Template integer parameters[edit | edit source]

If an integer value is a constant in the application code, but is a variable in library code, make it a template parameter.

Let's assume you are writing the following library function, in which both x and y do not have a known value when the library is developed:

int f1(int x, int y) { return x * y; }

Such function may be called from the following application code, in which x does not have a constant value, but y is the constant 4:

int a = f1(b, 4);

If, when you write the library, you know that the caller will surely pass a constant for the argument y, you can transform your function into the following function template:

template <int Y> int f2(int x) { return x * Y; }

Such function may be called from the following application code:

int a = f2<4>(b);

Such a call instantiates automatically the following function:

int f2(int x) { return x * 4; }

The latter function is faster than the former function f1, for the following reasons:

  • Only one argument is passed to the function (x) instead of two (x and y).
  • The multiplication by an integer constant (4) is always faster than a multiplication by an integer variable (y).
  • As the constant value (4) is a power of two, the compiler, instead of performing an integer multiplication, performs a bit shift.

In general, the integer template parameters are constants for those who instantiate the template and therefore for the compiler, and constants are handled more efficiently than variables. In addition, some operations involving constants are pre-computed at compilation-time.

If, instead of a normal function, you already have a function template, it is enough to add a further parameter to that template.

The Curiously Recurring Template Pattern[edit | edit source]

If you have to write a library abstract base class such that in every algorithm in the application code only one class derived from such base class will be used, use the Curiously Recurring Template Pattern.

Let's assume you are writing the following library base class:

class Base {
public:
    void g() { f(); }
private:
    virtual void f() = 0;
};

In this class, the function g performs an algorithm that calls the function f as an abstract operation for the algorithm. In design patterns terminology, g is a template method design pattern. The purpose of such class is to allow to write the following application code:

class Derived1: public Base {
private:
    virtual void f() { ... }
};
...
Base* p1 = new Derived1;
p1->g();

In such a case, it is possible to transform the previous library code into the following:

template <class Derived> class Base {
public:
    void g() { f(); }
private:
    void f() { static_cast<Derived*>(this)->f(); }
};

As a consequence, the application code will become the following:

class Derived1: public Base<Derived1> {
private:
    void f() { ... }
};
...
Derived1* p1 = new Derived1;
p1->g();

In such a way, the call to f in the function Base<Derived1>::g is statically bound to the member function Derived1::f, that is the call to such function is no more virtual, and can be inlined.

Though, let's assume you want to add the following definition:

class Derived2: public Base<Derived2> {
protected:
    void f() { ... }
};

With this technique it wouldn't be possible to define a pointer or a reference to a base class that is common to both Derived1 and Derived2, as such base classes are two unrelated types; as a consequence, this technique is not applicable when you want to allow the application code to define a container of arbitrary objects derived from the class Base.

Other limitations are:

  • Base is necessarily an abstract type;
  • an object of type Derived1 cannot be converted into an object of type Derived2 or vice versa;
  • for every derivation of Base, all the machine code generated for Base is duplicated.

The Strategy design pattern[edit | edit source]

If an object that implements the Strategy design pattern (aka Policy) is a constant in every algorithm of the application code, eliminate such an object, make static all its members, and add its class as a template parameter.

Let's assume you are writing the following library code, that implements the Strategy design pattern:

class C;
class Strategy {
public:
    virtual bool is_valid(const C&) const = 0;
    virtual void run(C&) const = 0;
};

class C {
public:
    void set_strategy(const Strategy& s) { s_ = s; }
    void f() { if (s_.is_valid(*this)) s_.run(*this); }
private:
    Strategy s_;
};

This library code has the purpose to allow the following application code:

class MyStrategy: public Strategy {
public:
    virtual bool is_valid(const C& c) const { ... }
    virtual void run(C& c) const { ... }
};
...
MyStrategy s; // Object representing my strategy.
C c; // Object containing an algorithm with customizable strategy.
c.set_strategy(s); // Assignment of the custom strategy.
c.f(); // Execution of the algorithm with assigned strategy.

In such a case, it's possible to convert the previous library code into the following:

template <class Strategy>
class C {
public:
    void f() {
        if (Strategy::is_valid(*this)) Strategy::run(*this);
    }
};

As a consequence, the application code will become the following:

class MyStrategy {
public:
    static bool is_valid(const C<MyStrategy>& c) { ... }
    static void run(C<MyStrategy>& c) { ... }
};
...

C<MyStrategy> c; // Object with statically assigned strategy.
c.f(); // Execution with statically assigned strategy.

In such a way, the object-strategy is avoided, and the member functions MyStrategy::is_valid and MyStrategy::run are statically bound, that is calls to virtual functions are avoided.

Though, such solution does not allow to choose the strategy at run-time, and of course neither to change it during the object life. In addition, the algorithm code is duplicated for every instantiation of its class.

Bitwise operators[edit | edit source]

If you have to perform boolean operations on a set of bits, put those bits in an unsigned int object, and use bitwise operators on it.

The bitwise operators (&, |, ^, <<, and >>) are translated in single fast instructions, and operate on all the bits of a register in a single instruction.