Optimizing C++/Writing efficient code/Memory access

From Wikibooks, open books for an open world
Jump to navigation Jump to search

This section presents guidelines for improving main-memory access performance by exploiting features of the processor caches and of secondary memory swapping by the operating system virtual memory manager.

Memory access order[edit | edit source]

Access memory in increasing addresses order. In particular:

  • scan arrays in increasing order;
  • scan multi-dimensional arrays using the rightmost index for innermost loops;
  • in class constructors and in assignment operators (operator=), access member variables in the order of declaration.

Data caches optimize memory access in increasing sequential order.

When a multi-dimensional array is scanned, the innermost loop should iterate on the last index, the innermost-but-one loop should iterate on the last-but-one index, and so on. In such a way, it is guaranteed that array cells are processed in the same order in which they are arranged in memory. For example, the following code is optimized:

float a[num_levels][num_rows][num_columns];
for (int lev = 0; lev < num_levels; ++lev) {
    for (int r = 0; r < num_rows; ++r) {
         for (int c = 0; c < num_columns; ++c) {
            a[lev][r][c] += 1;
        }
    }
}

Memory alignment[edit | edit source]

Keep the compiler default memory alignment.

By default, compilers use an alignment criterion for fundamental types, for which objects may have only memory addresses that are a multiple of particular factors. Such criterion guarantees top performance, but it may add paddings (or holes) between successive objects.

If it is necessary to avoid such paddings for some structures, use the pragma directive only around such structure definitions.

Grouping functions in compilation units[edit | edit source]

Define in the same compilation unit all the member functions of a class, all the friend functions of the class, and all the member functions of friend classes of the class, except when the resulting file becomes unwieldy because of its size.

In this way, both the machine code resulting from the compilation of the functions and the static data defined in the classes and functions will have adjacent addresses. In addition, even compilers that do not perform whole program optimization may optimize calls among these functions.

Grouping variables in compilation units[edit | edit source]

Define every global variable in the compilation unit in which it is used most often.

In this way, such variables will have addresses near to each other and to the static variables defined in such compilation units. In addition, even compilers that do not perform whole program optimization may optimize accesses to such variables from the functions that use them most often.

Private functions and variables in compilation units[edit | edit source]

Declare in an anonymous namespace the variables and functions that are global to a compilation unit, but that are not used by other compilation units.

In the C language and also in C++, such variables and functions may be declared static. However, in modern C++, the use of static global variables and functions is not recommended and should be replaced by variables and functions declared in an anonymous namespace.

In both cases, the compiler is notified that such identifiers will never be used by other compilation units. This allows compilers that do not perform whole program optimization to optimize the use of such variables and functions.