# Optimizing C++/General optimization techniques/Sorting

The C++ Standard Template Library (STL) provides the template function `sort` that implements a comparison sort algorithm. Because `sort` is templatized, it can be used for various types of sequences holding any type of key, as long as the keys are comparable (implement the < operator). A good compiler can generate code optimized for the various kinds of sequence/key combinations.

The reference implementation of the STL uses the introsort algorithm (since the 2000 release; the GNU C++ library uses the reference implementation). This algorithm is a very fast combination of quicksort and heapsort with a specially designed selection algorithm.

The `sort` template function is not guaranteed to be stable. When a stable sort is required, use the `stable_sort` template function instead.

This section suggests alternatives to the `sort` and `stable_sort` template functions that may be faster in specific cases.

## Sorting with small ranges of keys

To sort a data set according an integer key having a small range, use the counting sort algorithm.

The counting sort algorithm has O(N+M) complexity, where N is the number of elements to sort and M is the range of the sort keys, that is the difference between the highest key and the lowest key. In case N elements are to be sorted whose key is an integer number belonging to an interval containing no more than two times N values (i.e when `M <= 2 * N` holds), this algorithm may be quite faster than `sort`. In some cases it is faster even with larger ranges.

This algorithm may be used also for a partial ordering; for example, if the keys are integers between zero and one billion, you can still sort them using only the most significant byte, so to get an order for which the formula $a_n < a_{n + 1} + 256*256*256$ holds.

### Example: sorting 8-bit integers

Say you want to sort an array of arbitrary `unsigned char` elements. `<climits>` defines constant limits for integer and char types for a specific implementation. CHAR_BIT is the number of bits in a `char` object. ISO C++ requires CHAR_BIT to be 8 or greater. An `unsigned char` may have a value in the range between 0 and UCHAR_MAX. ISO C++ requires UCHAR_MAX to be 255 (2^8-1) or greater.

Note: `unsigned char` must be used because `char` can be signed or unsigned depending on the implementation.

```#include <climits>

void count_sort(unsigned char *a, unsigned char *const end)
{
unsigned char freq[UCHAR_MAX+1] = {0};
unsigned char *p, c;

for (p = a; p < end; ++p) {
freq[*p] += 1;
}
for (c = 0, p = a; c < UCHAR_MAX; ++c) {
while (freq[c]-- > 0) {
*p++ = c;
}
}
while (freq[c]-- > 0) {
*p++ = c;
}
}
```

The `counting_sort` function implements the pigeonhole sort algorithm. It takes a pointer to the first element of the input array and a pointer that points one element beyond the end of the array. Why? Because we don't have to stop here.

We can generalize `counting_sort` to a template function that also works for `string`, `vector<unsigned char>` and other sequence types, without loss of efficiency. When doing so, we need to work with iterators rather than pointers.

```#include <iterator>
#include <limits>

template <typename iterator>
void counting_sort(iterator const &begin, iterator const &end)
{
typedef std::iterator_traits<iterator>::value_type T;
T max = std::numeric_limits<T>::max();
T freq[max+1] = {0};
iterator it;
T c;

for (it = begin; it < end; ++it) {
freq[*it] += 1;
}
for (c = 0, it = begin; c < max; ++c)
while (freq[c]-- > 0) {
*it++ = c;
}
}
while (freq[c]-- > 0) {
*it++ = c;
}
}
```

## Partial sorting

### Partitioning

If you have to split a sequence according a criterion, use a partitioning algorithm, instead of a sorting one.

In STL there is the `std::partition` algorithm, that is faster than the `std::sort` algorithm, as it has O(N) complexity, instead of O(N log(N)).

### Stable partitioning and sorting

If you have to partition or sort a sequence for which equivalent entities may be swapped, don't use a stable algorithm.

In STL there is the `std::stable_partition` partitioning algorithm, that is slightly slower than the `std::partition` algorithm; and there is the `std::stable_sort` sorting algorithm, that is slightly slower than the `std::sort` algorithm.

### Order partitioning

If you have to pick out the first N elements from a sequence, or the Nth element in a sequence, use an order partitioning algorithm, instead of a sorting one.

In STL there is the `std::nth_element` algorithm, that, although slightly slower than the `std::stable_partition` algorithm, is quite faster then the `std::sort` algorithm, as it has O(N) complexity, instead of O(N log(N)).

### Sorting only the first N elements

If you have to sort the first N elements of a much longer sequence, use an order statistic algorithm, instead of a sorting one.

In STL there are the `std::partial_sort` and `std::partial_sort_copy` algorithms, that, although slower than the `std::nth_element` algorithm, are so much faster than the `std::sort` algorithm as the partial sequence to sort is shorter than the whole sequence.