C++ Programming/Chapter Advanced Features Print version
From Wikibooks, the open-content textbooks collection
Authors
- The following people are authors to this book
- Panic
- Acknowledgment is given for using some contents from other works like Wikipedia, the Wikibooks Java Programming, C Programming and C++ Exercises for beginners, the C/C++ Reference Web Site, and from Wikisource, as from the authors Scott Wheeler, Stephen Ferg and Ivor Horton.
There are many other contributors/editors to the book; a verifiable list of all contributions exist as History Logs at Wikibooks (http://en.wikibooks.org/).
Advanced Features
I/O
Also commonly referenced as the C++ I/O of the C++ standard since the standard also includes the C Standard library and its I/O implementation, as seen before in the Standard C I/O Section.
The <iostream> library automatically defines a few standard objects:
- cout, an object of the ostream class, which displays data to the standard output device.
- cerr, another object of the ostream class that writes unbuffered output to the standard error device.
- clog, like cerr, but uses buffered output.
- cin, an object of the istream class that reads data from the standard input device.
The <fstream> library allows programmers to do file input and output with the ifstream and ofstream classes.
The string class
The string class is a part of the C++ standard library, used for convenient manipulation of sequences of characters, to replace the static, unsafe C method of handling strings. To use the string class in a program, the <string> header must be included. The standard library string class can be accessed through the std namespace.
The basic template class is basic_string<> and its standard specializations are string and wstring.
Basic usage
Declaring a std string is done by using one of these two methods:
using namespace std; string std_string; or std::string std_string;
Text I/O
This section will deal only keyboard, text input. There are many other inputs that can be read (mouse movements and button clicks, etc...), but will not be covered in this section, even reading the special keys of the keyboard will be excluded.
Perhaps the most basic use of the string class is for reading text from the user and writing it to the screen. In the header file iostream, C++ defines an object named cin that handles input in much the same way that cout handles output.
// snipped designed to get an integer value from the user int x; std::cin >> x;
The >> operator will cause the execution to stop and will wait for the user to type something. If the user types a valid integer, it will be converted into an integer value and stored in x.
If the user types something other than an integer, the compiler will not report an error. Instead, it leaves the old content (a "random" meaningless value) in x and continues.
This can then be extended into the following program:
#include <iostream> #include <string> int main(){ std::string name; std::cout << "Please enter your first name: "; std::cin >> name; std::cout << "Welcome " << name << "!" << std::endl; return 0; }
Although a string may hold a sequence containing any character--including spaces and nulls--when reading into a string using cin and the extraction operator (>>) only the characters before the first space will be stored. Alternatively, if an entire line of text is desired, the getline function may be used:
std::getline(std::cin, name);
Getting user input
Fortunately, there is a way to check and see if an input statement succeeds. We can invoke the good function on cin to check what is called the stream state. good returns a bool: if true, then the last input statement succeeded. If not, we know that some previous operation failed, and also that the next operation will fail.
Thus, getting input from the user might look like this:
#include <iostream> int main () { using namespace std; // pull in the std namespace int x; // prompt the user for input cout << "Enter an integer: "; // get input cin >> x; // check and see if the input statement succeeded if (cin.good() == false) { cout << "That was not an integer." << endl; return -1; } // print the value we got from the user cout << x << endl; return 0; }
cin can also be used to input a string:
string name; cout << "What is your name? "; cin >> name; cout << name << endl;
As with the scanf() function from the Standard C Library, this statement only takes the first word of input, and leaves the rest for the next input statement. So, if you run this program and type your full name, it will only output your first name.
You may also notice the >> operator doesn't handle errors as expected (for example, if you accidentally typed your name in a prompt for a number.) Because of these issues, it may be more suitable to read a line of text, and using the line for input — this is performed using the function called getline.
string name; cout << "What is your name? "; getline (cin, name); cout << name << endl;
The first argument to getline is cin, which is where the input is coming from. The second argument is the name of the string variable where you want the result to be stored.
getline reads the entire line until the user hits Return or Enter. This is useful for inputting strings that contain spaces.
In fact, getline is generally useful for getting input of any kind. For example, if you wanted the user to type an integer, you could input a string and then check to see if it is a valid integer. If so, you can convert it to an integer value. If not, you can print an error message and ask the user to try again.
To convert a string to an integer you can use the strtol function defined in the header file cstdlib. (Note that the older function atoi is less safe than strtol, as well as being less capable.)
If you still need the features of the >> operator, you will need to create a string stream as available from <sstream>. The use of this stream will be discussed in a later chapter.
More advanced string manipulation
We will be using this dummy string for some of our examples.
string str("Hello World!");
This invokes the default constructor with a const char* argument. Default constructor creates a string which contains nothing, ie. no characters, not even a '\0' (however std::string is not null terminated).
string str2(str);
Will trigger the copy constructor. std::string knows enough to make a deep copy of the characters it stores.
string str2 = str;
This will copy strings using assignment operator. Effect of this code is same as using copy constructor in example above.
Size
string::size_type string::size() const; string::size_type string::length() const;
So for example one might do:
string::size_type strSize = str.size(); string::size_type strSize2 = str2.length();
The methods size() and length() both return the size of the string object. There is no apparent difference. Remember that the last character in the string is size() - 1 and not size(). Like in C-style strings, and arrays in general, std::string starts counting from 0.
I/O
ostream& operator<<(ostream &out, string &str); istream& operator>>(istream &in, string &str);
The shift operators (>> and <<) have been overloaded so you can perform I/O operations on istream and ostream objects, most notably cout, cin, and filestreams. Thus you could just do console I/O like this:
std::cout << str << endl; std::cin >> str; istream& getline (istream& in, string& str, char delim = '\n');
Alternatively, if you want to read entire lines at a time, use getline(). Note that this is not a member function. getline() will retrieve characters from input stream in and assign them to str until EOF is reached or delim is encountered. getline will reset the input string before appending data to it. delim can be set to any char value and acts as a general delimiter. Here is some example usage:
#include <fstream> //open a file std::ifstream file("somefile.cpp"); std::string data, temp; while( getline(file, temp, '#')) //while data left in file { //append data data += temp; } std::cout << data;
Because of the way getline works (ie it returns the input stream), you can nest multiple getline() calls to get multiple strings; however this may significantly reduce readability.
Operators
char& string::operator[](string::size_type pos);
Chars in strings can be accessed directly using the overloaded subscript ([]) operator, like in char arrays:
std::cout << str[0] << str[2];
prints "Hl".
std::string supports casting from the older C string type const char*. You can also assign or append a simple char to a string. Assigning a char* to a string is as simple as
str = "Hello World!";
If you want to do it character by character, you can also use
str = 'H';
Not surprisingly, operator+ and operator+= are also defined! You can append another string, a const char* or a char to any string.
The comparison operators >, <, ==, >=, <=, != all perform comparison operations on strings, similiar to the C strcmp() function. These return a true/false value.
if(str == "Hello World!") { std::cout << "Strings are equal!"; }
Searching strings
string::size_type string::find(string needle, string::size_type pos = 0) const;
You can use the find() member function to find the first occurrence of a string inside another. find() will look for needle inside this starting from position pos and return the position of the first occurrence of the needle. For example:
std::string haystack = "Hello World!"; std::string needle = "o"; std::cout << haystack.find(needle);
Will simply print "4" which is the index of the first occurrence of "o" in str. If we want the "o" in "World", we need to modify pos to point past the first occurrence. str.find(find, 4) would return 4, while str.find(find, 5) would give 7. If the substring isn't found, find() returns std::string::npos.This simple code searches a string for all occurrences of "wiki" and prints their positions:
std::string wikistr = "wikipedia is full of wikis (wiki-wiki means fast)"; for(string::size_type i = 0, tfind; (tfind = wikistr.find("wiki", i)) != string::npos; i = tfind + 1) { std::cout << "Found occurrence of 'wiki' at position " << tfind << std::endl; } string::size_type string::rfind(string needle, string::size_type pos = string::npos) const;
The function rfind() works similarly, except it returns the last occurrence of the passed string.
Inserting/erasing
string& string::insert(size_type pos, const string& str);
You can use the insert() member function to insert another string into a string. For example:
string newstr = " Human"; str.insert (5,newstr);
Would return Hello Human World!
string& string::erase(size_type pos, size_type n);
You can use erase() to remove a substring from a string. For example:
str.erase (6,11);
Would return Hello!
string& string::substr(size_type pos, size_type n);
You can use substr() to extract a substring from a string. For example:
string str = "Hello World!"; string part = str.substr(6,5);
Would return World.
Backwards compatibility
const char* string::c_str() const; const char* string::data() const;
For backwards compatibility with C/C++ functions which only accept char* parameters, you can use the member functions string::c_str() and string::data() to return a temporary const char* string you can pass to a function. The difference between these two functions is that c_str() returns a null-terminated string while data() does not necessarily return a null-terminated string. So, if your legacy function requires a null-terminated string, use c_str(), otherwise use data() (and presumably pass the length of the string in as well).
String Formatting
Strings can only be appended to other strings, but not to numbers or other datatypes, so something like std::string("Foo") + 5 would not result in a string with the content "Foo5". To convert other datatypes into string there exist the class std::ostringstream, found in the include file <sstream>. std::ostringstream acts exactly like std::cout, the only difference is that the output doesn't go to the current standard output as provided by the operating system, but into an internal buffer, that buffer can be converted into a std::string via the std::ostringstream::str() method.
Example
#include <iostream> #include <sstream> int main() { std::ostringstream buffer; // Use the std::ostringstream just like std::cout or other iostreams buffer << "You have: " << 5 << " Helloworlds in your inbox"; // Convert the std::ostringstream to a normal string std::string text = buffer.str(); std::cout << text << std::endl; return 0; }
Advanced use
Further reading
Streams
Input and output are essential for any computer software, as these are the only means by which the program can communicate with the user. The simplest form of input/output is pure textual, i.e. the application displays in console form, using simple ASCII characters to prompt the user for inputs, which are supplied using the keyboard.
// 'Hello World!' program #include <iostream> int main() { std::cout << "Hello World!" << std::endl; return 0; }
demonstrates the use of the std::cout stream, known as the standard output stream.
IOStreams are part of the C++ Standard Library concept we saw early not to be confused with the Standard Template Library (STL) that we have yet to introduce.
A stream is a type of object from which we can take values, or to which we can pass values. This is done transparently in terms of the underlying code.
Stream classes
Input and output is critical to computers. Every program you write will handle i/o in some form to communicate with the user. As this is a very common operation, programming languages like C++ are designed to make i/o as powerful yet painless as possible.
There are many ways for a program to gain input and output, including
- File i/o, that is, reading and writing to files
- Console i/o, reading and writing to a console window, such as a terminal in UNIX-based operating systems or a DOS prompt in Windows.
- Network i/o, reading and writing from a network device
- String i/o, reading and writing treating a string as if it were the input or output device
While these may seem unrelated, they work very similarly. In fact, operating systems that follow the POSIX specification deal with files, devices, network sockets, consoles, and many other things all with one type of handle, a file descriptor. However, low-level interfaces provided by the operating system tend to be difficult to use, so C++, like other languages, provide an abstraction to make programming easier. This abstraction is the stream.
Almost all input and output one ever does can be modeled very effectively as a stream. Having one common model means that one only has to learn it once. If you understand streams, you know the basics of how to output to files, the screen, sockets, pipes, and anything else that may come up.
A stream is an object that allows one to push data in or out of a medium, in order. Usually a stream can only output or can only input. It is possible to have a stream that does both, but this is rare. One can think of a stream as a car driving along a one-way street of information. An output stream can insert data and move on. It (usually) cannot go back and adjust something it has already written. Similarly, an input stream can read the next bit of data and then wait for the one that comes after it. It does not skip data or rewind and see what it had read 5 minutes ago.
The semantics of what a stream's read and write operations do depend on the type of stream. In the case of a file, an input file stream reads the file's contents in order without rewinding, and an output file stream writes to the file in order. For a console stream, output means displaying text, and input means getting input from the user via the console. If the user has not inputted anything, then the program blocks, or waits, for the user to enter in something.
Standard input, output, and error
The most common streams one uses are cout, cin, and cerr (pronounced "c out", "c in", and "c err(or)", respectively). They are defined in the header <iostream>. Usually, these streams read and write from a console or terminal. In UNIX-based operating systems, such as Linux and Mac OS X, the user can redirect them to other files, or even other programs, for logging or other purposes. They are analogous to stdout, stdin, and stderr found in C. cout is used for generic output, cin is used for input, and cerr is used for printing errors. (cerr typically goes to the same place as cout, unless one or both is redirected, but it is not buffered and allows the user to fine-tune which parts of the program's output is redirected where.)
The standard syntax for outputting to a stream, in this case, cout, is
cout << some_data << some_more_data;
Example
#include <iostream> using namespace std; int main() { int a = 1; cout << "Hello world! " << a << '\n'; return 0; }
Result of Execution
Hello world! 1
To add a line break, send a newline character, \n or use std::endl, which writes a newline and flushes the stream's buffer.
Example
#include <iostream> #include <ostream> using namespace std; int main() { int a = 1; char x = 13; cout << "Hello world!" << "\n" << a << endl << x << endl; return 0; }
Execution
Hello world! 1
It is always a good idea to end your output with a blank line, so as to not mess up with user's terminals.
Files
With cout and cin, we can do basic communication with the user. For more complex io, we would like to read to and write from files. This is done with a file stream, defined in the header <fstream>. ofstream is an output file stream, and ifstream is an input file stream.
To open a file, one can either call open on the file stream or, more commonly, use the constructor. One can also supply an open mode to further control the file stream. Open modes include
- ios::app Leaves the file's original contents and appends new data to the end.
- ios::out Outputs new data in the file, removing the old contents. (default for ofstream)
- ios::in Reads data from the file. (default for ifstream)
Example
// open a file called Test.txt and write "HELLO, HOW ARE YOU?" to it #include <fstream> using namespace std; int main() { ofstream file1; file1.open("file1.txt", ios::app); file1 << "This data will be appended to the file file1.txt\n"; file1.close(); ofstream file2("file2.txt"); file2 << "This data will replace the contents of file2.txt\n"; return 0; }
The call to close() can be omitted if you do not care about the return value (whether it succeeded); the destructors will call close when the object goes out of scope.
If an operation (e.g. opening a file) was unsuccessful, a flag is set in the stream object. You can check the flags' status using the bad() or fail() member functions, which return a boolean value. The stream object doesn't throw any exceptions in such a situation; hence manual status check is required. See reference for details on bad() and fail().
Manipulators
A manipulator is a function that can be passed as an argument to a stream in different circumstances. For example, the manipulator 'hex' will cause the stream object to format subsequent integer input to the stream in hexadecimal instead of decimal. Likewise, 'oct' results in integers displaying in octal, and 'dec' reverts back to decimal.
Example
#include <iostream> using namespace std; int main() { cout << dec << 16 << ' ' << 10 << endl; cout << oct << 16 << ' ' << 10 << endl; cout << hex << 16 << ' ' << 10 << endl; return 0; }
Execution
16 10 20 12 10 a
There are many manipulators which can be used in conjunction with streams to simplify the formatting of input. For example, 'setw()' sets the field width of the data item next displayed. Used in conjunction with 'left' and 'right'(which set the justification of the data), 'setw' can easily be used to create columns of data.
Example
#include <iostream> #include <iomanip> using namespace std; int main() { cout << setw(10) << right << 90 << setw(8) << "Help!" << endl; cout << setw(10) << left << 45 << setw(8) << "Hi!" << endl; return 0; }
Execution
90 Help!
45 Hi!
The data in the top row display at the right of the columns created by 'setw', while in the next row, the data is left justified in the column. Please note the inclusion of a new library 'iomanip'. Most formatting manipulators require this library.
Here are some other manipulators and their uses:
| Manipulator | Function |
|---|---|
| boolalpha | displays boolean values as 'true' and 'false' instead of as integers. |
| noboolalpha | forces bools to display as integer values |
| showuppercase | converts strings to uppercase before displaying them |
| noshowuppercase | displays strings as they are received, instead of in uppercase |
| fixed | forces floating point numbers to display with a fixed number of decimal places |
| scientific | displays floating point numbers in scientific notation |
Buffers
Most stream objects, including 'cout' and 'cin', have an area in memory where the information they are transferring sits until it is asked for. This is called a 'buffer'. Understanding the function of buffers is essential to mastering streams and their use.
Example
#include <iostream> using namespace std; int main() { int num1, num2; cin >> num1; cin >> num2; cout << "Number1: " << num1 << endl << "Number2: " << num2 << endl; return 0; }
Execution 1
>74 >27 Number1: 74 Number2: 27
The inputs are given separately, with a hard return between them. '>' denotes user input.
Execution 2
>74 27 Number1: 74 Number2: 27
The inputs are entered on the same line. They both go into the 'cin' stream buffer, where they are stored until needed. As 'cin' statements are executed, the contents of the buffer are read into the appropriate variables.
Execution 3
>74 27 56 Number1: 74 Number2: 27
In this example, 'cin' received more input than it asked for. The third number it read in, 56, was never inserted into a variable. It would have stayed in the buffer until 'cin' was called again. The use of buffers can explain many strange behaviors that streams can exhibit.
Example
#include <iostream> using namespace std; int main() { int num1, num2, num3; cin >> num1 >> num2; cout << "Number1: " << num1 << endl << "Number2: " << num2 << endl; cin >> num3; cout << "Number3: " << num3 << endl; return 0; }
Execution
>45 89 37 Number1: 45 Number2: 89 Number3: 37
Notice how all three numbers were entered at the same time in one line, but the stream only pulled them out of the buffer when they were asked for. This can cause unexpected output, since the user might accidentally put an extra space into his input. A well written program will test for this type of unexpected input and handle it gracefully.
"safe bool" idiom
or Why define an operator void*() cast operator rather than an operator bool()?
This is so that we cannot write things like:
int foo = std::cin;
or, more importantly,
int bah; std::cin << bah; // observe: << instead of >>
by mistake. However, it is not perfect, as it allows other mistakes such as
delete std::cin;
though fortunately such errors are less likely, as delete should be used carefully in any case.
The state of the art would have us instead define a private nested class dummy within std::ios, and return a pointer-to-member-function of dummy -- hence allowing implicit conversion from that to bool, but not allowing many other operations. This is sometimes referred to as the "safe bool" idiom, and is motivated by the fact that C++'s bool type has implicit conversions both to and from int as a result of the standardization process.
Output
As seen in the "Hello World!" program, we direct the output to std::cout. This means that it is a member of the standard library. For now, don't worry about what this means; we will cover the library and namespaces in later chapters.
What you do need to remember is that, in order to use the output stream, you must include a reference to the standard IO library, as shown here:
#include <iostream>
This opens up a number of streams, functions and other programming devices which we can now use. For this section, we are interested in two of these; std::cout and std::endl.
Once we have referenced the standard IO library, we can use the output stream very simply. To use a stream, give its name, then pipe something in or out of it, as shown:
std::cout << "Hello, world!";
The << operator feeds everything to the right of it into the stream. We have essentially fed a text object into the stream. That's as far as our work goes; the stream now decides what to do with that object. In the case of the output stream, it's printed on-screen.
We're not limited to only sending a single object type to the stream, nor indeed are we limited to one object a time. Consider the examples below:
std::cout << "Hello, " << userName << std::endl; std::cout << "The answer to life, the universe and everything is " << 42 << std::endl;
As can be seen, we feed in various values, separated by a pipe character. The result comes out something like:
Hello, Joe The answer to life, the universe and everything is 42
(The name will of course vary; we will discuss variables a little later.)
You will have noticed the use of std::endl throughout some of the examples so far. This is the newline constant. It is a member of the standard IO library, and comes "free" when we instantiate that in order to use the output stream. When the output stream receives this constant, it starts a new line in the console.
And of course, we're not limited to sending only ONE newline, either:
std::cout << "Hello, " << userName << std::endl << std::endl; std::cout << "How old are you?";
Which produces something like:
Hello, Joe How old are you?
Input
What would be the use of an application that only ever outputted information, but didn't care about what its users wanted? Minimal to none. Fortunately, inputting is as easy as outputting when you're using the stream.
The standard input stream is called std::cin and is used very similarly to the output stream. Once again, we instantiate the standard IO library:
#include <iostream>
This gives us access to std::cin (and the rest of that class). Now, we give the name of the stream as usual, and pipe output from it into a variable. A number of things have to happen here, demonstrated in the example below:
#include <iostream> int main(int argc, char argv[]) { int a; std::cout << "Hello! How old are you? "; std::cin >> a; std::cout << "You're really " << a << " years old?" << std::endl; return 0; }
We instantiate the standard IO library as usual, and call our main function in the normal way. Now we need to consider where the user's input goes. This calls for a variable (discussed in a later chapter) which we declare as being called a.
Next, we send some output, asking the user for their age. The real input happens now; everything the user types until they hit Enter is going to be stored in the input stream. We pull this out of the input stream and save it in our variable.
Finally, we output the user's age, piping the contents of our variable into the output stream.
Note: You will notice that if anything other than a whole number is entered, the program will crash. This is due to the way in which we set up our variable. Don't worry about this for now; we will cover variables later on.
Text input until EOF/error/invalid input
Input from the stream infile to a variable data until one of the following:
- EOF reached on infile.
- An error occurs while reading from infile (e.g., connection closed while reading from a remote file).
- The input item is invalid, e.g. non-numeric characters, when data is of type int.
#include <iostream> // ... while (infile >> data) { // manipulate data here }
Note that the following is not correct:
#include <iostream> // ... while (!infile.eof()) { infile >> data; // wrong! // manipulate data here }
This will cause the last item in the input file to be processed twice, because eof() does not return true until input fails due to EOF.
Making user-created classes compatible with the stream library
It is often useful to have your own classes' instances compatible with the stream framework. For instance, if you defined the class Foo like this:
class Foo { public: Foo() : x(1), y(2) { } int x, y; };
You will not be able to pass its instance to cout directly using the '<<' operator, because it is not defined for these two objects (Foo and ostream). What needs to be done is to define this operator and thus bind the user-defined class with the stream class.
ostream& operator<<(ostream& output, Foo& arg) { output << arg.x << "," << arg.y; return output; }
Now this is possible:
Foo my_object; cout << "my_object's values are: " << my_object << endl;
The operator function needs to have 'ostream&' as its return type, so chaining output works as usual between the stream and objects of type Foo:
Foo my1, my2, my3; cout << my1 << my2 << my3;
This is because (cout << my1) is of type ostream&, so the next argument (my2) can be appended to it in the same expression, which again gives an ostream& so my3 can be appended and so on.
If you decided to restrict access to the member variables x and y (which is probably a good idea) within the class Foo, i.e.:
class Foo { public: Foo() : x(1), y(2) { } private: int x, y; };
you will have trouble, because the global operator<< function doesn't have access to the private variables of its second argument. There are two possible solutions to this problem:
1. Within the class Foo, declare the operator<< function as the classes' friend which grants it access to private members, i.e. add the following line to the class declaration:
friend ostream& operator<<(ostream& output, Foo& arg);
Then define the operator<< function as you normally would (note that the declared function is not a member of Foo, just its friend, so don't try defining it as Foo::operator<<).
2. Add public-available functions for accessing the member variables and make the operator<< function use these instead:
class Foo { public: Foo() : x(1), y(2) { } int get_x() { return x; } int get_y() { return y; } private: int x, y; }; ostream& operator<<(ostream& output, Foo& arg) { output << arg.get_x() << "," << arg.get_y(); return output; }
Standard Template Library (STL)
C++ Standard Library was based in the STL published by SGI (a software library), that was partially included into the C++ Standard. The ISO C++ does not specify header content, and allows implementation either in the headers, or in a true library. There are many different implementations of the STL, all based on the language standard but nevertheless differing from each other, making it transparent for the programmer, but enabling specialization and rapid evolution of the code base.
The Standard Template Library (STL) offers easy-to-use containers, data types and functions making programming easier. Instead of wondering if your array would ever need to hold 257 records or having nightmares of string buffer overflows, you can enjoy vector and string that automatically extend to contain more records or characters. For example, vector is just like an array, except that vector's size can expand to hold more cells or shrink when fewer will suffice. One must keep in mind that the STL does not conflict with OOP but in itself is not object oriented -- in particular it makes no use of runtime polymorphism (i.e., has no virtual functions).
The true power of the STL lies not in its container classes, but in the fact that it is a framework, combining algorithms with data structures using indirection through iterators to allow generic implementations of algorithms to work efficiently on varied forms of data. To give a simple example, the same std::copy function can be used to copy elements from one array to another, or to copy the bytes of a file, or to copy the whitespace-separated words in "text like this" into a container such as std::vector<std::string>.
// std::copy from array a to array b int a[10] = { 3,1,4,1,5,9,2,6,5,4 }; int b[10]; std::copy(&a[0], &a[10], b); // std::copy from input stream a to an arbitrary OutputIterator template <typename OutputIterator> void f(std::istream &a, OutputIterator destination) { std::copy(std::istreambuf_iterator<char>(a), std::istreambuf_iterator<char>(), destination); } // std::copy from a buffer containing text, inserting items in // order at the back of the container called words. std::istringstream buffer("text like this"); std::vector<std::string> words; std::copy(std::istream_iterator<std::string>(buffer), std::istream_iterator<std::string>(), std::back_inserter(words)); assert(words[0] == "text"); assert(words[1] == "like"); assert(words[2] == "this");
Containers
The containers we will discuss in this section of the book are part of the standard namespace (std::). They all originated in the STL.
Sequence Containers
- Sequences - easier than arrays
Sequences are similar to C arrays, but they are easier to use. Vector is usually the first sequence to be learned. Other sequences, list and double-ended queues, are similar to vector but more efficient in some special cases. (Their behavior is also different in important ways concerning validity of iterators when the container is changed; iterator validity is an important, though somewhat advanced, concept when using containers in C++.)
- vector - "an easy-to-use array"
- list - in effect, a doubly-linked list
- deque - double-ended queue (properly pronounced "deck", often mispronounced as "dee-queue")
vector
The vector is a template class in itself, it is a Sequence Container and allows you to easily create a dynamic array of elements (one type per instance) of almost any data-type or object within a programs when using it. The vector class handles most of the memory management for you.
Since a vector contain contiguous elements it is an ideal choice to replace the old C style array, in a situation where you need to store data, and ideal in a situation where you need to store dynamic data as an array that changes in size during the program's execution (old C style arrays can't do it). However, vectors do incur a very small overhead compared to static arrays (depending on the quality of your compiler), and cannot be initialized through an initialization list.
Accessing members of a vector or appending elements takes a fixed amount of time, no matter how large the vector is, whereas locating a specific value in a vector element or inserting elements into the vector takes an amount of time directly proportional to its location in it (size dependent).
- Example
/* David Cary 2009-03-04 quick demo for wikibooks */ #include <iostream> #include <vector> using namespace std; vector<int> pick_vector_with_biggest_fifth_element( vector<int> left, vector<int> right ){ if( (left[5]) < (right[5]) ){ return( right ); }; // else return( left ); } int * pick_array_with_biggest_fifth_element( int * left, int * right ){ if( (left[5]) < (right[5]) ){ return( right ); }; // else return( left ); } int vector_demo(void){ cout << "vector demo" << endl; vector<int> left(7); vector<int> right(7); left[5] = 7; right[5] = 8; cout << left[5] << endl; cout << right[5] << endl; vector<int> biggest( pick_vector_with_biggest_fifth_element( left, right ) ); cout << biggest[5] << endl; return 0; } int array_demo(void){ cout << "array demo" << endl; int left[7]; int right[7]; left[5] = 7; right[5] = 8; cout << left[5] << endl; cout << right[5] << endl; int * biggest = pick_array_with_biggest_fifth_element( left, right ); cout << biggest[5] << endl; return 0; } int main(void){ vector_demo(); array_demo(); }
- Member Functions
The vector class models the Container concept, which means it has begin(), end(), size(), max_size(), empty(), and swap() methods.
- informative
vector::front- Returns reference to first element of vector.vector::back- Returns reference to last element of vector.vector::size- Returns number of elements in the vector.vector::empty- Returns true if vector has no elements.
- standard operations
vector::insert- Inserts elements into a vector (single & range), shifts later elements up. Inefficient.vector::push_back- Appends (inserts) an element to the end of a vector, allocating memory for it if necessary. Amortized O(1) time.vector::erase- Deletes elements from a vector (single & range), shifts later elements down. Inefficient.vector::pop_back- Erases the last element of the vector, (possibly reducing capacity - usually it isn't reduced, but this depends on particular STL implementation). Amortized O(1) time.vector::clear- Erases all of the elements. Note however that if the data elements are pointers to memory that was created dynamically (e.g., the new operator was used), the memory will not be freed.
- allocation/size modification
vector::assign- Used to delete a origin vector and copies the specified elements to an empty target vector.vector::reserve- Changes capacity (allocates more memory) of vector, if needed. In many STL implementations capacity can only grow, and is never reduced.vector::capacity- Returns current capacity (allocated memory) of vector.vector::resize- Changes the vector size.
- iteration
vector::begin- Returns an iterator to start traversal of the vector.vector::end- Returns an iterator that points just beyond the end of the vector.vector::at- Returns a reference to the data element at the specified location in the vector, with bounds checking.
vector<int> v; for (vector<int>::iterator it = v.begin(); it!=v.end(); ++it/* increment operand is used to move to next element*/) { cout << *it << endl; }
vector::Iterators
std::vector<T> provides Random Access Iterators; as with all containers, the primary access to iterators is via begin() and end() member functions. These are overloaded for const- and non-const containers, returning iterators of types std::vector<T>::const_iterator and std::vector<T>::iterator respectively.
vector examples
/* Vector sort example */ #include <iostream> #include <vector> int main() { using namespace std; cout << "Sorting STL vector, \"the easier array\"... " << endl; cout << "Enter numbers, one per line. Press ctrl-D to quit." << endl; vector<int> vec; int tmp; while (cin>>tmp) { vec.push_back(tmp); } cout << "Sorted: " << endl; sort(vec.begin(), vec.end()); int i = 0; for (i=0; i<vec.size(); i++) { cout << vec[i] << endl;; } return 0; }
The call to sort above actually calls an instantiation of the function template std::sort, which will work on any half-open range specified by two random access iterators.
If you like to make the code above more "STLish" you can write this program in the following way:
#include <iostream> #include <vector> #include <algorithm> #include <iterator> int main() { using namespace std; cout << "Sorting STL vector, \"the easier array\"... " << endl; cout << "Enter numbers, one per line. Press ctrl-D to quit." << endl; vector<int> vec(istream_iterator<int>(cin), istream_iterator<int>()); sort(vec.begin(), vec.end()); cout << "Sorted: " << endl; copy(vec.begin(), vec.end(), ostream_iterator<int>(cout, "\n")); return 0; }
Linked lists
The STL provides a class template called list (part of the standard namespace (std::)) which implements a non-intrusive doubly-linked list. Linked lists can insert or remove elements in the middle in constant time, but do not have random access. One useful feature of std::list is that references, pointers and iterators to items inserted into a list remain valid so long as that item remains in the list.
list examples
Associative Containers (key and value)
This type of container point to each element in the container with a key value, thus simplifying searching containers for the programmer. Instead of iterating through an array or vector element by element to find a specific one, you can simply ask for people["tero"]. Just like vectors and other containers, associative containers can expand to hold any number of elements.
Maps and Multimaps
map and multimap are associative containers that manage key/value pairs as elements as seen above. The elements of each container will sort automatically using the actual key for sorting criterion. The difference between the two is that maps do not allow duplicates, whereas, multimaps does.
- map - unique keys
- multimap - same key can be used many times
- set - unique key is the value
- multiset - key is the value, same key can be used many times
/* Map example - character distribution */ #include <iostream> #include <map> #include <string> #include <cctype> using namespace std; int main() { /* Character counts are stored in a map, so that * character is the key. * Count of char a is chars['a']. */ map<char, long> chars; cout << "chardist - Count character distributions" << endl; cout << "Type some text. Press ctrl-D to quit." << endl; char c; while (cin.get(c)) { // Upper A and lower a are considered the same c=tolower(static_cast<unsigned char>(c)); chars[c]=chars[c]+1; // Could be written as ++chars[c]; } cout << "Character distribution: " << endl; string alphabet("abcdefghijklmnopqrstuvwxyz"); for (string::iterator letter_index=alphabet.begin(); letter_index != alphabet.end(); letter_index++) { if (chars[*letter_index] != 0) { cout << char(toupper(*letter_index)) << ":" << chars[*letter_index] << "\t" << endl; } } return 0; }
Container Adapters
- stack - last in, first out (LIFO)
- queue - first in, first out (FIFO)
- priority queue
Iterators
C++'s iterators are the foundation of the STL, now largely incorporated into the standard library part of C++.
The basic idea of an iterator is to provide a way to navigate over some collection of objects. Iterators exist in languages other than C++, but C++ uses an unusual form of iterators, with pros and cons.
In C++, an iterator is a concept rather than a specific type. Iterators are further divided based on properties such as traversal properties.
Some (overlapping) categories of iterators are:
- Singular iterators
- Invalid iterators
- Random access iterators
- Bidirectional iterators
- Forward iterators
- Input iterators
- Output iterators
- Mutable iterators
A pair of iterators [begin, end) is used to define a half open range, which includes the element identified from begin to end, except for the element identified by end. As a special case, the half open range [x, x) is empty, for any valid iterator x.
The most primitive examples of iterators in C++ (and likely the inspiration for their syntax) are the built-in pointers, which are commonly used to iterate over elements within arrays.
Iteration over a Container
Accessing (but not modifying) each element of a container group of type C<T> using an iterator.
for ( typename C<T>::const_iterator iter = group.begin(); iter != group.end(); ++iter ) { T const &element = *iter; // access element here }
Note the usage of typename. It informs the compiler that 'const_iterator' is a type as opposed to a static member variable. (It is only necessary inside templated code, and indeed in C++98 is invalid in regular, non-template, code. This may change in the next revision of the C++ standard so that the typename above is always permitted.)
Modifying each element of a container group of type C<T> using an iterator.
for ( typename C<T>::iterator iter = group.begin(); iter != group.end(); ++iter ) { T &element = *iter; // modify element here }
When modifying the container itself while iterating over it, some containers (such as vector) require care that the iterator doesn't become invalidated, and end up pointing to an invalid element. For example, instead of:
for (i = v.begin(); i != v.end(); ++i) { ... if (erase_required) { v.erase(i); } }
Do:
for (i = v.begin(); i != v.end(); ) { ... if (erase_required) { i = v.erase(i); } else { ++i; } }
The erase() member function returns the next valid iterator, or end(), thus ending the loop. Note that ++i is not executed when erase() has been called on an element.
Functors
A functor or function object, is an object that has an operator (). The importance of functors is that they can be used in many contexts in which C++ functions can be used, whilst also having the ability to maintain state information. Next to iterators, functors are one of the most fundamental ideas exploited by the STL.
The STL provides a number of pre-built functor classes; std::less, for example, is often used to specify a default comparison function for algorithms that need to determine which of two objects comes "before" the other.
/* INCORRECT Example demonstrating declaring a functor and using it to calculate the sum of squares of number in a vector */ #include <vector> #include <algorithm> // Define the Functor for AccumulateSquareValues template<typename T> struct AccumulateSquareValues { AccumulateSquareValues() : sumOfSquares() { } void operator()(const T& value) { sumOfSquares += value*value; } T Result() const { return sumOfSquares; } T sumOfSquares; }; std::vector<int> intVec; intVec.reserve(10); for( int idx = 0; idx < 10; ++idx ) { intVec.push_back(idx); } AccumulateSquareValues<int> sumOfSquare = std::for_each(intVec.begin(), intVec.end(), AccumulateSquareValues<int>() ); std::cout << "The sum of squares for 1-10 is " << sumOfSquare.Result() << std::endl;
STL Algorithms
the STL algorithms are there mainly to help the programmer manipulate collections, sets or on elements in containers.
- The _if suffix
- The _copy suffix
- Non-modifying algorithms
- Modifying algorithms
- Removing algorithms
- Mutating algorithms
- Sorting algorithms
- Sorted range algorithms
- Numeric algorithms
Allocators
Allocators are used by the Standard C++ Library (and particularly by the STL) to allow parameterization of memory allocation strategies.
The subject of allocators is somewhat obscure, and can safely be ignored by most application software developers. All standard library constructs that allow for specification of an allocator have a default allocator which is used if none is given by the user.
Custom allocators can be useful if the memory use of a piece of code is unusual in a way that leads to performance problems if used with the general-purpose default allocator. There are also other cases in which the default allocator is inappropriate, such as when using standard containers within an implementation of replacements for global operators new and delete.
STL implementations
The STL implementation most widely used is the Dinkumware STL library, it is the implementation that ships with Microsoft Visual Studio.
- SGI STL library (http://www.sgi.com/tech/stl/) free STL implementation.
- STLport STL library (http://www.stlport.com/) free and highly cross-platform implementation based on the SGI implementation.
- Dinkumware STL library (http://www.dinkumware.com/) commercial STL implementation.
Smart Pointers
Using raw pointers to store allocated data and then cleaning them up in the destructor is generally a very bad idea. Even temporarily storing allocated data in a raw pointer and then deleting it when done with it should be avoided. If your code throws an exception, it can be cumbersome to properly catch the exception and delete all allocated objects. Smart pointers can alleviate this headache by automatically deleting their contents when they go out of scope.
#include <memory> class A { public: virtual char val() = 0; }; class B : public A { public: virtual char val() { return 'B'; } }; A* get_a_new_b() { return new B(); } bool some_func() { bool rval = true; std::auto_ptr<A> a( get_a_new_b() ); try { std::cout << a->val(); } catch(...) { if( !a.get() ) { throw "Memory allocation failure!"; } rval = false; } return rval; }
Semantics
The auto_ptr has semantics of strict ownership, meaning that the auto_ptr instance is the sole entity responsible for the object's lifetime. If an auto_ptr is copied, the source loses the reference. For example:
#include <iostream> #include <memory> using namespace std; int main(int argc, char **arv) { int *i = new int; auto_ptr<int> x(i); auto_ptr<int> y; y = x; cout << x.get() << endl; cout << y.get() << endl; }
This code will print a NULL address for the first auto_ptr object and some non-NULL address for the second, showing that the source object lost the reference during the assignment (=). The raw pointer i in the example should not be deleted, as it will be deleted by the auto_ptr that owns the reference. In fact, new int could be passed directly into x, eliminating the need for i.
Notice that the object pointed by an auto_ptr is destructed using operator delete; this means that you should only use auto_ptr for pointers obtained with operator new. This excludes pointers returned by malloc(), calloc() or realloc() and operator new[].
Exception Handling
When designing a programming task (a class or even a function) one cannot always assume that application/task will run or be completed correctly (exit with the result it was intended to). It may be the case that it will be just inappropriate for that given task to report an error message (return an error code) or just exit. To handle these types of cases, C++ supports the use of language constructs to separate error handling and reporting code from ordinary code, that is constructs that can deal with these exceptions (errors and abnormalities) and so we call this global approach that adds uniformity to program design the exception handling.
An exception is said to be "thrown" at the place where some error or abnormal condition is detected. The throwing will cause the normal program flow to be aborted. Instead, execution of the program will resume at a designated block of code, called a "catch block", which encloses the point of throwing in terms of program execution. The catch block can be, and usually is, located in a different function/method than the point of throwing. In this way, C++ supports non-local error handling. Along with altering the program flow, throwing of an exception passes an object to the catch block. This object can provide data which is necessary for the handling code to decide in which way it should react on the exception.
Consider a code example for clarification:
void a_function()
{
// This function does not return normally,
// instead execution will resume at a catch block.
// The thrown object is in this case of the type char const*,
// i.e. it is a C-style string. More usually, exception
// objects are of class type.
throw "This is an exception!";
}
void another_function()
{
// To catch exceptions, you first have to introduce
// a try block via " try { ... } ". Then multiple catch
// blocks can follow the try block.
// " try { ... } catch(type 1) { ... } catch(type 2) { ... }"
try
{
a_function();
// Because the function throws an exception,
// the rest of the code in this block will not
// be executed
}
catch(char const* p_string) // This catch block
// will react on exceptions
// of type char const*
{
// Execution will resume here.
// You can handle the exception here.
}
// As can be seen
catch(...) // The ellipsis indicates that this
// block will catch exceptions of any type.
{
// In this example, this block will not be executed,
// because the preceding catch block is chosen to
// handle the exception.
}
}
tryandcatchblock combination
Partial handling
Consider the following case:
void g()
{
throw "Exception";
}
void f() {
int* i = new int(0);
g();
delete i;
}
int main() {
f();
return 0;
}
Can you see the problem in this code ? If g throws an exception, the variable i is never deleted and we have a memory leak.
To prevent the memory leak, f() must catch the exception, and delete i. But f() can't handle the exception, it doesn't know how!
What is the solution then? f() shall catch the exception, and then rethrow it:
void g()
{
throw "Exception";
}
void f() {
int* i = new int(0)
try
{
g();
}
catch (...)
{
delete i;
throw; // This empty throw rethrows the exception we caught
// An empty throw can only exist in a catch block
}
delete i;
}
int main() {
f();
return 0;
}
There's a better way though; see "Writing Exception-Safe Code" below for information on using "RAII" classes to avoid the need to write catch, which also explains why C++ can do better than "finally".
Exception hierarchy
You may throw as exception an object (like a class or string), a pointer (like char*), or a primitive (like int). So which should you choose? You should throw objects, as they ease the handling of exceptions for the programmer. It is common to create a class hierarchy of exception classes:
- class MyApplicationException {};
- class MathematicalException : public MyApplicationException {};
- class DivisionByZeroException : public MathematicalException {};
- class InvalidArgumentException : public MyApplicationException {};
- class MathematicalException : public MyApplicationException {};
An example:
float divide(float numerator, float denominator)
{
if(denominator == 0.0)
throw DivisionByZeroException();
}
enum MathOperators {DIVISION, PRODUCT};
float operate(int action, float argLeft, float argRight)
{
if(action == DIVISION)
{
return divide(argLeft, argRight);
}
else if(action != PRODUCT))
{
// call the product function
// ...
}
// No match for the action! action is an invalid agument
throw InvalidArgumentException();
}
int main(int argc, char* argv[]) {
try
{
operate(atoi(argv[0]), atof(argv[1]), atof(argv[2]));
}
catch(MathematicalException& )
{
// Handle Error
}
catch(MyApplicationException& )
{
// This will catch in InvalidArgumentException too.
// Display help to the user, and explain about the arguments.
}
return 0;
}
Throwing Objects
There are several ways to throw an exception object. Let's review them.
Throw a pointer to the object:
void foo()
{
throw new MyApplicationException();
}
void bar()
{
try
{
foo();
}
catch(MyApplicationException* e)
{
// Handle exception
}
}
But now, who is responsible to delete the exception? The handler? This makes code uglier. There must be a better way!
How about this:
void foo()
{
throw MyApplicationException();
}
void bar()
{
try
{
foo();
}
catch(MyApplicationException e)
{
// Handle exception
}
}
Looks better! But now, the catch handler that catches the exception, does it by value, meaning that a copy constructor is called. This can cause the program to crash if the exception caught was a bad_alloc caused by insufficient memory. In such a situation, seemingly safe code that is assumed to handle memory allocation problems results in the program crashing with a failure of the exception handler. The correct approach is:
void foo()
{
throw MyApplicationException();
}
void bar()
{
try
{
foo();
}
catch(MyApplicationException const& e)
{
// Handle exception
}
}
This method has all the advantages - the compiler is responsible for destroying the object, and no copying is done at catch time!
The conclusion is that exceptions should be thrown by value, and caught by (usually const) reference.
Stack unwinding
Consider the following code
void g()
{
throw std::exception();
}
void f()
{
std::string str = "Hello"; // This string is newly allocated
g();
}
int main()
{
try
{
f();
}
catch(...)
{ }
}
The flow of the program:
- main() calls f()
- f() creates a local variable named str
- str constructor allocates a memory chunk to hold the string "Hello"
- f() calls g()
- g() throws an exception
- f() does not catch the exception.
- Because the exception was not caught, we now need to exit f() in a clean fashion.
- At this point, all the destructors of local variables previous to the throw
- are called - This is called 'stack unwinding'.
- The destructor of str is called, which releases the memory occupied by it.
- As you can see, the mechanism of 'stack unwinding' is essential to prevent resource leaks - without it, str would never be destroyed, and the memory it used would be lost forever.
- main() catches the exception
- The program continues.
The 'stack unwinding' guarantees destructors of local variables (stack variables) will be called when we leave its scope.
Writing exception safe code
Guards
If you plan to use exceptions in your code (and you should), you must always try to write your code in an exception safe manner. Let's see some of the problems that can occur:
Consider the following code:
void g() { throw std::exception(); } void f() { int *i = new int(2); *i = 3; g(); // Oops, if an exception is thrown, i is never deleted // and we have a memory leak delete i; } int main() { try { f(); } catch(...) { } return 0; }
Can you see the problem in this code? When an exception is thrown, we will never run the line that deletes i!
What's the solution to this ? Earlier we saw a solution based on f() ability to catch and re-throw. But there is a neater solution using the 'stack unwinding' mechanism. But 'stack unwinding' only applies to destructors for objects, so how can we use it?
We can write a simple wrapper class:
// Note: This type of class is best implemented using templates, discussed in the next chapter. class IntDeleter { public: IntDeleter(int* value) { m_value = value; } ~IntDeleter() { delete m_value; } // operator *, enables us to dereference the object and use it // like a regular pointer. int& operator *() { return *m_value; } private: int *m_value; };
The new version of f():
void f() { IntDeleter i(new int(2)); *i = 3; g(); // No need to delete i, this will be done in destruction. // This code is also exception safe. }
The pattern presented here is called a guard. A guard is very useful in other cases, and it can also help us make our code more exception safe. The guard pattern is similar to a finally block in other languages, like Java.
Note that the C++ Standard Library provides a templated guard by the name of auto_ptr.
Guide-lines
Because it is hard to write exception safe code, you should only use an exception when you have to - when an error has occurred which you can not handle. Do not use exceptions for the normal flow of the program. This example is WRONG.
void sum(int a, int b) { throw a+b; } int main() { int result; try { sum(2,3); } catch(int tmpResult) { // Here the exception is used instead of a return value! // This is wrong! result = tmpResult; } return 0; }
Exceptions in constructors and destructors
When an exception is thrown from a constructor, the object is not considered instantiated, and therefore its destructor will not be called.
What happens when we allocate this object with new ?
- Memory for the object is allocated
- The object's constructor throws an exception
- The object was not instantiated due to the exception
- The memory occupied by the object is deleted
- The exception is propagated, until it is caught
The main purpose of throwing an exception from a constructor is to inform the program/user that the creation and initialization of the object did not finish correctly. This is a very clean way of providing this important information, as constructors do not return a separate value containing some error code (as an initialization function would).
In contrast, it is strongly recommended not to throw exceptions inside a destructor. It is important to note when a destructor is called:
- as part of a normal deallocation (exit from a scope, delete)
- as part of a stack unwinding that handles a previously thrown exception.
In the former case, throwing an exception inside a destructor can simply cause memory leaks due to incorrectly deallocated object. In the latter, the code must be more clever. If an exception was thrown as part of the stack unwinding caused by another exception, there is no way to choose which exception to handle first. This is interpreted as a failure of the exception handling mechanism and that causes the program to call the function terminate.
To address this problem, it is possible to test if the destructor was called as part of an exception handling process. To this end, one should use the standard library function uncaught_exception, which returns true if an exception has been thrown, but hasn't been caught yet. All code executed in such a situation must not throw another exception.
Situations where such careful coding is necessary are extremely rare. It is far safer and easier to debug if the code was written in such a way that destructors did not throw exceptions at all.
Templates
Templates are a way to make code more reusable. Trivial examples include creating generic data structures which can store arbitrary data types. Templates are of great utility to programmers, especially when combined with multiple inheritance and operator overloading. The Standard Template Library (STL) provides many useful functions within a framework of connected templates.
As the templates are very expressive they may be used for things other than generic programming. One such use is called template metaprogramming, which is a way of pre-evaluating some of the code at compile-time rather than run-time. Further discussion here only relates to templates as a method of generic programming.
By now you should have noticed that functions that perform the same tasks tend to look similar. For example, if you wrote a function that prints an int, you would have to have the int declared first. This way, the possibility of error in your code is reduced, however, it gets somewhat annoying to have to create different versions of functions just to handle all the different data types you use. For example, you may want the function to simply print the input variable, regardless of what type that variable is. Writing a different function for every possible input type (double,char *, etc ...) would be extremely cumbersome. That's where templates come in.
Templates solve some of the same problems as macros, generate "optimized" code at compile time, but are subject to C++'s strict type checking.
Parameterized types, better known as templates, allow the programmer to create one function that can handle many different types. Instead of having to take into account every data type, you have one arbitrary parameter name that the compiler then replaces with the different data types that you wish the function to use, manipulate, etc.
- Templates are instantiated at compile-time with the source code.
- Templates are type safe.
- Templates allow user-defined specialization.
- Templates allow non-type parameters.
- Templates use “lazy structural constraints”.
- Templates support mix-ins.
- Syntax for Templates
Templates are pretty easy to use, just look at the syntax:
template <class TYPEPARAMETER>
(or, equivalently, and preferred by some)
template <typename TYPEPARAMETER>
- Function Template
There are two kinds of templates. A function template behaves like a function that can accept arguments of many different types. For example, the Standard Template Library contains the function template max(x, y) which returns either x or y, whichever is larger. max() could be defined like this:
template <typename TYPEPARAMETER>
TYPEPARAMETER max(TYPEPARAMETER x, TYPEPARAMETER y)
{
if (x < y)
return y;
else
return x;
}
This template can be called just like a function:
std::cout << max(3, 7); // outputs 7
The compiler determines by examining the arguments that this is a call to max(int, int) and instantiates a version of the function where the type TYPEPARAMETER is int.
This works whether the arguments x and y are integers, strings, or any other type for which it makes sense to say x < y". If you have defined your own data type, you can use operator overloading to define the meaning of < for your type, thus allowing you to use the max() function. While this may seem a minor benefit in this isolated example, in the context of a comprehensive library like the STL it allows the programmer to get extensive functionality for a new data type, just by defining a few operators for it. Merely defining < allows a type to be used with the standard sort(), stable_sort(), and binary_search() algorithms; data structures such as sets, heaps, and associative arrays; and more.
As a counterexample, the standard type complex does not define the < operator, because there is no strict order on complex numbers. Therefore max(x, y) will fail with a compile error if x and y are complex values. Likewise, other templates that rely on < cannot be applied to complex data. Unfortunately, compilers historically generate somewhat esoteric and unhelpful error messages for this sort of error. Ensuring that a certain object adheres to a method protocol can alleviate this issue.
{TYPEPARAMETER} is just the arbitrary TYPEPARAMETER name that you want to use in your function. Some programmers prefer using just T in place of TYPEPARAMETER.
Let's say you want to create a swap function that can handle more than one data type...something that looks like this:
template <class SOMETYPE> void swap (SOMETYPE &x, SOMETYPE &y) { SOMETYPE temp = x; x = y; y = temp; }
The function you see above looks really similar to any other swap function, with the differences being the template <class SOMETYPE> line before the function definition and the instances of SOMETYPE in the code. Everywhere you would normally need to have the name or class of the datatype that you're using, you now replace with the arbitrary name that you used in the template <class SOMETYPE>. For example, if you had SUPERDUPERTYPE instead of SOMETYPE, the code would look something like this:
template <class SUPERDUPERTYPE> void swap (SUPERDUPERTYPE &x, SUPERDUPERTYPE &y) { SUPERDUPERTYPE temp = x; x = y; y = temp; }
As you can see, you can use whatever label you wish for the template TYPEPARAMETER, as long as it is not a reserved word.
- Class Template
A class template extends the same concept to classes. Class templates are often used to make generic containers. For example, the STL has a linked list container. To make a linked list of integers, one writes list<int>. A list of strings is denoted list<string>. A list has a set of standard functions associated with it, which work no matter what you put between the brackets.
If you want to have more than one template TYPEPARAMETER, then the syntax would be:
template <class SOMETYPE1, class SOMETYPE2, ...>
- Templates and Classes
Let's say that rather than create a simple templated function, you would like to use templates for a class, so that the class may handle more than one datatype. You may have noticed that some classes from are able to accept a type as a parameter and create variations of an object based on that type (for example the classes of the STL container class hierarchy). This is because they are declared as templates using syntax not unlike the one presented below:
template <class T> class Foo { public: Foo(); void some_function(); T some_other_function(); private: int member_variable; T parametrized_variable; };
Defining member functions of a template class is somewhat like defining a function template, except for the fact, that you use the scope resolution operator to indicate that this is the template classes' member function. The one important and non-obvious detail is the requirement of using the template operator containing the parametrized type name after the class name.
The following example describes the required syntax by defining functions from the example class above.
template <class T> Foo<T>::Foo() { member_variable = 0; } template <class T> void Foo<T>::some_function() { cout << "member_variable = " << member_variable << endl; } template <class T> T Foo<T>::some_other_function() { return parametrized_variable; }
As you may have noticed, if you want to declare a function that will return an object of the parametrized type, you just have to use the name of that parameter as the function's return type.
Advantages and disadvantages
Some uses of templates, such as the max() function, were previously filled by function-like preprocessor macros.
// a max() macro #define max(a,b) ((a) < (b) ? (b) : (a))
Both macros and templates are expanded at compile time. Macros are always expanded inline; templates can also be expanded as inline functions when the compiler deems it appropriate. Thus both function-like macros and function templates have no run-time overhead.
However, templates are generally considered an improvement over macros for these purposes. Templates are type-safe. Templates avoid some of the common errors found in code that makes heavy use of function-like macros. Perhaps most importantly, templates were designed to be applicable to much larger problems than macros. The definition of a function-like macro must fit on a single logical line of code.
There are three primary drawbacks to the use of templates. First, many compilers historically have very poor support for templates, so the use of templates can make code somewhat less portable. Second, almost all compilers produce confusing, unhelpful error messages when errors are detected in template code. This can make templates difficult to develop. Third, each use of a template may cause the compiler to generate extra code (an instantiation of the template), so the indiscriminate use of templates can lead to code bloat, resulting in excessively large executables.
The other big disadvantage of templates is that to replace a #define like max which acts identically with dissimilar types or function calls is impossible. Templates have replaced using #defines for complex functions but not for simple stuff like max(a,b). For a full discussion on trying to create a template for the #define max, see the paper "Min, Max and More" that Scott Meyer wrote for C++ Report in January 1995.
The biggest advantage of using templates, is that a complex algorithm can have a simple interface that the compiler then uses to choose the correct implementation based on the type of the arguments. For instance, a searching algorithm can take advantage of the properties of the container being searched. This technique is used throughout the C++ standard library.
Linkage problems
While linking a template-based program consisting over several modules spread over a couple files, it is a frequent and mystifying situation to find that the object code of the modules won't link due to 'unresolved reference to (insert template member function name here) in (...)'. The offending function's implementation is there, so why is it missing from the object code? Let's stop a moment and consider how can this be possible.
Assume you have created a template based class called Foo and put its declaration in the file Util.hpp along with some other regular class called Bar:
template <class T> Foo { public: Foo(); T some_function(); T some_other_function(); T some_yet_other_function(); T member; }; class Bar { Bar(); void do_something(); };
Now, to adhere to all the rules of the art, you create a file called Util.cc, where you put all the function definitions, template or otherwise:
#include "Util.hpp" template <class T> T Foo<T>::some_function() { ... } template <class T> T Foo<T>::some_other_function() { ... } template <class T> T Foo<T>::some_yet_other_function() { ... }
and, finally:
void Bar::do_something() { Foo<int> my_foo; int x = my_foo.some_function(); int y = my_foo.some_other_function(); }
Next, you compile the module, there are no errors, you are happy. But suppose there's an another (main) module in the program, which resides in MyProg.cc:
#include "Util.hpp" // imports our utility classes' declarations, including the template int main() { Foo<int> main_foo; int z = main_foo.some_yet_other_function(); return 0; }
This also compiles clean to the object code. Yet when you try to link the two modules together, you get an error saying there's an undefined reference to Foo<int>::some_yet_other function() in MyProg.cc. You defined the template member function correctly, so what is the problem?
As you remember, templates are instantiated at compile-time. This helps avoid code bloat, which would be the result of generating all the template class and function variants for all possible types as its parameters. So, when the compiler processed the Util.cc code, it saw that the only variant of the Foo class was Foo<int>, and the only needed functions were:
int Foo<int>::some_function(); int Foo<int>::some_other_function();
No code in Util.cc required any other variants of Foo or its methods to exist, so the compiler generated no code other than that. There's no implementation of some_yet_other_function() in the object code, just as there's no implementation for
double Foo<double>::some_function();
or
string Foo<string>::some_function();
The MyProg.cc code compiled without errors, because the member function of Foo it uses is correctly declared in the Util.hpp header, and it is expected that it will be available upon linking. But it's not and hence the error, and a lot of nuisance if you are new to templates and start looking for errors in your code, which ironically is perfectly correct.
The solution is somewhat compiler dependent. For the GNU compiler, try experimenting with the -frepo flag, and also reading the template-related section of 'info gcc' (node "Template Instantiation": "Where's the Template?") may prove enlightening. In Borland, supposedly, there's a selection in the linker options, which activates 'smart' templates just for this kind of problem.
The other thing you may try is called explicit instantiation. What you do is create some dummy code in the module with the templates, which creates all variants of the template class and calls all variants of its member functions, which you know are needed elsewhere. Obviously, this requires you to know a lot about what variants you need throughout your code. In our simple example this would go like this:
1. Add the following class declaration to Util.hpp:
class Instantiations { private: void Instantiate(); };
2. Add the following member function definition to Util.cc:
void Instantiations::Instantiate() { Foo<int> my_foo; my_foo.some_yet_other_function(); // other explicit instantiations may follow }
Of course, you never need to actual instantiate the Instantiations class, or call any of its methods. The fact that they just exist in the code makes the compiler generate all the template variations which are required. Now the object code will link without problems.
There's still one, if not elegant, solution. Just move all the template functions' definition code to the Util.hpp header file. This is not pretty, because header files are for declarations, and the implementation is supposed to be defined elsewhere, but it does the trick in this situation. While compiling the MyProg.cc (and any other modules which include Util.hpp) code, the compiler will generate all the template variants which are needed, because the definitions are readily available.
Template Metaprogramming Overview
Template metaprogramming (TMP) refers to uses of the C++ template system to perform computation at compile-time within the code. It can, for the most part, be considered to be "programming with types" -- in that, largely, the "values" that TMP works with are specific C++ types.
Because template metaprogramming is something of an unintended use of C++'s template system, it is frequently somewhat cumbersome, though powerful. It also challenges the capabilities of older compilers; generally speaking, compilers from around the year 2000 and later are able to deal with much practical TMP code.
Traits classes could also be considered a primitive form of template metaprogramming; given input of a type, they compute as output properties associated with that type (for example, std::iterator_traits<> takes an iterator type as input, and computes properties such as the iterator's difference_type, value_type and so on). More sophisticated TMP came later.
History of TMP
Historically TMP is something of an accident; it was discovered during the process of standardizing the C++ language that its template system happens to be Turing-complete, i.e., capable in principle of computing anything that is computable. The first concrete demonstration of this was a program written by Erwin Unruh which computed prime numbers although it did not actually finish compiling: the list of prime numbers was part of an error message generated by the compiler on attempting to compile the code. TMP has since advanced considerably, and is now a practical tool for library builders in C++, though its complications mean that it is not generally appropriate for the majority of applications or systems programming contexts.
template <int p, int i> class is_prime { public: enum { prim = (p==2) || (p%i) && is_prime<(i>2?p:0),i-1>::prim }; }; template<> class is_prime<0,0> { public: enum {prim=1}; }; template<> class is_prime<0,1> { public: enum {prim=1}; }; template <int i> class D { public: D(void*); }; template <int i> class Prime_print { // primary template for loop to print prime numbers public: Prime_print<i-1> a; enum { prim = is_prime<i,i-1>::prim }; void f() { D<i> d = prim ? 1 : 0; a.f(); } }; template<> class Prime_print<1> { // full specialization to end the loop public: enum {prim=0}; void f() { D<1> d = prim ? 1 : 0; }; }; #ifndef LAST #define LAST 18 #endif int main() { Prime_print<LAST> a; a.f(); }
Example: Compile-time Factorial
Calculating factorials is naturally done recursively: 0! = 1, and for n>0, n! = n*(n-1)!. In C++ TMP, this corresponds to a class template "factorial" whose general form uses the recurrence relation, and a specialization of which terminates the recursion.
First, the general (unspecialized) template says that factorial<n>::value is given by n*factorial<n-1>::value:
template <unsigned n> struct factorial { enum { value = n * factorial<n-1>::value }; };
Next, the specialization for zero says that factorial<0>::value evaluates to 1:
template <> struct factorial<0> { enum { value = 1 }; };
And now some code that "calls" the factorial template at compile-time:
int main() { // Because calculations are done at compile-time, they can be // used for things such as array sizes. int array[ factorial<7>::value ]; }
Example: Compile-time "If"
The following code defines a meta-function called "if_"; this is a class template that can be used to choose between two types based on a compile-time constant, as demonstrated in main below:
template <bool Condition, typename TrueResult, typename FalseResult> class if_; template <typename TrueResult, typename FalseResult> struct if_<true, TrueResult, FalseResult> { typedef TrueResult result; }; template <typename TrueResult, typename FalseResult> struct if_<false, TrueResult, FalseResult> { typedef FalseResult result; }; int main() { if_<true, int, void*>::result number(3); if_<false, int, void*>::result pointer(&number); }
Building Blocks
Conventions for "Structured" TMP
Run-Time Type Information (RTTI)
RTTI refers to the ability of the system to report on the dynamic type of an object and to provide information about that type at runtime (as opposed to at compile time).
dynamic_cast
dynamic_cast allows down-casts of polymorphic types—in other words casting a base type to a type lower in the hierarchy.
object_of_target_type* ptr = dynamic_cast<target_type*>(pointer_expression); object_of_target_type& ref = dynamic_cast<target_type&>(reference_expression);
Let's say that we have the following class hierarchy:
class Interface { public: virtual void GenericOp() = 0; }; class SpecificClass : public Interface { public: virtual void GenericOp(); virtual void SpecificOp(); };
Let's say that we also have a pointer of type Interface*, like so:
Interface* ptr_interface;
But suppose that we know for sure that this pointer points to an object of type SpecificClass, and we would like to call the member SpecificOp() of that class. To dynamically convert to a derived type we can use dynamic_cast, like so:
SpecificClass* ptr_specific = dynamic_cast<SpecificClass*>(ptr_interface);
With this statement, the program converts the base class pointer to a derived class pointer and allows the derived class members to be called. Be very careful, however: if the pointer that you are trying to cast is not of the correct type, then dynamic_cast will return a null pointer.
We can also use dynamic_cast with references.
SpecificClass& ref_specific = dynamic_cast<SpecificClass&>(ref_interface);
This works almost in the same way as pointers. However, if the real type of the object being cast is not correct then dynamic_cast will not return null (there's no such thing as a null reference). Instead, it will throw a std::bad_cast exception.
typeid
The typeid operator returns information about a specific type. Its RTTI use looks like
const std::type_info& info = typeid(object_expression);
Sometimes we need to know the exact type of an object. The typeid operator returns a reference to a standard class std::type_info that contains information about the type. This class provides some useful members including the == and != operators. The most interesting method is probably:
const char* std::type_info::name() const;
This member function returns a pointer to a C-style string with the name of the object type. For example, using the classes from our earlier example:
const std::type_info &info = typeid(ptr_interface); std::cout << info.name() << std::endl;
This program would print SpecificClass because that is the dynamic type of the pointer ptr_interface.
typeid is actually an operator rather than a function, as it can also act on types:
const std::type_info& info = typeid(type);
for example (and somewhat circularly)
const std::type_info& info = typeid(std::type_info);
will give a type_info object which describes type_info objects. This latter use is not RTTI, but rather CTTI (compile-time type identification).
Limitations
There are some limitations to RTTI. First, RTTI can only be used with polymorphic types. That means that your classes must have at least one virtual function, either directly or through inheritance. Second, because of the additional information required to store types some compilers require a special switch to enable RTTI.
Note that references to pointers will not work under RTTI:
void example( int*& refptrTest ) { std::cout << "What type is *&refptrTest : " << typeid( refptrTest ).name() << std::endl; }
Will report int*, as typeid() does not support reference types.
Misuses of RTTI
RTTI should only be used sparingly in C++ programs. There are several reasons for this. Most importantly, other language mechanisms such as polymorphism and templates are almost always superior to RTTI. As with everything, there are exceptions, but the usual rule concerning RTTI is more or less the same as with goto statements. Do not use it as a shortcut around proper, more robust design. Only use RTTI if you have a very good reason to do so and only use it if you know what you are doing.
Chapter Summary
- I/O
- Standard Template Library (STL)
- Smart Pointers
- Exception Handling
- Templates
- Run-Time Type Information (RTTI)