C++ Programming/Variables
From Wikibooks, the open-content textbooks collection
Contents |
[edit] Variables
Much like a person has a name that distinguishes him or her from other people, a variable assigns a particular instance of an object type, a name or label by which the instance can be referred to. Depending on its use in the code a variable has also a specific locality in relation to the hardware and based on the structure of the code it also has specific scope where the compiler will recognize it as valid. All these characteristics are defined by a programmer.
[edit] Locality (hardware)
Variables have two distinct characteristics: those that are created on the stack (local variables), and those that are accessed via a hard-coded memory address (global variables).
[edit] Globals
Typically a variable is bound to a particular address in computer memory that is automatically assigned to at runtime, with a fixed number of bytes determined by the size of the object type of a variable and any operations performed on the variable effects one or more values stored in that particular memory location.
The only scope that can be defined for a global variable is a namespace, this deals with the visibility of variable not its validity.
All global defined variables will have static lifetime. Only those not defined as const will permit external linkage by default.
[edit] Locals
If the size and location of a variable is unknown beforehand, the location in memory of that variable is stored in another variable instead, and the size of the original variable is determined by the size of the type of the second value storing the memory location of the first. This is called referencing, and the variable holding the other variables memory location is called a pointer.
[edit] Scope (code)
Variables also reside in a specific scope. The scope of a variable determines the life-time of a variable. Entrance into a scope begins the life of a variable and leaving scope ends the life of a variable. This becomes important later as the constructors of variables are called when entering scope and the destructors of variables are called when leaving scope. A variable is visible when in scope unless it is hidden by a variable with the same name inside an enclosed scope. A variable can be in global scope, namespace scope, file scope or block scope.
[edit] Definition vs. Declaration
There is an important concept, the distinction between the declaration of a variable and its definition. The declaration announces the properties (the type, size, etc.), on the other hand the definition causes storage to be allocated in accordance to the declaration.
[edit] Type
Just as there are different types of values (integer, character, etc.), there are different types of variables. A variable can refer to simple values like integers and strings called a primitive type or to a set of values called a composite type that are made up of primitive types and other composite types. Types consist of a set of valid values and a set of valid operations which can be performed on these values. A variable must declare what type it is before it can be used in order to enforce value and operation safety and to know how much space is needed to store a value.
Major functions that type systems provide are:
- Safety - types make it impossible to code some operations which cannot be valid in a certain context. This mechanism effectively catches the majority of common mistakes made by programmers. For example, an expression "Hello, Wikipedia"/1 is invalid because a string literal cannot be divided by an integer in the usual sense. As discussed below, strong typing offers more safety, but it does not necessarily guarantee complete safety (see type-safety for more information).
- Optimization - static type checking might provide useful information to a compiler. For example, if a type says a value is aligned at a multiple of 4, the memory access can be optimized.
- Documentation - using types in languages also improves documentation of code. For example, the declaration of a variable as being of a specific type documents how the variable is used. In fact, many languages allow programmers to define semantic types derived from primitive types; either composed of elements of one or more primitive types, or simply as aliases for names of primitive types.
- Abstraction - types allow programmers to think about programs in higher level, not bothering with low-level implementation. For example, programmers can think of strings as values instead of a mere array of bytes.
- Modularity - types allow programmers to express the interface between two subsystems. This localizes the definitions required for interoperability of the subsystems and prevents inconsistencies when those subsystems communicate.
[edit] Data Types Table
| Type | Size in Bits | Comments | Alternate Names |
|---|---|---|---|
| Primitive Types | |||
| char | ≥ 8 |
|
— |
| signed char | same as char |
|
— |
| unsigned char | same as char |
|
— |
| short | ≥ 16, ≥ size of char |
|
short int, signed short, signed short int |
| unsigned short | same as short |
|
unsigned short int |
| int | ≥ 16, ≥ size of short |
|
signed, signed int |
| unsigned int | same as int |
|
unsigned |
| long | ≥ 32, ≥ size of int |
|
long int, signed long, signed long int |
| unsigned long | same as long |
|
unsigned long int |
| bool | ≥ size of char, ≤ size of long |
|
— |
| wchar_t | ≥ size of char, ≤ size of long |
|
— |
| float | ≥ size of char |
|
— |
| double | ≥ size of float |
|
— |
| long double | ≥ size of double |
|
— |
| User Defined Types | |||
| struct or class | ≥ sum of size of each member |
|
— |
| union | ≥ size of the largest member |
|
— |
| enum | ≥ size of char |
|
— |
| typedef | same as the type being given a name |
|
— |
| template | ≥ size of char | — | — |
| Derived Types[4] | |||
| type& (reference) |
≥ size of char |
|
— |
| type* (pointer) |
≥ size of char |
|
— |
| type [integer] (array) |
≥ integer × size of type |
|
— |
| type (comma-delimited list of types/declarations) (function) |
— |
|
— |
| type aggregate_type::* (member pointer) |
≥ size of char |
|
— |
| [1] -128 can be stored in two's-complement machines (i.e. most machines in existence). | ||
| [2] -32768 can be stored in two's-complement machines (i.e. most machines in existence). | ||
| [3] -2147483648 can be stored in two's-complement machines (i.e. most machines in existence). | ||
| [4] The precedences in a declaration are: | [], () (left associative) | — Highest |
| &, *, ::* (right associative) | — Lowest | |
[edit] Standard Types
C++ has five basic primitive types called standard types, specified by particular keywords, that store a single value.
The type of a variable determines what kind of values it can store:
- bool - a boolean value: true; false
- int - Integer: -5; 10; 100
- char - a character in some encoding, often something like ASCII, ISO-8859-1 ("Latin 1") or ISO-8859-15: 'a', '=', 'G', '2'.
- float - floating-point number: 1.25; -2.35*10^23
- double - double-precision floating-point number: like float but more decimals
The float and double primitive data types are called 'floating point' types and are used to represent real numbers (numbers with decimal places, like 1.435324 and 853.562). Floating point numbers and floating point arithmetic can be very tricky, due to the nature of how a computer calculates floating point numbers.
[edit] Declaration
C++ is a statically typed language. Hence, any variable cannot be used without specifying its type. This is why the type figures in the declaration. This way the compiler can protect you from trying to store a value of an incompatible type into a variable, e.g. storing a string in an integer variable. Declaring variables before use also allows spelling errors to be easily detected. Consider a variable used in many statements, but misspelled in one of them. Without declarations, the compiler would silently assume that the misspelled variable actually refers to some other variable. With declarations, an "Undeclared Variable" error would be flagged. Another reason for specifying the type of the variable is so the compiler knows how much space in memory must be allocated for this variable.
The simplest variable declarations look like this (the parts in []s are optional):
[specifier(s)] type variable_name [ = initial_value];
To create an integer variable for example, the syntax is
int sum;
where sum is the name you made up for the variable. This kind of statement is called a declaration. It declares sum as a variable of type int, so that sum can store an integer value. Every variable has to be declared before use and it is common practice to declare variables as close as possible to the moment where they are needed. This is unlike languages, such as C, where all declarations must precede all other statements and expressions.
In general, you will want to make up variable names that indicate what you plan to do with the variable. For example, if you saw these variable declarations:
char firstLetter; char lastLetter; int hour, minute;
you could probably make a good guess at what values would be stored in them. This example also demonstrates the syntax for declaring multiple variables with the same type in the same statement: hour and minute are both integers (int type). Notice how a comma separates the variable names.
int a = 123; int b (456);
Those lines also declare variables, but this time the variables are initialized to some value. What this means is that not only is space allocated for the variables but the space is also filled with the given value. The two lines illustrate two different but equivalent ways to initialize a variable. The assignment operator '=' in a declaration has a subtle distinction in that it assigns an initial value instead of assigning a new value. The distinction becomes important especially when the values we are dealing with are not of simple types like integers but more complex objects like the input and output streams provided by the iostream class.
The expression used to initialize a variable need not be constant. So the lines:
int sum; sum = a + b;
can be combined as:
int sum = a + b;
or:
int sum (a + b);
Declare a floating point variable 'f' with an initial value of 1.5:
float f = 1.5 ;
Floating point constants should always have a '.' (decimal point) somewhere in them. Any number that does not have a decimal point is interpreted as an integer, which then must be converted to a floating point value before it is used.
For example:
double a = 5 / 2;
will not set a to 2.5 because 5 and 2 are integers and integer arithmetic will apply for the division, cutting off the fractional part. A correct way to do this would be:
double a = 5.0 / 2.0;
You can also declare floating point values using scientific notation. The constant .05 in scientific notation would be
. The syntax for this is the base, followed by an e, followed by the exponent. For example, to use .05 as a scientific notation constant:
double a = 5e-2;
Below is a program storing two values in integer variables, adding them and displaying the result:
// This program adds two numbers and prints their sum. #include <iostream.h> int main() { int a = 123; int b (456); int sum; sum = a + b; std::cout << "The sum of " << a << " and " << b << " is " << sum << "\n"; return 0; }
OR, if you like to save some space, the same above statement can be written as:
// This program adds two numbers and prints their sum, variation 1 #include <iostream> #include <ostream> using namespace std; int main() { int a = 123, b (456), sum = a + b; cout << "The sum of " << a << " and " << b << " is " << sum << endl; return 0; }
[edit] Modifiers
There are several modifiers that can be applied to data types to change the range of numbers they can represent.
[edit] const
A variable declared with this specifier cannot be changed (as in read only). Either local or class-level variables (scope) may be declared const indicating that you don't intend to change their value after they're initialized. You declare a variable as being constant using the const keyword. Global const variables have static linkage. If you need to use a global constant across multiple files the best option is to use a special header file that can be included across the project.
const unsigned int DAYS_IN_WEEK = 7 ;
declares a positive integer constant, called DAYS_IN_WEEK, with the value 7. Because this value cannot be changed, you must give it a value when you declare it. If you later try to assign another value to a constant variable, the compiler will print an error.
int main(){ const int i = 10; i = 3; // ERROR - we can't change "i" int &j = i; // ERROR - we promised not to // change "i" so we can't // create a non-const reference // to it const int &x = i; // fine - "x" is a const // reference to "i" return 0; }
The full meaning of const is more complicated than this; when working through pointers or references, const can be applied to mean that the object pointed (or referred) to will not be changed via that pointer or reference. There may be other names for the object, and it may still be changed using one of those names so long as it was not originally defined as being truly const.
It has an advantage for programmers over #define command because it is understood by the compiler, not just substituted into the program text by the preprocessor, so any error messages can be much more helpful.
With pointer it can get messy...
T const *p; // p is a pointer to a const T T *const p; // p is a const pointer to T T const *const p; // p is a const pointer to a const T
If the pointer is a local, having a const pointer is useless. The order of T and const can be reversed:
const T *p;
is the same as
T const *p;
[edit] volatile
A hint to the compiler that a variable's value can be changed externally; therefore the compiler must avoid aggressive optimization on any code that uses the variable.
Unlike in Java, C++'s volatile specifier does not have any meaning in relation to multi-threading. Standard C++ does not include support for multi-threading (though it is a common extension) and so variables needing to be synchronized between threads need a synchronization mechanisms such as mutexes to be employed, keep in mind that volatile implies only safety in the presence of implicit or unpredictable actions by the same thread (or by a signal handler in the case of a volatile sigatomic_t object). Accesses to mutable volatile variables and fields are viewed as synchronization operations by most compilers and can affect control flow and thus determine whether or not other shared variables are accessed, this implies that in general ordinary memory operations cannot be reordered with respect to a mutable volatile access. This also means that mutable volatile accesses are sequentially consistent. This is not (as yet) part of the standard, it is under discussion and should be avoided until it gets defined.
[edit] mutable
This specifier may only be applied to a non-static, non-const member variables. It allows the variable to be modified within const member functions.
mutable is usually used when an object might be logically constant, i.e, no outside observable behavior changes, but not bitwise const, i.e. some internal member might change state.
The canonical example is the proxy pattern. Suppose you have created an image catalog application that shows all images in a long, scrolling list. This list could be modeled as:
class image { public: // construct an image by loading from disk image(const char* const filename); // get the image data char const * data() const; private: // The image data char* m_data; } class scrolling_images { image const* images[1000]; };
Note that for the image class, bitwise const and logically const is the same: If m_data changes, the public function data() returns different output.
At a given time, most of those images will not be shown, and might never be needed. To avoid having the user wait for a lot of data being loaded which might never be needed, the proxy pattern might be invoked:
class image_proxy { public: image_proxy( char const * const filename ) : m_filename( filename ), m_image( 0 ) {} ~image_proxy() { delete m_image; } char const * data() const { if ( !m_image ) { m_image = new image( m_filename ); } return m_image->data(); } private: char const* m_filename; mutable image* m_image; }; class scrolling_images { image_proxy const* images[1000]; };
Note that the image_proxy does not change observable state when data() is invoked: it is logically constant. However, it is not bitwise constant since m_image changes the first time data() is invoked. This is made possible by declaring m_image mutable. If it had not been declared mutable, the image_proxy::data() would not compile, since m_image is assigned to within a constant function.
[edit] short
The short specifier can be applied to the int data type. It can decrease the number of bytes used by the variable, which decreases the range of numbers that the variable can represent. Typically, a short int is half the size of a regular int -- but this will be different depending on the compiler and the system that you use. When you use the short specifier, the int type is implicit. For example:
short a;
is equivalent to:
short int a;
[edit] long
The long specifier can be applied to the int and double data types. It can increase the number of bytes used by the variable, which increases the range of numbers that the variable can represent. A long int is typically twice the size of an int, and a long double can represent larger numbers more precisely. When you use long by itself, the int type is implied. For example:
long a;
is equivalent to:
long int a;
The shorter form, with the int implied rather than stated, is more idiomatic (i.e., seems more natural to experienced C++ programmers).
Use the long specifier when you need to store larger numbers in your variables. Be aware, however, that on some compilers and systems the long specifier may not increase the size of a variable. Indeed, most common 32-bit platforms (and one 64-bit platform) use 32 bits for int and also 32 bits for long int.
[edit] unsigned
The unsigned specifier makes a variable only represent positive numbers and zero. It can be applied only to the char, short,int and long types. For example, if an int typically holds values from -32768 to 32767, an unsigned int will hold values from 0 to 65535. You can use this specifier when you know that your variable will never need to be negative. For example, if you declared a variable 'myHeight' to hold your height, you could make it unsigned because you know that you would never be negative inches tall.
[edit] signed
The signed specifier makes a variable represent both positive and negative numbers. It can be applied only to the char, int and long data types. The signed specifier is applied by default for int and long, so you typically will never use it in your code.
[edit] static
Using the static modifier makes a variable have static lifetime and on global variables makes them require internal linkage (variables will not be accessible form code of the same project that resides in other files).
- static lifetime
- Means that a static variable will needs to be initialized in the file scope and at run time, will exist and maintain changes across until the program's process is closed, the particular order of destruction of static variables is undefined.
The static keyword can also be used on functions, inside functions, on classes, on classes members (data and functions), in structs, unions (but not in a union's member) we will cover each use separately.
[edit] Enumerated Data Types
In programming it is often necessary to deal with data types that describe a fixed set of alternatives. For example, when designing a program to play a card game it is necessary to keep track of the suit of an individual card.
One method for doing this may be to create unique constants to keep track of the suit. For example one could define
const int Clubs=0; const int Diamonds=1; const int Hearts=2; const int Spades=3; . . . int current_card_suit=Diamonds;
Unfortunately there are several problems with this method. The most minor problem is that this can be a bit cumbersome to write. A more serious problem is that this data is indistinguishable from integers. It becomes very easy to start using the associated numbers instead of the suits themselves. Such as:
int current_card_suit=1;
...and worse to make mistakes that may be very difficult to catch such as a typo...
current_card_suit=11;
...which produces a valid expression in C++, but would be meaningless in representing the card's suit.
One way around these difficulty is to create a new data type specifically designed to keep track of the suit of the card, and restricts you to to only use valid possibilities. We can accomplish this using an enumerated data type using the C++ "enum" keyword. In this case we could create the desired data type with the code:
enum card_suit {Clubs,Diamonds,Hearts,Spades}; card_suit first_cards_suit=Diamonds; card_suit second_cards_suit=Hearts; card_suit third_cards_suit=0; //Would cause an error, 0 is an "integer" not a "card_suit" card_suit forth_cards_suit=first_cards_suit; //OK, they both have the same type.
The line of code creates a new data type "card_suit" that may take on only one of four possible values: "Clubs", "Diamonds", "Hearts", and "Spades". In general the enum command takes the form
- enum new_type_name {possible_value_1, possible_value_1, ..., possible_value_n'} Optional_Variable_With_This_Type;
While the second line of code creates a new variable with this data type and initializes it to value to Diamonds". The other lines create new variables of this new type and show some initializations that are (and are not) possible.
Internally enumerated types are stored as integers, that begin with 0 and increment by 1 for each new possible value for the data type.
enum apples { Fuji, Macintosh, GrannySmith }; enum oranges { Blood, Navel, Persian }; apples pie_filling = Navel; //error can't make an apple pie with oranges. apples my_fav_apple = Macintosh; oranges my_fav_orange = Navel; //This has the same internal integer value as my_favorite_apple if(my_fav_apple == my_fav_orange) //Many compilers will produce an error or warning letting you know your comparing two different quantities. std::cout << "You shouldn't compare apples and oranges" << std::endl;
While enumerated types are not integers, they are in some case converted into integers. For example, when we try to send an enumerated type to standard output.
For example:
enum color {Red, Green, Blue}; color hair=Red; color eyes=Blue; color skin=Green; std::cout << "My hair color is " << hair << std::endl; std::cout << "My eye color is " << eyes << std::endl; std::cout << "My skin color is " << skin << std::endl; if (skin==Green) std::cout << "I am seasick!" << std::endl;
Will produce the output:
My hair color is 0 My eye color is 2 My skin color is 1 I am seasick!
We could improve this example by introducing an array that holds the names of our enumerated type such as:
std::string color_names[3]={"Red", "Green", "Blue"}; enum color {Red, Green, Blue}; color hair=Red; color eyes=Blue; color skin=Green; std::cout << "My hair color is " << color_names[hair] << std::endl; std::cout << "My eye color is " << color_names[eyes] << std::endl; std::cout << "My skin color is " << color_names[skin] << std::endl;
In this case hair is automatically converted to an integer when it is index arrays. This technique is intimately tide to the fact that the color Red is internally stored as "0", Green is internally stored as "1", and Blue is internally stored as "2". Be Careful! One may override these default choices for the internal values of the enumerated types.
This is done by simply setting the value in the "enum" such as:
enum color {Red=2, Green=4, Blue=6};
In fact it is not necessary to an integer for every value of an enumerated type. In the case the value, the complier will simply increase the value of the previous possible value by one.
Consider the following example:
enum colour {Red=2, Green, Blue=6, Orange};
Here the internal value of "Red" is 2, "Green" is 3, "Blue" is 6 and "Orange is 7. Be careful to keep in mind when using this that the internal values do not need to be unique.
Enumerated types are also automatically converted into integers in arithmetic expressions. Which makes it useful to be able to choose particular integers for the internal representations of an enumerated type.
One may have enumerated for the width and height of a standard computer screen. This may allow a program to do meaningful calculations, while still maintaining the benefits of an enumerated type.
enum screen_width {SMALL=800, MEDIUM=1280}; enum screen_height {SMALL=600, MEDIUM=768}; screen_width MyScreenW=SMALL; screen_height MyScreenH=SMALL; std::cout << "The number of pixels on my screen is " << MyScreenW*MyScreenH << std::endl;
It should be noted that the internal values used in an enumerated type are constant, and cannot be changed during the execution of the program.
It is perhaps useful to notice that while the enumerated types can be converted to integers for the purpose arithmetic, they cannot be iterated through.
For example:
enum month { JANUARY=1, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY, AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, DECEMBER}; for( month cur_month = JANUARY; cur_month <= DECEMBER; cur_month=cur_month+1) { std::cout << cur_month << std::endl; }
This will fail to compile. The problem is with the for loop. The first two statements in the loop are fine. We may certainly create a new month variable and initialize it. We may also compare two months, where they will be compared as integers. We may not increment the cur_month variable. "cur_month+1" evaluates to an integer which may not be stored into a "month" data type.
In the code above we might try to fix this by replacing the for loop with:
for( int monthcount = JANUARY; monthcount <= DECEMBER; monthcount++) { std::cout << monthcount << std::endl; }
This will work because we can increment the integer "mounthcount".
[edit] Derived Types
[edit] Type conversion
Type conversion or typecasting refers to changing an entity of one data type into another.
[edit] Implicit type conversion
Implicit type conversion, also known as coercion, is an automatic and temporary type conversion by the compiler. In a mixed-type expression, data of one or more subtypes can be converted to a supertype as needed at runtime so that the program will run correctly.
For example:
double d; long l; int i; if (d > i) d = i; if (i > l) l = i; if (d == l) d *= 2;
As you can see d, l and i belong to different data types, the compiler will then automatically and temporarily converted the original types to equal data types each time a comparison or assignment is executed.
[edit] Explicit Type Conversion
Explicit type conversion manually converts one type into another, and is used in cases where automatic type casting doesn't occur.
double d = 1.0; printf ("%d\n", (int)d);
In this example, d would normally be a double and would be passed to the printf function as such. This would result in unexpected behavior, since printf would try to look for an int. The typecast in the example corrects this, and passes the integer to printf as expected.
[edit] typedef
typedef is a languages keyword, used to give a data type a new name. The intent is to make comprehension of source easier. Most of the time this occurs in old external libraries. The Style Conventions section of this book also mentions this keyword.
typedef int Apples; typedef int Oranges; Apples coxes; Oranges jaffa;