C++ Programming/Code/Style Conventions
From Wikibooks, the open-content textbooks collection
Contents |
[edit] Coding Style Conventions
The use of a style guide or code convention gives programmers a set of rules for code normalization or coding style that establishes how they should format code, name variables, place comments or any other non language dependent structural decision that is used on the code. This is very important, as you share a project with others. Agreeing to a set of rules enables a common set of coding standards and recommendations that will enable a greater understandings and transparency of the code base, providing a common ground for undocumented structures, making for easy debugging, and will increase code maintainability. These rules can also be referred to as Source Code Style.
A list of different approaches can be found on the Reference Section. The most commonly used style for the C++ (and C) language is the Kernighan and Ritchie (K&R) style-guide. You should be warned that this is one of the first decisions as you take on a project and in a democratic environment a consensus can be very hard to achieve, programmers tend to stick to a coding style, they have it automated and any deviation can be very hard to conform with, if you don't have a favorite style try to use the smallest possible variation to a common one or get as broad a view as you can get, so that you can adapt easily to changes or defend your approach. There is software that can help to format or beautify the code, but automation can have its drawbacks. As seen earlier, indentation and the use of white spaces or tabs are completely ignored by the compiler. A coding style should vary depending on the lowest common denominator of the needs to standardize.
Field the are impacted by the selection of a Code Style:
- Reusability
- Self documenting code
- Internationalization
- Maintainability
- Portability
- Optimization
- Build process
- Error avoidance
- Security
- Standardization is Important
It does not matter which particular coding style you pick. However, once a coding style is selected, it should be kept throughout the same project. Reading code that follows different styles can become very difficult. In the next sections we try to explain why some of the options are common practice without forcing you to adopt a specific style.
|
NOTE: |
[edit] Identifier Naming
Identifiers are names given to variables, functions, objects, etc. to refer to them in the program. C++ identifiers must start with a letter or an underscore character (_), possibly followed by a series of letters, underscores or digits. None of the C++ keywords can be used as identifiers. Identifiers with successive underscores are reserved for use in the header files or by the compiler for special purpose, e.g. name mangling.
This leaves a lot of freedom in naming, one could use specific prefixes or suffixes, start a name with a initial upper case or lower letter, keep the name all in a single case or created a type of word separator like "_" or a change of case. We will suggested some rules that will make your choice more informed.
[edit] Leading underscores
In most contexts, leading underscores are better avoided. They are reserved for the compiler or internal variables of a library, and can make your code less portable and more difficult to maintain. Those variables can also be stripped from a library (i.e. the variable isn't accessible anymore, it is hidden from external world) so unless you want to override an internal variable of a library, don't do it.
[edit] Reusing existing names
Do not use the names of standard library functions and objects for your identifiers as these names are considered reserved words and programs may become difficult to understand when used in unexpected ways.
[edit] Sensible names
Always use good, unabbreviated, correctly-spelled meaningful names.[1] Prefer the English language (since C++ and most libraries used already use English), avoid the use of short cryptic names. This will make it easier to read and to type a name without having to look it up.
|
NOTE: |
[edit] Names indicate purpose
An identifier should indicate the function of the variable/function/etc. that it represents, e.g. foobar is probably not a good name for a variable storing the age of a person.
Identifier names should also be descriptive. n might not be a good name for a global variable representing the number of employees. However, a good medium between long names and lots of typing has to be found. Therefore, this rule can be relaxed for variables that are used in a small scope or context. Many programmers prefer short variables (such as i) as loop iterators.
[edit] Capitalization
Conventionally, variable names start with a lower case character. In identifiers which contain more than one natural language words, either underscores or capitalization is used to delimit the words, e.g. num_chars (K&R style) or numChars (Java style). It is recommended that you pick one notation and do not mix them within one project.
[edit] Constants
When naming #defines, constant variables, enum constants. and macros put in all uppercase using '_' separators; this makes it very clear that the value is not alterable and in the case of macros, makes it clear that you are using a construct that requires care.
|
NOTE: |
[edit] Functions and Member Functions
The name given to functions and member functions should be descriptive and make it clear what it does. Since usually functions and member functions perform actions, the best name choices typically contain a mix of verbs and nouns in them such as CheckForErrors() instead of ErrorCheck() and dump_data_to_file() instead of data_file(). Clear and descriptive names for functions and member functions can sometimes make guessing correctly what functions and member functions do easier, aiding in making code more self documenting. By following this and other naming conventions programs can be read more naturally.
People seem to have very different intuitions when using names containing abbreviations. It's best to settle on one strategy so the names are absolutely predictable. Take for example NetworkABCKey. Notice how the C from ABC and K from key are confused. Some people don't mind this and others just hate it so you'll find different policies in different code so you never know what to call something.
Prefixes and suffixes are sometimes useful:
- Min - to mean the minimum value something can have.
- Max - to mean the maximum value something can have.
- Cnt - the current count of something.
- Count - the current count of something.
- Num - the current number of something.
- Key - key value.
- Hash - hash value.
- Size - the current size of something.
- Len - the current length of something.
- Pos - the current position of something.
- Limit - the current limit of something.
- Is - asking if something is true.
- Not - asking if something is not true.
- Has - asking if something has a specific value, attribute or property.
- Can - asking if something can be done.
- Get - get a value.
- Set - set a value.
[edit] Examples
In most contexts, leading underscores are also better avoided. For example, these are valid identifiers:
- i loop value
- numberOfCharacters number of characters
- number_of_chars number of characters
- num_chars number of characters
- get_number_of_characters() get the number of characters
- get_number_of_chars() get the number of characters
- is_character_limit() is this the character limit?
- is_char_limit() is this the character limit?
- character_max() maximum number of a character
- charMax() maximum number of a character
- CharMin() minimum number of a character
These are also valid identifiers but can you tell what they mean?:
- num1
- do_this()
- g()
- hxq
The following are valid identifiers but better avoided:
- _num as it could be used by the compiler/system headers
- num__chars as it could be used by the compiler/system headers
- main as there is potential for confusion
- cout as there is potential for confusion
The following are not valid identifiers:
- if as it is a keyword
- 4nums as it starts with a digit
- number of characters as spaces are not allowed within an identifier
[edit] Hungarian Notation
Hungarian notation, which would now be called Apps Hungarian, was invented by Charles Simonyi, a programmer who worked at Xerox PARC circa 1972-1981, and who later became Chief Architect at Microsoft and has been until recently the preeminent naming convention used on most Microsoft code. It uses prefixes, like "m_" to indicates it is a member variable, the "p" indicates that a pointer and the rest of the name is normally written out with caps on the first letter. We mention this convention because you will very probably find it in use, even more probable if you do any programming in Windows, if you are interested on learning more you can check Wikipedia's entry on this notation.
[edit] Reduced/Abuse the use of keywords
This can be defended both ways. This can mean less typing but also make the reader and the compiler (depending on the situation) to do extra work, on the other hand if you write more keywords the resulting code will be clearer and reduces errors, or more defined (self documented) but this can lead to adding limitations to the code's evolution. This is a thin line were an equilibrium must be reached in accord to the projects nature. The important fact is to be consistent as with any other rule.
[edit] inline
Using inline if the member function is implicitly inlined.
[edit] const
[edit] typedef
It is common practice to avoid using this keyword since it can obfuscates code if not properly used or it can cause programmers to accidentally misuse large structures thinking them to be simple types. If used, define a set or rules for the types you rename and be sure to document them.
[edit] volatile
[edit] 25 lines 80 columns
This is a commonly recommended but often inapplicable rule. Many people say it's an outdated rule, that it comes from prehistoric times when terminals could only display 25 lines 80 columns.
This rules signifies that if you are writing code that will go further than 80 columns or 25 lines, it's time to think about splitting the code into functions. This recommended practice relates also to the 0 means success convention for functions, that we will cover on the Functions Section of this book.
This practice will save you precious time when you have to return to a project you haven't been working on for 6 months.
[edit] Whitespace and Indentation
|
Conventions followed when using whitespace to improve the readability of code is called an indentation style. Every block of code and every definition should follow a consistent indention style. This usually means everything within { and }. However, the same thing goes for one-line code blocks.
Use a fixed number of spaces for indentation. Recommendations vary; 2, 3, 4, 8 are all common numbers. If you use tabs for indention you have to be aware that editors and printers may deal with, and expand, tabs differently. The K&R standard recommends an indentation size of 4 spaces.
For example, a program could as well be written using as follows:
// Using an indentation size of 2 if ( a > 5 ) { b=a; a++; }
However, the same code could be made much more readable with proper indentation:
// Using an indentation size of 2 if ( a > 5 ) { b=a; a++; } // Using an indentation size of 4 if ( a > 5 ) { b=a; a++; }
[edit] Placement of braces (curly brackets)
As we have seen early on the Statements Section, compound statement are very important in C++, they also are subject of different coding styles, that recommend different placements of opening and closing braces ({ and }). Some recommend putting the opening brace on the line with the statement, at the end (K&R). Others recommend putting these on a line by itself, but not indented (ANSI C++). GNU recommends putting braces on a line by itself, and indenting them half-way. We recommend picking one brace-placement style and sticking with it.
Examples:
if (a > 5) { // This is K&R style } if (a > 5) { // This is ANSI C++ style } if (a > 5) { // This is GNU style }
[edit] Comments
Comments are portions of the code ignored by the compiler which allow the user to make simple notes in the relevant areas of the source code. Comments come either in block form or as single lines.
- Single-line comments (informally, C++ style), start with
//and continue until the end of the line. If the last character in a comment line is a\the comment will continue in the next line. - Multi-line comments (informally, C style), start with
/*and end with*/.
|
NOTE: |
We will now describe how a comment can be added to the source code, but not where how and when to comment, we will get into that later.
[edit] C style Comments
If you use this kind of comment try to use it like this... Commented
/*void EventLoop(); /**/
or for multiple lines
/*
void EventLoop();
void EventLoop();
/**/
this opens you the option to do this... Uncommented
void EventLoop(); /**/
or for multiple lines
void EventLoop(); void EventLoop(); /**/
|
NOTE: |
... by removing only the start of comment and so activating the next one, you did re-activate the commented code, because if you start a comment this way it will be valid until it finds the close of comment */.
|
NOTE: int function() /* This is a comment /* { return 0; } and this is the same comment */ so this isn't in the comment, and will give an error*/ because of the text so this isn't in the comment */ at the end of the line which is not inside the comment; the comment ends at the first */ pair it finds, ignoring any interim /* pairs which might look to human readers like the start of a nested comment. |
[edit] C++ style Comments
Examples:
// This is a single one line comment
or
if (expression) // This needs a comment { statements; } else { statements; }
The backslash is a continuation character and will continue the comment to the following line:
// This comment will also comment the following line \
std::cout << "This line will not print" << std::endl;
- Using comments to temporarily ignore code
Comments are also sometimes used to enclose code that we temporarily want the compiler to ignore. This can be useful in finding errors in the program. If a program does not give the desired result, it might be possible to track which particular statement contains the error by commenting out code.
- Example with C style comments
/* This is a single line comment */
or
/*
This is a multiple line comment
*/
- C and C++ style
Combining multi-line comments (/* */) with c++ comments (//) to comment out multiple lines of code:
Commenting out the code:
/*
void EventLoop();
void EventLoop();
void EventLoop();
void EventLoop();
void EventLoop();
//*/
uncommenting the code chunk
//* void EventLoop(); void EventLoop(); void EventLoop(); void EventLoop(); void EventLoop(); //*/
This works because a //* is still a c++ comment. And //*/ acts as a c++ comment and a multi-line comment terminator. However this doesn't work if there are any multi-line comments are used for function descriptions.
- Note on doing it with preprocessor statements
Another way (considered bad practice) is to selectively enable disable sections of code:
#if(0) // Change this to 1 to uncomments. void EventLoop(); #endif
this is considered a bad practice because the code often become illegible when several #if are mixed, if you use them don't forget to add a comment at the #endif saying what #if it correspond
#if (FEATURE_1 == 1) do_something; #endif //FEATURE_1 == 1
you can prevent illegibility by using inline functions (often considered better than macros for legibility with no performance cost) containing only 2 sections in #if #else #endif
inline do_test() { #if (Feature_1 == 1) do_something #endif //FEATURE_1 == 1 }
and call
do_test();
in the program
|
NOTE: If your comment lies into one line with code, use C++ style. |
- Related content

