C++ Programming: Programming languages, an introduction

From Wikibooks, open books for an open world
Jump to navigation Jump to search

What is a programming language?[edit | edit source]

In the most basic terms, a "programming language" is a means of communication between a human being (programmer) and a computer. A programmer uses this means of communication in order to give the computer instructions. These instructions are called "programs".

Like the many natural languages we use to communicate with each other, there are many languages that a programmer can use to communicate with a computer. Each programming language has its own set of words and rules, called the syntax of that language. If you're going to write a program, you have to follow the syntax of the language you're using, otherwise you won't be understood.

Programming languages can generally be divided in two categories: Low-Level and High-level, both concepts will be introduced to you and their relevance to C++.

Low-level[edit | edit source]

Image shows most programming languages and their relations from the mid-1800s up to 2003 (click here for full size).

The lower level in computer "languages" are:

Machine code (also called binary) is the lowest form of a low-level language. Machine code consists of a string of 0s and 1s, which combine to form meaningful instructions that computers can take action on. If you look at a page of binary it becomes apparent why binary is never a practical choice for writing programs; what kind of person would actually be able to remember what a bunch of strings of 1 and 0 mean?

Assembly language (also called ASM), is just above machine code on the scale from low level to high level. It is a human-readable translation of the machine language instructions the computer executes. For example, instead of referring to processor instructions by their binary representation (0s and 1s), the programmer refers to those instructions using a more memorable (mnemonic) form. These mnemonics are usually short collections of letters that symbolize the action of the respective instruction, such as "ADD" for addition, and "MOV" for moving values from one place to another.

Note:
Assembly language is processor specific. This means that a program written in assembly language will not work on computers with different processor architectures.
Using ASM to optimize certain tasks is common for C++ programmers, but will require special considerations, because ASM is not as portable.

You do not have to understand assembly language to program in C++, but it does help to have an idea of what's going on "behind-the-scenes". Learning about assembly language will also allow you to have more control as a programmer and help you in debugging and understanding code.

The advantages of writing in a high-level language format far outweigh any drawbacks, due to the size and complexity of most programming tasks, those advantages include:

  • Advanced program structure: loops, functions, and objects all have limited usability in low-level languages, as their existence is already considered a "high" level feature; that is, each structure element must be further translated into low-level language.
  • Portability: high-level programs can run on different kinds of computers with few or no modifications. Low-level programs often use specialized functions available on only certain processors, and have to be rewritten to run on another computer.
  • Ease of use: many tasks that would take many lines of code in assembly can be simplified to several function calls from libraries in high-level programming languages. For example, Java, a high-level programming language, is capable of painting a functional window with about five lines of code, while the equivalent assembly language would take at least four times that amount.

High-level[edit | edit source]

High-level languages do more with less code, although there is sometimes a loss in performance and less freedom for the programmer. They also attempt to use English language words in a form which can be read and generally interpreted by the average person with little to no programming experience. A program written in one of these languages is sometimes referred to as "human-readable code". In general, abstraction makes learning a programming language easier.

No programming language is written in what one might call a natural language like "plain English" though, (although BASIC and COBOL come close and someone is working hard at it in the Osmosian Order's Plain English compiler and Integrated Development Environment, which is written entirely in Plain English, being plain English then open to debate regarding its definition). Anyhow, because of this necessity for reduction and control regarding written expression that results in the use of programming languages (constructed and formal languages) the text for the program is sometimes referred to as "code" or more specifically as "source code." This is discussed in more detail in The Code Section of the book.

The important bits to retain is that while some words (instructions) are in English (mostly for ease) the language used is different (with generally good reasons why, otherwise someone will create a new programming language), beyond that the rest of above paragraph may only be of importance when you start building parsers, languages and compilers. The Higher-level a language is, the harder it works to solve the problem of abstraction to the hardware (CPU, co-processors, number of registers etc...) by supporting portability on code and higher human intelligibility via added complexity in expression and constructs.

Keep in mind that this classification scheme is evolving. C++ is still considered a high-level language, but with the appearance of newer languages (Java, C#, Ruby etc...), C++ is beginning to be grouped with lower level languages like C.

Translating programming languages[edit | edit source]

Since a computer is only capable of understanding machine code, human-readable code must be either interpreted or translated into machine code.

An Interpreter is a program (often written in a lower level language) that interprets the instructions of a program one instruction at a time into commands that are to be carried out by the interpreter as it happens. Typically each instruction consists of one line of text or provides some other clear means of telling each instruction apart and the program must be reinterpreted again each time the program is run.

A Compiler is a program used to translate the source code, one instruction at a time, into machine code. The translation into machine code may involve splitting one instruction understood by the compiler into multiple machine instructions. The instructions are only translated once and after that the machine can understand and follow the instructions directly whenever it is instructed to do so. A complete examination of the C++ compiler is given in the Compiler Section of the book.

The tools with which to instruct a computer may differ, however no matter which statements are used, just about every programming language will support constructs that accomplish the following:

Input
Input is the act of getting information from a device such as a keyboard or mouse, or sometimes another program.
Output
Output is the opposite of input; it gives information to the computer monitor or another display device or program.
Math/Algorithm
All computer processors (the brain of the computer), have the ability to perform basic mathematical computation, and every programming language has some way of telling it to do so.
Testing
Testing involves telling the computer to check for a certain condition and to do something when that condition is true or false. Conditionals are one of the most important concepts in programming, and all languages have some method of testing conditions.
Repetition
Perform some action repeatedly, usually with some variation.

Further examination and analysis of C++ language constructs is provided on the Statements Section of the book.

Believe it or not, that's pretty much all there is to it. Every program you have ever used, no matter how simple or complex, is made up of functions that function more or less like these. Therefore, one way to describe computer programming is the process of breaking a large, complex task up into smaller and smaller sub-tasks until eventually each sub-task is simplified enough to be performed with one of these functions.

C++ is mostly compiled rather than interpreted (there are some C++ interpreters), and then "executed" later. As complicated as this may seem, further on you will see how easy it can be.

So as we have seen in the Introducing C++ Section, C++ evolved from C by adding some levels of abstraction (so we can correctly state that C++ is of a higher level than C). We will learn the particulars of those differences in the Programming Paradigms Section of the book and for some of you that already know some other languages should look into Programming Languages Comparisons Section.