C Programming/Print version
From Wikibooks, the open-content textbooks collection
The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
http://en.wikibooks.org/wiki/C_Programming
[edit] Table of contents
[edit] Introduction
[edit] Beginning C
- Basics of Compilation
- Programming Structure and Style
- Error Handling
- Variables
- Simple Input and Output
- Simple Math in C
- Further Math in C
- Program Flow
- Procedures and Functions
- The Preprocessor
- Libraries
- Standard libraries
- File I/O
- Exercises
[edit] In-depth C ideas
- Arrays & Strings
- Pointers and relationship to arrays
- Memory Management
- String Manipulation
- C complex types
- Sockets and Networking (UNIX)
- Common Practices
- Serialization and X-Macros
- Coroutines
- Typecasting
[edit] C and beyond
- Weaknesses of C
- Language Overloading and Extensions
- Combining Languages
- Commented Source Code Library
- Further reading for windows game programing
[edit] Computer Programming
The following articles are C adaptions from articles of the Computer programming book.
[edit] C Reference Tables
This section has some tables and lists of C entities.
[edit] Introduction
[edit] Why Learn C?
C is the most commonly used programming language for writing operating systems. Unix was the first operating system written in C. Later Microsoft Windows, Mac OS X, and GNU/Linux were all written in C.
Not only is C the language of operating systems, it is the precursor and inspiration for almost all of the most popular high-level languages available today. In fact, Perl, PHP, Python, and Ruby are all written in C.
By way of analogy, let's say that you were going to be learning Spanish, Italian, French, or English. Do you think knowing Latin would be helpful? Just as Latin was the basis of all of those languages, knowing C will enable you to understand and appreciate an entire family of programming languages built upon the traditions of C. Knowledge of C enables freedom.
[edit] Why C, and not assembly language?
Because Assembly, while extremely powerful, is simply too difficult to program large applications, and hard to read or interpret in a logical way. Assembly language can be directly converted into binary with no compilation, assembly is instead 'linked' together, meaning the individual source files are all put into one file and converted to binary. C also gives the programmer the ability to program in assembly right inside the C-code, giving programmers the option to optimize a very important or heavily used piece of code in Assembly.
[edit] Why C, and not Java or Basic or Perl?
Mostly because of memory allocation. Unlike most computer languages, C allows the programmer to address memory the way he/she would using assembly language. Languages like Java and Perl shield the programmer from having to worry about memory allocation and pointers. This is usually a good thing. It's quite tedious to deal with memory allocation when building a high-level program like a quarterly income statement report. However, when dealing with low level code such as that part of the OS that moves the string of bytes that makes up that quarterly income report from the computer's memory to the network card's buffer so they can be shipped to the network printer, direct access to memory is critical -- something you just can't do with Java. C can be compiled into fast and efficient machine code.
So is it any wonder that C is such a popular language?
Like toppling dominoes, the next generation of programs follows the trend of its ancestors. Operating systems designed in C always have system libraries designed in C. Those system libraries are in turn used to create higher-level libraries (like OpenGL, or GTK), and the designers of those libraries often decide to use the language the system libraries used. Application developers use the higher-level libraries to design word processors, games, media players, and the like. Many of them will choose to program in the language that higher-level library uses. And the pattern continues on and on and on...
[edit] History of the C Programming Language
The field of computing as we know it today started in 1947 with three scientists at Bell Telephone Laboratories--William Shockley, Walter Brattain, and John Bardeen--and their groundbreaking invention of the transistor. In 1956, the first fully transistor-based computer, the TX-0, was completed at MIT. The first integrated circuit was created in 1958 by Jack Kilby at Texas Instruments but, the first high-level programming language existed even before then.
In 1954, The Fortran project, named for it being the Formula Translator, began. Fortran begot Algol 58, the Algorithmic Language, in 1958. Algol 58 begot Algol 60 in 1960. Algol 60 begot CPL, the Combined Programming Language, in 1963. CPL begot BCPL, Basic CPL, in 1967. BCPL begot B in 1969. B begot C in 1971.
B was the first language in the C lineage directly. It was created by Ken Thompson. B at Bell Labs and was an interpreted language used in early internal versions of the UNIX operating system. Thompson and Dennis Ritchie, also of Bell Labs, improved B and called it NB. Further extensions to NB created its logical successor, C, a compiled language. Most of UNIX was then rewritten in NB and then C, which led to a more portable operating system.
The portability of UNIX was the main reason for the initial popularity of both UNIX and C. So rather than creating a new operating system for each new machine, system programmers could simply write the few system-dependent parts required for the machine, and write a C compiler for the new system. Thereafter since most of the system utilities were written in C, it simply made sense to also write new utilities in that language.
[edit] Getting Started
The goal of this book is to introduce you to the C programming language. Basic computer literacy is assumed, but no special knowledge is needed.
The minimum software requirements to program in C is a text editor, as opposed to a word processor. There are many text editors (see List of Text Editors), the most popular being vi, its clones (such as Vim), and Emacs. A text editor with syntax highlighting is recommended, as it can make code easier to read at a glance. Highlighting can also make it easy to spot syntax errors. Most programmers' text editors on Windows and Unix systems can do this.
If you choose to use a text editor, you will be required to have a C compiler. A compiler is a program that converts C code into executable machine code. [1]
Popular C compilers Include:
| Platform | License | Extra | |||
|---|---|---|---|---|---|
| OpenWatcom | [1] | DOS, Windows, Netware, OS/2 | Open source | ||
| Borland C Compiler | [2] | Windows | Freeware | ||
| Microsoft Visual Studio Express | [3] | Windows | Freeware | light weight, powerful, and student friendly version of industry standard compiler | |
| Tiny C Compiler (TCC) | [4] | GNU/Linux, Windows | LGPL | Small, fast, newcomer-friendly compiler. | |
| GNU C Compiler | [5] | DOS, Cygwin (w32), MinGW (w32)OS/2, Mac OS X, Unix, | GPL | De facto standard. Ships with most Unix systems. |
Though not absolutely needed, many programmers both prefer and recommend using an Integrated development environment (IDE) over a text editor and compiler. An IDE is a program that combines a set of programs that developers need into one convenient package, usually with a graphical user interface. These programs include a compiler, linker, and text editor. They also typically include a debugger, a tool that will preserve your C source code after compilation and enable you to do such things as step manually through it or alter data in an attempt to uncover errors.
Popular IDEs Include:
| Platform | License | Extra | |||
|---|---|---|---|---|---|
| CDT | [6] | Windows, Mac OS X, Unix | Open source | A C/C++ plug-in for Eclipse, a popular open source IDE. | |
| Anjuta | [7] | Unix | GPL | A GTK+2 IDE for the GNOME desktop environment | |
| Little C Compiler (LCC) | [8] | Windows | Free for non-commercial use. | ||
| Xcode | [9] | Mac OS X | Free | Available on the "Developer Tools" disc with most recent-model Apple computers, or as download when registered (free) at Apple Developer Connection. | |
| Pelles C | [10] | Windows, Pocket PC | "free" | ||
| Dev C++ | [11] | Windows, Mac OS X, Unix | GPL | ||
| Emacs | [12] | Windows, Mac OS X, Unix | GPL | Powerful programmable editor. Both graphic and text use. Does everything you need. Information for MS Windows. | |
| Microsoft Visual Studio Express | [13] | Windows | Free | light weight, powerful, and student friendly version of industry standard compiler |
On GNU/LINUX, GCC is almost always included automatically.
On Microsoft Windows, Dev-C++ is recommended for beginners because it is easy to use, free, and simple to install. However, Dev-C++ hasn't updated since February 22 2005.[2]
On Mac OS X, the XCode IDE provides the compilers needed to compile various source files. Installing XCode installs both the command-line compilers as well as the graphical IDE.
[edit] Footnotes
- ↑ Actually, GCC's(GNU C Compiler) cc (C Compiler) translates the input .c file to the target cpu's assembly, output is written to an .s file. Then as (assembler) generates a machine code file from the .s file. Pre-processing is done by another sub-program cpp (C PreProcessor).
- ↑ http://sourceforge.net/news/?group_id=10639
[edit] Dev-C++
Dev C++, as mentioned before, is an Integrated Development Enviroment(IDE) for the C++ programming language, available from Bloodshed Software.
C++ is a programming language which contains within itself most of the C language, plus a few extensions - as such, most C++ compilers also compile C programs, sometimes with a few adjustments (like invoking it with a different name or commandline switch). Therefore, you can use Dev C++ for C development.
However, Dev C++ is not the compiler: It is designed to use the MinGW or Cygwin versions of GCC - both of which can be downloaded as part of the Dev C++ package, although they are completely different projects.
Dev C++ simply provides an editor, syntax highlighting, some facilities for the visualisation of code (like class and package browsing) and a graphical interface to the chosen compiler. Because Dev C++ analyses the error messages produced by the compiler and attempts to distinguish the line numbers from the errors themselves, the use of other compiler software is discouraged since the format of their error messages is likely to be different.
The latest version of Dev-C++ is a beta for version 5 - as such, it still has a significant number of bugs. However, all the features are there and it is quite usable - as such, it is still considered one of the best free software C IDEs available for Windows.
A version of Dev C++ for Linux is in the pipeline; it is not quite usable yet, however Linux users already have a wealth of IDEs available to them (for example KDevelop and Anjuta.) Also, almost all the graphical text editors, and other common editors such as emacs and vi(m), support syntax highlighting.
[edit] GCC
The GNU Compiler Collection (GCC) is a free set of compilers developed by the Free Software Foundation, with Richard Stallman as one of the main architects.
- Steps for Obtaining the GCC Compiler if You're on GNU/Linux
On GNU/Linux, Installing the GNU C Compiler can vary in method from distribution to distribution.
- For Redhat, get a GCC RPM, e.g. using Rpmfind and then install (as root) using rpm -ivh gcc-version-release.arch.rpm
- For Fedora Core, install the GCC compiler (as root) by using yum install gcc.
- For Mandrake, install the GCC compiler (as root) by using urpmi gcc
- For Debian, install the GCC compiler (as root) by using apt-get install gcc.
- For Ubuntu, install the GCC compiler (along with other necessary tools) by using sudo aptitude install build-essential, or by using Synaptic. You do not need Universe enabled.
- For Slackware, the package is available on their website - simply download, and type installpkg gcc-xxxxx.tgz
- For Gentoo, you should already have GCC installed as it will have been used when you first installed. To update it run (as root) emerge -uav gcc
- For Arch GNU/Linux, install the GCC compiler (as root) by using pacman -Sy gcc.
- For FreeBSD, NetBSD, OpenBSD, DragonFly BSD, Darwin the port of GNU gcc is available in the base system, or it could be obtained using the ports collection or pkgsrc.
- If you cannot become root, get the GCC tarball from ftp://ftp.gnu.org/ and follow the instructions in it to compile and install in your home directory. Be warned though, you need a C compiler to do that - yes, GCC itself is written in C.
- You can use some commercial C compiler/IDE.
- Steps for Obtaining the GCC Compiler if You're on Windows
- Go to http://www.cygwin.com and click on the "Install Cygwin Now" button in the upper right corner of the page.
- Click "run" in the window that pops up, and click "next" several times, accepting all the default settings.
- Choose any of the Download sites ("ftp.easynet.be", etc.) when that window comes up; press "next" and the Cygwin installer should start downloading.
- When the "Select Packages" window appears, scroll down to the heading "Devel" and click on the "+" by it. In the list of packages that now displays, scroll down and find the "gcc-core" package; this is the compiler. Click once on the word "Skip", and it should change to some number like "3.4" etc. (the version number), and an "X" will appear next to "gcc-core" and several other related packages that will now be downloaded.
- Click "next" and the compiler as well as the Cygwin tools should start downloading; this could take a while. While you're waiting, go to http://www.crimsoneditor.com and download that free programmer's editor; it's powerful yet easy to use for beginners.
- Once the Cygwin downloads are finished and you have clicked "next", etc. to finish the installation, double-click the Cygwin icon on your desktop to begin the Cygwin "command prompt". Your home directory will automatically be set up in the Cygwin folder, which now should be at "C:\cygwin" (the Cygwin folder is in some ways like a small unix/linux computer on your Windows machine -- not technically of course, but it may be helpful to think of it that way).
- Type "gcc" at the Cygwin prompt and press "enter"; if "gcc: no input files" or something like it appears you have succeeded and now have the gcc compiler on your computer (and congratulations -- you have also just received your first error message!).
The current stable (usable) version is 4.2.1 published on 2007-07-21, which supports several platforms. In fact, GCC is not only a C compiler, but a family of compilers for several languages, such as C++, Ada , Java, and Fortran.
Like in every other programming language learning book we use the Hello world program to introduce you to C.
/*1*/ #include <stdio.h> /*2*/ /*3*/ int main(void) /*4*/ { /*5*/ printf("Hello, world!\n"); /*6*/ return 0; /*7*/ } /*8*/
This program prints "Hello, world!" and then exits. The numbers are added for our benefit to refer to certain lines and would not be part of the real program.
To create a Hello World program, One would enter the code into a either their IDE or text editor to be compiled later.
[edit] Line-by-Line Explanation
Line 1 tells the C compiler to find a file called stdio.h and add the contents of that file to this program. In C, you often have to pull in extra optional components when you need them. stdio.h contains descriptions of standard input/output functions; in other words, stuff you can use to send messages to a user, or to read input from a user.
Line 3 is something you'll find in every C program. Every program has a main function. Generally, the main function is where a program begins. However, one C program can be scattered across multiple files, so you won't always find a main function in every file. The int at the beginning means that main will return an integer to whatever made it run when it is finished and void in the parentheses means that main takes no parameters (parameters to main typically come from a shell when the program is invoked).
Line 5 is the statement that actually sends the message to the screen. printf is a function that is declared in the file stdio.h - which is why you had to #include that at the start of the program. \n is an escape sequence which adds a new line at the end of the printed text.
Line 6 will return zero (which is the integer referred to on line 3) to the operating system. When a program runs successfully its return value is zero (GCC4 complains if it doesn't when compiling). A non-zero value is returned to indicate a warning or error.
Line 8 is there because it is (at least on UNIX) considered good practice to end a file with a new line. In gcc using the -Wall -pedantic -ansi options, if the file does not end with a new line this message is displayed: "warning: no newline at end of file".
[edit] Introductory Exercises
[edit] On GCC
If you are using a Unix(-like) system, such as GNU/Linux, Mac OS X, or Solaris, it will probably have GCC installed. Type the hello world program into a file called first.c and then compile it with gcc. Just type:
gcc first.c
Then run the program by typing:
./a.out
or, If you are using Cygwin.
a.exe
You should now see your very first C program
There are a lot of options you can use with the gcc compiler. For example, if you want the output to have a name other than a.out, you can use the -o option. The following shows a few examples:
-c- indicates that the compiler is supposed to generate an object file, which can be later linked to other files to form a final program.
-o- indicates that the next parameter is the name of the resulting program (or library). If this option is not specified, the compiled program will, for historic reasons, end up in a file called "a.out" or "a.exe" (for cygwin users).
-g3- indicates that debugging information should be added to the results of compilation.
-O2 -ffast-math- indicates that the compilation should be optimized.
-W -Wall -fno-common -Wcast-align -Wredundant-decls -Wbad-function-cast -Wwrite-strings -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes- indicates that gcc should warn about many types of suspicious code that are likely to be incorrect.
-E- indicates that gcc should only preprocess the code; this is useful when you are having trouble understanding what gcc is doing with #include and #define, among other things.
All the options are well documented in the manual page for GCC.
[edit] On IDE's
If you are using a commercial IDE you may have to select console project, and to compile you just select build from the menu or the toolbar. The executable will appear inside the project folder, but you should have a menu button so you can just run the executable from the IDE.
[edit] Beginning C
[edit] Basic Concepts
Before one gets too deep into learning C syntax and programming constructs, it is beneficial to learn the meaning of a few key terms that are central to a thorough understanding of C.
[edit] Block Structure, Statements, Whitespace, and Scope
Now we discuss the basic structure of a C program. If you're familiar with PASCAL, you may have heard it referred to as a block structured language. C does not have complete block structure (and you'll find out why when you go over functions in detail) but it is still very important to understand what blocks are and how to use them.
So what's in a block? Generally, a block consists of executable statements, or the text the compiler will attempt to turn into executable instructions, and the whitespace that surrounds them.
In C, blocks begin with a "left curly" ("{") and end with a "right curly" ("}"). Blocks can contain sub-blocks, which can contain their own sub-blocks, and so on. Statements always end with a semicolon (;) character. Multiple statements can share a single line in the source file. There are several kinds of statements, including assignment, conditional and flow-control. A good portion of this book deals with statement construction.
Whitespace refers to the tab, space and newline/EOL (End Of Line) characters that separate the text characters that make up source code lines. Like many things in life, it's hard to appreciate whitespace until it's gone. To a C compiler, the source code
puts("Hello world"); return 0;
is the same as
puts("Hello world");
return 0;
is the same as
puts ( "Hello world") ;
return 0;
The compiler simply skips over whitespace. However, it is common practice to use spaces (and tabs) to organize source code for human readability.
In C, most of the time we do not want other functions or other programmer's routines accessing data that we are currently manipulating. This is why it is important to understand the concept of scope.
Scope describes the level at which a piece of data or a function can be seen or manipulated. There are two kinds of scope in C, local and global. When we speak of something being global, we speak of something that can be seen or manipulated from anywhere in the program. When we speak of something being local, we speak of something that can be seen or manipulated only within the block it was declared, or its sub-blocks. (Sub-blocks can access data local to their encompassing block, but a block cannot access local data local to a sub-block.)
TIP: You can use blocks without an if, loop, et al statement to organize your code.
[edit] Basics of Using Functions
Functions are a big part of programming. A function is a special kind of block that performs a well-defined task. If a function is well-designed, it can enable a programmer to perform a task without knowing anything about how the function works. The act of requesting a function to perform its task is called a function call. Many functions require a caller to hand it certain pieces of data needed to perform its task; these are called arguments. Many functions also return a value to the caller when they're finished; this is called a return value.
- The things you need to know before calling a function are:
- What the function does
- The data type (discussed later) of the arguments and what they mean
- The data type of the return value and what it means
All code other than global data definitions and declarations needs to be a part of a function.
Every executable program needs to have one, and only one, main function, which is where the program begins executing.
[edit] The Standard Library
In 1983, when C was in the process of becoming standardized, the American National Standards Institute (ANSI) formed a committee to establish a standard specification of C known as "ANSI C" This was so that there could be a basic set of functions common to each implementation of C. This is called the Standard Library. The Standard Library provides functions for tasks such as input/output, string manipulation, mathematics, files, memory allocation, and more. The Standard Library does NOT provide functions for anything that might be dependent on hardware or operating system, like graphics, sound, or networking. In the "Hello, World", program, a Standard Library function is used - puts -- which outputs lines of text to the standard output stream.
[edit] Comments and Coding Style
Comments are text inserted into the source code of a program that serve no purpose other than documenting the code. In C, they begin with /* and end with */. Good commenting is considered essential to software development, not just because others may need to read your code, but you may need to come back to your code a long time after writing it and immediately understand how it works. In general, it is a good idea to comment anything that is not immediately obvious to a competent programmer. However, it is not a good idea to comment every line. This may actually make your code more difficult to read and it will waste space.
Good coding style habits are important to adopt for the simple reason that code should be intuitive and readable, which is, after all, the purpose of a high-level programming language like C. In general, provide ample white space, indent so that the opening brace of a block and the closing brace of a block are vertically aligned, and provide intuitive names for your functions and variables. Throughout this text we will be providing more style and coding style tips for you. Do try and follow these tips: they will make your code easier for you and others to read and understand.
Having covered the basic concepts of C programming, we can now briefly discuss the process of compilation.
Like any programming language, C by itself is completely incomprehensible to a microprocessor. Its purpose is to provide an intuitive way for humans to provide instructions that can be easily converted into machine code, and therefore something that is comprehensible to the microprocessor. The compiler is what takes this code, and translates it into the machine code.
To those new to programming, this seems fairly simple. A naive compiler might read in every source file, translate everything into machine code, and write out an executable. This could work, but has two serious problems. First, for a large project, the computer may not have enough memory to read all of the source code at once. Second, if you make a change to a single source file, you would rather not have to recompile the entire application.
To deal with these problems, compilers break their job down into steps; for each source file (each .c file), the compiler reads the file, reads the files it references with #include, and translates it to machine code. The result of this is an "object file" (.o). Once every object file is made, a "linker" collects all of the object files and writes the actual program. This way, if you change one source file, only that file needs to be recompiled and then the application needs to be re-linked.
Without going into the painful details, it can be beneficial to have a superficial understanding of the compilation process. In brief, here it is:
[edit] Preprocessor
Many times you will need to give special instructions to your compiler. This is done through inserting preprocessor directives into your code. When you begin compiling your code, a subprogram called the preprocessor scans the source code and preforms simple substitution of tokenized strings for others according to predefined rules.
In C language, all preprocessor directives begin with the hash character (#). You can see one preprocessor directive in "Hello, World!": #include.
#include <stdio.h>
This would replace #include for the actual source code in the file stdio.h. Other directives include #pragma compiler settings and macros. The result of the preprocessing stage is a text string.
One thing to remember is that these directives are NOT compiled as part of your source code.
[edit] Syntax Checking
This step ensures that the code is valid and will sequence into an executable program.
[edit] Object Code
The compiler produces a machine code equivalent of the source code that can then be linked into the final program.
[edit] Linking
Linking combines the separate object codes into one complete program by integrating libraries and the code and producing either an executable program or a library. Linking is performed by a linker, which is often part of a compiler.
Its important to note after discussing the basics that compilation is a "one way street". That is, compiling a C source file into machine code is easy, but "decompiling" (turning machine code into the C source that creates it) is not. Decompilers for C do exist, but they rarely create useful code.
[edit] C Structure and Style
This is a basic introduction to producing effective code structure in the C Programming Language. It is designed to provide information on how to effectively use indentations, comments, and other elements that will make your C code more readable. It is not a tutorial on actually programming in C.
New programmers will often not see the point of creating structure in their programs' code, because they often think that code is designed purely for a compiler to read. This is often not the case, because well-written code that follows a well-designed structure is usually much easier for programmers (who haven't worked on the code for months) to read, and edit.
In the following sections, we will attempt to explain good programming techniques that will in turn make your programs more effective.
[edit] Introduction
The following two blocks of code are essentially the same: Both of them contain exactly the same code, and will compile and execute with the same result; however there is one essential difference.
Which of the following programs do you think is easier to read?
#include <stdio.h> int main(void) {printf("Hello, World!\n");return 0;}
or
#include <stdio.h> int main(void) { printf("Hello, World!\n"); return 0; }
The simple use of indents and line breaks can greatly improve the readability of the code; without making any impact whatsoever on how the code performs. By having readable code, it is much easier to see where functions and procedures end, and which lines are part of which loops and procedures.
This book is going to focus on the above piece of code, and how to improve it. Please note that during the course of the tutorial, there will be many (apparently) redundant pieces of code added. These are only added to provide examples of techniques that we will be explaining, without breaking the overall flow of code that the program achieves.
[edit] Line Breaks and Indentation
The addition of white space inside your code is arguably the most important part of good code structure. Effective use of it can create a visual gauge of how your code flows, which can be very important when returning to your code when you want to maintain it.
[edit] Line Breaks
10 #include <stdio.h>
20 int main(void){ int i=0; printf("Hello, World!"); for (i=0; i<1; i++){ printf("\n"); break; } return 0; }
Rather than putting everything on one line, it is much more readable to break up long lines so that each statement and each declaration goes on its own line.
[edit] blank lines
Blank lines should be used in three main parts of your code
- After precompiler declarations.
- After new variables are declared.
- Between new paths of code. (i.e. Before the declaration of the function or loop, and after the closing '}' bracket).
Note that we have added line numbers to the start of the lines. Using these in actual code will make your compiler fail, they are only there for reference in this book.
10 #include <stdio.h>
20 int main(void)
30 {
40 int i=0;
50 printf("Hello, World!");
60 for (i=0; i<1; i++)
70 {
80 printf("\n");
90 break;
100 }
110 return 0;
120 }
130
Based on the rules we established earlier, there should now be four line breaks added.
- Between lines 10 and 20, as line 10 is a precompiler declaration
- Between lines 40 and 50, as the block above it contains variable declarations
- Between lines 50 and 60, as it is the beginning of a new path (the 'for' loop)
- Between lines 100 and 110, as it is the end of a path of code
This will make the code much more readable than it was before:
The following lines of code have line breaks between functions, but not any indention.
10 #include <stdio.h>
11
20 int main(void)
30 {
40 int i=0;
41
50 printf("Hello, World!");
51
60 for (i=0; i<1; i++)
70 {
80 printf("\n");
90 break;
100 }
101
110 return 0;
120 }
But this still isn't as readable as it can be.
[edit] Indentation
Although adding simple line breaks between key blocks of code can make code marginally easier to read, it provides no gauge of the flow of the program. The use of your tab key can be very useful now: indentation visually separates paths of code by moving their starting points to a new column in the line. This simple practice will make it much easier to read code. Indentation follows a fairly simple rule:
- All code inside a new path (i.e. Between the two '{' brackets '}') should be indented by one tab more than the code in the previous path.
So, based on our code from the previous section, there are two paths that require indentation:
- Lines 40 to 100
- Lines 70 and 80
10 #include <stdio.h>
11
20 int main(void)
30 {
40 int i=0;
41
50 printf("Hello, World!");
51
60 for (i=0; i<1; i++)
61 {
70 printf("\n");
80 break;
90 }
91
100 return 0;
110 }
120
It is now fairly obvious as to which parts of the program fit inside which paths of code. You can tell which parts of the program will loop, and which ones will not. Although it might not be immediately noticeable, once many nested loops and paths get added to the structure of the program, the use of indentation can be very important.
NOTE: Many text editors automatically indent appropriately when you hit the enter/return key.
[edit] Comments
Comments in code can be useful for a variety of purposes. They provide the easiest way to point out specific parts of code (and their purpose); as well as providing a visual "split" between various parts of your code. Having a good commentary throughout your code will make it much easier to remember what specific parts of your code do.
Comments in modern flavours of C (and many other languages) can come in two forms:
//Single Line Comments
and
/*Multi-Line Comments*/
This section is going to focus on the various uses of each form of commentary.
[edit] Single-line Comments
Single-line comments are most useful for simple 'side' notes that explain what certain parts of the code do. The best places to put these comments are next to variable declarations, and next to pieces of code that may need explanation.
Based on our previous program, there are two good places to place comments
- Line 40, to explain what 'int i' is going to do
- Line 80, to explain why there is a 'break' keyword.
This will make our program look something like
10 #include <stdio.h>
11
20 int main(void)
30 {
40 int i=0; //Temporary variable used for 'for' loop.
41
50 printf("Hello, World!");
51
60 for (i=0; i<1; i++)
61 {
70 printf("\n");
80 break; //Exits 'for' loop.
90 }
91
100 return 0;
110 }
[edit] Multi-line Comments
Multi-line comments are most useful for long explanations of code. They can be used as copyright/licensing notices, and they can also be used to explain the purpose of a path of code. This can be useful in two facets: They make your functions easier to understand, and they make it easier to spot errors in code (if you know what a path is supposed to do, then it is much easier to find the piece of code that is responsible).
As an example, suppose we had a program that was designed to print "Hello, World! " a certain number of times, on a certain number of lines. There would be many for loops in this program. For this example, we shall call the number of lines i, and the number of strings per line as j.
A good example of a multi-line comment that describes 'for' loop i's purpose would be:
/* For Loop (int i) Loops the following procedure i times (for number of lines). Performs 'for' loop j on each loop, and prints a new line at end of each loop. */
This provides a good explanation of what 'i's purpose is, whilst not going into detail of what 'j' does. By going into detail over what the specific path does (and not ones inside it), it will be easier to troubleshoot the path.
Similarly, you should always include a multi-line comment as the first thing inside a function, to explain the name of the function; the input that it will take, how it takes that input; the output; and the overall procedure that the function is designed to perform. Always leave the technical details to the individual code paths inside your program - this makes it easier to troubleshoot.
A function descriptor should look something like:
/* Function : int hworld (int i,int j) Input : int i (Number of lines), int j (Number of instances per line) Output : 0 (on success) Procedure: Prints "Hello, World!" j times, and a new line to standard output over i lines. */
This system allows for an at-a-glance explanation of what the function should do. You can then go into detail over how each aspect of the program is achieved later on in the program.
Finally, if you like to have aesthetically-pleasing source code, the multi-line comment system allows for the easy addition of starry borders to your comment. These make the comments stand out much more than they would without the border (especially buried deep in source code). They should take a format similar to:
/*************************************** * This is a multi line comment * * That is surrounded by a * * Cool, starry border! * ***************************************/
Applied to our original program, we can now include a much more descriptive and readable source code:
10 #include <stdio.h>
11
20 int main(void)
30 {
31 /************************************************************************************
32 * Function: int main(void) *
33 * Input : none *
34 * Output : Returns 0 on success *
35 * Procedure: Prints "Hello, World!" and a new line to standard output then exits. *
36 ************************************************************************************/
40 int i=0; //Temporary variable used for 'for' loop.
41
50 printf("Hello, World!");
51
52 /* FOR LOOP (int i)
53 Prints a new line to standard output, and exits */
60 for (i=0; i<1; i++)
61 {
70 printf("\n");
80 break; //Exits 'for' loop.
90 }
91
100 return 0;
110 }
This will allow any outside users of the program an easy way to understand what the code does, and how it works. It also prevents confusion with other like-named functions.
[edit] Examples
[edit] Links
- Aladdin's C coding guidelines - A more definitive C coding guideline.
- C/C++ Programming Styles GNU Coding styles & Linux Kernel Coding style
C has no native support for error handling (properly known as exception handling). The programmer must instead prevent errors from occurring in the first place, often testing return values from functions. -1 and NULL are used in several functions such as socket() (Unix socket programming) or malloc() respectively to indicate problems that the programmer should be aware about. In a worst case scenario where there is an unavoidable error and no way to recover from it, a C programmer usually tries to log the error and "gracefully" terminate the program.
There is an external variable called "errno", accessible by the programs after including <errno.h> - that file comes from the definition of the possible errors that can ocurr in some Operating Systems (e.g. Linux - in this case, the definition is in include/asm-generic/errno.h) when programs ask for resources. Such variable indexes error descriptions, that is accessible by the function 'strerror( errno )'.
The following code tests the return value from the library function malloc to see if dynamic memory allocation completed properly:
#include <stdio.h> /* fprintf */ #include <errno.h> /* errno */ #include <stdlib.h> /* malloc, free, exit */ #include <string.h> /* strerror */ extern int errno; int main( void ) { /* pointer to char, requesting dynamic allocation of 2,000,000,000 * storage elements (declared as an integer constant of type * unsigned long int). (If your system has less than 2GB of memory * available, then this call to malloc will fail) */ char *ptr = malloc( 2000000000UL ); if ( ptr == NULL ) puts("malloc failed"); else { /* the rest of the code hereafter can assume that 2,000,000,000 * chars were successfully allocated... */ free( ptr ); } exit(EXIT_SUCCESS); /* exiting program */ }
The code snippet above shows the use of the return value of the library function malloc to check for errors. Many library functions have return values that flag errors, and thus should be checked by the astute programmer. In the snippet above, a NULL pointer returned from malloc signals an error in allocation, so the program exits. In more complicated implementations, the program might try to handle the error and try to recover from the failed memory allocation.
[edit] Handling divide by zero errors
A common pitfall made by C programmers is not checking if a divisor is zero before a division command. The following code will produce a runtime error and in most cases, exit.
int dividend = 50; int divisor = 0; int quotient; quotient = (dividend/divisor); /* This will produce a runtime error! */
For reasons beyond the scope of this document, you must check or make sure that a divisor is never zero. Alternatively, for *nix processes, you can stop the OS from terminating your process by blocking the SIGFPE signal.
The code below fixes this by checking if the divisor is zero before dividing.
#include <stdio.h> /* for fprintf and stderr */ #include <stdlib.h> /* for exit */ main() { int dividend = 50; int divisor = 0; int quotient; if (divisor == 0) { /* Example handling of this error. Writing a message to stderr, and * exiting with failure. */ fprintf(stderr, "Division by zero! Aborting...\n"); exit(EXIT_FAILURE); /* indicate failure.*/ } quotient = (dividend/divisor); }
[edit] Variables
Like most programming languages, C is able to use and process named variables and their contents. Variables are simply names used to refer to some location in memory - a location that holds a value with which we are working.
It may help to think of variables as a placeholder for a value. You can think of a variable as being equivalent to its assigned value. So, if you have a variable i that is initialized (set equal) to 4, then it follows that i+1 will equal 5.
Since C is a relatively low-level programming language, before a C program can utilize memory to store a variable it must claim the memory needed to store the values for a variable . This is done by declaring variables. Declaring variables is the way in which a C program shows the number of variables it needs, what they are going to be named, and how much memory they will need.
Within the C programming language, when managing and working with variables, it is important to know the type of variables and the size of these types. Since C is a fairly low-level programming language, these aspects of its working can be hardware specific - that is, how the language is made to work on one type of machine can be different from how it is made to work on another.
All variables in C are "typed". That is, every variable declared must be assigned as a certain type of variable.
[edit] Declaring, Initializing, and Assigning Variables
Here is an example of declaring an integer, which we've called some_number. (Note the semicolon at the end of the line; that is how your compiler separates one program statement from another.)
int some_number;
This statement means we're declaring some space for a variable called some_number, which will be used to store integer data. Note that we must specify the type of data that a variable will store. There are specific keywords to do this - we'll look at them in the next section.
Multiple variables can be declared with one statement, like this:
int anumber, anothernumber, yetanothernumber;
We can also declare and assign some content to a variable at the same time. This is called initialization because it is the "initial" time a value has been assigned to the variable:
int some_number=3;
In C, all variable declarations (except for globals) should be done at the beginning of a block. Some compilers do not let you declare your variables, insert some other statements, and then declare more variables. Variable declarations (if there are any) should always be the first part of any block.
After declaring variables, you can assign a value to a variable later on using a statement like this:
some_number=3;
You can also assign a variable the value of another variable, like so:
anumber = anothernumber;
Or assign multiple variables the same value with one statement:
anumber = anothernumber = yetanothernumber = 3;
This is because the assignment ( x = y) returns the value of the assignment. x = y = z is really shorthand for x = (y = z).
[edit] Naming Variables
Variable names in C are made up of letters (upper and lower case) and digits. The underscore character ("_") is also permitted. Names must not begin with a digit. Unlike some languages (such as Perl and some BASIC dialects), C does not use any special prefix characters on variable names.
Some examples of valid (but not very descriptive) C variable names:
foo Bar BAZ foo_bar _foo42 _ QuUx
Some examples of invalid C variable names:
2foo (must not begin with a digit) my foo (spaces not allowed in names) $foo ($ not allowed -- only letters, digits, and _) while (language keywords cannot be used as names)
As the last example suggests, certain words are reserved as keywords in the language, and these cannot be used as variable names.
In addition there are certain sets of names that, while not language keywords, are reserved for one reason or another. For example, a C compiler might use certain names "behind the scenes", and this might cause problems for a program that attempts to use them. Also, some names are reserved for possible future use in the C standard library. The rules for determining exactly what names are reserved (and in what contexts they are reserved) are too complicated to describe here, and as a beginner you don't need to worry about them much anyway. For now, just avoid using names that begin with an underscore character.
The naming rules for C variables also apply to other language constructs such as function names, struct tags, and macros, all of which will be covered later.
[edit] Literals
Anytime within a program in which you specify a value explicitly instead of referring to a variable or some other form of data, that value is referred to as a literal. In the initialization example above, 3 is a literal. Literals can either take a form defined by their type (more on that soon), or one can use hexadecimal (hex) notation to directly insert data into a variable regardless of its type. Hex numbers are always preceded with 0x. For now, though, you probably shouldn't be too concerned with hex.
[edit] The Four Basic Types
In Standard C there are four basic data types. They are int, char, float, and double.
[edit] The int type
The int type stores integers in the form of "whole numbers". An integer is typically the size of one machine word, which on most modern home PCs is 32 bits (4 octets). Examples of literals are whole numbers (integers) such as 1,2,3, 10, 100... When int is 32 bits (4 octets), it can store any whole number (integer) between -2147483648 and 2147483647. A 32 bit word (number) has the possibility of representing 4294967296 numbers (2 to the power of 32).
If you want to declare a new int variable, use the int keyword. For example:
int numberOfStudents, i, j=5;
In this declaration we declare 3 variables, numberOfStudents, i and j, j here is assigned the literal 5.
[edit] The char type
The char type is capable of holding any member of the execution character set. It stores the same kind of data as an int (i.e. integers), but always has a size of one byte. The size of a byte is specified by the macro CHAR_BIT which specifies the number of bits in a char (byte). In standard C it never can be less than 8 bits. A variable of type char is most often used to store character data, hence its name. Most implementations use the ASCII character set as the execution character set, but it's best not to know or care about that unless the actual values are important.
Examples of character literals are 'a', 'b', '1', etc., as well as some special characters such as '\0' (the null character) and '\n' (newline, recall "Hello, World"). Note that the char value must be enclosed within single quotations.
When we initialize a character variable, we can do it two ways. One is preferred, the other way is bad programming practice.
The first way is to write
char letter1 = 'a';
This is good programming practice in that it allows a person reading your code to understand that letter is being initialized with the letter 'a' to start off with.
The second way, which should not be used when you are coding letter characters, is to write
char letter2 = 97; /* in ASCII, 97 = 'a' */
This is considered by some to be extremely bad practice, if we are using it to store a character, not a small number, in that if someone reads your code, most readers are forced to look up what character corresponds with the number 97 in the encoding scheme. In the end, letter1 and letter2 store both the same thing -- the letter "a", but the first method is clearer, easier to debug, and much more straightforward.
One important thing to mention is that characters for numerals are represented differently from their corresponding number, i.e. '1' is not equal to 1.
There is one more kind of literal that needs to be explained in connection with chars: the string literal. A string is a series of characters, usually intended to be displayed. They are surrounded by double quotes (" ", not ' '). An example of a string literal is the "Hello, world!\n" in the "Hello, World" example.
[edit] The float type
float is short for Floating Point. It stores real numbers also, but is only one machine word in size. Therefore, it is used when less precision than a double provides is required. float literals must be suffixed with F or f, otherwise they will be interpreted as doubles. Examples are: 3.1415926f, 4.0f, 6.022e+23f. float variables can be declared using the float keyword.
[edit] The double type
The double and float types are very similar. The float type allows you to store single-precision floating point numbers, while the double keyword allows you to store double-precision floating point numbers - real numbers, in other words, both integer and non-integer values. Its size is typically two machine words, or 8 bytes on most machines. Examples of double literals are 3.1415926535897932, 4.0, 6.022e+23 (scientific notation). If you use 4 instead of 4.0, the 4 will be interpreted as an int.
The distinction between floats and doubles was made because of the differing sizes of the two types. When C was first used, space was at a minimum and so the judicious use of a float instead of a double saved some memory. Nowadays, with memory more freely available, you do not really need to conserve memory like this - it may be better to use doubles consistently. Indeed, some C implementations use doubles instead of floats when you declare a float variable.
If you want to use a double variable, use the double keyword.
[edit] sizeof
If you have any doubts as to the amount of memory actually used by any type (and this goes for types we'll discuss later, also), you can use the sizeof operator to find out for sure. (For completeness, it is important to mention that sizeof is a compile-time unary operator, not a function.) Its syntax is:
sizeof object sizeof(type)
The two expressions above return the size of the object and type specified, in bytes. The return type is size_t (defined in the header <stddef.h>) which is an unsigned value. Here's an example usage:
size_t size; int i; size = sizeof(i);
size will be set to 4, assuming CHAR_BIT is defined as 8, and an integer is 32 bits wide. The value of sizeof's result is the number of bytes.
Note that when sizeof is applied to a char, the result is 1; that is:
sizeof(char)
always returns 1 (however, this is not true of C++).
[edit] Data type modifiers
One can alter the data storage of any data type by preceding it with certain modifiers.
long and short are modifiers that make it possible for a data type to use either more or less memory. The int keyword need not follow the short and long keywords. This is most commonly the case. A short can be used where the values fall within a lesser range than that of an int, typically -32768 to 32767. A long can be used to contain an extended range of values. It is not guaranteed that a short uses less memory than an int, nor is it guaranteed that a long takes up more memory than an int. It is only guaranteed that sizeof(short) <= sizeof(int) <= sizeof(long). Typically a short is 2 bytes, an int is 4 bytes, and a long either 4 or 8 bytes.
In all of the types described above, one bit is used to indicate the sign (positive or negative) of a value. If you decide that a variable will never hold a negative value, you may use the unsigned modifier to use that one bit for storing other data, effectively doubling the range of values while mandating that those values be positive. The unsigned specifier also may be used without a trailing int, in which case the size defaults to that of an int. There is also a signed modifier which is the opposite, but it is not necessary, except for certain uses of char, and seldom used since all types (except char) are signed by default.
To use a modifier, just declare a variable with the data type and relevant modifiers:
unsigned short int usi; /* fully qualified - unsigned short int */ short si; /* short int */ unsigned long uli; /* unsigned long int */
[edit] const qualifier
When the const qualifier is used, the declared variable must be initialized at declaration. It is then not allowed to be changed.
While the idea of a variable that never changes may not seem useful, there are good reasons to use const. For one thing, many compilers can perform some small optimizations on data when it knows that data will never change. For example, if you need the value of π in your calculations, you can declare a const variable of pi, so a program or another function written by someone else cannot change the value of pi.
Note that a Standard conforming compiler must issue a warning if an attempt is made to change a const variable - but after doing so the compiler is free to ignore the const qualifier.
[edit] Magic numbers
When you write C programs, you may be tempted to write code that will depend on certain numbers. For example, you may be writing a program for a grocery store. This complex program has thousands upon thousands of lines of code. The programmer decides to represent the cost of a can of corn, currently 99 cents, as a literal throughout the code. Now, assume the cost of a can of corn changes to 89 cents. The programmer must now go in and manually change each entry of 99 cents to 89. While this is not that big of a problem, considering the "global find-replace" function of many text editors, consider another problem: the cost of a can of green beans is also initially 99 cents. To reliably change the price, you have to look at every occurrence of the number 99.
C possesses certain functionality to avoid this. This functionality is approximately equivalent, though one method can be useful in one circumstance, over another.
[edit] Using the const keyword
The const keyword helps eradicate magic numbers. By declaring a variable const corn at the beginning of a block, a programmer can simply change that const and not have to worry about setting the value elsewhere.
There is also another method for avoiding magic numbers. It is much more flexible than const, and also much more problematic in many ways. It also involves the preprocessor, as opposed to the compiler. Behold...
[edit] #define
When you write programs, you can create what is known as a macro, so when the computer is reading your code, it will replace all instances of a word with the specified expression.
Here's an example. If you write
#define PRICE_OF_CORN 0.99
when you want to, for example, print the price of corn, you use the word PRICE_OF_CORN instead of the number 0.99 - the preprocessor will replace all instances of PRICE_OF_CORN with 0.99, which the compiler will interpret as the literal double 0.99. The preprocessor performs substitution, that is, PRICE_OF_CORN is replaced by 0.99 so this means there is no need for a semicolon.
It is important to note that #define has basically the same functionality as the "find-and-replace" function in a lot of text editors/word processors.
For some purposes, #define can be harmfully used, and it is usually preferable to use const if #define is unnecessary. It is possible, for instance, to #define, say, a macro DOG as the number 3, but if you try to print the macro, thinking that DOG represents a string that you can show on the screen, the program will have an error. #define also has no regard for type. It disregards the structure of your program, replacing the text everywhere (in effect, disregarding scope), which could be advantageous in some circumstances, but can be the source of problematic bugs.
You will see further instances of the #define directive later in the text. It is good convention to write #defined words in all capitals, so a programmer will know that this is not a variable that you have declared but a #defined macro.
[edit] Scope
In the Basic Concepts section, the concept of scope was introduced. It is important to revisit the distinction between local types and global types, and how to declare variables of each. To declare a local variable, you place the declaration at the beginning (i.e. before any non-declarative statements) of the block to which the variable is intended to be local. To declare a global variable, declare the variable outside of any block. If a variable is global, it can be read, and written, from anywhere in your program.
Global variables are not considered good programming practice, and should be avoided whenever possible. They inhibit code readability, create naming conflicts, waste memory, and can create difficult-to-trace bugs. Excessive usage of globals is usually a sign of laziness and/or poor design. However, if there is a situation where local variables may create more obtuse and unreadable code, there's no shame in using globals. (Implementing malloc, which is a function discussed later, is one example of something that is simply too much more difficult to write without at least one global variable.)
[edit] Other Modifiers
Included here, for completeness, are more of the modifiers that standard C provides. For the beginning programmer, static and extern may be useful. volatile is more of interest to advanced programmers. register and auto are largely deprecated and are generally not of interest to either beginning or advanced programmers.
static is sometimes a useful keyword. It is a common misbelief that the only purpose is to make a variable stay in memory.
When you declare a function or global variable as static it will become internal. You cannot access the function or variable through the extern (see below) keyword from other files in your project.
When you declare a local variable as static, it is created just like any other variable. However, when the variable goes out of scope (i.e. the block it was local to is finished) the variable stays in memory, retaining its value. The variable stays in memory until the program ends. While this behaviour resembles that of global variables, static variables still obey scope rules and therefore cannot be accessed outside of their scope.
Variables declared static are initialized to zero (or for pointers, NULL) by default.
You can use static in (at least) two different ways. Consider this code, and imagine it is in a file called jfile.c:
#include <stdio.h> static int j = 0; void up(void) { /* k is set to 0 when the program starts. The line is then "ignored" * for the rest of the program (i.e. k is not set to 0 every time up() * is called) */ static int k = 0; j++; k++; printf("up() called. k= %2d, j= %2d\n", k , j); } void down(void) { static int k = 0; j--; k--; printf("down() called. k= %2d, j= %2d\n", k , j); } int main(void) { int i; /* call the up function 3 times, then the down function 2 times */ for (i= 0; i < 3; i++) up(); for (i= 0; i < 2; i++) down(); return 0; }
The j var is accessible by both up and down and retains its value. the k vars also retain their value, but they are two different variables in each their scopes. static vars are a good way to implement encapsulation, a term from the object-oriented way of thinking that effectively means not allowing changes to be made to a variable except through function calls.
Running the program above will produce the following output:
up() called. k= 1, j= 1 up() called. k= 2, j= 2 up() called. k= 3, j= 3 down() called. k= -1, j= 2 down() called. k= -2, j= 1
extern is used when a file needs to access a variable in another file that it may not have #included directly. Therefore, extern does not actually carve out space for a new variable, it just provides the compiler with sufficient information to access the remote variable.
volatile is a special type modifier which informs the compiler that the value of the variable may be changed by external entities other than the program itself. This is necessary for certain programs compiled with optimizations - if a variable were not defined volatile then the compiler may assume that certain operations involving the variable are safe to optimize away when in fact they aren't. volatile is particularly relevant when working with embedded systems (where a program may not have complete control of a variable) and multi-threaded applications.
auto is a modifier which specifies an "automatic" variable that is automatically created when in scope and destroyed when out of scope. If you think this sounds like pretty much what you've been doing all along when you declare a variable, you're right: all declared items within a block are implicitly "automatic". For this reason, the auto keyword is more like the answer to a trivia question than a useful modifier, and there are lots of very competent programmers that are unaware of its existence.
register is a hint to the compiler to attempt to optimize the storage of the given variable by storing it in a register of the computer's CPU when the program is run. Most optimizing compilers do this anyway, so use of this keyword is often unnecessary. In fact, ANSI C states that a compiler can ignore this keyword if it so desires -- and many do. Microsoft Visual C++ is an example of an implementation that completely ignores the register keyword.
[edit] Concepts
[edit] In this section
[edit] Simple Input and Output
When you take time to consider it, a computer would be pretty useless without some way to talk to the people who use it. Just like we need information in order to accomplish tasks, so do computers. And just as we supply information to others so that they can do tasks, so do computers.
These supplies and returns of information to a computer are called input and output. 'Input' is information supplied to a computer or program. 'Output' is information provided by a computer or program. Frequently, computer programmers will lump the discussion in the more general term input/output or simply, I/O.
In C, there are many different ways for a program to communicate with the user. Amazingly, the most simple methods usually taught to beginning programmers may also be the most powerful. In the "Hello, World" example at the beginning of this text, we were introduced to a Standard Library file stdio.h, and one of its functions, printf(). Here we discuss more of the functions that stdio.h gives us.
[edit] Output using printf()
Recall from the beginning of this text the demonstration program duplicated below:
#include <stdio.h> int main(void) { printf("Hello, world!\n"); return 0; }
If you compile and run this program, you will see the sentence below show up on your screen:
Hello, world!
This amazing accomplishment was achieved by using the function printf(). A function is like a "black box" that does something for you without exposing the internals inside. We can write functions ourselves in C, but we will cover that later.
You have seen that to use printf() one puts text, surrounded by quotes, in between the brackets. We call the text surrounded by quotes a literal string (or just a string), and we call that string an argument to printf.
As a note of explanation, it is sometimes convenient to include the open and closing parentheses after a function name to remind us that it is, indeed, a function. However usually when the name of the function we are talking about is understood, it is not necessary.
As you can see in the example above, using printf() can be as simple as typing in some text, surrounded by double quotes (