User:Jimregan/C Primer chapter 1

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Introduction to C; Functions; Control Constructs

The C programming language is sometimes referred to as a "middle-level" language. It provides low-level programming capability at the expense of some user-friendliness. Some cynics claim that "C combines the flexibility and power of assembly language with the user-friendliness of assembly language."

The original implementations of C were defined as described in the classic reference, The C Programming Language, by Brian Kernighan and Dennis Ritchie. This definition left a few things to be desired, and the American National Standards Institute (ANSI) formed a group in the 1980s to tighten up the spec. The result was "ANSI C", which is the focus of this document.

An Introductory C Program

Here's a simple C program to calculate the volume of a sphere:

  /* sphere.c */
  #include <stdio.h>                /* Include header file for printf. */
  #define PI 3.141592654            /* Define a constant. */
  float sphere( int rad );          /* Function prototype. */
  void main()                       /* Main program. */
  {
    float vol;                      /* Declare variable. */
    int radius = 3;                 /* Declare and initialize variable. */
    vol = sphere( radius );         /* Call function, get result, print it. */
    printf( "Volume: %f\n", vol );
  }
  float sphere( int rad )           /* Function. */
  { 
    float result;                   /* Local variable declaration. */
    result = rad * rad * rad;
    result = 4 * PI * result / 3;
    return( result );               /* Result returned to main program. */
  }

The first thing you'll figure out about this program are that comments are enclosed by "/*" and "*/". Comments can go almost anywhere, since the compiler ignores them. You'll also notice that most lines of statements in the program end with a ";" and that some do not. Either forgetting a ";" or adding one where it isn't needed is a common C programming bugs. Lines of code are grouped by using curly brackets ("{" and "}").

C is case-sensitive. All C keywords are in lower-case. You can declare program variable and function names in whatever case you want, but by convention they should be in lower case. Constants, more on this momentarily, are upper case by convention.

  • Not all the lines in a C program are executable statements. Some of the statements shown above are "preprocessor directives".

C compilation is a multi-pass process. To create a program, a C compiler system performs the following steps:

  • It runs the source file text through a "C preprocessor". All this does is perform various text manipulations on the source file, such as macro expansion, constant expansion, file inclusion, and conditional compilation
  • more on all these things later. The output of the preprocessor is a second-level source file for actual compilation. You can think of the C preprocessor as a sort of specialized "text editor".
  • Next, it runs the second-level source file through the compiler proper, which actually converts the source code statements into their binary equivalents. That is, it creates an "object file" from the source file.

The object file still cannot be executed, however. If it makes use of C library functions, such as "printf()" in the example above, the binary code for the library functions has to be merged, or "linked", with the program object file. Furthermore, some addressing information needs to be linked to the object file so it can actually be loaded and run on the target system.

     These linking tasks are performed by a "linker", which takes one or more object files and links them to binary library files to create an "executable" file that can actually be run. Note that you can create programs from a set of multiple object files using the linker.
     C has a large number of libraries and library functions. C by itself has few statements, so much of its functionality is implemented as library calls. 

Commands intended for the C preprocessor, rather than for the C compiler itself, start with a "#" and are known as "preprocessor directives" or "metacommands". The example program above has two such metacommands:

 #include <stdio.h>
 #define PI 3.14

The first statement, "#include <stdio.h>", simply merges the contents of the file "stdio.h" to the contents of the current program file before compilation. The "stdio.h" file contains declarations required for use of the standard-I/O library, which provides the "printf()" function.

Incidentally, the "stdio.h" file, or "header file", only contains declarations. The actual code for the library is contained in separate library files that are added at link time. You can create your own header files with your own declarations if you like, and include them as follows:

  #include "mydefs.h"

Angle brackets are only used to define default header files that the C preprocessor can find in default directories.

A C program is built up of one or more functions. The program above contains two user-defined functions, "main()" and "sphere()", as well as the "printf()" library function.

The "main()" function is mandatory when writing a self-contained program. It defines the function that is automatically executed when the program is run. All other functions will be directly or indirectly called by "main()".

You call a C function simply by specifying its name, with any arguments enclosed in following parentheses, with commas separating the arguments. In the program above, the "printf()" function is called as follows:

  printf( "Volume: %f\n", volume );

This invocation provides two arguments. The first -- "Volume: %f\n" -- supplies text and some formatting information. The second, "volume", supplies a numeric value.

A function may or may not return a value. The "sphere()" function does, and so is invoked as follows:

  volume = sphere( radius );

A function uses the "return" keyword to return a value. In the case of "sphere", it returns the volume of the sphere with the statement:

  return( result );
All variables in a C program must be "declared" by specifying their name and type. The example program declares two variables for the "main" routine:
  float volume;
  int radius = 3;

-- and one in the "sphere" routine:

  float result;

The declarations of "volume" and "result" specify a floating-point, or real, variable. The declaration of "radius" specifies an integer variable. The declaration allows variables to be initialized when declared if need be, in this case declaring "radius" and assigning it a value of "3".

All three of these declarations define "local" variables. Local variables exist only within the functions that declare them. You could declare variables of the same name in different functions, and they would still remain distinct variables. You can also declare "global" variables that can be shared by all functions by declaring them outside the program's functions and then using the "extern" keyword within the functions to allow access to it.

  /* global.c */
  #include <stdio.h>
  void somefunc( void );
  int globalvar;
  void main()
  {
    extern int globalvar;
    globalvar = 42;
    somefunc();
    printf( "%d\n", globalvar );
  }
  void somefunc( void )
  {
    extern int globalvar;
    printf( "%d\n", globalvar );
    globalvar = 13;
  }
You'll notice that besides the variable declarations, there is also a function declaration, or "function prototype", that allows the C compiler to check that any calls to the function are correct:
  float sphere( int rad );

The function prototypes declare the type of value the function returns (the type will be "void" if it does not return a value), and the arguments that are to be provided with the function.

Finally, the "printf()" library function provides text output capabilities for the program. You can use "printf()" to print a simple message as follows:

  printf( "Hello, world!" );

-- displays the text:

  Hello, world!

Remember that "printf()" doesn't automatically add a "newline" to allow following "printf()"s to print on the next display line. If you execute:

  printf( "Twas bryllig " );
  printf( "and the slithy toves" );

-- you get the text:

  Twas bryllig and the slithy toves

You must add a newline character ("\n") to force a newline. For example:

  printf( "Hello,\nworld!" );

-- gives:

  Hello,
  world!

These examples only print a predefined text constant. You can also include "format codes" in the string and then follow the string with one or more variables to print the values they contain:

  printf( " Result = %f\n", result );

This would print something like:

  Result = 0.5

The "%f" is the format code that tells "printf" to print a floating-point number. For another example, consider:

  printf( "%d times %d = %d\n", a, b, a * b );

-- which would print something like:

  4 * 10 = 40

The "%d" prints an integer quantity. Math or string expressions and functions can be included in the argument list.

If you simply want to print a string of text, you can use a simpler function, "puts()", that displays the specified text and automatically appends a newline:

  puts( "Hello, world!" );

Just for fun, let's take a look at what our example program would be like in the earlier versions of C:

  /* oldspher.c */
  #include <stdio.h>
  #define PI 3.141592654
  float sphere();        /* Parameters not defined in function prototype. */
  main()
  {
    float volume;
    int radius = 3;
    volume = sphere( radius );
    printf( "Volume: %f\n", volume );
  }
  float sphere( rad )
  int rad;          /* Parameter type not specified in function header. */
  { 
    float result;
    result = rad * rad * rad;
    result = 4 * PI * result / 3;
    return result;
  }

The following sections elaborate on the principles outlined in this section. They may repeat information presented above for the sake of completeness.

C Functions in Detail

As noted previously, any C program must have a "main()" function to contain the code executed by default when the program is run.

There can be as many functions as you like in the program, and all functions are "visible" to all other functions. For example, if you have:

  /* fdomain.c */
  #include <stdio.h>
  void func1( void );
  void func2( void );
  
  void main()
  {
    puts( "MAIN" );
    func1();
    func2();
  }
  void func1( void )
  {
    puts( "FUNC1" );
  }
  void func2( void )
  {
    puts( "FUNC2" );
    func1();
  }

-- then "main()" can call "func1()" and "func2()"; "func1()" could call "func2()"; and "func2()" can call "func1()". In principle, even "main()" could be called by other functions, but there's no intelligent reason to do so. Although "main()" is the first function in the listing above, there's no particular requirement that it be so, but by convention it always is.

Functions can call themselves recursively. For example, "func1()" can call "func1()" indefinitely, or at least until a stack overflow occurs. You cannot declare functions inside other functions.

Functions are defined as follows:

  float sphere( int rad )
  { 
     ...
  }

They begin with a function header that starts with a return value type declaration ("float" in this case), then the function name ("sphere"), and finally the arguments required ("int rad").

ANSI C dictates that function prototypes be provided to allow the compiler to perform better checking on function calls:

  float sphere( int rad );

For an example, consider a simple program that "fires" a weapon (simply by printing "BANG!"):

  /* bango.c */
  #include <stdio.h>
  void fire( void );
  void main()
  {
    printf( "Firing!\n" );
    fire();
    printf( "Fired!\n" );
  }
  void fire( void )
  {
    printf( "BANG!\n" );
  }

Since "fire()" does not return a value and does not accept any arguments, both the return value and the argument are declared as "void"; "fire()" also does not use a "return" statement and simply returns automatically when completed.

Let's modify this example to allow "fire()" to accept an argument that defines a number of shots. This gives the program:

  /* fire.c */
  #include <stdio.h>
  void fire( int n );
  void main()
  {
    printf( "Firing!\n" );
    fire( 5 );
    printf( "Fired!\n" );
  }
  void fire( int n )
  {
    int i;
    for ( i = 1; i <= n ; ++i )
    {
      printf( "BANG!\n" );
    }
  }

This program passes a single parameter, an integer, to the "fire()" function. The function uses a "for" loop to execute a "BANG!" the specified number of times (more on "for" later).

If a function requires multiple arguments, they can be separated by commas:

  printf( "%d times %d = %d\n", a, b, a * b );

The word "parameter" is sometimes used in place of "argument". There is actually a fine distinction between these two terms: the calling routine specifies "arguments" to the called function, while the called function receives the "parameters" from the calling routine.

When you list a parameter in the function header, it becomes a local variable to that function. It is initialized to the value provided as an argument by the calling routine. If a variable is used as an argument, there is no need for it to have the same name as the parameter specified in the function header.

For example:

 fire( shots );
 ...
 void fire( int n )
 ... 

The integer variable passed to "fire()" has the name "shots", but "fire()" accepts the value of "shots" in a local variable named "n". The argument and the parameter could also have the same name, but even then they would remain distinct variables.

Parameters are matched with arguments in the order in which they are sent:

  /* pmmatch.c */
  #include <stdio.h>
  void showme( int a, int b );
  void main()
  {
    int x = 1, y = 100;
    showme( x, y );
  }
  void showme( int a, int b )
  {
    printf( "a=%d  b=%d\n", a, b );
  }

This prints:

  a=1  b=100

You can also modify this program to show that the arguments are not affected by any operations the function performs on the parameters, as follows:

  /* noside.c */
  #include <stdio.h>
  void showmore( int a, int b );
  void main()
  {
     int x = 1, y = 100;
     showmore( x, y );
     printf( "x=%d  y=%d\n", x, y );
  }
  void showmore( int a, int b )
  {
     printf( "a=%d  b=%d\n", a, b );
     a = 42;
     b = 666;
     printf( "a=%d  b=%d\n", a, b );
  }

This prints:

  a=1  b=100
  a=42  b=666
  x=1  y=100

You can send arrays to functions as if they were any other type of variable:

  /* fnarray.c */
  #include <stdio.h>
  #define SIZE 10
  
  void testfunc( int a[] );
  
  void main()
  {
    int ctr, a[SIZE];
    for( ctr = 0; ctr < SIZE; ++ctr )
    {
      a[ctr] = ctr * ctr;
    }
    testfunc( a );
  }
  
  void testfunc( int a[] )
  {
    int n;
    for( n = 0; n < SIZE; ++ n )
    {
      printf( "%d\n", a[n] );
    }
  }

Although a novice programmer would not want to deal with such complications, it is possible to define functions with a variable number of parameters. In fact, "printf()" is such a function. We won't worry about this issue further in this document.

The normal way to get a value out of a function is simply to provide it as a return value. This neatly encapsulates the function and isolates it from the calling routine. In the example in the first section, the function "sphere()" returned a "float" value with the statement:

  return( result );

The calling routine accepted the return value as follows:

  volume = sphere( radius );

The return value can be used directly as a parameter to other functions:

  printf( "Volume: %f\n", sphere( radius ) );

The return value does not have to be used; "printf()", for example, returns the number of characters it prints, but few programs bother to check.

A function can contain more than one "return" statement:

  if( error == 0 )
  {
    return( 0 );
  }
  else
  {
    return( 1 );
  }

You can place "return" anywhere in a function, and it does not have to return a value. Without a value, "return" simply causes the function to return control to the calling routine. This of course implies that the data type of the function be declared as "void":

  void ftest( int somevar )
  {
     ...
     if( error == 0 )
     {
       return();
     }
     ...
  }

If there's no "return" in a function, the function returns after it executes its last statement. Again, this means the function type must be declared "void".

The "return" statement can only return a single value, but this value can be a "pointer" to an array or a data structure. Pointers are a complicated subject and will be discussed in detail later. They can also be used to return values through an argument list.

C Control Constructs

C contains a number of looping constructs, such as the "while" loop:

  /* while.c */
  #include <stdio.h>
  void main()
  {
    int test = 10;
    while( test > 0 )
    {
      printf( "test = %d\n", test );
      test = test - 2;
    }
  }

This loop may not execute at all, if "test" starts with an initial value less than or equal to 0. There is a variant, "do", that will always execute at least once:

  /* do.c */
  #include <stdio.h>
  void main()
  {
    int test = 10;
    do 
    {
      printf( "test = %d\n", test );
      test = test - 2;
    }
    while( test > 0 );
  } 

The most common looping construct, however, is the "for" loop, which creates a loop much like the "while" loop but in a more compact form:

  /* for.c */
  #include <stdio.h>
  
  void main()
  {
    int test;
    for( test = 10; test > 0; test = test - 2 )
    {
      printf( "test = %d\n", test );
    }
  }

Notice that with all these loops, the initial loop statement does not end with a ";". If you did this with the "for" loop above, the "for" statement would execute to completion, but not run any of the statements in the body of the loop.

The "for" loop has the syntax:

  for( <initialization>; <operating test>; <modifying expression> )

All the elements in parentheses are optional. You could actually run a "for" loop indefinitely with:

  for( ; ; )
  {
    ...
  }

-- although using an indefinite "while" is cleaner:

  while( 1 )
  {
    ...
  }

You can use multiple expressions in either the initialization or the modifying expression with the "," operator:

  /* formax.c */
  #include <stdio.h>
  void main()
  {
    int a, b;
    for ( a = 256, b = 1; b < 512 ; a = a / 2, b = b * 2 )
    {
      printf( "a = %d  b = %d\n", a, b );
    }
  }

The conditional tests available to C are as follows:

  a == b:   equals                       
  a != b:   not equals
  a < b:    less than                    
  a > b:    greater than
  a <= b:   less than or equals          
  a >= b:   greater than or equals

The fact that "==" is used to perform the "equals" test, while "=" is used as the assignment operator, often causes confusion and is a common bug in C programming:

  a == b:   Is "a" equal to "b"?
  a = b:    Assign value of "b" to "a".

C also contains decision-making statements, such as "if":

  /* if.c */
  #include <stdio.h>
  #define MISSILE 1
  void fire( int weapon )
  void main()
  {
    fire( MISSILE );
  }
  void fire( int weapon )
  {
    if( weapon == MISSILE )
    {
      printf( "Fired missile!\n" );
    }
    if( weapon != MISSILE )
    {
      printf( "Unknown weapon!\n");
    }
  }

This example can be more easily implemented using an "else" clause:

  /* ifelse.c */
  void fire( int weapon )
  {
    if( weapon == MISSILE )
    {
      printf( "Fired missile!\n" );
    }
    else
    {
      printf( "Unknown weapon!\n");
    }
  }

Since there is only one statement in each clause, the curly brackets aren't really necessary. This would work just as well:

  void fire( int weapon )
  {
    if( weapon == MISSILE )
      printf( "Fired missile!\n" );
    else
      printf( "Unknown weapon!\n" );
  }

However, the brackets make the structure more obvious; prevent errors if you add statements to the conditional clauses; and the compiler doesn't care one way or another, it generates the same code.

There is no "elseif" keyword, but you can nest "if" statements:

  /* nestif.c */
  #include <stdio.h>
  #define MISSILE 1
  #define LASER 2
  void fire( int weapon )
  void main()
  {
    fire( LASER );
  }
  void fire( int weapon )
  {
    if( weapon == MISSILE )
    {
      printf( "Fired missile!\n" );
    }
    else
    {
      if( weapon == LASER )
      {
        printf( "Fired laser!\n" );
      }
      else
      {
        printf( "Unknown weapon!\n");
      }
    }
  }

This is somewhat clumsy, however, and the "switch" statement does a cleaner job:

  /* switch.c */
  void fire( int weapon )
  {
    switch( weapon )
    {
    case MISSILE:
      printf( "Fired missile!\n" );
      break;
    case LASER:
      printf( "Fired laser!\n" );
      break;
    default:
      printf( "Unknown weapon!\n");
      break;
    }
  }

The "switch" statement tests the value of a single variable, which means that if you are testing multiple variables, or are testing for anything but equality to one of a list of values, you'll still have to use the "if" statement. The optional "default" clause is used to handle conditions not covered by the other cases.

Each clause ends in a "break", which causes execution to break out of the "switch". Leaving out a "break" can be another subtle error in a C program, since if it isn' there, execution flows right through to the next clause. However, this can be used to advantage. Suppose in our example the routine can also be asked to fire a ROCKET, which is the same as a MISSILE:

  void fire( int weapon )
  {
    switch( weapon )
    {
    case ROCKET:
    case MISSILE:
      printf( "Fired missile!\n" );
      break;
    case LASER:
      printf( "Fired laser!\n" );
      break;
    default:
      printf( "Unknown weapon!\n");
      break;
    }
  }

The "break" statement is not specific to "switch" statements. It can be used to break out of other control structures, though good program design tends to avoid such improvisations:

  /* break.c */
  #include <stdio.h>
  void main()
  {
    int n;
    for( n = 0; n < 10; n = n + 1 )
    {
      if( n == 5 )
      {
        break;  /* Punch out of loop at value 5. */
      }
      else
      {
        printf( "%d\n", n );
      }
    }
  }

If the "for" loop were nested inside a "while" loop, a "break" out of the "for" loop would still leave you stuck in the "while" loop. The "break" keyword only applies to the control construct that executes it.

There is also a "continue" statement that allows you to skip to the end of the loop body and continue with the next iteration of the loop. For example:

  /* continue.c */
  #include <stdio.h>
  void main()
  {
    int n;
    for( n = 0; n < 10; n = n + 1 )
    {
      if( n == 5 )
      {
        continue;
      }
      else
      {
        printf( "%d\n", n );
      }
    }
  }

Finally, there is a "goto" statement:

  goto punchout;
  ...
  punchout:

-- that allows you to jump to an arbitrary tag within a function, but the use of this statement is generally discouraged.

While these are the lot of C's true control structures, there is also a special "conditional operator" that allows you to perform simple conditional assigment of the form:

  if( a == 5) 
  {
    b = -10;
  }
  else
  {
    b = 255;
  }

-- using a much tidier, if more cryptic, format:

  b = ( a == 5 ) ? -10 : 255 ;

v2.0.7 / 1 of 7 / 01 feb 02 / greg goebel / public domain