User:Jimregan/C Primer chapter 3

From Wikibooks, open books for an open world
Jump to navigation Jump to search

C Input & Output

This chapter covers console (keyboard/display) and file I/O. You've already seen one console-I/O function, "printf()", and there are several others, C has two separate approaches toward file I/O, one based on library functions that resembles console I/O, and a second that uses "system calls". These topics are discussed in detail below.

C Console I/O

Console I/O in general means communications with the computer's keyboard and display. However, in most modern operating systems the keyboard and display are simply the default input and output devices, and user can easily redirect input from, say, a file or other program and redirect output to, say, a serial I/O port:

  type infile > myprog > com

The program itself, "myprog", doesn't know the difference. The program uses console I/O to simply read its "standard input (stdin)" -- which might be the keyboard, a file dump, or the output of some other program -- and print to its "standard output (stdout)" -- which might be the display or printer or another program or a file. The program itself neither knows nor cares.

Console I/O requires the declaration:

  #include <stdio.h>

Useful functions are:

  printf():     Print a formatted string to stdout.
  scanf():      Read formatted data from stdin.
  putchar():    Print a single character to stdout.
  getchar():    Read a single character from stdin.
  puts():       Print a string to stdout.
  gets():       Read a line from stdin.

PC-based compilers also have an alternative library of console I/O functions that you will see on occasion. These functions require the declaration:

  #include <conio.h>

The three most useful PC console I/O functions are:

   * getch():
     Get a character from the keyboard (no need to press Enter).
   * getche():
     Get a character from the keyboard and echo it.
   * kbhit():
     Check to see if a key has been pressed. 

The "printf()" function, as you remember, prints a string that may include formatted data:

  printf( "This is a test!\n" );

-- which can include the contents of variables:

  printf( "Value1:  %d   Value2:  %f\n", intval, floatval );

The available format codes are:

  %d:   decimal integer
  %ld:  long decimal integer
  %c:   character
  %s:   string
  %e:   floating-point number in exponential notation
  %f:   floating-point number in decimal notation
  %g:   use %e and %f, whichever is shorter
  %u:   unsigned decimal integer
  %o:   unsigned octal integer
  %x:   unsigned hex integer

Using the wrong format code for a particular data type can lead to bizarre output.

You can obtain further control by using modifiers. For example, you can add a numeric prefix to specify the minimum field width:

  %10d

This specifies a minimum field width of ten characters. If the field width is too small, a wider field will be used. Adding a minus sign:

  %-10d

-- causes the text to be left-justified. You can also add a numeric precision:

  %6.3f

This specifies three digits of precision in a field six characters wide. You can specify a precision for strings as well, in which case it indicates the maximum number of characters to be printed. For example:

  /* prtint.c */
  #include <stdio.h>
  void main()
  {
    printf( "<%d>\n", 336 );
    printf( "<%2d>\n", 336 );
    printf( "<%10d>\n", 336 );
    printf( "<%-10d>\n", 336 );
  }

This prints:

  <336>
  <336>
  <       336>
  <336       >

Similarly:

  /* prfloat.c */
  #include <stdio.h>
  void main()
  {
    printf( "<%f>\n", 1234.56 );
    printf( "<%e>\n", 1234.56 );
    printf( "<%4.2f>\n", 1234.56 );
    printf( "<%3.1f>\n", 1234.56 );
    printf( "<%10.3f>\n", 1234.56 );
    printf( "<%10.3e>\n", 1234.56 );
  }

-- prints:

  <1234.560000>
  <1.234560e+03>
  <1234.56>
  <1234.6>
  <  1234.560>
  < 1.234e+03>

And finally:

  /* prtstr.c */
  #include <stdio.h>
  void main()
  {
    printf( "<%2s>\n", "Barney must die!" );
    printf( "<%22s>\n", "Barney must die!" );
    printf( "<%22.5s>\n", "Barney must die!" );
    printf( "<%-22.5s>\n", "Barney must die!" );
  }

-- prints:

  <Barney must die!>
  <      Barney must die!>
  <                 Barne>
  <Barne                 >

Just for convenience, the table of special characters listed in chapter 2 is repeated here. These characters can be embedded in "printf" strings:

  '\a':    alarm (beep) character
  '\\p':    backspace
  '\f':    formfeed
  '\n':    newline
  '\r':    carriage return
  '

'\v': vertical tab '\': backslash '\?': question mark '\: single quote '"\': double quote '\0NN': character code in octal '\xNN': character code in hex '\0': NULL character The "scanf()" function reads formatted data using a syntax similar to that of "printf", except that it requires pointers as parameters, since it has to return values. For example:

  /* cscanf.c */
  #include <stdio.h>
  void main()
  {
    int val;
    char name[256];
  
    printf( "Enter your age and name.\n" );
    scanf( "%d %s", &val, name ); 
    printf( "Your name is: %s -- and your age is: %d\n", name, val );
  }

There is no "&" in front of "name" since the name of a string is already a pointer. Input fields are separated by whitespace (space, tab, or newline), though you can use a count ("%10d") to define a specific field width. Formatting codes are the same as for "printf()", except:

   * There is no "%g" format code.
   * The "%f" and "%e" format codes work the same.
   * There is a "%h" format code for reading short integers. 

If you include characters in the format code, "scanf()" will read in that character and discard it. For example, if the example above were modified as follows:

  scanf( "%d,%s", &val, name );

-- then "scanf()" will assume that the two input values are comma-separated and swallow the comma when it is encountered.

If you precede a format code with an asterisk, the data will be read and discarded. For example, if the example were changed to:

  scanf( "%d%*c%s", &val, name );

-- then if the two fields were separated by a ":", that character would be read in and discarded.

The "scanf()" function will return the value EOF (an "int"), defined in "stdio.h", when its input is terminated.

The "putchar()" and "getchar()" functions handle single character I/O. For example, the following program accepts characters from standard input one at a time:

  /* inout.c */
  #include <stdio.h>
  void main ()
  {
    unsigned int ch; 
  
    while ((ch = getchar()) != EOF)
    {
      putchar( ch ); 
    }
  }

The "getchar" function returns an "int" and also terminates with an EOF. Notice the neat way C allows you to get a value and then test it in the same expression, a particularly useful feature for handling loops.

One word of warning on single-character I/O: if you are entering characters from the keyboard, most operating systems won't send the characters to the program until you press the Enter key, meaning that you can't really do single-character keyboard I/O this way.

The little program above is the essential core of a character-mode text "filter", a program that can perform some transformation between standard input and standard output. Such a filter can be used as an element to construct more sophisticated applications:

  type file.txt > filter1 | filter2 > outfile.txt

The following filter capitalizes the first character in each word in the input. The program operates as a "state machine", using a variable that can be set to different values, or "states", to control its operating mode. It has two states: SEEK, in which it is looking for the first character, and REPLACE, in which it is looking for the end of a word.

In SEEK state, it scans through whitespace (space, tab, or newline), e echoing characters. If it finds a printing character, it converts it to uppercase and goes to REPLACE state. In REPLACE state, it converts characters to lowercase until it hits whitespace, and then goes back to SEEK state.

The program uses the "tolower()" and "toupper()" functions to make case conversions. These two functions will be discussed in the next chapter.

  /* caps.c */
  #include <stdio.h>
  #include <ctype.h>
  #define SEEK 0
  #define REPLACE 1
  void main()
  {
    int ch, state = SEEK;
    while(( ch = getchar() ) != EOF )
    {
      switch( state )
      {
      case REPLACE:
        switch( ch )
        {
        case ' ':
        case '

case '\n': state = SEEK; break; default: ch = tolower( ch ); break; } break; case SEEK: switch( ch ) { case ' ': case '\n': break; default: ch = toupper( ch ); state = REPLACE; break; } } putchar( ch ); } } * The "puts()" function is like a simplified version of "printf()" without format codes. It allows you to print a string that is automatically terminated with a newline:

  puts( "Hello world!" );

The "gets()" function is particularly useful: it allows you to read a line of text (terminated by a newline. The "gets()" function doesn't read the newline into the string into the program, making it much less finicky about its inputs than "scanf()":

  /* cgets.c */
  #include <stdio.h>
  #include <string.h>
  #include <stdlib.h>
  void main()
  {
    char word[256], 
         *guess = "blue";
    integer i, n = 0;
    puts( "Guess a color (use lower case please):" );
    while( gets( word ) != NULL )
    {
      if( strcmp( word, guess ) == 0 )
      {
         puts( "You win!" );
         exit( 0 );
      }
      else
      {
         puts( "No, try again." );
      }
    }
  }

This program includes the "strcmp" function, which performs string comparisons and returns 0 on a match. This function is described in more detail in the next chapter.

You can use these functions to implement filters that operate on lines of text, rather than characters. A core program for such filters follows:

  /* lfilter.c */
  #include <stdio.h>
  void main ()
  {
    char b[256];
    while (( gets( b ) ) != NULL )
    {
      puts( b ); 
    }
  }

The "gets()" function returns a NULL, defined in "stdio.h", on input termination or error.

The PC-based console-I/O functions "getch()" and "getche()" operate much as "getchar()" does, except that "getche()" echoes the character automatically.

The "kbhit()" function is very different in that it only indicates if a key has been pressed or not. It returns a nonzero value if a key has been pressed, and zero if it hasn't. This allows a program to poll the keyboard for input rather than hanging on keyboard input, and waiting for something to happen. These functions require the "conio.h" header file, not the "stdio.h" header file.

C File-I/O Through Library Functions

The file-I/O library functions are much like the console-I/O functions. In fact, most of the console-I/O functions can be thought of as special cases of the file-I/O functions. The library functions include:

  fopen():     Create or open a file for reading or writing.
  fclose():    Close a file after reading or writing it.
  fseek():     Seek to a certain location in a file.
  rewind():    Rewind a file back to its beginning and leave it open.
  rename():    Rename a file.
  remove():    Delete a file.
  fprintf():   Formatted write.
  fscanf():    Formatted read.
  fwrite():    Unformatted write.
  fread():     Unformatted read.
  putc():      Write a single byte to a file.
  getc():      Read a single byte from a file.
  fputs():     Write a string to a file.
  fgets():     Read a string from a file.

All these library functions depend on definitions made in the "stdio.h" header file, and so require the declaration:

  #include <stdio.h>

C documentation normally refers to these functions as performing "stream I/O", not "file I/O". The distinction is that they could just as well handle data being transferred through a modem as a file, and so the more general term "data stream" is used rather than "file". However, we'll stay with the "file" terminology in this discussion for the sake of simplicity.

  • The "fopen()" function opens and, if need be, creates a file. Its syntax is:
  <file pointer> = fopen( <filename>, <access mode> );

The "fopen()" function returns a "file pointer", declared as follows:

  FILE *<file pointer>;

The file pointer will be returned with the value "NULL", defined in "stdio.h", if there is an error. The "access modes" are defined as follows:

  r:    Open for reading.
  w:    Open and wipe (or create) for writing.
  a:    Append -- open (or create) to write to end of file.
  r+:   Open a file for reading and writing.
  w+:   Open and wipe (or create) for reading and writing.
  a+:   Open a file for reading and appending.

The "filename" is simply a string of characters.

It is often useful to use the same statements to communicate either with files or with standard I/O. For this reason, the "stdio.h" header file includes predefined file pointers with the names "stdin" and "stdout". You don't need to do an "fopen()" on them, you can just assign them to a file pointer:

  fpin = stdin;
  fpout = stdout;

-- and following file-I/O functions won't know the difference.

The "fclose()" function simply closes the file given by its file pointer parameter. It has the syntax:

  fclose( fp );

The "fseek()" function call allows you to select any byte location in a file for reading or writing. It has the syntax:

  fseek( <file_pointer>, <offset>, <origin> );

The offset is a "long" and specifies the offset into the file, in bytes. The "origin" is an "int" and is one of three standard values, defined in "stdio.h":

  SEEK_SET:   Start of file.
  SEEK_CUR:   Current location.
  SEEK_END:   End of file.

The "fseek()" function returns 0 on success and non-zero on failure.

The "rewind()", "rename()", and "remove()" functions are straightforward. The "rewind()" function resets an open file back to its beginning. It has the syntax:

  rewind( <file_pointer> );

The "rename()" function changes the name of a file:

  rename( <old_file_name_string>, <new_file_name_string> );

The "remove()" function deletes a file:

  remove( <file_name_string> )

The "fprintf()" function allows formatted ASCII data output to a file, and has the syntax:

  fprintf( <file pointer>, <string>, <variable list> );

The "fprintf()" function is identical in syntax to "printf()", except for the addition of a file pointer parameter. For example, the following "fprintf()" call:

  /* fprpi.c */
  #include <stdio.h>
  void main()
  {
    int n1 = 16;
    float n2 = 3.141592654f;
    FILE *fp;
    fp = fopen( "data", "w" );
    fprintf( fp, "  %d   %f", n1, n2 ); 
    fclose( fp );
  }

-- stores the following ASCII data:

   16   3.14159

The formatting codes are exactly the same as for "printf()":

  %d:   decimal integer
  %ld:  long decimal integer
  %c:   character
  %s:   string
  %e:   floating-point number in exponential notation
  %f:   floating-point number in decimal notation
  %g:   use %e and %f, whichever is shorter
  %u:   unsigned decimal integer
  %o:   unsigned octal integer
  %x:   unsigned hex integer

Field-width specifiers can be used as well. The "fprintf()" function returns the number of characters it dumps to the file, or a negative number if it terminates with an error.

The "fscanf()" function is to "fprintf()" what "scanf()" is to "printf()": it reads ASCII-formatted data into a list of variables. It has the syntax:

  fscanf( <file pointer>, <string>, <variable list> );

However, the "string" contains only format codes, no text, and the "variable list" contains the addresses of the variables, not the variables themselves. For example, the program below reads back the two numbers that were stored with "fprintf()" in the last example:

  /* frdata.c */
  #include <stdio.h>
  void main()
  {
    int n1;
    float n2;
    FILE *fp;
    fp = fopen( "data", "r" );
    fscanf( fp, "%d %f", &n1, &n2 );
    printf( "%d %f", n1, n2 );
    fclose( fp );
  }

The "fscanf()" function uses the same format codes as "fprintf()", with the familiar exceptions:

   * There is no "%g" format code.
   * The "%f" and "%e" format codes work the same.
   * There is a "%h" format code for reading short integers. 

Numeric modifiers can be used, of course. The "fscanf()" function returns the number of items that it successfully read, or the EOF code, an "int", if it encounters the end of the file or an error.

The following program demonstrates the use of "fprintf()" and "fscanf()":

  /* fprsc.c */
  #include <stdio.h>
  void main()
  {
    int ctr, i[3], n1 = 16, n2 = 256;
    float f[4], n3 = 3.141592654f;
    FILE *fp;
    fp = fopen( "data", "w+" );
    /* Write data in:   decimal integer formats
                        decimal, octal, hex integer formats
                        floating-point formats  */
    fprintf( fp, "%d %10d %-10d \n", n1, n1, n1 );   
    fprintf( fp, "%d %o %x \n", n2, n2, n2 );
    fprintf( fp, "%f %10.10f %e %5.4e \n", n3, n3, n3, n3 );
    /* Rewind file. */
    rewind( fp );
    /* Read back data. */
    puts( "" );
    fscanf( fp, "%d %d %d", &i[0], &i[1], &i[2] );
    printf( "   %d\t%d\t%d

fscanf( fp, "%d %o %x", &i[0], &i[1], &i[2] ); fscanf( fp, "%f %f %f %f", &f[0], &f[1], &f[2], &f[3] );

fclose( fp ); } The program generates the output:

  16         16         16
  256        256        256
  3.141593   3.141593   3.141593   3.141600

The "fwrite()" and "fread()" functions are used for binary file I/O. The syntax of "fwrite()" is as follows:

  fwrite( <array_pointer>, <element_size>, <count>, <file_pointer> );

The array pointer is of type "void", and so the array can be of any type. The element size and count, which give the number of bytes in each array element and the number of elements in the array, are of type "size_t", which are equivalent to "unsigned int".

The "fread()" function similarly has the syntax:

  fread( <array_pointer>, <element_size>, <count>, <file_pointer> );

The "fread()" function returns the number of items it actually read.

The following program stores an array of data to a file and then reads it back using "fwrite()" and "fread()":


  /* fwrrd.c */
  #include <stdio.h>
  #include <math.h>
  
  #define SIZE 20
  
  void main()
  {
    int n;
    float d[SIZE];
    FILE *fp;
  
    for( n = 0; n < SIZE; ++n )                 /* Fill array with roots. */
    {
      d[n] = (float)sqrt( (double)n );
    }
    fp = fopen( "data", "w+" );                 /* Open file. */
    fwrite( d, sizeof( float ), SIZE, fp );     /* Write it to file. */
    rewind( fp );                               /* Rewind file. */
    fread( d, sizeof( float ), SIZE, fp );      /* Read back data. */
    for( n = 0; n < SIZE; ++n )                 /* Print array. */
    {
      printf( "%d: %7.3f\n", n, d[n] );
    }
    fclose( fp );                               /* Close file. */
  }

The "putc()" function is used to write a single character to an open file. It has the syntax:

  putc( <character>, <file pointer> );

The "getc()" function similarly gets a single character from an open file. It has the syntax:

  <character variable> = getc( <file pointer> );

The "getc()" function also returns "EOF" on error. The console I/O functions "putchar()" and "getchar()" are really only special cases of "putc()" and "getc()" that use standard output and input.

  • The "fputs()" function writes a string to a file. It has the syntax:
  fputs( <string / character array>, <file pointer> );

The "fputs()" function will return an EOF value on error. For example:

  fputs( "This is a test", fptr );

The "fgets()" function reads a string of characters from a file. It has the syntax:

  fgets( <string>, <max_string_length>, <file_pointer> );

The "fgets" function reads a string from a file until if finds a newline or grabs <string_length-1> characters. It will return the value NULL on an error.

The following example program simply opens a file and copies it to another file, using "fgets()" and "fputs()":

  /* fcopy.c */
  #include <stdio.h>
  
  #define MAX 256
  
  void main()
  {
    FILE *src, *dst;
    char b[MAX];
  
    /* Try to open source and destination files. */
  
    if ( ( src = fopen( "infile.txt", "r" )) == NULL )
    {
       puts( "Can't open input file." );
       exit();
    }
    if ( (dst = fopen( "outfile.txt", "w" )) == NULL )
    {
       puts( "Can't open output file." );
       fclose( src );
       exit();
    }
  
    /* Copy one file to the next. */
  
    while( ( fgets( b, MAX, src ) ) != NULL )
    {
       fputs( b, dst );
    }
  
    /* All done, close up shop. */
  
    fclose( src );
    fclose( dst );
  }

C File-I/O Through System Calls

File-I/O through system calls is simpler and operates at a lower level than making calls to the C file-I/O library. There are seven fundamental file-I/O system calls:

  creat():    Create a file for reading or writing.
  open():     Open a file for reading or writing.
  close():    Close a file after reading or writing.
  unlink():   Delete a file.
  write():    Write bytes to file.
  read():     Read bytes from file.

These calls were devised for the UN*X operating system and are not part of the ANSI C spec.

Use of these system calls requires a header file named "fcntl.h":

  #include <fcntl.h>

The "creat()" system call, of course, creates a file. It has the syntax:

  <file descriptor variable> = creat( <filename>, <protection bits> );

This system call returns an integer, called a "file descriptor", which is a number that identifies the file generated by "creat()". This number is used by other system calls in the program to access the file. Should the "creat()" call encounter an error, it will return a file descriptor value of -1.

The "filename" parameter gives the desired filename for the new file. The "permission bits" give the "access rights" to the file. A file has three "permissions" associated with it:

   * Write permission:
     Allows data to be written to the file.
   * Read permission:
     Allows data to be read from the file.
   * Execute permission:
     Designates that the file is a program that can be run. 

These permissions can be set for three different levels:

   * User level:
     Permissions apply to individual user.
   * Group level:
     Permissions apply to members of user's defined "group".
   * System level:
     Permissions apply to everyone on the system. 

For the "creat()" system call, the permissions are expressed in octal, with an octal digit giving the three permission bits for each level of permissions. In octal, the permission settings:

  0644

-- grant read and write permissions for the user, but only read permissions for group and system. The following octal number gives all permissions to everyone:

  0777

Should you attempt to "creat()" an existing file (for which you have write permission), "creat()" will not return an error. It will instead wipe the contents of the file and return a file descriptor for it.

For example, to create a file named "data" with read and write permission for everyone on the system, you would write:

  #define RD_WR 0666
  ...
  int fd;                               /* Define file descriptor. */
  fd = creat( "data", RD_WR );

The "open()" system call opens an existing file for reading or writing. It has the syntax:

  <file descriptor variable> = open( <filename>, <access mode> );

The "open()" call is similar to the "creat()" call in that it returns a file descriptor for the given file, and returns a file descriptor of -1 if it encounters an error. However, the second parameter is an "access mode", not a permission code. There are three modes (defined in the "fcntl.h" header file):

  O_RDONLY:   Open for reading only.
  O_WRONLY:   Open for writing only.
  O_RDWR:     Open for reading and writing.

For example, to open "data" for writing, assuming that the file had been created by another program, you would write:

  int fd;
  fd = open( "data", O_WRONLY );

A few additional comments before proceeding:

   * A "creat()" call implies an "open()". There is no need to "creat()" a file and then "open()" it.
   * There is an operating-system-dependent limit on the number of files that a program can have open at any one time.
   * The file descriptor is no more than an arbitrary number that a program uses to distinguish one open file for another. When a file is closed, re-opening it again will probably not give it the same file descriptor. 

The "close()" system call is very simple. All it does is "close()" an open file when there is no further need to access it. The "close()" system call has the syntax:

  close( <file descriptor> );

The "close()" call returns a value of 0 if it succeeds, and returns -1 if it encounters an error.

The "unlink()" system call deletes a file. It has the syntax:

  unlink( <file_name_string> );

It returns 0 on success and -1 on failure.

The "write()" system call writes data from a open file. It has the syntax:

  write( <file descriptor>, <buffer>, <buffer length> );

The file descriptor is returned by a "creat()" or "open()" system call. The "buffer" is a pointer to a variable or an array that contains the data; and the "buffer length" gives the number of bytes to be written into the file.

While different data types may have different byte lengths on different systems, the "sizeof()" statement can be used to provide the proper buffer length in bytes. A "write()" call could be specified as follows:

  float array[10];
  ...
  write( fd, array, sizeof( array ) );

The "write()" function returns the number of bytes it actually writes. It will return -1 on an error.

The "read()" system call reads data from a open file. Its syntax is exactly the same as that of the "write()" call:

  read( <file descriptor>, <buffer>, <buffer length> );

The "read()" function returns the number of bytes it actually returns. At the end of file it returns 0, or returns -1 on error.

v2.0.7 / 3 of 7 / 01 feb 02 / greg goebel / public domain