Fortran/io

From Wikibooks, open books for an open world
< Fortran
Jump to: navigation, search

It is often useful, in Fortran and other languages, to specify where, and how you want something to print or be read. Fortran offers many commands and formatting specifications which can serve these purposes. In the following sections we will be considering the I/O operations (OPEN, CLOSE, INQUIRE, REWIND, BACKSPACE, ENDFILE, FLUSH, PRINT, READ, WRITE and NAMELIST) and I/O formatting (FORMAT). In Fortran 2008, a significant addition has been the ability to extend the basic facilities with user-defined routines to output derived types including those that are themselves composed of other derived types.

Together these commands form a very powerful assembly of facilities for reading and writing formatted and unformatted files with sequential, direct or asynchronous access. Indeed, the options can look bewildering at first. However, the basic operations are simple enough but they are backed up by the power and flexibility to read or write almost any file.

However, it is worth mentioning, at this stage, what Fortran cannot do simply and directly from a language-defined perspective: Fortran will not address a file defined by a URL, files must be available locally or on a mapped network drive or equivalent. Similarly, Fortran does not natively support XML, except that XML is simple ASCII text so it can be read easily but parsing it out is down to the programmer! The Fortran language knows nothing about computer graphics; you are not going to find a language-defined draw command. Fortran will happily open and read a jpg file but there is no language-defined method of displaying the file as a picture; a suitable external library will have to be used. Finally, Fortran does not have any language-defined mouse operations or touch screen gestures.

I/O Operations[edit]

Introduction[edit]

Modern Fortran has a rich vocabulary for I/O operations. These operations can generally be used on the screen and keyboard, external files and internal files. In recent versions of Fortran, the syntax of these commands has been rationalised but most of the original syntax has been retained for backwards compatibility. I/O operations are notoriously error-prone and Fortran now supports a unified mechanism for identifying and processing errors.

Simple I/O Operations[edit]

PRINT[edit]

This is the classic "Hello World" operation, but is rarely used in production code. PRINT is one of two formatted output operations and it is also much simpler that the WRITE statement. The main purpose of the PRINT statement is to print to the screen (standard output unit) and it has no options for file output. The general form is:

PRINT fmt, list

Both fmt and list are optional and the fmt can be explicit or list-directed (as indicated by *), and being optional can take the name=value format. The list is a comma separated list of literals or intrinsic type variables. So here are some examples:

program hello
implicit none
integer :: i
! List-directed fmt
PRINT *, "Hello World"
do i = 1, 10
    ! An explicit fmt and a two element list
    PRINT '(A,I0)', "Hello ", i
enddo
! Name=value for fmt
PRINT fmt='(A)', 'Goodbye'
end program hello

Note that PRINT is just about the only I/O operation that still does not support IOSTAT and IOMSG clauses, and for this reason alone should not be used in new code except for temporary output and debugging. PRINT has no explicit mechanism for printing user-defined types and this is another reason for not using it in production code.

I/O Channels & Files[edit]

In the example on PRINT shown above, the PRINT statement is automatically pre-connected to the standard output device also known as the computer screen. However, in general, Fortran requires a two stage process to connect code to external files. First we have to connect the file to a Fortran channel (manually identified by a positive integer or automatically assigned a negative value): the OPEN command, and then we can READ and WRITE to the now open channel. READ and WRITE operations to a channel that is not open, results in an error; there is no language-specified buffering until the relevant channel is opened. Once an I/O operation is complete we can CLOSE the connection between the file and the Fortran channel. If a Fortran program terminates before a channel is closed to a file, Fortran will usually close the channel without significant loss of data.

The state of availability of Fortran channels can be ascertained through one form of the INQUIRE command. The INQUIRE command can also be used to determine the existence and other properties of a file before it is connected to a Fortran channel.

Fortran I/O to internal files does not require a pre-connection process. Fortran input from the keyboard and output to the screen is automatically pre-connected on a special channel (*). Compiler vendors are free to assign a channel number to these standard I/O devices and the user can determine which channel numbers have been used via the intrinsic module ISO_FORTRAN_ENV.

OPEN[edit]

This is the command required to establish a connection between an external file and a Fortran channel. The OPEN command can be used to create a new file or connect to existing files. Subsequent I/O to the file, once open, is made via this channel number. The OPEN command has options to ensure that the file already exists or does not already exist, to ensure that it is used only for input, or only for output, or for both. The expected format of the file can be specified in the OPEN command and errors can be trapped. The full syntax of the command can appear to be rather complicated, but generally any one call to OPEN uses only a small subset of all the available options.

The OPEN command originated when external files were usually card images. OPEN can now specify that the connection to a file be fixed-format, asynchronous, a binary stream and many combinations of these. It is worth remembering that I/O is a major source of potential coding errors and where critical data are read they should be written again to confirm correct processing.

The value of the Fortran channel number has global scope within any one image of a Fortran program. Even if an integer variable is used to open a channel and that variable has very limited scope, the actual channel number is effectively ubiquitous. This needs to be considered at design time because one module can open, say, channel 10, and another module can close channel 10 without any use association between them. For this reason, in large programs, it is often the case that a single module is used to control all file i/o operations so that a clear and obvious "open - read/write - close" chain can be maintained.

Finally, as usual with Fortran, there are options and clauses which are retained for legacy purposes which should not be used in new code.

OPEN Command Syntax[edit]

OPEN([unit=]u, [, olist])

Where [] indicate an option and [,olist] is a comma separated list of options. In the above, u is a scalar integer expression or equivalent and is required unless the newunit option is specified. (Bad luck: we cannot open more than one file in one OPEN statement). Technically, u is called the external file unit number, or channel number for short.

Common Options[edit]

newunit=nu where nu is a default integer variable. This allows the processor to select the channel number and, to avoid conflicts with legacy code, a negative value (not -1) will be selected that does not conflict with any current unit number in use. This is the form that should be used in all new code.

iostat = ios where ios is a default integer variable which will be set to zero if the open statement does not detect an error, but will be set to a positive value is an error does occur, and the exact value is vendor dependent. Although technically an option, this is a highly recommended option for all OPEN commands. If this option is not present (and the err= option is not present, see below) the program will stop if there is an error. The presence of this option confers on the programmer the responsibility to check the value returned and have the program act accordingly.

iomsg = iom where iom is a scalar character variable of default kind. Again, although technically an option, this is a highly recommended option for all OPEN commands in new code. The length of the message is error and vendor specific and may require some trial and error.

file = fln where fln is a default character variable, literal or expression which specifies the name of the external file. The file name can be a fully qualified path or a local filename. If the path points to a file on a network drive the drive must be preconnected and there is no language defined way of making this connection. (Except that we can always resort to EXECUTE_COMMAND_LINE)

status = stn where stn is also a default character variable, literal or expression which must evaluate (case independently) to one of 'old', 'new', 'replace', 'scratch' or 'unknown'. 'old' requires the file to exist and is typically used when the purpose of the open statement is to allow a file to be read. 'new' and 'replace' require the presence of the file= option described above, and 'new' requires the file to not exist, and 'replace' allows the file to already exist but if it does it will be overwritten. 'scratch' is special in that the file= option must not be used and the file created cannot be kept on subsequent execution of a CLOSE command. 'scratch' is typically used for the temporary warehousing of large data structures to a hard disk or similar mass storage. If 'unknown' is specified this is also the default if the status= option is not given, and the file status becomes vendor and system dependent, i.e. a manual will have to be consulted.

action = act where act is a default kind character expression, variable or value that evaluates to 'READ', 'WRITE' or 'READWRITE'. Somewhat amazingly, the default is processor dependent, so the manual will have to be consulted. If 'READ' is specified, the file is to be regarded as read only and attempt to execute WRITE, PRINT or ENDFILE statements on this channel will result in errors. Similarly if 'WRITE' is specified, the file is to be regarded as write only and attempts to execute a READ statement will result in an error. When 'WRITE' is specified some other statements may result in an error in a processor dependent manner (e.g. BACKSPACE).

Simple Example of Open[edit]

PROGRAM OPENA
IMPLICIT NONE
INTEGER :: nout !channel number
INTEGER :: my_iostat !integer scalar to catch error status
CHARACTER(256) :: my_iomsg !Default-kind character variable to catch error msg
OPEN(newunit=nout, file="local.dat", iostat=my_iostat, iomsg=my_iomsg)
IF(my_iostat /= 0) THEN
    WRITE(*,*) 'Failed to open local.dat, iomsg='//trim(my_iomsg)
    STOP
ENDIF
WRITE(nout,*) 
...

Less Common Options[edit]

access=acl where acl is a character expression, variable or literal that evaluates to either 'sequential', 'direct' or 'stream'. When opening a file that already exists, this value must correspond to an allowed value which is usually the value given when the file was created. The default for a new file is 'sequential'. 'stream' access is new at Fortran 2008 and provides some compatibility with C binary stream files. The other really important feature of 'stream' access is that a file can be positioned for write and part of the file overwritten without changing the rest of the file. For formatted stream files the NEW_LINE(nl) function will return the relevant new line characters in the character variable nl.

recl=rcl where rcl is an integer expression, variable or literal that must evaluate to a positive value. For a file to be opened for direct access this 'option' is required and must specify the length of each record. For sequential files it is optional and can be used to specify the maximum length of a record. For a file that already exists, the value or rcl must correspond to the value used to create the file. In any case, the value of rcl must also be allowed by the underlying operating system.

form=frm where frm is a character expression, variable or literal that evaluates to either 'formatted' or 'unformatted'. This option can often be omitted since the default is 'formatted' for sequential access and 'unformatted' for direct access.

blank=blk where blk is a character expression, varialbe or literal that provides the value 'null' or 'zero' for formatted i/o only. See bn and bz formats below.

position=psn where psn is a character expression, variable or literal that evaluates to 'asis', 'rewind' or 'append' and applies only when the access method is sequential. The default value is 'asis'. When opened, a new file is always positioned at its inital point but for existing files, the user has the the option of where to position the current position.s

delim=

pad=

Options to Avoid[edit]

There is alot of legacy code out there and this section describes features that are still legal but which should be considered for replacement, and certainly not used in new code.

err=eno where eno is a literal integer label number. If an error occurs in the processing of the OPEN statement the program will transfer control to the statement with label number eno. The presence of the err= option caused the program to continue if there was an error. This option should now be replaced with iostat= and iomsg=. (It is legal to specify both err= and iostat=, but without either the program will stop if an error occurs processing the OPEN statement.)

unit=nu where nu is a default integer expression, variable or literal value which must be positive and which must not coincide with any unit already in use. If this option is placed first in the list of options the "unit=" can be omitted. This was very widely used and should now be replaced with newunit=. In very old code, nu was a fixed value and the programmer had to ensure that it did not clash with other channels in use at the same time. More recently, the INQUIRE statement can be used to select a unit number not already in use, but this could not guard against a subsequent OPEN statement trying to use a fixed value already in use. This is why the newunit option, and only the newunit option, is allowed to specify a negative value for the channel number.

READ[edit]

The READ statement is a statement which reads from the specified input in the specified form, a variable. For example,

PROGRAM READA
IMPLICIT NONE
INTEGER :: A
READ(*,*) A
END PROGRAM

will create an integer memory cell for A, and then it will read a value with the default formatting from the default input and store it in A. The first * in (*,*) signifies where the value should be read from. The second * specifies the format the user wants the number read with. Let us first discuss the format strings available. In Fortran we have at our disposal many format strings with which we may specify how we want numbers, or character strings to appear on the screen. For fixed point reals: Fw.d; w is the total number of spaces alloted for the number, and d is the number of decimal places. The decimal place always takes up one position. For example,

PROGRAM READA
IMPLICIT NONE
REAL :: A
READ(*,'(F5.2)') A
END PROGRAM

The details of the I/O formatting e.g. '(F5.2)' will be described below

WRITE[edit]

INQUIRE[edit]

The INQUIRE statement has two basic forms: "INQUIRE by unit" and "INQUIRE by file" and both are very useful and well worth getting to know. There is a more obscure form called "INQUIRE by length" which is useful for checking the unformatted record length of potential output in order to decide what record length may be required, or whether a file with an already defined record length can cope with a given output.

INQUIRE by unit[edit]

INQUIRE by file[edit]

INQUIRE by length[edit]

This rather more obscure version of the INQUIRE command is used to obtain the length of an unformatted record required to contain a given form of output and hence allow the user to either check or specify the length of a record required.

INQUIRE Errors[edit]

CLOSE[edit]

The CLOSE command releases the connection between a file and a Fortran channel. In the process, and depending on how the file was opened, the file can be saved or discarded. The general form of the CLOSE command is as follows:

CLOSE([unit=]u, [, olist])

Where u is a default integer expression, variable or literal value that evaluates to the number of the channel to close, and "unit=" is optional. The options available in the option list are as follows:

iostat=ios where ios is a default integer variable which will return with the value 0 if the CLOSE command is executed correctly. If an error occurs the return value will be positive and a message describing the error will be provided via the iomsg option described below.

err=eno where eno is a literal integer label number. If an error occurs in the processing of the CLOSE statement, the program will transfer control to the statement with label number eno. The presence of the err= option caused the program to continue if there was an error. This option should now be replaced with iostat= and iomsg=. (It is legal to specify both err= and iostat=, but without either the program will stop if an error occurs processing the OPEN statement.)

iomsg = iom where iom is a scalar character variable of default kind. Again, although technically an option, this is a highly recommended option for all CLOSE commands in new code. The length of the message is error and vendor specific and may require some trial and error.

status=st

CLOSE Errors[edit]

It is perhaps counter intuitive, but Fortran does not consider attempting to CLOSE a channel that is already closed to be an error. Like INQUIRE and OPEN, the errors CLOSE will report are effectively operating system errors. For example, CLOSE with STATUS="DELETE" on a channel which is already closed is an error especially if the file no longer exists. Similarly, Fortran will report an error if a file created as SCRATCH is closed with STATUS="KEEP" because the only option for scratch files is STATUS='DELETE'. These limitations are predicated on the IOSTAT (and IOMSG) clause being used to allow the user to program for graceful termination when necessary, and not to obtain a full report on the performance of a CLOSE command.

I/O Formatting[edit]

List-Directed Formatting[edit]

We describe explicit formatting below but it is immediately clear that there is a whole "language within a language" so Fortran provides a short cut, or language-defined format guessing. It turns out that this default formatting is very close to a comma separated variable (CSV) processor for input. List-directed I/O is specified by a fmt=* clause, but the fmt clause is optional and can be replaced with just *.

Explicit Formatting[edit]

Fortran has a rich, but very terse, language for controlling the formatting of I/O operations. The format commands can be placed in an explicit FORMAT statement or they can be placed within a clause of the relevant READ or WRITE statement either literally or stored in a CHARACTER variable.

There are limitations: Fortran is concerned with character I/O and not with presentational properties such as the size or font. Fortran has very limited capabilities for random access especially on output. However, it is very simple for the Fortran programmer to output files that conform to modern CSS and HTML standards and to control output appearance accordingly when the file is accessed via a browser.