Pascal Programming

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Pascal_Programming

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Pascal is an influential computer programming language named after the mathematician Blaise Pascal. It was invented by Niklaus Wirth in 1968 as a research project into the nascent field of compiler theory. The backronym PASCAL standing for primary algorithmic scientific commercial application language highlights its suitability for computing tasks in science, making it certainly usable for general programming as well.

Target demographic
Scope
Standard Pascal (ISO standard 7185) and selected modern extensions.
Description
This books will teach high-level programming, using the programming language Pascal.
Learning objectives
You can analyze trivial to medium difficult programming problems, take general software engineering principles into consideration, and write POC implementations in Pascal using its strengths and knowing its limits. This book will not make a senior-level programmer out of you, but you will definitely pass any college-level introductory CS classes.
Not covered here, but (possibly) in other Wikibooks
Computer architecture, low-level OS interactions, specific usage of high-level libraries such as nCurses.
Guidelines for co‑authors
• American English spelling. Mathematical vocabulary, but explain words if they mean something special in mathematics.
• Use Unextended Pascal, ISO 7185, as your base, and go from there.
• Every example program is regarded and has to be by itself complete.
Responsible authors
These authors ensure the book follow a more or less uniform style and can be read from stem to stern with an acceptable degree of repetition. You are welcome to contribute individual chapters or sections without feeling responsible for the entire book.
Structure
See, think, do: Expose the reader to beautiful code and challenge them.

Standard Pascal

## Alternative resources

Tutorials, Textbooks, and the like:

References, Articles on certain topics:

Part Ⅰ

Standard Pascal

# Beginning Pascal

Welcome to the WikiBook Pascal Programming! This book will teach you to program in Pascal, a high-level, human-readable programming language. High-level means there are abstract concepts, such as data types or control structures, which the microprocessor does not know, but the programming language provides this abstraction level. Human-readable refers to the fact that a program written in Pascal can be read like (very simple, “Neanderthalian”) English phrases. This makes Pascal particularly suitable for beginners and we hope you will appreciate this.

## Prerequisites

In order to successfully use this book you need to already know a few things:

Covering these topics would be out of this book’s scope. Pascal only assumes there is some user interface (i. e. a console) and there are external entities (this usually refers to “files”). Every system, however, implements them differently, so we cannot explain them to you, nor can we say at what point you have learned enough to continue with this book.

## Required software

Pascal is a compiled language. That means, you need a tool, a computer program, that “translates” the human-readable Pascal source code into a sequence of Bytes the microprocessor understands. This work is done by a compiler.

Prior the 2000s there were many different compilers, but (as in 2020) there are primarily three Pascal compilers:

• Delphi,
• Free Pascal Compiler (FPC), and
• GNU Pascal Compiler (GPC).

The authors suggest FPC, due to its availability (on many platforms, and free of charge) and continuous progress in development. This table provides more information about each compiler:

Delphi Embarcadero.com Windows proprietary commercial product, with IDE
Free Pascal FreePascal.org many GPL supports multiple dialects
GNU Pascal GNU-Pascal.de All that GCC supports GPL considered abandoned since the 2010
Pascal-P SourceForge Public domain ISO 7185 Level 0 only, must be compiled manually
comparison of current Pascal compilers (current means since 2000)

Furthermore, you will need a program you can edit source code files with. This can be any editor (that can edit and save plain text files), but there are also dedicated suites available for programming purposes. These are called integrated development environments, in short IDE. Such IDEs provide means to write, compile, and run programs, and possibly find programming mistakes, all in one single program. Some IDEs are:

• Delphi
• fp(1), a text-mode IDE that is shipped with the FPC
• Lazarus, which is related to the FPC, but more colorful

An IDE may be overwhelming if you are just starting to program. In this case we suggest to stick to simple editors, such as nano(1). It has an easy to understand user guidance system allowing you to delve in into programming right away.

A temporary alternative for your first steps may also be websites:

All of these are powered by the FPC. Be aware of what you enter on those sites.

## Working with this book

We suggest to create a dedicated folder for your programming exercises. Keep your source code files until you have finished with this book. If your folder becomes cluttered with all kinds of files, the FPC comes with the tool delp(1) that can delete all (Pascal-related) files other than source code files.

Next Page: Beginning
Home: Pascal Programming

# Starting up

In this chapter you will learn:

• How a Pascal source code file is structured
• Basic terminology

## Programs

All your programming tasks require one source code file that is called in Pascal a program. A program source code file is translated by the compiler into an executable application which you can run. Let’s look at a minimal program source code file:

program nop;
begin
{ intentionally empty }
end.

1. The first line program nop; indicates that this file is a Pascal source code file for a program.
2. The begin and end mark a frame. We will explain this in detail as we move on.
3. { intentionally empty } is a comment. Comments will be ignored by the compiler, thus do not contribute in any way how the executable program looks or behaves.
4. And the final dot . after the final end informs the compiler about the program source code file’s end.
 If you feel overwhelmed by the pace of this book, the Wikibook Programming Basics might be more suitable for you.

## Compilation

In order to start your program you need to compile it.

First, copy the program shown above. We advise you to actually type out the examples and not to copy and paste code. Name the file nop.pas. nop is the program’s name, and the filename extension .pas helps you to identify the source code file.

Once you are finished, tell the compiler you have chosen to compile the program:

If you are using the FPC, type into a console fpc followed by a (relative or absolute) file name path to the source code file:
FPC Input:
fpc nop.pas

Result:
Target OS: Linux for x86-64
Compiling nop.pas
4 lines compiled, 0.1 sec

If there no typing errors, successful compilation looks like this (some data may differ). In the current directory there will be a new file called nop. This is the executable program you can start.

If you are using the GPC, type into a console gpc followed by a (relative or absolute) file name path to the source code file:
GPC Input:
gpc nop.pas

Result:
If there are no typing mistakes, gpc will not report any errors, but there will be a new file (by default) called a.out.

Finally, you can then execute the program by one of the methods your OS provides. For example on a console you simply type out the file name of the executable file: ./nop (where ./ refers to the current working directory in Unix-like environments) As this program does (intentionally) nothing, you will not notice any (notable) changes. After all, the program’s name nop is short for no operation.

 Programs need to be compiled for every single platform. Platform refers to OS, OS version, and the utilized microprocessor architecture and make. Only if all of these metrics match, you can copy an executable file to a different computer and run it there, too. Otherwise it may fail. Modern OSs can prevent you from running non-compatible programs (to some degree of precision).

## The computer speaks

Congratulations to your first Pascal program! To be fair, though, the program is not of much use, right? As a small step forward, let’s make the computer speak (metaphorically) and introduce itself to the world:

program helloWorld(output);
begin
writeLn('Hello world!');
end.


The first difference you will notice is in the first line. Not only the program name changed, but there is (output). This is a program parameter. In fact, it is a list. Here, it only contains one item, but the general form is (a, b, c, d, e, …) and so on. A program parameter designates an external entity the OS needs to supply the program with, so it can run as expected. We will go into detail later on, but for now we need to know there are two special program parameters: input and output. These parameters symbolize the default means of interacting with the OS. Usually, if you run a program on a console, output is the console’s display.

### Writing to the console

The next difference is writeLn('Hello world!'). This is a statement. The statement is a routine invocation. The routine is called writeLn. WriteLn has (optional) parameters. The parameters are, again, a comma-separated list surrounded by parentheses.

#### Routines

Routines are reusable pieces of code that can be used over and over again. The routine writeLn, short for write line, writes all supplied parameters to the destination followed by a “newline character” (some magic that will move the cursor to the next line). Here, however, the destination is invisible. That is, because it is optional it can be left out. If it is left out, the destination becomes output, so our console output. If we want to name the destination explicitly, we have to write writeLn(output, 'Hello world!'). WriteLn(output, 'Hello world!') and writeLn('Hello world!') are identical. The missing optional parameter will be inserted automatically, but it relieves the programmer from typing it out.

In order to use a routine, we write its name, as a statement, followed by the list of parameters. We did that in line 2 above.

 Routines need to be defined before they can be used. The routine writeLn, however, is defined as an integral part of the Pascal language. In one of the following chapters we will learn to define our own routines.

#### String literals

The parameter 'Hello world!' is a so-called string literal. Literal means, your program will take this sequence of characters as it is, not interpret it in any way, and pass it to the routine. A string literal is delimited by typewriter (straight) apostrophes.

### Reserved words

In contrast to that, the words program, begin and end (and many more you see in a bold face in the code examples) are so-called reserved words. They convey special meaning as regards to how to interpret and construct the executable program. You are only allowed to write them at particular places.

 Nevertheless, you can write the string literal 'program'. The string delimiters “disable” interpretation.

### Behavior

Now, that we know what the source code contains, create a new file helloWorld.pas, copy the source code (by typing it manually), compile and run it:

Code:

program helloWorld(output);
begin
writeLn('Hello world!');
end.


Output:

Hello world!

The program will print Hello world!, without the straight quotation marks, on an individual line to the console. Isn’t that great?
“Help! I only see the terminal window opening and closing again!”
In this case, try this program
program helloWorld(input, output);
begin
writeLn('Hello world!');
end.

The changed lines are highlighted. The extra readLn() will make your program stall, so the program is not considered done. After you hit ↵ Enter the terminal window should close again.

This type of program, by the way, is an example of a class of “Hello world” programs. They serve the purpose for demonstrating minimal requirements a source code file in any programming language needs to fulfill. For more examples see Hello world in the WikiBook “Computer Programming” (and appreciate Pascal’s simplicity compared to other programming languages).

We already saw the option to write comments. The purpose of comments is to serve the programmer as a reminder.

### Comment syntax

Pascal defines curly braces as comment delimiting characters: { comment } (spaces are for visual guidance and have no significance). The left brace opens or starts a comment, and the right brace closes a comment.

 “Inside” a comment you cannot use the comment closing character as part of your text. The first occurrence of the proper closing character(s) will be the end of the comment.

However, when Pascal was developed not all computer systems had curly braces on their keyboards. Therefore the bigramms (a pair of letters) using parentheses and asterisks was made legal, too: (* comment *).

Such comments are called block comments. They can span multiple lines. Delphi introduced yet another style of comment, line comments. They start with two slashes // and comprise everything until the end of the current line.

Delphi, the FPC as well as GPC support all three styles of comments.

There is an “art” of writing good comments.

 Comments should not repeat what can be deduced from the source code itself. program helloWorld(output); begin { This is where the program begins } writeLn('Hello world!'); end. (* This is where the program ends. *) 

 Comments should explain information that is not apparent: program nop; begin { intentionally empty } end. 

When writing a comment, stick to one natural language. In the chapters to come you will read many “good” comments (unless they clearly demonstrate something like below).

## Terminology

Familiarize with the following terminology (that means the terms on the right printed as comments):

program demo(input, output);     // program header
// ───────────────────────────────────┐
const                            // ────────────────────┐              │
answer = 42;                   // constant definition ┝ const-section│
// ────────────────────┘              │
type                             // ────────────────────┐              │
employee = record              // ─┐                  │              │
number: integer;           //  │                  │              │
firstName: string;         //  ┝ type definition  │              │
lastName: string;          //  │                  ┝ type-section │
end;                         // ─┘                  │              │
//                     │              │
employeeReference = ^employee; // another type def.   │              │
// ────────────────────┘              ┝ block
//                                    │
var                              // ────────────────────┐              │
boss: employeeReference;       // variable declaration┝ var-section  │
// ────────────────────┘              │
//                                    │
begin                            // ────────────────────┐              │
boss := nil;                   // statement           │              │
writeLn('No boss yet.');       // another statement   ┝ sequence     │
readLn();                      // another statement   │              │
end.                             // ────────────────────┘              │
// ───────────────────────────────────┘


Note, how every constant and type definition, as well as every variable declaration all go into dedicated sections. The reserved words const, type, and var serve as headings.

A sequence is also called a compound statement. The combination of definitions, declarations and a sequence is called a block. Definitions and declarations are optional, but a sequence is required. The sequence may be empty, as we already demonstrated above, but this is usually not the case.

Do not worry, the difference between definition and declaration will be explained later. For now you should know and recognize sections and blocks.

Can a comment contain a comment? Try and write a test program to find it out! Mix various comment delimiters and see what happens if you mix them up.
Yes/no. While you can begin another comment inside a comment, the terminating character(s) will mark the end of a comment in general. The following situations do not cause a problem:
program commentDemo;
begin
{ (* Hello { { { }
(* (* { (* Foo }
{ (* Bar *)


The first comment-ending character(s) demarcate the end of the entire comment, regardless whether it started with { or (*. That means, here the compiler will complain:

	{ start (* again? } *)


Line comments are immune to this, since they do not have an explicit end delimiter. This will compile without errors:

	// *) } { (*
end.

Yes/no. While you can begin another comment inside a comment, the terminating character(s) will mark the end of a comment in general. The following situations do not cause a problem:
program commentDemo;
begin
{ (* Hello { { { }
(* (* { (* Foo }
{ (* Bar *)


The first comment-ending character(s) demarcate the end of the entire comment, regardless whether it started with { or (*. That means, here the compiler will complain:

	{ start (* again? } *)


Line comments are immune to this, since they do not have an explicit end delimiter. This will compile without errors:

	// *) } { (*
end.


What does writeLn (note the lack of a parameter list) do?
WriteLn without any supplied parameters prints an empty line to the default destination, i. e. output.
WriteLn without any supplied parameters prints an empty line to the default destination, i. e. output.

Write a program that shows this (or similar):
     ####     ####
######## ########
##     #####     ##
##       #       ##
##      ILY      ##
##   sweetie   ##
###         ###
###     ###
### ###
###
#

An acceptable implementation could look like this:
program valentine(output);
begin
writeLn('     ####     ####');
writeLn('   ######## ########');
writeLn('  ##     #####     ##');
writeLn('  ##       #       ##');
writeLn('  ##      ILY      ##');
writeLn('   ##   sweetie   ##');
writeLn('    ###         ###');
writeLn('      ###     ###');
writeLn('        ### ###');
writeLn('          ###');
writeLn('           #');
end.


Note, the program parameter list (first line) only lists output. Beware, while the exact number of spaces do not matter in your code, they do matter in string literals.

Wikipedia has more on ASCII art.
An acceptable implementation could look like this:
program valentine(output);
begin
writeLn('     ####     ####');
writeLn('   ######## ########');
writeLn('  ##     #####     ##');
writeLn('  ##       #       ##');
writeLn('  ##      ILY      ##');
writeLn('   ##   sweetie   ##');
writeLn('    ###         ###');
writeLn('      ###     ###');
writeLn('        ### ###');
writeLn('          ###');
writeLn('           #');
end.


Note, the program parameter list (first line) only lists output. Beware, while the exact number of spaces do not matter in your code, they do matter in string literals.

Wikipedia has more on ASCII art.
Next Page: Variables and Constants | Previous Page: Getting started
Home: Pascal Programming

# Variables and Constants

Like all programming languages, Pascal provides some means to modify memory. This concept is known as variables. Variables are named chunks of memory. You can use them to store data you cannot predict.

Constants, on the other hand, are named pieces of data. You cannot alter them during run-time, but they are hard-coded into the compiled executable program. Constants do not necessarily occupy any dedicated unique space in memory, but facilitate writing clean and understandable source code.

## Declaration

In Pascal, before you are even allowed to use any variable or constant you have to declare them, like virtually any symbol in Pascal. A declaration makes a certain symbol known to the compiler and possibly instructs it to make the necessary provisions for their effective usage, that means – in the context of variables – earmark some piece of memory.

A declaration is always a two-tuple ${\displaystyle \left({\text{identifier}},{\text{definition}}\right)}$, to be more specific, variables are declared like ${\displaystyle \left({\text{identifier}},{\text{data type}}\right)}$ and constant declarations are ${\displaystyle \left({\text{identifier}},{\text{literal}}\right)}$ tuples. A tuple is an ordered collection. You may not reverse or rearrange its items without the tuple rendering to be different.

 After you have declared an identifier to refer to one thing, you may not re-declare the same identifier to refer to another (or same) thing (“shadowing” may apply, but more on that later).

## Identifiers

### Structure

Identifiers are names denoting constants, types, bounds, variables, procedures, and functions. They must begin with a letter, which may be followed by any combination and numbers of letters and digits. The spelling of an identifier is significant over its whole length. Corresponding upper-case and lower-case letters are considered equivalent.[1]

Letters refers to the modern Latin alphabet, that is all letters you use in writing English words, and digits are Western Arabic digits.

### Usage

As you infer from the quote’s last sentence, the casing of letters does not matter: Foo and fOO are both the same identifier, just different representations.

Identifiers are used simply by writing them out at a suitable position.

### Significant characters

In the age Pascal was developed in, computer memory was a precious resource. In order to build a working compiler, however, the notion of significant characters was introduced. A significant character of an identifier is a character that contributes to distinguishing two identifiers from one another.

Some programming languages had a limit of 8 (eight) characters. This led to very cryptic identifiers. Today, however, the limit of significant characters is primarily governed by usability: The programmer eventually has to type them out if no IDE supports some auto-completion mechanism. The FPC, for example, has a limit of 127 characters:

Identifiers consist of between 1 and 127 significant characters (letters, digits and the underscore character), of which the first must be a letter (az or AZ), or an underscore (_).[2]

 You are still allowed to write identifiers longer than 127 characters, however, the compiler only looks at the first 127 characters and discards the remaining characters as irrelevant.

Note, allowing _, too, is an ISO 10206 (“Extended Pascal”) extension, but – unlike the FPC – it imposes the restriction that an identifier may neither begin or end an identifier, nor may two underscores appear one another.

## Variables

### Variable section

Variables are declared in a dedicated section, the var-section.

program varDemo(input, output);
var
number: integer;
begin
write('Enter a number: ');
writeLn('Great choice! ', number, ' is awesome.');
end.


When the compiler processes the var-section it will set as much memory aside as is required by its associated data type. Here, we instruct the compiler to reserve space for an integer. An integer is a data type that is part of the programming language, thus it is guaranteed to be present regardless of the used compiler. It stores a subset of ℤ, the set of integers, like for example 42, 1337 or -1.

### Data type

Data type refers to the combination of a permissible range of values and permissible operations on this range of values. Pascal defines some basic data types as part of the language. Apart from integer there are also:

char
A character, like a Latin letter or Western Arabic digit, but also spaces and other characters.
real
A subset of ℚ, that is – due to computer’s binary nature – the set of rational numbers. Examples are 0.015625 (2−6) or 73728.5 (216 + 213 + 2−1).
Boolean
A Boolean value, that is false or true.

Each data type defines how data are laid out in memory. In a high-level language, such as Pascal, it is not of the programmer’s concern how exactly the data are stored, but the processor (i. e. in most cases a compiler) has to define it.

We will revisit all data types later on.

As you may have noticed, the example above contains readLn(number) and the program header also lists input. ReadLn will (try to) read data from the (optionally named) source and store the (interpreted) values into the supplied parameters discarding any line-end characters. If the source is not specified, like it is the case here, input is assumed, thus readLn(number) is equivalent to readLn(input, number), but shorter.

When the program is run, it will stop and wait for the user to input a number, that is a literal that can be converted into the argument’s data type.

If you do not enter a literal that is compatible to the type of the supplied argument, something like this happens:
Enter a number: I want cookies!
./a.out: sign or digit expected (error #552 at 402ac3)

And then the program aborted. The following writeLn was not executed. Now obviously I want cookies! is not a literal that can be converted into an integer value (i. e. the data type of number). For reference, this error message was generated with the program compiled using the GPC. Programs compiled with different compilers may emit different error messages.

You have to indicate in your program’s accompanying documents – the user manual – how and when the user needs to input data. Later we will learn how to treat erroneous input, but this is too complex for now.

### More variables

There can be as many var-sections as necessary, but they may not be empty. There is also a shorthand syntax for declaring many variables of the same type:

var
foo, bar, x: integer;


This will declare three independent variables, all of the integer data type. Nonetheless, different types have to appear in different declarations:

var
x: integer;


## Constants

### Constant section

program constDemo(output);
const
begin
writeLn('The answer to the Ultimate Question of ',
'Life, the Universe, and Everything, is: ',
end.


### Usage

As already mentioned in the introduction, a constant may never change its value, but you have to modify the source code. Consequently, the name of a constant cannot appear on the left-hand side of an assignment.

### Pre‑defined constants

There are some already predefined constants:

maxInt
This is the maximum integer value an integer variable could assume. There is no minimum integer constant, but it is guaranteed that a integer variable can at least store the value -maxInt.
maxChar
Likewise, this is the maximum char value a char variable could assume, where maximum refers to the ordinal value using the built-in ord function.
maxReal, minReal and epsReal
Are defined by the “Extended Pascal” standard.
false and true
Refer to Boolean values.

### Rationale

Pascal was designed, so – among other considerations – it could be compiled in one pass, from top to bottom: The reason being to make compiling fast and simple. Distinguishing between variables and constants allows the processor to simply substitute any occurrence of a constant identifier to be replaced by its value. Thus, a constant does not need any special treatment like a variable, yet allows the programmer to reuse reappearing data.

Does the German word Zähler (meaning “counter” / “enumerator”) constitute a valid identifier?
No, because the letter Umlaut-a (ä) is not a letter in the English alphabet. Identifiers may only consist of (English alphabet) letters and (Western Arabic) digits. As a word of advice, stick to English words as identifiers even though your native language is a different one.
No, because the letter Umlaut-a (ä) is not a letter in the English alphabet. Identifiers may only consist of (English alphabet) letters and (Western Arabic) digits. As a word of advice, stick to English words as identifiers even though your native language is a different one.

Is 1direction (1D) a permissible identifier?
This is also not a valid identifier. All identifiers have to start with a letter. This restriction allows compiler vendors to assume, once a digit is encountered, that a number literal follows.
This is also not a valid identifier. All identifiers have to start with a letter. This restriction allows compiler vendors to assume, once a digit is encountered, that a number literal follows.

What is the difference between write and writeLn?
They are the same except that, as the name already indicates, writeLn puts the cursor into the next line after it has printed all its parameters.
They are the same except that, as the name already indicates, writeLn puts the cursor into the next line after it has printed all its parameters.

References:

1. Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5.
2. Michaël Van Canneyt (September 2017). "§1.4". Free Pascal Reference guide. version 3.0.4. p. 15. Retrieved 2019-12-14.

Next Page: Input and Output | Previous Page: Beginning
Home: Pascal Programming

# Input and Output

We already have been using I/O since the first chapter, but only to get going. It is time to dig a little bit deeper, so we can write nicer programs.

## Interface

In its heydays Pascal was so smart and defined a minimal common, yet convenient interface to interact with I/O. Despite various standardization efforts I/O operations differ among every single OS, yet – as part of the language – Pascal defines a set of operations to be present, regardless of the utilized compiler or OS.

### Special files

In the first chapter it was already mentioned that input and output are special program parameters. If you list them in the program parameter list, you can use these identifiers to write and read from the terminal, the CLI you are using.

### Text files

In fact, input and output are variables. Their data type is text. We call a variable that has the data type text a text file.

The data of a text file are composed of lines. A line is a (possibly empty) sequence of characters (e. g. letters, digits, spaces or punctuation) until and including a terminating “newline character”.

### Files

A file – in general – has the following properties:

• It can be associated with an external entity. External means “outside” of your program. A suitable entity can be, for instance, your console window, a device such as your keyboard, or a file that resides in your file system.
• If a file is associated with an external entity, it is considered bound.
• A file has a mode. Every file can be in generation or inspection mode, none or both. If a file is in generation and inspection mode at the same time, this can also be called update mode.[fn 1]
• Every file has a buffer. This buffer is a temporary storage for writing or reading data, so virtually another variable. This buffer variable exists due to reasons how I/O on computers works.

All this information is implicitly available to you, you do not need to take care of it. You can query and alter some information in predefined ways.

All you have to keep in mind in order to successfully use files is that a file has a mode. The text files input and output are, once they are listed in the program parameter list, in inspection and generation mode respectively. You can only read data from files that are inspection mode. And it is only possible to write data to files that are generation mode.

Note, due to their special nature the mode of input and output cannot be changed.

### Routines

Pascal defines the following routines to read and write to files:

• get,
• put,
• read/readLn, and
• write/writeLn.

The routines readLn and writeLn can only be used in conjunction with text files, whereas all other routines work with any kind of file. In the following sections we will focus on read and write. These routines build upon the “low-level” get and put. In the chapter “Files” we will take a look at them, though.

## Writing data

Let’s look at a simple program:

program writeDemo(output);
var
x: integer;
begin
x := 10;
writeLn(output, x:20);
end.


Copy the program and see what it does.

### Assignment

First, we will learn a new statement, the assignment. Colon equals (:=) is read as “becomes”. In the line x := 10 the variable’s value becomes ten. On the left hand side you write a variable name. On the right hand side you put a value. The value has to be valid for the variable’s data type. For instance, you could not assign 'Hello world!' to the variable x, because it is not a valid integer, i. e. the data type x has.

### Converting output

The power of write/writeLn is that – for text files – it converts the parameters into a human-readable form. On modern computers the integer value ten is stored in a particular binary form. 00001010 is a visual representation of the bits set (1) and unset (0) for storing “ten”. Yet, despite the binary storage the characters you see on the screen are 10. This conversion, from zeroes and ones into a human-readable representation, the character sequence “10”, is done automatically.

 If the destination of write/writeLn is a text file, all parameters are converted into a human-readable form provided such conversion is necessary and makes sense.

### Formatting output

Furthermore, after the parameter x comes :20. As you might have noticed, when you run the program the value ten is printed right-aligned making the 0 in 10 appear at the 20th column (position from the left margin).

The :20 is a format specifier. It ensures that the given parameter has a minimum width of that many characters and it may fill missing “width” with spaces to left.

 Format specifiers in a write/writeLn call can only be specified where a human-readable representation is necessary, in other words if the destination is a text file.

Look at this program:

program iceCream(input, output);
var
response: char;
begin
writeLn('Do you like ice cream?');
writeLn('Type “y” for “yes” (“Yummy!”) and “n” for “no”.');
writeLn('Confirm your selection by hitting Enter.');

if response = 'y' then
begin
writeLn('Awesome!');
end;
end.


### Requirements

All parameters given to read/readLn have to be variables. The first parameter, the source, has to be a file variable which is currently in inspection mode. We ensure that by putting input into the program parameter list. If the source parameter is input, you are allowed to omit it, thus readLn(response) is equivalent to readLn(input, response).

 If the source is a text file, you can only read values for variables having the data type char, integer, real, or “string types”. Other variables not compatible to these types cannot be read from a text file. (The term “compatible” will be explained later.)

### Branching

A new language construct which we will cover in detail in the next chapter is the if-then-branch. The code after then that is surrounded by begin and end; is only executed if response equals to the character value 'y'. Otherwise, we are polite and do not express our strong disagreement.

Can you write to input? Why does / should it work, or not?
You cannot write to input. The text file input is, provided it is listed in the program parameter list, in inspection mode. That means you can only read data from this text file, never write.
You cannot write to input. The text file input is, provided it is listed in the program parameter list, in inspection mode. That means you can only read data from this text file, never write.

Can you read to a constant?
No, all parameters to read/readLn have to be variables. A constant, per definition, does not change its value during run-time. That means, also the user cannot assign values to a constant.
No, all parameters to read/readLn have to be variables. A constant, per definition, does not change its value during run-time. That means, also the user cannot assign values to a constant.

Take your program valentine from the first chapter and improve it with knowledge you have learned in this chapter: Make the heart ideogram appear (sort of) centered. Assume a console window width of 80 characters, or any reasonable width.
An improved version may look like this:
program valentine(output);
const
width = 49;
begin
writeLn('   ####     ####   ':width);
writeLn(' ######## ######## ':width);
writeLn('##     ####      ##':width);
writeLn('##       #       ##':width);
writeLn('##      ILY      ##':width);
writeLn(' ##   sweetie   ## ':width);
writeLn('  ###         ###  ':width);
writeLn('    ###     ###    ':width);
writeLn('      ### ###      ':width);
writeLn('        ###        ':width);
writeLn('         #         ':width);
end.

Note the usage of a constant for the formatting width. Use constants whenever you are otherwise repeating values. Do not worry if you did not do that. You will get a sense for that. Also, the string literals can be shorter on the left side if the longest string literal is shorter than width (otherwise it does not resemble a heart ideogram anymore):
program valentine(output);
const
width = 49;
begin
writeLn(   '####     ####   ':width);
writeLn( '######## ######## ':width);
writeLn('##     ####      ##':width);
writeLn('##       #       ##':width);
writeLn('##      ILY      ##':width);
writeLn( '##   sweetie   ## ':width);
writeLn(  '###         ###  ':width);
writeLn(    '###     ###    ':width);
writeLn(      '### ###      ':width);
writeLn(        '###        ':width);
writeLn(         '#         ':width);
end.

Note that the “opening” typewriter apostrophe starts right before first hash mark. Indentation, though, has been preserved, so you can still recognize the heart shape in your source code and do not need to run the program to see it.
An improved version may look like this:
program valentine(output);
const
width = 49;
begin
writeLn('   ####     ####   ':width);
writeLn(' ######## ######## ':width);
writeLn('##     ####      ##':width);
writeLn('##       #       ##':width);
writeLn('##      ILY      ##':width);
writeLn(' ##   sweetie   ## ':width);
writeLn('  ###         ###  ':width);
writeLn('    ###     ###    ':width);
writeLn('      ### ###      ':width);
writeLn('        ###        ':width);
writeLn('         #         ':width);
end.

Note the usage of a constant for the formatting width. Use constants whenever you are otherwise repeating values. Do not worry if you did not do that. You will get a sense for that. Also, the string literals can be shorter on the left side if the longest string literal is shorter than width (otherwise it does not resemble a heart ideogram anymore):
program valentine(output);
const
width = 49;
begin
writeLn(   '####     ####   ':width);
writeLn( '######## ######## ':width);
writeLn('##     ####      ##':width);
writeLn('##       #       ##':width);
writeLn('##      ILY      ##':width);
writeLn( '##   sweetie   ## ':width);
writeLn(  '###         ###  ':width);
writeLn(    '###     ###    ':width);
writeLn(      '### ###      ':width);
writeLn(        '###        ':width);
writeLn(         '#         ':width);
end.

Note that the “opening” typewriter apostrophe starts right before first hash mark. Indentation, though, has been preserved, so you can still recognize the heart shape in your source code and do not need to run the program to see it.

Create a program that draws a 40 by 6 box like the one shown below, but (for educational purposes) we do not want to enter four times 38 spaces in our source code.
o--------------------------------------o
|                                      |
|                                      |
|                                      |
|                                      |
o--------------------------------------o

If you are stuck, here is a hint.
This is an empty string literal: ''. It is two straight typewriter’s apostrophes back-to-back. You can use it in your solution.
This is an empty string literal: ''. It is two straight typewriter’s apostrophes back-to-back. You can use it in your solution.
An acceptable implementation could look like this:
program box(output);
const
space = 38;
begin
writeLn('o--------------------------------------o');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('o--------------------------------------o');
end.

The '':space will generate 38 (that is the value of the constant space) spaces. If you are really smart, you have noticed that the top and bottom edges of the box are the same literal twice. We can shorten our program even further:
program box(output);
const
space = 38;
topAndBottomEdge = 'o--------------------------------------o';
begin
writeLn(topAndBottomEdge);
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn(topAndBottomEdge);
end.

An acceptable implementation could look like this:
program box(output);
const
space = 38;
begin
writeLn('o--------------------------------------o');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('o--------------------------------------o');
end.

The '':space will generate 38 (that is the value of the constant space) spaces. If you are really smart, you have noticed that the top and bottom edges of the box are the same literal twice. We can shorten our program even further:
program box(output);
const
space = 38;
topAndBottomEdge = 'o--------------------------------------o';
begin
writeLn(topAndBottomEdge);
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn('|', '':space, '|');
writeLn(topAndBottomEdge);
end.


Notes:

1. “Update” mode is only available in Extended Pascal (ISO standard 10206). In Standard (unextended) Pascal (laid out in ISO standard 7185) a file can be either in inspection or generation mode, or none.

Next Page: Expressions and Branches | Previous Page: Variables and Constants
Home: Pascal Programming

# Expressions and Branches

In this chapter you will learn

• to distinguish between statements and expressions, and
• how to program branches.

## Statement

Before we get know “expressions”, let’s define “statements” more precisely, shall we: A statement tells the computer to change something. All statements in some way or other change the program state. Program state refers to a whole conglomerate of individual states, including but not limited to:

• the values variables have, or
• in general the program’s designated memory contents, but also
• (implicitly) which statement is currently processed.

The last metric is stored in an invisible variable, the program counter. The PC always points to the currently processed statement. Imagine pointing with your finger to one source code line (or, more precisely, statement): “Here we are!” After a statement has successfully been executed, the PC advances to the effect that it points to the next statement.[fn 1] The PC cannot be altered directly, but only implicitly. In this chapter we will learn how.

### Classification

Statements can be categorized into two groups: Elementary and complex statements. Elementary statements are the minimal building blocks of high-level programming languages. In Pascal they are:[fn 2][fn 3]

• Assignments (:=), and
• Routine[fn 4] invocations (such as readLn(x) and writeLn('Hi!')).

“Complex” statements are:

• Sequences (surrounded by begin and end),
• branches, and
• loops.

### Semicolon

Unlike many other programming languages, in Pascal the semicolon ; separates two statements. Lots of programming languages use some symbol to terminate a statement, e. g. the semicolon. Pascal, however, recognized that an extra symbol should not be part of a statement in order to make it an actual statement. The helloWorld program from the second chapter could be written without a semicolon after the writeLn(…), because there is no following statement:

program helloWorld(output);
begin
writeLn('Hello world!')
end.


We, however, recommend you to put a semicolon there anyway, even though it is not required. Later in this chapter you will learn one place you most probably (that means not necessarily always) do not want to put a semicolon.

Although a semicolon does not terminate a statement, the program header, constant definitions, variable declarations and some other language constructs are terminated by this symbol. You cannot omit a semicolon at these locations.

## Expressions

Expressions, in contrast to statements, do not change the program state. They are transient values that can be used as part of statements. Examples of expressions are:

• 42,
• 'D’oh!', or
• x (where x is the name of a previously declared variable).

Every expression has a type: When an expression is evaluated it results in a value of a certain data type. The expression 42 has the data type integer, 'D’oh!' is a “string type” and the expression merely consisting of a variable’s name, such as x, evaluates to the data type of that variable. Because the data type of an expression is so important, expressions are named after their type. The expression true is a Boolean expression, as is false.

### Using expressions

Expressions appear at many places:

• In the assignment statement (:=) you write an expression on the RHS. This expression has to have the data type of the variable on the LHS.[fn 5] An assignment makes the transient value of an expression “permanent” by storing it into the variable’s memory block.
• The parameter lists of routine invocations consist of expressions. In order to invoke a routine all the parameters have to be stored in memory. Think of a routine invocation as a sequence of assignments to invisible variables before the routine is actually called. Thus writeLn(output, 'Hi!') can be understood as
1. destination becomes output
2. “first parameter” becomes 'Hi!'
3. call the routine writeLn with the invisible “variables” destination and “first parameter”
For the first two pseudo-assignments the value/expression on the RHS had to be assignment-compatible with the LHS.
• In a constant definition the RHS is also an expression, although – hence their name – it has to be constant. You could not use, for instance, a variable as part of that expression.

The power of expressions lies in their capability to link with other expressions. This is done by using special symbols called operators. In the previous chapter we already saw one operator, the equals operator =. Now we can break up such an expression:

     response         =           'y'       {
│ └──────┰─────┘      ┃     └──────┰──────┘ │
│ sub-expression  operator  sub-expression  │
│                                           │
└─────────────────────┰─────────────────────┘
expression                 }


As you can you can see in the diagram, an expression can be part of a larger expression. The sub-expressions are linked using the operator symbol =. Sub-expressions that are linked via, or associated with an operator symbol are also called operands.

### Comparisons

Linking expressions via an operand “creates” a new expression which has a data type on its own. While response and 'y' in the example above were both char-expressions, the overall data type of the whole expression is Boolean, because the linking operator is the equal comparison. An equal comparison yields a Boolean expression. Here is a table of relational operators which we can already use with our knowledge:

name source code symbol
=, equals =
≠, unequal <>
<, less than <
>, greater than >
≤, less than or equal to <=
≥, greater than or equal to >=
relational operators (excerpt)

Using these symbols yield Boolean expressions. The value of the expression will be either true or false depending on the operator’s definition.

All those relational operators require operands on both sides to be of the same data type.[fn 6] Although we can say '$' = 1337 is wrong, that means it should evaluate to the value false, it is nevertheless illegal, because '$' is a char‑expression and 1337 is an integer‑expression. Pascal forbids you to compare things/objects that differ in their data type. So, I guess, y’can’t compare apples ’n’ oranges after all. (Note, a few conversion routines will allow you to do some comparisons that are not allowed directly, but by taking a detour. In the next chapter we will see some of them.)

 The imposed data type restrictions are installed for good cause. You might get the impression Pascal was being fussy about all those data types, but this so-called strong type safety is actually an advantage. It prevents you, the programmer, from inadvertent programming mistakes.

### Calculations

Expressions are also used for calculations, the machine you are using is not called “computer” for no reason. In Standard Pascal you can add, subtract, multiply and divide two numbers, i. e. integer‑ and real‑expressions and any combination thereof. The symbols that work for all combinations are:

name                         source code symbol
+, plus +
−, minus -
×, times *
arithmetic operators (excerpt)

The division operation has been omitted as it is tricky, and will be explained in a following chapter.

 If at least one of the operands is a real‑expression, the entire expression is of type real, even if the exact value could be represented by an integer.

Note, unlike in mathematics, there is no invisible times assumed between two “operands”: You always need to write the “times”, meaning the asterisk * explicitly.

The operator symbols + and - can also appear with one number expression only. It then indicates the positive or negative sign, or – more formally – sign identity or sign inversion respectively.

### Operator precedence

Just like in mathematics, operators have a certain “force” associated with them, in CS we call this operator precedence. You may recall from your primary or secondary education, school or homeschooling, the acronym PEMDAS: It is a mnemonic standing for the initial letters of

1. parentheses
2. exponents
3. multiplication / division

giving us the correct order to evaluate an arithmetic expression in mathematics. Luckily, Pascal’s operator precedence is just the same, although – to be fair – technically not defined by the word “PEMDAS”.[fn 7]

 Standard Pascal does not define any exponent / power operator, thus e. g. ${\displaystyle x^{2}}$ has to be written as x * x and cannot be abbreviated any further. The Extended Pascal standard, however, does define the pow operator.

As you might have guessed it, operator precedence can be overridden on a per-expression basis by using parentheses: In order to evaluate 5 * (x + 7), the sub-expression x + 7 is evaluated first and that value is then multiplied by 5, even though multiplication is generally evaluated prior sums or differences.

## Branches

Branches are complex statements. Up to this point all programs we wrote were linear: They started at the top and the computer (ideally) executed them line-by-line until the final end.. Branches allow you to choose alternative paths, like at a T‑bone intersection: “Do I turn left or do I turn right?” The general tendency to process the program “downward” remains, but there is (in principle) a choice.

### Conditional statement

Let’s review the program iceCream from the previous chapter. The conditional statement is highlighted:

program iceCream(input, output);
var
response: char;
begin
writeLn('Do you like ice cream?');
writeLn('Type “y” for “yes” (“Yummy!”) and “n” for “no”.');
writeLn('Confirm your selection by hitting Enter.');

if response = 'y' then
begin
writeLn('Awesome!');
end;
end.


Now we can say that response = 'y' is a Boolean expression. The words if and then are part of the language construct we call conditional statement. After then comes a statement, in this case a complex statement: begin … end is a sequence and considered to be one statement.

If you remember or can infer from the source code, the statements between begin … end, the writeLn('Awesome!') is only executed if the expression response = 'y' evaluated to true. Otherwise, this is skipped as if there was nothing.

Due to this binary nature – yes / no, execute the code or skip it – the expression between if and then has to be a Boolean expression. You cannot write if 1 * 1 then …, since 1 * 1 is an integer-expression. The computer cannot decide based on an integer-expression, whether it shall take a route or not.

### Alternative statement

Let’s expand the program iceCream by giving an alternative response if the user says not to like ice cream. We could do this with another if‑statement, yet there is a smarter solution for this frequently occurring situation:

	if response = 'y' then
begin
writeLn('Awesome!');
end
else
begin
writeLn('That’s a pity!');
end;


The highlighted alternative, the else‑branch, will only be executed if the supplied Boolean expression evaluated to false. In either case, regardless whether the then‑branch or the else‑branch was taken, program execution resumes after the else‑statement (in this after the end; in the last line).

 There is no semicolon after end preceding the else. A semicolon separates statements, but the entire construct if … then … else … is one (complex) statement. You are not allowed to divide parts of a statement by a semicolon. Note, there are situations though, where you put a semicolon at this position anyway. We will elaborate that in detail in one of the advanced chapters.

### Relevance

Branches and (soon explained) loops are the only method of modifying the PC, “your finger” pointing to the currently executed statement, based on data, an expression, and thus a means of responding to user input. Without them, your programs would be static and do the same over and over again, so pretty boring. Utilizing branches and loops will make your program way more responsive to the given input.

Which relational operators can you use to compare char-expressions? All? None?
You can use all relational operators presented in this chapter. Pascal, the ISO standard 7185, defines that the letters '0' through '9', 'A' through 'Z', and 'a' through 'z' are sorted as you are familiar from the English alphabet, or – with respect to the digits – their numeral value in ascending order. Because of that you are allowed to make a comparison such as 'A' <= 'F' (which will evaluate to true).
You can use all relational operators presented in this chapter. Pascal, the ISO standard 7185, defines that the letters '0' through '9', 'A' through 'Z', and 'a' through 'z' are sorted as you are familiar from the English alphabet, or – with respect to the digits – their numeral value in ascending order. Because of that you are allowed to make a comparison such as 'A' <= 'F' (which will evaluate to true).

42
42

Notes:

1. This paragraph intentionally uses imprecise terminology to keep things simple. The PC is in fact a processor register (e. g. %eip, extended instruction pointer) and points to the following instruction (not current statement). See Subject: Assembly languages for more details.
2. Jumps (goto) have been deliberately banned into the appendix, and are not covered here, yet goto is also an elementary statement.
3. Exception extensions also define raise as an elementary statement.
4. More correctly: procedure calls.
5. Or, a “compatible” data type, e. g. an integer expression can be stored into a variable of the data type real, but not the other way round. As we progress we will learn more about “compatible” types.
6. In the chapter on sets we will expand this statement.
7. To read the technical definition, see § Expressions, subsection 1 “General” in the ISO standard 7185.

Next Page: Routines | Previous Page: Input and Output
Home: Pascal Programming

# Routines

In the opening chapter routines were already mentioned. Routines are, as it was described before, reusable pieces of code that can be used over and over again. Examples of routines are read / readLn and write / writeLn. You can invoke, call, these routines as many times as you want. In this chapter you will learn

• how to define your own routines,
• the difference between a definition and declaration, and
• the difference between functions and procedures.

## Different routines for different occasions

Routines come in two flavors. In Pascal, routines can either replace statements, or they replace a (sub‑)expression. A routine that can be used where statements are allowed is called a procedure. A routine that is called as part of an expression is a function.

### Functions

A function is a routine that returns a value. Pascal defines, among others, a function odd. The function odd takes one integer-expression as a parameter and returns false or true, depending on the parity of the supplied parameter (in layman terms that means whether it is divisible by 2). Let’s see the function odd in action:

program functionDemo(input, output);
var
x: integer;
begin
write('Enter an integer: ');

if odd(x) then
begin
writeLn('Now this is an odd number.');
end
else
begin
writeLn('Boring!');
end;
end.


Odd(x) is pronounced “odd of x”. First, the expression in parentheses is evaluated. Here it is simply x, the variable’s value to be precise, but a more complex expression is allowed too, as long as it eventually evaluates to an integer-expression. The value of this expression, the actual parameter, is then handed to a (in this case invisible) block of code that processes the input, performs some calculations on it, and returns false or true according to the calculation’s findings. The function’s returned value is ultimately filled in in place of the function call. You can, in your mind, read false / true in place of odd(x), although this is dynamic depending on the given input.

 You are only allowed to call functions where you can put an expression. The following program is wrong: program lostFunction; begin odd(42); end.  Calling a function results in an expression, in this particular case (the value) false. But false is not a statement. You can only put statements between begin and end, no expressions.[fn 1]

### Procedures

Procedures on the other hand cannot be used as part of an expression. You can only call procedures where statements are allowed.

 A routine can either be a function or a procedure. In some programming languages the routine used to retrieve data from the console can be used like a function, but this is not the case in Pascal. The following program will not compile: program strayProcedure(input, output); begin if readLn(input) = '' then begin writeLn('Error: No input supplied.'); end; end.  ReadLn refers to a procedure thus it does not return anything, yet at this specific position a value has to be inserted so the if‑branch language construct and the equal comparison make sense.

### Effects

A procedure may use functions, and the other way around. Do not understand a function as a mere substitute for an expression. In the following section we will learn why.

### Rationale

The dichotomy of routines, distinguishing between a procedure and a function, is meant to gently push the programmer to write “clean” programs. Doing so, a routine does not conceal whether it is just a replacement for a sequence of statements or shorthand for a complex, difficult to write out expression. This kind of notation works without introducing nasty pseudo types like, for example, void in the C programming language where every routine is a function, but the “invalid” data type void will allow you to make it (in part) behave like a procedure.

## Definition

Defining routines follows a pattern you are already familiar with since your very first program. A program is, in some regards, like a special routine: You can run it as many times as you want through OS-defined means. A program’s definition looks almost just like a routine’s.

A routine is defined by,

2. a block

in that order. The routine header shows a couple differences depending on whether it is a function or procedure. We will first take a look at blocks, since these are the same for both types of routines.

### Block

A block is the synthesis of a productive part (statements) and (optional) declarations and definitions. In Standard Pascal (as laid out by the ISO standard 7185) a block has a fixed order:[fn 2]

1. constant definitions (the const-section)
2. type definitions (the type-section)
3. variable declarations (the var-section)
4. routine declarations and definitions
5. sequence (begin … end, possibly empty)

All items but the last one, the productive part, are optional.

 Sections (const, type, or var-section) may not be empty. Once you specify a section heading, you have to define/declare at least one symbol in the just started section.

In EP, the fixed order restriction has been lifted. There, sections and routine declarations and definitions may occur as many times as needed and do not necessarily have to adhere to a particular order. The consequences are detailed in the chapter “Scopes”. For the remainder of this book we will refer to EP’s definition of block, because all major compilers support this. Nevertheless, the order defined by Standard Pascal is a good guideline: It makes sense to define types, before there is a section that may use those types (i. e. var-section).

1. the word function or procedure,
2. an identifier identifying this routine,
3. possibly a parameter list, and,
4. lastly, in the case of functions, the data type of an expression a call to this function results in, the result data type.

The parameter list for routines also defines the data type of every single parameter. Thus, the header of the function odd could look like this:

function odd(x: integer): Boolean;


Take notice of the colon (:) after the parameter list separating the function’s result data type. You can view functions as sort of special variable declaration which also separates an identifier with a colon, except in the case of a function the “variable’s” value is computed dynamically.

Formal parameters, i. e. parameters in the context of a routine header, are separated by a semicolon. Consider the following procedure header:

procedure printAligned(x: integer; tabstop: integer);


Note that every routine header is terminated with a semicolon.

### Body

While the routine header tells the processor (usually a compiler), “Hey, there’s a routine with the following properties: […]”, it is not enough. You have to “flesh out”, give the routine a body. This is done in the subsequent block.

Inside the block all parameters can be read as if they were variables.

### Function result

In the sequence of the block defining a function there is automatically a variable of the function’s name. You have to assign a value exactly one time, so the function, mathematically speaking, becomes defined. Confer this example:

function getRandomNumber(): integer;
begin
// chosen by fair dice roll,
// guaranteed to be random
getRandomNumber := 4;
end;


Note that the block did not contain a var-section declaring the variable getRandomNumber, but it is already implicitly declared by the function’s header: Both the name and the data type are part of the function header.

## Declaration

A routine declaration happens most of the time implicitly. Declaring a routine, or in general any identifier, refers to the process of giving the processor (i. e. usually a compiler) information in order to correctly interpret your program source code. This information is not directly encoded in your executable program, but it is implicitly there. Examples are:

• A variable declaration tells the processor to install proper provisions in order to reserve some memory space. This chunk of memory will be interpreted according to its associated data type. However, neither the variable’s name, nor the data type are in any way stored in your program. Only the processor knows about this information as it is reading your source code file.
• A routine header constitutes a routine declaration (which is usually directly followed by its definition[fn 3]). Here again, the information given in a routine header are not stored directly in the executable file, but they ensure the processor (the compiler) will correctly transform your source code.
• Likewise, type declarations merely serve the purpose of clean and abstract programming, but those declarations do not end up in the executable program file.[fn 4]

Declarations make an identifier known to denote a certain object (“object” mathematically speaking). Definitions on the other hand will, hence their name, define what this object exactly is. Whether it is a value of a constant, the value of a variable, or the steps taken in a routine (the statement sequence), data defined through definitions will result in specific code in your executable file, which may vary according to the information given in related declarations; writing a variable possessing the data type integer is fundamentally different than writing a value of the type real. The code for properly storing, calculating and retrieving integer and real values differs, but the computer is not aware of that. It just performs the given instructions, the circumstance that a certain set of instructions resemble operations on Pascal’s data type real for instance is, so to speak, a “coincidence”.

## Calling routines

### Routing

Routines are selected based on their signature. A routine signature consists of

1. the routine’s name,
2. the data type’s of all arguments, and
3. (implicitly) their correct order.

Thus the signature of the function odd reads odd(integer). The function named odd accepts one integer value as the first (and only) argument.

 In some other programming languages the data type of the returned value also belongs to a routine’s signature. Remember differing definitions of the term signature should you ever switch between programming languages.

Pascal allows you to declare and define routines of the same name, but differing formal parameters. This is usually called overloading. When calling a routine there must be exactly one routine of that name that accepts parameters with their corresponding data types.

## Pre-defined routines

Pascal’s pre-defined functions (excerpt)
signature description returned value’s type
abs(integer) absolute value of argument integer
odd(integer) parity (is given value divisible by two) Boolean
sqr(integer) the value squared integer

## Persistent variables

Some compilers, such as the FPC, allow you to use constants as if they were variables, but different lifetime. In the following example the “constant” numberOfInvocations exists for the entire duration of program execution, but is only accessible in the scope it was declared in.

program persistentVariableDemo(output);
{$ifDef FPC} // allow assignments to _typed_ “constants” {$writeableConst on}
{$endIf} procedure foo; const numberOfInvocations: integer = 0; begin numberOfInvocations := numberOfInvocations + 1; writeLn(numberOfInvocations); end; begin foo; foo; foo; end.  The program will print 1, 2, 3 for every call. Lines 2, 4, and 5 contain specially crafted comments that instruct the compiler to support persistent variables. These comments are non-standard, yet some are explained in the appendix, chapter “Preprocessor Functionality”. Note, the concept of typed “constants” is not standardized. Some object-oriented programming extensions will give nicer tools to implement such behavior as demonstrated above. We primarily explained the concept of persistent variables to you, so you can read and understand source code by other people. ## Benefit Routines can be used as many times as you want. They are no tools of mere “text substitution”: The definition of a routine is not “copied” to the place where it is called, the call site. The size of the executable program file remains about the same. Utilizing routines can also be and usually is beneficial to the development progress of a program. By splitting up a programming project into smaller understandable problems you can focus on solving isolated issues as part of the big task. This approach is known as divide and conquer. We now ask you to slowly shift toward thinking more about your programming tasks before you start typing anything. You may need to spend more time on thinking about, for example, how to structure a routine’s parameter list. What information, what parameters, does this routine require? Where and how can a recurring pattern be generalized through a routine definition? Identifying such questions needs time and expertise, so do not be discouraged if you are not seeing everything the task’s sample answers show. You will learn through your mistakes. Keep in mind, though, routines are no panacea. There are situations, very specific situations, where you do not want to use routines. Recognizing those, however, is out this book’s scope. For the sake of this textbook, and in 99% of all your programming projects you want to use routines if possible. Modern compilers can even recognize some situations where a routine was “unnecessary”, yet the only gain is that your source code becomes more structured and thus readable, albeit at the expense of being more abstract and therefore complex.[fn 5] ## Tasks Write a (now infamous) program that writes the word “Mississippi” in large (spanning at least three lines) capital letters. It should become apparent that writing four routines, printM, printI, printS, printP, will significantly speed up development. An acceptable answer could look like this program mississippi(output); const width = 8; procedure printI; begin writeLn( '# ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn; end; procedure printM; begin writeLn('# #':width); writeLn('## ##':width); writeLn('# ## #':width); writeLn('# #':width); writeLn('# #':width); writeLn; end; procedure printP; begin writeLn( '### ':width); writeLn( '# # ':width); writeLn( '### ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn; end; procedure printS; begin writeLn(' ### ':width); writeLn(' # # ':width); writeLn(' ## ':width); writeLn('# # ':width); writeLn(' ### ':width); writeLn; end; begin printM; printI; printS; printS; printI; printS; printS; printI; printP; printP; printI; end.  An acceptable answer could look like this program mississippi(output); const width = 8; procedure printI; begin writeLn( '# ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn; end; procedure printM; begin writeLn('# #':width); writeLn('## ##':width); writeLn('# ## #':width); writeLn('# #':width); writeLn('# #':width); writeLn; end; procedure printP; begin writeLn( '### ':width); writeLn( '# # ':width); writeLn( '### ':width); writeLn( '# ':width); writeLn( '# ':width); writeLn; end; procedure printS; begin writeLn(' ### ':width); writeLn(' # # ':width); writeLn(' ## ':width); writeLn('# # ':width); writeLn(' ### ':width); writeLn; end; begin printM; printI; printS; printS; printI; printS; printS; printI; printP; printP; printI; end.  Notes: 1. Some dialects of Pascal are not so strict about that: The FPC has the option {$extendedSyntax on} which will allow the program above to compile anyway.
2. The label-section has intentionally been omitted.
3. The Extended Pascal standard allows so-called “forward declarations” [remote directive]. A forward declaration of a routine is just the declaration, no definition.
4. Some compilers support the generation of non-standardized “run-time type information” (RTTI). By enabling RTTI, type declarations do produce data that is stored in your program.
5. One such compiler optimization is called inlining. This will effectively copy a routine definition to the call site. Pure functions even stand to benefit by being defined as isolated functions, provided the compiler does support appropriate optimizations.

Next Page: Enumerations | Previous Page: Expressions and Branches
Home: Pascal Programming

# Enumerations

One powerful notational as well as syntactical tool of Pascal is the declaration of custom enumeration data types.

## Handling

### Notion

An enumeration data type is a finite list of named discrete values. Enumerations virtually give names to individual integer values, however, you cannot (directly) do arithmetic operations on it.

### Declaration

An enumeration data type is declared by following the data type identifier with a non-empty comma-separated list of (new, not previously used) identifiers.

type
weekday = (Monday, Tuesday, Wednesday, Thursday, Friday,
Saturday, Sunday);


The individual list items refer to specific values the data type may assume. The data type identifier identifies the data type as a whole.

### Operations

Once an enumeration data type has been declared, you can use it like any other data type:

var
startOfWeek: weekday;
begin
startOfWeek := Sunday;
end.


The variable startOfWeek is restricted to assume only legal values of the data type weekday. Note that Sunday is not enclosed by typewriter quotation marks (') which usually indicate a string literal. The identifier Sunday indicates a value in its own right.

### Ordinal values

#### Automatism

Every enumeration data type declaration implicitly defines an order. The comma-separated list is per definition a sorted list. The built‑in function ord, short for ordinal value, gives you the opportunity to obtain the ordinal value of an enumeration element, that is an integer-value unique/specific to that enumeration member.

The first element of an enumeration is numbered as 0. The second, if applicable, has the number 1, and so forth.

#### Override

Some compilers, such as the FPC, allow you to specify explicit indexes for some, or even all elements of an enumeration:

type
month = (January := 1, February, March, April, May, June,
July, August, September, October, November, December);


Here, January will have the ordinal value 1. And all following items have an ordinal value greater than 1. The automatic assignment of numbers still ensures every enumeration member has a unique number among the entire enumeration data type. February will have the ordinal value 2, March the value 3, and so on. The value 0, however, is not assigned to any element of that enumeration.

 If you specify explicit indices for particular elements, you need to ensure all numbers are ascending order. You cannot assign the same number twice. If you skip giving numbers, the automatic numbering system will assign and “reserve” numbers. You cannot use numbers the automatism uses.

Specifying explicit indices is a non-standard extension. In FPC’s {$mode Delphi} you need to use a plain equal sign (=) instead of :=. This is also referred to as “C‑style enumeration declaration”, since the programming language C uses that syntax. #### Inverse Pascal does not provide a generic function that lets you determine the enumeration element based on a number. There is no function returning January, for instance, if it is supplied with the integer-value 1.[fn 1] ### Neighbors The standard functions pred and succ, short for predecessor and successor respectively, are automatically defined for every enumeration data type. These functions return the previous or next value of an enumeration value. For example succ(January) will return February, as it is the successor of the value January. However, pred(January) will fail as there is technically no member prior January. An enumeration list is not cyclical. Although in real life January follows December, the enumeration data type month does not “know” that. The EP standard allows a second optional integer parameter to be supplied to either pred or succ. succ(January, 2) is identical to succ(succ(January)), yet more convenient and shorter, but also pred(January, -2) returns the same value. Utilizing this functionality you can obtain an enumeration value given its index. succ(Monday, 3) evaluates to the weekday value that has the ordinal value 3, thus virtually providing a means for an inverse ord function. However, it is necessary to know the first element of the enumeration though, and the enumeration may not use any explicit indices in its declaration (unless all indices coincide with the automatic numbering pattern). ### Operators Enumeration data type values are automatically eligible to be used with several operators. Since every enumeration value has an ordinal value, they can be ordered and you can test for that. The relational operators • < • > • <= • >= • = • <> work in conjunction with enumeration values. For example, January < February will evaluate to true, because January has a smaller ordinal value than February. Although, technically you can compare apples and oranges (spoiler alert: they are unequal), all relational operators only work in conjunction with two values of the same kind. In Pascal, you cannot compare a weekday value with a month value. Nonetheless, something like ord(August) > ord(Monday) is legal, since you are then in fact comparing integer values. Note, arithmetic operators (+, -, and so on) do not work with enumeration data types, despite their ordinal values. ## Boolean as an enumeration data type ### Definition The data type Boolean is a built‑in special enumeration data type. It is guaranteed that • ord(false) = 0, • ord(true) = 1, and, in consequence, • pred(true) = false. ### Logical operators Boolean is only enumeration data type operations can be directly performed on using logical operators. #### Negation The most basic operator is the negation. It is a unary operator, that means it expects only one operand. In Pascal it uses the keyword not. By preceding a Boolean expression with not (and some separator such as a space character), the expression is negated. expression result not true false not false true not #### Conjunction While this may be pretty straightforward, the so-called logical conjunction, indicated by and, might not be. The truth table for it looks like this: value of tired value of intoxicated result of tired and intoxicated false false false false true false true false false true true true conjunction truth table In EE this is frequently written as ${\displaystyle \cdot }$ (“times”) or even omitted, because (like an mathematics) an invisible “times” is assumed. Given that the ordinal values of false and true are as defined above, you could calculate the and result by multiplying them. #### Disjunction A little more confusing, because it may be contradictory to someone’s natural language, is the word or. If either operand is true, the overall expression’s result becomes true. value of raining value of snowing result of raining or snowing false false false false true true true false true true true true disjunction truth table Electrical engineers frequently use the ${\displaystyle +}$ symbol to denote this operation. With respect to Boolean’s ordinal value, though, you must “define” that ${\displaystyle 1+1}$ was still ${\displaystyle 1}$. #### Precedence Like the usual rule in mathematics “multiplication and division first, then addition and subtraction”, a conjunction is evaluated first before a disjunction is. However, since the negation is a unary operator, it is evaluated first in any case. That means you must be really careful not to forget placing parenthesis. The expression not hungry or thirsty is fundamentally different to not (hungry or thirsty) ## Ranges ### Ordinal types Enumeration data types belong to the category of ordinal data types. Other ordinal data types are: • integer, • char, • and all enumeration data types, including Boolean. They all have in common, that a value of them can be mapped to a distinct integer-value. The ord function lets you retrieve that value. ### Intervals Sometimes, it makes sense to restrict a set of values to a certain range. For instance, the hours on a military time clock may show values from 0 up to and including 23. Yet the data type integer will permit other values too. Pascal allows you to declare (sub‑)range data types. A (sub‑)range data type has a host data type, e. g. integer, and two limits. One lower and one upper limit. A range is specified by giving the limits in ascending order, separated through two periods back-to-back (..): type majuscule = 'A'..'Z'; The limits may be given as any computable expression, as long as it does not depend on run-time data.[fn 2] For example constants (that have already been defined) may be used: type integerNonNegative = 0..maxInt; Note, we named this range integerNonNegative and not nonNegativeInteger, because this will facilitate alphabetical sorting of some documentation tools or in IDEs. ### Restriction A variable possessing one (sub‑)range data type may only assume values within the range. If the variable exceeds its legal range, the program aborts. The following error message may appear (memory address at the end can vary): ./a.out: value out of range (error #300 at 402a54) The corresponding test program has been compiled with GPC. Other compilers may emit different messages. The default configuration of the FPC, however, ignores this. Assigning out-of-range values to variables will not yield an error (if it depends on run-time data). The developers of the FPC cite compatibility reasons to other compilers, which decided to ignore out-of-range values for speed reasons.[fn 3] You need to specifically request that illegal values cannot be assigned to ordinal type variables. This can be done by placing a specially crafted comment prior any (crucial) assignments: {$rangeChecks on} (case-insensitive) or {$R+} for short (case-sensitive) will ensure illegal values are not assigned and the program aborts if any attempts are made anyway. Specifying this compiler switch once in your source code file is sufficient. FPC’s ‑Cr command-line switch has the same effect.  The range restriction has to be met when storing the value, when saving the value to a variable for instance. Intermediate calculations, such as when evaluating an expression, may fall out of range though. ## Selections With the advent of enumeration data types, it may become cumbersome and tedious to check for values just using if‑branches. ### Explanation The case selection statement unites multiple exclusive if‑branches in one language construct.[fn 4] case sign(x) of -1: begin writeLn('You have entered a negative number.'); end; 0: begin writeLn('The numbered you have entered is sign-less.'); end; 1: begin writeLn('That is a positive number.'); end; end; Between case and of any expression that evaluates to an ordinal value may appear. After that, -1:, 0: and 1: are case labels. These case labels mark the start of alternatives. After a case label follows a statement. -1, 0 and 1 denote case values. Every case label consists of a non-empty comma-separated list of case values, followed by a colon (:). All case values have to be legal constant values, constant expressions, that are compatible to the comparison expression above, what is written between case and of. Every specified case value needs to appear exclusively in one case label. No case label value can appear twice. It is not necessary to put them in order, according their ordinal value, although it can make your source code more readable. ### Shorthand for many cases In EP case labels may contain ranges. program letterReport(input, output); var c: char; begin write('Give me a letter: '); readLn(c); case c of 'A'..'Z': begin writeLn('Wow! That’s a big letter!'); end; 'a'..'z': begin writeLn('What a nice small letter.'); end; end; end. This shorthand notation allows you to catch many cases. The case label 'A'..'Z': includes all upper-case letters, without requiring you to list them all individually. Take care that no range overlaps with other case label values. This is forbidden. Good processors will complain about such a mistake though. The GPC yields the error message duplicate case-constant in case' statement, the FPC reports just duplicate case label[fn 5], both telling you some information about the location in your source code. ### Fall-back It is important that any (expected) value of the comparison expression matches one case label. If the comparison expression evaluates to a value no case label contains the corresponding value, the program aborts.[fn 6] If this is not desired the “Extended Pascal” standard allows a special case label called otherwise (note, without a colon). This case treats all values that have no explicit case label associated with them. program asciiTest(input, output); var c: char; begin write('Supply any character: '); readLn(c); case c of // empty statement, so the control characters are not // considered by the otherwise-branch as non-ASCII characters #0..#31, #127: ; #32..#126: begin writeLn('You entered an ASCII printable character.'); end; otherwise begin writeLn('You entered a non-ASCII character.'); end; end; end. otherwise may only appear at the end. There must be at least one case label beforehand, otherwise (no pun intended) the otherwise case is always taken, rendering the entire case-statement useless. BP, that is Delphi, re-uses the word else having the same semantics, the same meaning as otherwise. The FPC and GPC support both, although GPC can be instructed to only accept otherwise. ## Tasks Write a concise function that returns the successor of month, but for December the value January is returned. Using knowledge you have learned in this unit, a case statement is just perfect: function successor(start: month): month; begin case start of January..November: begin successor := succ(start); end; December: begin successor := January; end; end; end; For the purposes of this exercise (demonstrating that relational operators such as < are automatically defined for enumeration data types) the following is acceptable too: function successor(start: month): month; begin if start < December then begin successor := succ(start); end else begin successor := January; end; end; Yet the case-implementation is, mathematically speaking, more precise. In the first implementation, if the parameter is wrong, out of range, the program aborts. The latter implementation with if … then … else will be “wrongly” defined for illegal values too. Using knowledge you have learned in this unit, a case statement is just perfect: function successor(start: month): month; begin case start of January..November: begin successor := succ(start); end; December: begin successor := January; end; end; end; For the purposes of this exercise (demonstrating that relational operators such as < are automatically defined for enumeration data types) the following is acceptable too: function successor(start: month): month; begin if start < December then begin successor := succ(start); end else begin successor := January; end; end; Yet the case-implementation is, mathematically speaking, more precise. In the first implementation, if the parameter is wrong, out of range, the program aborts. The latter implementation with if … then … else will be “wrongly” defined for illegal values too. Notes: 1. Some compilers, such as the FPC, allow “typecasting” effectively transforming 1 into January. However, this – typecasting – is not a function, especially typecasting does not work properly if the values are out of range (no RTE-generation, nor whatsoever). 2. This is an “Extended Pascal” (ISO 10206) extension. In Standard “unextended” Pascal (ISO 7185) only constants are allowed. 3. The Pascal ISO standards do allow this. It is at the compiler programmer’s discretion to ignore such errors. Nonetheless, accompanying documents (manuals, etc.) are meant to point that out. 4. This is an analogy. case-statements are usually not translated into a series of if‑branches. 5. This error message is imprecise. The error message of GPC is more correct. The problem is that a certain value, “case-constant”, appears multiple times. 6. Many compilers do not respect this requirement in their default configuration. The GPC needs to be instructed to be “completely” ISO-compliant (‑‑classic‑pascal, ‑‑extended‑pascal, or just ‑‑case‑value‑checking). In BP, Delphi will just continue, leaving a missing case unnoticed. As of version 3.2.0 the FPC does not regard this requirement at all. Next Page: Sets | Previous Page: Routines Home: Pascal Programming # Sets This chapter introduces you to a new custom data type. Sets are one of the basic structured data types. When programming you will frequently find that some logic can be modeled with sets. Learning and mastering usage of sets is a key skill, since you will encounter them a lot in Pascal. ## Basics ### Notion Sets are (possibly empty) aggregations of distinguishable objects. Either a set contains an object, or it does not. An object being part of a set is also referred to as element of that set. Let's say we know the objects “apple”, “banana” and “pencil”. The set Fruit ≔ {“apple”, “banana”} contains the objects “apple” and “banana”. “Pencil” is not a member of the set Fruit. ### Digitization When a computer is supposed to store and process a set, it actually handles a series of Boolean values.[fn 1] Every one of those Boolean values tells us whether a certain element is part of a set.  A computer does also store a Boolean value for every object that is not part of a certain set. ### Sets in Pascal The computer needs to know how many Boolean values it needs to set aside. In order to achieve this, a set in Pascal requires an ordinal type as a set’s base type. An ordinal type always has a finite range of permissible discrete values, thus the computer knows beforehand how many Boolean values to reserve, how many elements we can expect a set contain at most. In consequence, a valid set type declaration is: type characters = set of char; A variable of the data type characters can only contain char values. This set cannot contain, for instance, 42, that is an integer value, nor is this information stored in any way.  A data type declaration for a set of real is illegal, since the set’s base type, the data type real, is not an ordinal data type. Remember, in order to qualify as an ordinal data type there must be a means to assign every legal value an integer value.[fn 2] Sets are particularly useful in conjunction with enumeration data types, which you just learned in previous chapter. Let’s consider an example in Pascal: program setDemo; type skill = (cooking, cleaning, driving, videogames, eating); skills = set of skill; var slob: skills; begin slob := [videogames, eating]; end. Here, we have declared a variable slob, which represents a set of the skill enumeration data type values. In the penultimate line we populate our set slob with two objects, videogames and eating. The brackets indicate a set literal. [videogames, eating] is a set expression which we are assigning to the slob variable. The set variable slob contains no other objects. However, the computer still stores five Boolean values for every potential member of that set. The number five is number of elements in skill, the set’s base type. The information that cooking, cleaning and driving are not part of the set slob is stored explicitly (by the proper Boolean value false). ### Inspecting a set If we want to learn, whether a certain object is part of a set, the set operator in yields the corresponding Boolean value the computer uses to store that information. program setInspection(output); type skill = (cooking, cleaning, driving, videogames, eating); skills = set of skill; var slob: skills; begin slob := [videogames, eating]; if videogames in slob then begin writeLn('Is she a level 45 dungeon master?'); end; end. The in operator is one of Pascal’s non-commutative operators. This means, you cannot swap the operands. On the RHS you always need to write an expression evaluating to a set value, whereas the LHS has to be an expression evaluating to this set’s base type. Even though we, as humans, can say that 42 in slob is wrong, i. e. false, such a comparison is illegal. Per definition, the slob set can only contain skill values. ## Operations So far, sets probably seemed like a really complicated way for using Boolean values. The true power of sets lies in a number of distinct operations, making sets an easier, and thus better alternative to handling two or more individual (but related) Boolean values directly. ### Combinations In Pascal, two sets of the same kind, the same data type, can be combined forming a new set of the respective data type. Following operators are available: name mathematical symbol source code symbol union + difference - intersection * symmetric difference >< set operators in Pascal The symmetric difference operator is only defined in EP. #### Union In a Venn diagram both circles represent a (positive) number of objects belonging to either or both sets. The red area represents the result of a union. The result of unifying two sets into one is called union. Let’s say, recently our slob has learned how to drive and does that now too. This can be written as: slob := slob + [driving]; Now, slob contains all objects it previously held, plus all objects from the other set, [driving]. #### Difference difference as a Venn diagram: right circle “minus” left circle Of course sets can be deprived of a set of elements by using the difference operator, in source code written as -. slob := slob - []; This removes all objects present in the second set from the first set. Here, the empty set ([]) does not contain any objects, thus removing no objects has virtually no effect on slob. #### Intersection intersection as a Venn diagram: the overlapping area, if any, here colored in red, represents the intersection Furthermore you can intersect sets. The intersection of two sets is defined as the set of elements both operands contain. program intersectionDemo; type skill = (cooking, cleaning, driving, videogames, eating); skills = set of skill; var slob, blueCollar, common: skills; begin blueCollar := [cooking, cleaning, driving, eating]; slob := [driving, videogames, eating]; common := blueCollar * slob; end. The set common now (only) contains driving and eating, because those are the objects member of both operands, of both given sets. #### Symmetric difference symmetric difference as a Venn diagram A disjunct result to the intersection gives the symmetric difference. It is the union of the operands without the elements contained in both sets. unique := blueCollar >< slob; Now unique is [cooking, cleaning, videogames], because those are the values from either set, but not both. ### Comparisons Two sets of the same kind, the same data type, can be compared by looking at each element in both sets. name mathematical symbol source code symbol equality = = inequality <> inclusion <= inclusion >= element in comparison operators for sets in Pascal All comparison operators, as before, evaluate to a Boolean expression. #### Inclusion The inclusion of a two sets means that all objects one set contains are present in another set. If the expression A <= B evaluates to true, all objects present in the set A are also present B. In a Venn diagram you will notice that one circle’s area is completely surrounded by another circle, if not identical to the other circle.  Expressions about empty sets are always true, despite not having any objects to check for. The expression [] <= someSet always evaluates to true, regardless of someSet’s value (it may even be another empty set). #### Equality and inequality The equality of two sets is defined as A <= B and B <= A. All objects contained in the left-hand set are present in the right-hand set and vice versa. In other words, there is not a single object that is present in just one of the sets. The inequality is just the negation thereof. #### Element of The in operator is the only set operator that does not act on two sets but on one potential set member candidate and a set. It has been introduced above. With respect to Venn diagrams, though, you can say that the in operator is “like” pointing with your index finger to a point inside a circle, or outside of it. ## Pre-defined set routines ### Cardinality (After initialization) at any time a set contains a certain number of elements. In mathematics the number of objects being part of a set is called cardinality. The cardinality of a set can be retrieved using the function card, an EP extension. emptySet := []; writeLn(card(emptySet)); This will print 0 as there are no elements in an empty set. Unfortunately, not all compilers implement the card function. The FPC does not have none. The GPC does supply one, though. ### Universe Originally, Wirth proposed a function all:  all(T) is the set of all values of type T [1] For example: superwoman := all(skill); The set superwoman would contain all available skill values, cooking, cleaning, driving, videogames, eating. Unfortunately, this proposal never made it into the ISO standards, nor do the FPC or GPC support that function, or provide an equivalent. The only alternative is to use an appropriate set constructor (an EP extension): [cooking..eating] is equivalent to all(skill), provided that cooking is the first skill and eating the last skill value (referring to the order these items were listed during the data type declaration of skill). ### Inclusion and exclusion Not standardized, but convenient is BP’s definition of include and exclude procedures. These are shorthand for very frequent set manipulations. The procedures allow you to quickly add or remove one object from one set. include(recognizedLetters, 'Z'); is identical to recognizedLetters := recognizedLetters + ['Z']; but you do not need to type out the set name twice and everything, thus reducing the chance of typing mistakes. Likewise, exclude(primeSieve, 4); will do the same as primeSieve := primeSieve - [4]; Both, the FPC and GPC, support these handy routines, which are in fact in all cases implemented as compiler intrinsics, not actual procedures. ## Intermediate usage ### Set literals Effectively stating sets is a required skill when handling sets. It is important to understand that sets merely store the information that an object is a member of a set, or not. The set ['A', 'A', 'A'] is identical to ['A']. Specifying 'A' multiple times does not make it “more” part of that set. Also, it is not necessary to list all members in any particular order. ['X', 'Z', 'Y'] is just as acceptable as ['X', 'Y', 'Z'] is. Mathematically speaking, sets are not ordered. Pascal’s requirement that a set’s base data type has to be an ordinal type is purely a technical requirement. For readability reasons it is usually sensible, though, to list elements in ascending order. The EP standard gives you nice short notation for set literals containing a continuous series of values. Instead of writing [7, 8, 9, 11, 12, 13] you can also write ranges like [7..9, 11..13] evaluating to the very same value. Of course, all numbers could also be variables, or expressions in general. Set literals are always a positive statement which objects are in a set. If we wanted a set of integer values between 0 and 10 without 3, 5 and 7, but do not want to write this set out entirely (i. e. as [0, 1, 2, 4, 6, 8, 9, 10]), you can either write [0..2, 4, 6, 8..10] or the expression [0..10] - [3, 5, 7]. The latter is probably a little easier to grasp what objects are and which are not in the final set. ### Memory restrictions Although a set of integer is legal and complies with all Pascal standards, many compilers do not support such large sets. Per definition, a set of integer can contain (at most) all values in the range -maxInt..maxInt. That is a lot (try writeLn(maxInt) or read your compiler’s documentation to find out this value). On a 64‑bit platform this value (usually) is 263−1, i. e. 9,223,372,036,854,775,808. As of the year 2020 many computers will quickly run out of main memory if they attempted to hold that many Boolean values. As a consequence, BP restricts permissible set’s base types. In BP the base type’s largest and smallest values’ ordinal values have to be in the range 0..255. The value 255 is 28−1. As of version 3.2.0, the FPC sets the same limitations. The GPC allows set definitions beyond 28 elements, although some configuration is required: You need to specify the ‑‑setlimit command-line parameter or a specially crafted comment in your source code: program largeSetDemo; {$setLimit=2147483647} // this is 2³¹ − 1
type
// 1073741823 is 2³⁰ − 1
characteristicInteger = -1073741823..1073741823;
integers = set of characteristicInteger;
var
manyManyIntegers: integers;
begin
manyManyIntegers := [];
include(manyManyIntegers, 1073741823);
end.

This will instruct the GPC that a set of characteristicInteger can only store up to this many characteristicInteger values.

## Loops

Now that you have made the acquaintance of enumeration data types and sets, you see yourself faced with dealing a growing number of data. Pascal, like many other programming languages, support a language construct called loops.

### Characteristics

Loops are (possibly empty) sequences of statements that are repeated over and over again, or even never, based on a Boolean value. The sequence of statements is termed loop body. The loop head contains (possibly implicitly) a Boolean expression determining whether the loop body is executed. Every time the loop body is run, an iteration is in progress.

The term loop originates from the circumstance that some early models of computers required programs to be fed (“loaded”) via punched paper tape. If a portion of that paper tape was meant to be processed multiple times, that piece of paper tape was cut, bent and temporarily fixated so it formed a physical loop. Thankfully, advancements in computer technology has made it far more convenient to handle repeating code.

Pascal (and many other programming languages) differentiate between two groups of loops:

• counting loops, presented here, and
• conditional loops, presented in a chapter to come.

Counting loops have in common that, before running the first iteration it can already be determined how many times the loop body will be executed just by evaluating the loop head.[fn 3] Conditional loops on the other hand are based on an abort condition, i. e. a Boolean expression. Except for infinite loops, there is no way to tell in advance how many times, how many iterations a conditional loop will have without thoroughly (mathematically) analyzing the loop body and loop head, and possibly even considering circumstances beyond the loop.

### Counting loops

Counting loops do not necessarily count a quantity. They are named after the fact that they employ a variable, a counting variable. This variable of any ordinal data type (de facto) assigns every iteration a number.

A counting loop is introduced by the reserved word for:

program forLoopDemo(output);
var
i: integer;
begin
for i := 1 to 10 do // loop head
begin               // ┐
writeLn(i:3);   // ┝ loop body
end;                // ┘
end.

After for follows a specially crafted assignment to the counting variable.

#### Range of counting variable

1 to 10 (with the auxiliary reserved word to) denotes a range of values the counting variable i will assume while executing the loop body. 1 and 10 are both expressions possessing the counting variable’s data type, that means there could also appear variables or more complex expressions, not just constant literals as shown.

This range is like a set. It may possibly be empty: The range 5 to 4 is an empty range, since there are no values between 5 up to and including 4. In consequence, the counting variable will not be assigned any value out of this empty range, as there simply are none available, and the loop body is never executed. Nevertheless, the range 8 to 8 contains exactly one value, i. e. 8.

During the first iteration the corresponding counting variable, here i, will have the first value out of the given range, the start value, in the example above this is the value 1. In the successive iteration the variable i has the value 2, and so forth up to and including the final value of the given range, here 10.

#### Immutability of counting variable

It is not necessary to actually utilize the counting variable inside the loop body, but you can use it if you are just obtaining its current value: Inside the loop body of for‑loops it is forbidden to assign any values to the counting variable. Forbidden assignments include, but are not limited to putting the counting the variable on the LHS of :=, but also read/readLn may not use the counting variable. Tampering with the counting variable is forbidden, because the loop head will effectively employ succ(i) to obtain the next iteration’s value. The loop head implicitly contains the Boolean expression counting variablefinal value. If the counting variable was manipulated this condition might never be met, thus destroying the characteristics of a for‑loop. Preventing the programmer to do any assignments preemptively ensures such an infinite loop is not, accidentally as well as deliberately, created.

#### Reverse direction

Pascal also allows for‑loops in a reversed direction using the reserved word downto instead of to:

program downtoDemo(output);
var
c: char;
begin
for c := 'Z' downto 'B' do
begin
write(c, ' ');
end;
writeLn('A');
end.

Here, the range is 'Z' down and including to 'B'. The loop’s terminating condition is still counting variablefinal value, but in this case the counting variable c becomes pred(c) (not succ) at the end of each iteration, after the loop body has been executed.

### Loops on collections

EP allows to iterate over discrete aggregations, such as sets. This is particularly useful if you have a routine that needs to be applied to every value of an aggregation. Here is an example to demonstrate the principle:

program forInDemo(output);
var
c: char;
vowels: set of char;
begin
vowels := ['A', 'E', 'I', 'O', 'U'];

for c in vowels do
begin
writeLn(c);
end;
end.

Now you see the word in again, but in this context c in vowels is not an expression. The data type restrictions for in are still in effect: On the RHS an aggregation expression is given, whereas the LHS is in this case a variable that has the aggregation’s data type. This variable will be assigned every value out of the given aggregation every time an iteration is processed.

Since the RHS just needs an expression, not necessarily a variable, so you can shorten the example even further to:

    for c in ['A', 'E', 'I', 'O', 'U'] do

Note, unlike the counting loops above, you are not supposed to make any assumptions about the order the loop variable is assigned values to. It may be in ascending, descending, or completely mixed up “order”, but the specific order is “implementation defined”, i. e. it depends on the used compiler. Accompanying documents of the compiler explain in which order the for … in loop is processed.

What value does the set expression fruit - fruit evaluate to?
This expression will evaluate to an empty set, because the difference operator “removes” all objects the latter set contains from the former. Here, we are referring to the same set twice, so both operands are equal, effectively rendering the expression to “remove all objects in this very same set”. Needless to say, but it is very counter-productive to ever write such an expression, since we already know its result, but it is allowed anyway.
This expression will evaluate to an empty set, because the difference operator “removes” all objects the latter set contains from the former. Here, we are referring to the same set twice, so both operands are equal, effectively rendering the expression to “remove all objects in this very same set”. Needless to say, but it is very counter-productive to ever write such an expression, since we already know its result, but it is allowed anyway.

Sources:

1. Wirth, Niklaus. The Programming Language Pascal (Revised Report ed.). p. 38.

Notes:

1. This is an analogy for explanation. The ISO standards do not set this requirement. Compiler developers are free to choose whatever implementation of the set data type they think is suitable. It would be perfectly OK to just store a list of elements that are present in a set and nothing else. Nevertheless, all, Delphi, FPC, as well as the GPC do store sets as a series of bits (“Boolean values”) as it is explained here.
2. The data type real does not qualify as an ordinal data type. Although it stores a finite subset of ℚ, the set of rational numbers, so there is a map T ↦ ℕ, this depends on the real data type’s precision (T), thus there is no one standard way of defining ord for real values, but many.
3. This statement ignores many dialects’ extensions break/leave/continue/cycle that neither the ISO standards 7185 (Standard Pascal) or 10206 (Extended Pascal) define, or the goto statement.
Next Page: Arrays | Previous Page: Enumerations
Home: Pascal Programming

# Arrays

An array is a structure concept for custom data types. It groups elements of the same data type. You will use arrays a lot if you are dealing with lots of data of the same data type.

## Lists

### Notion

In general, an array is a limited and arranged aggregation of objects, all of which having the same data type called base type or component type.[1] An array has at least one discrete, bounded dimension, continuously enumerating all its objects. Each object can be uniquely identified by one or more scalar values, called indices, along those dimensions.

### Declaration

In Pascal an array data type is declared using the reserved word array in combination with the auxiliary reserved word of, followed by the array’s base type:

program arrayDemo(output);
type
dimension = 1..5;
integerList = array[dimension] of integer;

Behind the word array follows a non-empty comma-separated list of dimensions surrounded by brackets.[fn 1] All array’s dimensions have to be ordinal data types, yet the base type type can be of any kind. If an array has just one dimension, like the one above, we may also call it a list.

### Access

A variable of the data type integerList as declared above, holds five independent integer values. Accessing them follows a specific scheme:

var
powerN: integerList;
begin
powerN[1] := 5;
powerN[2] := 25;
powerN[3] := 125;
powerN[4] := 625;
powerN[5] := 3125;

writeLn(powerN[3]);
end.

This program will print 125, since it is the value of powerN that has the index value 3. Arrays are like a series of “buckets” each holding one of the base data type’s values. Every bucket can be identified by a value according to the dimension specifications. When referring to one of the buckets, we have to name the group, that is the variable name (in this case powerN), and a proper index surrounded by brackets.

## Character array

Lists of characters frequently have and had special support with respect to I/O and manipulation. This section is primarily about understanding, as in the next chapter we will get to know a more sophisticated data type called string.

### Direct assignment

String literals can be assigned to array[…] of char variables using an assignment statement, thus instead of writing:

program stringAssignmentDemo;
type
fileReferenceDimension = 1..4;
fileReference = array[fileReferenceDimension] of char;
var
currentFileReference: fileReference;
begin
currentFileReference[1] := 'A';
currentFileReference[2] := 'Z';
currentFileReference[3] := '0';
currentFileReference[4] := '9';

You can simply write:

	currentFileReference := 'AZ09';
end.

Note, that you do not need to specify any index anymore. You are referring to the entire array variable on the LHS of the assignment symbol (:=). This only works for overwriting the values of the whole array. There are extensions allowing you to overwrite merely parts of a char array, but more on that in the next chapter.

Most implementations of string literal to char array assignments will pad the given string literal with insignificant char values if it is shorter than the variable’s capacity. Padding a string means to fill the remaining positions with other characters in order meet a certain size. The GPC uses space characters (' '), whereas the FPC uses char values whose ordinal value is zero (#0).

Although not standardized,[fn 2] read/readLn and write/writeLn usually support writing to and reading from array[…] of char variables.

Very early Pascal compilers supported only this kind of human-readable IO:

Code:

program interactiveGreeting(input, output);
type
nameCharacterIndex = 1..20;
name = array[nameCharacterIndex] of char;
var
host, user: name;
begin
host := 'linuxnotebook'; { in lieu of getHostName }

write('Hello, ', user, '. ');
writeLn('My name is ', host, '.');
end.

Output:

Hello! What is your name?
Aïssata
Hello, Aïssata. My name is linuxnotebook.
Note, display will vary according to the padding algorithm explained above and longer inputs will be silently clipped to the maximum capacity of user. The user input has been highlighted, and the program was compiled with the FPC.

This works because text files, like the standard files input and output, are understood to be infinite sequences of char values.[fn 3] Since our array[…] of char is also, although finite sequence of char values, individual values can be copied pretty much directly to and from text files, not requiring any kind of conversion.

### Primitive comparison

Unlike other arrays, array[…] of char variables can be compared using = and <>.

	if user <> host then
begin
write('Hello, ', user, '. ');
writeLn('My name is ', host, '.');
end
else
begin
writeLn('No. That is my name.');
end;

This kind of comparison only works as expected for identical data types. It is a primitive character-by-character comparison. If either array is longer or shorter, an = comparison will always fail, because not all characters can be compared to each other. Correspondingly, an <> comparison will always succeed for array[…] of char values that differ in length.

Note, the EP standard also defines the EQ and NE functions, beside many more. The difference to = and <> is that blank padding (i. e. #0 in FPC or ' ' in GPC) has no significance in EQ and NE. In consequence, that means using these functions you can compare strings and char arrays regardless of their respective maximum capacity and still get the naturally expected result. The = and <> comparisons on the other hand look at the memory’s internal representation.

## Matrices

An array’s base type can be any data type,[fn 4] even another array. If we want to declare an “array of arrays” there is a short syntax for that:

program matrixDemo(output);
const
columnMinimum = -30;
columnMaximum =  30;
rowMaximum =  10;
rowMinimum = -10;
type
columnIndex = columnMinimum..columnMaximum;
rowIndex = rowMinimum..rowMaximum;
plot = array[rowIndex, columnIndex] of char;

This has already been described above. The last line is identical to:

	plot = array[rowIndex] of array[columnIndex] of char;

It can be expanded to two separate declarations allowing us to “re-use” the “inner” array data type:

	row = array[columnIndex] of char;
plot = array[rowIndex] of row;

Note that in the latter case plot uses row as the base type which is an array by itself. Yet in the short notation we specify char as the base type, not a row but its base type.

When an array was declared to contain another array, there is a short notation for accessing individual array items, too:

var
curve: plot;
x: columnIndex;
y: rowIndex;
v: integer;
begin
// initialize
for y := rowMinimum to rowMaximum do
begin
for x := columnMinimum to columnMaximum do
begin
curve[y, x] := ' ';
end;
end;

// graph something
for x := columnMinimum to columnMaximum do
begin
v := abs(x) - rowMaximum;

if (v >= rowMinimum) and (v <= rowMaximum) then
begin
curve[v, x] := '*';
end;
end;

This corresponds to the array’s declaration. It is vital to ensure the indices you are specifying are indeed valid. In the latter loop the if branch checks for that. Attempting to access non-existent array values, i. e. by supplying illegal indices, may crash the program, or worse remain undetected thus causing difficult to find mistakes.

 A program compiled with the GPC will terminate with./a.out: value out of range (error #300 at 402a76)if you are attempting to access a non-existent array item (the final hexadecimal number may vary). The FPC, however, by default ignores this. You need to specifically request errors to be detected either by specifying the ‑Cr command-line switch, or placing the compiler directive comment{$rangeChecks on}in your source code (before you are accessing any arrays, writing this once in your source code is enough). Confer also chapter “Enumerations”, subsection “Restriction”. Note, the “unusual” order of x and y has been chosen to facilitate drawing an upright graph:  // print graph, note reverse downto direction for y := rowMaximum downto rowMinimum do begin writeLn(curve[y]); end; end. That means, it is still possible to refer to entire “sub”-arrays as a whole. You are not forced to write all dimension an array value has, given it makes sense. Array data types that have exactly two dimensions are also called matrices, singular matrix.  In mathematics a matrix does not necessarily have to be homogenous, but could contain different “data types”. ## Real values As introduced in one of the first chapters the data type real is part of the Pascal programming language. It is used to store integer values in combination with an optional fractional part. ### Real literals In order to distinguish integer literals from real literals, specifying real values in your source code (and also for read/readLn) differs slightly. The following source code excerpt shows some real literals: program realDemo; var x: real; begin x := 1.61803398; x := 1E9; x := 500e-12 x := -1.7320508; x := +0.0; end. To summarize the rules: • A real value literal always contains either a . as a radix mark, or an E/e to separate an exponent (the ${\displaystyle x}$ in ${\displaystyle {\text{…}}\times 10^{x}}$), or both. • There is at least one Western-Arabic decimal digit before and one after the . (if there is any). • The entire number and exponent are preceded by signs, yet a positive sign is optional.  0.0 accepts a sign, but is in fact (in mathematics) a sign-less number. -0.0 and +0.0 denote the same value. As it has always been, all number values cannot contain spaces. ### Limitations The real data type has many limitations you have to be aware of in order to effectively use it. First of all, we want to re-emphasize an issue that was mentioned when data types were introduced: In computing real variables can only store a subset of rational numbers (ℚ).[2] That means, for example, you cannot store the (mathematically speaking) real number (ℝ) ${\displaystyle {\sqrt {2}}}$. This number is an irrational number (i. e. not a rational number). If you cannot write a number as a finite real literal, it is impossible to store it in a system using a finite amount of memory, such as computer systems do. Fortunately, in EP three constants aide your usage of real values. minReal is the smallest positive real value. It conjunction with the constant maxReal, it is guaranteed that all arithmetic operations in ${\displaystyle \left[-{\texttt {maxReal}},{\texttt {maxReal}}\right]\setminus \left(-{\texttt {minReal}},{\texttt {minReal}}\right)\cup \left\{0\right\}}$ produce, quote, reasonable approximations.[3] It is not specified what exactly constitutes a “reasonable” approximation, thus it can, for example, mean that maxReal - 1 yields as “an approximation” maxReal.[fn 5] Also, it is quite possible that real variables can store larger values than maxReal. epsReal is short for “epsilon real”. The small Greek letter ε (epsilon) frequently denotes in mathematics an infinitely small (positive) value, yet not zero. According to the ISO standard 10206 (“Extended Pascal”), epsReal is the result of subtracting 1.0 from the smallest value that is greater than 1.0.[3] No other value can be represented between this value and 1.0, thus epsReal represents the highest precision available, but just at that point.[fn 6] Most implementations of the real data type will show a significantly varying degree of precision. Most notable, the precision of real data type implementations complying with the IEEE standard 754 format, decays toward the extremes, when approaching (and going beyond) -maxReal and maxReal. Therefore you usually use, if at all, a reasonable multiple of epsReal that fits the given situation. ### Transfer functions Pascal’s strong typing system prevents you from assigning real values to integer variables. The real value may contain a fractional part that an integer variable cannot store. Pascal defines, as part of the language, two standard functions addressing this issue in a well-defined manner. • The function trunc, short for “truncation”, simply discards any fractional part and returns, as an integer, the integer part. As a consequence this is effectively rounding a number toward zero. trunc(-1.999) will return the value -1. • If this feels “unnatural”, the round function rounds a real number to its closest integer value. round(x) is (regarding the resulting value) equivalent to trunc(x + 0.5) for non-negative values, and equivalent to trunc(x - 0.5) for negative x values.[fn 7] In other words, this is what you were probably taught in grade school, or the first rounding method you learned in homeschooling. It is commonly referred to as “commercial rounding”. Both functions will fail if there is no such integer value fulfilling the function’s respective definition. There is no function if you want to (explicitly) “transfer” an integer value to a real value. Instead, one uses arithmetically neutral operations: • integerValue * 1.0 (using multiplicative neutral element), or • integerValue + 0.0 (using summing neutral element) These expressions make use of the fact, as it was mentioned earlier as a passing remark in the chapter on expressions, that the expression’s overall data type will become real as soon as one real value is involved.[4]  It is not guaranteed that all possible integer values can be stored as real variables.[fn 8] This primarily concerns non-small values, but it is important to understand that the data type integer is the best choice to accurately store integral, whole-numbered values nonetheless. ### Printing real values By default write/writeLn print real values using “scientific notation”. Borland Pascal defines a real value constant pi. It was adopted by many compilers. Code: program printPi(output); begin writeLn(pi); end. Output:  3.141592653589793e+00 This output has been generated by a program compiled with the GPC (without forcing ISO standard compliance). The width may vary, but the general style 1. sign, where a positive sign is replaced with a blank 2. one digit 3. a dot 4. a positive number of post-decimal digits 5. an E (uppercase or lowercase) 6. the sign of the exponent, but this time a positive sign is always written and a zero exponent is preceded by a plus sign, too 7. the exponent value, with a fixed minimum width (here 2) and leading zeros is always the same. While this style is very universal, it may also be unusual for some readers. Particularly the E notation is something now rather archaic, usually only seen on pocket calculators, i. e. devices lacking of enough display space. Luckily, write/writeLn allow us to customize the displayed style. For starters, specifying a minimum total width is legal for real parameters, too, but this also shows more digits: Code: program printPiDigits(output); begin writeLn(pi:40); end. Output:  3.141592653589793238512808959406186e+00 Note that 40 refers to the entire width including a sign, the radix mark, the e and the exponent representation. The procedures write/writeLn accept for real type values (and only for real values) another colon-separated format specifier. This second number determines the (exact) number of post-decimal digits, the “fraction part”. Supplying two format specifiers also disables scientific notation. All real values are printed using regular positional notation. That may mean for “large” numbers such as 1e100 printing a one followed by a hundred zeros (just for the integer part). Let’s look at an example: Code: program realFormat(output); var x: real; begin x := 248e9 + 500e-6; writeLn(x:32:8); end. Output:  248000000000.00048828 Note, the entire width, including . and in the case of negative numbers -, is 32 characters. After the . follow 8 digits. Bear in mind the precise number, especially the fractional part, may vary. In some regions and languages it is customary to use a , (comma) or other character instead of a dot as a radix mark. Pascal’s on-board write/writeLn procedures will, on the other hand, always print a dot, and for that matter – read/readLn will always accept dots as radix marks only. Nevertheless, all current Pascal programming suites, Delphi, FPC and GPC provide appropriate utilities to overcome this restriction. For further details we refer to their manuals. This issue should not keep us from continuing learning Pascal. The output write/writeLn (and in EP also writeStr) generate will be rounded with respect to the last printed digit. Code: program roundedWriteDemo(output); begin writeLn(3.75:4:1); end. Output:  3.8 Note, this rounding has to be distinguished from approximations arising from real’s limitations. It was verified that the computer used for this demonstration could indeed store precisely the specified value. The rounding you see in this particular case is not due to any technical circumstances. ### Comparisons First of all, all (arithmetic) comparison operators do work with real values. The operators = and <>, though, are particularly tricky to handle. In most applications you do not compare real values to each other when checking for equality or inequality. The problem is that numbers such as ⅓ cannot be stored exactly with a finite series of binary digits, only approximated, yet there is not one valid approximation for ⅓ but many legit ones. The = and <> operators, however, compare – so to speak — for specific bit patterns. This is usually not desired (for values that cannot be represented exactly). Instead, you want to ensure the values you are comparing are within a certain range, like: function equal(x, y, delta: real): Boolean; begin equal := abs(x - y) <= delta; end; Delphi and the FPC’s standard RTS provide the function sameValue as part of the math unit. You do not want to program something other people already have programmed for you, i. e. use the resources. ## Division Now that we know the data type used for storing (a subset of) rational numbers, in Pascal known as real, we can perform and use the result of another arithmetic operation: The division. ### Flavors Pascal defines two different division operators: • The div operator performs an integer division and discards, if applicable, any remainder. The expression’s resulting data type is always integer. The div operator only works if both operands are integer expressions. • The, probably more familiar, operator / (a forward slash), divides the LHS number (the dividend) by the RHS number (the divisor), too, but a /-division always yields a real type value.[4] This is also the case if the fractional part is zero. A “remainder” does not exist for the / operation. The div operation can be put in terms of /: divisor div dividend = trunc(divisor / dividend) However, this is only a semantic equivalent,[fn 9] it is not how it is actually calculated. The reason is, the / operator would first convert both operands to real values and since, as explained above, not all integer values can necessarily be represented exactly as real values, this would produce results potentially suffering from rounding imprecisions. The div operator on the other hand is a pure integer operator and works with “integer precision”, that means no rounding is involved in actually calculating the div result. ### Off limits divisor Note, since there is no generally accepted definition for division by zero, a zero divisor is illegal and will result in your program to abort. If your divisor is not a constant and depends on run-time data (such as a variable read from user input), you should check that it is non-zero before doing a division: if divisor <> 0 then begin x := dividend / divisor; // ... end else begin writeLn('Error: zero divisor encountered when calculating rainbow'); end; Alternatively, you can declare data types excluding zero, so any assignment of a zero value will be detected: type natural = 1..maxInt; var divisor: natural; begin write('Enter a positive integer: '); readLn(divisor); // entering a non-positive value will fail Some Pascal dialects introduce the concept of “exceptions” that can be used to identify such problems. Exceptions may be mentioned again in the “Extensions” part of this Wikibook. ## Arrays as parameters Arrays can be copied with one simple assignment dataOut := dataIn;. This requires, however, as it is customary with Pascal’s strong type safety, that both arrays are assignment-compatible: That means their base type and dimension specifications are the same.[fn 10] Because calling a routine involves invisible assignments, writing general-purpose code dealing with lots of different situations would be virtually impossible if the entire program had to use one array type only. In order to mitigate this situation, conformant array type parameters allow writing routines accepting differing array dimension lengths. Array dimension data types still have to match. Let’s look at an example program using this feature: program tableDemo(output); procedure printTableRows(data: array[minimum..maximum: integer] of integer); var i: integer; // or in EP i: type of minimum; [preferred alternative] begin for i := minimum to maximum do begin writeLn(i:20, ' | ', data[i]:20); end; end; A conformant-array parameter looks pretty similar to a regular array variable declaration, but the dimensions are specified differently. Usually, when declaring new arrays you provide constant values as dimension limits. Since we do not want constants, though, we name placeholder identifiers for the actual dimension limits of any array printTableRows will receive. In this case they are named minimum and maximum, joined by .. inbetween indicating a range. minimum and maximum become variables inside the definition of printTableRows, but you are not allowed to assign any values to them.[fn 11] You are not allowed to declare new identifiers bearing the same name as the array boundary variables. In Pascal all constants implicitly have an unambiguously determinable data type. Since our array limit identifiers are in fact variables they require a data type. The : integer indicates both minimum and maximum have the data type integer.  In a conformant array paramter, the short notation for nested arrays uses a ; to separate multiple dimensions, thus resembling a regular parameter list. Once we have declared and defined printTableRows we can use it with lots of differently sized arrays: var table: array[0..3] of integer; nein: array[9..9] of integer; begin table[0] := 1; table[1] := 2; table[2] := 4; table[3] := 8; printTableRows(table); nein[9] := 9; printTableRows(nein); end. Delphi and the FPC (as of version 3.2.0 released in 2020) do not support conformant-array parameters, but the GPC does. ## Logarithms ### Special support Prior the 21st century logarithm tables and slide rules were heavily utilized tools in manual calculations, so much so it led to the inclusion of two basic functions in Pascal. • The function exp exponentiates a number to the base ${\displaystyle e}$, Euler’s constant. The value of exp(x) is equivalent to the mathematical term ${\displaystyle e^{\texttt {x}}}$. • The function ln, short for Latin “logaritmus naturalis”, takes the natural logarithm of a positive number. “Natural” refers to ${\displaystyle e}$ again. Both functions always return real values. ### Introduction Since the use of logarithms is not necessarily taught in all curricula, or you might just need a refresher, here is a short primer: The basic idea of logarithms is that all operations are lifted one level.  logarithm level real level ${\displaystyle \ln x+\ln y}$ ${\displaystyle \ln x-\ln y}$ ${\displaystyle \ln x\times y}$ ${\displaystyle x\times y}$ ${\displaystyle x/y}$ ${\displaystyle x^{y}}$ On the lifted level many operations become simpler, especially if the numbers are large. For instance, you can perform a rather easy addition if you actually mean to take the product of two numbers. For this you have to “lift” all operands one level up: This is done by taking the logarithm. Pay particular attention to ${\displaystyle x^{y}}$: On the logarithm level ${\displaystyle y}$ is a non-“logarithmized” factor, you only take the logarithm of ${\displaystyle x}$ Once you are done, you have descend one level again to get the actual “real” result of the intended operation (on the underlying level). The reverse operation of ln is exp.[fn 12] To put this principle in Pascal terms: x * y = exp(ln(x) + ln(y)) Remember, x and y have to be positive in order to be valid parameters to ln. ### Application Taking the logarithm and then exponentiating values again are considered “slow” operations and introduce a certain overhead. In programming, overhead means taking steps that are not directly related to the actual underlying problem, but only facilitate solving it. In general, overhead is avoided, since (at first) it takes us even farther away from the solution. Nevertheless, logarithms can be used to overcome range limitations of the real data type if intermediate results are out of its range, but it is known that the final result will definitely be within the range of real again. Code: program logarithmApplicationDemo(output); const operand = maxReal; var result: real; begin // for comparison writeLn(maxReal:40); result := sqr(operand); result := sqrt(result); writeLn(result:40); // “lift” one level result := ln(operand); result := 2.0 * result; // corresponds to sqr(…) result := 0.5 * result; // corresponds to sqrt(…) // reverse ln: bring result “back down” result := exp(result); writeLn(result:40); end. Output:  1.7976930000000000495189307440746532950903200318892038e+308 Inf 1.7976929999999315921963138504476453672053533033977331e+308 The intermediate result of sqr in line 10 exceeds the range of real rendering any subsequent results invalid. In this particular implementation this situation is displayed as Inf (infinity). Since we know that reversing this operation by taking the principal square root results in a storable result again, we can perform the same operation with logarithms instead. The shown output was generated by the program being compiled with the GPC. The program was executed on a 64-bit platform with an FPU using 80-bit numbers. As you can see, this goes to the detriment of precision. It is a compromise between fast operations, and “accurate enough” results. The best solution is, of course, finding a better algorithm. The above demonstration ${\displaystyle {\sqrt {x^{2}}}}$ is effectively ${\displaystyle \left|x\right|}$, that is abs(x) (remember, squaring a number always yields a non-negative number). This operation’s result will stay in the range of real. ## Tasks All tasks, including those in the following chapters, can be solved without conformant-array parameters. This takes account of the fact that not all major compilers support them.[fn 13] Nonetheless, using routines with conformant-array parameters are often the most elegant solution. Write a program that prints the following multiplication table:  1 2 3 4 5 6 7 8 9 10 2 4 6 8 10 12 14 16 18 20 3 6 9 12 15 18 21 24 27 30 4 8 12 16 20 24 28 32 36 40 5 10 15 20 25 30 35 40 45 50 6 12 18 24 30 36 42 48 54 60 7 14 21 28 35 42 49 56 63 70 8 16 24 32 40 48 56 64 72 80 9 18 27 36 45 54 63 72 81 90 10 20 30 40 50 60 70 80 90 100 The program may not contain any string literals (i. e. you may not just use ten writeLn). The generation of data and printing data shall be implemented by separate routines (these routine may not call each other). The following is a straight-forward implementation. program multiplicationTable(output); const xMinimum = abs(1); xMaximum = abs(10); yMinimum = abs(1); yMaximum = abs(10); // NB: Only Extended Pascal allows constant definitions // based on expressions. The following two definitions // are illegal in Standard Pascal (ISO 7185). zMinimum = xMinimum * yMinimum; zMaximum = xMaximum * yMaximum; Calculating the maximum and minimum expected value now (as constants) has the advantage that the compiler will emit a warning during compilation if any value exceeds maxInt. The abs were inserted as a means of documentation: The program only works properly for non-negative values. type x = xMinimum..xMaximum; y = yMinimum..yMaximum; z = zMinimum..zMaximum; table = array[x, y] of z; Using z as the table array’s base type (and not just integer) has the advantage that if we accidentally implement the multiplication incorrectly, assigning out-of-range values will fail. For such a trivial task like this one it is of course irrelevant, but for more difficult situations deliberately constricting the allowed range can thwart programming mistakes. Do not worry if you just used a plain integer. var product: table; Note, the product variable has to be declared outside and before populateTable and printTable are defined. This way both routines refer to the same product variable.[fn 14] procedure populateTable; var factorX: x; factorY: y; begin for factorX := xMinimum to xMaximum do begin for factorY := yMinimum to yMaximum do begin product[factorX, factorY] := factorX * factorY; end; end; end; It is also possible to reuse previously calculated values, make use of the fact that the table can be mirrored along the diagonal axis, or do other “optimization stunts”. The important thing for this task, though, is to correctly nest the for loops. procedure printTable; var factorX: x; factorY: y; begin for factorY := yMinimum to yMaximum do begin for factorX := xMinimum to xMaximum do begin write(product[factorX, factorY]:5); end; writeLn; end; end; An advanced implementation would, of course, first determine the maximum expected length and store it as a variable, instead of using the hardcoded format specifier :5. This, though, is out of this task’s scope.[fn 15] It just should be mentioned hardcoded values like this one are considered bad style. begin populateTable; printTable; end. The following is a straight-forward implementation. program multiplicationTable(output); const xMinimum = abs(1); xMaximum = abs(10); yMinimum = abs(1); yMaximum = abs(10); // NB: Only Extended Pascal allows constant definitions // based on expressions. The following two definitions // are illegal in Standard Pascal (ISO 7185). zMinimum = xMinimum * yMinimum; zMaximum = xMaximum * yMaximum; Calculating the maximum and minimum expected value now (as constants) has the advantage that the compiler will emit a warning during compilation if any value exceeds maxInt. The abs were inserted as a means of documentation: The program only works properly for non-negative values. type x = xMinimum..xMaximum; y = yMinimum..yMaximum; z = zMinimum..zMaximum; table = array[x, y] of z; Using z as the table array’s base type (and not just integer) has the advantage that if we accidentally implement the multiplication incorrectly, assigning out-of-range values will fail. For such a trivial task like this one it is of course irrelevant, but for more difficult situations deliberately constricting the allowed range can thwart programming mistakes. Do not worry if you just used a plain integer. var product: table; Note, the product variable has to be declared outside and before populateTable and printTable are defined. This way both routines refer to the same product variable.[fn 14] procedure populateTable; var factorX: x; factorY: y; begin for factorX := xMinimum to xMaximum do begin for factorY := yMinimum to yMaximum do begin product[factorX, factorY] := factorX * factorY; end; end; end; It is also possible to reuse previously calculated values, make use of the fact that the table can be mirrored along the diagonal axis, or do other “optimization stunts”. The important thing for this task, though, is to correctly nest the for loops. procedure printTable; var factorX: x; factorY: y; begin for factorY := yMinimum to yMaximum do begin for factorX := xMinimum to xMaximum do begin write(product[factorX, factorY]:5); end; writeLn; end; end; An advanced implementation would, of course, first determine the maximum expected length and store it as a variable, instead of using the hardcoded format specifier :5. This, though, is out of this task’s scope.[fn 15] It just should be mentioned hardcoded values like this one are considered bad style. begin populateTable; printTable; end. Think of as many different ways to specify real value literals shorter than five characters, all denoting the value “positive one”. What are they? Regardless of their usability all the following are legal real value literals denoting the value “positive one”: 1. 1.0 2. +1.0 3. 1.00 4. 1E0 5. 1E+0 6. 1E-0 7. 1E00 In real life you primarily use of course 1.0. This exercise is meant to sensitize you to the fact that (unlike integer values) real number values can have many valid representations. Note, some compilers will accept literals such as 1., too, but this non-standard. The GPC will only accept it in non-ISO-compliant modes, but still emit a warning. Regardless of their usability all the following are legal real value literals denoting the value “positive one”: 1. 1.0 2. +1.0 3. 1.00 4. 1E0 5. 1E+0 6. 1E-0 7. 1E00 In real life you primarily use of course 1.0. This exercise is meant to sensitize you to the fact that (unlike integer values) real number values can have many valid representations. Note, some compilers will accept literals such as 1., too, but this non-standard. The GPC will only accept it in non-ISO-compliant modes, but still emit a warning. Write a binary (= accepting/taking two parameters) integer function that calculates the n-th integer power of a positive number. Restrict the parameters’ data types as much as possible. The function should return 0 if the result is invalid (i. e. out of range). Calculate means that we want the exact result. Although you have learned logarithms in this lesson, it is always best to stay within the realm of integers. Logarithms potentially yield approximations. The following is an acceptable implementation: type naturalNumber = 1..maxInt; wholeNumber = 0..maxInt; {** \brief iteratively calculates the integer power of a number \param base the (positive) base in x^n \param exponent the (non-negative) exponent in x^n \return \param base to the power of \param exponent, or zero in the case of an error **} function power(base: naturalNumber; exponent: wholeNumber): wholeNumber; var accumulator: wholeNumber; begin { anything [except zero] to the power of zero is defined as one } accumulator := 1; for exponent := exponent downto 1 do begin { if another “times base” would exceed the limits of integer } { we invalidate the entire result } if accumulator > maxInt div base then begin accumulator := 0; end; accumulator := accumulator * base; end; power := accumulator; end; As you can see, it is perfectly valid to make such null statements as exponent := exponent just to satisfy the Pascal’s syntax requirements. A good compiler will optimize that away. Note that the EP standard provides the integer power operator pow.[fn 16] Calculate means that we want the exact result. Although you have learned logarithms in this lesson, it is always best to stay within the realm of integers. Logarithms potentially yield approximations. The following is an acceptable implementation: type naturalNumber = 1..maxInt; wholeNumber = 0..maxInt; {** \brief iteratively calculates the integer power of a number \param base the (positive) base in x^n \param exponent the (non-negative) exponent in x^n \return \param base to the power of \param exponent, or zero in the case of an error **} function power(base: naturalNumber; exponent: wholeNumber): wholeNumber; var accumulator: wholeNumber; begin { anything [except zero] to the power of zero is defined as one } accumulator := 1; for exponent := exponent downto 1 do begin { if another “times base” would exceed the limits of integer } { we invalidate the entire result } if accumulator > maxInt div base then begin accumulator := 0; end; accumulator := accumulator * base; end; power := accumulator; end; As you can see, it is perfectly valid to make such null statements as exponent := exponent just to satisfy the Pascal’s syntax requirements. A good compiler will optimize that away. Note that the EP standard provides the integer power operator pow.[fn 16] Improve your previous solution and make it work for negative base values as well. If your compiler supports the EP procedure halt, your function should print an error message and terminate the program if ${\displaystyle {\text{base}}={\text{exponent}}=0}$, because there is no universally agreed definition for ${\displaystyle 0^{0}}$. The changed lines have been highlighted. {** \brief iteratively calculates the integer power of a number \param base the non-zero base in x^n \param exponent the (non-negative) exponent in x^n \return \param base to the power of \param exponent, or zero in the case of an error The program aborts if base = 0 = exponent. **} function power(base: integer; exponent: wholeNumber): integer; var accumulator: integer; negativeResult: Boolean; Remember to update all data types and keep the source code documentation in sync. begin if [base, exponent] = [0] then begin writeLn('Error in power: base = exponent = 0, but 0^0 is undefined'); halt; end; As per task specifications this part was optional. Here, a set was chosen to sensitize you to that possibility. You will, nevertheless, usually and most probably write (base = 0) and (base = exponent) or similar, which is just as valid.  { anything [except zero] to the power of zero is defined as one } accumulator := 1; negativeResult := (base < 0) and odd(exponent); { calculating the _positive_ power of base^exponent } { simplifies the invalidation condition in the loop below } base := abs(base); if base > 1 then begin for exponent := exponent downto 1 do begin { if another “times base” would exceed the limits of integer } { we invalidate the entire result } if accumulator > maxInt div base then begin accumulator := 0; end; accumulator := accumulator * base; end; end; The necessity of this new if branch may be not as apparent as it should be: Because we earlier extended the range of possible base values to all integer values, it has also become possible to specify 0. However, remember division by zero is illegal. Since our invalidation condition relies on div base we need to take precautionary steps.  if negativeResult then begin accumulator := -1 * accumulator; end; power := accumulator; end; The changed lines have been highlighted. {** \brief iteratively calculates the integer power of a number \param base the non-zero base in x^n \param exponent the (non-negative) exponent in x^n \return \param base to the power of \param exponent, or zero in the case of an error The program aborts if base = 0 = exponent. **} function power(base: integer; exponent: wholeNumber): integer; var accumulator: integer; negativeResult: Boolean; Remember to update all data types and keep the source code documentation in sync. begin if [base, exponent] = [0] then begin writeLn('Error in power: base = exponent = 0, but 0^0 is undefined'); halt; end; As per task specifications this part was optional. Here, a set was chosen to sensitize you to that possibility. You will, nevertheless, usually and most probably write (base = 0) and (base = exponent) or similar, which is just as valid.  { anything [except zero] to the power of zero is defined as one } accumulator := 1; negativeResult := (base < 0) and odd(exponent); { calculating the _positive_ power of base^exponent } { simplifies the invalidation condition in the loop below } base := abs(base); if base > 1 then begin for exponent := exponent downto 1 do begin { if another “times base” would exceed the limits of integer } { we invalidate the entire result } if accumulator > maxInt div base then begin accumulator := 0; end; accumulator := accumulator * base; end; end; The necessity of this new if branch may be not as apparent as it should be: Because we earlier extended the range of possible base values to all integer values, it has also become possible to specify 0. However, remember division by zero is illegal. Since our invalidation condition relies on div base we need to take precautionary steps.  if negativeResult then begin accumulator := -1 * accumulator; end; power := accumulator; end; Write a program that accepts “chirps”, that is microblogging messages consisting of up to 320 characters. • The user will terminate his input with an empty line. Print this instruction beforehand. • When done, print the message. • When printing, a line may be at most 80 characters long (or whatever is reasonable for you). You are allowed to presume the user’s input lines are at most 80 characters long. • Ensure you only wrap lines at space characters (unless there are no space characters in a line). Hint: For testing purposes you may want to scale down your input sizes. More tasks you can solve can be found on the following Wikibook pages: Sources: 1. Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). p. 56. doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5. "An array type consists of a fixed number of components (defined when the array type is introduced) all having the same type, called the component type." 2. This limitation comes from ﻿Pascal: ISO 7185:1990﻿ (Report). ISO/IEC. 1991. p. 16. "real-type. The required type-identifier real shall denote the real-type. […] The values shall be an implementation-defined subset of the real numbers, denoted as specified in 6.1.5 by signed-real." The end of the last sentence implies only writable, those you can specify in your source code, are legal real values. For example, the value ${\displaystyle 3.1415}$ is different from ${\displaystyle \pi }$, the decimial representation of which would be infinitely long (a. k. a. irrational number), thus the actual, correct value of ${\displaystyle \pi }$ cannot appear in source code as a “real” value. 3. a b Joslin, David A. (1989-06-01). "Extended Pascal – Numerical Features". Sigplan Notices 24 (6): 77–80. doi:10.1145/71052.71063. "The programmer can obtain some idea of the real range and precision from the (positive) implementation-defined constants MINREAL, MAXREAL and EPSREAL. Arithmetic in the ranges [-maxreal,-minreal], 0, and [minreal,maxreal] “can be expected to work with reasonable approximations”, whereas outside those ranges it cannot. As what constitutes a “reasonable approximation” is a matter of opinion, however, and is not defined in the standard, this statement may be less useful than it appears at first sight. The measure of precision is on somewhat firmer ground: EPSREAL is the commonly employed measure of (typically floating-point) accuracy, i.e the smallest value such that 1.0 + epsreal > 1.0.". 4. a b Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). p. 20. doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5. "As long as at least one of the operands is of type Real (the other possibly being of type Integer) the following operators yield a real value:  * multiply / divide (both operands may be integers, but the result is always real) + add - subtract " Notes: 1. Some (old) computers did not know the bracket characters. Seriously, that’s not a joke. Instead, a substitute bigram was used: var signCharacter: array(.Boolean.) of char;, and signCharacter(.true.) := '-';. You may encounter this kind of notation in some (old) textbooks on Pascal. Anyway, using brackets is still the preferred method. 2. Only I/O concerning a packed array[1..n] of componentType, where n is greater than 1 and componentType is or is a subrange of char, is standardized. However, in this part of the book you are not introduced to the concept of packing, the packed keyword. Therefore, the shown behavior is non-standard. 3. More precisely, text files are (possibly empty) sequences of lines, each line consisting of a (possibly empty) sequence of char values. 4. Some compilers (such as the FPC) allow zero-sized data types [not allowed in any ISO standard]. If that is the case, an array that has a zero-size base type will be rendered ineffective, virtually forfeiting all characteristics of arrays. 5. Modern real arithmetic processors can indicate a precision loss, i. e. when the result of an arithmetic operation had to be “approximated”. However, there is no standardized way to access this kind of information from your Pascal source code, and usually this kind of signaling is also not favorable, since the tiniest precision loss will set off the alarm, thus slowing down your program. Instead, if it matters, one uses software that allows arbitrary precision arithmetics, like for example the GNU Multiple Precision Arithmetic Library. 6. This number is not completely arbitrary. The most prevalent real number implementation IEEE 754 uses a “hidden bit”, making the value 1.0 special. 7. Not all compilers comply with this definition of the standard. The FPC’s standard round implementation will round in the case of equidistance toward even numbers. Knowing this is relevant for statistical applications. The GPC uses for its round implementation functionality provided by the hardware. As such, the implementation is hardware-dependent, on its specific configuration, and may deviate from the ISO 7185 standard definition. 8. Given the most prevalent implementations Two’s complement for integer values and IEEE 754 for real values, you have to consider the fact that (virtually) all bits in an integer contribute to its (mathematical) value, whereas a real number stores values for the expression mantissa * base pow exponent. In very simple terms, the mantissa stores the integer part of a value, but the problem is that it occupies fewer bits than an integer would use, thus there is (for values that require more bits) a loss in information (i. e. a loss in precision). 9. The exact technical definition reads like: The value of dividend div divisor shall be such that abs(dividend) - abs(divisor) < abs((dividend div divisor) * divisor) <= abs(dividend) where the value shall be zero if abs(dividend) < abs(divisor); otherwise, […] 10. Furthermore, both arrays have to be either packed or “unpacked”. 11. The EP standard calls this characteristic protected. 12. It is important that both functions use one common base, in this case it is ${\displaystyle e}$. 13. The ISO standard 7185 (“Standard Pascal”) calls this, lack of conformant-array parameters, “Level 0 compliance”. “Level 1 compliance” includes support for conformant array parameters. Of the compilers presented in Getting started only the GPC is a “Level 1”-compliant compiler. 14. This style of programming is slightly disfavored, keyword “global variables”, but as for now we do not know appropriate syntax (var parameters) to not do that. 15. For extra credit: You can make use of the fact that (assuming that zMaximum is positive) ${\displaystyle 10^{\log _{10}{\texttt {zMaximum}}}={\frac {\ln {\texttt {zMaximum}}}{\ln 10}}}$. You can use this value to find out the minimum number of digits required. 16. In EP there also exists a real power operator **. The difference is similar to that of the division operators: pow only accepts integer values as operands and yields an integer value, whereas ** always yields a real value. Your choice for either of which, again, should be based on the required degree of precision. Next Page: Strings | Previous Page: Sets Home: Pascal Programming # Strings The data type string(…) is used to store a finite sequence of char values. It is a special case of an array, but unlike an array[…] of char the data type string(…) has some advantages facilitating its effective usage. The data type string(…) as presented here is an Extended Pascal extension, as defined in the ISO standard 10206. Due to its high relevance in practice, this topic has been put into the Standard Pascal part of this Wikibook, right after the chapter on arrays.  Many compilers have a different conception of what constitutes a string. Consult their manual for their idiosyncratic differences. Rest assured, the GPC supports string(…) as explained here. ## Properties ### Capacity #### Definition The declaration of a string data type always entails a maximum capacity: program stringDemo(output); type address = string(60); var houseAndStreet: address; begin houseAndStreet := '742 Evergreen Trc.'; writeLn('Send complaints to:'); writeLn(houseAndStreet); end. After the word string follows a positive integer number surrounded by parenthesis. This is not a function call.[fn 1] #### Implications Variables of the data type address as defined above will only be able to store up to 60 independent char values. Of course it is possible to store less, or even 0, but once this limit is set it cannot be expanded. #### Inquiry String variables “know” about their own maximum capacity: If you use writeLn(houseAndStreet.capacity), this will print 60. Every string variable automatically has a “field” called capacity. This field is accessed by writing the respective string variable’s name and the word capacity joined by a dot (.). This field is read-only: You cannot assign values to it. It can only appear in expressions. ### Length All string variables have a current length. This is the total number of legit char values every string variable currently contains. To query this number, the EP standard defines a new function called length: program lengthDemo(output); type domain = string(42); var alphabet: domain; begin alphabet := 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; writeLn(length(alphabet)); end. The length function returns a non-negative integer value denoting the supplied string’s length. It also accepts char values.[fn 2] A char value has by definition a length of 1. It is guaranteed that the length of a string variable will always be less than or equal to its corresponding capacity. ### Compatibility You can copy entire string values using the := operator provided the variable on the LHS has the same or a greater capacity than the RHS string expression. This is different than a regular array’s behavior, which would require dimensions and size to match exactly. program stringAssignmentDemo; type zipcode = string(5); stateCode = string(2); var zip: zipcode; state: stateCode; begin zip := '12345'; state := 'QQ'; zip := state; // ✔ // zip.capacity > state.capacity // ↯ state := zip; ✘ end. As long as no clipping occurs, i. e. the omission of values because of a too short capacity, the assignment is fine. ### Index It is worth noting that otherwise strings are internally regarded as arrays.[fn 3] Like a character array you can access (and alter) every array element independently by specifying a valid index surrounded by brackets. However, there is a big difference with respect to validity of an index. At any time, you are only allowed to specify indices that are within the range 1..length. This range may be empty, specifically if length is currently 0.  It is not possible to change the current length by manipulating individual string components: program stringAccessDemo; type bar = string(8); var foo: bar; begin foo := 'AA'; { ✔ length ≔ 2 } foo[2] := 'B'; { ✔ } foo[3] := 'C'; { ↯: 3 > length } end. ## Standard routines In addition to the length function, EP also defines a few other standard functions operating on strings. ### Manipulation The following functions return strings. #### Substring In order to obtain just a part of a string (or char) expression, the function subStr(stringOrCharacter, firstCharacter, count) returns a sub-string of stringOrCharacter having the non-negative length count, starting at the positive index firstCharacter. It is important that firstCharacter + count - 1 is a valid character index in stringOrCharacter, otherwise the function causes an error.[fn 4] Let’s have a look at it in action: Code: program substringDemo(output); begin writeLn(subStr('GCUACGGAGCUUCGGAGUUAG', 7, 3)); { char index: 1 4 7 … } end. Output: GAG Pay particular attention to the firstCharacter index. Here we wanted to extract the third codon. However, firstCharacter is not simply 2 * 3 but 2 * 3 + 1. Indexing characters in a string variable start at 1. Note, a sophisticated implementation for encoding codons would not make use of string, but define a custom enumeration data type. For string-variables, the subStr function is the same as specifying myString[firstCharacter..firstCharacter+count].[fn 5] Evidently, if the firstCharacter value is some complicated expression, the subStr function should be preferred to prevent any programming mistakes. However this syntax of a range index can be used not just as values in expressions, but also to overwrite parts of a string. Code: program substringOverwriteDemo(output); var m: string(35); begin m := 'supercalifragilisticexpialidocious '; m[21..35] := '-yadi-yada-yada'; writeLn(m); end. Output: supercalifragilistic-yadi-yada-yada Note that the first assignment contains a trailing blank. As emphasized above, you cannot use this syntax to alter the length of a string. Furthermore, the third parameter to subStr can be omitted: This will simply return the rest of the given string starting at the position indicated by the second parameter.[fn 6] #### Remove trailing spaces The trim(source) function returns a copy of source without any trailing space characters, i. e. ' '. In LTR scripts any blanks to the right are considered insignificant, yet in computing they take up (memory) space. It is advisable to prune strings before writing them, for example, to a disk or other long-term storage media, or transmission via networks. Concededly memory requirements were a more relevant issue prior to the 21st century. ### First occurrence of substring The function index(source, pattern) finds the first occurrence of pattern in source and returns the starting index. All characters from pattern match the characters in source at the returned offset:  pattern pattern pattern source 1 2 3 ✘ X Y X 1 2 3 ✘ X Y X 1 2 3 ✔ X Y X Z Y X Y X Y X 1 2 3 4 5 6 7 Note, to obtain the second or any subsequent occurrence, you need to use a proper substring of the source. Because the “empty string” is, mathematically speaking, present everywhere, index(characterOrString, '') always returns 1. Conversely, because any non-empty string cannot occur in an empty string, index('', nonEmptyStringOrCharacter) always returns 0, in the context of strings an otherwise invalid index. The value zero is returned if pattern does not occur in source. This will always be the case if pattern is longer than source. ### Operators The EP standard introduced an additional operator for strings of any length, including single characters. The + operator concatenates two strings or characters, or any combination thereof. Unlike the arithmetic +, this operator is non-commutative, that means the order of the operands matters. expression result 'Foo' + 'bar' 'Foobar' '' + '' '' '9' + chr(ord('0') + 9) + ' Luftballons' '99 Luftballons' concatenation samples Concatenation is useful if you intend to save the data somewhere. Supplying concatenated strings to routines such as write/writeLn, however, may possibly be disadvantageous: The concatenation, especially of long strings, first requires to allocate enough memory to accommodate for the entire resulting string. Then, all the operands are copied to their respective location. This takes time. Hence, in the case of write/writeLn it is advisable (for very long strings) to use their capability of accepting an infinite number of (comma-separated) parameters. Procedural equivalent Note, the common LOC stringVariable := 'xyz' + someStringOrCharacter + …; is equivalent to writeStr(stringVariable, 'xyz', someStringOrCharacter, …); The latter is particularly useful if you also want to pad the result or need some conversion. Writing foo:20 (minimum width of 20 characters possibly padded with spaces ' ' to the left) is only acceptable using write/writeLn/writeStr. WriteStr is an EP extension. The GPC, the FPC and Delphi are also shipped with a function concat performing the very same task. Read the respective compiler’s documentation before using it, because there are some differences, or just stick to the standardized + operator. ### Sophisticated comparison All functions presented in this subsection return a Boolean value. #### Order Since every character in a string has an ordinal value, we can think of a method to sort them. There are two flavors of comparing strings: • One uses the relational operators already introduced, such as =, > or <=. • The other one is to use dedicated functions like LT, or GT. The difference lies in their treatment of strings that vary in length. While the former will bring both strings to the same length by padding them with space characters (' '), the latter simply clips them to the shortest length, but taking into account which one was longer (if necessary). function name meaning operator EQ equal = NE not equal <> LT less than < LE less than or equal to <= GT greater than > GE greater than or equal to >= string comparison functions and operators All these functions and operators are binary, that means they expect and accept only exactly two parameters or operands respectively. They can produce different results if supplied with the same input, as you will see in the next two sub-subsections. #### Equality Let’s start with equality. • Two strings (of any length) are considered equal by the EQ function if both operands are of the same length and the value, i. e. the character sequence that actually make up the strings, are the same. • An =‑comparison, on the other hand, augments any “missing” characters in the shorter string by using the padding character space (' ').[fn 7] Let’s see this in action: Code: program equalDemo(output); const emptyString = ''; blankString = ' '; begin writeLn(emptyString = blankString); writeLn(EQ(emptyString, blankString)); end. Output:  True False As you can see emptyString got padded to match the length of blankString, before the actual character-by-character =‑expression took place. To put this relationship in other words, Pascal terms you already know: (foo = bar) = EQ(trim(foo), trim(bar)) The actual implementation is usually different, because trim can be, especially for long strings, quite resource-consuming (time, as well as memory). As a consequence, an =‑comparison is usually used if trailing spaces are insignificant, but are still there for technical reasons (e. g. because you are using an array[1..8] of char). Only EQ ensures both strings are lexicographically the same. Note that the capacity of either string is irrelevant. The function NE, short for not equal, behaves accordingly. #### Less than A string is determined to be “less than” another one by sequentially reading both strings simultaneously from left to right and comparing corresponding characters. If all characters match, the strings are said to be equal to each other. However, if we encounter a differing character pair, processing is aborted and the relation of the current characters determines the overall string’s relation.  first operand second operand determined relation 'A' 'B' 'C' 'D' 'A' 'B' 'E' 'A' = = < ⨯ If both strings are of equal length, the LT function and the <‑operator behave the same. LT actually even builds on top of <. Things get interesting if the supplied strings differ in length. 1. The LT function first cuts both strings to the same (shorter) length. (substring) 2. Then a regular comparison is performed as demonstrated above. If the shortened versions, common length versions turn out to be equal, the (originally) longer string is said to be greater than the other one. The <‑comparison, on the other, compares all remaining “missing” characters to ' ', the space character. This can lead to differing results: Code: program lessThanDemo(output); var hogwash, malarky: string(8); begin { ensure ' ' is not chr(0) or maxChar } if not (' ' in [chr(1)..pred(maxChar)]) then begin writeLn('Character set presumptions not met.'); halt; { EP procedure immediately terminating the program } end; hogwash := '123'; malarky := hogwash + chr(0); writeLn(hogwash < malarky, LT(hogwash, malarky)); malarky := hogwash + '4'; writeLn(hogwash < malarky, LT(hogwash, malarky)); malarky := hogwash + maxChar; writeLn(hogwash < malarky, LT(hogwash, malarky)); end. Output:  False True True True True True When doing a primitive <‑comparison, the “missing” fourth character in hogwash is presumed to be ' '. The fourth character in malarky is compared against ' '. The situation above has been provoked artificially for demonstration purposes, but this can still become an issue if you are frequently using characters that are “smaller” than the regular space character, like for instance if you are programming on an 1980s 8‑bit Atari computer using ATASCII. The LE, GT, and GE functions act accordingly. ## Details on string literals ### Inclusion of delimiter In Pascal string literals start with and are terminated by the same character. Usually this is a straight (typewriter’s) apostrophe ('). Troubles arise if you want to actually include that character in a string literal, because the character you want to include into your string is already understood as the terminating delimiter. Conventionally, two straight typewriter’s apostrophes back-to-back are regarded as an apostrophe image. In the produced computer program, they are replaced by a single apostrophe. program apostropheDemo(output); var c: char; begin for c := '0' to '9' do begin writeLn('ord(''', c, ''') = ', ord(c)); end; end. Each double-apostrophe is replaced by a single apostrophe. The string still needs delimiting apostrophes, so you might end up with three consecutive apostrophes like in the example above, or even four consecutive apostrophes ('''') if you want a char-value consisting of a single apostrophe. ### Non-permissible characters A string is a linear sequence of characters, i. e. along a single dimension.  As such the only illegal “character” in strings is the one marking line breaks (new lines). The string literal in the following piece of code is unacceptable, because it spans across multiple (source code) lines. welcomeMessage := 'Hello! All your base are belong to us.'; However, this only concerns string literals (string values written in your source code). You are nevertheless allowed to use the OS-specific code indicating EOLs, yet the only cross-platform (i. e. guaranteed to work regardless of the used OS) procedure is writeLn. Although not standardized, many compilers provide a constant representing the environment’s character/string necessary to produce line breaks. In FPC it is called lineEnding. Delphi has sLineBreak, which is also understood by the FPC for compatibility reasons. The GPC’s standard module GPC supplies the constant lineBreak. You will first need to import this module before you can use that identifier. ## Remainder operator The final Standard Pascal arithmetic operator you are introduced to, after learning to divide, is the remainder operator mod (short for modulo). Every integer division (div) may yield a remainder. This operator evaluates to this value.  i i mod 2 i mod 3 -3 -2 -1 0 1 2 3 1 0 1 0 1 0 1 0 1 2 0 1 2 0 Similar to all other division operations, the mod operator does not accept a zero value as the second operand. Moreover, the second operand to mod must be positive. There are many definitions, among computer scientists and mathematicians, as regards to the result if the divisor was negative. Pascal avoids any confusion by simply declaring negative divisors as illegal. The mod operator is frequently used to ensure a certain value remains in a specific range starting at zero (0..n). Furthermore, you will find modulo in number theory. For example, the definition of prime numbers says “not divisible by any other number”. This expression can be translated into Pascal like that: expression ${\displaystyle x}$ is divisible by ${\displaystyle d}$ ${\displaystyle d\mid x}$ x mod d = 0  odd(x) is shorthand for x mod 2 <> 0.[fn 8] ## Tasks How do you access a single character from an array[n..m] of string(c)? Since a string(…) is basically a special case of an array (namely one consisting of char values), you can access a single character from it just like usual: v[i, p] where i is a valid index in the range n..m and p refers to the character index within 1..length(v[i]). Since a string(…) is basically a special case of an array (namely one consisting of char values), you can access a single character from it just like usual: v[i, p] where i is a valid index in the range n..m and p refers to the character index within 1..length(v[i]). Write a Boolean function that returns true if, and only if a given string(…) contains non-blank characters (i. e. other characters than ' '). The following is a quite neat implementation. program spaceTest(input, output); type info = string(20); {** \brief determines whether a string contains non-space characters \param s the string to inspect \return true if there are any characters other than ' ' *} function containsNonBlanks(s: info): Boolean; begin containsNonBlanks := length(trim(s)) > 0; end; // … remaining code for testing purposes only … Note, that this function (correctly) returns false if supplied with an empty string (''). Alternatively you could have written:  containsNonBlanks := '' <> s; This requires some Pascal afficionado (someone like you) though, because in other programming languages this kind of expression would only check for empty strings. Also, it requires a string(…) data type to work properly. Remember, in these exercises there is no “best” solution. The following is a quite neat implementation. program spaceTest(input, output); type info = string(20); {** \brief determines whether a string contains non-space characters \param s the string to inspect \return true if there are any characters other than ' ' *} function containsNonBlanks(s: info): Boolean; begin containsNonBlanks := length(trim(s)) > 0; end; // … remaining code for testing purposes only … Note, that this function (correctly) returns false if supplied with an empty string (''). Alternatively you could have written:  containsNonBlanks := '' <> s; This requires some Pascal afficionado (someone like you) though, because in other programming languages this kind of expression would only check for empty strings. Also, it requires a string(…) data type to work properly. Remember, in these exercises there is no “best” solution. Write a program that reads a string(…) and transposes every letter in it by 13 positions with respect to the original character’s place in the English alphabet, and then outputs the modified version. This algorithm is known as “Caesar cipher”. For simplicity assume all input is lower-case. This implementation makes use of multiple things you saw in this unit: program rotate13(input, output); const // we will only operate ("rotate") on these characters alphabet = 'abcdefghijklmnopqrstuvwxyz'; offset = 13; type integerNonNegative = 0..maxInt; sentence = string(80); var secret: sentence; i, p: integerNonNegative; begin readLn(secret); for i := 1 to length(secret) do begin // is current character in alphabet? p := index(alphabet, secret[i]); // if so, rotate if p > 0 then begin // The + 1 in the end ensures that p // in the following expression alphabet[p] // is indeed always a valid index (i.e. not zero). p := (p - 1 + offset) mod length(alphabet) + 1; secret[i] := alphabet[p]; end; end; writeLn(secret); end. An implementation using a translation table (array[chr(0)..maxChar] of char) would have been acceptable, too, but care must be taken in properly populating it. Note, it is not guaranteed that expressions such as succ('A', 13) will yield the expected result. The range 'A'..'Z' is not necessarily contiguous, so you should not make any assumptions about it. If your solution makes use of that, you must document it (e. g. “This program only runs properly on computers using the ASCII character set.”). This implementation makes use of multiple things you saw in this unit: program rotate13(input, output); const // we will only operate ("rotate") on these characters alphabet = 'abcdefghijklmnopqrstuvwxyz'; offset = 13; type integerNonNegative = 0..maxInt; sentence = string(80); var secret: sentence; i, p: integerNonNegative; begin readLn(secret); for i := 1 to length(secret) do begin // is current character in alphabet? p := index(alphabet, secret[i]); // if so, rotate if p > 0 then begin // The + 1 in the end ensures that p // in the following expression alphabet[p] // is indeed always a valid index (i.e. not zero). p := (p - 1 + offset) mod length(alphabet) + 1; secret[i] := alphabet[p]; end; end; writeLn(secret); end. An implementation using a translation table (array[chr(0)..maxChar] of char) would have been acceptable, too, but care must be taken in properly populating it. Note, it is not guaranteed that expressions such as succ('A', 13) will yield the expected result. The range 'A'..'Z' is not necessarily contiguous, so you should not make any assumptions about it. If your solution makes use of that, you must document it (e. g. “This program only runs properly on computers using the ASCII character set.”). Write a Boolean function that determines whether a string is a palindrome, that means it can be read forward and backwards producing the same meaning/sound provided word gaps (spaces) are adjusted accordingly. For simplicity assume all characters are lower-case and there are no punctuation characters (other than whitespace). The following is an acceptable implementation: program palindromes(input, output); type sentence = string(80); { \brief determines whether a lower-case sentence is a palindrome \param original the sentence to inspect \return true iff \param original can be read forward and backward } function isPalindrome(original: sentence): Boolean; var readIndex, writeIndex: integer; derivative: sentence; check: Boolean; begin check := true; // “sentences” that have a length of one, or even zero characters // are always palindromes if length(original) > 1 then begin // ensure derivative has the same length as original derivative := original; // the contents are irrelevant, alternatively [in EP] you could’ve used //writeStr(derivative, '':length(original)); // which would’ve saved us the “fill the rest with blanks” step below writeIndex := 1; // strip blanks for readIndex := 1 to length(original) do begin // only copy significant characters if not (original[readIndex] in [' ']) then begin derivative[writeIndex] := original[readIndex]; writeIndex := writeIndex + 1; end; end; // fill the rest with blanks for writeIndex := writeIndex to length(derivative) do begin derivative[writeIndex] := ' '; end; // remove trailing blanks and thus shorten length derivative := trim(derivative); for readIndex := 1 to length(derivative) div 2 do begin check := check and (derivative[readIndex] = derivative[length(derivative) - readIndex + 1]); end; end; isPalindrome := check; end; var mystery: sentence; begin writeLn('Enter a sentence that is possibly a palindrome (no caps):'); readLn(mystery); writeLn('The sentence you have entered is a palindrome: ', isPalindrome(mystery)); end. Ensure that you have understood that you can only set a string’s length by making a direct “complete” assignment (confer § Index). Notice how this implementation uses an altered, filtered copy of the original string. For demonstration purposes the example shows if not (original[readIndex] in [' ']) then. In fact an explicit set list would have been more adequate, i. e. if original[readIndex] in ['a', 'b', 'c', …, 'z']) then. Do not worry if you simply wrote something to the effect of if original[readIndex] <> ' ' then, this is just as fine given the task’s requirements. The following is an acceptable implementation: program palindromes(input, output); type sentence = string(80); { \brief determines whether a lower-case sentence is a palindrome \param original the sentence to inspect \return true iff \param original can be read forward and backward } function isPalindrome(original: sentence): Boolean; var readIndex, writeIndex: integer; derivative: sentence; check: Boolean; begin check := true; // “sentences” that have a length of one, or even zero characters // are always palindromes if length(original) > 1 then begin // ensure derivative has the same length as original derivative := original; // the contents are irrelevant, alternatively [in EP] you could’ve used //writeStr(derivative, '':length(original)); // which would’ve saved us the “fill the rest with blanks” step below writeIndex := 1; // strip blanks for readIndex := 1 to length(original) do begin // only copy significant characters if not (original[readIndex] in [' ']) then begin derivative[writeIndex] := original[readIndex]; writeIndex := writeIndex + 1; end; end; // fill the rest with blanks for writeIndex := writeIndex to length(derivative) do begin derivative[writeIndex] := ' '; end; // remove trailing blanks and thus shorten length derivative := trim(derivative); for readIndex := 1 to length(derivative) div 2 do begin check := check and (derivative[readIndex] = derivative[length(derivative) - readIndex + 1]); end; end; isPalindrome := check; end; var mystery: sentence; begin writeLn('Enter a sentence that is possibly a palindrome (no caps):'); readLn(mystery); writeLn('The sentence you have entered is a palindrome: ', isPalindrome(mystery)); end. Ensure that you have understood that you can only set a string’s length by making a direct “complete” assignment (confer § Index). Notice how this implementation uses an altered, filtered copy of the original string. For demonstration purposes the example shows if not (original[readIndex] in [' ']) then. In fact an explicit set list would have been more adequate, i. e. if original[readIndex] in ['a', 'b', 'c', …, 'z']) then. Do not worry if you simply wrote something to the effect of if original[readIndex] <> ' ' then, this is just as fine given the task’s requirements. Without trying it out, what is the result of LT('', '')? Now you are allowed to try it out. Now you are allowed to try it out. Write a function that determines whether a year in the Gregorian calendar is a leap year. Every fourth year is a leap year, but every hundredth year is not, unless it is the fourth century in a row. This task is a prime example for the mod operator you just saw: { \brief determines whether a year is a leap year in Gregorian calendar \param x the year to inspect \return true, if and only if \param x meets leap year conditions } function leapYear(x: integer): Boolean; begin leapYear := (x mod 4 = 0) and (x mod 100 <> 0) or (x mod 400 = 0); end; Note there is an already pre-made function isLeapYear in Delphi’s and the FPC’s sysUtils unit or in GPC’s GPC module. Whenever possible reuse code already available. This task is a prime example for the mod operator you just saw: { \brief determines whether a year is a leap year in Gregorian calendar \param x the year to inspect \return true, if and only if \param x meets leap year conditions } function leapYear(x: integer): Boolean; begin leapYear := (x mod 4 = 0) and (x mod 100 <> 0) or (x mod 400 = 0); end; Note there is an already pre-made function isLeapYear in Delphi’s and the FPC’s sysUtils unit or in GPC’s GPC module. Whenever possible reuse code already available. After writing function returning the leap year property of a year, write a binary function returning the number of days in a given month and year. This is a typical case for a case-statement. Recall that there must be exactly one assignment to the result variable: type { a valid day number in Gregorian calendar month } day = 1..31; { a valid month number in Gregorian calendar year } month = 1..12; { \brief determines the number of days in a given Gregorian year \param m the month whose day number count is requested \param y the year (relevant for leap years) \return the number of days in a given month and year } function daysInMonth(m: month; y: integer): day; begin case m of 1, 3, 5, 7, 8, 10, 12: begin daysInMonth := 31; end; 4, 6, 9, 11: begin daysInMonth := 30; end; 2: begin daysInMonth := 28 + ord(leapYear(y)); end; end; end; Note, Delphi’s and the FPC’s compatible dateUtils unit provide a function called daysInAMonth. You are strongly encouraged to reuse it instead of your own code. This is a typical case for a case-statement. Recall that there must be exactly one assignment to the result variable: type { a valid day number in Gregorian calendar month } day = 1..31; { a valid month number in Gregorian calendar year } month = 1..12; { \brief determines the number of days in a given Gregorian year \param m the month whose day number count is requested \param y the year (relevant for leap years) \return the number of days in a given month and year } function daysInMonth(m: month; y: integer): day; begin case m of 1, 3, 5, 7, 8, 10, 12: begin daysInMonth := 31; end; 4, 6, 9, 11: begin daysInMonth := 30; end; 2: begin daysInMonth := 28 + ord(leapYear(y)); end; end; end; Note, Delphi’s and the FPC’s compatible dateUtils unit provide a function called daysInAMonth. You are strongly encouraged to reuse it instead of your own code. More exercises can be found in: Notes: 1. In fact this is a discrimination of, what EP calls “schema”. Schemata will be explained in detail in the Extensions Part of this Wikibook. 2. This functionality is useful if you are handling constants you or someone might change at some point. Per definition the literal value ' ' is a char value, whereas '' (“null-string”) or '42' are string literals. In order to write generic code, length accepts all kinds of values that could denote a finite sequence of char values. 3. In fact the definition essentially is packed array[1..capacity] of char. 4. This means, in the case of empty strings, only the following function call could be legal subStr('', 1, 0). It goes without saying that such a function call is very useless. 5. The string variable may not be bindable when using this notation. 6. Omitting the third parameter in the case of empty strings or characters is not allowed. subStr('', 1) is illegal, because there is no “character 1” in an empty string. Also, subStr('Z', 1) is not allowed, because 'Z' is a char-expression and as such always has a length of 1, rendering any need for a “give me the rest of/subsequent characters of” function obsolete. 7. If you are a GPC user, you will need to ensure you are in a fully-EP-compliant mode for example by specifying ‑‑extended‑pascal on the command line. Otherwise, no padding occurs. The Standard (unextended) Pascal, as per ISO standard 7185, does not define any padding algorithm. 8. The actual implementation of odd may be different. On many processor architectures it is usually something to the effect of the x86 instruction and x, 1. Next Page: Records | Previous Page: Arrays Home: Pascal Programming # Records The key to successful programming is finding the "right" structure of data and program. —Niklaus Wirth[1] After you have learned to use an array, this chapter introduces you to another data type structure concept called record. Like an array, the use of records primarily serves the purposes of allowing you to write clean, structured programs. It is otherwise optional. ## Concept You briefly saw a record in the first chapter. While an array is a homogenous aggregation of data, that means all members have to have the same base data type, a record is potentially, but not necessarily an aggregation of data having various different data types.[2] ### Declaration A record data type declaration looks pretty much like a collection of variable declarations: program recordDemo; type (* a standard line on a text console *) line = string(80); (* 1st grade through 12th grade *) grade = 1..12; (* encapsulate all administrative data in one structure *) student = record firstname: line; lastname: line; level: grade; end; The declaration begins with the word record and ends with end. Inbetween you declare fields, or members, member elements of the entire record.  Here again the semicolon has the function of separating members. The keyword end will actually terminate a record declaration. Note, how in the following correct example there is no semicolon after the last member’s declaration: program recordSemicolonDemo; type sphere = record radius: real; volume: real; surface: real end;Despite that, it is a frequent practice to put a semicolon there anyway, even though it is not necessary. You would otherwise too often simply add a new member below the last line, forgetting to add the semicolon in the preceding line and thus provoking a syntax error. All record members have to bear distinct names within the record declaration itself. For instance in the example above, declaring two “variables”, member elements of the name level will be rejected. There is no requirement on how many fields you have to declare. An “empty” record is also possible:[fn 1] type emptyRecord = record end; ### Many fields of the same data type Similar to the declaration of variables you can define multiple fields of the same data type at once by separating identifiers with a comma. The previous declaration of sphere could also be written as: type sphere = record radius, volume, surface: real; end; Most Pascal veterans and style guides, however, discourage the use of this shorthand notation (both for variable as well as record declarations, but also in formal parameter lists). It is only reasonable if all declared identifiers absolutely always have same data type; it is virtually guaranteed you will never want to change the data type of just one field in the comma-separated list. If in doubt, use the longhand. In programming, convenience plays a tangential role. ### Use By declaring a record variable you immediately have the entire set of “sub”‑variables at your hand. Accessing them is done by specifying the record variable’s name, plus a dot (.), followed by the record field’s name: var posterStudent: student; begin posterStudent.firstname := 'Holden'; posterStudent.lastname := 'Caulfield'; posterStudent.level := 10; end. You already saw the dot notation in the previous chapter on strings, where appending .capacity on a name of a string(…) variable refers to the respective variable’s character capacity. This is not a coincidence. However, especially beginners occasionally confuse the data type name with the variable’s name. The following code highlights the difference. Remember that a data type declaration does not reserve any memory and is mainly informative for the compiler, whereas a variable declaration actually sets some chunk of memory aside. program dotNoGo(output); { This program does not compile. } type line = string(80); quizItem = record question: line; answer: line; end; var response: line; challenge: quizItem; begin writeLn(line.capacity); { ↯ line is not a variable } writeLn(response.capacity); { ✔ correct } writeLn(quizItem.question); { ↯ quizItem refers to a data type } { Data type declarations (as per definition) do not reserve any memory } { thus you cannot “read/write” from/to a data type. } writeLn(challenge.question); { ✔ correct } end. And, as it has always been, you first need to assign a value to a variable before you are allowed to read it. The source code above ignores that to focus on the main issue. The key point is, the dot (.) notation is only valid if there is memory.[fn 2] ### Advantages But why and when do we want to use a record? At first glance and in the given examples so far it may seem like a troublesome way to declare and use multiple variables. Yet the fact that a record is handled as one unit entails one big advantage: • You can copy entire record values via a simple assignment (:=). • This means you can pass much data at once: A record can be a parameter of routines, and in EP functions can return them as well.[fn 3] Evidently you want to group data together that always appear together. It does not make sense to group unrelated data, just because we can. Another quite useful advantage is presented below in the section on variant records. ## Routing override As you saw earlier, referring to members of a record can get a little tedious, because we are repeating the variable name over and over again. Fortunately, Pascal allows us abbreviate things a bit. ### With-clause The with-clause allows us to eliminate repeating a common prefix, specifically the name of a record variable.[3] begin with posterStudent do begin firstname := 'Holden'; lastname := 'Caulfield'; level := 10; end; end. All identifiers that identify values are first looked for in the record scope of posterStudent. If there is no match, all variable identifiers outside of the given record are considered too. Of course it is still possible to denote a record member by its full name. E. g. in the source code above it would be perfectly legal to still write posterStudent.level within the with-clause. Concededly, this would defeat the purpose of the with-clause, but sometimes it may still be beneficial to emphasize the specific record variable just for documentation. It is nevertheless important to understand that the FQI, the fully-qualified identifier, the one with a dot in it, does not lose its “validity”. In principle, all components of structured values “containing dots” can be abbreviated with with. This is also true for the data type string you have learned in the previous chapter. program withDemo(input, output); type { Deutsche Post „Maxi-Telegramm“ } telegram = string(480); var post: telegram; begin with post do begin writeLn('Enter your telegram. ', 'Maximum length = ', capacity, ' characters.'); readLn(post); { … } end; end. Here, within the with-clause capacity, and for that matter post.capacity, refer to post.capacity. ### Multiple levels If multiple with-clauses ought to be nested, there is the short notation:  with snakeOil, sharpTools do begin … end; which is equivalent to:  with snakeOil do begin with sharpTools do begin … end; end; It is important to bear in mind, first identifiers in sharpTools are searched, and if there is no match, secondly, identifiers in snakeOil are considered. ## Variant records In Pascal a record is the only data type structure concept that allows you to, so to speak, alter its structure during run-time, while a program is running. This super practical property of record permits us to write versatile code covering many cases. ### Declaration Let’s take a look at an example: type centimeter = 10..199; // order of female, male has been chosen, so ord(sex) // returns the [minimum] number of non-defective Y chromosomes sex = (female, male) // measurements according EN 13402 size designation of clothes [incomplete] clothingSize = record shoulderWidth: centimeter; armLength: centimeter; bustGirth: centimeter; waistSize: centimeter; hipMeasurement: centimeter; case body: sex of female: ( underbustMeasure: centimeter; ); male: ( ); end; The variant part of a record starts with the keyword case, which you already know from selections. After that follows a record member declaration, the variant selector, but instead of a semicolon you put the keyword of thereafter. Below that follow all possible variants. Each variant is marked by a value out of the variant selector’s domain, here female and male. Separated by a colon (:) follows a variant denoter surrounded by parentheses. Here you can list additional record members that are only available if a certain variant is “active”. Note that all identifiers across all alternatives must be unique. The individual variants are separated by a semicolons, and there can be at most one variant part which has to appear at the end. Because you will need to be able to list all possible variants, the variant selector has to be an ordinal data type. ### Use Using variant records requires you to first select a variant. Variants are “activated” by assigning a value to the variant selector. Note, variants are not “created”; they all already exist at program startup. You merely need to make a choice.  boobarella.body := female; boobarella.underbustMeasure := 69; Only after assigning a value to the variant selector and as long as this value remains unchanged, you are allowed to access any fields of the respective variant. It is illegal to reverse the previous two lines of code and attempt accessing the underbustMeasure field even though body is not defined yet and, more importantly, does not bear the value female. It is certainly permissible to change the variant selector later in your program and then use a different variant, but all previously stored values in the variant part relinquish their validity and you cannot restore them. If you switch back the variant to a previous, original value, you will need to assign all values in that variant anew. ### Application This concept opens up new horizons: You can design your programs more interactively in a neat fashion. You can now choose a variant based on run-time data (data that is read while the program is running). Because at any time (after the first assignment of a value to the variant selector) only one variant is “active”, your program will crash if it attempts reading/writing values of an “inactive” variant. This is a desirable behavior, because that is the whole idea of having distinct variants. It guarantees your programs overall integrity. ### Anonymous variants Pascal also permits having anonymous variant selectors, that is selectors not bearing any name. The implications are • you cannot explicitly select (nor query) any variant, so • in turn all variants are considered “active” at the same time. “But wasn’t this the object of the exercise?” you might ask. Yes, indeed, since there is no named selector your program cannot keep track which variant is supposed to work and which one is “defective”. You are responsible to determine which variant you can sensibly read/write at present. Anonymous variants are/were frequently abused to implement “typecasts”. If you have an anonymous variant part, you can declare members bearing different data types which in turn determine the underlying data’s interpretation method. You can then exploit the fact that many (but not necessarily all) compilers put all variants in the same memory block. Code: program anonymousVariantsDemo(output); type bitIndex = 0..(sizeOf(integer) * 8 - 1); exposedInteger = record case Boolean of false: ( value: integer; ); true: ( bit: set of bitIndex; ); end; var i: exposedInteger; begin i.bit := [4]; writeLn(i.value); end. Output: 16 The value 16 is (and this should be considered “a coincidence”) ${\displaystyle 2^{4}}$. We stress that all Pascal standards do not make any statement regarding internal memory structure. A high-level programming language is not concerned about how data is stored, it even does not know the notion of “bits”, “voltage high”/“voltage low”.  Thus, if you are (intentionally) using any of this demonstrated behavior, you can not say “I am programming in Pascal” anymore, but you are programming specifically for the compiler so-and-so. The memory layout of data structures varies among Pascal implementations. The example above, for example, was designed for and works with the GPC and the FPC in their default configurations. Do not deem it as “Pascal”, but a descendant of it. There is a good chance that using a different compiler will produce different results. This concept exists in many other programming languages too. In the programming language C, for instance, it is called a union. ## Conditional loops So far we have been exclusively using counting loops. This is great if you can predict in advance the number of iterations, how many times the loop’s body needs to be executed. Yet every so often it is not possible to formulate a proper expression determining the number of iterations in advance. Conditional loops allow you to make the execution of the next iteration dependent on a Boolean expression. They come in two flavors: • Head-controlled loop, and • tail-controlled loop. The difference is, the loop’s body of a tail-controlled loop is executed at least once in any case, whereas a head-controlled loop might never execute the loop body at all. In either case, a condition is evaluated over and over again and must uphold for the loop to continue. ### Head-controlled loop A head-controlled loop is frequently called while-loop because of its syntax. The “control” condition appears above the loop body, i. e. at the head. Code: program characterCount(input, output); type integerNonNegative = 0..maxInt; var c: char; n: integerNonNegative; begin n := 0; while not EOF do begin read(c); n := n + 1; end; writeLn('There are ', n:1, ' characters.'); end. Output: $ cat ./characterCount.pas | ./characterCount
There are 240 characters.
$printf '' '' | ./characterCount There are 0 characters. The loop’s condition is a Boolean expression framed by the words while and do. The condition must evaluate to true for any (subsequent) iteration to occur. As you can see from the output, in the second case, it may even be zero times: Evidently for empty input n := n + 1 was never executed. EOF is shorthand for EOF(input). This standard function returns true if there is no further data available to read, commonly called end of file. It is illegal, and will horribly fail, to read from a file if the respective EOF function call returns true. Unlike a counting loop, you are allowed to modify data the conditional loop’s condition depends on. const (* instead of a hard-coded length 64 *) (* you can write sizeOf(integer) * 8 in Delphi, FPC, GPC *) wordWidth = 64; type integerNonNegative = 0..maxInt; wordStringIndex = 1..wordWidth; wordString = array[wordStringIndex] of char; function binaryString(n: integerNonNegative): wordString; var (* temporary result *) binary: wordString; i: wordStringIndex; begin (* initialize binary with blanks *) for i := 1 to wordWidth do begin binary[i] := ' '; end; (* if n _is_ zero, the loop's body won't be executed *) binary[i] := '0'; (* reverse Horner's scheme *) while n >= 1 do begin binary[i] := chr(ord('0') + n mod 2); n := n div 2; i := i - 1; end; binaryString := binary; end; The n the loop’s condition depends on will be repeatedly divided by two. Because the division operator is an integer division (div), at some point the value 1 will be divided by two and the arithmetically correct result 0.5 is truncated (trunc) toward zero. Yet the value 0 does not satisfy the loop’s condition anymore, thus there will not be any subsequent iterations. ### Tail-controlled loop In a tail-controlled loop the condition appears below the loop’s body, at the foot. The loop’s body is always run once before even the condition is evaluated at all. program repeatDemo(input, output); var i: integer; begin repeat begin write('Enter a positive number: '); readLn(i); end until i > 0; writeLn('Wow! ', i:1, ' is a quite positive number.'); end. The loop’s body is encapsulated by the keywords repeat and until.[fn 4] After until follows a Boolean expression. In contrast to a while loop, the tail-controlled loop always continues, always keeps going, until the specified condition becomes true. A true condition marks the end. In the above example the user will be prompted again and again until he eventually complies and enters a positive number. ## Date and time This section introduces you to features of Extended Pascal as defined in the ISO standard 10206. You will need an EP‑compliant compiler to use those features. ### Time stamp In EP there is a standard data type called timeStamp. It is declared as follows:[fn 5] type timeStamp = record dateValid: Boolean; timeValid: Boolean; year: integer; month: 1..12; day: 1..31; hour: 0..23; minute: 0..59; second: 0..59; end; As you can see from the declaration, timeStamp also contains data fields for a calendar date, not just the time as indicated by a standard clock.  A processor (i. e. usually a compiler) may provide additional (thus non-standard) fields. The GPC for instance supplies, among other fields, a field called timeZone indicating the offset in seconds versus UTC (“world time”). ### Getting a time stamp EP also defines a unary procedure that populates a timeStamp variable with values. GetTimeStamp assigns values to all members of a timeStamp record passed in the first (and only) parameter. These values represent the “current date” and “current time” as at the invocation of this procedure. However, in the 1980’s not all (personal/home) computers did have a built-in “real time” clock. Therefore, the ISO standard 10206 devised prior 21st century stated that the word “current” was “implementation-defined”. The dateValid and timeValid fields were specifically inserted to address the issue that some computers simply do not know the current date and/or time. When reading values from a timeStamp variable, it is still advisable to check their validity first after having getTimeStamp fill them out. If getTimeStamp was unable to obtain a “valid” value, it will set • day, month and year to a value representing January 1, 1 CE, but also dateValid to false. • In the case of time, hour, minute and second become all 0, a value representing midnight. The timeValid field becomes false. Both are independent from each other, so it may certainly be the case that just the time could be determined, but the date is invalid. Note that the Gregorian calendar was introduced during the year 1582 CE, so the timeStamp data type is generally useless for any dates before 1583 CE. ### Printable dates and times Having obtained a timeStamp, EP furthermore supplies two unary functions: • date returns a human-readable string representation of day, month and year, and • time returns a human-readable string representation of hour, minute and second. Both functions will fail and terminate the program if dateValid or timeValid indicate an invalid datum respectively. Note, the exact format of string representation is not defined by the ISO standard 10206. Putting things together, consider the following program: Code: program dateAndTimeFun(output); var ts: timeStamp; begin getTimeStamp(ts); if ts.dateValid then begin writeLn('Today is ', date(ts), '.'); end; if ts.timeValid then begin writeLn('Now it is ', time(ts), '.'); end; end. Output: Today is 11 Nov 2022. Now it is 03:26:42.  The output may differ. Here, the GPC was used and the hardware had an RTC. It is needless to say, but also you might see nothing if both dateValid and timeValid are false. ## Summary on loops This is a good time to take inventory and reiterate all kinds of loops. ### Conditional loops Conditional loops are the tools of choice if you cannot predict the total number of iterations. head-controlled loop tail-controlled loop while condition do begin … end;  repeat begin … end until condition; condition must evaluate to true for any (including subsequent) iterations to occur. condition must be false for any subsequent iteration to occur. comparison of conditional loops in Pascal It is possible to formulate either loop as the other one, but usually one of them is more suitable. A tail-controlled loop is particularly suitable if you do not have any data yet to make a judgment, to evaluate a proper condition prior the first iteration. ### Counting loops Counting loops are good if you can predict the total number of iterations before entering the loop. counting up loop counting down loop for controlVariable := initialValue to finalValue do begin … end; for controlVariable := initialValue downto finalValue do begin … end; After each non-final iteration controlVariable becomes succ(controlVariable). controlVariable must be less than or equal to finalValue for another iteration to occur. After each non-final iteration controlVariable becomes pred(controlVariable). controlVariable must be greater than or equal to finalValue for another iteration to occur. comparison of counting loop directions in Pascal  Both, the initialValue and finalValue expressions, are evaluated exactly once.[4] This is very different to conditional loops. Inside counting loops’ bodies you cannot modify the counting variable, only read it. This prevents you from any accidental manipulations and ensures the calculated predicted total number of iterations will indeed occur.  It is not guaranteed that controlVariable is finalValue “after” the loop. If there were exactly zero iterations, no assignments to controlVariable were made. Thus generally presume controlVariable is invalid/uninitialized after a for-loop unless you are absolutely sure there was at least one iteration. ### Loops on aggregations If you are using an EP-compliant compiler, you furthermore have the option to use a for … in loop on sets. program forInDemo(output); type characters = set of char; var c: char; parties: characters; begin parties := ['R', 'D']; for c in parties do begin write(c:2); end; writeLn; end. ## Tasks You have made it this far, and it is quite impressive how much you already know. Since this chapter’s concept of a record should not be too difficult to grasp, the following exercises mainly focus on training. A professional computer programmer spends most of his time on thinking what kind of implementation, using which tools (e. g. array “vs.” set), is the most useful/reasonable. You are encouraged to think first, before you even start typing anything. Nonetheless, sometimes (esp. due to your lack of experience) you need to just try things out, which is fine if it is intentional. Aimlessly finding a solution does not discern an actual programmer. Can you have a record contain another record? Like it was possible for an array to contain another array, this is quite possible for a record too. Write a test program to see for yourself. The important thing is to note that the dot-notation can be expanded indefinitely (myRecordVariable.topRecordFieldName.nestedRecordFieldName.doubleNestedRecordFieldName). Evidently at some point it becomes too difficult to read so use this wisely. Like it was possible for an array to contain another array, this is quite possible for a record too. Write a test program to see for yourself. The important thing is to note that the dot-notation can be expanded indefinitely (myRecordVariable.topRecordFieldName.nestedRecordFieldName.doubleNestedRecordFieldName). Evidently at some point it becomes too difficult to read so use this wisely. Write a loop that never ends, that means it is impossible that the loop will ever terminate. If your test program does not terminate, you most likely have achieved this task. On a standard Linux terminal you can then press Ctrl+C to forcefully kill the program. There are two flavors of infinite loops: while true do begin … end; The condidition needs to be negated in a repeat … until loop: repeat begin … end until false; Infinite loops are very undesirable. While constant expressions like in these examples are easy to spot, tautologies, expressions that always evaluate to true, or expressions that can never be fulfilled (in the case of a repeat … until loop), are not. For instance, given that i was an integer the loop while i <= maxInt do will run indefinitely, because i can never exceed maxInt[fn 6] and thus break the loop’s condition. Therefore be reminded to carefully formulate expressions for conditional loops and ensure it will eventually reach a terminating state. Otherwise it can be frustrating for the user of your program. There are two flavors of infinite loops: while true do begin … end; The condidition needs to be negated in a repeat … until loop: repeat begin … end until false; Infinite loops are very undesirable. While constant expressions like in these examples are easy to spot, tautologies, expressions that always evaluate to true, or expressions that can never be fulfilled (in the case of a repeat … until loop), are not. For instance, given that i was an integer the loop while i <= maxInt do will run indefinitely, because i can never exceed maxInt[fn 6] and thus break the loop’s condition. Therefore be reminded to carefully formulate expressions for conditional loops and ensure it will eventually reach a terminating state. Otherwise it can be frustrating for the user of your program. Rewrite the following loop as a while-loop: repeat begin imagineJumpingSheep; sheepCount := sheepCount + 1; waitTwoSeconds; end until asleep; The important thing is to realize is that the entire loop body is repeated above the while-loop even begins: imagineJumpingSheep; sheepCount := sheepCount + 1; waitTwoSeconds; while not asleep do begin imagineJumpingSheep; sheepCount := sheepCount + 1; waitTwoSeconds; end; Do not forget to negate the condition when transforming a conditonal loop into the other kind. Obviously the repeat … until-loop is more suitable in this case. The important thing is to realize is that the entire loop body is repeated above the while-loop even begins: imagineJumpingSheep; sheepCount := sheepCount + 1; waitTwoSeconds; while not asleep do begin imagineJumpingSheep; sheepCount := sheepCount + 1; waitTwoSeconds; end; Do not forget to negate the condition when transforming a conditonal loop into the other kind. Obviously the repeat … until-loop is more suitable in this case. If you are using a Linux or FreeBSD OS and an EP‑compliant compiler: Write a program that takes the output of the command getent passwd as input and only prints the first field/column of every line. In a passwd(5) file, fields are separated by a colon (:). Your program will list all known user names. You can run the following program with the command getent passwd | ./cut1 (the file name of your executable program may differ). program cut1(input, output); const separator = ':'; var line: string(80); begin while not EOF(input) do begin { This reads the _complete_ line, but at most} { line.capacity characters are actually saved. } readLn(line); writeLn(line[1..index(line, separator)-1]); end; end. Remember that index will return the index of the colon character which you do not want to print, thus you will need to subtract 1 from its result. This program will evidently fail if a line does not contain a colon. You can run the following program with the command getent passwd | ./cut1 (the file name of your executable program may differ). program cut1(input, output); const separator = ':'; var line: string(80); begin while not EOF(input) do begin { This reads the _complete_ line, but at most} { line.capacity characters are actually saved. } readLn(line); writeLn(line[1..index(line, separator)-1]); end; end. Remember that index will return the index of the colon character which you do not want to print, thus you will need to subtract 1 from its result. This program will evidently fail if a line does not contain a colon. Based on your previous solution, extend your program so only user names whose UID is greater than or equal to 1000. The UID is stored in the third field. The changed lines have been highlighted. A comment from the previous source code has been omitted. program cut2(input, output); const separator = ':'; minimumID = 1000; var line: string(80); nameFinalCharacter: integer; uid: integer; begin while not EOF do begin readLn(line); nameFinalCharacter := index(line, separator) - 1; { username:encryptedpassword:usernumber:… } { ↑ nameFinalCharacter + 1 } { ↑ … + 2 is the index of the 1st password character } uid := index(subStr(line, nameFinalCharacter + 2), separator); { Note that the preceding index did not operate on line } { but an altered/different/independent “copy” of it. } { This means, we’ll need to offset the returned index once again. } readStr(subStr(line, nameFinalCharacter + 2 + uid), uid); { Read/readLn/readStr automatically terminate reading an integer } { number from the source if a non-digit character is encountered. } { (Preceding blanks/space characters are ignored and } { the _first_ character still may be a sign, that is + or -.)} if uid >= minimumID then begin writeLn(line[1..nameFinalCharacter]); end; end; end. Recall from the previous chapter that the third parameter in subStr can be omitted effectively meaning “give me the rest of a string.” Note that this programming task mimics (some of) the behavior of cut(1). Use programs/source code that has already been programmed for you whenever possible. Reinventing the wheel is not necessary. Nonetheless, this basic task is a good exercise. On a RHEL system you may rather want to set minimumID to 500. The changed lines have been highlighted. A comment from the previous source code has been omitted. program cut2(input, output); const separator = ':'; minimumID = 1000; var line: string(80); nameFinalCharacter: integer; uid: integer; begin while not EOF do begin readLn(line); nameFinalCharacter := index(line, separator) - 1; { username:encryptedpassword:usernumber:… } { ↑ nameFinalCharacter + 1 } { ↑ … + 2 is the index of the 1st password character } uid := index(subStr(line, nameFinalCharacter + 2), separator); { Note that the preceding index did not operate on line } { but an altered/different/independent “copy” of it. } { This means, we’ll need to offset the returned index once again. } readStr(subStr(line, nameFinalCharacter + 2 + uid), uid); { Read/readLn/readStr automatically terminate reading an integer } { number from the source if a non-digit character is encountered. } { (Preceding blanks/space characters are ignored and } { the _first_ character still may be a sign, that is + or -.)} if uid >= minimumID then begin writeLn(line[1..nameFinalCharacter]); end; end; end. Recall from the previous chapter that the third parameter in subStr can be omitted effectively meaning “give me the rest of a string.” Note that this programming task mimics (some of) the behavior of cut(1). Use programs/source code that has already been programmed for you whenever possible. Reinventing the wheel is not necessary. Nonetheless, this basic task is a good exercise. On a RHEL system you may rather want to set minimumID to 500. Write a prime sieve. One routine does the calculations, another routine prints them. This exercise’s goal is to give you an opportunity to type, to write an adequate program. If necessary, you can peek at existing implementations, but still write it on your own, adding your own comments to the source code. The following program meets all requirements. Note, an implementation using an array[1..limit] of Boolean would have been perfectly fine as well, although the shown set of natural implementation is in principle preferred. program eratosthenes(output); type { in Delphi or FPC you will need to write 1..255 } natural = 1..4095; {$setLimit 4096}{ only in GPC }
naturals = set of natural;

const
{ high is a Borland Pascal (BP) extension. }
{ It is available in Delphi, FPC and GPC. }
limit = high(natural);

{ Note: It is important that primes is declared }
{ in front of sieve and list, so both of these }
{ routines can access the _same_ variable. }
var
primes: naturals;

{ This procedure sieves the primes set. }
{ The primes set needs to be fully populated }
{ _before_ calling this routine. }
procedure sieve;
var
n: natural;
i: integer;
multiples: naturals;
begin
{ 1 is by definition not a prime number }
primes := primes - [1];

{ find the next non-crossed number }
for n := 2 to limit do
begin
if n in primes then
begin
multiples := [];
{ We do _not_ want to remove 1 * n. }
i := 2 * n;
while i in [n..limit] do
begin
multiples := multiples + [i];
i := i + n;
end;

primes := primes - multiples;
end;
end;
end;

{ This procedures lists all numbers in primes }
{ and enumerates them. }
procedure list;
var
count, n: natural;
begin
count := 1;

for n := 2 to limit do
begin
if n in primes then
begin
writeLn(count:8, '.:', n:22);
count := count + 1;
end;
end;
end;

{ === MAIN program === }
begin
primes := [1..limit];
sieve;
list;
end.
Appreciate the fact that because you have separated the sieve task from the list task, both routine definitions and the main part of the program at the bottom remain quite short and are thus easier to understand.
The following program meets all requirements. Note, an implementation using an array[1..limit] of Boolean would have been perfectly fine as well, although the shown set of natural implementation is in principle preferred.
program eratosthenes(output);

type
{ in Delphi or FPC you will need to write 1..255 }
natural = 1..4095;
{$setLimit 4096}{ only in GPC } naturals = set of natural; const { high is a Borland Pascal (BP) extension. } { It is available in Delphi, FPC and GPC. } limit = high(natural); { Note: It is important that primes is declared } { in front of sieve and list, so both of these } { routines can access the _same_ variable. } var primes: naturals; { This procedure sieves the primes set. } { The primes set needs to be fully populated } { _before_ calling this routine. } procedure sieve; var n: natural; i: integer; multiples: naturals; begin { 1 is by definition not a prime number } primes := primes - [1]; { find the next non-crossed number } for n := 2 to limit do begin if n in primes then begin multiples := []; { We do _not_ want to remove 1 * n. } i := 2 * n; while i in [n..limit] do begin multiples := multiples + [i]; i := i + n; end; primes := primes - multiples; end; end; end; { This procedures lists all numbers in primes } { and enumerates them. } procedure list; var count, n: natural; begin count := 1; for n := 2 to limit do begin if n in primes then begin writeLn(count:8, '.:', n:22); count := count + 1; end; end; end; { === MAIN program === } begin primes := [1..limit]; sieve; list; end. Appreciate the fact that because you have separated the sieve task from the list task, both routine definitions and the main part of the program at the bottom remain quite short and are thus easier to understand. Write a program that reads an infinite number of numerical values from input and at the end prints on output the arithmetic mean. program arithmeticMean(input, output); type integerNonNegative = 0..maxInt; var i, sum: real; count: integerNonNegative; begin sum := 0.0; count := 0; while not eof(input) do begin readLn(i); sum := sum + i; count := count + 1; end; { count > 0: do not do division by zero. } if count > 0 then begin writeLn(sum / count); end; end. Note that using a data type excluding negative numbers (here we named it integerNonNegative) mitigates the issue that count may flip the sign, a condition known as overflow. This would cause the program to fail if count := count + 1 became too large, and effectively falls out of the range 0..maxInt. There is, despite maxReal, no programmatic way to tell that sum became too large or too small rendering it severely inaccurate, because any value of sum may be legit nevertheless. program arithmeticMean(input, output); type integerNonNegative = 0..maxInt; var i, sum: real; count: integerNonNegative; begin sum := 0.0; count := 0; while not eof(input) do begin readLn(i); sum := sum + i; count := count + 1; end; { count > 0: do not do division by zero. } if count > 0 then begin writeLn(sum / count); end; end. Note that using a data type excluding negative numbers (here we named it integerNonNegative) mitigates the issue that count may flip the sign, a condition known as overflow. This would cause the program to fail if count := count + 1 became too large, and effectively falls out of the range 0..maxInt. There is, despite maxReal, no programmatic way to tell that sum became too large or too small rendering it severely inaccurate, because any value of sum may be legit nevertheless. This task is a fine exercise for those using an EP-compliant compiler: Write a time function that returns a string in the “American” time format 9:04 PM. This may look easy at first, but it can become quite a challenge. Have fun! A smart person would try to reuse time itself. However, the output of time itself is not standardized, so we will need to define everything by ourselves: type timePrint = string(8); function timeAmerican(ts: timeStamp): timePrint; const hourMinuteSeparator = ':'; anteMeridiemAbbreviation = 'AM'; postMeridiemAbbreviation = 'PM'; type noonRelation = (beforeNoon, afterNoon); letterPair = string(2); var { contains 'AM' and 'PM' accessible via an index } m: array[noonRelation] of letterPair; { contains a leading zero accessible via a Boolean expression } z: array[Boolean] of letterPair; { holds temporary result } t: timePrint; begin { fill t with spaces } writeStr(t, '':t.capacity); This fallback value (in the case ts.timeValid is false) allows the programmer/“user” of this function to “blindly” print its return value. There will be a noticeable gap in the output. Another sensible “fallback” value would be an empty string.  with ts do begin if timeValid then begin m[beforeNoon] := anteMeridiemAbbreviation; m[afterNoon] := postMeridiemAbbreviation; z[false] := ''; z[true] := '0'; writeStr(t, ((hour + 12 * ord(hour = 0) - 12 * ord(hour > 12)) mod 13):1, hourMinuteSeparator, z[minute < 10], minute:1, ' ', m[succ(beforeNoon, hour div 12)]); This is the most complicated part of this problem. First of all, all number parameters to writeStr are explicitly suffixed with :1 as the minimum-width specification, because there are some compilers that would otherwise assume, for example, :20 as a default value. Since we know that timeStamp.hour is in the range 0..23 we can use the div and mod operations as demonstrated. However, we will need account of an hour value of 0, which is usually denoted as 12:00 AM (and not zero). A conditional “shift” by 12 using the shown Boolean expression and ord “fixes” this. Furthermore, here is a brief reminder that in EP the succ function accepts a second parameter.  end; end; timeAmerican := t; end; Finally we will need to copy our temporary result, to the function result variable. Remember there must be exactly one assignment, although not all compilers enforce this rule. A smart person would try to reuse time itself. However, the output of time itself is not standardized, so we will need to define everything by ourselves: type timePrint = string(8); function timeAmerican(ts: timeStamp): timePrint; const hourMinuteSeparator = ':'; anteMeridiemAbbreviation = 'AM'; postMeridiemAbbreviation = 'PM'; type noonRelation = (beforeNoon, afterNoon); letterPair = string(2); var { contains 'AM' and 'PM' accessible via an index } m: array[noonRelation] of letterPair; { contains a leading zero accessible via a Boolean expression } z: array[Boolean] of letterPair; { holds temporary result } t: timePrint; begin { fill t with spaces } writeStr(t, '':t.capacity); This fallback value (in the case ts.timeValid is false) allows the programmer/“user” of this function to “blindly” print its return value. There will be a noticeable gap in the output. Another sensible “fallback” value would be an empty string.  with ts do begin if timeValid then begin m[beforeNoon] := anteMeridiemAbbreviation; m[afterNoon] := postMeridiemAbbreviation; z[false] := ''; z[true] := '0'; writeStr(t, ((hour + 12 * ord(hour = 0) - 12 * ord(hour > 12)) mod 13):1, hourMinuteSeparator, z[minute < 10], minute:1, ' ', m[succ(beforeNoon, hour div 12)]); This is the most complicated part of this problem. First of all, all number parameters to writeStr are explicitly suffixed with :1 as the minimum-width specification, because there are some compilers that would otherwise assume, for example, :20 as a default value. Since we know that timeStamp.hour is in the range 0..23 we can use the div and mod operations as demonstrated. However, we will need account of an hour value of 0, which is usually denoted as 12:00 AM (and not zero). A conditional “shift” by 12 using the shown Boolean expression and ord “fixes” this. Furthermore, here is a brief reminder that in EP the succ function accepts a second parameter.  end; end; timeAmerican := t; end; Finally we will need to copy our temporary result, to the function result variable. Remember there must be exactly one assignment, although not all compilers enforce this rule. Sources: 1. Wirth, Niklaus (1979). "The Module: a system structuring facility in high-level programming languages". proceedings of the symposium on language design and programming methodology. Berlin, Heidelberg: Springer. Abstract. doi:10.1007/3-540-09745-7_1. ISBN 978-3-540-09745-7. Retrieved 2021-10-26. 2. Cooper, Doug. "Chapter 11. The record Type". Oh! Pascal! (third edition ed.). p. 374. ISBN 0-393-96077-3. "[…] records have two unique aspects: • First, the stored values can have different types. This makes records potentially heterogeneous—composed of values of different kinds. Arrays, in contrast, hold values of just one type, so they’re said to be homogeneous. • […]" 3. Wirth, Niklaus (1973-07-00). The Programming Language Pascal (Revised Report ed.). p. 30. "Within the component statement of the with statement, the components (fields) of the record variable specified by the with clause can be denoted by their field identifier only, i.e. without preceding them with the denotation of the entire record variable." 4. Jensen, Kathleen; Wirth, Niklaus. Pascal – user manual and report (4th revised ed.). p. 39. doi:10.1007/978-1-4612-4450-9. ISBN 978-0-387-97649-5. "The initial and final values are evaluated only once." Notes: 1. This kind of record will not be able to store anything. In the next chapter you will learn a (and the only) instance it could be useful. 2. Indeed most compilers consider the dot as a dereferencing indicator and the field name denotes a static offset from a base memory address. 3. In Standard (“unextended”) Pascal, ISO standard 7185, a function can only return “simple data type” and “pointer data type” values. 4. Actually the shown begin … end is redundant since repeat … until constitute a frame in their own right. For pedagogical reasons we teach you to always use begin … end nevertheless whereever a sequence of statements usually appears. Otherwise you might change your repeat … until loop to a while … do loop forgetting to surround the loop’s body statements with a proper begin … end frame. 5. The packed designation has been omitted for simplicity. 6. According to most compilers’ definition of maxInt. The ISO standards merely require, that all arithmetic operations in the interval -maxInt..maxInt work absolutely correct, but it is thinkable (although unlikely) that more values are supported. Next Page: Pointers | Previous Page: Strings Home: Pascal Programming # Pointers The new data type presented in this chapter adds another layer of abstraction to your repertoire: Pointers are by far the most complicated data type. If you master them, you have got what it takes to tackle even the supreme discipline of assembly programming. So, let’s get started! ## Indirection In Pascal there are two kinds of variable types. • So far we have been using static variables. They exist during the entire execution of a block, e. g. while a program is running or just during execution of a routine. • There is another kind called dynamic variables.[fn 1] They do not necessarily “exist” during the entire block. That means, there is no static memory allocated, but the used memory space varies each time the program runs. While using static variables, the compiler[fn 2] already knows which memory chunk will be used in advance.[fn 3] Dynamic variables, however, are, hence their name, dynamic in that they will be occupying different, unpredictable memory segments. Memory is referred to by addresses. An address is, in CS, simply a number, an integer value so to speak.[fn 4] When you want to refer to a certain memory block, you use its address. The pointer data type is a value that stores an address. This address can then be used to access the memory it is referring to. A pointer, however, is just that: It is pointing, but not making any statement as regards to “whom”, what variable this block of memory “belongs.” ### Declaration In Pascal, a pointer data type declaration starts with a ↑ (up arrow), or alternatively and more frequently the ^ (caret) character, followed by the name of a data type. program pointerDemo(output); type charReference = ^char; A variable of this pointer data type can point to a single char value (and no other data type). In Pascal all pointer data types have to indicate the data type of the value the pointer is referring to. This is because a pointer alone is just an address: An address merely points to the start of a memory block. There is no statement with respect to this block’s size, its length. The domain restriction, the specification of targeted value’s data type, tells the compiler “how large” a memory block will be, and in consequence how to properly read and write, how to access it. Unlike any other data type, a pointer data type is the only data type that can use data types not declared yet. Below you will see a usage scenario, but let’s continue in the script. ### Allocating memory When you are declaring a variable in the var‑section, you are declaring a static variable. In the following code fragment c is a static variable, thus its memory location is already known. var c: charReference; begin { artificially stall the program without breakpoints } readLn; At this point, there is no memory space alloted to a char value yet. There is already space to store a pointer value, the address of a char value, but we do not have any space available to put it, a char value, anywhere. In Pascal you will first need to invoke the procedure new to get memory assigned to your program. New takes one pointer variable as an argument and will reserve enough memory space to hold one value of the pointer’s domain.  new(c); After this operation • you occupy additional memory for, in this case, one char value the program previously did not “own”, and • c, the pointer variable itself, will give us the address of this newly allocated memory. As with all variables of any kind, the memory space we have acquired now is totally undefined (unintialized). ### Dereferencing To use the memory we just gained we will have to follow a pointer. This is done by appending ↑ (or usually ^) to the name of the pointer variable.  c^ := 'X'; writeLn(c^); This action is called dereferencing. The pointer is a (kind of) reference to the underlying char value. This char value does not have a name, but you use the pointer to access it anyway. On this dereferenced variable we can perform all operations permissible on the pointer’s domain data type. I. e. here we are allowed to assign a char value 'X' to it, and then use it, for instance, in a writeLn as demonstrated above. Note that something like c := 'X' will not work, because in this case c simply refers to the pointer, the address storage. • The expression c has the data type charReference. • The expression c^ has the data type char. In Pascal it is forbidden to directly assign addresses to pointers, other than by using new. For the special case of nil, see below. ### Releasing memory After invoking new the respective memory is exclusively reserved to your program. This memory management occurs outside of your program. It is a typical task of the respective OS. To reverse the operation of new, there is a dedicated procedure “unreserving” memory: Dispose.  readLn; dispose(c); readLn; end. Dispose takes the name of a pointer variable, and releases previously with new allocated memory. After a dispose you may not follow, dereference, the pointer anymore. Nevertheless the pointer itself still stores the address where, in this case, the referenced char value was. Meanwhile, the “freed” memory may be used again for something or by someone else. ### Lifetime In Pascal, memory of dynamic variables remains reserved If a chunk of memory is rendered inaccessible by some operation, it is automatically released. This can happen implicitly: In the program above the pointer variable c is “gone” upon program termination. Because this variable is/was the only pointer (left) pointing to our previously reserved char value, there is an automatic “invisible” dispose. Insofar, the explicit dispose from our side was not necessary. However, unfortunately not all compilers comply with this specification as laid out in the Pascal ISO standards. For instance, Delphi, as well as the FPC (even in its {$mode ISO} compatibility mode, as of version 3.2.0) will not issue an automatic dispose. There, an explicit dispose is necessary.[fn 5] Rest assured, using the GPC it is not necessary though; the GPC fully complies with the ISO standard 7185 level 1.

Note, that memory accessibility is transitive: This means that, for instance, a pointer pointing to a pointer pointing to the memory still satisfies the accessibility requirement.

## Indication

The additional housekeeping of allocating and releasing memory may seem like quite a hassle, so when does that make sense?

• All variables declared in a var-section need to indicate their size in advance. For some applications, however, you do not know how much data you will need to store and process. Pointers are a means to overcome this limitation. Further below we will explore how.
• Pointer values can be used to represent graphs, networks, of data, allowing you to put everything into relation with each other. This means you do not need to store the same datum multiple times. A pointer value is usually, with respect to its memory requirements, a comparatively small data type. Handling pointers trades lower memory space demand for increased complexity.

Furthermore, pointer values are frequently used to implement variable parameters of routines: Due to its smaller size passing a single pointer value can be faster than passing, that means copying, for instance an entire array. This kind of use of pointers is completely transparent. Pascal equips you with an adequate language construct; you will learn more about variable parameters in the chapter on scopes.

### Nil pointers

All pointers can be assigned a literal value nil. The nil pointer value represents the notion “not pointing anywhere in particular.”

Coincidentally, nil is the only pointer value that could be used for a pointer literal.

const
nowhere = nil;

There is no other pointer value that you could possibly specify anywhere in your source code. This also means you cannot explicitely compare any specific pointer value except nil.

Note that nil is fundamentally different to an unintialized variable. You are allowed to read the value of a pointer that has been assigned the value nil, but you are still forbidden to attempt reading the value of a variable that has not been assigned any value at all.

 Attempting to dereference a pointer that currently possesses the value nil constitutes a fatal error.

### Permissible operators

In the introduction we used the analogy comparing pointers to integer values. However, this is really just that. Unlike integer values, pointers are by no means “ordered”; they do not belong the class of ordinal data types. There is no ord, succ, pred defined for a pointer, but also ordering comparison operators like < or >= do not apply to pointers, not to mention any arithmetic operator is invalid in combination with a pointer value.

The only operators applicable to pointers are[fn 6]

• =, do two pointer values refer to the same address,
• <>, do two pointer values refer to different addresses, and
• :=, the assignment of a pointer value, either nil or the value of an already defined pointer variable of the same data type, to a pointer variable.

It may seem at first like quite a restriction, but it prevents you from doing potentially harmful, or even just stupid stuff.

### Chicken or egg

Pointers are the only data type that can be declared using a data type yet to be declared.[fn 7] This circumstance makes it possible to declare data types containing pointers, possibly to the data type being declared at hand or other yet to be declared data types. This is possible because a pointer to foo has the same memory requirements as a pointer to bar or any other data type. The domain restriction of a pointer is not (necessarily/explicitly) stored in the program.

In the following code fragment numberListItem is not yet declared, but you are still allowed declare a new pointer data type with it anyway:

program listDemo(input, output);

type
numberListItemReference = ^numberListItem;

numberListItem = record
value: real;
nextItemLocation: numberListItemReference;
end;

Yet you cannot reverse the order of the declarations of numberListItemReference and numberListItem; the compiler cannot magically conclude nextItemLocation is a pointer until it has actually seen/read the respective declaration.

### Putting things together

Now we can use this data structure to dynamically store a series of numbers. Pay attention when to derefercene the pointer in the following code:

var
numberListStart: numberListItemReference;
begin
new(numberListStart);

new(numberListStart^.nextItemLocation);

dispose(numberListStart^.nextItemLocation);
dispose(numberListStart);
end.

The entire program contains one static variable. Only the variable numberListStart was declared by you. During run-time, however, while the program is running you will have at one point two additional real values at your disposal.

Take notice of this example’s order of dispose statements: The supplied pointer variable must be valid, so a reverse order would not be possible in this specific case here.

Concededly, this example could have been better implemented by simply declaring two real variables. The true power of pointers becomes apparent when you are, unlike the above code, use pointers as a means of abstraction. This chapter’s exercises will delve into that.

## Routines

In particular, let’s first explore a special kind of pointers: Routine parameters, that is functional and procedural parameters, are parameters of routines that allow you to statically modify the routine’s behavior by virtually passing the address of another routine. Let’s see how this works.

### Declaration and use

In the formal parameter list of a routine you can declare a parameter that looks just like a routine signature:

program routineParameter(output);

procedure fancyPrint(function f: integer);
begin
writeLn('❧ ', f:1, ' ☙')
end;

Inside the definition of fancyPrint you can use the parameter f as if it was a regular function declared and defined before and outside of fancyPrint. However, at this point it is not known what function will be used. The actual parameter f is in fact a pointer.[fn 8] We only know that this pointer’s “domain” is a, any function without parameters and returning an integer value, but this is already enough we need to know.

### One routine fits it all

To calls this kind of routine you will need to specify an appropriate routine designator that matches the signature as regards to order, number and data types of parameters and, if applicable, the returned value’s data type.

function getRandom: integer;
begin
{ chosen by fair dice roll: guaranteed to be random }
getRandom := 4
end;

begin
{ the answer to the ultimate question of life, the universe and everything }
end;

begin
fancyPrint(getRandom);
end.

To supply a routine parameter value to a routine, simply name a compatible routine. Note that in this case you never specify any parameters, because you are not making a call here, but the called routine will do so “on behalf” of you. Specifying the routine’s name, and thus passing its address, is sufficient to achieve that.

 Standard routines such as writeStr (EP) or sin cannot be used that way,[fn 9] because they are an integral part of the language. There is no (singular) routine definition for them.

## Caveats

As a beginner, pointers are difficult to tame. Without experience, you will frequently observe (for you) “unexpected” behaviors. Some pitfalls are presented here.

### with-clause

Special care must be taken when using pointers in conjunction with a with-clause. The expressions listed at the top of a with-clause are evaluated once before executing any following statement. During the entire with-statement the expressions using the “short” notation will actually use an invisible transient value. This speeds up execution, because the same value is not evaluated over and over again, but there is also a caveat in it.

Surprisingly, the long notation using an FQI can become invalid, while the short notation at first seems to be still valid. The following program demonstrates the issue:

program withDemo(output);
type
foo = record
magnitude: integer;
end;
fooReference = ^foo;
var
bar: fooReference;
begin
new(bar);
bar^.magnitude := 42;

with bar^ do
begin
dispose(bar);
bar := nil;
{ Here, bar^.magnitude would fail horribly, }
{ but you can still do the following: }
writeLn(magnitude);
end;
end.

When you compile and run this program, you will

1. notice that it prints anything but 42, but
2. it should be rather astonishing that it still prints anything at all.

The writeLn(magnitude) does actually use a “hidden (pointer) variable” and not bar. This variable’s value was evaluated one time at the top of the with-clause. The compiler does not (and cannot) complain that bar meanwhile became invalid. You are not making any assignments to the actually utilized hidden variable (i. e. it is still considered bearing a valid value), thus there is no reason for complaints.

### Limits

 This section primarily concerns users of Delphi and the FPC, as well as possibly some other compilers. Users of the GPC could skip this section, but understanding the theory is encouraged.

Memory is not an infinite resource. This has some grave implications.

 Most OSs try their best to fulfill the processes’ requests. Using a non-ISO-compliant compiler, the following program is doomed to fail though: program oomDemo; var p: ^integer; begin while true do begin new(p); end; end. Even though this program overwrites the previous pointer value, thus rendering the previously associated integer value inaccessible, the now inaccessible memory is still exclusively reserved to your program. Depending on OS internals and also the compiler used to compile your program, your computer will eventually freeze (become irresponsive to any input) or (a robust OS) will just kill your program (jargon for terminating it immediately without giving it any chance to fix the problem) and reclaim the once reserved but never released memory.

There is no means to check whether any subsequent new will exhaust the finite resource memory. On multi-tasking OSs it is feasible that between the time you have queried the amount of free memory space and actually requesting additional memory, another program running at the same time has acquired memory so there is none, or not enough left for you. This kind of situation is known as time-of-check to time-of-use. You need to simply in a make-or-break manner ask for more memory.

 This issue is rather of theoretical concern for the scope of this textbook. A standard desktop computer manufactured in the 21st century or later will not run out of memory for any programming exercise given here. This is not supposed to mean you can waste memory.
 Do not hoard memory: To mitigate potential OOM conditions, it is generally sensible to dispose memory as soon as you are certain it will not be used anymore.

Adapt the program listDemo so it accepts an unknown number of items. The program should print the number of total items first and then a list of items.
An acceptable solution could look like this:
function readNumber: numberListItemReference;
var
result: numberListItemReference;
begin
new(result);

with result^ do
begin
nextItemLocation := nil;
end;

end;

{ === MAIN ============================================================= }
var
numberListRoot: numberListItemReference;
currentNumberListItem: numberListItemReference;
numberListLength: integer;
begin
writeLn('Enter numbers and finish by abandoning input:');

{ input - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
numberListLength := 1;
currentNumberListItem := numberListRoot;

while not EOF(input) do
begin
with currentNumberListItem^ do
begin
currentNumberListItem := nextItemLocation;
end;
numberListLength := numberListLength + 1;
end;

{ output  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
writeLn('You’ve entered ', numberListLength:1, ' numbers as follows:');
currentNumberListItem := numberListRoot;
while currentNumberListItem <> nil do
begin
with currentNumberListItem^ do
begin
writeLn(value);
currentNumberListItem := nextItemLocation;
end;
end;

{ release memory  - - - - - - - - - - - - - - - - - - - - - - - - - - - }
currentNumberListItem := numberListRoot;
while currentNumberListItem <> nil do
begin
with currentNumberListItem^ do
begin
dispose(currentNumberListItem);
{ Note that at _this_ point, after dispose(…), writing
… := currentNumberListItem^.nextItemLocation
would be illegal! }
currentNumberListItem := nextItemLocation;
end;
end;
end.
An acceptable solution could look like this:
function readNumber: numberListItemReference;
var
result: numberListItemReference;
begin
new(result);

with result^ do
begin
nextItemLocation := nil;
end;

end;

{ === MAIN ============================================================= }
var
numberListRoot: numberListItemReference;
currentNumberListItem: numberListItemReference;
numberListLength: integer;
begin
writeLn('Enter numbers and finish by abandoning input:');

{ input - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
numberListLength := 1;
currentNumberListItem := numberListRoot;

while not EOF(input) do
begin
with currentNumberListItem^ do
begin
currentNumberListItem := nextItemLocation;
end;
numberListLength := numberListLength + 1;
end;

{ output  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - }
writeLn('You’ve entered ', numberListLength:1, ' numbers as follows:');
currentNumberListItem := numberListRoot;
while currentNumberListItem <> nil do
begin
with currentNumberListItem^ do
begin
writeLn(value);
currentNumberListItem := nextItemLocation;
end;
end;

{ release memory  - - - - - - - - - - - - - - - - - - - - - - - - - - - }
currentNumberListItem := numberListRoot;
while currentNumberListItem <> nil do
begin
with currentNumberListItem^ do
begin
dispose(currentNumberListItem);
{ Note that at _this_ point, after dispose(…), writing
… := currentNumberListItem^.nextItemLocation
would be illegal! }
currentNumberListItem := nextItemLocation;
end;
end;
end.

Write a procedure that accepts a real function and graphs its function values similar to this:
                                        *
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

For that complete the following procedure:

program graphPlots(output);

const
lineWidth = 80;

procedure plot(
function f(x: real): real;
xMinimum: real; xMaximum: real; xDelta: real;
yMinimum: real; yMaximum: real
);
{
this is the part you are supposed to implement
}

function wave(x: real): real;
begin
wave := sin(x);
end;

begin
plot(wave, 0.0, 6.283, 0.196, -1.0, 1.0);
end.
An acceptable implementation of plot could look like this:
procedure plot(
function f(x: real): real;
xMinimum: real; xMaximum: real; xDelta: real;
yMinimum: real; yMaximum: real
);
var
x: real;
y: real;
column: 0..lineWidth;
begin
x := xMinimum;

while x < xMaximum do
begin
y := f(x);
{ always reset column in lieu of doing that in an else branch }
column := 0;
{ is function value within window? }
if (y >= yMinimum) and (y <= yMaximum) then
begin
{ move everything toward zero }
y := y - yMinimum;
{ scale [yMinimum, yMaximum] range to [0..79] range }
y := y * (lineWidth - 1) / (yMaximum - yMinimum);
{ convert to integer }
column := round(y) + 1;
end;

The following use of write/writeLn is actually an EP extension. In Standard Pascal as laid out in ISO standard 7185 all format specifiers need to be positive integer values. Extended Pascal also allows a zero value. While for printing integer values the width specifier still indicates a minimum width, for char and string values it means the exact width. Thus the following can print a blank line when column is zero, i. e. when the function value is outside of the window.

		writeLn('*':column);

It should be an easy feat for you to adapt the writeLn line should your compiler not support this EP extension.

		x := x + xDelta;
end;
end;
Appreciate the fact that once you have implemented plot in such a generic way, i. e. accepting a functional parameter, you can reuse it for any other function you wish to.
An acceptable implementation of plot could look like this:
procedure plot(
function f(x: real): real;
xMinimum: real; xMaximum: real; xDelta: real;
yMinimum: real; yMaximum: real
);
var
x: real;
y: real;
column: 0..lineWidth;
begin
x := xMinimum;

while x < xMaximum do
begin
y := f(x);
{ always reset column in lieu of doing that in an else branch }
column := 0;
{ is function value within window? }
if (y >= yMinimum) and (y <= yMaximum) then
begin
{ move everything toward zero }
y := y - yMinimum;
{ scale [yMinimum, yMaximum] range to [0..79] range }
y := y * (lineWidth - 1) / (yMaximum - yMinimum);
{ convert to integer }
column := round(y) + 1;
end;

The following use of write/writeLn is actually an EP extension. In Standard Pascal as laid out in ISO standard 7185 all format specifiers need to be positive integer values. Extended Pascal also allows a zero value. While for printing integer values the width specifier still indicates a minimum width, for char and string values it means the exact width. Thus the following can print a blank line when column is zero, i. e. when the function value is outside of the window.

		writeLn('*':column);

It should be an easy feat for you to adapt the writeLn line should your compiler not support this EP extension.

		x := x + xDelta;
end;
end;
Appreciate the fact that once you have implemented plot in such a generic way, i. e. accepting a functional parameter, you can reuse it for any other function you wish to.

Notes:

1. The Pascal ISO standards call this idea identified variables.
2. For the sake of simplicity we say this was the compiler’s task. Usually it is rather a task of a linker, the link editor, that determines and substitutes specific addresses.
3. Actually the compiler does not know which (physical) memory will be used, but another abstraction layer called virtual memory administerd by the OS permits us to think that way.
4. This is an analogy for explanation purposes. The range of integer values does not necessarily correspond to permissible pointer values (i. e. addresses). For instance on x32 targets pointers have 32 significant bits, but an integer value occupies 64 bits.
5. Failing to release memory will probably go unnoticed. Your program will compile and run without a proper dispose. However, eventually the finite resource “memory” will be exhausted, a condition known as memory leak. If there is no sufficient memory available, any new will fail and terminate the program immediately.
6. Some manuals call the ↑/^ an “operator”. This language, however, is imprecise. The ↑ does not alter the state of your program, do an operation, but merely instructs the compiler to treat an identifier differently than it would without the arrow’s presence.
7. The declaration of the pointer data type and the referenced data type must occur in the same scope, in the same block, in other words in one and the same type-section.
8. This is an implementation detail that is not specified by the ISO standards, although in fact most compilers will implement this as a pointer.
9. Some compilers do not have this restriction, yet the ISO standards require an “activation”, which simply does not happen for standard routines.
Next Page: Files | Previous Page: Records
Home: Pascal Programming

# Files

Ever wondered how to process bulks of data? Files are the solution in Pascal. You were already acquainted with some basics in the input and output chapter. Here we will elaborate more details as far as the ISO standard 7185 “Pascal” defines them. The “Extended Pascal” ISO standard 10206 defines even more features, but these will be covered in the second part of this WikiBook.

## File data types

So far we have been only handling text files, i. e. files possessing the data type text, but there are more file types.

### Concept

Mathematically speaking, a file is a bounded finite sequence. That means,

• components are oriented along an axis (sequence),
• component values are chosen from one domain (bounded), and
• there is a certain number of components present (finite).

To put this in fancy math symbols:

${\displaystyle M^{\,*}=\bigcup _{i\ =\ 0}^{\infty }M^{\,i}}$

### Declaration

In Pascal we can declare file data types by specifying file of recordType, where recordType needs to be a valid record data type. A permissible record data type can be any data type, except another file data type (including text) or a data type containing such. That means an array of file data types, or a record having a file as a component is not permitted. Let’s see an example:

program fileDemo(output);

type
integerFile = file of integer;

With a variable of the data type integerFile we can access a file containing only one kind of data, integer values (the domain restriction).

var
temperatures: integerFile;
i: integer;

Note, the variable temperatures is not a file by itself. This Pascal variable merely provides us with an abstract “handle”, something that permits us, the program, to get a hold of the actual file (as described in § Concept).

## Modes

All files have a current mode. Upon declaration of a file variable, this mode is, like usual, undefined. In Standard Pascal as defined by the ISO standard 7185 you can choose from either generation or inspection mode.

### Generation mode

In order to write to a file you will need to call the standard built-in procedure named rewrite. Rewrite will attempt opening a file for writing from the start.

begin
rewrite(temperatures);

The file immediately becomes empty, hence its name rewrite. Extended Pascal also has the non-destructive procedure extend.

Only after successfully opening a file for writing, all write routines become legal. Attempting to write to a file that has not been opened for writing will constitute a fatal error.

	write(temperatures, 70);
write(temperatures, 74);

All parameters to write after the destination (here temperatures) have to be of the destination file’s recordType. There must be at least one. Only if the destination is a text file, various built-in data types are permitted.

Note that the procedure(s) writeLn (and readLn) can only be applied to text files. Other files do not “know” the notion of lines, therefore the …Ln procedures cannot be applied on them.

### Inspection mode

In order to read a file you will need to call the standard built-in procedure named reset. Reset will attempt opening a file for reading from the start.

	reset(temperatures);
while not EOF(temperatures) do
begin
writeLn(i);
end;
end.

Note that after reset(temperatures) you cannot write anything to that file anymore. Modes are exclusive: Either you are writing or reading.[fn 1]

## Application

The main and most apparent “advantage” of a file might be: Unlike an array we do not need to specify a size in advance, in our source code. The file can be as large as needed. Yet an array can be copied with a := assignment. Entire files cannot be copied this way.

The main “disadvantage” of a file might be: Access is only sequentially. We have to start reading and writing a file from the start. If we want to have, say, the 94th record, we need to advance 93 times and also take account of the possibility that there might be less than 94 records available.[fn 2]

The words advantage and disadvantage were put between quotation marks, because a programming language cannot judge/rate what is “better” or “worse”. It is the programmer’s task to make the assessment. Files are especially suitable for I/O of unpredictable length, for instance user input.

## Primitive routines

So far we have been using only read/readLn and write/writeLn. These procedures are convenient and perfect for everday use. However, Pascal also gives you the oppurtunity to have a comparatively “low-level” access to files, get and put.

### Buffer

Every file variable is associated with a buffer. A buffer is a temporary storage space. Everything you read from and write to a file passes through this storage space before the actual read or write action is communicated to the OS.[fn 3] Buffered I/O is chosen for performance reasons.

In Pascal we can access one, the “current” component of the buffer by appending ↑ to the variable name, just as if it was a pointer. The data type of this dereferenced value is the recordType as in our declaration. So if we have

var
foobar: file of Boolean;

the expression foobar↑ has the data type Boolean.

To put everything into relation to each other let’s take a look at a diagram. This diagram is about understanding and shows a very specific situation. Focus on the relationships:

The upper part is in the purview of the OS. The lower part is in the purview of the (our) program. The data of the file, here a sequence of 16 integer values in total, are exclusively managed by the OS. Any access of the data is done via the OS. Directly reading or writing is not possible. We ask the OS to copy the first 4 integer data values for us into our buffer. We do so, because copying 4 integers individually is slower than copying them all together in one go.[fn 4]

### Sliding window

The three different storage locations – the actual data file, the internal buffer, and the buffer variable – work together in providing us a “view” of the file. If we overlay everything that contains the same information, we get the following image:

Here, the second quartet of integers was loaded into the internal buffer (green background). The file buffer points to the second component of the internal buffer. This is represented by a bluish hue over the sixth component of the entire file. Everything else is shaded, meaning we can view and manipulate only the sixth component.

This sliding window can be advanced (in the rightwards direction, i. e. in the direction of EOF) with the routines get and put. Both advance the file buffer to point to the next item in the internal buffer. Once the internal buffer has been completely processed, the next batch of components is loaded or stored. Calling get is only legal while a file is inspection mode; respectively put is only legal while a file is generation mode.

### Using the window

Get and put take one non-optional parameter, a file (or text) variable. Put takes the current contents of the buffer variable and ensures they are written to the actual file. Let’s see this in action. Consider the following program:

program getPutDemo(output);
type
realFile = file of real;
var
score: realFile;
begin

The following table shows in the right-hand column the state of score, the contents and where the sliding window is at (blue background).

source code state after successful operation
	rewrite(score);
 N/A 🠅
	score^ := 97.75;
 97.75 🠅
	put(score);
 97.75 N/A 🠅
	score^ := 98.38;
 97.75 98.38 🠅
	put(score);
 97.75 98.38 N/A 🠅
	score^ := 100.00
 97.75 98.38 100.00 🠅
	{ For demonstration purposes: no put(score) here. }
 97.75 98.38 100.00 🠅

Now let’s print the file score we just filled with some real values. For a change we use get. Like read/readLn, getis only allowed if not EOF:

	reset(score);
while not EOF(score) do
begin
writeLn(score^);
get(score);
end;
end.

Note that this prints just two real values:

 9.775000000000000E+01
9.838000000000000E+01

The third real value, although defined, was not written by a corresponding put(score)

### Requirements

As mentioned above, get may only be called when the specified file is inspection mode, whereas put may only be called when the file is generation mode. More specifically, calling get(F) is only allowed when EOF(F) is false, and calling put(F) is only allowed when EOF(F) is true. In other words, reading past the EOF is forbidden, while writing has to occur at the EOF.

After successfully calling rewrite(F) (or the EP procedure extend(F)) the value of EOF(F) becomes true. Any subsequent put(F) does not alter this value. After calling reset(F) the value of EOF(F) depends on whether the given file is empty. Any subsequent get(F) may change this value from false to true (never in the reverse direction).

 As you know, it is forbidden to read a variable that was not previously defined (i. e. you have to assign a value beforehand). Because it involves reading the buffer value, writing a buffer is only allowed if it was previously defined. Consider the following faulty code snippet: temperatures^ := 88; put(temperatures); { ✔ Good. Will successfully write 88. } put(temperatures); { ↯ Bad. temperatures^ is not defined. } put(temperatures); { ↯ temperatures^ still not defined. }Remember, get and put advance the sliding window. Only the first put(temperatures) reads the defined value temperatures^. The next and following put(temperatures) would however read an undefined temperatures↑.

### Text buffer

The buffer value of a text has some special behavior. A text file is essentiallly a file of char. Everything presented in this chapter can be applied to a text file just as if it was file of char. However, as repeatedly emphasized, a text file is structured into lines, each line consisting of a (possibly empty) sequence of char values.

When EOLn(input) becomes true, the buffer variable input↑ returns a space character (' '). Thus when using buffer variables the only way to distinguish between a space character as part of a line, and a space character terminating a line is to call the function EOLn.

Rationale: Various operating systems employ different methods of marking the end of a line. It has to be marked somehow, because this information cannot be magically deduced out of nowhere. However, there are multiple strategies out there. This is really inconvenient for the programmer who cannot take account of everything. Pascal has therefore chosen that, regardless of the specific EOL marker used, the buffer variable contains a simple space character at the end of a line. This is predictable, and predictable behavior is good.

### Purpose

It is worth noting that all functionality of read/readLn and write/writeLn can at their heart be based on get and put respectively. Here are some basic relationships:

If f refers to a file of recordType variable and x is a recordType variable, read(f, x) is equivalent to

	x := f^;
get(f);

Similarly, write(f, x) is equivalent to

	f^ := x;
put(f);

For text variables the relationships are not as straightforward. The behavior depends on the various destination/source variables’ data types. Nonetheless, one simple relationship is, if f refers to a text variable, readLn(f) is equivalent to

	while not EOLn(f) do
begin
get(f);
end;
get(f);

The latter get(f) actually “consumes” the newline marker.

## Support

Unfortunately, from the compilers presented in the opening chapter, Delphi and the FPC do not support all ISO 7185 functionality.

• Delphi and the FPC require files to be explicitly associated with file names before performing any operations. It is required to back any kind of file by a file in background memory (e. g. on disk). How this works will be explained in the second part of this book, since ISO standard 10206 “Extended Pascal” defines some means for that, too.
• The FPC provides the procedures get and put, and file variable buffers only in {$mode ISO} or {$mode extendedPascal}. Delphi does not support this at all.

Rest assured, everything works fine if you are using the GPC. The authors cannot make a statement regarding the Pascal‑P compiler since they have not tested it.

Can you write to a buffer variable, while the respective file is in inspection mode? In other words, is it legal for a buffer variable to appear on the LHS of an assignment when the file is in inspection mode?
The buffer variable is, hence its name, a variable. You may read from and write to it regardless of the current mode. However, the buffer is only created if the file variable is initialized. That means a mode has to be selected by invoking reset or rewrite first. Think of reset/rewrite as a special kind of new and the file variable as a pointer. You may only dereference the pointer (= append ↑) if it was previously defined.
The buffer variable is, hence its name, a variable. You may read from and write to it regardless of the current mode. However, the buffer is only created if the file variable is initialized. That means a mode has to be selected by invoking reset or rewrite first. Think of reset/rewrite as a special kind of new and the file variable as a pointer. You may only dereference the pointer (= append ↑) if it was previously defined.

Write a filter program that merges repeating space characters ' ' into a single space character. (A filter program means, process input and write to output with the specified rule applied on the given input.) Extra credit: Write a solution that does not declare any additional variables (i. e. there is no var-section).
An acceptable solution is:
program mergeRepeatingSpace(input, output);
const
{ Choose any character, but ' ' (a single space). }
nonSpaceCharacter = 'X';
begin
output^ := nonSpaceCharacter;

while not EOF do
begin

Since input↑ contains a space character when we are the EOL, the only correct way of emitting a new line is using writeLn. WriteLn does not use the buffer variable. In other words, output↑ may contain any value now.

		if EOLn then
begin
writeLn;

In this branch of the if statement, input↑ holds a space character. However this instance of space character should not trigger the repeating space character detection. Therefore we assign a non-space character to output↑ (now acting as a “previous character variable”).

			output^ := nonSpaceCharacter;
end
else
begin
if [output^, input^] <> [' '] then

In Extended Pascal using the string/char concatenation operator + you could write:

			if output^ + input^ <> '' then

Remember that the plain =‑comparison pads both operands to the same length using space characters.

			begin
write(input^);
end;

output^ := input^;
{ The buffer variable (output↑) now contains the previous character. }
end;

get(input);
end;
end.
An easier implementation probably would employ a Boolean variable as a flag whether the preceding character was non-newline space character.
An acceptable solution is:
program mergeRepeatingSpace(input, output);
const
{ Choose any character, but ' ' (a single space). }
nonSpaceCharacter = 'X';
begin
output^ := nonSpaceCharacter;

while not EOF do
begin

Since input↑ contains a space character when we are the EOL, the only correct way of emitting a new line is using writeLn. WriteLn does not use the buffer variable. In other words, output↑ may contain any value now.

		if EOLn then
begin
writeLn;

In this branch of the if statement, input↑ holds a space character. However this instance of space character should not trigger the repeating space character detection. Therefore we assign a non-space character to output↑ (now acting as a “previous character variable”).

			output^ := nonSpaceCharacter;
end
else
begin
if [output^, input^] <> [' '] then

In Extended Pascal using the string/char concatenation operator + you could write:

			if output^ + input^ <> '' then

Remember that the plain =‑comparison pads both operands to the same length using space characters.

			begin
write(input^);
end;

output^ := input^;
{ The buffer variable (output↑) now contains the previous character. }
end;

get(input);
end;
end.
An easier implementation probably would employ a Boolean variable as a flag whether the preceding character was non-newline space character.

Write a program that reads from input and only writes the last input char value to output. On a standard Linux or FreeBSD system you can test your program with the command line echo -n '123H' | ./printLastCharacter. The ‑n option flag is important. Otherwise your program might just display a single space (' ') character. Alternatively, you may use printf '123H' | ./printLastCharacter. With either variant your program should write a line consisting of the single character H.
An acceptable solution could look like this:
program printLastCharacter(input, output);
begin
{ We cannot output anything, unless there is at least one character. }
if not EOF(input) then
begin
while not EOF(input) do
begin
{ After get(input), input↑ becomes undefined once
we reach EOF(input). Therefore copy it beforehand. }
output^ := input^;
get(input);
end;
put(output);
writeLn(output);
end;
end.

By specifying input in the program parameter list, the post-assertions of reset become true. That means, there has been an implicit (= invisible) get(input) before our begin in the second line and only after that the value the of EOF(input) becomes defined. If you happen to have a compiler supporting Extended Pascal’s halt procedure, you would eliminate one indentation level:

	{ We cannot output anything, unless there is at least one character. }
if EOF(input) then
begin
halt;
end;

while not EOF(input) do
Generally speaking, programmers like to avoid indentation levels, because it can indicate complexity. On the other hand, it is absolutely legitimate if you find this style of coding “more complex”.
An acceptable solution could look like this:
program printLastCharacter(input, output);
begin
{ We cannot output anything, unless there is at least one character. }
if not EOF(input) then
begin
while not EOF(input) do
begin
{ After get(input), input↑ becomes undefined once
we reach EOF(input). Therefore copy it beforehand. }
output^ := input^;
get(input);
end;
put(output);
writeLn(output);
end;
end.

By specifying input in the program parameter list, the post-assertions of reset become true. That means, there has been an implicit (= invisible) get(input) before our begin in the second line and only after that the value the of EOF(input) becomes defined. If you happen to have a compiler supporting Extended Pascal’s halt procedure, you would eliminate one indentation level:

	{ We cannot output anything, unless there is at least one character. }
if EOF(input) then
begin
halt;
end;

while not EOF(input) do
Generally speaking, programmers like to avoid indentation levels, because it can indicate complexity. On the other hand, it is absolutely legitimate if you find this style of coding “more complex”.

Notes:

1. Extended Pascal, as defined by ISO standard 10206, also permits an update mode, i. e. reading and writing at the same time, yet this is only possible for “direct-access files” (files that are indexed).
2. Extended Pascal, ISO knows “direct-access files”. Such a file type allows accessing the 94th record in an easy and fast manner, yet it cannot “grow” as needed.
3. This is an implementation detail and not a requirement imposed by programming language. Already the mere presence of an OS is beyond Pascal’s horizon. Nonetheless, this description is a common scheme.
4. This is of course under the presumption, that we do intend to need them. Unnecessarily copying data that will not be used later on is a waste of computing time.

Next Page: Scopes | Previous Page: Pointers
Home: Pascal Programming

Part Ⅱ

Extensions

# Units

In original Standard Pascal all functionality of a program other than the standard functions Pascal already defines had to be defined in one file, the program source code file. While in the context of teaching the sources remained rather short, entire applications quickly become cluttered despite various comments structuring the text.

Quite soon different attempts to modularize programs emerged. The most notable implementation that remains in use till today is UCSD Pascal’s concept of units.

## UCSD Pascal units

A UCSD Pascal unit is like a program except that it cannot run on its own, but is supposed to be used by programs. A unit can define constants, types, variables and routines just like any program, but there is no executable portion that can be run independently. Using a unit means that this unit becomes a part of the program; It is like copying the entire source code from the unit to the program, but not quite the same.

Usually, units are stored in separate files thus incredibly cleaning up the program’s source code file. However, this is not a set requirement, since after an end. a module is considered complete and another module may follow.

 From now on, as another layer of abstraction, module refers to either a program or unit. (In FP a library is another type of module.)

### Defining units

A unit definition shows many similarities to a regular program, but with many additional features.

The first line of a unit looks like this:

unit myGreatUnit;

Unlike a program there is no parameter list. A unit is a self-contained unit of certain functionality, thus cannot be parameterized in any way.[fn 1]

This line also declares a new identifier, in this example myGreatUnit. MyGreatUnit becomes the first component of the so-called fully-qualified identifier. More on that later.

#### Parts

The unit concept provides means to encapsulate its definitions, so that the programmer using the unit does not need to know how certain functionality is implemented.

This is done by splitting the unit into two parts:

1. the interface part, and
2. the implementation part.

A programmer using another unit only needs to know how to use the unit: This is outlined in the interface part. The programmer who is programming the unit on the other hand will need to implement the unit’s functionality in the implementation part. Thus a bare minimum unit looks like this:

unit myGreatUnit;
interface
implementation
end.

The interface part has to come before the implementation part. Also note that units terminate with a end. just as a program does.

The interface part of unit consists of a block, except that it cannot contain any statements. The interface is merely declaratory. All identifiers defined in the interface part will become “public”, i. e. a programmer using the unit will have access to them. All identifiers defined in the implementation part, on the other hand, are “private”: they are only available within the unit’s own implementation part. There is no way to circumvent this separation of exported and “private” code.

#### Example

unit randomness;

// public  - - - - - - - - - - - - - - - - - - - - - - - - -
interface

// a list of procedure/function signatures makes
// them usable from outside of the unit
function getRandomNumber(): integer;

// a definition (an implementation) of a routine
// must not be in the interface-part

// private - - - - - - - - - - - - - - - - - - - - - - - - -
implementation

function getRandomNumber(): integer;
begin
// chosen by fair dice roll
// guaranteed to be random
getRandomNumber := 4;
end;

end.

### Using units

#### Import

Now, it is great that we have finally outsourced some code, but the point of all of this is to use the outsourced code. For this, UCSD Pascal defines the uses clause. A uses clause instructs the compiler to import another unit’s code and familiarize with all identifiers declared in the interface part of that unit. Thus, all identifiers from the unit’s interface part become available, as if they were part of the module importing them via the uses clause. Here is an example:

program chooseNextCandidate(input, output);
uses
// imports a unit
randomness;

begin
writeLn('next candidate: no. ', getRandomNumber());
end.

Note, that the program chooseNextCandidate neither defines nor declares the function getRandomNumber, but nevertheless uses it. Since getRandomNumber’s signature is listed in the interface part of randomness, it is available for other modules using that module.

 Each program may have at most one uses clause. It has to appear right after the program header.

Uses clauses are allowed in any module. Of course it is possible to use other units inside a unit. Moreover, you are allowed to have two uses clauses in one unit, one in the interface and one in implementation part each. The units listed in the interface part’s uses clause propagate, that means they become also to the module that uses such units.[fn 2][fn 3]

#### Namespaces

Now, programming with units would have been a hassle if all units that were ever programmed had to explicitly define exclusive identifiers. But this is not the case. With the advent of modules all modules implicitly constitute a namespace. A namespace is a self-contained scope where only within identifiers need to be unique. You are quite welcome to define your own getRandomNumber and still use the randomness unit.

In order to distinguish between identifiers coming from various namespaces, identifiers can be qualified by prepending the namespace name to the identifier, separated by a dot. Thus, randomness.getRandomName unambiguously identifies the getRandomNumber function exported by the randomness unit. This notation is called fully-qualified identifier, or FQI for short.

## Unit design

There are several considerations that should be accounted for:

• Whenever some code might be useful for other programs too, you may want to create a separate unit.
• One unit should provide all functionality necessary in order to be useful, however,
• a unit should not provide features that are unrelated to its main purpose.
• Your unit’s usability largely depends on well-defined interface. Requiring knowledge of the specific implementation is usually an indicator for bad code.

## Special units

### Run-time system

Some compilers use units for providing certain functionality that serves the gray zone between a compiler’s actual task and a program (i. e. what you write). Most notably, Delphi, the FPC as well as the GPC provide a run-time system (RTS) that includes all standard routines defined as part of the language (e. g. writeLn and ord). In Delphi and the FPC this unit is called system, whereas the GPC comes with the GPC unit. These units are sometimes referred to as run-time library, RTL for short.

 Since these units provide the standard routines of Pascal they have to be imported as the very first unit. However, due to human’s proclivity to err the compiler will take care of ensuring the RTS is loaded first. Therefore, it is forbidden to write uses system;. The FPC will emit an error if you are attempting to import the RTL manually.

Knowing what the RTS’s unit is called could be useful, since this implies that all identifiers of the RTL are part of one namespace. That means, in (for example) Delphi and the FP one may refer to the standard function abs by both, its short name as well as the FQI system.abs. The latter may be required if you are shadowing the abs function in the current scope, but need to use Pascal’s own abs function.

### Debugging

The FPC comes with a special unit heapTrc (heap trace). This unit provides a memory manager. It is used to find out whether the program does not release any memory blocks it earlier reserved for itself. Allocating memory and not handing it back to the OS is called “memory leaking” and is a very bad circumstance. Due to the heapTrc unit’s intrusive behavior into Pascal’s memory management, it also needs to be loaded very soon after the system unit has been loaded. Hence, FPC forbids you to explicitly include the heapTrc unit in the uses clause, but provides the -gh compiler command-line switch that will ensure inclusion of that unit.

The heapTrc unit is only used at the development stage. It can print a memory report after the program’s final end..

The heapTrc unit is somewhat easy to use, but also limited in its features. We recommend to use dedicated debugging and profiling tools such as valgrind(1) as knowing how to use such tools will serve you well if you ever switch programming languages. If you specify the -gv switch on fpc(1)’s invocation, the FPC will insert debugging information for usage with valgrind(1).

## Other modularization implementations

The Extended Pascal standard lays out a specification for modules. These provide advanced means of modularization. However, neither FPC nor Delphi support this, only the GPC does.

Write a unit that says goodbye on program termination, i. e. prints a message to the terminal.
You can utilize the finalization section as hook to achieve that behavior:
unit friendly;
interface
implementation
finalization
begin
writeLn('Goodbye!');
end;
end.
You can utilize the finalization section as hook to achieve that behavior:
unit friendly;
interface
implementation
finalization
begin
writeLn('Goodbye!');
end;
end.

Notes:

1. The module concept described in Extended Pascal standard does allow module parameterization.
2. As of version 3.0.4 the FPC does not support propagation of identifiers.
3. PXSC-style modules require the usage of use global someModuleName in order to enable propagation of identifiers.

Next Page: Object oriented | Previous Page: Scopes
Home: Pascal Programming

# Object Oriented Programming

Back to Pascal Programming

Object Oriented Pascal allows the user to create applications with Classes and Types. This saves the developer time on developing programs that would be very flexible.

This is a sample program (tested with the FreePascal compiler) that will store a number 1 in private variable One, increase it by one and then print it.

 program types;  // this is a simple program
type MyType=class
private
One:Integer;
public
function Myget():integer;
procedure Myset(val:integer);
procedure Increase();
end;

function MyType.Myget():integer;
begin
Myget:=One;
end;
procedure MyType.Myset(val:integer);
begin
One:=val;
end;
procedure MyType.Increase();
begin
One:=One+1;
end;

var
NumberClass:MyType;
begin
NumberClass:=MyType.Create;  // creating instance
NumberClass.Myset(1);
NumberClass.Increase();
writeln('Result: ',NumberClass.Myget());
NumberClass.Free;  // destroy instance
end.

This example is very basic and would be pretty useless when used as OOP. Much more complicated examples can be found in Delphi and Lazarus which include a lot of Object Oriented programming.

# Miscellaneous Extensions

The last Pascal-related standards were published in 1990, ISO standard 7185 “[Standard] Pascal”, and ISO standard 10206 “Extended Pascal”. But ever since IT did not stop evolving. Several compiler manufacturers continued extending Pascal by miscellaneous extensions, some of which we are presenting here.

## Inline assembly

Since TP version 1.0 there exists the possibility to include assembly language inside your Pascal source code. This is called inline assembly. While normal Pascal is surrounded by a begin … end frame, assembly language can be framed by asm … end. Here is an example that can be compiled with the FPC:

program asmDemo(input, output, stdErr);
{$ifNDef CPUx86_64} {$fail only for x86_64}
{$endIf} var foo: int64; begin write('Enter an integer: '); readLn(foo); // This directive will tell FPC // a certain assembly language style is used // within the asm...end frame. {$asmMode intel}
asm
mov rax, [foo]        // rax ≔ foo^
// ensure foo is positive
test rax, rax         // x ≟ 0
jns @is_positive      // if ¬SF then goto is_positive
neg rax               // rax ≔ −rax
@is_positive:
// NOTE: Here we assume the popcnt instruction
//       was supported by the processor,
//       but this is bad style.
//       You ought to use the cpuid instruction
//       (if available) in order to determine
//       whether popcnt is available.
popcnt rax, rax       // rax ≔ popCnt(rax)
mov [foo], rax        // foo ≔ rax
// An array of strings after the asm-block closing ‘end’
// tells the compiler which registers have changed
// (you do not want to mess with the compiler’s understanding
// which registers mean what)
end ['rax'];
writeLn('Your number has a binary digital sum of ', foo, '.');
end.

Writing inline assembly code is useful if you have special knowledge about data and the compiler generates inefficient code. You can try to optimize for speed or size in order mitigate performance bottlenecks.

All, Delphi, the FPC as well as the GPC support asm` frames, but each with a few subtle differences. We therefore refer to the compiler’s manuals, and not forgetting this book is about programming in Pascal.