Structured Query Language/Create Table

From Wikibooks, open books for an open world
Jump to navigation Jump to search



One of the basic steps during database development cycles is the fixing of decisions about the table structure. To do so, there is the CREATE TABLE statement with which developers define tables together with their columns and constraints.

Because a lot of features may be activated by the command, its syntax is a little bit complex. This page shows the most important parts. The syntax is not straight forward. At some points it is possible to use alternative formulations to express the same purpose, e.g. the Primary Key may be defined within the column definition as a column constraint, at the end of the command as a table constraint or as a separate stand-alone command 'ALTER TABLE ADD CONSTRAINT ...;' .

CREATE TABLE <tablename> (
  <column_name> <data_type> <default_value> <identity_specification> <column_constraint>,
  <column_name> <data_type> <default_value>                          <column_constraint>,
  ...,
  <table_constraint>,
  <table_constraint>,
  ...
);

General Description[edit | edit source]

After the introductory key words CREATE TABLE, the tablename is specified. Within a pair of parentheses, a list of column definitions follows. Each column is defined by its name, data type, an optional default value, and optional constraints for this individual column.

After the list of column definitions, developers can specify table constraints like Primary and Foreign Keys, Unique conditions, and general column conditions.

An first example was shown at the page Create a simple Table and a second one here:

CREATE TABLE test_table (
  -- define columns (name / type / default value / column constraint
  id             DECIMAL                           PRIMARY KEY,
  part_number    CHAR(10)          DEFAULT 'n/a'   NOT NULL,
  part_name      VARCHAR(500),
  state          DECIMAL           DEFAULT -1,
  -- define table constraints (eg: 'n/a' shall correlate with NULL)
  CONSTRAINT test_check CHECK ((part_number  = 'n/a' AND part_name IS     NULL) OR
                               (part_number != 'n/a' AND part_name IS NOT NULL))
);

The table consists of 4 columns. All of them have a data type and some a default value. The column id acts as the Primary Key. The table constraint test_check guarantees that part_name is mandatory if part_number is recorded.

Column Definition[edit | edit source]

Data Type[edit | edit source]

The standard defines a lot of predefined data types: character strings of fixed and variable size, character large objects (CLOB), binary strings of fixed and variable size, binary large objects (BLOB), numeric, boolean, datetime, interval, xml. Beyond, there are complex types like: ROW, REF(erence), ARRAY, MULTISET and user-definded types (UDT). The predefined data types are explained on the next page. To keep things simple, we use on this page only CHAR, VARCHAR, and DECIMAL.

Default Value[edit | edit source]

A column can have a default value. Its data type corresponds to the type of the column. It may be a constant value like the number -1 or the string 'n/a', or it is a system variable or a function call to determine dynamic values like the username or the actual timestamp.

The default clause affects those INSERT and MERGE commands, which do not specify the column. In our example database the person table has the column weight with the default value 0. If we omit this column in an INSERT command, the DBMS will store the value 0.

-- This INSERT command omits the 'weight' column. Therefore the value '0' (which is different from
-- the NULL value) is stored in the weight column.
INSERT INTO person (id, firstname, lastname,    date_of_birth,     place_of_birth, ssn)
VALUES             (11, 'Larry',   'Goldstein', date'1970-11-20', 'Dallas',        '078-05-1120');
COMMIT;

-- This SELECT retrieves the row ...
SELECT * 
FROM   person
WHERE  id = 11
AND    weight = 0;

-- ... but not this one:
SELECT * 
FROM   person
WHERE  id = 11
AND    weight IS NULL;

Identity Specification[edit | edit source]

The identity specification serves for the generation of a series of unique values that act as the Primary Key to the table's rows. The standard defines the syntax as: "GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY". Unfortunately, most DBMS vendors do not support this formulation. Instead, they offer different syntaxes and even different concepts to generate primary key values. Some use a combination of generators/sequences and triggers, others a special data type, or different keywords.

An overview about the wide spread of implementations is available in the wikibook SQL Dialects Reference: Auto-increment_column.

Column Constraint[edit | edit source]

The column constraint clause specifies conditions which all values must meet. There are different column constraint types:

NOT NULL
Primary Key
Unique
Foreign Key
Check values


The NOT NULL phrase defines, that it is not allowed to store the NULL value in the column.

-- The column col_1 is per definition not allowed to hold the NULL value
CREATE TABLE t1 (col_1 DECIMAL NOT NULL);

-- This INSERT command will fail
INSERT INTO t1(col_1) values(NULL);

-- The same applies to the following UPDATE command
INSERT INTO t1(col_1) values(5);
UPDATE t1 SET col_1 = NULL;


The PRIMARY KEY phrase defines that the column acts as the Primary Key of the table. This implies that the column is not allowed to store a NULL value and that the values of all rows are distinct from each other.

CREATE TABLE t2 (col_1 DECIMAL PRIMARY KEY);

-- This INSERT will fail because a primary key column is not allowed to store the NULL value.
INSERT INTO t2(col_1) VALUES(NULL);

-- This INSERT works
INSERT INTO t2(col_1) VALUES(5);

-- But the next INSERT will fail, because only one row with the value '5' is allowed.
INSERT INTO t2(col_1) VALUES(5);


The UNIQUE constraint has a similar meaning as the PRIMARY KEY phrase. But there are two slight differences.

First, the values of different rows of a UNIQUE column are not allowed to be equal, which is the same as with PK. But they are allowed to hold the NULL value, which is different from Primary Key. The existence of NULL values has an implication. As the term null = null never evaluates to true (it evaluates to unknown) there may exist multiple rows with the NULL value in a column which is defined to be UNIQUE.

Second, only one Primary Key definition per table is allowed. In contrast, there may be many UNIQUE constraints (on different columns).

CREATE TABLE t3 (col_1 DECIMAL UNIQUE);

-- works well
INSERT INTO t3(col_1) VALUES(5);
-- fails because there is another row with value 5
INSERT INTO t3(col_1) VALUES(5);

-- works well
INSERT INTO t3(col_1) VALUES(null);
-- works also
INSERT INTO t3(col_1) VALUES(null);

-- check the results
SELECT * FROM t3;


The FOREIGN KEY condition defines that the column can hold only those values, which are also stored in a different column of (the same or) another table. This different column has to be UNIQUE or a Primary Key, whereas the values of the foreign key column itself may hold identical values for multiple rows. The consequence is that one cannot create a row with a certain value in this column before there is a row with exactly this certain value in the referred table. In our example database, we have a contact table whose column person_id refers to the id of persons. It makes sense that one cannot store contact values before storing the appropriate person.

Foreign Keys are the technique to realize one-to-many (1:m) relationships.

-- A table with a column which refers to the 'id' column of table 'person'
CREATE TABLE t4 (col_1 DECIMAL REFERENCES person(id));

-- This INSERT works as in table 'person' of our example database there is a row with id = 3.
INSERT INTO t4(col_1) VALUES(3);

-- This statement will fail because in 'person' there is no row with id = 99.
INSERT INTO t4(col_1) VALUES(99);


Column checks inspect the values of the column to see whether they meet the defined criterion. Within such column checks, only the actual column is visible. If a condition covers two or more columns (e.g., col_1 > col_2) a table check must be used.

-- 'col_1' shall contain only values from 1 to 10.
-- A hint to MySQL users: MySQL before 8.0.16 accepts the syntax of column checks - but ignores the definitions silently. The same applies to MariaDB before 10.2.1.
CREATE TABLE t5 (col_1 DECIMAL CHECK (col_1 BETWEEN 1 AND 10));

-- This INSERT works:
INSERT INTO t5(col_1) VALUES(3);

-- This statement will fail:
INSERT INTO t5(col_1) VALUES(99);

Table Constraint[edit | edit source]

Table constraints define rules which are mandatory for the table as a whole. Their semantic and syntax overlaps partially with the previous shown column constraints.

Table constraints are defined after the definition of all columns. The syntax starts with the keyword CONSTRAINT, followed by an optional name. The following example includes the optional names t6_pk, t6_ik and t6_fk. It is a good practice to include names. In the case of an error exception, most DBMS will include this name as part of related error messages - if a name isn't defined, the DBMS may use its internal naming convention, which can be cryptic.


Primary Key, UNIQUE and Foreign Key[edit | edit source]

In the same manner as shown in the column constraints part Primary Key, UNIQUE and Foreign Key conditions can be expressed as table constraints. The syntax differs slightly from the column constraint syntax; the semantic is identical.

-- A table with a PK column, one UNIQUE column and a FK column.
CREATE TABLE t6 (
  col_1 DECIMAL,
  col_2 CHAR(10),
  col_3 DECIMAL,
  CONSTRAINT t6_pk PRIMARY KEY (col_1), -- 't6_pk' is the name of the constraint
  CONSTRAINT t6_uk UNIQUE      (col_2),
  CONSTRAINT t6_fk FOREIGN KEY (col_3) REFERENCES person(id)
);

NOT NULL and Simple Column Checks[edit | edit source]

Similar to column constraints part NOT NULL conditions and simple column checks can be expressed as table expressions.

CREATE TABLE t7 (
  col_1 DECIMAL,
  col_2 DECIMAL,
  CONSTRAINT t7_col_1_nn    CHECK (col_1 IS NOT NULL),
  CONSTRAINT t7_col_2_check CHECK (col_2 BETWEEN 1 and 10)
);

General Column Checks[edit | edit source]

If a condition affects more than one column, it must be expressed as a table constraint.

CREATE TABLE t8 (
  col_1 DECIMAL,
  col_2 DECIMAL,
  col_3 DECIMAL,
  col_4 DECIMAL,
  -- col_1 can hold only those values which are greater than col_2 
  CONSTRAINT t8_check_1 CHECK (col_1 > col_2),
  -- If col_3 is NULL, col_4 must be NULL also
  CONSTRAINT t8_check_2 CHECK ((col_3 IS     NULL AND col_4 IS     NULL) OR
                               (col_3 IS NOT NULL AND col_4 IS NOT NULL))
);

-- These two INSERTs work as they meet all conditions
INSERT INTO t8 VALUES(1, 0, null, null);
INSERT INTO t8 VALUES(2, 0, 5, 5);

-- Again: MySQL ignores check conditions silently

-- This INSERT fails because col_1 is not greater than col_2
INSERT INTO t8 VALUES(3, 6, null, null);

-- This INSERT fails because col_3 is not null and col_4 is null
INSERT INTO t8 VALUES(4, 0, 5, null);

Column Constraints vs. Table Constraints[edit | edit source]

As you have seen, some constraints may be defined as part of the column definition, which is called a column constraint, or as a separate table constraint. Table constraints have two advantages. First, they are a little bit more powerful.

Second, they do have their own name! This helps to understand system messages. Furthermore, it opens the possibility to manage constraints after the table exists and contains data. The ALTER TABLE statement can deactivate, activate, or delete constraints. To do so, you have to know their name.

Clean Up[edit | edit source]

DROP TABLE t1;
DROP TABLE t2;
DROP TABLE t3;
DROP TABLE t4;
DROP TABLE t5;
DROP TABLE t6;
DROP TABLE t7;
DROP TABLE t8;

Exercises[edit | edit source]

Create a table 'company' with columns 'id' (numeric, primary key), 'name' (strings of variable size up to 200), 'isin' (strings of length 12, not nullable, unique values).
Create a solution with column constraints only and another one with table constraints only.

Click to see solution
-- column constraints only
CREATE TABLE company_1 (
  id   DECIMAL PRIMARY KEY,
  name VARCHAR(200),
  isin CHAR(12) NOT NULL UNIQUE
);
-- table constraints only
CREATE TABLE company_2 (
  id   DECIMAL,
  name VARCHAR(200),
  isin CHAR(5),
  CONSTRAINT company_2_pk PRIMARY KEY (id),
  CONSTRAINT company_2_uk UNIQUE      (isin),
  CONSTRAINT company_2_check_isin CHECK (isin IS NOT NULL)
);

Create a table 'accessory' with columns 'id' (numeric, primary key), 'name' (strings of variable size up to 200, unique), 'hobby_id' (decimal, not nullable, foreign key to column 'id' of table 'hobby').
Create a solution with column constraints only and another one with table constraints only.

Click to see solution
-- column constraints only
CREATE TABLE accessory_1 (
  id       DECIMAL PRIMARY KEY,
  name     VARCHAR(200) UNIQUE,
  hobby_id DECIMAL NOT NULL REFERENCES hobby(id)
);
-- table constraints only
CREATE TABLE accessory_2 (
  id       DECIMAL,
  name     VARCHAR(200),
  hobby_id DECIMAL,
  CONSTRAINT accessory_2_pk PRIMARY KEY (id),
  CONSTRAINT accessory_2_uk UNIQUE      (name),
  CONSTRAINT accessory_2_check_1  CHECK (hobby_id IS NOT NULL),
  CONSTRAINT accessory_2_fk FOREIGN KEY (hobby_id) REFERENCES hobby(id)
);

-- Test some legal and illegal values
INSERT INTO accessory_1 VALUES (1, 'Fishing-rod', 2);
COMMIT;
-- ...