Structured Query Language/Create Table

From Wikibooks, open books for an open world
Jump to: navigation, search
TRUNCATE Structured Query Language
Create Table
Data Types



One of the basic steps during database development cycles is the fixing of decisions about the table structure. To do so there is the CREATE TABLE statement with which developers define tables together with their columns and constraints.

Because a lot of features may be activated by the command, its syntax is a little bit complex. This page shows the most important parts. The syntax is not straight forward. At some points it is possible to use alternative formulations to express the same purpose, e.g. the Primary Key may be defined within the column definition as a column constraint, at the end of the command as a table constraint or as a separate stand-alone command 'ALTER TABLE ADD CONSTRAINT ...;' .

CREATE TABLE <tablename> (
  <column_name> <data_type> <default_value> <identity_specification> <column_constraint>,
  <column_name> <data_type> <default_value>                          <column_constraint>,
  ...,
  <table_constraint>,
  <table_constraint>,
  ...
);

General Description[edit]

After the introductory key words CREATE TABLE the tablename is specified. Within a pair of parentheses a list of column definitions follows. Each column is defined by its name, data type, an optional default value and optional constraints for this individual column.

After the list of column definitions developers can specifiy table constraints like Primary and Foreign Keys, Unique conditions and general column conditions.

An first example was shown at the page Create a simple Table and a second one here:

CREATE TABLE test_table (
  -- define columns (name / type / default value / column constraint
  id             DECIMAL                           PRIMARY KEY,
  part_number    CHAR(10)          DEFAULT 'n/a'   NOT NULL,
  part_name      VARCHAR(500),
  state          DECIMAL           DEFAULT -1,
  -- define table constraints (eg: 'n/a' shall correlate with NULL)
  CONSTRAINT test_check CHECK ((part_number  = 'n/a' AND part_name IS     NULL) OR
                               (part_number != 'n/a' AND part_name IS NOT NULL))
);

The table consists of 4 columns. All of them have a data type and some a default value. The column id acts as the Primary Key. The table constraint test_check guarantees that part_name is mandatory if part_number is recorded.

Column Definition[edit]

Data Type[edit]

The standard defines a lot of predefined data types: character strings of fixed and variable size, character large objects (CLOB), binary strings of fixed and variable size, binary large objects (BLOB), numeric, boolean, datetime, interval, xml. Beyond there are complex types like: ROW, REF(erence), ARRAY, MULTISET and user-definded types (UDT). The predefined data types are explained on the next page. To keep things simple we use on this page only CHAR, VARCHAR and DECIMAL.

Default Value[edit]

A column can have a default value. Its data type corresponds to the type of the column. It may be a constant value like the number -1 or the string 'n/a', or it is a system variable or a function call to determine dynamic values like the username or the actual timestamp.

The default clause affects those INSERT and MERGE commands, which do not specify the column. In our example database the person table has the column weight with the default value 0. If we omit this column in an INSERT command, the DBMS will store the value 0.

-- This INSERT command omits the 'weight' column. Therefore the value '0' (which is different from
-- the NULL value) is stored in the weight column.
INSERT INTO person (id, firstname, lastname,    date_of_birth,     place_of_birth, ssn)
VALUES             (11, 'Larry',   'Goldstein', DATE'1970-11-20', 'Dallas',        '078-05-1120');
COMMIT;
 
-- This SELECT retrieves the row ...
SELECT * 
FROM   person
WHERE  id = 11
AND    weight = 0;
 
-- ... but not this one:
SELECT * 
FROM   person
WHERE  id = 11
AND    weight IS NULL;

Identity Specification[edit]

The identity specification serves for the generation of a series of unique values which acts as the Primary Key to the tables rows. The standard defines the syntax as: "GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY". Unfortunatelly most DBMS vendors do not support this formulation. Instead they offer different syntaxes and even different concepts to generate primary key values. Some use a combination of generators/sequences and triggers, others a special data type or different key words.

An overview about the wide spread of implementations is available in the wikibook SQL Dialects Reference: Auto-increment_column.

Column Constraint[edit]

The column constraint clause specifies conditions which all values must meet. There are different column constraint types:

NOT NULL
Primary Key
Unique
Foreign Key
Check values


The NOT NULL phrase defines, that it is not allowed to store the NULL value in the column.

-- The column col_1 is per definition not allowed to hold the NULL value
CREATE TABLE t1 (col_1 DECIMAL NOT NULL);
 
-- This INSERT command will fail
INSERT INTO t1(col_1) VALUES(NULL);
 
-- The same applies to the following UPDATE command
INSERT INTO t1(col_1) VALUES(5);
UPDATE t1 SET col_1 = NULL;


The PRIMARY KEY phrase defines that the column acts as the Primary Key of the table. This implies that the column is not allowed to store a NULL value and that the values of all rows are distinct from each other.

CREATE TABLE t2 (col_1 DECIMAL PRIMARY KEY);
 
-- This INSERT will fail because a primary key column is not allowed to store the NULL value.
INSERT INTO t2(col_1) VALUES(NULL);
 
-- This INSERT works
INSERT INTO t2(col_1) VALUES(5);
 
-- But the next INSERT will fail, because only one row with the value '5' is allowed.
INSERT INTO t2(col_1) VALUES(5);


The UNIQUE constraint has a similar meaning as the PRIMARY KEY phrase. But there are two slight differences.

First, the values of different rows of a UNIQUE column are not allowed to be equal, which is the same as with PK. But they are allowed to hold the NULL value, which is different from PK. The existence of NULL values has an implication. As the term null = null never evaluates to true (it evaluates to unknown) there may exist multiple rows with the NULL value in a column which is defined to be UNIQUE.

Second, only one PK definition per table is allowed. In contrast, there may be many UNIQUE constraints (on different columns).

CREATE TABLE t3 (col_1 DECIMAL UNIQUE);
 
-- works well
INSERT INTO t3(col_1) VALUES(5);
-- fails because there is another row with value 5
INSERT INTO t3(col_1) VALUES(5);
 
-- works well
INSERT INTO t3(col_1) VALUES(NULL);
-- works also
INSERT INTO t3(col_1) VALUES(NULL);
 
-- check the results
SELECT * FROM t3;


The FOREIGN KEY condition defines that the column can hold only those values, which are also stored in a different column of (the same or) another table. This different column has to be UNIQUE or a Primary Key, whereas the values of the foreign key column itself may hold identical values for multiple rows. The consequence is that one cannot create a row with a certain value in this column before there is a row with exactly this certain value in the referred table. In our example database we have a contact table whose column person_id refers to the id of persons. It makes sense that one cannot store contact values before storing the appropriate person.

Foreign Keys are the technique to realise 1:m relationships.

-- A table with a column which referes to the 'id' column of table 'person'
CREATE TABLE t4 (col_1 DECIMAL REFERENCES person(id));
 
-- This INSERT works as in table 'person' of our example database there is a row with id = 3.
INSERT INTO t4(col_1) VALUES(3);
 
-- This statement will fail as in 'person' there is no row with id = 99.
INSERT INTO t4(col_1) VALUES(99);


Column checks inspect the values of the column to see whether they meet the defined criterion. Within such column checks only the actual column is visible. If a condition covers two or more colums (eg.: col_1 > col_2) a table check must be used.

-- 'col_1' shall contain only values from 1 to 10.
-- A hint to MySQL users: MySQL accepts the syntax of column checks - but it ignores them silently.
CREATE TABLE t5 (col_1 DECIMAL CHECK (col_1 BETWEEN 1 AND 10));
 
-- This INSERT works:
INSERT INTO t5(col_1) VALUES(3);
 
-- This statement will fail:
INSERT INTO t5(col_1) VALUES(99);

Table Constraint[edit]

Table constraints defines rules which are mandatory for the table as a whole. Their sematic and syntax overlaps partially with the previous shown column constraints.

Table constraints are defined after the definition of all columns. The syntax starts with the key word CONSTRAINT and includes the possibility to denominate them with a meaningful name, t6_pk, t6_uk and t6_fk in the next example. In the case of any exception most DBMS shows this name as part of the error message - and if you havn't defined one it uses its internal naming conventions which may be very cryptic.

Primary Key, UNIQUE and Foreign Key[edit]

In the same manner as shown in the column constraints part Primary Key, UNIQUE and Foreign Key conditions can be expressed as table constraints. The syntax differs slightly from the column constraint syntax, the semantic is identical.

-- A table with a PK column, one UNIQUE column and a FK column.
CREATE TABLE t6 (
  col_1 DECIMAL,
  col_2 CHAR(10),
  col_3 DECIMAL,
  CONSTRAINT t6_pk PRIMARY KEY (col_1), -- 't6_pk' is the name of the constraint
  CONSTRAINT t6_uk UNIQUE      (col_2),
  CONSTRAINT t6_fk FOREIGN KEY (col_3) REFERENCES person(id)
);

NOT NULL and Simple Column Checks[edit]

In a similar way as shown in the column constraints part NOT NULL conditions and simple column checks can be expressed as table expressions.

CREATE TABLE t7 (
  col_1 DECIMAL,
  col_2 DECIMAL,
  CONSTRAINT t7_col_1_nn    CHECK (col_1 IS NOT NULL),
  CONSTRAINT t7_col_2_check CHECK (col_2 BETWEEN 1 AND 10)
);

General Column Checks[edit]

If a condition affects more than one column it must be expressed as a table constraint.

CREATE TABLE t8 (
  col_1 DECIMAL,
  col_2 DECIMAL,
  col_3 DECIMAL,
  col_4 DECIMAL,
  -- col_1 can hold only those values which are greater than col_2 
  CONSTRAINT t8_check_1 CHECK (col_1 > col_2),
  -- If col_3 is NULL, col_4 must be NULL also
  CONSTRAINT t8_check_2 CHECK ((col_3 IS     NULL AND col_4 IS     NULL) OR
                               (col_3 IS NOT NULL AND col_4 IS NOT NULL))
);
 
-- This two INSERTs work as they meet all conditions
INSERT INTO t8 VALUES(1, 0, NULL, NULL);
INSERT INTO t8 VALUES(2, 0, 5, 5);
 
-- Again: MySQL ignores check conditions silently
 
-- This INSERT fails because col_1 is not greater than col_2
INSERT INTO t8 VALUES(3, 6, NULL, NULL);
 
-- This INSERT fails because col_3 is not null and col_4 is null
INSERT INTO t8 VALUES(4, 0, 5, NULL);

Column Constraints vs. Table Constraints[edit]

As you have seen some constraints may be defined as part of the column definition, which is called a column constraint, or as a separate table constraint. Table constraints have two advantages. First, they are a little bit more powerful.

Second, they do have their own name! This helps to understand system messages. Furthermore it opens the possibility to manage constraints after the table exists and contains data. The ALTER TABLE statement can deactivate, activate or delete constraints. To do so, you have to know their name.

Clean Up[edit]

DROP TABLE t1;
DROP TABLE t2;
DROP TABLE t3;
DROP TABLE t4;
DROP TABLE t5;
DROP TABLE t6;
DROP TABLE t7;
DROP TABLE t8;

Exercises[edit]

Create a table 'company' with columns 'id' (numeric, primary key), 'name' (strings of variable size up to 200), 'isin' (strings of length 12, not nullable, unique values).
Create a solution with column constraints only and another one with table constraints only.

Click to see solution
-- column constraints only
CREATE TABLE company_1 (
  id   DECIMAL PRIMARY KEY,
  name VARCHAR(200),
  isin CHAR(12) NOT NULL UNIQUE
);
-- table constraints only
CREATE TABLE company_2 (
  id   DECIMAL,
  name VARCHAR(200),
  isin CHAR(5),
  CONSTRAINT company_2_pk PRIMARY KEY (id),
  CONSTRAINT company_2_uk UNIQUE      (isin),
  CONSTRAINT company_2_check_isin CHECK (isin IS NOT NULL)
);

Create a table 'accessory' with columns 'id' (numeric, primary key), 'name' (strings of variable size up to 200, unique), 'hobby_id' (decimal, not nullable, foreing key to column 'id' of table 'hobby').
Create a solution with column constraints only and another one with table constraints only.

Click to see solution
-- column constraints only
CREATE TABLE accessory_1 (
  id       DECIMAL PRIMARY KEY,
  name     VARCHAR(200) UNIQUE,
  hobby_id DECIMAL NOT NULL REFERENCES hobby(id)
);
-- table constraints only
CREATE TABLE accessory_2 (
  id       DECIMAL,
  name     VARCHAR(200),
  hobby_id DECIMAL,
  CONSTRAINT accessory_2_pk PRIMARY KEY (id),
  CONSTRAINT accessory_2_uk UNIQUE      (name),
  CONSTRAINT accessory_2_check_1  CHECK (hobby_id IS NOT NULL),
  CONSTRAINT accessory_2_fk FOREIGN KEY (hobby_id) REFERENCES hobby(id)
);
 
-- Test some legal and illegal values
INSERT INTO accessory_1 VALUES (1, 'Fishing-rod', 2);
COMMIT;
-- ...


TRUNCATE Structured Query Language
Create Table
Data Types