Structured Query Language/Example Database Data

From Wikibooks, open books for an open world
Jump to navigation Jump to search



rDBMS offers different ways to put data into their storage: from CSV files, Excel files, product-specific binary files, via several API's or special gateways to other databases respectively database systems and some more technics. So there is a wide range of - non standardized - possibilities to bring data into our system. Because we are speaking about SQL, we use the standardized INSERT command to do the job. It is available on all systems.

We use only a small amount of data because we want to keep things simple. Sometimes one needs a high number of rows to do performance tests. For this purpose, we show a special INSERT command at the end of this page, which exponentially inflates your table.

person[edit | edit source]

--
-- After we have done a lot of tests we may want to reset the data to its original version.
-- To do so, use the DELETE command. But be aware of Foreign Keys: you may be forced to delete
-- persons at the very end - with DELETE it's just the opposite sequence of tables in comparison to INSERTs.
-- Be careful and don't confuse DELETE with DROP !!
--
-- DELETE FROM person_hobby;
-- DELETE FROM hobby;
-- DELETE FROM contact;
-- DELETE FROM person;
-- COMMIT;

INSERT INTO person VALUES (1,  'Larry',  'Goldstein', DATE'1970-11-20', 'Dallas',        '078-05-1120', 95);
INSERT INTO person VALUES (2,  'Tom',    'Burton',    DATE'1977-01-22', 'Birmingham',    '078-05-1121', 75);
INSERT INTO person VALUES (3,  'Lisa',   'Hamilton',  DATE'1975-12-23', 'Richland',      '078-05-1122', 56);
INSERT INTO person VALUES (4,  'Kim',    'Goldstein', DATE'2011-06-01', 'Shanghai',      '078-05-1123', 11);
INSERT INTO person VALUES (5,  'James',  'de Winter', DATE'1975-12-23', 'San Francisco', '078-05-1124', 75);
INSERT INTO person VALUES (6,  'Elias',  'Baker',     DATE'1939-10-03', 'San Francisco', '078-05-1125', 55);
INSERT INTO person VALUES (7,  'Yorgos', 'Stefanos',  DATE'1975-12-23', 'Athens',        '078-05-1126', 64);
INSERT INTO person VALUES (8,  'John',   'de Winter', DATE'1977-01-22', 'San Francisco', '078-05-1127', 77);
INSERT INTO person VALUES (9,  'Richie', 'Rich',      DATE'1975-12-23', 'Richland',      '078-05-1128', 90);
INSERT INTO person VALUES (10, 'Victor', 'de Winter', DATE'1979-02-28', 'San Francisco', '078-05-1129', 78);
COMMIT;

Please note that the format of DATEs may depend on your local environment. Furthermore, SQLite uses a different syntax for the implicit conversion from string to DATE.

-- SQLite syntax
INSERT INTO person VALUES (1,  'Larry',  'Goldstein', DATE('1970-11-20'), 'Dallas',      '078-05-1120', 95);
...

contact[edit | edit source]

-- DELETE FROM contact;
-- COMMIT;

INSERT INTO contact VALUES (1,  1,  'fixed line', '555-0100');
INSERT INTO contact VALUES (2,  1,  'email',      'larry.goldstein@acme.xx');
INSERT INTO contact VALUES (3,  1,  'email',      'lg@my_company.xx');
INSERT INTO contact VALUES (4,  1,  'icq',        '12111');
INSERT INTO contact VALUES (5,  4,  'fixed line', '5550101');
INSERT INTO contact VALUES (6,  4,  'mobile',     '10123444444');
INSERT INTO contact VALUES (7,  5,  'email',      'james.dewinter@acme.xx');
INSERT INTO contact VALUES (8,  7,  'fixed line', '+30000000000000');
INSERT INTO contact VALUES (9,  7,  'mobile',     '+30695100000000');
COMMIT;

hobby[edit | edit source]

-- DELETE FROM hobby;
-- COMMIT;

INSERT INTO hobby VALUES (1,  'Painting',
                              'Applying paint, pigment, color or other medium to a surface.');
INSERT INTO hobby VALUES (2,  'Fishing',
                              'Catching fishes.');
INSERT INTO hobby VALUES (3,  'Underwater Diving',
                              'Going underwater with or without breathing apparatus (scuba diving / breath-holding).');
INSERT INTO hobby VALUES (4,  'Chess',
                              'Two players have 16 figures each. They move them on an eight-by-eight grid according to special rules.');
INSERT INTO hobby VALUES (5,  'Literature', 'Reading books.');
INSERT INTO hobby VALUES (6,  'Yoga',
                              'A physical, mental, and spiritual practices which originated in ancient India.');
INSERT INTO hobby VALUES (7,  'Stamp collecting',
                              'Collecting of post stamps and related objects.');
INSERT INTO hobby VALUES (8,  'Astronomy',
                              'Observing astronomical objects such as moons, planets, stars, nebulae, and galaxies.');
INSERT INTO hobby VALUES (9,  'Microscopy',
                              'Observing very small objects using a microscope.');
COMMIT;

person_hobby[edit | edit source]

-- DELETE FROM person_hobby;
-- COMMIT;

INSERT INTO person_hobby VALUES (1, 1, 1);
INSERT INTO person_hobby VALUES (2, 1, 4);
INSERT INTO person_hobby VALUES (3, 1, 5);
INSERT INTO person_hobby VALUES (4, 5, 2);
INSERT INTO person_hobby VALUES (5, 5, 3);
INSERT INTO person_hobby VALUES (6, 7, 8);
INSERT INTO person_hobby VALUES (7, 4, 4);
INSERT INTO person_hobby VALUES (8, 9, 8);
INSERT INTO person_hobby VALUES (9, 9, 9);
COMMIT;

Grow up[edit | edit source]

For realistic performance tests, we need a vast amount of data. The few number of rows in our example database does not meet this criteria. How can we generate test data and store it in a table? There are different possibilities: FOR loops in a procedure, (pseudo-) recursive calls, importing external data in a system-specific fashion, and some more.

Because we are dealing with SQL, we introduce an INSERT command, which is portable across all rDBMS. Although it has a simple syntax, it is very powerful. With every execution, it will double the number of rows. Suppose there is 1 row in a table. After the first execution, there will be a second row in the table. At first glance, this sounds boring. But after 10 executions there are more than a thousand rows, after 20 executions there are more than a million, and we suspect that only a few installations can execute it more than 30 times.

INSERT INTO person (id,                                firstname, lastname, weight)
SELECT              id + (select max(id) from person), firstname, lastname, weight
FROM        person;
COMMIT;

The command is an INSERT in combination with a (Sub-)SELECT. The SELECT retrieves all rows of the table because there is no WHERE clause. This is the reason for the doubling. The mandatory columns firstname and lastname keeps unchanged. We ignore optional columns. Only the primary key id is computed. The new value is the sum of the old value plus the highest available id when starting the command.

Some more remarks:

  • max(id) is determined only once per execution! This illustrates an essential aspect of rDBMS: At a conceptual level, the database has a particular state before execution of a command and a new state after its execution. Commands are atomic operations moving the database from one state to another - they run entirely or not a bit! Both, the SELECT and the inner SELECT with the max(id), act on the initial state. They never see the result or an intermediate result of the INSERT. Otherwise, the INSERT would never end.
  • If we wish to observe the process of growing, we can add a column to the table to store max(id) with each iteration.
  • The computation of the new id may be omitted if the DBMS supports AUTOINCREMENT columns.
  • For performance tests, it may be helpful to store some random data in one or more columns.