Structured Query Language/Relational DBMS (rDBMS)

From Wikibooks, open books for an open world
Jump to navigation Jump to search

The Theory

[edit | edit source]

A relational DBMS is an implementation of data stores according to the design rules of the relational model. This approach allows operations on the data according to the relational algebra like projections, selections, joins, set operations (union, difference, intersection, ...), and more. Together with Boolean algebra (and, or, not, exists, ...) and other mathematical concepts, relational algebra builds up a complete mathematical system with basic operations, complex operations, and transformation rules between the operations. Neither a DBA nor an application programmer needs to know the relational algebra. But it is helpful to know that your rDBMS is based on this mathematical foundation - and that it has the freedom to transform queries into several forms.

The Data Model

[edit | edit source]

The relational model designs data structures as relations (tables) with attributes (columns) and the relationship between those relations. The information about one entity of the real world is stored within one row of a table. However, the term one entity of the real world must be used with care. It may be that our intellect identifies a machine like a single airplane in this vein. Depending on the information requirements, it may be sufficient to put all of the information into one row of a table airplane. But in many cases, it is necessary to break the entity into its pieces and model the pieces as discrete entities, including the relationship to the whole thing. If, for example, information about every single seat within the airplane is needed, a second table seat and some way of joining seats to airplanes will be required.

This way of breaking up information about real entities into a complex data model depends highly on the information requirements of the business concept. Additionally, there are some formal requirements independent of any application: the resulting data model should conform to a so-called normal form. Normally these data models consist of a great number of tables and relationships between them. Such models will not predetermine their use by applications; they are strictly descriptive and will not restrict access to the data in any way.

Some more Basics

[edit | edit source]

Operations within databases must have the ability to act not only on single rows, but also on sets of rows. Relational algebra offers this possibility. Therefore languages based on relational algebra, e.g.: SQL, offer a powerful syntax to manipulate lots of data within one single command.

As operations within relational algebra may be replaced by different but logically equivalent operations, a language based on relational algebra should not predetermine how its syntax is mapped to operations (the execution plan). The language should describe what should be done and not how to do it. Note: This choice of operations does not concern the use or neglect of indices.

As described before the relational model tends to break up objects into sub-objects. In this and in other cases it is often necessary to collect associated information from a bunch of tables into one information unit. How is this possible without links between participating tables and rows? The answer is: All joining is done based on the values which are actually stored in the attributes. The rDBMS must make its own decisions about how to reach all concerned rows: whether to read all potentially affected rows and ignore those which are irrelevant (full table scan) or, to use some kind of index and read-only those which match the criteria. This value-based approach allows even the use of operators other than the equal-operator, e.g.:

SELECT * FROM gift JOIN box ON gift.extent < box.extent;

This command will join all "gift" records to all "box" records with a larger "extent" (whatever "extent" means).