- 1 SELECT
- 2 Joins
- 3 Subqueries
- 4 References
- 5 Resources
select syntax is as follows:
SELECT * FROM a_table_name WHERE condition GROUP BY grouped_field HAVING group_name condition ORDER BY ordered_field LIMIT limit_number, offset
List of fields
You must specify what data you're going to retrieve in the SELECT clause:
SELECT DATABASE() -- returns the current db's name SELECT CURRENT_USER() -- returns your username SELECT 1+1 -- returns 2
Any SQL expression is allowed here.
You can also retrieve all fields from a table:
SELECT * FROM `stats`
If you SELECT only the necessary fields, the query will be faster.
The table's name
If you are retrieving results from a table or a view, usually you specify the table's name in the FROM clause:
SELECT id FROM `stats` -- retrieve a field called id from a table called stats
SELECT MAX(id) FROM `stats` SELECT id*2 FROM `stats`
You can also use the `db_name`.`table_name` syntax:
SELECT id FROM `sitedb`.`stats`
But you can also specify the table's name in the SELECT clause:
SELECT `stats`.`id` -- retrieve a field called id from a table SELECT `sitedb`.`stats`.`id`
You can set a filter to decide what records must be retrieved.
For example, you can retrieve only the record which has an id of 42:
SELECT * FROM `stats` WHERE `id`=42
Or you can read more than one record:
SELECT * FROM `antiques` WHERE buyerid IS NOT NULL
You can group all records by one or more fields. The record which have the same value for that field will be grouped in one computed record. You can only select the grouped record and the result of some aggregate functions, which will be computed on all records of each group.
For example, the following will group all records in the table `users` by the field `city`. For each group of users living in the same city, the maximum age, the minimum age and the average age will be returned:
SELECT city, MAX(age), MIN(age), AVG(age) GROUP BY `city`
In the following example, the users are grouped by city and sex, so that we'll know the max, min and avg age of male/female users in each city:
SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city`, `sex`
The HAVING clause declares a filter for the records which are computed by the GROUP BY clause. It's different from the WHERE clause, that operates before the GROUP BY. Here's what happens:
- The records which match to the WHERE clause are retrieved
- Those records are used to compute new records as defined in the GROUP BY clause
- The new records that match to the HAVING conditions are returned
This means which WHERE decides what record are used to compose the new computed records.
HAVING decides what computed records are returned, so it can operate on the results of aggregate functions. HAVING is not optimized and can't use indexes.
Incorrect use of HAVING:
SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city` HAVING sex='m'
This probably gives a wrong results. MAX(age) and other aggregate calculations are made using all values, even if the record's sex value is 'f'. This is hardly the expected result.
Incorrect use of HAVING:
SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city`, `sex` HAVING sex='m'
This is correct and returns the expected results, but the execution of this query is not optimized. The WHERE clause can and should be used, because, so that MySQL doesn't computes records which are excluded later.
Correct use of HAVING:
SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city` HAVING MAX(age) > 80
It must group all records, because can't decide the max age of each city before the GROUP BY clause is execute. Later, it returns only the record with a MAX(age)>80.
You can set an arbitrary order for the records you retrieve. The order may be alphabetical or numeric.
SELECT * FROM `stats` ORDER BY `id`
By default, the order is ASCENDING. You can also specify that the order must be DESCENDING:
SELECT * FROM `stats` ORDER BY `id` ASC -- default SELECT * FROM `stats` ORDER BY `id` DESC -- inverted
NULLs values are considered as minor than any other value.
You can also specify the field position, in place of the field name:
SELECT `name`, `buyerid` FROM `antiques` ORDER BY 1 -- name SELECT `name`, `buyerid` FROM `antiques` ORDER BY 2 -- buyerid SELECT `name`, `buyerid` FROM `antiques` ORDER BY 1 DESC
SQL expressions are allowed:
SELECT `name` FROM `antiques` ORDER BY REVERSE(`name`)
You can retrieve records in a random order:
SELECT `name` FROM `antiques` ORDER BY RAND()
If a GROUP BY clause is specified, the results are ordered by the fields named in GROUP BY, unless an ORDER BY clause is present. You can even specify in the GROUP BY clause if the order must be ascending or descending:
SELECT city, sex, MAX(age) GROUP BY `city` ASC, `sex` DESC
If you have a GROUP BY but you don't want the records to be ordered, you can use ORDER BY NULL:
SELECT city, sex, MAX(age) GROUP BY `city`, `sex` ORDER BY NULL
You can specify the maximum of rows that you want to read:
SELECT * FROM `antiques` ORDER BY id LIMIT 10
This statement returns a maximum of 10 rows. If there are less than 10 rows, it returns the number of rows found. The limit clause is usually used with ORDER BY.
You can get a given number of random records:
SELECT * FROM `antiques` ORDER BY rand() LIMIT 1 -- one random record SELECT * FROM `antiques` ORDER BY rand() LIMIT 3
You can specify how many rows should be skipped before starting to return the records found. The first record is 0, not one:
SELECT * FROM `antiques` ORDER BY id LIMIT 10 SELECT * FROM `antiques` ORDER BY id LIMIT 0, 10 -- synonym
You can use the LIMIT clause to get the pagination of results:
SELECT * FROM `antiques` ORDER BY id LIMIT 0, 10 -- first page SELECT * FROM `antiques` ORDER BY id LIMIT 10, 10 -- second page SELECT * FROM `antiques` ORDER BY id LIMIT 20, 10 -- third page
Also, the following syntax is acceptable:
SELECT * FROM `antiques` ORDER BY id LIMIT 10 OFFSET 10
You can use the LIMIT clause to check the syntax of a query without waiting for it to return the results:
SELECT ... LIMIT 0
- SQL_CALC_FOUND_ROWS may speed up a query 
- LIMIT is particularly useful for SELECTs which use ORDER BY, DISTINCT and GROUP BY, because their calculations don't have to involve all the rows.
- If the query is resolved by the server copying internally the results into a temporary table, LIMIT helps MySQL to calculate how much memory is required by the table.
The DISTINCT keyword can be used to remove all duplicate rows from the resultset:
SELECT DISTINCT * FROM `stats` -- no duplicate rows SELECT DISTINCTROW * FROM `stats` -- synonym SELECT ALL * FROM `stats` -- duplicate rows returned (default)
You can use it to get the list of all values contained in one field:
SELECT DISTINCT `type` FROM `antiques` ORDER BY `type`
Or you can use it to get the existing combinations of some values:
SELECT DISTINCT `type`, `age` FROM `antiques` ORDER BY `type`
If one of the fields you are SELECTing is the PRIMARY KEY or has a UNIQUE index, DISTINCT is useless. Also, it's useless to use DISTINCT in conjunction with the GROUP BY clause.
IN and NOT IN
SELECT id FROM stats WHERE POSITION IN ('Manager', 'Staff') SELECT ownerid, 'is in both orders & antiques' FROM orders, antiques WHERE ownerid = buyerid UNION SELECT buyerid, 'is in antiques only' FROM antiques WHERE buyerid NOT IN (SELECT ownerid FROM orders)
EXISTS and ALL
(Compatible: Mysql 4+)
SELECT ownerfirstname, ownerlastname FROM owner WHERE EXISTS (SELECT * FROM antiques WHERE item = 'chair') SELECT buyerid, item FROM antiques WHERE price = ALL (SELECT price FROM antiques)
There are some hints you may want to give to the server to better optimize the SELECTs. If you give more than one hints, the order of the keywords is important:
SELECT [ALL | DISTINCT | DISTINCTROW ] [HIGH_PRIORITY] [STRAIGHT_JOIN] [SQL_SMALL_RESULT | SQL_BIG_RESULT] [SQL_BUFFER_RESULT] [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS] ...
Usually, DML commands (INSERT, DELETE, UPDATE) have higher priority than SELECTs. If you specify HIGH_PRIORITY though, the SELECT will have higher priority than DML statements.
STRAIGHT_JOIN Force MySQL to evaluate the tables of a JOIN in the same order they are named, from the leftmost.
SQL_SMALL_RESULT It's useful only while using DISTINCT or GROUP BY. Tells the optimizer that the query will return few rows.
SQL_BIG_RESULT It's useful only while using DISTINCT or GROUP BY. Tells the optimizer that the query will return a many rows.
SQL_BUFFER_RESULT Force MySQL to copy the result into a temporary table. This is useful to remove LOCKs as soon as possible.
SQL_CACHE Forces MySQL to copy the result into the query cache. Only works if the value of query_cache_type is DEMAND or 2.
SQL_NO_CACHE Tells MySQL not to cache the result. Useful if the query occurs very seldom or if the result often change.
SQL_CALC_FOUND_ROWS Useful if you are using the LIMIT clause. Tells the server to calculate how many rows would have been returned if there were no LIMIT. You can retrieve that number with another query:
SELECT SQL_CALC_FOUND_ROWS * FROM `stats` LIMIT 10 OFFSET 100; SELECT FOUND_ROWS();
UNION and UNION All
(Compatible: Mysql 4+)
Following query will return all the records from both tables.
SELECT * FROM english UNION ALL SELECT * FROM hindi
UNION is the same as UNION DISTINCT.
If you type only UNION, then it is considered that you are asking for distinct records. If you want all records, you have to use UNION ALL.
SELECT word FROM word_table WHERE id = 1 UNION SELECT word FROM word_table WHERE id = 2 (SELECT magazine FROM pages) UNION DISTINCT (SELECT magazine FROM pdflog) ORDER BY magazine (SELECT ID_ENTRY FROM TABLE WHERE ID_AGE = 1) UNION DISTINCT (SELECT ID_ENTRY FROM TABLE WHERE ID_AGE=2)
The Most important aspect of SQL is its relational features. You can query, compare and calculate two different tables having entirely different structure. Joins and subselects are the two methods to join tables. Both methods of joining tables should give the same results. The natural join is faster on most SQL platforms.
In the following example a student is trying to learn what the numbers are called in hindi.
CREATE TABLE english (Tag INT, Inenglish VARCHAR(255)); CREATE TABLE hindi (Tag INT, Inhindi VARCHAR(255)); INSERT INTO english (Tag, Inenglish) VALUES (1, 'One'); INSERT INTO english (Tag, Inenglish) VALUES (2, 'Two'); INSERT INTO english (Tag, Inenglish) VALUES (3, 'Three'); INSERT INTO hindi (Tag, Inhindi) VALUES (2, 'Do'); INSERT INTO hindi (Tag, Inhindi) VALUES (3, 'Teen'); INSERT INTO hindi (Tag, Inhindi) VALUES (4, 'Char');
|select * from english||select * from hindi|
Cartesian join (CROSS JOIN)
A Cartesian join is when you join every row of one table to every row of another table.
SELECT * FROM english, hindi
It is also called Cross Join and may be written in this way:
SELECT * FROM english CROSS JOIN hindi
SELECT hindi.Tag, english.Inenglish, hindi.Inhindi FROM english, hindi WHERE english.Tag = hindi.Tag -- equal SELECT hindi.Tag, english.Inenglish, hindi.Inhindi FROM english INNER JOIN hindi ON english.Tag = hindi.Tag
You can also write the same query as
SELECT hindi.Tag, english.Inenglish, hindi.Inhindi FROM english INNER JOIN hindi ON english.Tag = hindi.Tag
Natural Joins using "using" (Compatible: MySQL 4+; but changed in MySQL 5) The following statement using "USING" method will display the same results.
SELECT hindi.tag, hindi.Inhindi, english.Inenglish FROM hindi NATURAL JOIN english USING (Tag)
LEFT JOIN / LEFT OUTER JOIN
The syntax is as follows:
SELECT field1, field2 FROM table1 LEFT JOIN table2 ON field1=field2 SELECT e.Inenglish AS English, e.Tag, '--no row--' AS Hindi FROM english AS e LEFT JOIN hindi AS h ON e.Tag=h.Tag WHERE h.Inhindi IS NULL
English tag Hindi One 1 --no row-
Right Outer Join
SELECT '--no row--' AS English, h.tag, h.Inhindi AS Hindi FROM english AS e RIGHT JOIN hindi AS h ON e.Tag=h.Tag WHERE e.Inenglish IS NULL
English tag Hindi --no row-- 4 Char
- Make sure that you have the same name and same data type in both tables.
- The keywords LEFT and RIGHT are not absolute, they only operate within the context of the given statement: we can reverse the order of the tables and reverse the keywords, and the result would be the same.
- If the type of join is not specified as inner or outer then it will be executed as an INNER JOIN.
Full Outer Join
As for v5.1, MySQL does not provide FULL OUTER JOIN. You may emulate it this way:
(SELECT a.*, b* FROM tab1 a LEFT JOIN tab2 b ON a.id = b.id) UNION (SELECT a.*, b* FROM tab1 a RIGHT JOIN tab2 b ON a.id = b.id)
It is possible to join more than just two tables:
SELECT ... FROM a JOIN (b JOIN c ON b.id=c.id) ON a.id=b.id
Here is an example from Savane:
mysql> SELECT group_type.type_id, group_type.name, COUNT(people_job.job_id) AS COUNT FROM group_type JOIN (groups JOIN people_job ON groups.group_id = people_job.group_id) ON group_type.type_id = groups.TYPE GROUP BY type_id ORDER BY type_id +---------+--------------------------------------+-------+ | type_id | name | COUNT | +---------+--------------------------------------+-------+ | 1 | Official GNU software | 148 | | 2 | non-GNU software AND documentation | 268 | | 3 | www.gnu.org portion | 4 | | 6 | www.gnu.org translation team | 5 | +---------+--------------------------------------+-------+ 4 ROWS IN SET (0.02 sec)
(Compatible: MySQL 4.1 and later)
- SQL subqueries let you use the results of one query as part of another query.
- Subqueries are often natural ways of writing a statement.
- Let you break a query into pieces and assemble it.
- Allow some queries that otherwise can't be constructed. Without using a subquery, you have to do it in two steps.
- Subqueries always appear as part of the WHERE (or HAVING) clause.
- Only one field can be in the subquery SELECT. It means Subquery can only produce a single column of data as its result.
- ORDER BY is not allowed; it would not make sense.
- Usually refer to name of a main table column in the subquery.
- This defines the current row of the main table for which the subquery is being run. This is called an outer reference.
For e.g. If RepOffice= OfficeNbr from Offices table, list the offices where the sales quota for the office exceeds the sum of individual salespersons' quotas
SELECT City FROM Offices WHERE Target > ???
??? is the sum of the quotas of the salespeople, i.e.
SELECT SUM(Quota) FROM SalesReps WHERE RepOffice = OfficeNbr
We combine these to get
SELECT City FROM Offices WHERE Target > (SELECT SUM(Quota) FROM SalesReps WHERE RepOffice = OfficeNbr)
Display all customers with orders or credit limits > $50,000. Use the DISTINCT word to list the customer just once.
SELECT DISTINCT CustNbr FROM Customers, Orders WHERE CustNbr = Cust AND (CreditLimit>50000 OR Amt>50000);