SQL Subqueries: The Super Powerful SQL Feature
In this lesson you explore the Subqueries, the super powerful and flexible feature of the SQL
language. SQL Subqueries can be used in the SELECT
, INSERT
, UPDATE
, and DELETE
statements. This lesson focuses on using subqueries in the SELECT
statement.
The first point to address is where you put the subquery in the SELECT
statement. It can go in the WHERE
, FROM
, and HAVING
clauses, or even inside the SELECT
list of column. The most typical is the WHERE
clause.
SQL Subqueries in the WHERE Clause
From a syntax point of view, a SQL Subquery is a SELECT
statement enclosed in parenthesis, and like every SELECT
, it can return 0, 1, or multiple rows. The number of records returned by a subquery is very important. The operators you use to create a condition for a subquery varies depending on how many records the subquery returns. The subquery operator rules are shown in the table:
Number of records the subquery returns | Operator allowed |
---|---|
1 record | = , > , >=, <, <= , <> |
0, 1 or many records | IN, EXISTS |
Here is the example table. Suppose you work on a farm growing apples. The database contains a single table named apple with all the production history. The table has the following schema:
Year | Apple Variety | Number of Trees in Production | Tons Produced | Harvest day | Price per Ton | First summer storm |
---|---|---|---|---|---|---|
2020 | Red Delicious | 2000 | 102 | 06/23/2020 | 54.50 | 07/03/2020 |
2020 | Magic Green | 700 | 33 | 06/12/202062.6007/03/2020 | ||
2020 | Red Globus | 500 | 26 | 05/30/2020 | 71.50 | 07/03/2020 |
2019 | Red Delicious | 1800 | 87 | 07/15/2019 | 52.25 | 07/12/2019 |
2019 | Magic Green | 500 | 26 | 06/28/2019 | 59.40 | 07/12/2019 |
2019 | Red Globus | 500 | 27 | 05/28/2019 | 68.00 | 07/12/2019 |
2018 | Red Delicious | 1800 | 92 | 07/02/2018 | 56.75 | 06/03/2018 |
2018 | Red Globus | 500 | 24 | 05/30/2018 | 66. | 06/03/2018 |
2017 | Red Delicious | 1500 | 76.5 | 07/18/2017 | 51.45 | 07/30/2017 |
2016 | Red Delicious | 1500 | 72 | 06/26/2016 | 47.60 | 06/23/2016 |
Suppose you want to obtain the year and the tons_produced
for Red Delicious variety and the year that price of Red Delicious touches its maximum value in the history.
Here is the query and subquery in the WHERE clause:
SELECT year,
tons_produced
FROM apple
WHERE price_per_ton = ( SELECT MAX(price_per_ton)
FROM apple WHERE variety = ‘Red Delicious’
)
Using MAX()
constrains the subquery to return a single record, enabling you to use the = operator in the subquery comparison condition. In a case where a subquery returns 0 or multiple rows and an operator is used, the database returns an error.
Here is the result:
year | tons_produced |
---|---|
2018 | 92 |
Now you want to obtain the year
, tons produced
, number of trees
and the metric tons_per_tree
for those years when the tons_per_tree
was less than the average for Red Delicious. You can use a subquery to obtain the average of tons_per_tree
for Red Delicious, and then compare it with every Red Delicious record tons_per_tree
metric.
Here is the query and subquery in the WHERE clause:
SELECT year, number_of_trees, tons_produced,
tons_produced / number_of_trees AS tons_per_tree
FROM apple
WHERE (tons_produced / number_of_trees) < (
SELECT AVG(tons_produced / number_of_trees)
FROM apple
WHERE apple_variety = 'Red Delicious'
)
AND apple_variety = 'Red Delicious'
The results:
Year | number_of_trees | tons_produced | tons_per_tree |
---|---|---|---|
2019 | 1800 | 87 | 0.0483 |
2016 | 1500 | 72 | 0.0480 |
So far the subqueries have returned exactly one record. In the next example, the subquery returns several records. The goal is to obtain the same attributes as the previous Red Delicious query for the years where the average tons_per_dollar
is > 0.05.
Here is the query:
SELECT year, number_of_trees, tons_produced,
tons_produced / number_of_trees AS tons_per_tree
FROM apple
WHERE year IN (
SELECT year
FROM apple
WHERE apple_variety = 'Red Delicious'
GROUP BY year
HAVING AVG(tons_produced / number_of_trees) > 0.05
)
AND apple_variety = 'Red Delicious'
This subquery can return more than 1 record, so you use the IN
operator in the condition.
Here are the results:
Year | Number of Trees | Tons_produced | Tons_per_tree |
---|---|---|---|
2020 | 2000 | 102 | 0.051 |
2018 | 1800 | 92 | 0.051 |
2017 | 1500 | 76 | 0.051 |
Now you want to obtain the years where you did not produce the Red Globus variety. For this query, use the EXISTS
unary operator. A unary means the operator only needs one operand, which in this case, is the subquery. Here is the form for the EXISTS
condition:
WHERE EXISTS ( subquery )
The EXISTS
condition returns TRUE if the subquery returns any row or FALSE if the subquery returns 0 rows (empty set).
Here is the query to obtain the years where Red Globus was not produced:
SELECT DISTINCT year
FROM apple a1
WHERE NOT EXISTS (
SELECT year
FROM apple a2
WHERE a2.apple_variety = 'Red Globus'
AND a2.year = a1.year
)
Two items to note in this query:
- The
DISTINCT
clause is used to remove duplicate records from the result. - The condition a2.year = a1.year inside the subquery is comparing the year column of a record from the subquery against the year column of a record in the outer query. This is called a Correlated Subquery as it uses columns in the outer query in the
WHERE
clause of the subquery.
Here is the result:
Year |
---|
2016 |
2017 |
SQL Subqueries In the FROM Clause
This example uses a subquery in the FROM clause of the SELECT
to produce an temporary table called red_delicious_prod
.
Here is the query:
SELECT
red_delicious_prod.year,
red_delicious_prod.tons_produced,
red_delicious_prod.price_per_ton,
red_delicious_prod.tons_produced * red_delicious_prod.price_per_ton AS revenue
FROM (
SELECT year, tons_produced, price_per_ton
FROM apple
WHERE apple_variety = ‘Red Delicious’
) red_delicious_prod
The outer query references the temporary red_delicious_prod
table created on the fly during the query execution. When the query is complete, the red_delicious_prod
table is no longer available.
Closing Words
In this lesson you learned to use subqueries in the SELECT
statement. You used subqueries with different operators in the WHERE
clause and in the FROM
clause. There are other places in the SELECT
statement where you can put a subquery. For example you can put a subquery in the SELECT
list, or in the HAVING
clause. The next lesson covers details of the DISTINCT
clause. Keep going, learn SQL
, increase your skills!