SQL Correlated Subqueries Increase the Power of SQL
A SQL correlated subquery is a query which is executed one time for each record returned by the outer query. It is called correlated as it is a correlation between the number of times the subquery is executed with the number of records returned by the outer query (not the subquery).
Correlated subqueries are a different type of subquery. In some situations, you must use a correlated subquery to obtain the correct reply or result data set. This lesson covers how to identify the situations that require a correlated subquery. There are examples of correlated subqueries, and a discussion about overusing correlated subqueries.
Using SQL Correlated Subqueries
There are some data questions where the SQL to solve the question must include a correlated subquery. This is particularly true in queries where you are looking for what might be called negatives or negative data questions.
These examples use two tables:
payment_history table has a column called
payment_type that indicates if a payment is part of a regular salary, a bonus, or an award. Here is some sample data for the tables:
Here is an SQL query to answer a negative data question: List the employees who have never received a bonus.
SELECT last_name, first_name FROM employee e1 WHERE NOT EXISTS (SELECT ph.last_name FROM payment_history ph WHERE ph.employee_id = e1.employee_id AND ph.payment_type = ‘bonus’ )
Correlated vs Simple Subqueries
The main difference between correlated and simple subqueries is that correlated subqueries reference columns from the outer table. In the example, ph.employee_id = e1.employee_id is a reference to the outer subquery table. To identify a correlated query, just look for these kinds of references. If you find at least one, you have a correlated subquery!
The negative part of the data question is often solved in the correlated subquery by using a
NOT EXISTS operator in the WHERE clause.
EXISTS is an operator always followed by a subquery. If the subquery returns at least one record, then the
EXISTS evaluates to
TRUE. If the subquery returns empty, then the
EXISTS evaluates to
FALSE. Note we use
NOT EXISTS, which is the opposite to
What About Positive Data Questions?
Should you use a correlated subquery for a positive data question? No. You can if you want to, but a
JOIN condition or a relationship between two tables to answer data questions is a better approach for positive data questions.
Here is an SQL query to answer a positive data question: List the employees who have received a bonus.
SELECT first_name, last_name FROM employee e1, payment_history ph WHERE ph.employee_id = e1.employee_id AND ph.payment_type = ‘bonus’
The query uses an implicit
JOIN by putting both tables in the
FROM clause. The implicit
JOIN is equivalent to the query:
SELECT first_name, last_name FROM employee e1 INNER JOIN payment_history ph ON ph.employee_id = e1.employee_id WHERE ph.payment_type = ‘bonus’
The result is:
It is a best practice to solve data question with
JOIN instead of a correlated subquery whenever possible. You should only use correlated subqueries for negative data questions or other data questions where a correlated subquery is the only way to answer it.
Performance is the biggest reason to avoid correlated subqueries. They tend to be really slow. Avoid them when you can.
A Complex Example Using a Correlated Subquery
In this example, obtain the employee names who made a higher salary in March 2018 than their average monthly salary for all previous months.
SELECT first_name, last_name FROM employee e1, payment_history ph WHERE e1.employee_id = ph.employee_id AND amount_paid > = ( SELECT avg(amount_paid) FROM payment_histoty ph2 WHERE ph2.employee_id = e1.employee_id AND ph2.payment_date < ‘2018-03-01’ ) AND month(ph.payment_date) = 3 AND year(ph.payment_date) = 2018 AND ph.payment_type = ‘salary’
You can identify the reference to the outer subquery table marked in red, so you know this is a correlated subquery. The subquery executes once per record instead of just a single time like a simple query.
In this lesson you learned how to identify and use correlated subqueries, a special kind of subquery, which is used under specific circumstances. The next lesson covers Window Functions: very powerful functions that add new possibilities to the SQL language. Keep going, learn
SQL and increase your skills!
IN THIS PAGE