What is a subquery?
A subquery, also known as an inner query or inner select, is a query nested inside another SQL query. It can be embedded within SELECT, INSERT, UPDATE, or DELETE statements, or even within another subquery. Subqueries are used to return data that will be used by the main query as a condition or for calculation.
What is a Subquery?
In SQL, a subquery is a query (SELECT statement) that is embedded inside another SQL query. The inner query executes first, and its result is then used by the outer query. This allows for more complex data retrieval and manipulation by using the results of one query as input for another.
Subqueries can be used in various clauses of the main query, including WHERE, HAVING, FROM, and SELECT. They are particularly useful for performing operations that require a temporary result set or for filtering data based on values derived from another table or calculation.
Types of Subqueries
Subqueries can be categorized based on the number of rows and columns they return.
Scalar Subquery
A scalar subquery returns a single row and a single column (a single value). It can be used anywhere a single value is expected, such as in the SELECT clause, WHERE clause, or as part of an expression.
SELECT product_name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);
Row Subquery
A row subquery returns a single row but multiple columns. It is often used in the WHERE or HAVING clause where multiple column values need to be compared against a single row.
SELECT employee_id, first_name, last_name
FROM employees
WHERE (department_id, salary) = (SELECT department_id, MAX(salary) FROM employees GROUP BY department_id HAVING department_id = 10);
Table Subquery
A table subquery returns multiple rows and multiple columns. It is typically used in the FROM clause as a derived table (inline view) or with operators like IN, EXISTS, or ALL/ANY in the WHERE clause.
SELECT c.customer_name, o.order_date
FROM customers c
JOIN (SELECT customer_id, MAX(order_date) AS order_date FROM orders GROUP BY customer_id) o
ON c.customer_id = o.customer_id;
Key Characteristics and Rules
- Subqueries must be enclosed in parentheses.
- An outer query can execute a subquery once for each row processed by the outer query (correlated subquery) or execute once and cache the result (non-correlated subquery).
- Subqueries can return single values, single rows, or multiple rows and columns.
- They can be used with comparison operators (e.g., =, <, >), set operators (e.g., IN, NOT IN, EXISTS), and quantifiers (e.g., ALL, ANY).
- The ORDER BY clause cannot be used directly in a subquery, except when TOP or ROWNUM is specified.
Advantages of Subqueries
- Improve readability and organization of complex queries.
- Allow for structured queries where the output of one query is used as input for another.
- Provide an alternative to complex joins for certain types of queries.
- Easier to maintain and understand compared to very complex single queries.
Disadvantages of Subqueries
- Can be less efficient than joins in some scenarios, especially with large datasets.
- Poor performance if not optimized, particularly for correlated subqueries.
- Debugging can be more challenging due to the nested nature of the queries.
- Lack of clarity in some complex nested structures if not carefully written.