Top 104 SQL Interview Questions & Answers (2026)

Q1.

What is the difference between WHERE and HAVING clause?

In SQL, both the WHERE and HAVING clauses are used to filter data, but they operate at different stages of query processing and on different types of data. Understanding their distinct roles is crucial for writing efficient and correct SQL queries.

The WHERE Clause

The WHERE clause is used to filter individual rows based on specified conditions *before* any grouping occurs. It operates on rows retrieved from the FROM clause and can filter data based on columns that are not aggregated. It cannot directly contain aggregate functions.

sql

SELECT product_name, price
FROM products
WHERE price > 50;

The HAVING Clause

The HAVING clause is used to filter *groups* of rows based on specified conditions *after* the GROUP BY clause has been applied. It typically operates on the results of aggregate functions (like SUM, COUNT, AVG, MAX, MIN) and filters the groups that meet the criteria. If no GROUP BY clause is present, HAVING acts on the entire result set as a single group.

sql

SELECT department, COUNT(employee_id) AS total_employees
FROM employees
GROUP BY department
HAVING COUNT(employee_id) > 5;

Key Differences and Comparison

Feature	WHERE Clause	HAVING Clause
Purpose	Filters individual rows	Filters groups of rows
Execution Stage	Before GROUP BY	After GROUP BY
Applicability	Works on individual rows/columns	Works on aggregate functions/groups
Aggregate Functions	Cannot use aggregate functions directly	Can and often does use aggregate functions
Data Filtering	Filters data before aggregation	Filters data after aggregation
Columns Used	Non-aggregated columns	Aggregated columns (from GROUP BY) or aggregate functions

Summary

In essence, use WHERE to filter individual records before they are grouped, and use HAVING to filter the results of groups after aggregation. Combining both clauses allows for precise control over data filtering at different stages of a SQL query's execution.

Q2.

Explain different types of JOINs in SQL.

SQL JOINs are fundamental operations used to combine rows from two or more tables based on a related column between them. They are essential for retrieving meaningful data from relational databases, allowing you to create a comprehensive view of scattered information.

Introduction to SQL JOINs

In relational databases, data is often distributed across multiple tables to ensure normalization and reduce redundancy. A JOIN clause is used to combine rows from two or more tables, based on a common field between them. The type of JOIN determines which rows are kept from each table when a match is found or not found.

Types of SQL JOINs

INNER JOIN

The INNER JOIN keyword selects all rows from both tables as long as there is a match between the columns in both tables. It returns only the rows where the join condition is met in both tables, effectively discarding rows that do not have a match in the other table.

sql

SELECT orders.order_id, customers.customer_name
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id;

LEFT JOIN (or LEFT OUTER JOIN)

The LEFT JOIN keyword returns all rows from the left table (table1), and the matching rows from the right table (table2). If there is no match in the right table, NULL is used for columns from the right table. It's often used when you want to see all entries from one table, and any related entries from another.

sql

SELECT customers.customer_name, orders.order_id
FROM customers
LEFT JOIN orders ON customers.customer_id = orders.customer_id;

RIGHT JOIN (or RIGHT OUTER JOIN)

The RIGHT JOIN keyword returns all rows from the right table (table2), and the matching rows from the left table (table1). If there is no match in the left table, NULL is used for columns from the left table. This is essentially the mirror image of a LEFT JOIN.

sql

SELECT employees.employee_name, departments.department_name
FROM employees
RIGHT JOIN departments ON employees.department_id = departments.department_id;

FULL OUTER JOIN (or OUTER JOIN)

The FULL OUTER JOIN keyword returns all rows when there is a match in either table. It combines the results of both LEFT and RIGHT outer joins. If there are rows in either table that do not have matches in the other table, those rows will still be included, with NULL values for the columns of the table that lacked a match.

sql

SELECT employees.employee_name, departments.department_name
FROM employees
FULL OUTER JOIN departments ON employees.department_id = departments.department_id;

CROSS JOIN

A CROSS JOIN produces a Cartesian product of the tables involved in the join. This means it combines each row from the first table with every row from the second table. If table A has N rows and table B has M rows, a CROSS JOIN will result in N * M rows. It does not require a join condition.

sql

SELECT products.product_name, colors.color_name
FROM products
CROSS JOIN colors;

SELF JOIN

A SELF JOIN is a regular join, but the table is joined with itself. It is used to combine rows with other rows in the same table. This is particularly useful for querying hierarchical data or comparing rows within the same table, often requiring table aliases to differentiate between the two instances of the table.

sql

SELECT A.employee_name AS Employee, B.employee_name AS Manager
FROM employees A, employees B
WHERE A.manager_id = B.employee_id;

Q3.

What is the difference between INNER JOIN and LEFT JOIN?

SQL JOIN clauses are used to combine rows from two or more tables, based on a related column between them. This document will focus on explaining the fundamental differences and use cases for INNER JOIN and LEFT JOIN.

SQL JOINs Overview

JOINs are a core concept in relational databases, enabling you to retrieve data from multiple tables simultaneously. They establish a relationship between tables based on common columns, typically primary and foreign keys. For our examples, consider two tables: 'Customers' and 'Orders'.

CustomerID	Name
1	Alice
2	Bob
3	Charlie

OrderID	CustomerID	Amount
101	1	150.00
102	2	25.00
103	1	75.00
104	4	50.00

INNER JOIN

An INNER JOIN returns only the rows that have matching values in both tables. If a record in one table does not have a matching record in the other table, it is excluded from the result set. It's the most common type of JOIN and is often implied if you simply use the JOIN keyword without specifying a type.

sql

SELECT C.CustomerID, C.Name, O.OrderID, O.Amount
FROM Customers C
INNER JOIN Orders O ON C.CustomerID = O.CustomerID;

Result of the INNER JOIN on 'Customers' and 'Orders':

CustomerID	Name	OrderID	Amount
1	Alice	101	150.00
1	Alice	103	75.00
2	Bob	102	25.00

LEFT JOIN (or LEFT OUTER JOIN)

A LEFT JOIN (also known as LEFT OUTER JOIN) returns all rows from the left table, and the matching rows from the right table. If there is no match for a row in the left table, the columns from the right table will contain NULLs in the result set. It preserves all records from the 'left' table (the first table mentioned in the FROM clause).

sql

SELECT C.CustomerID, C.Name, O.OrderID, O.Amount
FROM Customers C
LEFT JOIN Orders O ON C.CustomerID = O.CustomerID;

Result of the LEFT JOIN on 'Customers' and 'Orders':

CustomerID	Name	OrderID	Amount
1	Alice	101	150.00
1	Alice	103	75.00
2	Bob	102	25.00
3	Charlie	NULL	NULL

Key Differences Summarized

Matching Rows: INNER JOIN returns only rows where a match exists in both tables. LEFT JOIN returns all rows from the left table, and matched rows from the right table.
Unmatched Rows: INNER JOIN excludes unmatched rows from either table. LEFT JOIN includes all rows from the left table, padding columns from the right table with 'NULL's where no match exists.
Result Size: The result of an INNER JOIN can be smaller than or equal to the smallest of the two tables. The result of a LEFT JOIN will always have at least as many rows as the left table.

When to Use Which?

Use INNER JOIN when:

You only care about records that have a direct relationship in both tables.
You want to find all customers who have placed at least one order.
You need to combine information only where there's a common data point across both datasets.

Use LEFT JOIN when:

You want to retrieve all records from one table (the 'left' table) regardless of whether they have a match in the second table.
You want to find all customers, and if they have orders, include order details (otherwise, show 'NULL's for order details).
You need to identify records in the left table that *do not* have a match in the right table (often achieved by adding WHERE right_table.id IS NULL).

Conclusion

Choosing between INNER JOIN and LEFT JOIN depends entirely on the specific data you need to retrieve. INNER JOIN is for intersecting data, while LEFT JOIN is for preserving all data from one table while optionally adding related data from another. Understanding their distinct behaviors is crucial for writing accurate and efficient SQL queries.

Q4.

What is GROUP BY clause?

The SQL GROUP BY clause is a powerful command used with aggregate functions to group rows that have the same values into summary rows. It is an essential tool for data analysis and reporting, allowing you to perform calculations on subsets of data rather than the entire dataset.

Understanding the GROUP BY Clause

The primary function of the GROUP BY clause is to arrange identical data into groups. When combined with aggregate functions like COUNT(), SUM(), AVG(), MAX(), and MIN(), it allows you to compute a single summary value for each group, making it invaluable for generating summarized reports and statistics.

Syntax

sql

SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1, column3
ORDER BY column1;

Key Concepts

Aggregate Functions: GROUP BY is almost always used in conjunction with aggregate functions to perform calculations (e.g., sum, average, count) on each group.
Non-aggregated Columns: Any column that appears in the SELECT list and is not part of an aggregate function must be included in the GROUP BY clause.
Filtering Groups: Use the HAVING clause to filter groups based on aggregate conditions, unlike WHERE which filters individual rows *before* grouping occurs.

Example

Consider a table named Orders with columns CustomerID, OrderDate, and Amount. To find the total amount spent by each customer, you would use GROUP BY as follows:

sql

SELECT CustomerID, SUM(Amount) AS TotalAmountSpent
FROM Orders
GROUP BY CustomerID;

This query would return a result set where each row represents a unique CustomerID, and the TotalAmountSpent column shows the sum of all Amount values for orders placed by that specific customer.

Common Uses and Best Practices

Sales Analysis: Grouping sales data by product category, region, or time period to identify trends.
User Activity: Counting user actions (e.g., logins, purchases) per user or per day.
Reporting: Generating summary reports, such as monthly sales summaries or departmental expense reports.
Performance: For very large datasets, ensure you have appropriate indexes on columns used in the GROUP BY clause to optimize query performance.

Q5.

What is the difference between UNION and UNION ALL?

In SQL, both `UNION` and `UNION ALL` operators are used to combine the result sets of two or more `SELECT` statements into a single result set. While their primary goal is similar, they differ significantly in how they handle duplicate rows and, consequently, their performance characteristics. Understanding these distinctions is crucial for efficient query writing and data manipulation.

Understanding UNION

The UNION operator combines the result sets of two or more SELECT statements and eliminates duplicate rows from the final result. For UNION to work, each SELECT statement must have the same number of columns, and the corresponding columns must have compatible data types. The column names in the final result set are usually taken from the first SELECT statement.

sql

SELECT column1, column2 FROM table1
UNION
SELECT column1, column2 FROM table2;

Because UNION performs a distinct operation to remove duplicates, it often involves an implicit sorting or hashing process, which can make it slower and more resource-intensive, especially when dealing with large datasets. It guarantees a unique set of rows in its output.

Understanding UNION ALL

The UNION ALL operator combines the result sets of two or more SELECT statements, but unlike UNION, it retains all duplicate rows. This means if a row exists in both result sets, or multiple times within a single result set, it will appear as many times in the final output. Like UNION, the SELECT statements must have the same number of columns with compatible data types.

sql

SELECT column1, column2 FROM table1
UNION ALL
SELECT column1, column2 FROM table2;

Since UNION ALL does not perform any distinct operation or duplicate checking, it is generally much faster and less resource-intensive than UNION. It simply appends the results of subsequent SELECT statements to the first one, making it the preferred choice when you know there are no duplicates or when preserving duplicates is desired.

Key Differences Summarized

Feature	UNION	UNION ALL
Duplicate Rows	Removes duplicates	Includes duplicates
Performance	Slower (due to duplicate removal)	Faster (no duplicate removal)
Sorting	Often involves implicit sorting/hashing	Does not involve implicit sorting/hashing
Resource Usage	Higher (more processing)	Lower (less processing)

When to Use Which?

Use UNION when:

You need to combine results from multiple queries and want only unique rows in the final output.
You are intentionally filtering out duplicate data across your combined sets.
The overhead of duplicate removal is acceptable for data integrity.

Use UNION ALL when:

You need to combine results from multiple queries and want to retain all rows, including duplicates.
Performance is a critical concern, and you are certain there are no duplicates, or duplicates are desired.
You are simply aggregating all data from different sources without needing to de-duplicate.

Q6.

What are aggregate functions in SQL?

SQL aggregate functions perform calculations on a set of rows and return a single summary value. They are commonly used with the GROUP BY clause to summarize data for each group, but can also be used to summarize an entire table, providing powerful analytical capabilities.

What Are Aggregate Functions?

Aggregate functions operate on a collection of input values and return a single value summarizing those inputs. Unlike scalar functions, which operate on a single row, aggregate functions process multiple rows to produce one result. They are essential for analytical queries and reporting, allowing users to gain insights into their data by summarizing it.

Common Aggregate Functions

The most frequently used aggregate functions in SQL include COUNT, SUM, AVG, MIN, and MAX. Each serves a distinct purpose in data summarization.

COUNT(): Returns the number of rows or non-NULL values in a specified column. COUNT(*) counts all rows, while COUNT(column_name) counts non-NULL values.
SUM(): Calculates the sum of all values in a numeric column. It ignores NULL values.
AVG(): Computes the average (arithmetic mean) of all values in a numeric column. It also ignores NULL values.
MIN(): Returns the minimum value in a column. This can apply to numeric, string, or date/time data types.
MAX(): Returns the maximum value in a column. Similar to MIN(), it works with various data types.

Using with GROUP BY Clause

Aggregate functions are often used in conjunction with the GROUP BY clause to divide the rows into groups and perform the aggregation for each group. This allows you to get summary statistics per category, such as the average salary per department or the total sales per product.

sql

SELECT department, AVG(salary) AS average_salary
FROM employees
GROUP BY department;

Using with HAVING Clause

The HAVING clause is used to filter groups based on the results of an aggregate function. It is applied after the GROUP BY clause, whereas the WHERE clause filters individual rows before grouping. This distinction is crucial for filtering on aggregated data.

sql

SELECT department, COUNT(employee_id) AS num_employees
FROM employees
GROUP BY department
HAVING COUNT(employee_id) > 5;

The DISTINCT Keyword

The DISTINCT keyword can be used inside some aggregate functions (like COUNT, SUM, AVG) to operate only on unique values within the specified column, ignoring duplicates. This is particularly useful when you need to count unique occurrences or sum unique values.

sql

SELECT COUNT(DISTINCT city) AS unique_cities
FROM customers;

Q7.

What is a subquery?

A subquery, also known as an inner query or inner select, is a query nested inside another SQL query. It can be embedded within SELECT, INSERT, UPDATE, or DELETE statements, or even within another subquery. Subqueries are used to return data that will be used by the main query as a condition or for calculation.

What is a Subquery?

In SQL, a subquery is a query (SELECT statement) that is embedded inside another SQL query. The inner query executes first, and its result is then used by the outer query. This allows for more complex data retrieval and manipulation by using the results of one query as input for another.

Subqueries can be used in various clauses of the main query, including WHERE, HAVING, FROM, and SELECT. They are particularly useful for performing operations that require a temporary result set or for filtering data based on values derived from another table or calculation.

Types of Subqueries

Subqueries can be categorized based on the number of rows and columns they return.

Scalar Subquery

A scalar subquery returns a single row and a single column (a single value). It can be used anywhere a single value is expected, such as in the SELECT clause, WHERE clause, or as part of an expression.

sql

SELECT product_name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);

Row Subquery

A row subquery returns a single row but multiple columns. It is often used in the WHERE or HAVING clause where multiple column values need to be compared against a single row.

sql

SELECT employee_id, first_name, last_name
FROM employees
WHERE (department_id, salary) = (SELECT department_id, MAX(salary) FROM employees GROUP BY department_id HAVING department_id = 10);

Table Subquery

A table subquery returns multiple rows and multiple columns. It is typically used in the FROM clause as a derived table (inline view) or with operators like IN, EXISTS, or ALL/ANY in the WHERE clause.

sql

SELECT c.customer_name, o.order_date
FROM customers c
JOIN (SELECT customer_id, MAX(order_date) AS order_date FROM orders GROUP BY customer_id) o
ON c.customer_id = o.customer_id;

Key Characteristics and Rules

Subqueries must be enclosed in parentheses.
An outer query can execute a subquery once for each row processed by the outer query (correlated subquery) or execute once and cache the result (non-correlated subquery).
Subqueries can return single values, single rows, or multiple rows and columns.
They can be used with comparison operators (e.g., =, <, >), set operators (e.g., IN, NOT IN, EXISTS), and quantifiers (e.g., ALL, ANY).
The ORDER BY clause cannot be used directly in a subquery, except when TOP or ROWNUM is specified.

Advantages of Subqueries

Improve readability and organization of complex queries.
Allow for structured queries where the output of one query is used as input for another.
Provide an alternative to complex joins for certain types of queries.
Easier to maintain and understand compared to very complex single queries.

Disadvantages of Subqueries

Can be less efficient than joins in some scenarios, especially with large datasets.
Poor performance if not optimized, particularly for correlated subqueries.
Debugging can be more challenging due to the nested nature of the queries.
Lack of clarity in some complex nested structures if not carefully written.

Q8.

What is correlated subquery?

A correlated subquery is a subquery that depends on the outer query for its values and executes once for each row processed by the outer query. Unlike a regular subquery, it cannot be executed independently.

Definition

In a correlated subquery, the inner query references one or more columns from the table in the outer query. Because of this dependency, the subquery is re-evaluated for every row returned by the outer query, making its execution intertwined with the outer query's processing.

How it Works

The outer query starts processing its rows.
For each row selected by the outer query, the correlated subquery is executed.
The subquery uses a value from the current row of the outer query in its WHERE clause or other conditions.
The result of the subquery is then used by the outer query to filter or select the current row.

Example

Suppose you want to find all employees whose salary is greater than the average salary of their respective department.

sql

SELECT E1.employee_name, E1.salary, E1.department_id
FROM Employees E1
WHERE E1.salary > (
    SELECT AVG(E2.salary)
    FROM Employees E2
    WHERE E2.department_id = E1.department_id
);

In this example, the subquery (SELECT AVG(E2.salary) FROM Employees E2 WHERE E2.department_id = E1.department_id) is correlated because it refers to E1.department_id from the outer query (aliased as E1). For each employee (E1) the outer query considers, the inner query calculates the average salary for *that specific employee's department*.

Characteristics

Row-by-Row Execution: Executes once for each row processed by the outer query, which can impact performance for large datasets.
Dependency: Explicitly references columns from the outer query.
Versatility: Useful for complex row-level comparisons that are difficult to express with simple joins.
Keywords: Often used with EXISTS, NOT EXISTS, comparison operators (=, >, <), or aggregate functions in the subquery's SELECT clause.

When to Use

To find records that have a specific relationship with other records within the same table (self-referencing logic).
When an aggregate function needs to be computed for each group defined by the outer query's current row.
For existence checks (e.g., finding customers who have placed at least one order).

Alternatives

JOINs with Derived Tables/CTEs: Often, correlated subqueries can be rewritten using JOIN operations combined with Common Table Expressions (CTEs) or derived tables to pre-calculate values, which can be more efficient.
Non-Correlated Subqueries: For simpler cases where the inner query's result is independent of the outer query, a non-correlated subquery is more appropriate and generally faster.

Q9.

What is the difference between DELETE and TRUNCATE?

In SQL, both the DELETE and TRUNCATE commands are used to remove data from tables. However, they operate very differently in terms of how they remove data, their performance, logging, and transactional behavior.

Overview

DELETE is a DML (Data Manipulation Language) command that removes rows one by one, allowing for conditional deletion and transaction logging. TRUNCATE is a DDL (Data Definition Language) command that deallocates data pages, making it faster for removing all rows from a table.

DELETE Statement

The DELETE statement is used to remove one or more rows from a table. It can include a WHERE clause to specify which rows to delete. If no WHERE clause is provided, all rows are deleted. DELETE operations are logged, allowing them to be rolled back and triggering ON DELETE triggers.

sql

DELETE FROM Employees WHERE DepartmentID = 10;
DELETE FROM Products;

DML (Data Manipulation Language) command.
Removes rows one by one.
Allows WHERE clause for conditional deletion.
Generates rollback segments (can be rolled back).
Fires ON DELETE triggers.
Resets AUTO_INCREMENT or IDENTITY columns only if all rows are deleted and the table is empty.
Slower for large tables compared to TRUNCATE.
Requires DELETE privilege.

TRUNCATE Statement

The TRUNCATE statement is used to remove all rows from a table quickly and efficiently. It works by deallocating the data pages used by the table and logging only the deallocation of pages, rather than individual row deletions. This makes it much faster than DELETE for large tables, but it cannot be rolled back and does not fire triggers.

sql

TRUNCATE TABLE Employees;

DDL (Data Definition Language) command.
Removes all rows by deallocating data pages.
Does not allow WHERE clause.
Cannot be rolled back (implicit COMMIT).
Does not fire ON DELETE triggers.
Always resets AUTO_INCREMENT or IDENTITY columns.
Faster for large tables.
Requires DROP privilege on the table.

Key Differences

Feature	DELETE	TRUNCATE
Command Type	DML	DDL
Row-by-row deletion	Yes	No (deallocates pages)
`WHERE` clause	Yes	No
Rollback	Yes	No (implicit COMMIT)
Triggers	Fires `ON DELETE` triggers	Does not fire triggers
Auto-Increment Reset	Only if all rows deleted and table empty	Always resets
Performance	Slower for large tables	Faster for large tables
Logging	Logs each row deletion	Logs page deallocation
Privilege	`DELETE` privilege	`DROP` privilege on table

Q10.

What is the difference between DROP and TRUNCATE?

The DROP and TRUNCATE commands are both used in SQL to remove data or objects, but they operate at different levels and have distinct implications. Understanding their differences is crucial for effective database management.

DROP Command

The DROP command is a Data Definition Language (DDL) statement used to remove an entire schema object from the database. This includes tables, indexes, views, stored procedures, functions, and more. When you DROP a table, its entire definition (structure), all data within it, and any associated objects like indexes, constraints, and triggers are permanently removed.

Removes the table definition and all data.
Frees up the space occupied by the table and its associated objects.
Usually cannot be rolled back (depends on specific database features or transaction management).
Implicitly commits the transaction.
Removes all related indexes, constraints, and triggers.

sql

DROP TABLE Customers;

TRUNCATE Command

The TRUNCATE command is also a DDL statement used to quickly remove all rows from a table. Unlike DELETE, TRUNCATE deallocates the data pages used by the table, making it very fast and efficient for large tables. However, it preserves the table's structure, including its columns, data types, and associated indexes and constraints. Identity columns (auto-incrementing) are typically reset to their seed value.

Removes all rows from a table, but keeps the table structure intact.
It's a DDL command, not DML, despite affecting data.
Faster than DELETE for large tables because it deallocates data pages.
Usually cannot be rolled back (depends on specific database features).
Resets identity columns/sequences to their starting value.
Does not fire triggers defined on the table.

sql

TRUNCATE TABLE Products;

Key Differences

Feature	DROP	TRUNCATE
Purpose	Removes the entire table definition and all data	Removes all rows from a table; table structure remains
Type	DDL (Data Definition Language)	DDL (Data Definition Language)
Rollback	Usually not possible	Usually not possible
Speed	Slower overall (due to dropping metadata and associated objects)	Faster for deleting all rows (by deallocating data pages)
Space Reclamation	Frees up space for the table and all its associated objects	Frees up space for data, but not the table definition
Indexes/Constraints	Removes all associated indexes and constraints	Keeps all associated indexes and constraints
Triggers	Not applicable (the object is gone)	Does not fire any DML triggers
Identity Column	Removed (along with the table)	Resets to its seed value
Logging	Minimal logging (metadata changes)	Minimal logging (deallocates pages; less than DELETE)

When to Use?

Use DROP when you want to permanently remove a table and its entire definition from the database. Use TRUNCATE when you need to quickly clear all data from a table while keeping its structure, indexes, and constraints intact, typically for reloading or resetting data.

1 2 3 4 5 6 7 Next →