What are window functions in SQL?
SQL window functions perform calculations across a set of table rows that are related to the current row. Unlike aggregate functions, window functions do not group rows into a single output row; instead, they return a single value for each row in the result set, allowing you to see both individual rows and aggregated or ranked values simultaneously.
What Are Window Functions?
Window functions operate on a 'window' of rows defined by the OVER() clause. This window is a set of rows related to the current row, and the function calculates a value for each row within its window. The key distinction from regular aggregate functions (like SUM, AVG, COUNT) is that window functions do not collapse the rows being aggregated; they return a result for each individual row.
Components of the OVER() Clause
The OVER() clause is fundamental to defining the window on which the function operates. It consists of optional components:
PARTITION BY Clause
Divides the query result set into partitions (groups) to which the window function is applied independently. If omitted, the entire result set is treated as a single partition.
ORDER BY Clause
Specifies the logical order of rows within each partition. This is crucial for functions that depend on order, like RANK(), ROW_NUMBER(), LAG(), and LEAD().
Window Frame
Further refines the set of rows within a partition that are considered for the current row's calculation. This is defined using clauses like ROWS or RANGE, often with PRECEDING, FOLLOWING, or BETWEEN. For example, ROWS BETWEEN 2 PRECEDING AND CURRENT ROW.
Common Window Functions
- Ranking Functions: ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(n)
- Value Functions: LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()
- Aggregate Functions: SUM(), AVG(), COUNT(), MIN(), MAX() (when used with OVER() clause)
Example: Ranking Products by Sales within Categories
Consider a table of products with their categories and sales figures. We want to rank products by sales within each category.
SELECT
product_name,
category,
sales,
RANK() OVER (PARTITION BY category ORDER BY sales DESC) AS rank_within_category
FROM
products;
In this example, PARTITION BY category divides the products into separate groups for each category. ORDER BY sales DESC ranks products from highest to lowest sales within their respective categories. The RANK() function then assigns a rank to each product, with ties receiving the same rank and subsequent ranks skipping numbers.
Benefits of Using Window Functions
- Simplifies complex analytical queries, often replacing self-joins or correlated subqueries.
- Enables calculations like running totals, moving averages, and multi-row comparisons.
- Improves readability and maintainability of SQL code.
- Can offer performance advantages for certain types of computations.