5th Aug 2021 9 minutes read How to Define a Window Frame in SQL Window Functions Ignacio L. Bisso sql learn sql window functions Table of Contents Using PARTITION BY to Define a Window Frame Ordering the Rows Within a Window Frame With ORDER BY Defining Window Frame Bounds With ROWS Defining Window Frame Bounds With RANGE Take Advantage of Window Functions! The window function is a super powerful resource of the SQL language. At the core of any window function, there is a set of records, called the window frame, defined using an OVER clause. Knowing what records are in the window frame, how they are ordered, and what their upper and lower bounds are, are critical in understanding how window functions work. In this article, we will analyze and explain with examples how you can define different types of window frames. Read on to take an important step in the growth of your SQL skills! Using PARTITION BY to Define a Window Frame SQL window functions perform calculations based on a set of records. For example, you might want to calculate the average salary of a specific group of employee records. This group of records is called the window frame, and its definition is central to understanding how window functions work and how we can take advantage of them. The window frame is a set of rows related to the current row where the window function is used for calculation. The window frame can be a different set of rows for the next row in the query result, since it depends on the current row being processed. Every row in the result set of the query has its own window frame. In the rest of this article, we will show example queries based on a database of a car dealership group. The group stores the sales information grouped by month in a table called monthly_car_sales. Below is the table with some sample data: monthly_car_sales yearmonthmakemodeltypequantityrevenue 202101FordF100PickUp402500000 202101FordMustangCar91010000 202101RenaultFuegoCar209000000 202102RenaultFuegoCar5023000000 202102FordF100PickUp201200000 202102FordMustangCar101050000 202103RenaultMeganeCar5020000000 202103RenaultKoleosCar151004000 202103FordMustangCar202080000 202104RenaultMeganeCar5020000000 202104RenaultKoleosCar151004000 202104FordMustangCar252520000 A simple way to create a window frame is by using an OVER clause with a PARTITION BY subclause. In the following SQL example, we generate a report of revenue by make of the car for the year 2021. SELECT make, SUM(revenue) OVER (PARTITION BY make) AS total_revenue FROM monthly_car_sales WHERE year = 2021 Below, the window frames generated by the previous query are shown in different colors (red for Ford and blue for Renault). All records with the same value in the make column (the rows as color-coded below) belong to the same window frame. Since we have only two different values in the make column, we have two window frames. yearmonthmakemodeltypequantityrevenue 202101FordF100PickUp402500000 202101FordMustangCar91010000 202101RenaultFuegoCar209000000 202102RenaultFuegoCar5023000000 202102FordF100PickUp201200000 202102FordMustangCar101050000 202103RenaultMeganeCar5020000000 202103RenaultKoleosCar151004000 202103FordMustangCar202080000 202104RenaultMeganeCar4015000000 202104RenaultKoleosCar201504000 202104FordMustangCar252520000 The result of the query is: maketotal_revenue Ford10360000 Renault69508000 I would like to suggest 2 articles where you can find a lot of introductory information about SQL window functions: “SQL Course of The Month - Window Functions” and “When Do I Use SQL Window Functions?”. Ordering the Rows Within a Window Frame With ORDER BY In addition to PARTITION BY, we can use an ORDER BY subclause to order the rows inside a window frame. Having the window frame ordered by some criteria allows us to use analytic window functions like LEAD(), LAG(), and FIRST_VALUE(), among others. For example, if we want to obtain the revenue difference between consecutive months, we can order the window frame by month. Then, given any current row, the LAG() window function can return any column of the previous month. Let’s see an example that obtains the revenue difference between each pair of consecutive months. SELECT make, model, month, revenue AS current_month_revenue, LAG(revenue) OVER ( ORDER BY month) AS previous_month_revenue, revenue - LAG (revenue) OVER (ORDER BY month) AS delta_revenue FROM monthly_car_sales WHERE year = 2021 AND model = 'Mustang' The result of the previous query is: makeModelMonthCurrent Month RevenuePrevious Month RevenueDelta Revenue FordMustang11010000NULLNULL FordMustang2105000010100004000 FordMustang320800001050000103000 FordMustang425200002080000440000 The first row in the result has NULL values in previous_month_revenue and delta_revenue columns. This is because there is no previous month for January. The column delta_revenue is calculated between the current month’s revenue and the previous month’s revenue, the latter of which is obtained with the LAG() window function. When we use ORDER BY in an OVER clause, there is a new element to consider: the bounds of the window frame. If we don’t specify any ORDER BY clause, the entire partition becomes the window frame. However, when we use an ORDER BY subclause, the current row becomes the upper bound of the window frame. In other words, the rows following the current row (based on the ORDER BY criteria) are not included in the window frame. In the next section, we will cover the concept of window frame bounds in detail. Defining Window Frame Bounds With ROWS A very interesting feature of the OVER clause is the ability to specify the upper and lower bounds of a window frame. These bounds can be specified by using one of the two subclauses in the OVER clause: ROWS or RANGE. In this section, we will explain how to use the ROWS subclause of the OVER clause. The window frame is a set of rows that are somehow related to the current row. Their bounds can be defined for each row in the query result with a ROWS subclause, which has the following syntax: ROWS BETWEEN lower_bound AND upper_bound As we already mentioned in the previous section, it is important to know what the default bounds of the window frame are. When we specify an ORDER BY subclause, the current row is the upper bound of the window frame by default. However, in some cases, we need to change this upper bound (or the lower bound) as we will see below. Let’s see an example where we need to specify the bounds of a window frame. Suppose we want a report with the total sales for the current month, the total sales for the previous month, and the maximum sales in any individual month throughout the year, all by make and model. The query to obtain such a report is: SELECT make, model, month, revenue AS current_month_revenue, LAG(revenue) OVER (PARTITION BY make, model ORDER BY month) AS prev_month, MAX(revenue) OVER (PARTITION BY make, model ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_year_revenue FROM monthly_car_sales WHERE year = 2021 makemodelmonthcurrent_month_revenueprev_monthmax_year_revenue FordF10012500000NULL2500000 FordF1002120000025000002500000 FordMustang11010000NULL2520000 FordMustang2105000010100002520000 FordMustang3208000010500002520000 FordMustang4252000020800002520000 RenaultFuego19000000NULL23000000 RenaultFuego223000000900000023000000 RenaultKoleos31004000NULL1504000 RenaultKoleos4150400010040001504000 RenaultMegane320000000NULL20000000 RenaultMegane4150000002000000020000000 Had we omitted the ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING subclause in the MAX() window function in the previous query, we would have obtained the maximum between the first month and the current month. This is wrong, since we want the maximum monthly revenue considering the entire year (including the months after the current month). So, we need to include all available months in the table. We do that by adding the subclause ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING to the OVER clause. The next image shows all the possible bounds we can specify to define the lower and the upper bounds of a window frame: The options for lower and upper bounds in the OVER clause are: UNBOUNDED PRECEDING N PRECEDING CURRENT ROW N FOLLOWING UNBOUNDED FOLLOWING We need to make sure the lower bound is less than the upper bound. Finally, I would like to suggest the article Why Learn SQL Window Functions in 2021? where you can learn how to use window functions. Defining Window Frame Bounds With RANGE In the previous section, we defined the bounds of a window frame in terms of ROWS. In this section, we will explain how to use the subclause RANGE to specify the bounds of a window frame in ranges of rows. The syntax of the RANGE subclause is the following: RANGE BETWEEN lower_bound AND upper_bound A range is a set of rows with the same value for the PARTITION BY criteria. For example, if we have a PARTITION BY month, we can see the difference in the next image when we use ROWS or RANGE to define a window frame: OVER ( PARTITION BY …... ORDER BY month ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) OVER ( PARTITION BY ….. ORDER BY month RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING ) If we want a revenue report by make for the current month and for each of the last three months, we can use the following query: SELECT make, model, month, revenue AS model_revenue_current_month, SUM(revenue) OVER ( PARTITION BY make ORDER BY month RANGE BETWEEN 0 PRECEDING AND CURRENT ROW ) AS make_current_month, SUM(revenue) OVER (PARTITION BY make ORDER BY month RANGE BETWEEN 1 PRECEDING AND CURRENT ROW ) AS make_last_2_months, SUM(revenue) OVER (PARTITION BY make ORDER BY month RANGE BETWEEN 2 PRECEDING AND CURRENT ROW ) AS make_last_3_months FROM monthly_car_sales WHERE year = 2021 ORDER BY 1,3,2 The previous query uses the RANGE subclause to specify a window frame with all the records of the current make for a range of N months. For example: SUM(revenue) OVER ( PARTITION BY make ORDER BY month RANGE BETWEEN 1 PRECEDING AND CURRENT ROW ) AS make_last_2_months The previous subclause RANGE BETWEEN 1 PRECEDING AND CURRENT ROW specifies a window frame that includes the preceding month and the current month. Then, the SUM() function will return the total revenue in the last two months. Similarly, we can use the following OVER clause to obtain the total revenue in the last three months. SUM(revenue) OVER ( PARTITION BY make ORDER BY month RANGE BETWEEN 2 PRECEDING AND CURRENT ROW ) AS make_last_3_months There are several abbreviations available to make the syntax easier for these bounding clauses: AbbreviationComplete Syntax UNBOUNDED PRECEDINGBETWEEN UNBOUNDED PRECEDING AND CURRENT ROW n PRECEDINGBETWEEN n PRECEDING AND CURRENT ROW CURRENT ROWBETWEEN CURRENT ROW AND CURRENT ROW n FOLLOWINGBETWEEN AND CURRENT ROW AND n FOLLOWING UNBOUNDED FOLLOWINGBETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING Last, but not least, here is my preferred cheat sheet on window functions with a lot of syntax details: SQL window functions cheat sheet. Take Advantage of Window Functions! A central idea in the window function is the window frame, that is, the group of records over which the window function works. In this article, we have explained that the window frame depends on the current row and is defined by the OVER clause. We have also shown several examples for defining which records to include in the window frame, ordering the rows within it, and defining its bounds. For those who want to go deeper, I suggest the Window Functions SQL online course, with plenty of examples using different window functions. Develop your skills and increase your assets!! Tags: sql learn sql window functions