26th Nov 2020 6 minutes read Aggregate Functions vs Window Functions: A Comparison Kateryna Koidan sql learn sql Window Functions Aggregate Functions Table of Contents What Are SQL Aggregate Functions? What Are SQL Window Functions? Similarities and Differences Between Window and Aggregate Functions Let’s Practice Window Functions! If you aren’t familiar with SQL’s window functions, you may wonder how they differ from aggregate functions. When should you use window functions? In this article, we’ll review window functions and aggregate functions, examine their similarities and differences, and see which one to choose depending on what you need to do. After you’ve tackled basic SQL, you’ll probably want to get into some of its more advanced functions. That’s great; these functions make reporting and analysis easier. Pretty soon, though, you’ll run into two mysterious groups of functions: window functions and aggregate functions. What do they do? How are they different? You’re about to find out. What Are SQL Aggregate Functions? Aggregate functions operate on a set of values to return a single scalar value. These are SQL aggregate functions: AVG() returns the average of the specified values. SUM() calculates the sum of all values in the set. MAX() and MIN() return the maximum and minimum value, respectively. COUNT() returns the total number of values in the set. By using the GROUP BY clause, you can calculate an aggregate value for several groups in one query. For example, let’s say we have transaction data from two cities, San Francisco and New York: iddatecityamount 12020-11-01San Francisco420.65 22020-11-01New York1129.85 32020-11-02San Francisco2213.25 42020-11-02New York499.00 52020-11-02New York980.30 62020-11-03San Francisco872.60 72020-11-03San Francisco3452.25 82020-11-03New York563.35 92020-11-04New York1843.10 102020-11-04San Francisco1705.00 You can use SQL aggregate functions to calculate the average daily transaction amount for each city. You’ll need to group the data by both date and city: SELECT date, city, AVG(amount) AS avg_transaction_amount_for_city FROM transactions GROUP BY date, city; Here is the result of this query: datecityavg_transaction_amount_for_city 2020-11-01New York1129.85 2020-11-02New York739.65 2020-11-03New York563.35 2020-11-04New York1843.1 2020-11-01San Francisco420.65 2020-11-02San Francisco2213.25 2020-11-03San Francisco2162.425 2020-11-04San Francisco1705 By using the aggregate function AVG() and GROUP BY, we get results that are grouped by date and city. We had two transactions in New York on November 2nd and two transactions in San Francisco on November 3rd, but the result set doesn’t include these individual transactions; aggregate functions collapse the individual rows and present the aggregate (here, average) value for all the rows in the group. What Are SQL Window Functions? In SQL, window functions operate on a set of rows called a window frame. They return a single value for each row from the underlying query. The window frame (or simply window) is defined using the OVER() clause. This clause also allows defining a window based on a specific column (similar to GROUP BY). To calculate the returned values, window functions may use aggregate functions, but they will use them with the OVER() clause. Let’s go back to our data for San Francisco and New York. Here’s the table again: iddatecityamount 12020-11-01San Francisco420.65 22020-11-01New York1129.85 32020-11-02San Francisco2213.25 42020-11-02New York499.00 52020-11-02New York980.30 62020-11-03San Francisco872.60 72020-11-03San Francisco3452.25 82020-11-03New York563.35 92020-11-04New York1843.10 102020-11-04San Francisco1705.00 We want to add another column to this table with the average daily transaction value for each city. The following SQL query uses a window function to get the result we need: SELECT id, date, city, amount, AVG(amount) OVER (PARTITION BY date, city) AS avg_daily_transaction_amount_for_city FROM transactions ORDER BY id; Here is the result: iddatecityamountavg_daily_transaction_amount_for_city 12020-11-01San Francisco420.65420.65 22020-11-01New York1129.851129.85 32020-11-02San Francisco2213.252213.25 42020-11-02New York499.00739.65 52020-11-02New York980.30739.65 62020-11-03San Francisco872.602162.425 72020-11-03San Francisco3452.252162.425 82020-11-03New York563.35563.35 92020-11-04New York1843.101843.1 102020-11-04San Francisco1705.001705 Note that the rows are not collapsed; we still have one row for each of our transactions. All of the calculated averages are presented in the avg_daily_transaction_amount_for_city column. You can learn more about window functions in this in-depth guide. It provides several examples, including simple and more advanced applications. Also, the LearnSQL.com team has prepared a great SQL Window Functions Cheat Sheet. Print it and stick it on your desk, especially if you are new to window functions. Similarities and Differences Between Window and Aggregate Functions Now that we have seen both types of functions, we can summarize the similarities and differences between them. Both window functions and aggregate functions: Operate on a set of values (rows). Can calculate aggregate amounts (e.g. AVG(), SUM(), MAX(), MIN(), or COUNT()) on the set. Can group or partition data on one or more columns. Aggregate functions with GROUP BY differ from window functions in that they: Use GROUP BY() to define a set of rows for aggregation. Group rows based on column values. Collapse rows based on the defined groups. Window functions differ from aggregate functions used with GROUP BY in that they: Use OVER() instead of GROUP BY() to define a set of rows. May use many functions other than aggregates (e.g. RANK(), LAG(), or LEAD()). Groups rows on the row’s rank, percentile, etc. as well as its column value. Do not collapse rows. May use a sliding window frame (which depends on the current row). Let’s demonstrate that last difference with one more example. In this exercise, we want to calculate the average sales for the preceding and current days for each date (i.e. a 2-day moving average). I suggest starting with a common table expression (CTE) to define the daily_sales table, where we have the total sales for each day. Then, we use a window function with a sliding window frame to calculate the average of the total sales for the current and preceding days. The query is as follows: WITH daily_sales AS ( SELECT date, SUM(amount) AS sales_per_day FROM transactions GROUP BY date) SELECT date, AVG(sales_per_day) OVER (ORDER BY date ROWS 1 PRECEDING) AS avg_2days_sales FROM daily_sales ORDER BY date; Here is the result set: dateavg_2days_sales 2020-11-011550.5 2020-11-022621.525 2020-11-034290.375 2020-11-044218.15 In the first row, the table shows the total sales for November 1st because there is no preceding row for this date. Then, in the second row, we have the average sales for November 1st and 2nd; in the third row, the table includes the average sales for November 2nd and 3rd, and so on. Window functions are great for calculating moving averages – something you can’t do using just aggregate functions and GROUP BY(). Let’s Practice Window Functions! The following table summarizes all the similarities and differences between SQL’s aggregate functions and window functions: Aggregate functions + GROUP BYWindow Functions Operates on a set of rows (values) Groups data on one or more columns Uses aggregate functions like AVG(), SUM(), COUNT(), MIN(), and MAX() –Uses other functions, including RANK(), LAG(), LEAD(), and NTILE() Uses GROUP BY to define a set of rowsUses OVER() to define a set of rows Collapses individual rows into one summary rowKeeps individual rows and adds a summary column Groups rows based on the same column valueGroups rows by column value and also by the row’s rank, percentile, etc. Operates on a fixed group of valuesCan operate on a fixed or a sliding window frame Even though SQL window functions are an advanced topic, you can practice them on your own. This special guide on practicing SQL window functions gives some useful tips on learning the syntax of window functions and writing the corresponding queries. LearnSQL has prepared a comprehensive course on Window Functions; in it, you can practice creating sophisticated window frames with 218 interactive exercises. You’ll learn how to leverage window functions to compute running totals and moving averages, build rankings, find the best and worst performers, and investigate trends across time. You can learn more about this course from our interview with LearnSQL.com Chief Content Officer Agnieszka Kozubek-Krycuń. Thanks for reading and happy learning! Tags: sql learn sql Window Functions Aggregate Functions