7th Jun 2022 6 minutes read How to Combine Two Aggregate Functions in SQL Tihomir Babic sql learn sql aggregate functions Table of Contents Sample Data The Naive Solution Real Solution 1: Subquery Real Solution 2: CTE Become a Master of Data Aggregation and Reporting Having trouble using two aggregate functions in one query? This article will show you how to do it the right way – actually, the two right ways. In data analysis and reporting, we often need to count the number of records or sum them up and then calculate the average of this count or sum. Translated to SQL logic, this is the aggregation of aggregated data, or multi-level aggregation. For aggregation purposes, there are the SQL aggregate functions. And for multi-level aggregation, you’d use (at least) two aggregate functions at the same time. If you’re interested in quality reporting, you’ll need much more than SQL’s aggregate functions. However, they're certainly the basis of good reporting. The best way to build on that basis is to take our interactive Creating Basic SQL Reports course. It offers a whole section dedicated solely to computing multi-level aggregations in SQL and summarizing data using aggregate functions. It also covers data classification using the CASE WHEN and GROUP BY, calculating multiple metrics in one report, and comparing groups within the same report. With 97 interactive exercises, this course will help you master reporting in SQL! To make it even easier, I’ll show you here how to combine two aggregate functions in SQL. If your knowledge of aggregate functions is rusty, this guide to SQL aggregate functions can help you follow this article. You can also practice on these aggregate functions examples. Sample Data We have a table called new_users. It collects data about the new users of an app in the South American market. The columns are: id – The ID for each individual record. date – The date when the users joined. number_of_new_users – The number of new users by date. city – The users’ city. country – The users’ location country. You see below there are two dates in total, with every date having a different number of users from two countries and two cities in each country. iddatenumber_of_new_userscitycountry 12022-05-1029CordobaArgentina 22022-05-1047Buenos AiresArgentina 32022-05-1022BogotáColombia 42022-05-1052MedellínColombia 52022-05-1137CordobaArgentina 62022-05-1119Buenos AiresArgentina 72022-05-1141BogotáColombia 82022-05-1187MedellínColombia How would you use the AVG() and SUM() functions to calculate the average daily number of new users by country? The Naive Solution The logic behind the naive approach is, actually, quite sound. It’s a shame that this naive solution is no solution at all. You’ll soon realize why. If you only think about the mathematical approach, it’s simple: you first have to sum the number of new users per country every day, then calculate the average of that sum. Transferred to SQL functions, it seems logical to write something like this: SELECT country, AVG(SUM(number_of_new_users)) AS average_new_daily_users FROM new_users GROUP BY country; Why not, right? There’s a very simple reason why not: SQL does not allow the nesting of aggregate functions. In other words, you can’t use an aggregate function within an aggregate function. Well, you can, but the query will return an error saying exactly what I just said: How do you fix this query to combine two aggregate functions in SQL? There are two options: using a subquery or using Common Table Expressions (CTEs). Real Solution 1: Subquery The first option is to combine two aggregate functions using a subquery. The subquery is a query within the main query. When creating reports, they are usually found in the SELECT, FROM, or WHERE clauses. In this example, I’ll put the subquery in the FROM clause. SELECT country, AVG(ds.sum_new_users) AS average_daily_new_users FROM (SELECT date, country, SUM(number_of_new_users) AS sum_new_users FROM new_users GROUP BY date, country) AS ds GROUP BY country; The principle when combining two aggregate functions is to use the subquery for calculating the ‘inner’ statistic. Then the result is used in the aggregate functions of the outer query. The above code selects the date and country and calculates the sum of the column number_of_new_users. This returns the total daily number of new users. I gave the subquery the alias ds, which is short for ‘daily sum’. Once I got the daily sum, I referenced this in the outer query by calculating the average of the column ds.sum_new_users – i.e. the average of the daily new users. I want this average to be by country; that’s why I grouped the data by the country column. countryaverage_daily_new_users Colombia101 Argentina66 Real Solution 2: CTE The other option for combining aggregate functions in SQL is using a CTE instead of a subquery. A CTE is a tidier and “closer to the mathematical logic” version of a subquery. It is an expression that allows you to create a temporary result, which you can reference in another SELECT statement. You can use the result of a CTE as you would any other table. The difference is that the CTE result exists only when a CTE is run together with the query using the CTE. A more detailed explanation of the CTEs can be found in the article What Is a CTE?. The query below will give you the same result as the subquery solution: WITH ds AS ( SELECT date, country, SUM(number_of_new_users) AS sum_new_users FROM new_users GROUP BY date, country) SELECT country, AVG(ds.sum_new_users) AS average_daily_new_users FROM ds GROUP BY country; Every CTE is introduced using the WITH keyword. The same is true for my CTE named ds. After the keyword AS comes the CTE definition. In this case, it’s the same SELECT statement as in the subquery from the previous example. Here comes the sum(): it will, again, return the sum of new users by country and date. The second SELECT statement references CTE and calculates the average of the sum returned by CTE. Unlike nesting the subquery, using the CTE allows you to use the aggregate functions in the logical order: first SUM(), then AVG(). The ‘inner’ function is used in the CTE, while the second SELECT statement is for the ‘outer’ aggregate function. The result will be the same as with the subquery; no need for me to show it again. However, it’s always good to learn more about the differences between a subquery and a CTE and then practice using CTEs on real-life examples. Become a Master of Data Aggregation and Reporting The problem of using two aggregate functions in SQL is ubiquitous. The more complex reporting becomes, the more you’ll use multi-level aggregations. You now know subqueries and CTEs are the two solutions for that. All other nuances of reporting can be found and practiced in our Creating Basic SQL Reports course. CTEs can make your reporting easier and more sophisticated, so taking the Recursive Queries course would be a wise idea. Tags: sql learn sql aggregate functions