11th Aug 2022 4 minutes read How to Use Aggregate Functions in WHERE Clause Tihomir Babic sql WHERE aggregate functions Table of Contents Example Data An Aggregate Function in the WHERE Clause? Use Aggregate Functions in the HAVING Clause Aggregate Functions and Filtering Go Hand in Hand, But Not in the WHERE Clause Filtering data according to the result of an aggregate function is a common data analysis task. So how do you use aggregates in the WHERE clause? We’ll dedicate this whole article to answering that question. Combining aggregate functions and filtering based on their results is often used in data analysis – e.g. showing branches with total sales above X, countries where the number of posts is lower than Y, students with an average score below Z, and so on. Newer SQL users often attempt to do such filtering in the WHERE clause. But as we’ll see in this article, that’s not a viable solution. So let’s see how you can get the results you need. First, though, we’ll take a look at our dataset. In this case, the example is based on data from the European Climate Assessment & Dataset. Example Data Our dataset consists of only one table: temperature_data. It stores information about the maximum recorded daily temperature in several cities. The columns are: id – The ID of the recorded temperature. city – The city where the temperature was recorded. date – The date when the temperature was recorded. temperature – The highest temperature recorded (in degrees Celsius). To be more graphical, here’s all the data from the table. idcitydatetemperature 1Szczecin2022-05-3118.60 2Szczecin2022-05-3017.20 3Szczecin2022-05-2916.40 4Szczecin2022-05-2814.70 5Szczecin2022-05-2717.00 6Rotterdam2022-05-3118.40 7Rotterdam2022-05-3014.50 8Rotterdam2022-05-2914.50 9Rotterdam2022-05-2815.80 10Rotterdam2022-05-2716.10 11Berlin2022-05-3121.00 12Berlin2022-05-3017.50 13Berlin2022-05-2915.20 14Berlin2022-05-2816.30 15Berlin2022-05-2718.50 An Aggregate Function in the WHERE Clause? What I want to do with this data is find the average highest temperature by city and show only those cities with an average above 16°C. New analysts might write the query using the AVG() aggregate function in the WHERE clause: SELECT city, AVG(temperature) AS average_max_temperature FROM temperature_data WHERE AVG(temperature) > 16 GROUP BY city; First, we’re finding the average highest temperature by city. Then we’re using this result in the WHERE clause; that seems about right. But this query throws an error! And this error very explicitly answers the question of how you use aggregate functions in the WHERE clause: you don’t! Aggregate functions are not allowed because the WHERE clause is used for filtering data before aggregation. So while WHERE isn’t for aggregation, it has other uses. To filter data based on an aggregate function result, you must use the HAVING clause. Use Aggregate Functions in the HAVING Clause It’s simple to fix the above query: you only need to replace WHERE with HAVING. The HAVING clause is used for filtering data, much like WHERE. The main difference between WHERE and HAVING is that HAVING filters data after aggregation. So when writing code, you have to make sure the HAVING clause comes after the GROUP BY. This has something to do with the order of operations in SQL. Let’s rewrite our earlier query, removing WHERE and adding HAVING (after GROUP BY, of course!): SELECT city, AVG(temperature) AS average_highest_daily_temperature FROM temperature_data GROUP BY city HAVING AVG(temperature) > 16; Once again, we’re selecting the city and finding the average temperature using the AVG() function. Now, though, data is grouped by city and filtered using the HAVING clause to show only averages above 16. cityaverage_highest_daily_temperature Szczecin16.78 Berlin17.70 From the original data, the cities of Szczecin and Berlin remain because their average highest daily temperature is above 16°C. Aggregate Functions and Filtering Go Hand in Hand, But Not in the WHERE Clause Knowing that you can’t use the aggregate functions in the WHERE clause is only half of the answer. Of course, you still need some way to filter using aggregate functions’ output. The HAVING clause is that way! Both clauses look quite similar, especially because the way you set up a condition is the same. However, the important thing here is knowing the aggregate functions can be used in HAVING, but not in WHERE. For this difference to come to you naturally, you’ll need to write plenty of code. And for that, you’ll need plenty of examples, which is exactly what our SQL Practice course is all about. Tags: sql WHERE aggregate functions