13th Jul 2021 6 minutes read What Is the SQL GROUPING SETS Clause, and How Do You Use it? Andrew Bone sql learn sql GROUP BY ORDER BY Table of Contents SQL GROUPING SETS Syntax SQL GROUPING SETS Example SQL ROLLUP Example SQL CUBE Example Group Your Data Effectively With SQL GROUP BY Extensions GROUPING SETS are groups, or sets, of columns by which rows can be grouped together. Instead of writing multiple queries and combining the results with a UNION, you can simply use GROUPING SETS. GROUPING SETS in SQL can be considered an extension of the GROUP BY clause. It allows you to define multiple grouping sets in the same query. Let’s look at its syntax and how it can be equivalent to a GROUP BY with multiple UNION ALL clauses. SQL GROUPING SETS Syntax The general syntax of the GROUPING SETS is as follows: SELECT aggregate_function(column_1) column_2, column_3, FROM table_name GROUP BY GROUPING SETS ( (column_2, column_3), (column_2), (column_3), () ); You can see how we are grouping by the different sets. This syntax is equivalent to the following lengthier query that uses GROUP BY with UNION ALL to combine the results: SELECT SUM(column_1), column_2, column_3 FROM table_name GROUP BY column_2, column_3 UNION ALL SELECT SUM(column_1), column_2, NULL FROM table_name GROUP BY column_2 UNION ALL SELECT SUM(column_1), NULL, column_3 FROM table_name GROUP BY column_3 UNION ALL SELECT SUM(column_1), NULL, NULL FROM table_name If you use the GROUP BY like this, you need multiple UNION ALL clauses to combine the data from different sources. UNION ALL also requires all result sets to have the same number of columns with compatible data types, so you need to adjust the queries by adding a NULL value where required. Even though the query will work as you expect, it has two main problems: It is lengthy and not very manageable. It can lead to a performance issue, because SQL has to scan the sales table each time. The GROUPING SETS clause addresses these problems. But how does it affect the output compared to a traditional GROUP BY clause? Time to look at an example! SQL GROUPING SETS Example We need some sample data. Let’s create a table called payments that contains all payments our company has received in January, February, and March for the past four years, 2018 to 2021. The exact store where the payment occurred is denoted by the store_id column. To create this table, execute the following query: CREATE TABLE payment (payment_amount decimal(8,2), payment_date date, store_id int); INSERT INTO payment VALUES (1200.99, '2018-01-18', 1), (189.23, '2018-02-15', 1), (33.43, '2018-03-03', 3), (7382.10, '2019-01-11', 2), (382.92, '2019-02-18', 1), (322.34, '2019-03-29', 2), (2929.14, '2020-01-03', 2), (499.02, '2020-02-19', 3), (994.11, '2020-03-14', 1), (394.93, '2021-01-22', 2), (3332.23, '2021-02-23', 3), (9499.49, '2021-03-10', 3), (3002.43, '2018-02-25', 2), (100.99, '2019-03-07', 1), (211.65, '2020-02-02', 1), (500.73, '2021-01-06', 3); You can view the data using this simple SELECT clause: SELECT * FROM payment ORDER BY payment_date; Executing this query yields the result: payment_amountpayment_datestore_id 1200.992018-01-181 189.232018-02-151 3002.432018-02-252 33.432018-03-033 7382.102019-01-112 382.922019-02-181 100.992019-03-071 322.342019-03-292 2929.142020-01-032 211.652020-02-021 499.022020-02-193 994.112020-03-141 500.732021-01-063 394.932021-01-222 3332.232021-02-233 9499.492021-03-103 You can see that there are multiple entries for some stores. Imagine we are preparing a report and we want to see one total for each store. The SUM() aggregate function can help us with this. We will also use the GROUP BY clause to group our results by year and store. SELECT SUM(payment_amount), YEAR(payment_date) AS 'Payment Year', store_id AS 'Store' FROM payment GROUP BY YEAR(payment_date), store_id ORDER BY YEAR(payment_date), store_id; SUM(payment_amount)Payment YearStore 1390.2220181 3002.4320182 33.4320183 483.9120191 7704.4420192 1205.7620201 2929.1420202 499.0220203 394.9320212 13332.4520213 The results are aggregated by each unique combination of year and store. However, we can’t see the total payments by year: the total payments for 2018, 2019, 2020, or 2021. We cannot see the totals by store either, which would be a useful metric to have. Using GROUPING SETS allows us to view these totals. SELECT SUM(payment_amount), YEAR(payment_date) AS 'Payment Year', store_id AS 'Store' FROM payment_new GROUP BY GROUPING SETS (YEAR(payment_date), store_id) ORDER BY YEAR(payment_date), store_id; SUM(payment_amount)Payment YearStore 3079.89NULL1 14030.94NULL2 13864.90NULL3 4426.082018NULL 8188.352019NULL 4633.922020NULL 13727.382021NULL Wow, our results changed drastically! Now, we see just the grand totals for each store along with the grand totals for each year. For the columns by which the rows are not grouped, you see the NULL values. Remember that you can include multiple GROUP BY clauses in your GROUPING SETS. Applying this to our query yields the following: SELECT SUM(payment_amount), YEAR(payment_date) AS 'Payment Year', store_id AS 'Store' FROM payment_new GROUP BY GROUPING SETS ( (YEAR(payment_date), store_id), (YEAR(payment_date)), (store_id) ) ORDER BY YEAR(payment_date), store_id; SUM(payment_amount)Payment YearStore 3079.89NULL1 14030.94NULL2 13864.90NULL3 4426.082018NULL 1390.2220181 3002.4320182 33.4320183 8188.352019NULL 483.9120191 7704.4420192 4633.922020NULL 1205.7620201 2929.1420202 499.0220203 13727.382021NULL 394.9320212 13332.4520213 Before ending this tutorial, we should mention two other SQL GROUP BY extensions that could prove useful for your particular project or scenario: ROLLUP and CUBE. These topics are covered in great detail in this Advanced SQL learning track from LearnSQL.com, which features window functions, GROUP BY extensions, and recursive queries. SQL ROLLUP Example Similar to GROUPING SETS, you can use the ROLLUP option in a single query to generate multiple grouping sets. ROLLUP assumes a hierarchy among the input columns. For example, if the input columns are: GROUP BY ROLLUP(column_1,column_2) the hierarchy for this is column_1 > column_2, and ROLLUP generates the following grouping sets: (column_1, column_2) (column_1) () ROLLUP generates all grouping sets that make sense in this hierarchy. It generates a subtotal row every time the value of column_1 changes; this is the hierarchy we have provided. For this reason, we often use ROLLUP to generate subtotals and grand totals in reporting. The ordering of your columns in ROLLUP is very important. Let's look at a query that uses ROLLUP: SELECT SUM(payment_amount), YEAR(payment_date) AS 'Payment Year', store_id AS 'Store' FROM payment GROUP BY ROLLUP (YEAR(payment_date), store_id) ORDER BY YEAR(payment_date), store_id SUM(payment_amount)Payment YearStore 30975.73NULLNULL 4426.082018NULL 1390.2220181 3002.4320182 33.4320183 8188.352019NULL 483.9120191 7704.4420192 4633.922020NULL 1205.7620201 2929.1420202 499.0220203 13727.382021NULL 394.9320212 13332.4520213 The grand total is shown at the top of the result: 30975.73NULLNULL The rest of the result is structured as follows. First, the yearly total is shown: 4426.082018NULL This is followed by the totals by store by year: 1390.2220181 3002.4320182 33.4320183 As you can see, ROLLUP generates a subtotal row every time the value of Payment Year changes, since this is the hierarchy we provided. This example shows how useful ROLLUP can be for reporting purposes. SQL CUBE Example Similar to ROLLUP, CUBE is an extension of the GROUP BY clause. It allows you to generate subtotals for all combinations of the grouping columns specified in the GROUP BY clause. The CUBE is like combining GROUPING SETS and ROLLUP. It shows the detailed output of both. SELECT SUM(payment_amount), YEAR(payment_date) AS 'Payment Year', store_id AS 'Store' FROM payment GROUP BY CUBE (YEAR(payment_date), store_id) ORDER BY YEAR(payment_date), store_id SUM(payment_amount)Payment YearStore 30975.73NULLNULL 3079.89NULL1 14030.94NULL2 13864.90NULL3 4426.082018NULL 1390.2220181 3002.4320182 33.4320183 8188.352019NULL 483.9120191 7704.4420192 4633.922020NULL 1205.7620201 2929.1420202 499.0220203 13727.382021NULL 394.9320212 13332.4520213 The main difference in this output from the ROLLUP example is that the grand total for each store is also shown here. 3079.89NULL1 14030.94NULL2 13864.90NULL3 Apart from these rows, all the rows in this result are the same as the result of the ROLLUP. This concludes our comparison of GROUPING SETS, ROLLUP, and CUBE! You can find more examples in this article about grouping, rolling and cubing data. Group Your Data Effectively With SQL GROUP BY Extensions Gaining mastery of the SQL GROUP BY extensions will take practice. Options like GROUPING SETS, ROLLUP, and CUBE allow you to manipulate the results of your queries in different ways. Knowing how to use these extensions effectively reduces the need for manually formatting your data before passing it on to relevant stakeholders. To expand your knowledge further in this area, consider this GROUP BY extensions course from LearnSQL.com that covers GROUPING SETS, ROLLUP, and CUBE. Tags: sql learn sql GROUP BY ORDER BY