7th Sep 2023 7 minutes read COUNT OVER PARTITION BY: An Explanation with 3 Examples Ignacio L. Bisso sql PARTITION BY Table of Contents Example #1: Introduction to Using COUNT OVER PARTITION BY Example #2: Calculate Totals Based on Different Criteria Note: COUNT(DISTINCT) Doesn’t Work with OVER(PARTITION BY) Example #3: Using COUNT() with OVER In Expressions Using OVER PARTITION BY with COUNT() and Other Functions In SQL, combining the COUNT() function with the OVER() clause and PARTITION BY opens the door to a totally different way to count rows. In this article, you’ll learn how to do several COUNTs in a single query and calculate expressions based on the result of the COUNT function. In this article, we will cover how you can use the COUNT() function combined with the clauses OVER() and PARTITION BY. To find out more about this topic, I recommend our interactive course Window Functions. It contains over 200 interactive exercises on using the OVER() clause with window functions. After finishing this course, you'll embrace this topic with ease and feel comfortable using window functions in SQL databases. Example #1: Introduction to Using COUNT OVER PARTITION BY Let’s suppose we have a table called order with a record for each sales order received in a pet shop. The table has columns like order_id, order_date, customer_id, salesperson_id, ship_address, ship_state and amount_paid. The following query shows the orders received by the company during the first half of 2023. Note we added an extra column called orders_this_customer that shows the total number of orders sent by each customer during this period. SELECT order_id, order_date, customer_id, amount_paid, COUNT(*) OVER (PARTITION BY customer_id) AS orders_this_customer FROM order WHERE order_date >= '2023-01-01' AND order_date <= '2023-06-30'; In SQL, we use the COUNT() function alone or combined with the GROUP BY clause to count rows in a result set or in a group of rows. OVER() and PARTITION BY applies the COUNT() function to a group or rows defined by PARTITION BY. In our example, the group is defined by the values in the column customer_id. The COUNT() function counts the number of orders with the same customer_id. The combination of COUNT() and OVER(PARTITION BY) is more powerful than using the COUNT() function alone because it allows us to get the number of rows for each specific value of a column. When using OVER() and PARTITION BY, we don’t need to use the GROUP BY clause to group records; this allows us to have result sets at row level. Each row in the result set will have information at order level, but the value in the orders_this_customer column will have the total number of orders placed by each customer. In other words, we are combining report-level data with customer-level data in the same row. And we can add other level data using different columns in the PARTITION BY clause. Below is a partial result for this query: order_idorder_datecustomer_idamount_paidorders_this_customer 1002023-06-01John Doe25.404 1012023-06-01Eva Fox34.101 1022023-06-01John Doe23.184 1032023-06-02Xi Pea45.953 Before finishing this section, I would like to suggest the article The SQL COUNT() Function Explained with 7 Examples, where you can find many example queries using the COUNT() function. For those readers who want to go deeper into the topic, I recommend the article How to Use the SQL PARTITION BY With OVER where you can find a clear explanation with examples of the OVER() and PARTITION BY clauses. Example #2: Calculate Totals Based on Different Criteria Before continuing, let’s show the complete table order. Each row of the table represents an order received by the pet shop. An order row has a customer_id, a salesperson_id, an order_date, a ship_state, and a ship_city, among other columns; the other columns are self-explanatory. Here is a partial view of the table: order_idorder_datecustomer_idsales_person_idproduct_familyShip_ addressShip_cityShip_ stateamount_paid 1002023-06-01John DoeJamesDOG23 Street 342DallasTX25.40 1012023-06-01Eva FoxMaryDOG9 Street 142MiamiFL34.10 1022023-06-01John DoeJamesCAT23 Street 342El PasoTX23.18 1032023-06-02Xi PeaJamesFISH65 Street 113TampaFL45.95 1042023-06-02John DoeJamesCAT23 Street 342DallasTX23.18 1052023-06-02Xi PeaJamesFISH15 Street 13TampaFL45.95 1062023-06-02Sin XuMaryDOG52 Street 441El PasoTX25.00 1072023-06-03Xi PeaRobDOG78 Street 563TampaFL15.55 1082023-06-04Sean PenRobCAT18 Street 262MiamiFL85.35 1092023-06-04John DoeMaryDOG52 Street 441UplandCA63.00 Let’s suppose that the pet shop, at the end of the month, wants to have a report with all the orders. For each row, they also want to show two calculated fields: the total number of orders sold on this day and the total number of orders sold by this salesperson. The query to obtain this report is: SELECT order_id, order_date, customer_id, salesperson_id, COUNT(1) OVER (PARTITION BY order_date) as orders_per_day, COUNT(1) OVER (PARTITION BY salesperson_id) as orders_per_salesperson FROM order WHERE order_date between '2023-06-01' AND '2023-06-30'; The result of the query is shown below: order_idorder_datecustomer_idsalesperson_idorders_per_dayorders_per_salesperson 1012023-06-01Eva FoxMary33 1002023-06-01John DoeJames35 1022023-06-01John DoeJames35 1032023-06-02Xi PeaJames45 1042023-06-02John DoeJames45 1052023-06-02Xi PeaJames45 1062023-06-02Sin XuMary43 1072023-06-03Xi PeaRob12 1082023-06-04Sean PenRob22 1092023-06-04John DoeMary23 In the above query, we used the expression COUNT(1), which works the same as COUNT(*). The PARTITION BY clause indicates the criteria for grouping the rows to be counted. To obtain the field orders_per_day, we use the clause OVER (PARTITION BY order_date); for the field orders_per_salesperson, we use the clause OVER (PARTITION BY salesperson_id). Here, note that the only difference is the field that we put after the PARTITION BY clause. This field defines the criteria for counting rows; that is, all rows with the same value in this field will be counted together. The effect of these two fields in the query result is that we add two fields of different levels of granularity to the row; the orders_per_salesperson field is a total grouped by salesperson, while the field orders_per_day is a total grouped by order date. Note: COUNT(DISTINCT) Doesn’t Work with OVER(PARTITION BY) There are some variations of the COUNT() function, such as COUNT(*), COUNT(1), or COUNT(DISTINCT). If you want to read more about them, see our article What is the Difference Between COUNT(*), COUNT(1), COUNT(column name), and COUNT(DISTINCT column name)? You’ll find a clear explanation of these different ways to use the COUNT() function. In some cases – e.g. if we need to count how many different people placed orders in a given day – we might think of using the COUNT(DISTINCT customer_id) OVER (PARTITION BY order_date) expression. However, it is important to clarify that COUNT(DISTINCT) OVER(PARTITION BY) is not supported by most popular databases (like PostgreSQL, SQL Server, and Snowflake, among others). Example #3: Using COUNT() with OVER In Expressions In the next example query, we will calculate some percentages using arithmetic expressions with two different COUNTs. Let’s suppose that the marketing department wants to specifically target the next marketing campaign to a precise group of customers. To do this, they need some metrics about the business results: the percentage of orders received from each state, the percentage of orders received for each family of products, and the combination of both metrics (e.g. the percentage of orders received from Texas for the DOG family of products). The query to obtain such a report is: SELECT DISTINCT ship_state, product_family, COUNT(1) OVER () AS total_orders, COUNT(1) OVER (PARTITION BY ship_state) state_orders, COUNT(1) OVER (PARTITION BY ship_state) / COUNT(1) OVER () AS state_percentage, COUNT(1) OVER (PARTITION BY product_family) AS family_orders, COUNT(1) OVER (PARTITION BY product_family) / COUNT(1) OVER () AS family_percentage FROM order ORDER BY ship_state, product_family In the above query, the field total_orders uses the expression COUNT(1) OVER () to calculate the total quantity of orders received by the company. Then in the field state_orders, the COUNT() expression is used to obtain the quantity of orders received from the current row’s state (e.g. if the current row’s ship_state value is TX (Texas), this will show all orders from Texas); this field is very similar to what we calculated in previous examples. The interesting part of this example query is in the expression of the state_percentage field, which uses two COUNTs to calculate a percentage. In other words, we are dividing the number of orders from the current state (the red expression) by the total number of orders (the blue expression). In the last two fields, we repeat the same approach for the field product_family. The result of the query is below: ship_stateproduct_familytotal_ordersstate_ordersstate_percentagefamily_ordersfamily_percentage CADOG1010.1050.50 FLCAT1050.5030.30 FLDOG1050.5050.50 FLFISH1050.5020.20 TXCAT1040.4030.30 TXDOG1040.4050.40 Using OVER PARTITION BY with COUNT() and Other Functions In this article, we covered the COUNT() function combined with the OVER and PARTITION BY clauses. However, in SQL there are many other functions that can be combined with PARTITION BY. These functions are called window functions, and you can find excellent learning material on them in our Window Functions course. It has over 200 interactive exercises and offers a complete window functions tutorial. If you already know window functions and are looking for more practice, I recommend our Window Functions Practice Set. This course has been designed to offer comprehensive window functions practice on real-world examples. Its 100 exercises on three different datasets resemble problems you’ll see in the real world. If you’re looking for more resources on window functions, check out our SQL Window Functions Cheat Sheet and Top 10 SQL Window Functions Interview Questions, where you can find some material to prepare yourself for a SQL job interview. Go for it, learn SQL, and invest in you! Tags: sql PARTITION BY