19th Oct 2021 9 minutes read What Is a Recursive CTE in SQL? Tihomir Babic sql learn sql CTE Table of Contents What are CTEs? Non-Recursive CTE Syntax Recursive CTE Syntax Example 1 – Finding Bosses and Hierarchical Level for All Employees Example 2 – Finding the Investment Amount by Investor Example 3 – Finding Routes Between Cities Continue Practicing Recursive CTEs The article that’ll show you practical examples of using recursive CTEs in SQL. If you’ve heard about SQL’s recursive CTEs but never used them, this article is for you. It’s also for you if you never get tired of recursive CTE examples. Before we dig into recursion, I’ll remind you what CTEs are and what their syntax is. Then I’ll do the same for recursive CTEs. After that, I’ll show you how recursive CTEs work in three examples. What are CTEs? The CTE (common table expression), also known as the WITH clause, is an SQL feature that returns a temporary data set that can be used by another query. As it’s a temporary result, it’s not stored anywhere, but it still can be referenced like you would reference any other table. There are two types of CTEs, non-recursive and recursive. Here’s a nice article that will show you what CTEs are and how they work. Non-Recursive CTE Syntax The general syntax of a non-recursive CTE looks like this: WITH cte_name AS (cte_query_definition) SELECT * FROM cte_name; The first part of the syntax is the CTE. It begins with the keyword WITH. Then you give your CTE a name. After you follow that by the AS keyword, you can define CTE in the parentheses. The second part of the syntax is a simple SELECT statement. It is written immediately after the recursive CTE, without any commas, semicolons, or similar marks. Like I said earlier, the CTE is used in another query just like any other table. This is exactly what the SELECT statement does. Here’s the article that can additionally help you with the CTE syntax and its rules. And if you need some more CTE examples, this article is for you. Recursive CTE Syntax A recursive CTE references itself. It returns the result subset, then it repeatedly (recursively) references itself, and stops when it returns all the results. The syntax for a recursive CTE is not too different from that of a non-recursive CTE: WITH RECURSIVE cte_name AS ( cte_query_definition (the anchor member) UNION ALL cte_query_definition (the recursive member) ) SELECT * FROM cte_name; Again, at the beginning of your CTE is the WITH clause. However, if you want your CTE to be recursive, then after WITH you write the RECURSIVE keyword. Then it’s business as usual: AS is followed by the parentheses with the CTE query definition. This first query definition is called the anchor member. To connect the anchor member with the recursive member, you need to use the UNION or UNION ALL command. The recursive member is, obviously, the recursive part of CTE that will reference the CTE itself. You’ll see how it works in an example very soon. Recursive CTEs are used primarily when you want to query hierarchical data or graphs. This could be a company’s organizational structure, a family tree, a restaurant menu, or various routes between cities. See these articles to understand how CTEs work with hierarchical structures and how to query graph data. Now that we understand how recursive CTEs work, let’s look at some examples. Example 1 – Finding Bosses and Hierarchical Level for All Employees For this problem, I’ll use data from the table employees, which has the following columns: id: The employee’s ID. first_name: The employee’s first name. last_name: The employee’s last name. boss_id: The employee boss’s ID. Here’s what the data looks like: idfirst_namelast_nameboss_id 1DomenicLeaver5 2ClevelandHewins1 3KakalinaAtherton8 4RoxannaFairlieNULL 5HermieComsty4 6PoohGoss8 7FaulknerChalliss5 8BobbeBlakeway4 9LaureneBurchill1 10AugustaGosdin8 It’s not too complicated. For example, Domenic Leaver’s boss is the employee with the ID of 5; that’s Hermie Comsty. The same principle works for all other employees except Roxanna Fairlie. She has no boss; there is a NULL value in the column boss_id. We can conclude that Roxanna is the president or owner of the company. Let’s now write the recursive CTE to list all employees and their direct bosses. WITH RECURSIVE company_hierarchy AS ( SELECT id, first_name, last_name, boss_id, 0 AS hierarchy_level FROM employees WHERE boss_id IS NULL UNION ALL SELECT e.id, e.first_name, e.last_name, e.boss_id, hierarchy_level + 1 FROM employees e, company_hierarchy ch WHERE e.boss_id = ch.id ) SELECT ch.first_name AS employee_first_name, ch.last_name AS employee_last_name, e.first_name AS boss_first_name, e.last_name AS boss_last_name, hierarchy_level FROM company_hierarchy ch LEFT JOIN employees e ON ch.boss_id = e.id ORDER BY ch.hierarchy_level, ch.boss_id; What does this query do? It’s a recursive query, so it starts with WITH RECURSIVE. The name of the CTE is company_hierarchy. After AS, the CTE definition is in the parentheses. The first SELECT statement selects all the employee table columns where the column boss_id is NULL. In short, it will select Roxanna Fairlie, because only she has a NULL value in that column. Even shorter: I’m starting the recursion from the top of the organizational structure. There’s also a column hierarchy_level with the value of 0. That means the owner/president’s level is 0 – they're on top of the hierarchy. I’ve used the UNION ALL to connect this SELECT statement with the second one, i.e. with the recursive member. In the recursive member, I’m selecting all the columns from the table employees and the CTE company_hierarchy where the column boss_id is equal to the column id. Notice the part hierarchy_level + 1. This means that with every recursion, the CTE will add 1 to the previous hierarchy level, and it will do that until it reaches the end of the hierarchy. Also note that I’m treating this CTE as any other table. To finish defining the CTE, simply close the parentheses. Finally, there’s a third SELECT statement, outside of the CTE. It selects the columns that will show employees, their bosses’ names, and the hierarchy level. Data is taken from the CTE and the table employees. I’ve joined those two with a LEFT JOIN, since I want all the data from the CTE – including Roxanna Fairlie, who has the NULL value in the column boss_id. The result will be shown in ascending order: first by the hierarchy level, then by the boss’s ID. Here’s how it looks: employee_first_nameemployee_last_nameboss_first_nameboss_last_namehierarchy_level RoxannaFairlieNULLNULL0 HermieComstyRoxannaFairlie1 BobbeBlakewayRoxannaFairlie1 DomenicLeaverHermieComsty2 FaulknerChallissHermieComsty2 AugustaGosdinBobbeBlakeway2 PoohGossBobbeBlakeway2 KakalinaAthertonBobbeBlakeway2 LaureneBurchillDomenicLeaver3 ClevelandHewinsDomenicLeaver3 Roxanna Fairlie is the ultimate boss; you already knew that. There are two employees at level 1. This means Bobbe Blakeway and Hermie Comsty are the direct subordinates of Roxanna Fairlie. At level 2, there are employees whose direct bosses are Bobbe Blakeway and Hermie Comsty. There’s also a third level in the hierarchy. Those are employees whose immediate boss is Domenic Leaver. Example 2 – Finding the Investment Amount by Investor In this example, I’ll use the table investment: id: The investment’s ID. investment_amount: The investment’s amount. The data in the table looks like this: idinvestment_amount 19,705,321.00 25,612,948.60 35,322,146.00 These are the amounts of the three possible investment options. They will be considered by the three investors, who will divide the total investment amount in equal shares. Your task is to calculate the amount per investor depending on their number, i.e. if one, two, three, or no investors invest in each investment. The query that solves this problem is: WITH RECURSIVE per_investor_amount AS ( SELECT 0 AS investors_number, 0.00 AS investment_amount, 0.00 AS individual_amount UNION SELECT investors_number + 1, i.investment_amount, i.investment_amount / (investors_number + 1) FROM investment i, per_investor_amount pia WHERE investors_number << 3 ) SELECT * FROM per_investor_amount ORDER BY investment_amount, investors_number; Once again, the CTE starts with WITH RECURSIVE, followed by its name and the query definition. This time, I’ll use the anchor member of the recursive query to create some data. The columns are investors_number, investment_amount, and individual_amount. This is the point I want the recursion to start from (the same way it was in the previous example, with hierarchy_level = 0). Then comes the UNION and the recursive member. This part of the query will increase the investors_number column by one with every recursion. It will do that for every investment_amount. The third column will calculate the amount of that investment per investor, depending on the number of investors participating. The recursion will be performed for up to three investors (i.e. until it reaches the condition WHERE investors_number < 3). After that comes the simple SELECT statement that will return all the columns from the CTE. And here’s the result: investors_numberinvestment_amountindividual_amount 00.000.00 15,322,146.005,322,146.00 25,322,146.002,661,073.00 35,322,146.001,774,048.67 15,612,948.605,612,948.60 25,612,948.602,806,474.30 35,612,948.601,870,982.87 19,705,321.009,705,321.00 29,705,321.004,852,660.50 39,705,321.003,235,107.00 It’s not difficult to analyze. If there are no investors, the investment amount is zero, and so is the individual amount. If the investment is 5,322,146.00 and there’s only one investor, then the amount per investor will be 5,322,146.00. If there are two investors in the same amount, then each of them will have to pay 2,661,073.00. If all three investors decide to invest, each will pay 1,774,048.67. The other two investment amounts follow the same pattern, as you can see in the table. Example 3 – Finding Routes Between Cities In the third example, I’ll be using the table cities_route, which contains data about Dutch cities: city_from: The departure city. city_to: The destination city. distance: The distance between two cities, in kilometers. city_fromcity_todistance GroningenHeerenveen61.4 GroningenHarlingen91.6 HarlingenWieringerwerf52.3 WieringerwerfHoorn26.5 HoornAmsterdam46.1 AmsterdamHaarlem30 HeerenveenLelystad74 LelystadAmsterdam57.2 Use this table to find all the possible routes from Groningen to Haarlem, showing the cities on the route and the total distance. Here’s the query to solve this problem: WITH RECURSIVE possible_route AS ( SELECT cr.city_to, cr.city_from || '->' ||cr.city_to AS route, cr.distance FROM cities_route cr WHERE cr.city_from = 'Groningen' UNION ALL SELECT cr.city_to, pr.route || '->' || cr.city_to AS route, CAST((pr.distance + cr.distance) AS DECIMAL(10, 2)) FROM possible_route pr INNER JOIN cities_route cr ON cr.city_from = pr.city_to ) SELECT pr.route, pr.distance FROM possible_route pr WHERE pr.city_to = 'Haarlem' ORDER BY pr.distance; Let’s see what this query does. The first SELECT statement in the CTE definition will select the columns from the table cities_route where the departure city is Groningen. Notice there’s also a new column called route, which I’ll use to concatenate the cities on the route. The UNION ALL connects this with the recursive member. This SELECT statement will select the arrival city, concatenate the cities on the route, and finally add the distances between these cities to the total of the route between Groningen and Haarlem. To achieve all that, I’ve joined the CTE with the table cities_route. Then comes the SELECT statement that pulls data from the CTE. It will select the route and the distance where the arrival city is Haarlem, with the data being ordered by distance in ascending order. The query result looks like this: routedistance Groningen->Heerenveen->Lelystad->Amsterdam->Haarlem222.6 Groningen->Harlingen->Wieringerwerf->Hoorn->Amsterdam->Haarlem246.5 It’s not difficult to understand this table. There are two routes from Groningen to Haarlem. They include different cities in between and they are 222.6 km and 246.5 km long, respectively. If you want to keep learning, check out how you can use a recursive CTE instead of a long SQL query. And after tackling that subject,have a little fun by drawing something using a recursive CTE. Continue Practicing Recursive CTEs These three examples have demonstrated the possibilities of recursive CTEs in SQL. Now it’s time to build on what you learned here. Probably the best option is having a go at our Recursive Queries course. It offers you plenty of examples, explanations, and practice opportunities. The course is part of the Advanced SQL course track, where you can learn about other advanced SQL topics like window functions, GROUP BY extensions, and recursive queries. Our platform offers many ways to practice these topics. Have fun! Tags: sql learn sql CTE