9th Mar 2021 8 minutes read Why the SQL WITH Clause Is Awesome Kateryna Koidan sql learn sql WITH CTE Table of Contents Why Is the WITH Clause So Useful? Improving Readability Breaking Computations into Parts Nesting Computations Processing Hierarchical Structures Time to Practice WITH Clauses! If you’re not using WITH clauses yet, it’s definitely time to start! SQL WITH clauses, or common table expressions, help improve the structure of SQL queries by making them more readable. That’s already a lot, but WITH clauses have many more benefits. Let’s see together! The WITH clause was introduced in SQL:1999 to define views that are only valid for the query they belong to. Also known as common table expressions (CTEs), WITH clauses allow us to improve the structure of an SQL statement without polluting the database namespace. Using WITH clauses, we can create named subqueries that are referenced in the main query. The basic syntax is: WITH subquery1_name AS (SELECT … subquery1...), subquery2_name AS (SELECT … subquery2...) SELECT … main query ... As you can see, we can have multiple CTEs in one query. Moreover, one common table expression can reference another CTE (i.e. nested CTEs) or even itself (recursive CTEs). This gives us a whole bunch of interesting options, as we’ll see later in this article. If you are completely new to CTEs and subqueries, I recommend this beginner's guide to CTEs. You can also get hands-on practice writing WITH clauses in LearnSQL.com’s interactive Recursive Query course. Now, let’s go through several examples to see how we can benefit from common table expressions in real-world scenarios. Why Is the WITH Clause So Useful? WITH clauses are great at improving the structure of complex SQL queries that contain multiple computations. Let’s see how they work in practice! In our examples, we are going to analyze the San Francisco Bay area real estate market – arguably the most expensive place to live! So, here we have a table with houses available for sale. houses idaddressareacityprice 128 Sofia St1,565San Francisco998,000 225 Jack St1,680San Francisco1,850,000 313 Rose St1,148Redwood City265,000 435 Mary Ave3,800San Francisco5,500,000 554 Sara St1,098Redwood City1,375,000 612 John St1,910San Francisco1,195,000 732 Nice Ave1,340Mountain View349,000 812 Flower Ave1,234Redwood City1,048,000 934 Park St1,505Mountain View2,400,000 1078 Paul Ave1,870Mountain View1,658,000 Now it’s time to demonstrate the benefits of the WITH clause. Improving Readability When using WITH syntax, you name parts of the query as you write them. Unlike subqueries, the name comes before the query itself. Thus, when reading the query, you already know from the name what to expect. Furthermore, a well-written WITH clause is self-documenting. Let’s go through an example to see how WITH syntax makes the query more readable. To facilitate the evaluation of houses available for sale, we want to compare the price of each house with the minimum, average, and maximum house prices in its city. We can get the necessary output by using three CTEs that calculate the (1) minimum, (2) average, and (3) maximum prices for all cities in the table. Then, we can simply join our original table with the outputs of these three queries to display the required data: WITH min_price_city AS (SELECT city, MIN(price) AS minimum FROM houses GROUP BY city), avg_price_city AS (SELECT city, AVG(price) AS average FROM houses GROUP BY city), max_price_city AS (SELECT city, MAX(price) AS maximum FROM houses GROUP BY city) SELECT h.id, h.address, h.city, h.price, min.minimum, avg.average, max.maximum FROM houses h JOIN min_price_city min ON h.city = min.city JOIN avg_price_city avg ON h.city = avg.city JOIN max_price_city max ON h.city = max.city; idaddresscitypriceminimumaveragemaximum 1078 Paul AveMountain View1,658,000349,0001,469,0002,400,000 934 Park StMountain View2,400,000349,0001,469,0002,400,000 732 Nice AveMountain View349,000349,0001,469,0002,400,000 612 John StSan Francisco1,195,000998,0002,385,7505,500,000 435 Mary AveSan Francisco5,500,000998,0002,385,7505,500,000 225 Jack StSan Francisco1,850,000998,0002,385,7505,500,000 128 Sofia StSan Francisco998,000998,0002,385,7505,500,000 812 Flower AveRedwood City1,048,000265,000896,0001,375,000 554 Sara StRedwood City1,375,000265,000896,0001,375,000 313 Rose StRedwood City265,000265,000896,0001,375,000 In the query above, you can easily distinguish the three common table expressions. From their names, we can also quickly understand that the first clause calculates minimum prices in each city, the second one calculates the average prices, and the third one gives us the maximum prices. We also see how the result set of each CTE is used in the main query – they are referenced by name. WITH syntax makes queries easier to write and to read. Breaking Computations into Parts Another great benefit of WITH syntax is that it helps break computations into parts. In each WITH clause, you prepare a temporary table with the results of some computations; in the main query, you simply merge the results of all the preliminary computations. For example, let’s say we want to calculate how many houses in each city are expensive (i.e. priced above $1.5M) and how many are relatively cheap (i.e. below $500K). For this purpose, we’ll have two WITH clauses that separately calculate the number of (1) expensive and (2) cheap houses in each city. For convenience, we’ll also have a third WITH clause to get the list of cities. In the main query, we’ll join the results of these three subqueries to display the requested information: WITH cities AS (SELECT city FROM houses GROUP BY city), expensive AS (SELECT city, COUNT(*) AS expensive_houses FROM houses WHERE price > 1500000 GROUP BY city), cheap AS (SELECT city, COUNT(*) AS cheap_houses FROM houses WHERE price < 500000 GROUP BY city) SELECT c.city, e.expensive_houses, ch.cheap_houses FROM cities c FULL JOIN expensive e ON c.city = e.city FULL JOIN cheap ch ON c.city = ch.city; Here is the result of the query. In this case, the NULL values reflect that there are no houses above $1.5M in Redwood City and no houses below $500K in San Francisco: cityexpensive_housescheap_houses Mountain View21 San Francisco2NULL Redwood CityNULL1 As you see, WITH clauses make creating complex reports much easier. For more examples of applying WITH to complex analytical tasks, check out this article explaining CTEs by example. Nesting Computations Considering that the WITH syntax allows nesting, you can have several ‘layers’ of computations. For example, you can use one WITH clause to compute one aggregate function (like AVG()), then use another CTE to aggregate the result set of the first WITH clause (i.e. calculate the minimum average). Let’s demonstrate this with our house example. Our task will be to compute the average house price for each city and then compare this price with the minimum and maximum average price across the cities of the San Francisco Bay area. WITH avg_per_city AS ( SELECT city, AVG(price) AS average_price FROM houses GROUP BY city), min_price_city AS ( SELECT MIN (average_price) AS min_avg_price_city FROM avg_per_city), max_price_city AS ( SELECT MAX (average_price) AS max_avg_price_city FROM avg_per_city) SELECT ac.city, ac.average_price, min.min_avg_price_city, max.max_avg_price_city FROM avg_per_city ac CROSS JOIN min_price_city min CROSS JOIN max_price_city max; As you see, the first WITH clause computes the average house price for each city. The other two WITH clauses reference the first one to compute the minimum and maximum average prices across different cities. In the main query, we join the results of these CTEs to get the required output: cityaverage_pricemin_avg_price_citymax_avg_price_city Mountain View14690008960002385750 San Francisco23857508960002385750 Redwood City8960008960002385750 WITH clauses help build complex computations in one query. Excited? Wait until you hear about the amazing opportunities that recursive CTEs bring into play! Processing Hierarchical Structures The WITH syntax allows recursion. A recursive query is a query that refers to itself. Recursive CTEs are awesome for processing graphs, trees, and other hierarchical structures. Let’s say we have a company with a hierarchy where each employee has exactly one superior and superiors may have multiple subordinates: We can store these hierarchical relations in a table. In the employees table below, each employee is represented by a record that includes the employee’s ID, first name, last name, and the ID of their immediate superior. employees idfirst_namelast_namesuperior_id 1MariyaStevensNULL 2BobWhite1 3SarahGrey1 4KateWilliams2 5TanyaSmith2 6MaratSimpson3 7JackSailor3 8SophiaBarrel3 9TomStar3 Our task is to show the level of each employee in the company’s organizational structure, i.e. level 1 for the big boss, level 2 for her direct subordinates, and so on. Recursive queries let us accomplish this task in one query: WITH RECURSIVE levels AS ( SELECT id, first_name, last_name, superior_id, 1 AS level FROM employees WHERE superior_id IS NULL UNION ALL SELECT employees.id, employees.first_name, employees.last_name, employees.superior_id, levels.level + 1 FROM employees, levels WHERE employees.superior_id = levels.id ) SELECT * FROM levels; idfirst_namelast_namesuperior_idlevel 1MariyaStevensNULL1 2BobWhite12 3SarahGrey12 4KateWilliams23 5TanyaSmith23 6MaratSimpson33 7JackSailor33 8SophiaBarrel33 9TomStar33 In this example, we have created a recursive query called levels that refers to itself in the query body. The syntax of recursive CTEs is quite complicated. For a detailed explanation, I recommend reading this article that explains how recursive queries assist in processing hierarchical structure. Time to Practice WITH Clauses! Now you know that WITH clauses are very handy for creating complex reports and processing hierarchical structures. However, just reading about them is not enough. This is a complicated topic, especially when it comes to recursion. Practice is key to success here! I recommend starting with LearnSQL.com’s interactive course on Recursive Queries. It includes 114 coding challenges that cover all types of common table expressions, including simple CTEs, nested CTEs, and recursive CTEs. The WITH syntax is usually not taught in a typical SQL course. So, this course is quite unique. If you want to upgrade your skills in building complex, multilevel reports in SQL, consider also taking the SQL Reporting track. It includes three interactive courses that cover simple SQL reports, revenue trend analysis, and customer behavior analysis in SQL. Thanks for reading, and happy learning! Tags: sql learn sql WITH CTE