30th Mar 2017 7 minutes read How to Organize SQL Queries When They Get Long Marek Pankowski tips cte Table of Contents How to Organize SQL Queries Tip 1: Indent Your Code Tip 2: Use the WITH Clause Tip 3: Explain Yourself with Comments Tip 4: Break Queries into Steps Tip 5: Stick with One Naming Convention The Mysterious Long SQL Query – Summary The first long SQL query you’ll have to deal with is likely to be hard for you to structure and understand. These five tips will teach you the best way to organize SQL queries, i.e. write and format them. As we all know, SQL queries are essential to database management. Without them, it would be extremely difficult to find and work with the information in a database. Query length depends on the type of information we need and the size of the database. When we read a simple piece of SQL code, we are usually able to understand it quite easily. But what if we’re dealing with a 1,500-line query? If we want to understand what we’ve done (or let others understand it), we have to know how to organize SQL queries. Fortunately, remembering just five useful tips will help you deal with every long SQL query you’ll ever write. . How to Organize SQL Queries Tip 1: Indent Your Code Indentation helps keep your long SQL query clean by identifying where each block of code begins. This makes program structure more understandable and enables developers to easily find a specific instruction. When you correctly indent your code, you can quickly see what you are selecting, which tables provide the data, and what restrictions apply to it (i.e. the WHERE and GROUP BY clauses). Let’s see a relatively long SQL query example: SELECT column1 ,column2 FROM table1 WHERE column3 IN ( SELECT TOP(1) column4 FROM table2 INNER JOIN table3 ON table2.column1 = table3.column1 ) Compare the code above to this sample: SELECT column1, column2 FROM table1 WHERE column3 IN ( SELECT TOP(1) column4 FROM table2 INNER JOIN table3 ON table2.column1 = table3.column1) Now, imagine that there are 100 similar instructions, too. Without indentation, such long SQL query example will be very hard to read and understand! There is no single accepted indentation style. Every company or programmer usually develops their own. Even so, your indentation style should make sense to others. To help you get started, I’ve provided links to style guides by ApexSQL and Simon Holywell. Indentation is the first thing you should do when you start to write code. If you inherit a chunk of un-indented code, there are a lot of sites that let you format your long SQL query for free. Tip 2: Use the WITH Clause The WITH clause allows you to name a subquery block and treat it like a normal table. In this case, the WITH clause is basically a replacement for a normal subquery. Look at the long SQL query example below. Its task is to show the number of hospitals in Los Angeles where patients whose last visit was more than 14 days ago received an average drug dosage over 1,000 units. WITH patient_data AS ( SELECT patient_id, patient_name, hospital, drug_dosage FROM hospital_registry WHERE (last_visit > now() - interval '14 days' OR last_visit IS NULL) AND city = "Los Angeles" ) WITH average_dosage AS ( SELECT hospital, AVG(drug_dosage) AS Average FROM patient_data GROUP BY hospital ) SELECT count(hospital) FROM average_dosage; WHERE AVG(drug_dosage) > 1000 I’ll explain all elements in this query. The first element is the WITH clause. Using it, we can specify two subqueries as patient_data and average_dosage and use them like normal tables. The patient_data subquery was used to create the average_dosage subquery. So then average_dosage becomes the SELECT statement’s final data source. This arrangement is much cleaner than writing everything in one query and putting all the subqueries in the WHERE instructions. Because it makes code easier to read, the WITH clause is arguably the best way to organize SQL queries. Tip 3: Explain Yourself with Comments Code tells how, but comments explain why. Comments are some of the most important helpful features in any code, from a programming project in Java to an SQL query. They allow programmers to express their thought processes and their desired outcomes. Always add comments to your code. You’ll be glad you did, and so will any other programmers who work on your SQL. Even if you are just taking a break, add comments – they will help you pick up where you left off. This is how a very simple SQL query looks without comments: SELECT name, student_group, FROM course_marks WHERE points < 300 OR points IS NULL; Do you really know exactly what the WHERE means? Without comments, it is hard to understand what the coder wanted to accomplish. If you wonder how to organize SQL queries best, make it a habit to leave comments in your code. Let’s see this same long SQL query, but with comments and explanations: /*Get the list of every student who failed or was absent on exam*/ SELECT name, student_group, FROM course_marks WHERE points < 300 OR points IS NULL; --student fails when gets 300 or less points. --When student was absent, instead of points there is NULL in column Better, isn’t it? These comments have explained the developer’s actions and helped us understand what the query does. Tip 4: Break Queries into Steps A long SQL query is really a collection of blocks of code, which are much easier to control and to check for mistakes. Building your query a step at a time is a best practice for SQL development. This will enable you to find logical errors faster and be more confident of your results. Look how much easier it is to understand this long SQL query example because it’s been broken into logical steps: WITH subject AS ( SELECT student_id, subject_id, school_id, sum(exam_score) AS total_score FROM subject_marks GROUP BY student_id, subject_id, school_id ), student AS ( SELECT student_id, school_id, avg(total_score) AS average_student FROM subject GROUP BY student_id, school_id ) SELECT school_id, avg(average_student) FROM student GROUP BY school_id; We want the result of this query to show us the average amount of points for every student from every school. The first stage (the WITH subject clause) finds the sum of points for every subject. The second stage (the WITH student clause) averages points for every student. By the time we reach the SELECT, all we need to do is find the average amount of points and group the results by school. By building a long SQL query this way, you actually write a query that is not very long nor complicated any more. It just depends on how you choose to break it up. Tip 5: Stick with One Naming Convention It is also very important to use a single naming convention for tables, columns, and queries. This makes the SQL code easier to write and read. When a legacy database has tables named PRODUCT, users, USERS_other, and UserSECOND_NEW, it looks very unprofessional and the coding gets chaotic. Follow the established rules for capitalization, underscores, spaces, etc. This also applies to query styles: stick with the same formatting rules and indentation patterns for the whole database. If you start a new line after a SELECT, do it every time, especially when writing a long SQL query. Naming conventions can be very different from database to database. As with indentation, there is no single industry-recognized style. I recommend that you find which conventions are the most widely used and model your own style on those. These articles in the Vertabelo and Launch by Lunch blogs are a good place to start. The best way to understand the difference between good and bad query styling is to see them for yourself. Below, we find an example of poor style: SELECT name, SECONDNAME, Date_TIME, address, SecondAddress FROM registry, other_USERSData WHERE registry.id = other_USERSData.id; This query, which uses one constant style, is much more readable: SELECT name, second_name, date_time, address, second_address FROM registry, other_users_data WHERE registry.id = other_users_data.id; The Mysterious Long SQL Query – Summary To sum up, these tips will help you write better queries, no matter their length, but it’s especially important to apply them whenever you write a long SQL query. The best thing you can do is to practice using good techniques. If you don’t know how to develop your skills, our SQL Basics course is a great solution. There are a lot of lessons that will teach you how to organize SQL queries. It will take time to learn how to write clean and smart SQL queries. You’ll need to understand table relationships, code and data planning, how to identify the parts of a query, and what results you can expect. Patience and hard work are the keys to success! Tags: tips cte