12th Mar 2021 20 minutes read SQL JOIN Topics That Require Practice Tihomir Babic sql learn sql JOIN sql practice Table of Contents What Is JOIN? Beginner Topics Simple Two-Table Joins Joining Multiple Tables Intermediate Topics When You Need a LEFT JOIN or FULL JOIN Multiple Joins With LEFT JOIN or FULL JOIN Advanced Topics Non-Equi Join Are you interested in learning how to join tables in SQL? Where do you start? Many JOIN topics require practice. In this article, we’ll explore these topics by writing codes to solve real-life situations. Hopefully, you will learn which JOIN topics require attention. To make it easier for you, the topics are divided into beginner, intermediate, and advanced. Joining two or more tables in SQL lifts data analysis to another level. However, this can be a challenging topic when you first start learning SQL. The difficulty arises from understanding what you want and choosing the right way to join tables to get the desired result. To know what you want requires experience. And to get experience, you need practice. By using different ways of joining tables in SQL and making mistakes, you’ll learn when to use each type of JOIN. And, trust me, you’ll need to use JOINs a lot if you want to do anything remotely meaningful in SQL. No matter where you are on the SQL learning curve, your knowledge of JOINs can be bolstered by the SQL JOINs course. Although this course can be used for online practice, there’s also a special course dedicated to SQL JOIN practice, featuring many interactive exercises. Since there are various levels of SQL JOIN knowledge, I’ve divided the JOIN topics into three levels: beginner, intermediate, and advanced. But before I give you some practice material, let me explain in short what JOIN is. What Is JOIN? JOIN is an SQL clause used to combine data from two or more tables. There are four main types of JOINs. In gaining proficiency in SQL JOINs, it’s essential to understand the differences between them, how to use them, and when to use them. So, here they are: JOIN or INNER JOIN LEFT JOIN or LEFT OUTER JOIN RIGHT JOIN or RIGHT OUTER JOIN FULL JOIN or FULL OUTER JOIN I’ll explain how these JOINs work by giving you practical examples. For those who like to get into theory, don’t worry! I’ll link to helpful articles that explain these JOINs in detail. Let’s start with some beginner topics to practice. Beginner Topics Simple Two-Table Joins The need to join two tables in SQL is almost unavoidable. This is due to database logic: data is usually stored in multiple tables. Different data are stored in different tables. To join data from multiple tables into a meaningful result, you’ll need to use JOIN. The starting point is, of course, to connect one table with another. For example, if you have a list of the customers in one table and a list of countries in another, you’ll be able to find out which country every customer is from if you use JOIN. Or maybe there’s a list of company employees. If you join this table with another that contains a list of company departments, you can see which employees work in each department. Let me show you how it works with the country, city, and customer tables. The country table has the following data: id – the country’s ID and the table’s primary key (PK) country_name – the country’s name The city table has the following data: id – the city’s ID and the table’s primary key (PK) city_name – the city’s name country_id – the country’s ID and the table’s foreign key (FK), which references the table country The customer table has the following data: id – the customer’s ID and the table’s primary key (PK) customer_name – the customer’s name country_id – the country’s ID and the table’s foreign key (FK), which references the table country city_id – the city’s ID and the table’s foreign key (FK), which references the table city Before writing the code, let’s go through the JOIN syntax. Here’s how the basic JOIN code should look: SELECT column_list FROM table1 JOIN table2 ON table1.column_name = table2.column_name The code consists of three parts. The first one is the SELECT part, where you specify all the columns you want to show as a query result. The second part starts with the FROM clause. In this part, specify the table you want to get the data from. Here, act like you’ll use only one table. After you’ve specified the first table, write the JOIN clause. After you’ve done that, simply specify the second table you want the data from. The third part of the code is where you specify the columns used to combine the data from both tables. Do this by using the keyword ON. What’s next? Simply specify the columns from the first table and second table and put the equal sign (=) between them. Let’s see how it works in an example. We want to join the tables country and customer to show the customer’s name and country. To get this simple list, you should write this code: SELECT customer_name, country_name FROM customer JOIN country ON customer.country_id = country.id; Applying the syntax logic, you could read the above code like this: Select the columns customer_name and country_name. Do that from the table customer, which is joined with the table country. Join these two tables on two columns they have in common: the column country_id from the table customer and the column id from the table country. Take a look at the result: customer_namecountry_name Good ClientPortugal IT SuperioreItaly Zola Inc.Portugal WeMakeFoodSpain Jelly BellySpain Baldrick&SonsSpain Special GuysItaly Blackadder ConsultancyPortugal MismanagedItaly MultisocksItaly This is precisely what we wanted! One note: Writing JOIN in the code is the same as writing INNER JOIN. Both work and both will return the same result. I can't emphasize enough how vital practicing two-table JOINs is, especially if you're a beginner. Like anything in SQL, you need to write lots of code to become proficient and write them like second nature. JOINs can look scary at first. But practicing SQL JOINs makes them less scary. Practicing two-table JOINs is also the basis for any other JOIN. When you get a grip on two-tables JOINs, it'll be easier for you to grasp any other JOIN topic. Now, let me give you some tips for using the ON condition. One useful habit is to always state both the table’s name and the column’s name when using ON. Why is that? First of all, it’s easier to read the code that way. You won’t have to wonder which column is in which table once you explicitly write it in the code. This is especially helpful if your code becomes more complex. There is another reason why it’s good to write code like this. If you don’t have a habit of explicitly specifying the tables after the ON condition, you could be tempted to write the code this way: SELECT customer_name, country_name FROM customer JOIN country ON country_id = id; Run this code, and you’ll see it returns an error. Why is that? Both the customer and country tables have a column named id. SQL doesn’t know which id column you want. You have to be more specific and write this to avoid the error: ...ON country_id = country.id... Although this will get you the table above, this syntax can be confusing. If you’re not careful, you might think country_id is the same as country.id. Also, it doesn’t look nice! One table is not specified explicitly, while the other one is. I don’t like such “imbalances” in my code. Maybe it’s my mild form of OCD speaking. Nevertheless, I recommend building a habit of explicitly specifying tables. That way, you avoid errors, increase the code readability, and don’t trigger your possible OCD. Let this become your habit: ...ON customer.country_id = country.id... One more tip is to start using aliases. That way, you won’t have to write the tables’ full names. This is beneficial when you have multiple joins and long table names. Here’s how to use aliases: SELECT customer_name, country_name FROM customer cu JOIN country co ON cu.country_id = co.id; In the above code, the table customer now has the alias cu. The alias co is assigned to the table country. After the ON condition, you don’t need to write the tables’ full names. Instead, simply use the aliases. You can read more about joining tables in this article explaining what the inner join is. If you want a how-to guide, here’s one explaining how to join tables in SQL. Now that you know the basic principles of joining tables in SQL, let’s build on that! Next, we will learn about joining three tables. Joining Multiple Tables Once you understand how to join three tables, you’ll be able to join any number of tables. Let’s build on the previous example! In addition to the customer’s name and country, you want to show the customer’s city, too. To get this, you’ll have to join all three tables. Here’s how: SELECT customer_name, country_name, city_name FROM customer cu JOIN country co ON cu.country_id = co.id JOIN city ci ON cu.city_id = ci.id; First, a list of columns will appear in the result. Then, the tables customer and country are joined in the same way as before. After you’ve joined these two tables, you simply write JOIN again and specify the next table, city. Now, let’s pause for a moment and think: what data do the tables customer and city have in common? It’s the ID of the city. This data can be found in the column city_id of the table customer. The same data can be found in the column id of the table city. These two columns have to be specified after yet another ON clause, precisely as in the code above. So, the logic is as follows: Join one table with another on the common columns. Then, join the third table on the common columns. This will form a chain of JOIN clauses and ON conditions. The above code reads like this: Select the columns customer_name, country name, and city_name. Do that from the table customer, which is joined with the table country. Join those two tables on two columns they have in common: the column country_id from the table customer and the column id from the table country. Then, join the table city via the column city_id from the table customer and the column id from the table city. The code will return this result: customer_namecountry_namecity_name Good ClientPortugalLisabon IT SuperioreItalyGenoa Zola Inc.PortugalPorto WeMakeFoodSpainBarcelona Jelly BellySpainBarcelona Baldrick&SonsSpainMadrid Special GuysItalyRome Blackadder ConsultancyPortugalLisabon MismanagedItalyGenoa MultisocksItalyGenoa From this point on, you’ll be able to connect as many tables as you want. For example, you can use this to get employees from one table, connect it to their departments, and calculate their average salary. Or, you can get a list of products, and analyze their sales per region and per quarter. As you can see, joining three or more tables unlocks the possibility of creating exciting analyses and reports. But what you need is practice, a lot of practice. This article is a good start, but you need to get into as many situations as possible where you can practice SQL JOINs. Before jumping to more intermediate topics, make sure you understand this concept well. Here’s some more in-depth explanation of multiple joins to help you. Intermediate Topics In this part, I’ll explain the SQL JOIN topics you should practice after you’ve practiced the basics. We’ll go over when you should use LEFT JOIN and when the FULL JOIN is more suitable. I’ll also show you how to make multiple joins using those two types of joins. When You Need a LEFT JOIN or FULL JOIN The first step towards recognizing whether to use a LEFT JOIN or FULL JOIN is knowing the difference between them. The LEFT JOIN clause returns all the rows from the left table and all the matching rows from the right table. When there’s no matching record in the right table, it will return a NULL value. Using FULL JOIN will return all the rows from the left table and also all the rows from the right table. Not only that, but it will also return all the matching rows from both tables. For instance, you could use LEFT JOIN to get the professors list, with the subjects assigned to them. That way, you will get all the professors with assigned subjects and those with no subject assigned. But what if you also want to see if there’s a subject that has no professor assigned to it yet? That’s when you would use FULL JOIN. Another example is to get all the employees, including those that are not allocated to any department yet. Here, you would use LEFT JOIN. But what if you also want to see if there's a department with no employees? Then you could use FULL JOIN. Let me show you how this works with a real example. Here are three tables I will use. The department table has the following data: id – the department’s ID and the table’s primary key (PK) department_name – the department’s name The project table has the following data: id – the project’s ID and the table’s primary key (PK) project_name – the project’s name The employee table has the following data: id – the employee’s ID and the table’s primary key (PK) first_name – the employee’s first name last_name – the employee’s last name department_id – the department’s ID and the table’s foreign key (FK), which references the table department project_id – the project’s ID and the table’s foreign key (FK), which references the table project If your task is to list all the employees and the departments they work in, you should use LEFT JOIN. Here’s how: SELECT first_name, last_name, department_name FROM employee e LEFT JOIN department d ON e.department_id = d.id; The code writing logic is the same as with INNER JOIN. First, there’s a list of columns you want to see in the result. Then, you join the table employee with the table department using the LEFT JOIN clause. The tables are joined on the columns department_id and id. This is what you get when you run the code: first_namelast_namedepartment_name FrankStiglitzOperations SusanCoppolaSales KonradJohnsonSales MariaChirinosAccounting FabioAlcantaraSales MiriamSimoneAccounting OlufelaDiaoNULL ShoshanaLafayetteOperations JoseSilvaNULL NadiaIgnatovaNULL When there are no matching rows, the query returns NULL values. This means that Olufela Diao, Jose Silva, and Nadia Ignatova are not allocated to any department. Now that you’ve used LEFT JOIN, maybe it’s time to read an in-depth article explaining how LEFT JOIN works. What if you want to see everything as above, but you also want to check if there’s a department that has no employees? Then, you’ll have to use FULL JOIN: SELECT first_name, last_name, department_name FROM employee e FULL JOIN department d ON e.department_id = d.id; Again, first, there’s a list of the columns. The tables employee and department are again joined, this time using FULL JOIN instead of LEFT JOIN. They are joined on the same columns as above: department_id and id. Here’s how the results look: first_namelast_namedepartment_name FrankStiglitzOperations SusanCoppolaSales KonradJohnsonSales MariaChirinosAccounting FabioAlcantaraSales MiriamSimoneAccounting OlufelaDiaoNULL ShoshanaLafayetteOperations JoseSilvaNULL NadiaIgnatovaNULL NULLNULLHuman Resources As you can see, this table differs from the previous one only in the last row. Now, the result includes Human Resources, with NULL values in the columns first_name and last_name. This means that the Human Resources department has no employees. You’ve probably realized that the code syntax is the same no matter which JOIN you choose to use. What changes significantly is the result, so it’s essential to understand what every JOIN does and when to use it. Again, this understanding mainly comes from practicing writing JOIN queries. If you work with SQL a lot, you should specifically practice recognizing when to use LEFT JOIN, FULL JOIN, and INNER JOIN. Let’s now practice multiple joins using LEFT JOIN and FULL JOIN. Multiple Joins With LEFT JOIN or FULL JOIN What If you want to find out if there are employees with no department and no project to work on? You should use multiple LEFT JOINs to combine the tree tables. But you also have to filter the data so that not all employees are shown. Here’s how: SELECT first_name, last_name, department_name, project_name FROM employee e LEFT JOIN department d ON e.department_id = d.id LEFT JOIN project p ON e.project_id = p.id WHERE department_name IS NULL AND project_name IS NULL; Once again, there’s a list of columns. Next, the tables employee and department are connected on the columns department_id and id using LEFT JOIN. However, I still need data about the projects, so I again use the LEFT JOIN, this time to connect the table project where the project_id column equals the id column. Since I want only those employees with no department and no projects, I need to use the WHERE clause. In this case, I want only the results in which the department_name is NULL and the project_name is NULL, too. Here’s what the result looks like: first_namelast_namedepartment_nameproject_name OlufelaDiaoNULLNULL It shows that only Olufela Diao has no department and no project. This seems like an excellent place to recommend an article about how to LEFT JOIN multiple tables. What about FULL JOIN? When could it be useful in joining multiple tables? Use it if you want to find out if there’s a project that has no employees or department allocated to it: SELECT first_name, last_name, department_name, project_name FROM employee e FULL JOIN department d ON e.department_id = d.id FULL JOIN project p ON e.project_id = p.id WHERE first_name IS NULL AND last_name IS NULL AND department_name IS NULL; As in every other SELECT statement, there’s a list of columns. The tables employee and department are joined using FULL JOIN. This is done via the columns department_id and id. Then comes the next FULL JOIN, connecting the table project on the columns project_id and id. Finally, there’s a WHERE clause filtering the data. Here’s the result: first_namelast_namedepartment_nameproject_name NULLNULLNULLPointlessAsHell It seems the only project that no employee and no department is working on is PointlessAsHell. Not surprising, considering its name. Advanced Topics In this part, I’ll be talking about advanced join topics I didn’t mention yet, self join and non-equi join, that require practice when you want to master SQL JOINs. Self Join A self join is not a different type of join. It’s a regular join, but instead of joining two or more tables, the table is joined with itself. As with any other JOIN or SQL concept, the easiest way to learn is to practice self joins a lot. You’re probably wondering in which situations self joins are useful. Self joins are very useful when querying hierarchical data, for example, when you want to find all the employees under a manager or all descendants of an ancestor. However, I’ve chosen a more straightforward example to get to know the self join. In this example, there’s only one table, subscriber. This table contains data about magazine subscribers. There are the following columns: wing columns: id – the subscriber’s ID and the table’s primary key (PK) first_name – the subscriber’s first name last_name – the subscriber’s last name date_of_birth – the subscriber’s date of birth city – the subscriber’s home city For self join practicing purposes, your task is to show every subscriber and all the other subscribers from the same city, if there are any. Self join comes in handy here: SELECT DISTINCT s1.first_name, s1.last_name, s1.city FROM subscriber s1 JOIN subscriber s2 ON s1.city = s2.city WHERE s1.id <> s2.id ORDER BY city; The important thing when self joining is that you have to use aliases. The reason is you’ll be joining the table with itself. Aliases allow you to treat the same table as a different one by giving it a different alias. That way, SQL will know from which table to get the data. Let me now analyze the query above. First, there’s a list of columns I want to see in the result. There’s also a DISTINCT clause because I want to see each subscriber only once. I don’t want it to be repeated according to how many people from the same city are there. After listing the columns, I’ve joined the table subscriber with itself. I’ve done this by simply using the JOIN clause. I’ve also assigned different aliases to the same table, so SQL now thinks I’ve joined two different tables. I’ve joined the table with itself on the column id: one from table s1, the other from table s2. I’ve added the WHERE clause to get the data in which the id from one table is different from this column from the second table. Why do I need this condition? Because I only want the data where there’s more than one subscriber in the same city. Finally, the data is ordered by the city: first_namelast_namecity Jaelvan GaalAmsterdam Janvan GelderAmsterdam SeanSheridanAmsterdam HaraldBohteBerlin MariaWagnerBerlin MichelleTartagliaParis MortenNicholsonParis Now that you know how to self join a table, you can further your knowledge by practicing with seven more examples of the self join. Non-Equi Join You probably think you first have to learn the equi joins first before you go to the non-equi join. Well, you already did that! Every single join you’ve seen so far is an equi join. What’s that? It’s merely a join that uses an equal sign (=) between the columns when you’re joining tables. Non-equi join is precisely the opposite. Instead of an equal sign (=), you use one of the comparison operators such as <, >, <=, >=, !=, <>, and BETWEEN. How does this work in practice? Non-equi joins are helpful if, for example, you need to find all the subscribers that are older than Jael van Gaal. She’s the subscriber with id = 5. Here’s the query that’ll get the job done: Non-equi joins are helpful if, for example, you need to find all the subscribers that are older than Jael van Gaal. She’s the subscriber with id = 5. Here’s the query that’ll get the job done: SELECT s1.first_name, s1.last_name, s1.date_of_birth FROM subscriber s1 JOIN subscriber s2 ON s1.date_of_birth < s2.date_of_birth WHERE s2.id = 5; The code selects the lists of the columns. Then, the table subscriber is joined with itself. The table is joined on the column date_of_birth, but it’s done when the date_of_birth from the first table is lower (meaning: older) than the date_of_birth from the second table. You need the data only for those older than Jael van Gaal. That’s why there is a WHERE clause filtering the data. If you haven’t noticed, this non-equi join is also a self join. It can be used as a self join or as any other regular join connecting two or more tables. The only difference is there’s a comparison operator, not the equal sign (=). I’ve almost forgotten to show you the result of the query: first_namelast_namedate_of_birth MortenNicholson1950-03-30 Janvan Gelder1955-12-08 FrancescaTomatito1930-04-06 There are other situations where non-equi joins can be useful. They are great for listing pair combinations (for example, all the games in the football league) or when you’re performing data quality checks (for example, finding duplicates where there shouldn’t be any). Learn more in this article that gives you practical examples of non-equi joins. Why are the practical examples important? Non-equi joins can give you a lot of exciting possibilities for data analysis. Even though the concept is relatively straightforward (if you understand equi joins), you’ll really begin to understand non-equi joins once you start practicing and using them often. Practice will provide you with experience, which, in return, makes you better at code writing and recognizing when to use the non-equi joins. Congratulations, this is the last example I’ll bother you with! Do You Want To Join the Masters of JOIN? If you do, you’ve already taken the first step. By reading this article, you’ve had the chance to get a grasp of what JOINs can do. You now realize that JOINs are necessary for using SQL, be it at a beginner, intermediate, or advanced level. In short, you can’t avoid using them! I’ve provided you with plenty of examples and additional materials for mastering joining tables. This should only be a starting point. Now that you understand the general concepts, it’s time to dive deeper and practice on your own. Here are some examples of how you could practice JOINs. You can do it on your own, but you can also practice online with the help of interactive courses that are specially designed for learning JOINs and practicing JOINs. Tags: sql learn sql JOIN sql practice