29th May 2017 4 minutes read Useful SQL Patterns: Matching Nulls by Masking Nulls Aldo Zelen NULL SQL patterns Table of Contents Why Match by Null? The Problem with SQL Data Matching by Null Matching by Null with Masking Nulls One More Thing About SQL Data Matching and Nulls As you start coding in SQL, you will use some statements and techniques over and over again. We call these “SQL patterns”. This series will look at the most common SQL patterns and consider how to use them. In database development, SQL developers often find themselves returning to the same SQL statements. Learning about these now, early in your SQL journey, will help you work more efficiently. Today, in the first post of this series, we will consider the match by null SQL pattern related to SQL data matching. Why Match by Null? Often, an SQL developer must join two tables based on the values in a certain column (it’s the so-called SQL data matching). For the join to work, these columns must have matching values. For example, suppose we have a poorly-designed database with two tables, movies and genres. Every movie in the movie has a genre (action, comedy, adventure), but some movies have not yet been classified by genre. These records have NULL values in the genres column. In the genres table, there is a matching ID for NULL. So, the movies table has the following columns: movieid, title, and genres. The genres table has two columns, id and genres. Here’s the data in the movies table: movieid title genres 1 Toy Story (1995) 1 2 Jumanji (1995) 1 3 Grumpier Old Men (1995) 2 4 Waiting to Exhale (1995) 2 5 Father of the Bride Part II (1995) 2 6 Heat (1995) 3 7 Sabrina (1995) 2 8 Tom and Huck (1995) 1 9 Sudden Death (1995) 3 10 GoldenEye (1995) And here’s what you’d see in the genre table: id genres 1 Adventure 2 Comedy 3 Action Unknown Let’s say that you need to get a list of all movie titles with their genres, including any movies with unknown genres. Obviously, it makes sense to join the two tables using the genres column. If you were to do a simple join without using popular SQL patterns… SELECT title, g.genres FROM movies m JOIN genres g ON (m.genres = g.id) … You would get results that leave out any movies with an unknown value in the genre row after you’d match the following columns: title genres Tom and Huck (1995) Adventure Jumanji (1995) Adventure Toy Story (1995) Adventure Sabrina (1995) Comedy Father of the Bride Part II (1995) Comedy Waiting to Exhale (1995) Comedy Grumpier Old Men (1995) Comedy Sudden Death (1995) Action Heat (1995) Action GoldenEye (1995) Hmm. No GoldenEye. What went wrong? The Problem with SQL Data Matching by Null Remember, for a join to work, the values in the joining columns must match. But as regular readers of the LearnSQL.com blog know, NULL does not equal NULL. NULL is not the same as NULL! Why not? Sometimes English speakers will use “null” to mean “zero”. But in SQL, NULL does not equal zero. It does not signify “no value”. It is an unknown value. Since we do not know what one NULL represents, we can never say that it is equal to another NULL. Yet, we still need to match records on the basis of NULL values. How can we do this using SQL data matching? Matching by Null with Masking Nulls To get around the fact that we can’t match NULLs, we use the COALESCE statement. This masks all NULL values with a default value : SELECT title, g.genres FROM movies m JOIN genres g ON (COALESCE(m.genres, 0.0) = COALESCE(G.ID, 0.0)) As a result, you match the following columns including the unknown value: title genres Tom and Huck (1995) Adventure Jumanji (1995) Adventure Toy Story (1995) Adventure Sabrina (1995) Comedy Father of the Bride Part II (1995) Comedy Waiting to Exhale (1995) Comedy Grumpier Old Men (1995) Comedy Sudden Death (1995) Action Heat (1995) Action GoldenEye (1995) Unknown The genres column is a NUMERIC type column, so we have to mask NULL values with a number – in this case, “0”. If the column was of another data type, we would mask it with a default value of that data type. For instance, if we were dealing with a CHAR column, we could use an X. To read more about data types, visit the LearnSQL.com Blog. One More Thing About SQL Data Matching and Nulls Matching nulls is usually a quick fix used to get around data modeling mistakes. It’s better to fix your model rather than to work with null masking statements in join conditions. Still, sometimes this cannot be avoided, as with MERGE statements in complex ETL processing. By the way, there was a mistake in the design of this example. It was done on purpose to demonstrate the match by null pattern. Can you spot it? Can you fix it? Tags: NULL SQL patterns