16th Jan 2024 16 minutes read SQL CASE WHEN Explained: 10 Easy Examples for Beginners Maria Durkin CASE WHEN learn sql Table of Contents Understanding CASE WHEN Syntax Basic Syntax: CASE WHEN THEN CASE WHEN THEN ELSE Multiple THENs in CASE WHEN Examples of Using CASE WHEN in Data Analysis Example 1: Categorizing Data Example 2: Handling NULL Values Example 3: Creating Aggregated Columns Example 4: Marketing Analysis Example 5: Customer Segmentation Example 6: Categorizing Products by Price Range Example 7: Analyzing Order Fulfillment Status Example 8: Segmenting Customers by Purchase Frequency Example 9: Assessing Employee Performance Ratings Example 10: Grouping Products by Release Year 8 Tips for Using CASE WHEN in Data Analysis Beyond the Basics with CASE WHEN The CASE WHEN statement lets us make decisions on our data, categorizing and manipulating records based on specified conditions. Find out how to use CASE WHEN in this article. Imagine you're deciding what to wear for the day. You take out your umbrella if it's raining; if not, you leave it at home. This decision-making procedure is essentially the same as a SQL CASE WHEN statement. In the realm of SQL, the CASE WHEN statement functions much like an if-then-else expression, allowing us to create custom classifications within a query. Other programming languages use similar logic – e.g. Python uses if, elif, and else, and JavaScript uses the switch statement. This concept's widespread use across computer languages emphasizes how crucial it is: it gives users the capacity to handle a variety of situations. In SQL, this construct is a vital tool for data analysis. In this article, you'll find real-world practical exercises using the CASE WHEN statement for data analysis. This statement enables analysts to craft customized logic for classification and decision-making in their queries. As a result, the query’s accuracy and the analysis’ depth is enhanced. If this sounds interesting, why not explore our Creating Basic SQL Reports course? Along with covering details of the CASE WHEN syntax, this course also teaches you how to use SQL aggregation functions such as COUNT() and SUM(). You'll quickly learn how to compute averages, compare business groupings, and arrange intricate queries. Now, let's dive into the intricacies of SQL CASE WHEN and demystify the statement through some straightforward examples! Understanding CASE WHEN Syntax To explore the complexities of the CASE WHEN statement, let's break down its syntax using a few examples. Basic Syntax: CASE WHEN THEN To begin, we will examine the simplest syntax of the SQL CASE WHEN statement. This construct is especially helpful for segmenting records according to a given criteria and generating a new column to show the outcomes. Here’s the syntax: SELECT column_name, CASE WHEN condition THEN result END AS new_column FROM your_table; Let's explain each part in detail: SELECT: Specifies the columns to be included in the result set. CASE: Evaluates the specified condition for each row in the dataset. This Initiates the conditional logic. WHEN condition THEN result: Defines the condition to be checked and the result to be assigned if the condition is met. This allows for the dynamic classification of data. END: Marks the end of the CASE It signifies that the evaluation of conditions and assignment of results are complete. AS new_column: Creates a new column named new_column in the result set. This column captures the outcomes of the CASE WHEN evaluation for each row. This approach is particularly valuable when you want to introduce a categorical dimension to your data based on specific conditions. In the query below, we use CASE WHEN to label cities with temperatures exceeding 30 degrees Celsius as High: SELECT city, temperature, CASE WHEN temperature > 30 THEN 'High' END AS temperature_category FROM weather_data; This simplifies temperature pattern analysis with clear threshold labels in the new column, contributing to a more expressive and informative dataset for further analysis. Note that in this statement, records that don’t meet the specified condition will have a NULL value in the new column. This leads us to our next statement. CASE WHEN THEN ELSE Without an ELSE clause, we run the risk of our new column containing NULL values. By incorporating an ELSE clause, this construct provides a fallback result when the condition is not satisfied. This is useful when you want to ensure that every record in the dataset receives a meaningful value in the new column. Here’s the syntax: SELECT column_name, CASE WHEN condition THEN result ELSE alternative_result END AS new_column FROM your_table; Let's break down the conditions: WHEN condition THEN result: Defines the primary condition to be checked and the corresponding result to be assigned if the condition is met. ELSE alternative_result: Specifies an alternative result to be assigned when the primary condition is not This ensures that the new column will not contain NULL values. Let’s return to the weather data example. By introducing an ELSE clause, temperatures below the specified threshold will now be labeled as Normal. This construction works well to ensure every record in the new column has a defined value – which improves data analysis accuracy. This is the new query: SELECT city, temperature, CASE WHEN temperature > 30 THEN 'High' ELSE 'Normal' END AS temperature_category FROM weather_data; As you can see, all temperatures under 30 degrees Celsius are classed as normal; everything over 30 is high. Multiple THENs in CASE WHEN What happens if we have more than one condition we want to apply to our data? The following example shows how to use the CASE WHEN statement's syntax with multiple conditions. This offers a method for classifying data according to different standards: SELECT column_name, CASE WHEN condition1 THEN result1 WHEN condition2 THEN result2 -- Additional WHEN clauses as needed END AS new_column FROM your_table; Let's break down the conditions: WHEN condition1 THEN result1: Defines the first condition to be checked and the corresponding result if this condition is met. WHEN condition2 THEN result2: Specifies a second condition and its associated result. Additional WHEN clauses can be added for further conditions. In a CASE statement with multiple WHEN clauses, the order is significant. The conditions are evaluated sequentially, and the first condition that is met determines the result. Once a condition is satisfied, the corresponding result is returned and the subsequent WHEN clauses are skipped. In this example, if condition1 is true for a particular row, result1 will be returned, and condition2 will not be evaluated for that row. In the context of our previous weather data example, the final query incorporates multiple clauses as well as an ELSE clause to ensure that any records not meeting the specified temperature conditions receive a non-NULL output. This structure ensures that every record in the temperature_category column will be assigned a meaningful label; this contributes to a more comprehensive and informative weather analysis. Here’s the query: SELECT city, temperature, CASE WHEN temperature > 30 THEN 'High' WHEN temperature <= 30 AND temperature > 20 THEN 'Moderate' ELSE 'Low' END AS temperature_category FROM weather_data; As you can see, we now have three weather categories: High for temperatures over 30 degrees, Moderate for temps between 20 and 30, and Low for all other temps – i.e. those under 20 degrees Celsius. Now let’s apply these concepts to some real-world data analysis problems. Examples of Using CASE WHEN in Data Analysis Example 1: Categorizing Data Understanding transaction data is important for evaluating customer purchasing behavior in the context of a retail business. Let’s imagine you have a sales transaction dataset. It could be difficult to quickly obtain insights into the distribution of transactions and to analyze raw transaction amounts. One way to handle this situation is to group transactions according to amount. Here’s a query that lets us group transactions into High, Medium, and Low categories: SELECT TransactionID, Amount, CASE WHEN Amount < 1000 THEN 'Low' WHEN Amount >= 1000 AND Amount < 5000 THEN 'Medium' WHEN Amount >= 5000 THEN 'High' ELSE 'Unknown' END AS TransactionCategory FROM SalesTransactions; Using the CASE expression, we can categorize transactions into meaningful groups, simplifying the analysis and allowing for a quick overview of transaction patterns. A transaction with an amount of 980 will be classified as Low, while a transaction with an amount of 5,200 will be categorized as High. This dynamic categorization simplifies the analysis and provides a quick overview of transaction patterns. Example 2: Handling NULL Values Tracking order dates is essential for an e-commerce platform; it helps us comprehend customer behavior and develop better order fulfillment schedules. Let's say you have a dataset in which the OrderDate column contains a large number of NULL values. When examining order-related data, missing values in the OrderDate column may cause misunderstandings or confusion. So let’s write a query that places orders into two groups: those with an order date (Order Placed) and those with No Order Date. SELECT CustomerID, OrderDate, CASE WHEN OrderDate IS NULL THEN 'No Order Date' ELSE 'Order Placed' END AS OrderStatus FROM Orders; Here the CASE expression distinguishes between records with and without order dates, providing clear labels and enhancing the accuracy of data analysis. A row with a NULL OrderDate will be labeled as No Order Date, indicating that no specific order date is recorded for that transaction. In contrast, a row with a specific OrderDate, such as 2023-10-12, will be labeled as Order Placed, indicating that an order has been placed and has a specific order date. Example 3: Creating Aggregated Columns To optimize production and inventory management, a manufacturing company may seek to determine the level of demand for its products. The different levels of demand for different products may not be immediately apparent from looking at the total quantity sold. Therefore, additional columns can be created in the result set by aggregating data based on specific conditions or criteria. Let’s analyze a sample query: SELECT ProductID, SUM(Quantity) AS TotalQuantity, CASE WHEN SUM(Quantity) > 100 THEN 'High Demand' ELSE 'Normal Demand' END AS DemandLevel FROM OrderDetails GROUP BY ProductID; Here, the CASE expression manages the dynamic categorization of demand levels, allowing the company to identify high-demand products and adjust production plans accordingly. A row with a TotalQuantity of 120 will be labeled as High Demand, signifying a robust demand for that specific product. Conversely, a row with TotalQuantity of 80 will be labeled as Normal Demand, indicating a standard demand level for that particular product. Example 4: Marketing Analysis Imagine you are an online retailer who wishes to plan marketing campaigns and inventory stocking by analyzing the seasonality of customer orders. It is challenging to spot peak seasons or quarterly trends when viewing orders without classifying data by time; this is why we analyze data based on date ranges and offer insights into temporal patterns. SELECT OrderDate, COUNT(*) AS OrderCount, CASE WHEN OrderDate BETWEEN '2023-01-01' AND '2023-03-31' THEN 'Q1' WHEN OrderDate BETWEEN '2023-04-01' AND '2023-06-30' THEN 'Q2' -- Add more quarters as needed END AS Quarter FROM Orders GROUP BY OrderDate; Using the CASE expression, we organize orders into quarters. A row with an OrderDate of 2023-02-15 will be labeled as Q1, signifying that the order falls within the first quarter of the year. An order placed on 2023-05-20 will be labeled as Q2, indicating its placement within the second quarter. This enables you to strategize marketing efforts and adjust inventory levels based on seasonal demand. Example 5: Customer Segmentation A service that charges a subscription seeks to customize its products and marketing tactics according to its users' purchasing habits. When preparing certain reports, an analyst will need to group customers based on various criteria; in the example below, we segment customers into groups by their spending habits: SELECT CustomerID, SUM(PurchaseAmount) AS TotalPurchase, CASE WHEN SUM(PurchaseAmount) < 1000 THEN 'Bronze' WHEN SUM(PurchaseAmount) < 5000 THEN 'Silver' WHEN SUM(PurchaseAmount) < 10000 THEN 'Gold' ELSE 'Platinum' END AS CustomerSegment FROM Purchases GROUP BY CustomerID; The CASE expression segments customers into categories, allowing the service provider to offer personalized promotions, discounts, or services based on the spending level of each customer segment. Example 6: Categorizing Products by Price Range Exercise: Understanding the distribution of product prices at an e-commerce site is essential for making informed pricing and marketing decisions. With so many products available, it can be difficult to obtain insights. You have been asked to simplify the data so your boss can understand the pricing environment and make better decisions. The output of this exercise should contain the ProductID, ProductName, Price, and PriceCategory for each product. The PriceCategory column should categorize the products as follows: Low Price: Assigned to products with prices under 50. Moderate Price: Assigned to products with prices between 50 and 100. High Price: Assigned to products with prices over 100. Solution: SELECT ProductID, ProductName, Price, CASE WHEN Price < 50 THEN 'Low Price' WHEN Price >= 50 AND Price < 100 THEN 'Moderate Price' WHEN Price >= 100 THEN 'High Price' END AS PriceCategory FROM Products; Output: ProductIDProductNamePricePriceCategory 1Ultimate Headphones30Low Price 2Smartwatch XL80Moderate Price 3High-Performance Laptop120High Price Solution explanation: This example effectively uses the CASE WHEN statement to categorize products by price range. The conditions are clear and the resulting labels are meaningful, allowing for easier analysis. Example 7: Analyzing Order Fulfillment Status Exercise: In an online store, ensuring orders are delivered on time is key to customer satisfaction. With a high volume of orders, gaining insights is tough. How can you improve data analysis to track and optimize order fulfillment? The expected output of this exercise should present a clear breakdown of each order, including the OrderID, OrderDate, ShippedDate, and FulfillmentStatus. The FulfillmentStatus column categorizes orders into three groups: Not Shipped: Indicates orders awaiting shipment with a NULL ShippedDate. Shipped Late: Designates orders where the ShippedDate exceeds three days from the OrderDate, signaling a delay. Shipped On Time: Applied to orders shipped within three days of the OrderDate, ensuring timely fulfillment. Solution: SELECT OrderID, OrderDate, ShippedDate, CASE WHEN ShippedDate IS NULL THEN 'Not Shipped' WHEN ShippedDate > DATEADD(day, 3, OrderDate) THEN 'Shipped Late' ELSE 'Shipped On Time' END AS FulfillmentStatus FROM Orders; Output: OrderIDOrderDateShippedDateFulfillmentStatus 12023-01-102023-01-12Shipped On Time 22023-02-15NULLNot Shipped 32023-03-202023-03-25Shipped Late Solution explanation: This example showcases the effective use of CASE WHEN to categorize orders by fulfillment status. The conditions are logically structured, providing clear insights into the fulfillment process. The query aligns with the previously discussed tips by addressing NULL values so that orders with no shipment date will not be recorded as NULL. Example 8: Segmenting Customers by Purchase Frequency Exercise: In online retail, understanding customer buying habits is crucial. But when you have a large customer base, gaining clear insights is a challenge. Take this into account for this exercise, as you’re asked to segment customers based on their purchase frequency. The expected output of this exercise should provide a segmented view of customers based on their purchase frequency. It should include the CustomerID, TotalOrders, and CustomerSegment. The CustomerSegment column categorizes customers into three groups: Infrequent Shopper: Applied to customers with a purchase frequency of one order. Regular Shopper: Applied to customers with a purchase frequency between two and five orders. Frequent Shopper: Applied to customers with a purchase frequency exceeding five orders. Solution: SELECT CustomerID, COUNT(OrderID) AS TotalOrders, CASE WHEN COUNT(OrderID) = 1 THEN 'Infrequent Shopper' WHEN COUNT(OrderID) >= 2 AND COUNT(OrderID) <= 5 THEN 'Regular Shopper' WHEN COUNT(OrderID) > 5 THEN 'Frequent Shopper' END AS CustomerSegment FROM Orders GROUP BY CustomerID; Output: CustomerIDTotalOrdersCustomerSegment 1013Regular Shopper 1021Infrequent Shopper 1038Frequent Shopper Solution explanation: In this example, we include aggregate functions with the CASE WHEN statement to categorize customers by order frequency. By doing so, we can categorize the customers based on the frequency of their spending on the website. You can find more examples of combining aggregate functions with the CASE WHEN statement in our article How to Use CASE WHEN With SUM(). Example 9: Assessing Employee Performance Ratings Exercise: For effective HR decisions, evaluating employee performance is essential. Yet, with so many employees, the analysis process is complex. How can you streamline data analysis to assess and categorize employees based on productivity scores? The expected output of this exercise should streamline the analysis of employee performance, providing a clear evaluation of each employee. It should return the EmployeeID, ProductivityScore, and PerformanceRating. The PerformanceRating column categorizes employees into three groups: Excellent: Applied to employees with a ProductivityScore of 90 or higher, indicating outstanding performance. Good: Applied to employees with a ProductivityScore between 70 and 89, reflecting good performance. Needs Improvement: Applied to employees with a ProductivityScore below 70, indicating areas where improvement is needed. Solution: SELECT EmployeeID, ProductivityScore, CASE WHEN ProductivityScore >= 90 THEN 'Excellent' WHEN ProductivityScore >= 70 AND ProductivityScore < 90 THEN 'Good' WHEN ProductivityScore < 70 THEN 'Needs Improvement' END AS PerformanceRating FROM Employees; Output: EmployeeIDProductivityScorePerformanceRating 20195Excellent 20275Good 20360Needs Improvement Solution explanation: By defining ranges for performance with the CASE WHEN statement, we can categorize employees based on productivity scores. Since the conditions are clear, the resulting performance ratings provide actionable insights. Example 10: Grouping Products by Release Year Exercise: Managing product inventory requires understanding product life cycles. In a large catalog, identifying patterns can be tricky. How can you simplify this to effectively group products based on their release years? The expected output of this exercise should provide a clear grouping of products based on their release years. It should include the ProductID, ProductName, ReleaseYear, and ReleaseCategory. The ReleaseCategory column categorizes products into three groups: New Release: Applied to products released in the year 2023, indicating the latest additions to the catalog. Recent Release: Applied to products released between 2018 and 2022, signifying recently introduced items. Old Release: Applied to products released before the year 2018, identifying older items in the catalog. Solution: SELECT ProductID, ProductName, EXTRACT(YEAR FROM ReleaseDate) AS ReleaseYear, CASE WHEN EXTRACT(YEAR FROM ReleaseDate) = 2023 THEN 'New Release' WHEN EXTRACT(YEAR FROM ReleaseDate) >= 2018 AND EXTRACT(YEAR FROM ReleaseDate) < 2023 THEN 'Recent Release' WHEN EXTRACT(YEAR FROM ReleaseDate) < 2018 THEN 'Old Release' END AS ReleaseCategory FROM Products; Output: ProductIDProductNameReleaseYearReleaseCategory 1Chic Denim Jacket2022New Release 2Lunar Maxi Dress2019Recent Release 3Flare Athletic Leggings2015Old Release Solution explanation: This final example showcases the versatility of CASE WHEN. Here, we’ve used it with DATETIME functions to categorize products based on their release years. 8 Tips for Using CASE WHEN in Data Analysis Here are 10 tips to help you make the most of the CASE WHEN statement: Understand Your Data: Before implementing CASE WHEN, make sure you clearly understand the data you're working with. Identify the specific conditions or criteria that will add value to your analysis. Start Simple: If you're new to using CASE WHEN, begin with simple conditions. You can gradually add complexity as needed. This ensures your code remains readable and easier to troubleshoot. Use Meaningful Labels: When categorizing data, choose labels that are clear and meaningful. This enhances the interpretability of your results and makes the analysis more accessible to others. Consider Data Types: Make sure the data types in your conditions correspond to the data you are examining. Mismatched data types can lead to unexpected behaviors and errors. Combine Conditions Logically: Use AND and OR operators to express compound conditions. Parentheses can help clarify the order of evaluation. Address NULL Values: Consider using IS NULL, IS NOT NULL, or ELSE conditions to explicitly handle any columns in your analysis that may have NULL Utilize Aggregated Columns: CASE WHEN is particularly useful when creating aggregated columns. For example, you can categorize groups based on aggregated values (e.g. totals or averages) to gain insights into patterns or trends. Test and Validate: Test your CASE WHEN statements on a smaller subset of your data before applying them to the entire dataset. This helps catch any unexpected issues and ensures the logic works as intended. Beyond the Basics with CASE WHEN The CASE WHEN statement in SQL is a pivotal tool. It provides a structured and flexible approach to conditional logic – one that mirrors everyday decision-making. Plus, its intuitive nature makes it an accessible yet powerful feature of SQL. If you need to create custom classifications, handle NULL values, and categorize data dynamically, the CASE WHEN statement is a true asset. Whether you are a beginner navigating SQL or an experienced analyst, mastering the CASE WHEN statement is a key step toward unlocking deeper layers of data analysis. Dive further into its complexities with our articles How to Use CASE in ORDER BY in SQL and How to Use CASE WHEN in GROUP BY. And for an interactive learning experience, check out our Creating Basic SQL Reports course. Happy learning! Tags: CASE WHEN learn sql