12th May 2020 10 minutes read SQL in Google Sheets? Yes, We Can! Adrian Więch sql learn sql Table of Contents Sampling Google Sheets Data Understanding the Query Function First Query in Google Sheets Working with Columns and Conditions Adding More Conditions and Ordering Rows Counting Grouped Rows Finding Averages SQL(ish) Queries in Google Sheets? Absolutely! Did you know that you can write SQL-like queries in Google spreadsheets? This feature is incredibly powerful, versatile, and can replace most of the functionalities offered by Excel’s pivot tables. Read our article to find out how you can easily use the QUERY function to efficiently work with your Google Sheets data. Relational databases share many features with spreadsheets. When we explain what databases are, we often say that they are a bit like spreadsheets except they give us more control over larger amounts of data. To communicate with databases, we typically use SQL, which is an effective way of retrieving data from a database. (If you’ve never heard of SQL, start by reading our short article The Complete Beginner’s Guide to SQL Fundamentals.) SQL is a universal query language. It’s been around since the 1970s and is still extremely popular. Knowing SQL boosts your chances of a stable career in IT, as this knowledge is sought after by many employers. If you’d like to discover the benefits of using SQL, check out 5 Great Benefits of Learning SQL. Now, let’s get back to spreadsheets. Spreadsheets are important because data isn’t always stored in a database. Many companies use CSV files or spreadsheets to manage their info. These can be opened in applications such as Google Sheets or Microsoft Excel, but these files don’t work with databases as such. Does that mean you can’t use SQL when you work with data in spreadsheets? Not at all! Google Sheets provides a QUERY option that allows you to write SQL-like instructions and retrieve data in a way that’s similar to SQL. This way, you can use the power of SQL even if you don’t have a database to work with! Needless to say, Google Sheets is available for free. The program can open a wide range of file formats, including CSV and Excel files. Since SQL is a well-designed, standardized data querying language, thinking in terms of SQL when you work in Google Sheets can help your analyses. For more on this, see the post How to Start Thinking in SQL by NoSide technical advisor Jeffrey Edison. In this article, we’ll show you a few basic examples of working with SQL in Google Sheets. Some SQL knowledge is recommended, but it’s not required to understand the concepts we’ll present. You should be able to follow the examples even if you’ve never seen an SQL statement before. However, if you’d like to learn SQL, take a look at LearnSQL.com’s SQL Fundamentals course, which assumes no prior experience and will turn you into an intermediate SQL user. Sampling Google Sheets Data In this article, we’ll work with a simple Google Sheets spreadsheet that contains some basic information about selected hotels in Indonesia. We’ve used imaginary hotels, so don’t try to find them. You can view the spreadsheet and all the queries we present in this article in our "Hotel Query" Google Sheet. There is a separate sheet for each query discussed in this article. The Hotels table contains the following columns: Id - The unique identifier for that hotel. Name - The hotel’s name. Stars - The number of stars awarded to the hotel: 3, 4 or 5. Rating - The average rating of the hotel by its guests, on a scale from 0 to 10. TwinRoomPrice - The base price for a twin room for one night. City - The city where the hotel is located: Bandung, Denpasar, or Surabaya. Even though the hotels are imaginary, the cities are real. Google Bandung, Denpasar, and Surabaya —they are really good Indonesian travel destinations! All right. Now that we know the data, let’s learn how to use SQL in Sheets. Understanding the Query Function To write SQL-like instructions in Google Sheets, we’ll only need a single function named QUERY. The format of the function isn’t particularly difficult: =QUERY(data, query, [headers]) As you can see, the QUERY function takes three parameters, of which only two are required. Let’s look at those parameters: data - The range of cells containing the data (in this case, the range is the entire Hotels table). query - The SQL-like query to be performed. headers - The number of rows that contain header information. These are usually at the top of the data, i.e. the column names. This argument is optional. If you don’t provide it, Google Sheets will try to figure it out. The application is quite good at this, so the parameter is typically not needed. We’ll omit it in this article. Don’t worry if these parameters look a bit vague to you—we’ll provide a few easy-to-follow examples in the next paragraphs. First Query in Google Sheets Let’s start with the easiest possible example. We’ll simply select all the data from our Hotels table. We’ll first pick a free cell to the right of the Hotels table in Google Sheets and start writing our query in the following way. (Don’t hit Enter yet.) =QUERY(A1:F23, The first parameter, data, is the cell range. In this case, we provided A1:F23, which corresponds to the range of the Hotels table, as shown in the picture below. Now, after the comma, we should provide our SQL-like query within quotes. The official name for the query language used in Google Sheets is Google Visualization API Query Language. To select all the data, we need the following code: =QUERY(A1:F23,"SELECT *") Note that you may need to replace the comma that separates the parameters with a semicolon or another character, based on your Google Sheets settings. In Google Sheets, SELECT * means “select everything”. Note how similar this instruction is to its SQL equivalent, SELECT * FROM Hotels. In Google Sheets, we omit the FROM clause because the data range is specified in the first argument. If you press Enter now, Google Sheets will turn the query into a result set, as shown in the picture below: The table you can see is essentially identical to the original table. This is the expected behavior. Our query was SELECT *, which means “show everything from the original data”. Great! We’ve just written our first query in Google Sheets! Working with Columns and Conditions Now let’s take a look at some more advanced examples. Suppose we now want to show only three columns: Name, Rating, and TwinRoomPrice. What’s more, we only want to see three-star hotels. In this case, we’ll have to refer to individual columns. The query will look like this: =QUERY(A1:F23, "SELECT B, D, E WHERE C=3") As you can see, instead of using column names from our table (such as Name, Stars, or Rating), we use the column letters provided by Google Sheets. The column Name is located in column B in the spreadsheet, so we used SELECT B to show it in the resulting table. By the same token, we used D to select the Rating column and E to select the TwinRoomPrice column. The columns are also separated with commas, but there is no comma after the last column we want to select. After selecting the columns, we add WHERE C=3. C refers to the Stars column, so the instruction basically means select three-star hotels only (i.e. where the C column equals 3). The screenshot below shows how the query refers to the individual columns: When we hit Enter, Google Sheets produces a neat table that contains three columns with all the info for three-star hotels: Name, Rating, and TwinRoomPrice. Note that the equivalent SQL query would be very similar: SELECT Name, Rating, TwinRoomPrice FROM Hotels WHERE Stars=3; Adding More Conditions and Ordering Rows In our next example, we want to find all hotels in Bandung which have a rating above 7.0. We also want to see the resulting hotels sorted from least to most expensive. To satisfy these requirements, we’ll have to add two new parts. We’ll need to join multiple conditions with the keyword AND, and we’ll need to sort rows with a new clause named ORDER BY. Here’s the query: =QUERY(A1:F23, "SELECT * WHERE D > 7.0 AND F='Bandung' ORDER BY E") Look at the WHERE clause of the query above. Column D contains the hotel ratings. In this case, we want the rating to be greater than 7.0. We also want to introduce another condition: the hotels must be located in Bandung. To introduce more than one condition, we use the AND keyword. Column F contains the hotels’ cities, so we wrote F='Bandung' to get hotels from Bandung. Note that text values need to be surrounded by single quotes (unlike numbers). If we forget the quotes, we’ll get an error. Finally, we added the following piece of code: ORDER BY E. The ORDER BY clause is used to sort the resulting rows. In this case, we want to sort the rows by TwinRoomPrice, which is in column E. By default, the rows will be sorted in ascending order. When we hit the Enter button, we can see the resulting table. Indeed, we can only see hotels from Bandung with a rating greater than 7.0. Notice that all rows are sorted by price. Counting Grouped Rows Finally, we’ll move on to two examples that you can use to produce results similar to Excel pivot tables. First, we want to count the number of hotels in each city. To do that, we’ll use the following query: =QUERY(A1:F23,"SELECT F, COUNT(A) GROUP BY F ") The query contains two new elements: COUNT(A) and GROUP BY F. Let’s start with the latter. GROUP BY F means that we want to divide all rows into groups based on the values in Column F (which is City). We perform the grouping (also known as “aggregation”) to be able to show each city (SELECT F) alongside the number of hotels in that city (COUNT(A)). COUNT(A) counts the number of rows in the given group. If no GROUP BY clause is provided, it will count all the rows. We put column A (Id) inside the parentheses so that all the unique hotels will be counted. We chose Id out of convention; you could use another column, such as B (the hotel name), but ID columns are usually preferred because they always contain unique values. We discuss aggregate functions (such as COUNT()) and the GROUP BY clause in our SQL Fundamentals course. Check it out if you want to understand how these SQL features work. When we hit Enter, we can see the resulting table with the number of hotels in each city: The equivalent SQL query would look like this: SELECT City, COUNT(Id) FROM Hotels GROUP BY City; Finding Averages In the last example, we’ll calculate the average rating and average price for hotels in each category (three-star, four-star, and five-star). We’ll need the following query: =QUERY(A1:F23, "SELECT C, AVG(D), AVG(E) GROUP BY C") We used GROUP BY C to group all hotels by the Stars column. We also introduced AVG(D) and AVG(E), which are used to calculate the average values in columns D (Rating) and E (TwinRoomPrice), respectively. When we press Enter, we should see this result table: The equivalent SQL instruction would look like this: SELECT Stars, AVG(Rating), AVG(TwinRoomPrice) FROM Hotels GROUP BY Stars; SQL(ish) Queries in Google Sheets? Absolutely! In this article, I explained how to use a query language similar to SQL in Google Sheets. You should now have a solid understanding of the QUERY function in Sheets. You can also see how similar Google’s Query Language is to standard SQL. If you enjoyed learning this and are curious about SQL, check LearnSQL.com’s SQL Fundamentals Course. There’s a free trial to help you get started! There is certainly more to Google’s query language than we were able to show in this introductory article. If you’re interested in using its other features, take a look at the official documentation for Google Visualization API Query Language. Tags: sql learn sql