29th Jan 2021 12 minutes read Data Types in SQL Ignacio L. Bisso sql learn sql Data Engineering online practice Table of Contents SQL Data Types Are Everywhere Text/String Data Types In SQL Numeric SQL Data Types Converting Data Types Date and Time SQL Data Types SQL’s Other Data Types Now You Know SQL Data Types ... Read this article and learn the ABC of SQL data types. Data types are used to define the type of data stored in a database. However, there are related concepts that a good SQL developer should know: data type conversion, what operations are possible between different data types, date arithmetic, etc. SQL also provides functions that manipulate values of different data types (i.e. functions for string data, date and time data, etc.). But before you understand how to work with these functions, build database tables, and do other important tasks, you need to understand SQL’s data types. So let’s take a few minutes and look at what’s available. SQL Data Types Are Everywhere Nowadays, data is everywhere. Smartphones, websites, and Internet of Things devices are generating tons of data every second. The things that data can do for you are incredible: power your business, improve your life, and help people in so many ways. SQL is the language used to manipulate data stored in databases; if you want to extract value from your data, you need SQL to do it. All the data we just mentioned can be organized into different groups, e.g. picture or video data, geographic data, text data, numeric data, and many other types. In this article, we will explain some of SQL’s many different data types. A SQL data type defines what kind of values can be stored in a given column. In a SQL database, every table column has a data type. Before starting with technical details, I’d like to suggest the course Data Types in SQL, where you can learn how to use data types in different SQL databases. Text/String Data Types In SQL Text data is any group of alphanumeric characters (which we call ‘strings’). Most SQL databases usually have more than one textual data type. The main difference is related to the way the strings are stored. There are fixed-length and variable-length text data types, as we can see below: Type of StorageData TypeHow it’s stored FIXED LENGTHCHAR(N)It occupies N bytes, no matter how long the text string is. N is the length it will store. VARIABLE LENGTHVARCHAR(N)It occupies only the length of the text string. N is the maximum length allowed. The difference between fixed- and variable-length data types is simple. A fixed-length CHAR(8) column will always require the same amount of storage space, regardless of the string length – e.g. “a” will take the same space as “abcdefgh”. A variable-length VARCHAR(8) column will use less space to store “a” and more to store “abcdefgh”. SQL has a lot of basic operators and functions that can manipulate text values. The most important operator is concatenation, which combines two or more strings into one. In some databases, this operator is represented by the symbol ||; other databases use a function called CONCAT(). The most frequently used text functions are: SUBSTRING(), TRIM(), LEFT(), RIGHT(), LOWER(), UPPER() and LENGTH(). All these functions return a text value except for LENGTH(), which returns the length (in characters) of the string as an integer. The table below shows the results of these functions: Function Action Sample invocation Returned value Returned data type substring Returns part of a string substring(‘Hello World’,3,6) ‘llo Wo’ text trim Removes blank spaces from the beginning and end of a string. trim(‘ Hello ‘) ‘Hello’ text left Returns a specified number of characters, starting from the beginning of the string. left(‘Hello World’, 4) ‘Hell’ text right Returns a specified number of characters, starting from the end of the string. right(‘Hello World’, 3) ‘rld’ text lower Transforms all text into lower case. lower(‘Hello World’) ‘hello world’ text upper Transforms all text into capital letters. upper(‘Hello World’) ‘HELLO WORLD’ text length Returns the number of characters in the string. length(‘Hello World’) 11 integer Some of these functions are not part of the SQL standard, so they won’t work in all databases. This is due to something called SQL dialects: each database has its own syntax extensions that may be outside the SQL standard. For example, in PostgreSQL the symbol for concatenation is ||, while in MS SQL Server it is +. PostgreSQL and MS SQL Server are two different SQL dialects and ‘speak’ a slightly different version of SQL. If you want to learn more about text data types like VARCHAR or CHAR, I suggest you read the article How to Create a Table in SQL. Numeric SQL Data Types SQL has several numeric data types. To explain them, we will have to divide numbers into two groups: integer numbers and non-integer numbers (i.e. numbers with a decimal point). Each group contains several SQL data types. For integer values, we have the SMALLINT, INTEGER, and BIGINT data types. The difference between them is the range of the values they can represent and the memory size consumed by each value. We’ll go into details later in this article. For numbers with decimals there are two main groups: floating point numbers and the NUMERIC (also known as DECIMAL) data type. Floating point data types perform intensive math calculations very quickly because they are stored in the same format used by the processor. Like integers, floating point numbers have different data types, including FLOAT and DOUBLE PRECISION. NUMERIC (or DECIMAL) data types are more oriented to represent numbers with a fixed number of decimals, like money values that will always have two digits after the decimal. Again, we’ll go into the details later. Converting Data Types In some SQL statements, it can be necessary to change the data type of a column or a constant. For example, if you need to compare an INTEGER value with a NUMERIC value, you need to convert them to the same data type before making the comparison. In some cases, the database detects this need and executes the conversion automatically. In other cases, the developer needs to explicitly convert one or both of the values. Data type conversion is done with the SQL standard CAST() function. Suppose we need to compare the column salary (which is NUMERIC) against an INTEGER value. Data type conversion is done with the SQL standard CAST() function. Suppose we need to compare the column salary (which is NUMERIC) against an INTEGER value. CREATE TABLE example_cast( employee_name VARCHAR(50), salary NUMERIC(7,2) ); INSERT INTO example_cast VALUES ( ‘John Doe’, 42453.00); INSERT INTO example_cast VALUES ( ‘Mary Smith’, 54451.00); SELECT employee_name FROM example_cast WHERE CAST(salary AS INTEGER) > 50000; In the above code, we converted the salary column from NUMERIC(7,2) to an INTEGER data type with CAST(). We can now go ahead with the comparison. The article Understanding Numerical Data Types offers details about how SQL stores and manages numeric values. I recommend you read it for more information. Date and Time SQL Data Types Date and time data types store dates, times, or both. In SQL, the most common date and time data types are DATE, TIMESTAMP, and INTERVAL. There are other data types as well, which we’ll deal with later in this article. The DATE data type is used to store dates, e.g. ‘Dec 31 2020’. It’s important to differentiate the way the database stores a date value and the format used to display that date value. The database usually stores dates as an integer representing the number of days that have elapsed since a specific date. However, dates can be displayed in man different formats: 2020-12-31, 12/31/2020, Dec 31 2020, and so on. The TIMESTAMP data type is used to represent one point in the timeline. It can have different levels of precision. For example, we can represent values to the minute (2020-12-31 08:12), second (2020-12-31 08:12:35) or microsecond (2020-12-31 08:12:35.234). And as with DATE, SQL provides functions to format the way we show the TIMESTAMP value. Finally, the INTERVAL data type is used to represent the duration of a period of time with different levels of precision. We can store INTERVAL values like ‘20 hours’ , ‘10 days’ or ‘123,400 seconds’, among many others. There is an interesting topic called date arithmetic, which allows us to construct arithmetic operations or expressions involving the elements of different date/time data types. Some of these are shown in the next table, which contains the most frequent types of expressions involving time and date data: Arithmetic operationResulting data type TIMESTAMP + INTERVALTIMESTAMP TIMESTAMP - TIMESTAMPINTERVAL INTERVAL + INTERVALINTERVAL INTERVAL * INTEGERINTERVAL TIMESTAMP - INTERVALTIMESTAMP CAST(DATE, TIMESTAMP) + INTERVALTIMESTAMP CAST(DATE, TIMESTAMP) - TIMESTAMPINTERVAL Let’s suppose we sell fruit and we have a table called sales where we store one record for each sale we make. The schema of the table is: CustomerProductDate of SaleTons SoldTon PriceDelivery DateStarting Freeze JohnApple2020-02-236054.562020-02-282020-01-08 10:00 MaryPeach2020-03-023561.502020-03-122020-01-15 09:30 JohnBanana2020-04-131558.502020-04-222020-01-07 08:23 MaryPeach2020-05-2110061.002020-05-282020-01-15 09:30 SteveApple2020-05-219054.562020-05-302020-01-08 10:00 Due to food regulations, we have to inform our customers of the date and time when the products entered our cold storage; this is why we store the date and time in the column starting_freeze. As an example, let’s suppose we need to inform our customers of how long their products have been in storage. We can write this query: SELECT customer, product, delivery_date, CAST(delivery_date AS TIMESTAMP) - starting_freeze AS freeze_time FROM sales The previous query is simple; the only point we need to explain is how we calculate the freeze_time value. The first thing we do is convert delivery_date from a DATE to a TIMESTAMP. (SQL does this by adding the time 00:00:00 to the date value.) Next, we calculate the difference between delivery_date and starting_freeze. As both are TIMESTAMP values, the result will be an INTERVAL value representing the elapsed time between starting_freeze and delivery_date. This is exactly what we are looking for, as you can see in the following image: CustomerProductDelivery DateFreeze Time JohnApple2020-02-2850 days 14:00:00 MaryPeach2020-03-1256 days 14:30:00 JohnBanana2020-04-22105 days 15:37:00 MaryPeach2020-05-28133 days 14:30:00 SteveApple2020-05-30142 days 14:00:00 Now, we also want to inform our customers about any product consumption deadlines. Let’s say that regulations state that bananas have to be consumed no later than 180 days after their freeze date and other fruits must be consumed within 240 days. We can calculate the consume_deadline date using the following query: SELECT customer, product , delivery_date, CAST(delivery_date, TIMESTAMP)-starting_freeze AS freeze_time_at_delivery, starting_freeze + case when product = ‘banana’ then INTERVAL ‘1 day’ * 180 when product <> ‘banana’ then INTERVAL ‘1 day’ * 240 end AS consume_deadline FROM sales Notice how we calculated the consume_deadline in the previous query by using a CASE statement. You can see how we obtained an INTERVAL of 180 days by using the expression INTERVAL ‘1 day‘ * 180 . The results of this query are shown below: Customer Product Delivery Date Freeze Time at Delivery Consume Deadline John Apple 2020-02-28 50 days 14:00:00 2020-07-06 10:00:00 Mary Peach 2020-03-12 56 days 14:30:00 2020-09-11 09:30:00 John Banana 2020-04-22 105 days 15:37:00 2020-09-03 08:23:00 Mary Peach 2020-05-28 133 days 14:30:00 2020-09-11 09:30:00 Steve Apple 2020-05-30 142 days 14:00:00 2020-07-06 10:00:00 For more information on date-related queries, see How to Group Data by Week in SQL Server and How to Get the First Day of the Week in SQL Server. SQL’s Other Data Types There are many other data types available in SQL. Some of them are very simple, like the BOOLEAN data type; it has just two possible values: TRUE or FALSE. Other data types are more complex, like the XML or JSON data types, which allows us to store any valid XML document or JSON document in a table column. SQL also allows us to manipulate XML and JSON values, extracting, adding, or even filtering by their components. The BLOB and CLOB data types can also store very large amounts of data. CLOB is used to store Character Large Objects, while BLOB is used for Binary Large Objects. The following table includes the most important SQL data types, their syntax, examples of the values they store: Data TypeExample definitionExample valueAdditional Info CHARConsume Deadline2020-02-2850 days 14:00:00 name CHAR(30)‘John’Stores any string text as a fixed length.56 days 14:30:00 VARCHARstreet VARCHAR(90)‘5th Avenue’Store string texts of variable length. INTEGERage INTEGER1420Represents an integer (i.e. no decimals). The minimum and maximum values depend on your database. SMALLINTfloor_num SMALLINT15Same as INTEGER, except that it might hold a smaller range of values, depending on the database. BIGINTpopulation BIGINT298832888Same as INTEGER, except that it might hold a larger range of values, depending on the database. NUMERICbonus NUMERIC(7,2)67400.90Holds a NUMERIC value with a fixed number of decimal digits. The size argument has two parts. Precision comes first and specifies the total number of digits (e.g. 7); scale specifies the number of digits that come after the decimal (e.g. 2). Both arguments should be separated by a comma. FLOATdistance FLOAT49.12Stores numbers with a variable number of decimal digits. Are very fast for mathematical computations. DOUBLE PRECISIONsurface DOUBLE PRECISION409.123Like FLOAT, but with greater precision and storage space. Your database defines the precision. DATEborn_day DATE1983-02-03Represents a date. TIMEarrival TIME11:23:45Represents a time of day (without time zone info). TIMESTAMPstarting TIMESTAMP1983-03-03 11:23:45Represents a combination of DATE and TIME values, separated by a space. INTERVALtravel_time INTERVAL1 hour 43 minutesIt stores the quantity of time between two datetime values – e.g. between 10:00 and 12:30 is an interval of 02:30 (2 hours and 30 minutes). BOOLEANis_shipped BOOLEANtrueStores either a TRUE or a FALSE. XMLxml_doc XML Stores valid XML documents. JSONjson_doc JSON‘{id:”123” }’Stores JSON valid documents. CLOBcontract_text CLOBtext of an entire contractStores Character Large Objects, i.e. large amounts of textual data. BLOBphoto_id BLOBN/AStores Binary Large Objects, i.e. very large objects like images, PDF files, etc. Now You Know SQL Data Types ... In this article, we covered data types from the SQL standard point of view. However, all database engines (like PostgreSQL, Oracle, and MS SQL Server) have their own SQL dialects, which means they’ll have slightly different syntax for their data types. For example, the CLOB data type becomes the TEXT data type in PostgreSQL. Even so, this overview should give you a good idea of what’s possible with SQL data types. For specific details, you can always consult your preferred database’s documentation. If you want to really get comfortable with data types, I suggest our course Creating Database Structure. It’ll show you how to create and manage tables, views, and indexes with SQL. It also explains a lot about working with data types. Tags: sql learn sql Data Engineering online practice