Whether you're storing text, numbers, dates, or more complex data like arrays and JSON, YugabyteDB provides robust options to define the structure of your database tables. Choosing the correct data type ensures that data is stored optimally, queries run efficiently, and constraints like precision, length, and integrity are properly maintained.
This guide explores the different data types to use to handle different kinds of data efficiently.
Setup
The examples run on any YugabyteDB universe.
Set up a local cluster
If a local universe is currently running, first destroy it.
Start a local one-node universe with an RF of 1
by first creating a single node, as follows:
./bin/yugabyted start \
--advertise_address=127.0.0.1 \
--base_dir=${HOME}/var/node1 \
--cloud_location=aws.us-east-2.us-east-2a
After starting the yugabyted processes on all the nodes, configure the data placement constraint of the universe, as follows:
./bin/yugabyted configure data_placement --base_dir=${HOME}/var/node1 --fault_tolerance=zone
This command can be executed on any node where you already started YugabyteDB.
To check the status of a running multi-node universe, run the following command:
./bin/yugabyted status --base_dir=${HOME}/var/node1
Setup
To set up a universe, refer to Set up a YugabyteDB Anywhere universe.Setup
To set up a cluster, refer to Set up a YugabyteDB Aeon cluster.Strings
The following character types are supported:
varchar(n)
: variable-length stringchar(n)
: fixed-length, blank paddedtext
,varchar
: variable unlimited length
To test YugabyteDB support for character types, create a table that has columns with the following types specified:
CREATE TABLE char_types (
id serial PRIMARY KEY,
a CHAR (4),
b VARCHAR (16),
c TEXT
);
Insert the following rows into the table:
INSERT INTO char_types (a, b, c) VALUES (
'foo', 'bar', 'Data for the text column'
);
JSON
YugabyteDB provides two types for storing JSON (JavaScript Object Notation) data: JSON and JSONB. Both allow you to store JSON-formatted data, but they have distinct characteristics.
The JSON type preserves white space, key order, and duplicate keys, and is slower to process as it needs to be re-parsed for each operation. The newer JSONB type stores data in a decomposed binary format, eliminates white space, reorders keys, and removes duplicate keys, and is faster to process but slightly slower to input due to conversion overhead.
Numeric types
The following numeric types are supported:
SMALLINT
: a 2-byte signed integer that has a range from -32,768 to 32,767.INT
: a 4-byte integer that has a range from -2,147,483,648 to 2,147,483,647.float(n)
: a floating-point number whose precision is at least, n, up to a maximum of 8 bytesreal
: a 4-byte floating-point numbernumeric
ornumeric(p,s)
: a real number with p digits with s number after the decimal point. The numeric(p,s) is the exact number
The following example creates a table with integer type columns and inserting rows into it:
CREATE TABLE albums (
album_id SERIAL PRIMARY KEY,
title VARCHAR (255) NOT NULL,
play_time SMALLINT NOT NULL,
library_record INT NOT NULL
);
INSERT INTO albums
values (default,'Funhouse', 3600,2146483645 ),
(default,'Darkside of the Moon', 4200, 214648348);
Similarly, the following example shows how to create a table with floating-point typed columns and how to insert a row into that table:
CREATE TABLE floating_point_test (
floatn_test float8 not NULL,
real_test real NOT NULL,
numeric_test NUMERIC (3, 2)
);
INSERT INTO floating_point_test (floatn_test, real_test, numeric_test)
VALUES
(9223372036854775807, 2147483647, 5.36),
(9223372036854775800, 2147483640, 9.99);
SERIAL pseudotype
In YugabyteDB, just like in PostgreSQL, a sequence is a special kind of database object that generates a sequence of integers. A sequence is often used as the primary key column in a table.
By assigning the SERIAL pseudotype to a column, the following occurs in the background:
- The database creates a sequence object and sets the next value generated by the sequence as the default value for the column.
- The database adds a NOT NULL constraint to that column because a sequence always generates an integer, which is a non-null value.
- The SERIAL column is assigned as the owner of the sequence. This results in the sequence object being deleted when the SERIAL column or table is dropped.
YSQL supports the following pseudotypes:
- SMALLSERIAL: 2 bytes (1 to 32,767)
- SERIAL: 4 bytes (1 to 2,147,483,647)
- BIGSERIAL: 8 bytes (1 to 9,223,372,036,854,775,807)
Date and time
Temporal data types allow us to store date and time data. The following date and time types are supported in PostgreSQL and YugabyteDB:
- DATE: stores the dates only
- TIME: stores the time of day values
- TIMESTAMP: stores both date and time values
- TIMESTAMPTZ: is a timezone-aware timestamp data type
- INTERVAL: stores intervals of time
The following example creates a table with the temporal types:
CREATE TABLE temporal_types (
date_type DATE,
time_type TIME,
timestamp_type TIMESTAMP,
timestampz_type TIMESTAMPTZ,
interval_type INTERVAL
);
The following example inserts a row into the table:
INSERT INTO temporal_types (
date_type, time_type, timestamp_type, timestampz_type, interval_type)
VALUES
('2000-06-28', '06:23:00', '2016-06-22 19:10:25-07',
'2016-06-22 19:10:25-07', '1 year'),
('2010-06-28', '12:32:12','2016-06-22 19:10:25-07',
'2016-06-22 19:10:25-07', '10 years 3 months 5 days');
The following shows the inserted data:
yugabyte=# SELECT * FROM temporal_types;
date_type | time_type | timestamp_type | timestampz_type | interval_type
------------+-----------+---------------------+------------------------+------------------------
2010-06-28 | 12:32:12 | 2016-06-22 19:10:25 | 2016-06-22 19:10:25-07 | 10 years 3 mons 5 days
2000-06-28 | 06:23:00 | 2016-06-22 19:10:25 | 2016-06-22 19:10:25-07 | 1 year
Arrays
YSQL supports arrays to hold data of variable length. The type of the data stored in an array can be an inbuilt type, a user-defined type or an enumerated type. The following examples are adapted from Arrays:
Create an array
CREATE TABLE rock_band (
name text,
members text[]
);
Insert rows
The following shows how to insert a row into the table. Note that the array literals must be double-quoted.
INSERT INTO rock_band VALUES (
'Led Zeppelin', '{"Page", "Plant", "Jones", "Bonham"}'
);
The following is an alternate syntax using the array constructor. Note that the values must be single-quoted.
INSERT INTO rock_band VALUES (
'Pink Floyd', ARRAY['Barrett', 'Gilmour']
);
Access arrays
SELECT * FROM rock_band;
Expect the following output:
name | members
--------------+---------------------------
Pink Floyd | {Barrett,Gilmour}
Led Zeppelin | {Page,Plant,Jones,Bonham}
You can access array values using subscripts, as follows:
SELECT name FROM rock_band WHERE members[2] = 'Plant';
Expect the following output:
name
--------------
Led Zeppelin
You can also access array values using slices, as follows:
SELECT members[1:2] FROM rock_band;
Expect the following output:
members
-------------------
{Barrett,Gilmour}
{Page,Plant}
Update a single element
UPDATE rock_band SET members[2] = 'Waters' WHERE name = 'Pink Floyd';
select * from rock_band where name = 'Pink Floyd';
Expect the following output:
name | members
------------+------------------
Pink Floyd | {Barrett,Waters}
Update the entire array
UPDATE rock_band SET members = '{"Mason", "Wright", "Gilmour"}'
WHERE name = 'Pink Floyd';
select * from rock_band where name = 'Pink Floyd';
Expect the following output:
name | members
------------+------------------------
Pink Floyd | {Mason,Wright,Gilmour}
Search in arrays
Use the ANY keyword to search for a particular value in an array, as follows:
SELECT name FROM rock_band WHERE 'Mason' = ANY(members);
Expect the following output:
name
------------
Pink Floyd
Enumerations
YugabyteDB supports the ENUM type in PostgreSQL. The following examples are adapted from Enums:
Create ENUMs
CREATE TYPE e_contact_method AS ENUM (
'Email',
'Sms',
'Phone');
List ENUMs
To view the list of values across all ENUM types, execute the following:
SELECT t.typname, e.enumlabel
FROM pg_type t, pg_enum e
WHERE t.oid = e.enumtypid;
The output should be as follows:
typname | enumlabel
------------------+-----------
e_contact_method | Email
e_contact_method | Sms
e_contact_method | Phone
ENUM column
CREATE TABLE contact_method_info (
contact_name text,
contact_method e_contact_method,
value text
);
Insert an ENUM
The ENUM should have a valid value, as follows:
INSERT INTO contact_method_info VALUES ('Jeff', 'Email', 'jeff@mail.com');
Execute the following to verify:
select * from contact_method_info;
contact_name | contact_method | value
--------------+----------------+---------------
Jeff | Email | jeff@mail.com
Inserting an invalid ENUM value would fail, as shown in the following example:
INSERT INTO contact_method_info VALUES ('Jeff', 'Fax', '4563456');
You should see the following error (which is compatible with that of PostgreSQL):
ERROR: 22P02: invalid input value for enum e_contact_method: "Fax"
LINE 1: INSERT INTO contact_method_info VALUES ('Jeff', 'Fax', '4563...
Composite types
A composite type (also known as a user-defined type) is a collection of data types similar to a "struct" in a programming language. The examples in this section are adapted from PostgreSQL Data Types:
Create a composite type
CREATE TYPE inventory_item AS (
name text,
supplier_id integer,
price numeric
);
Column with a composite type
CREATE TABLE on_hand (
item inventory_item,
count integer
);
Insert
To insert a row, use the ROW keyword, as follows:
INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);
Query
To select some subfields from the on_hand
example table, execute the following:
SELECT (item).name FROM on_hand WHERE (item).price > 0.99;
You can also use the table names, as follows:
SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 0.99;
Expect the following output:
name
------------
fuzzy dice
Range types
Range data types represent a range of values of an element type. Range types are usually referred to as the subtype of the range. The subtype needs to follow a strict order because it must be well-defined regardless of the position of element values, which can be within, before, or after a value range.
YSQL supports the following range types:
tsrange
, which corresponds to a range oftimestamp without time zone
.tstzrange
, which corresponds to a range oftimestamp with time zone
.daterange
, which corresponds to a range ofdate
.int4range
, which corresponds to a range ofinteger
.int8range
, which corresponds to a range ofbiginteger
.numrange
, which corresponds to a range ofnumeric
.- User-defined types.
The following example shows how to provide a range of time for an employee's vacation:
CREATE TABLE employees (employee_no int, vacation tsrange);
INSERT INTO employees
VALUES (1227, '[2020-01-01 8:30, 2020-02-02 5:30)');
A non-empty range has a lower bound and an upper bound, with everything between these values included in the range. An inclusive bound indicated by square brackets includes the boundary itself in the range, whereas an exclusive bound indicated by braces excludes the boundary from the range. That is, in the preceding example, the first timestamp is included in the range, and the second timestamp is excluded. If the lower bound is omitted, everything less than the upper bound is included in the range, and if the upper bound is omitted, then everything greater than the lower bound is included in the range. If you omit both bounds, all values of the element type are in the range.
The following is a syntax of an input for a range value, where empty is a representation of a range that does not contain anything:
(lowerbound, upperbound)
(lowerbound, upperbound]
[lowerbound, upperbound)
[lowerbound, upperbound]
empty
lowerbound could be a string of a valid input for the subtype or empty if there is no lower bound. The same logic is applicable to upperbound. You can enclose bound values in double quotes, which is a requirement in cases when the value includes parentheses, brackets, commas, double quotes, or backslashes. To define an empty-string value, you use ""
(not providing anything is interpreted as defining an infinite bound). You may use whitespaces before and after values, but not between the parentheses or brackets, as this is interpreted as part of the lower or upper bound value.
You can create a range type using a constructor function named identically to the range type. The constructor functions typically have two or three arguments, with the former constructing a range in standard form (lower bound inclusive, upper bound exclusive), and the latter constructing a range with bounds specified by the third argument (one of strings "()
", "(]
", "[)
", or "[]
"). The following example shows the constructor with a lower bound, upper bound, and text argument:
SELECT numrange(3.0, 10.0, '(]');
In addition to using built-in range types, you can define your own custom ones. The following example shows how to define a range type of subtype text
and use it in a SELECT statement:
CREATE TYPE textrange
AS RANGE (subtype = text);
SELECT '( " a " " a ", " z " " z " )'::textrange;
For more information on range types, see Range data types.