Data types

Whether you're storing text, numbers, dates, or more complex data like arrays and JSON, YugabyteDB provides robust options to define the structure of your database tables. Choosing the correct data type ensures that data is stored optimally, queries run efficiently, and constraints like precision, length, and integrity are properly maintained.

This guide explores the different data types to use to handle different kinds of data efficiently.

For a list of supported and unsupported data types, see Data types.

Setup

The examples run on any YugabyteDB universe.

Set up a local cluster

If a local universe is currently running, first destroy it.

Start a local one-node universe with an RF of 1 by first creating a single node, as follows:

./bin/yugabyted start \
                --advertise_address=127.0.0.1 \
                --base_dir=${HOME}/var/node1 \
                --cloud_location=aws.us-east-2.us-east-2a

After starting the yugabyted processes on all the nodes, configure the data placement constraint of the universe, as follows:

./bin/yugabyted configure data_placement --base_dir=${HOME}/var/node1 --fault_tolerance=zone

This command can be executed on any node where you already started YugabyteDB.

To check the status of a running multi-node universe, run the following command:

./bin/yugabyted status --base_dir=${HOME}/var/node1

Setup

To set up a universe, refer to Set up a YugabyteDB Anywhere universe.

Setup

To set up a cluster, refer to Set up a YugabyteDB Aeon cluster.

Strings

The following character types are supported:

varchar(n): variable-length string
char(n): fixed-length, blank padded
text, varchar: variable unlimited length

To test YugabyteDB support for character types, create a table that has columns with the following types specified:

CREATE TABLE char_types (
  id serial PRIMARY KEY,
  a CHAR (4),
  b VARCHAR (16),
  c TEXT
);

Insert the following rows into the table:

INSERT INTO char_types (a, b, c) VALUES (
  'foo', 'bar', 'Data for the text column'
);

JSON

YugabyteDB provides two types for storing JSON (JavaScript Object Notation) data: JSON and JSONB. Both allow you to store JSON-formatted data, but they have distinct characteristics.

The JSON type preserves white space, key order, and duplicate keys, and is slower to process as it needs to be re-parsed for each operation. The newer JSONB type stores data in a decomposed binary format, eliminates white space, reorders keys, and removes duplicate keys, and is faster to process but slightly slower to input due to conversion overhead.

To understand how to use JSON and JSONB, see JSON support.

Numeric types

The following numeric types are supported:

SMALLINT: a 2-byte signed integer that has a range from -32,768 to 32,767.
INT: a 4-byte integer that has a range from -2,147,483,648 to 2,147,483,647.
float(n): a floating-point number whose precision is at least, n, up to a maximum of 8 bytes
real: a 4-byte floating-point number
numeric or numeric(p,s): a real number with p digits with s number after the decimal point. The numeric(p,s) is the exact number

The following example creates a table with integer type columns and inserting rows into it:

CREATE TABLE albums (
  album_id SERIAL PRIMARY KEY,
  title VARCHAR (255) NOT NULL,
  play_time SMALLINT NOT NULL,
  library_record INT NOT NULL
);

INSERT INTO albums
values (default,'Funhouse', 3600,2146483645 ),
(default,'Darkside of the Moon', 4200, 214648348);

Similarly, the following example shows how to create a table with floating-point typed columns and how to insert a row into that table:

CREATE TABLE floating_point_test (
  floatn_test float8 not NULL,
  real_test real NOT NULL,
  numeric_test NUMERIC (3, 2)
);

INSERT INTO floating_point_test (floatn_test, real_test, numeric_test)
VALUES
  (9223372036854775807, 2147483647, 5.36),
  (9223372036854775800, 2147483640, 9.99);

SERIAL pseudotype

In YugabyteDB, just like in PostgreSQL, a sequence is a special kind of database object that generates a sequence of integers. A sequence is often used as the primary key column in a table.

By assigning the SERIAL pseudotype to a column, the following occurs in the background:

The database creates a sequence object and sets the next value generated by the sequence as the default value for the column.
The database adds a NOT NULL constraint to that column because a sequence always generates an integer, which is a non-null value.
The SERIAL column is assigned as the owner of the sequence. This results in the sequence object being deleted when the SERIAL column or table is dropped.

YSQL supports the following pseudotypes:

SMALLSERIAL: 2 bytes (1 to 32,767)
SERIAL: 4 bytes (1 to 2,147,483,647)
BIGSERIAL: 8 bytes (1 to 9,223,372,036,854,775,807)

Date and time

Temporal data types allow us to store date and time data. The following date and time types are supported in PostgreSQL and YugabyteDB:

DATE: stores the dates only
TIME: stores the time of day values
TIMESTAMP: stores both date and time values
TIMESTAMPTZ: is a timezone-aware timestamp data type
INTERVAL: stores intervals of time

The following example creates a table with the temporal types:

CREATE TABLE temporal_types (
  date_type DATE,
  time_type TIME,
  timestamp_type TIMESTAMP,
  timestampz_type TIMESTAMPTZ,
  interval_type INTERVAL
);

The following example inserts a row into the table:

INSERT INTO temporal_types (
  date_type, time_type, timestamp_type, timestampz_type, interval_type)
VALUES
  ('2000-06-28', '06:23:00', '2016-06-22 19:10:25-07',
   '2016-06-22 19:10:25-07', '1 year'),
  ('2010-06-28', '12:32:12','2016-06-22 19:10:25-07',
   '2016-06-22 19:10:25-07', '10 years 3 months 5 days');

The following shows the inserted data:

yugabyte=# SELECT * FROM temporal_types;

 date_type  | time_type |   timestamp_type    |    timestampz_type     |     interval_type
------------+-----------+---------------------+------------------------+------------------------
 2010-06-28 | 12:32:12  | 2016-06-22 19:10:25 | 2016-06-22 19:10:25-07 | 10 years 3 mons 5 days
 2000-06-28 | 06:23:00  | 2016-06-22 19:10:25 | 2016-06-22 19:10:25-07 | 1 year

Arrays

YSQL supports arrays to hold data of variable length. The type of the data stored in an array can be an inbuilt type, a user-defined type or an enumerated type. The following examples are adapted from Arrays:

Create an array

CREATE TABLE rock_band (
   name text,
   members text[]
);

Insert rows

The following shows how to insert a row into the table. Note that the array literals must be double-quoted.

INSERT INTO rock_band VALUES (
  'Led Zeppelin', '{"Page", "Plant", "Jones", "Bonham"}'
);

The following is an alternate syntax using the array constructor. Note that the values must be single-quoted.

INSERT INTO rock_band VALUES (
  'Pink Floyd', ARRAY['Barrett', 'Gilmour']
);

Access arrays

SELECT * FROM rock_band;

Expect the following output:

     name     |          members
--------------+---------------------------
 Pink Floyd   | {Barrett,Gilmour}
 Led Zeppelin | {Page,Plant,Jones,Bonham}

You can access array values using subscripts, as follows:

SELECT name FROM rock_band WHERE members[2] = 'Plant';

Expect the following output:

     name
--------------
 Led Zeppelin

You can also access array values using slices, as follows:

SELECT members[1:2] FROM rock_band;

Expect the following output:

      members
-------------------
 {Barrett,Gilmour}
 {Page,Plant}

Update a single element

UPDATE rock_band SET members[2] = 'Waters' WHERE name = 'Pink Floyd';
select * from rock_band where name = 'Pink Floyd';

Expect the following output:

    name    |     members
------------+------------------
 Pink Floyd | {Barrett,Waters}

Update the entire array

UPDATE rock_band SET members = '{"Mason", "Wright", "Gilmour"}'
       WHERE name = 'Pink Floyd';
select * from rock_band where name = 'Pink Floyd';

Expect the following output:

    name    |        members
------------+------------------------
 Pink Floyd | {Mason,Wright,Gilmour}

Search in arrays

Use the ANY keyword to search for a particular value in an array, as follows:

SELECT name FROM rock_band WHERE 'Mason' = ANY(members);

Expect the following output:

    name
------------
 Pink Floyd

Enumerations

YugabyteDB supports the ENUM type in PostgreSQL. The following examples are adapted from Enums:

Create ENUMs

CREATE TYPE e_contact_method AS ENUM (
  'Email',
  'Sms',
  'Phone');

List ENUMs

To view the list of values across all ENUM types, execute the following:

SELECT t.typname, e.enumlabel
  FROM pg_type t, pg_enum e
  WHERE t.oid = e.enumtypid;

The output should be as follows:

     typname      | enumlabel
------------------+-----------
 e_contact_method | Email
 e_contact_method | Sms
 e_contact_method | Phone

ENUM column

CREATE TABLE contact_method_info (
   contact_name text,
   contact_method e_contact_method,
   value text
);

Insert an ENUM

The ENUM should have a valid value, as follows:

INSERT INTO contact_method_info VALUES ('Jeff', 'Email', 'jeff@mail.com');

Execute the following to verify:

select * from contact_method_info;

 contact_name | contact_method |     value
--------------+----------------+---------------
 Jeff         | Email          | jeff@mail.com

Inserting an invalid ENUM value would fail, as shown in the following example:

INSERT INTO contact_method_info VALUES ('Jeff', 'Fax', '4563456');

You should see the following error (which is compatible with that of PostgreSQL):

ERROR:  22P02: invalid input value for enum e_contact_method: "Fax"
LINE 1: INSERT INTO contact_method_info VALUES ('Jeff', 'Fax', '4563...

Composite types

A composite type (also known as a user-defined type) is a collection of data types similar to a "struct" in a programming language. The examples in this section are adapted from PostgreSQL Data Types:

Create a composite type

CREATE TYPE inventory_item AS (
   name text,
   supplier_id integer,
   price numeric
);

Column with a composite type

CREATE TABLE on_hand (
   item inventory_item,
   count integer
);

Insert

To insert a row, use the ROW keyword, as follows:

INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);

Query

To select some subfields from the on_hand example table, execute the following:

SELECT (item).name FROM on_hand WHERE (item).price > 0.99;

You can also use the table names, as follows:

SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 0.99;

Expect the following output:

    name
------------
 fuzzy dice

Range types

Range data types represent a range of values of an element type. Range types are usually referred to as the subtype of the range. The subtype needs to follow a strict order because it must be well-defined regardless of the position of element values, which can be within, before, or after a value range.

YSQL supports the following range types:

tsrange, which corresponds to a range of timestamp without time zone.
tstzrange, which corresponds to a range of timestamp with time zone.
daterange, which corresponds to a range of date.
int4range, which corresponds to a range of integer.
int8range, which corresponds to a range of biginteger.
numrange, which corresponds to a range of numeric.
User-defined types.

The following example shows how to provide a range of time for an employee's vacation:

CREATE TABLE employees (employee_no int, vacation tsrange);

INSERT INTO employees
  VALUES (1227, '[2020-01-01 8:30, 2020-02-02 5:30)');

A non-empty range has a lower bound and an upper bound, with everything between these values included in the range. An inclusive bound indicated by square brackets includes the boundary itself in the range, whereas an exclusive bound indicated by braces excludes the boundary from the range. That is, in the preceding example, the first timestamp is included in the range, and the second timestamp is excluded. If the lower bound is omitted, everything less than the upper bound is included in the range, and if the upper bound is omitted, then everything greater than the lower bound is included in the range. If you omit both bounds, all values of the element type are in the range.

The following is a syntax of an input for a range value, where empty is a representation of a range that does not contain anything:

(lowerbound, upperbound)

(lowerbound, upperbound]

[lowerbound, upperbound)

[lowerbound, upperbound]

empty

lowerbound could be a string of a valid input for the subtype or empty if there is no lower bound. The same logic is applicable to upperbound. You can enclose bound values in double quotes, which is a requirement in cases when the value includes parentheses, brackets, commas, double quotes, or backslashes. To define an empty-string value, you use "" (not providing anything is interpreted as defining an infinite bound). You may use whitespaces before and after values, but not between the parentheses or brackets, as this is interpreted as part of the lower or upper bound value.

You can create a range type using a constructor function named identically to the range type. The constructor functions typically have two or three arguments, with the former constructing a range in standard form (lower bound inclusive, upper bound exclusive), and the latter constructing a range with bounds specified by the third argument (one of strings "()", "(]", "[)", or "[]"). The following example shows the constructor with a lower bound, upper bound, and text argument:

SELECT numrange(3.0, 10.0, '(]');

In addition to using built-in range types, you can define your own custom ones. The following example shows how to define a range type of subtype text and use it in a SELECT statement:

CREATE TYPE textrange
  AS RANGE (subtype = text);

SELECT '( " a " " a ", " z " " z " )'::textrange;

For more information on range types, see Range data types.