ysql_dump
Overview
ysql_dump is a utility for backing up a YugabyteDB database into a plain-text, SQL script file. ysql_dump makes consistent backups, even if the database is being used concurrently. ysql_dump does not block other users accessing the database (readers or writers).
ysql_dump only dumps a single database. To backup global objects that are common to all databases in a cluster, such as roles, use ysql_dumpall.
Dumps are output in plain-text, SQL script files. Script dumps are plain-text files containing the SQL statements required to reconstruct the database to the state it was in at the time it was saved. To restore from such a script, import it using the ysqlsh \i
meta-command. Script files can be used to reconstruct the database even on other machines and other architectures; with some modifications, even on other SQL database products.
While running ysql_dump, you should examine the output for any warnings (printed on standard error).
The ysql_dump utility is derived from the PostgreSQL pg_dump utility.
Installation
ysql_dump is installed with YugabyteDB and is located in the postgres/bin
directory of the YugabyteDB home directory.
Online help
Run ysql_dump --help
to display the online help.
Syntax
ysql_dump [ <connection-option>... ] [ <content-output-format-option> ... ] [ <dbname> ]
- connection-option: See Database connection options.
- content-output-format-option: See Content and output format options
- dbname: The name of the database.
Content and output format options
The following command line options control the content and format of the output.
dbname
Specifies the name of the database to be dumped. If this is not specified, the environment variable PGDATABASE
is used. If that is not set, the user name specified for the connection is used.
-a, --data-only
Dump only the data, not the schema (data definitions). Table data, large objects, and sequence values are dumped.
-b, --blobs
Include large objects in the dump. This is the default behavior except when -n|--schema
, -t|--table
, or -s|--schema-only
is specified. The -b|--blobs
option is therefore only useful to add large objects to dumps where a specific schema or table has been requested. Note that blobs are considered data and therefore will be included when -a|--data-only
is used, but not when -s|--schema-only
is used.
-B, --no-blobs
Exclude large objects in the dump.
When both -b|--blobs
and -B|--no-blobs
are given, the behavior is to output large objects, when data is being dumped, see -b|--blobs
option.
-c, --clean
Output statements to clean (drop) database objects prior to outputting the statements for creating them. (Unless --if-exists
is also specified, restore might generate some harmless error messages, if any objects were not present in the destination database.)
-C, --create
Begin the output with a statement to create the database itself and reconnect to the created database. (With a script of this form, it doesn't matter which database in the destination installation you connect to before running the script.) If -c|--clean
is also specified, the script drops and recreates the target database before reconnecting to it.
-E encoding, --encoding=encoding
Create the dump in the specified character set encoding. By default, the dump is created in the database encoding. (Another way to get the same result is to set the PGCLIENTENCODING
environment variable to the desired dump encoding.)
-f file, --file=file
Send output to the specified file. This parameter can be omitted for file-based output formats, in which case the standard output is used.
-m addresses, --masters=addresses
Comma-separated list of YB-Master hosts and ports.
-n schema, --schema=schema
Dump only schemas matching schema; this selects both the schema itself, and all its contained objects. When this option is not specified, all non-system schemas in the target database will be dumped. Multiple schemas can be selected by writing multiple -n|--schema
options. Also, the schema parameter is interpreted as a pattern according to the same rules used by the ysqlsh \d
commands, so multiple schemas can also be selected by writing wildcard characters in the pattern. When using wildcards, be careful to quote the pattern if needed to prevent the shell from expanding the wildcards.
Note
When-n|--schema
is specified, ysql_dump makes no attempt to dump any other database objects that the selected schemas might depend upon. Therefore, there is no guarantee that the results of a specific-schema dump can be successfully restored by themselves into a clean database.
Note
Non-schema objects, such as blobs, are not dumped when-n|--schema
is specified. You can add blobs back to the dump with the -b|--blobs
option.
-N schema, --exclude-schema=schema
Do not dump any schemas matching the schema pattern. The pattern is interpreted according to the same rules as for -n|--schema
option. -N|--exclude-schema
can be given more than once to exclude schemas matching any of several patterns.
When both -n|--schema
and -N|--exclude-schema
are given, the behavior is to dump just the schemas that match at least one -n|--schema
option but no -N|--exclude-schema
options. If -N|--exclude-schema
appears without -n|--schema
, then schemas matching -N|--exclude-schema
are excluded from what is otherwise a normal dump.
-o, --oids
Dump object identifiers (OIDs) as part of the data for every table. Use this option if your application references the OID columns in some way (for example, in a foreign key constraint). Otherwise, this option should not be used.
-O, --no-owner
Do not output statements to set ownership of objects to match the original database. By default, ysql_dump issues ALTER OWNER
or SET SESSION AUTHORIZATION
statements to set ownership of created database objects. These statements will fail when the script is run unless it is started by a superuser (or the same user that owns all of the objects in the script). To make a script that can be restored by any user, but will give that user ownership of all the objects, specify -O|--no-owner
.
-s, --schema-only
Dump only the object definitions (schema), not data.
This option is the inverse of -a|--data-only
.
(Do not confuse this with the -n|--schema
option, which uses the word "schema" in a different meaning.)
To exclude table data for only a subset of tables in the database, see --exclude-table-data
.
-S username, --superuser=username
Specify the superuser username to use when disabling triggers. This is relevant only if --disable-triggers
is used. (Usually, it's better to leave this out, and instead start the resulting script as superuser.)
-t table, --table=table
Dump only tables with names matching table. For this purpose, "table" includes views, materialized views, sequences, and foreign tables. Multiple tables can be selected by writing multiple -t|--table
options. Also, the table parameter is interpreted as a pattern according to the same rules used by ysqlsh \d
commands, so multiple tables can also be selected by writing wildcard characters in the pattern. When using wildcards, be careful to quote the pattern if needed to prevent the shell from expanding the wildcards.
The -n|--schema
and -N|--exclude-schema
options have no effect when -t|--table
is used, because tables selected by -t|--table
will be dumped regardless of those options, and non-table objects will not be dumped.
Note
When-t|--table
is specified, ysql_dump makes no attempt to dump any other database objects that the selected tables might depend upon. Therefore, there is no guarantee that the results of a specific-table dump can be successfully restored by themselves into a clean database.
-T table, --exclude-table=table
Do not dump any tables matching the table pattern. The pattern is interpreted according to the same rules as for -t
. -T|--exclude-table
can be given more than once to exclude tables matching any of several patterns.
When both -t|--table
and -T|--exclude-table
are given, the behavior is to dump just the tables that match at least one -t|--table
option but no -T|--exclude-table
options. If -T|--exclude-table
appears without -t|--table
, then tables matching -T|--exclude-table
are excluded from what is otherwise a normal dump.
-v, --verbose
Specifies verbose mode. This causes ysql_dump to output detailed object comments and start and stop times to the dump file, and progress messages to standard error.
-V, --version
Print the ysql_dump version and exit.
-x, --no-privileges, --no-acl
Prevent dumping of access privileges (GRANT
and REVOKE
statements).
-Z 0..9, --compress=0..9
Specify the compression level to use. Zero (0
) means no compression. For plain text output, setting a nonzero compression level causes the entire output file to be compressed, as though it had been fed through gzip
; but the default is not to compress.
--column-inserts, --attribute-inserts
Dump data as INSERT
statements with explicit column names (INSERT INTO table (column, ...) VALUES ...
). This makes restoration very slow; it is mainly helpful for making dumps that can be loaded into non-YugabyteDB databases. However, as this option generates a separate statement for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents.
--disable-dollar-quoting
This option disables the use of dollar quoting for function bodies, and forces them to be quoted using SQL standard string syntax.
--disable-triggers
This option is relevant only when creating a data-only dump. It instructs ysql_dump to include statements to temporarily disable triggers on the target tables while the data is reloaded. Use this if you have referential integrity checks or other triggers on the tables that you do not want to invoke during data reload.
Presently, the statements emitted for --disable-triggers
must be done as superuser. So, you should also specify a superuser name with -S|--superuser
, or preferably be careful to start the resulting script as a superuser.
--enable-row-security
This option is relevant only when dumping the contents of a table which has row security. By default, ysql_dump sets row_security
to off
, to ensure that all data is dumped from the table. If the user does not have sufficient privileges to bypass row security, then an error is thrown. This parameter instructs ysql_dump to set row_security
to on
instead, allowing the user to dump the parts of the contents of the table that they have access to.
Note that if you use this option currently, you probably also want the dump be in INSERT
format, as the COPY FROM
during restore does not support row security.
--exclude-table-data=table
Do not dump data for any tables matching the table pattern. The pattern is interpreted according to the same rules as for -t|--table
. The --exclude-table-data
option can be given more than once to exclude tables matching any of several patterns. This option is helpful when you need the definition of a particular table even though you do not need the data in it.
To exclude data for all tables in the database, see -s|--schema-only
.
--if-exists
Use conditional statements (that is, add an IF EXISTS
clause) when cleaning database objects. This option is not valid unless -c|--clean
is also specified.
--inserts
Dump data as INSERT
statements (rather than COPY
statements). This will make restoration very slow; it is mainly helpful for making dumps that can be loaded into non-YugabyteDB databases. However, as this option generates a separate statement for each row, an error in reloading a row causes only that row to be lost rather than the entire table contents. Note that the restore might fail altogether if you have rearranged column order. The --column-inserts
option is safe against column order changes, though even slower.
--lock-wait-timeout=timeout
Do not wait forever to acquire shared table locks at the beginning of the dump. Instead fail if unable to lock a table in the specified timeout. The timeout may be specified in any of the formats accepted by SET statement_timeout
. (Allowed formats vary depending on the server version you are dumping from, but an integer number of milliseconds is accepted by all versions.)
--no-publications
Do not dump publications.
--no-security-labels
Do not dump security labels.
--no-subscriptions
Do not dump subscriptions.
--no-sync
By default, ysql_dump waits for all files to be written safely to disk. This option causes ysql_dump to return without waiting, which is faster, but means that a subsequent operating system crash can leave the dump corrupt. Generally, this option is helpful for testing but should not be used when dumping data from production installation.
--no-unlogged-table-data
Do not dump the contents of unlogged tables. This option has no effect on whether or not the table definitions (schema) are dumped; it only suppresses dumping the table data. Data in unlogged tables is always excluded when dumping from a standby server.
--quote-all-identifiers
Force quoting of all identifiers. This option is recommended when dumping a database from a server whose YugabyteDB major version is different from ysql_dump, or when the output is intended to be loaded into a server of a different major version. By default, ysql_dump quotes only identifiers that are reserved words in its own major version. This sometimes results in compatibility issues when dealing with servers of other versions that may have slightly different sets of reserved words. Using --quote-all-identifiers
prevents such issues, at the price of a harder-to-read dump script.
--section=sectionname
Only dump the named section. The section name can be pre-data, data, or post-data. This option can be specified more than once to select multiple sections. The default is to dump all sections.
The data section contains actual table data, large-object contents, and sequence values. Post-data items include definitions of indexes, triggers, rules, and constraints other than validated check constraints. Pre-data items include all other data definition items.
--no-serializable-deferrable
Use the --no-serializable-deferrable
flag to disable the default serializable-deferrable
transaction mode. The serializable-deferrable
mode ensures that the snapshot used is consistent with later database states by waiting for a point in the transaction stream at which no anomalies can be present, so that there is no risk of the dump failing or causing other transactions to roll back with a serialization_failure
.
If there are active read-write transactions, the maximum wait time until the start of the dump will be 50ms
(based on the default --max_clock_skew_usec
for YB-TServer and YB-Master servers.) If there are no active read-write transactions when ysql_dump is started, this option will not make any difference. Once running, performance with or without the option is the same.
--snapshot=snapshotname
Use the specified synchronized snapshot when making a dump of the database. This option is helpful when needing to synchronize the dump with a logical replication slot or with a concurrent session. In the case of a parallel dump, the snapshot name defined by this option is used rather than taking a new snapshot.
--strict-names
Require that each schema (-n|--schema
) and table (-t|--table
) qualifier match at least one schema or table in the database to be dumped. Note that if none of the schema or table qualifiers find matches, ysql_dump generates an error even without --strict-names
.
This option has no effect on -N|--exclude-schema
, -T|--exclude-table
, or --exclude-table-data
. An exclude pattern failing to match any objects is not considered an error.
--use-set-session-authorization
Output SQL-standard SET SESSION AUTHORIZATION
statements instead of ALTER OWNER
statements to determine object ownership. This makes the dump more standards-compatible, but depending on the history of the objects in the dump, might not restore properly. Also, a dump using SET SESSION AUTHORIZATION
statements will certainly require superuser privileges to restore correctly, whereas ALTER OWNER
statements requires lesser privileges.
-?, --help
Show help about ysql_dump command line arguments and then exit.
Database connection options
The following command line options control the database connection parameters.
-d dbname, --dbname=dbname
Specifies the name of the database to connect to. This is equivalent to specifying dbname
as the first non-option argument on the command line.
If this parameter contains an equal sign (=
) or starts with a valid URI prefix (yugabytedb://
), it is treated as a conninfo
string.
-h host, --host=host
Specifies the host name of the machine on which the server is running. If the value begins with a slash (/
), it is used as the directory for the Unix domain socket. Defaults to the compiled-in host of 127.0.0.1
else a Unix domain socket connection is attempted.
-p port, --port=port
Specifies the TCP port or local Unix domain socket file extension on which the server is listening for connections. Defaults to the compiled-in port of 5433
.
-U username, --username=username
The username to connect as.
-w, --no-password
Never issue a password prompt. If the server requires password authentication and a password is not available by other means such as a ~/.pgpass
file, the connection attempt will fail. This option can be useful in batch jobs and scripts where no user is present to enter a password.
-W, --password
Force ysql_dump to prompt for a password before connecting to a database.
This option is never essential, as ysql_dump automatically prompts for a password if the server demands password authentication. However, ysql_dump will waste a connection attempt finding out that the server wants a password. In some cases it is worth typing -W|--password
to avoid the extra connection attempt.
--role=rolename
Specifies a role name to be used to create the dump. This option causes ysql_dump to issue a SET ROLE <rolename>
statement after connecting to the database. It is useful when the authenticated user (specified by -U|--username
) lacks privileges needed by ysql_dump, but can switch to a role with the required rights. Some installations have a policy against logging in directly as a superuser, and use of this option allows dumps to be made without violating the policy.
Environment
The following PostgreSQL environment variables, referenced in some ysql_dump options, are used by YugabyteDB for PostgreSQL compatibility:
PGHOST
PGPORT
PGOPTIONS
PGUSER
PGDATABASE
PGCLIENTENCODING
This utility also uses the environment variables supported by libpq
.
Diagnostics
ysql_dump internally executes SELECT
statements. If you have problems running ysql_dump, make sure you are able to select information from the database using, for example, ysqlsh
. Also, any default connection settings and environment variables used by the libpq
front-end library will apply.
The database activity of ysql_dump is normally collected by the statistics collector. If this is undesirable, you can set parameter track_counts
to false
using PGOPTIONS
or the ALTER USER
statement.
Notes
If your YugabyteDB cluster has any local additions to the template1
database, be careful to restore the output of ysql_dump into a truly empty database; otherwise you are likely to get errors due to duplicate definitions of the added objects. To make an empty database without any local additions, copy from template0
not template1
, for example:
CREATE DATABASE foo WITH TEMPLATE template0;
When a data-only dump is chosen and the option --disable-triggers
is used, ysql_dump emits statements to disable triggers on user tables before inserting the data, and then statements to re-enable them after the data has been inserted. If the restore is stopped in the middle, the system catalogs might be left in the wrong state.
The dump file produced by ysql_dump does not contain the statistics used by the optimizer to make query planning decisions. Therefore, running ANALYZE
after restoring from a dump file can ensure optimal performance.
Because ysql_dump is used to transfer data to newer versions of YugabyteDB, the output of ysql_dump can be expected to load into YugabyteDB versions newer than the ysql_dump version. ysql_dump can also dump from YugabyteDB servers older than its own version. However, ysql_dump cannot dump from YugabyteDB servers newer than its own major version; it will refuse to even try, rather than risk making an invalid dump. Also, it is not guaranteed that the ysql_dump output can be loaded into a server of an older major version — not even if the dump was taken from a server of that version. Loading a dump file into an older server may require manual editing of the dump file to remove syntax not understood by the older server. Use of the --quote-all-identifiers
option is recommended in cross-version cases, as it can prevent problems arising from varying reserved-word lists in different YugabyteDB versions.
Examples
Dump a database into a SQL script file
$ ysql_dump mydb > mydb.sql
Dump a single table named mytable
$ ysql_dump -t mytable mydb -f mytable_mydb.sql
Dump schemas based on filters
The following command dumps all schemas whose names start with east
or west
and end in gsm
, excluding any schema whose names contain the word test
:
$ ysql_dump -n 'east*gsm' -n 'west*gsm' -N '*test*' mydb > myschemas_mydb.sql
Here's the same example, using regular expression notation to consolidate the options:
$ ysql_dump -n '(east|west)*gsm' -N '*test*' mydb > myschemas_mydb.sql
Dump all database objects based on a filter
The following command dumps all database objects except for tables whose names begin with ts_
:
$ ysql_dump -T 'ts_*' mydb > objects_mydb.sql