pg_cron extension TECH PREVIEW
The pg_cron extension provides a cron-based job scheduler that runs inside the database. It uses the same syntax as regular cron, and allows you to schedule YSQL commands directly from the database. You can also use '[1-59] seconds' to schedule a job based on an interval.
YugabyteDB supports all features of the pg_cron extension. Although YugabyteDB is a distributed database that operates on multiple nodes, pg_cron only runs on one of these nodes, called the pg_cron leader. Only the pg_cron leader schedules and runs the cron jobs. The queries executed by jobs do take advantage of all available resources in the cluster.
If the pg_cron leader node fails, another node is automatically elected as the new leader to ensure it is highly available. This process is transparent, and you can connect to any node in a cluster to schedule jobs.
Set up pg_cron
pg_cron in YugabyteDB is TP
. Before you can use the feature, you must enable it by setting the enable_pg_cron
flag. To do this, add enable_pg_cron
to the allowed_preview_flags_csv
flag and set the enable_pg_cron
flag to true on all YB-Masters and YB-TServers.
The pg_cron extension is installed on only one database, which stores the extension data. The default cron database is yugabyte
. You can change it by setting the ysql_cron_database_name
flag on all YB-TServers. You can create the database after setting the flag.
For example, to create a single-node yugabyted cluster with pg_cron on database 'db1', use the following command:
./bin/yugabyted start --master_flags "allowed_preview_flags_csv={enable_pg_cron},enable_pg_cron=true" --tserver_flags "allowed_preview_flags_csv={enable_pg_cron},enable_pg_cron=true,ysql_cron_database_name=db1" --ui false
To change the database after the extension is created, you must first drop the extension and then change the flag value.
Enable pg_cron
Create the extension as superuser on the cron database.
CREATE EXTENSION pg_cron;
You can grant access to other users to use the extension. For example:
GRANT USAGE ON SCHEMA cron TO elephant;
Use pg_cron
YugabyteDB supports all features and syntax of the pg_cron extension.
For example, the following command calls a stored procedure every five seconds:
SELECT cron.schedule('process-updates', '5 seconds', 'CALL process_updates()');
If you need to run jobs in multiple databases, use cron.schedule_in_database()
.
When running jobs, keep in mind the following:
- It may take up to 60 seconds for job changes to get picked up by the pg_cron leader.
- When a new pg_cron leader node is elected, no jobs are run for the first minute. Any job that were in flight on the failed node will not be retried, as their outcome is not known.
For more information on how to schedule jobs, refer to the pg_cron documentation.
Best practices
The cron.job_run_details
table is part of the pg_cron
extension in PostgreSQL. This table logs information about each cron job run, including its start and end time, status, and any exit messages or errors that occurred during the execution. The records in cron.job_run_details
are not cleaned automatically, so in scenarios where you have jobs that run frequently, set up a periodic cleanup task for the table using pg_cron
to ensure old data doesn't accumulate and affect database performance.
View job details
You can view the status of running and recently completed jobs in the cron.job_run_details
table using the following command:
select * from cron.job_run_details order by start_time desc limit 5;
Set up a periodic cleanup task
Create a periodoc cleaning task for the cron.job_run_details
table using pg_cron
similar to the following example:
-- Delete old cron.job_run_details records of the current user every day at noon
SELECT cron.schedule('delete-job-run-details', '0 12 * * *', $$DELETE FROM cron.job_run_details WHERE end_time < now() - interval '7 days'$$);
Examples
The following examples decribe various ways pg_cron
can be used to automate and improve database management tasks. The tool can help maintain database performance, consistency, and reliability through scheduled jobs.
Monitor and identify slow queries
Use pg_stat_statements to capture statistics about queries, and schedule regular reports with pg_cron
to summarize slow queries.
INSERT INTO slow_queries SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;
Validate performance
pg_cron
can indirectly assist with query tuning in a few ways. For example, after identifying and implementing query optimizations or indexing improvements, you can monitor the impact over time using pg_cron
to execute scripts that validate the effectiveness of your changes.
SELECT cron.schedule('weekly_check_performance', '0 4 * * 0',
$$
DO $$
BEGIN
-- Example of weekly check
IF EXISTS (SELECT 1 FROM your_table WHERE performance_degraded) THEN
RAISE NOTICE 'Performance issue detected';
END IF;
END;
$$;
$$);
Automate data refresh for materialized views
You can keep materialized views up to date ensuring that they reflect recent data changes.
The following example schedules a job to refresh the your_materialized_view
every hour.
SELECT cron.schedule('refresh_materialized_view', '0 * * * *', 'REFRESH MATERIALIZED VIEW your_materialized_view');
Cleanup old logs or temporary data
You can periodically clean up old logs or temporary data to free up space.
The following example schedules a job to delete logs older than 30 days every day at 2AM.
SELECT cron.schedule('cleanup_old_logs', '0 2 * * *', $$
DO $$
BEGIN
DELETE FROM logs WHERE log_date < NOW() - INTERVAL '30 days';
END;
$$);