pgcopydb clone

The main pgcopydb operation is the clone operation, and for historical and user friendliness reasons two aliases are available that implement the same operation:

pgcopydb
  clone     Clone an entire database from source to target
  fork      Clone an entire database from source to target

pgcopydb clone

The command pgcopydb clone copies a database from the given source Postgres instance to the target Postgres instance.

pgcopydb clone: Clone an entire database from source to target
usage: pgcopydb clone  --source ... --target ... [ --table-jobs ... --index-jobs ... ]

  --source                      Postgres URI to the source database
  --target                      Postgres URI to the target database
  --dir                         Work directory to use
  --table-jobs                  Number of concurrent COPY jobs to run
  --index-jobs                  Number of concurrent CREATE INDEX jobs to run
  --restore-jobs                Number of concurrent jobs for pg_restore
  --large-objects-jobs          Number of concurrent Large Objects jobs to run
  --split-tables-larger-than    Same-table concurrency size threshold
  --split-max-parts             Maximum number of jobs for Same-table concurrency
  --estimate-table-sizes        Allow using estimates for relation sizes
  --drop-if-exists              On the target database, clean-up from a previous run first
  --roles                       Also copy roles found on source to target
  --no-role-passwords           Do not dump passwords for roles
  --no-owner                    Do not set ownership of objects to match the original database
  --no-acl                      Prevent restoration of access privileges (grant/revoke commands).
  --no-comments                 Do not output commands to restore comments
  --no-tablespaces              Do not output commands to select tablespaces
  --skip-large-objects          Skip copying large objects (blobs)
  --skip-extensions             Skip restoring extensions
  --skip-ext-comments           Skip restoring COMMENT ON EXTENSION
  --skip-collations             Skip restoring collations
  --skip-vacuum                 Skip running VACUUM ANALYZE
  --skip-analyze                Skip running vacuumdb --analyze-only
  --skip-db-properties          Skip copying ALTER DATABASE SET properties
  --skip-split-by-ctid          Skip spliting tables by ctid
  --requirements <filename>     List extensions requirements
  --filters <filename>          Use the filters defined in <filename>
  --fail-fast                   Abort early in case of error
  --restart                     Allow restarting when temp files exist already
  --resume                      Allow resuming operations after a failure
  --not-consistent              Allow taking a new snapshot on the source database
  --snapshot                    Use snapshot obtained with pg_export_snapshot
  --follow                      Implement logical decoding to replay changes
  --plugin                      Output plugin to use (pgoutput, test_decoding, wal2json)
  --wal2json-numeric-as-string  Print numeric data type as string when using wal2json output plugin
  --slot-name                   Use this Postgres replication slot name
  --create-slot                 Create the replication slot
  --origin                      Use this Postgres replication origin node name
  --endpos                      Stop replaying changes when reaching this LSN
  --use-copy-binary             Use the COPY BINARY format for COPY operations
  --all-databases               Clone all databases found on the source instance

pgcopydb fork

The command pgcopydb fork copies a database from the given source Postgres instance to the target Postgres instance. This command is an alias to the command pgcopydb clone seen above.

Description

The pgcopydb clone command implements both a base copy of a source database into a target database and also a full Logical Decoding client. The default plugin is pgoutput, built into PostgreSQL core since version 10. The wal2json and test_decoding plugins are also supported.

Base copy, or the clone operation

The pgcopydb clone command implements the following steps:

pgcopydb gets the list of ordinary and partitioned tables from a catalog query on the source database, and also the list of indexes, and the list of sequences with their current values.

When filtering is used, the list of objects OIDs that are meant to be filtered out is built during this step.

pgcopydb calls into pg_dump to produce the pre-data section and the post-data sections of the dump using Postgres custom format.

The pre-data section of the dump is restored on the target database using the pg_restore command, creating all the Postgres objects from the source database into the target database.

When filtering is used, the pg_restore --use-list feature is used to filter the list of objects to restore in this step.

This step uses as many as --restore-jobs jobs for pg_restore to share the workload and restore the objects in parallel.

Then as many as --table-jobs COPY sub-processes are started to share the workload and COPY the data from the source to the target database one table at a time, in a loop.

A Postgres connection and a SQL query to the Postgres catalog table pg_class is used to get the list of tables with data to copy around, and the reltuples statistic is used to start with the tables with the greatest number of rows first, as an attempt to minimize the copy time.

An auxiliary process loops through all the Large Objects found on the source database and copies its data parts over to the target database, much like pg_dump itself would.

This step is much like pg_dump | pg_restore for large objects data parts, except that there isn’t a good way to do just that with the tooling.

As many as --index-jobs CREATE INDEX sub-processes are started to share the workload and build indexes. In order to make sure to start the CREATE INDEX commands only after the COPY operation has completed, a queue mechanism is used. As soon as a table data COPY has completed, all the indexes for the table are queued for processing by the CREATE INDEX sub-processes.

The primary indexes are created as UNIQUE indexes at this stage.

Then the PRIMARY KEY constraints are created USING the just built indexes. This two-steps approach allows the primary key index itself to be created in parallel with other indexes on the same table, avoiding an EXCLUSIVE LOCK while creating the index.

As many as --table-jobs VACUUM ANALYZE sub-processes are started to share the workload. As soon as a table data COPY has completed, the table is queued for processing by the VACUUM ANALYZE sub-processes.

An auxilliary process loops over the sequences on the source database and for each of them runs a separate query on the source to fetch the last_value and the is_called metadata the same way that pg_dump does.

For each sequence, pgcopydb then calls pg_catalog.setval() on the target database with the information obtained on the source database.

The final stage consists now of running the pg_restore command for the post-data section script for the whole database, and that’s where the foreign key constraints and other elements are created.

The post-data script is filtered out using the pg_restore --use-list option so that indexes and primary key constraints already created in steps 6 and 7 are properly skipped now.

This step uses as many as --restore-jobs jobs for pg_restore to share the workload and restore the objects in parallel.

Postgres privileges, superuser, and dump and restore

Postgres has a notion of a superuser status that can be assigned to any role in the system, and the default role postgres has this status. From the Role Attributes documentation page we see that:

superuser status:

A database superuser bypasses all permission checks, except the right to log in. This is a dangerous privilege and should not be used carelessly; it is best to do most of your work as a role that is not a superuser. To create a new database superuser, use CREATE ROLE name SUPERUSER. You must do this as a role that is already a superuser.

Some Postgres objects can only be created by superusers, and some read and write operations are only allowed to superuser roles, such as the following non-exclusive list:

Reading the pg_authid role password (even when encrypted) is restricted to roles with the superuser status. Reading this catalog table is done when calling pg_dumpall --roles-only so that the dump file can then be used to restore roles including their passwords.

It is possible to implement a pgcopydb migration that skips the passwords entirely when using the option --no-role-passwords. In that case though authentication might fail until passwords have been setup again correctly.

Most of the available Postgres extensions, at least when being written in C, are then only allowed to be created by roles with superuser status.

When such an extension contains Extension Configuration Tables and has been created with a role having superuser status, then the same superuser status is needed again to pg_dump and pg_restore that extension and its current configuration.

When using pgcopydb it is possible to split your migration in privileged and non-privileged parts, like in the following examples:

 $ coproc ( pgcopydb snapshot )

 # first two commands would use a superuser role to connect
 $ pgcopydb copy roles --source ... --target ...
 $ pgcopydb copy extensions --source ... --target ...

 # now it's possible to use a non-superuser role to connect
 $ pgcopydb clone --skip-extensions --source ... --target ...

 $ kill -TERM ${COPROC_PID}
 $ wait ${COPROC_PID}

In such a script, the calls to pgcopydb copy roles and pgcopydb copy extensions would be done with connection strings that connects with a role having superuser status; and then the call to pgcopydb clone would be done with a non-privileged role, typically the role that owns the source and target databases.

Warning

That said, there is currently a limitation in pg_dump that impacts pgcopydb. When an extension with configuration table has been installed as superuser, even the main pgcopydb clone operation has to be done with superuser status.

That’s because pg_dump filtering (here, there --exclude-table option) does not apply to extension members, and pg_dump does not provide a mechanism to exclude extensions.

Extension-aware migration

pgcopydb detects certain extensions in the source catalog and automatically runs extension-specific steps at the right point in the pipeline. No configuration is required: the behaviour is triggered by the presence of the extension in the source database. If an extension is listed under [exclude-extension] in the pgcopydb configuration filter file it is never fetched into the catalog, so its hook is skipped.

TimescaleDB

When timescaledb is detected in the source database, pgcopydb calls two functions on the target database:

Before pg_restore --pre-data: timescaledb_pre_restore() puts TimescaleDB into restore mode so it does not interfere with schema restore.
After pg_restore --post-data: timescaledb_post_restore() re-enables normal TimescaleDB operation.

Both the source and the target databases must have the TimescaleDB extension installed.

Citus

When citus is detected in the source database, pgcopydb queries the citus_tables view on the source coordinator and, after pg_restore --pre-data (at which point tables exist on the target as empty plain PostgreSQL tables), calls create_distributed_table() and create_reference_table() on the target coordinator.

Co-location groups are preserved: within each group the alphabetically first table is created with shard_count := N, colocate_with := 'none' and the remaining tables reference it via colocate_with. Reference tables are always created after all distributed tables.

Data is then copied through the Citus coordinator, which routes rows to the correct shards transparently — shard tables are never visible to pgcopydb. Both the source and the target must be Citus coordinators (multi-worker clusters are fully supported).

Cloning all databases with `--all-databases`

When using the --all-databases option, pgcopydb clone enumerates all non-template user databases on the source Postgres instance and clones each of them to the target instance in a single invocation.

The source and target URIs must point to the instance level (e.g. postgres://host/postgres) rather than to a specific application database. pgcopydb derives per-database connection strings by substituting the database name component.

The operation is structured in three sequential phases:

Phase I — snapshot export and pre-data: Roles are copied once at the instance level. A snapshot holder subprocess exports one consistent REPEATABLE READ snapshot per source database and holds those transactions open. A pool of --table-jobs pre-data workers then processes the databases in parallel, running schema fetch, pg_dump --pre-data, and pg_restore --pre-data for each.

Phase II — global COPY, indexes, and VACUUM: All tables across all databases are sorted by size (largest first) and processed by a single shared pool of workers — --table-jobs COPY workers, --index-jobs index/constraint workers, and --table-jobs VACUUM workers. The pools are not allocated per-database; they serve the entire workload globally.

Phase III — post-data: Databases are finalized one at a time: pg_restore --post-data applies remaining objects (triggers, rules, comments, etc.) and sequences are reset to match the source.

A consolidated summary is printed at the end aggregating timings across all databases.

For details about the process tree and the cross-database COPY queue, see Cloning all databases: the --all-databases option.

Change Data Capture using Postgres Logical Decoding

When using the --follow option the steps from the pgcopydb follow command are also run concurrently to the main copy. The Change Data Capture is then automatically driven from a prefetch-only phase to the prefetch-and-catchup phase, which is enabled as soon as the base copy is done.

See the command pgcopydb stream sentinel set endpos to remote control the follow parts of the command even while the command is already running.

The command pgcopydb stream cleanup must be used to free resources created to support the change data capture process.

Important

Make sure to read the documentation for pgcopydb follow and the specifics about Logical Replication Restrictions as documented by Postgres.

Change Data Capture Example 1

A simple approach to applying changes after the initial base copy has been done follows:

 $ pgcopydb clone --follow &

 # later when the application is ready to make the switch
 $ pgcopydb stream sentinel set endpos --current

 # later when the migration is finished, clean-up both source and target
 $ pgcopydb stream cleanup

Change Data Capture Example 2

In some cases, it might be necessary to have more control over some of the steps taken here. Given pgcopydb flexibility, it’s possible to implement the following steps:

Grab a snapshot from the source database and hold an open Postgres connection for the duration of the base copy.

In case of crash or other problems with the main operations, it’s then possible to resume processing of the base copy and the applying of the changes with the same snapshot again.

This step is also implemented when using pgcopydb clone --follow. That said, if the command was interrupted (or crashed), then the snapshot would be lost.

Setup the logical decoding within the snapshot obtained in the previous step, and the replication tracking on the target database.

The following SQL objects are then created:

a replication slot on the source database,

a replication origin on the target database.

This step is also implemented when using pgcopydb clone --follow. There is no way to implement Change Data Capture with pgcopydb and skip creating those SQL objects.

Start the base copy of the source database, and prefetch logical decoding changes to ensure that we consume from the replication slot and allow the source database server to recycle its WAL files.

Remote control the apply process to stop consuming changes and applying them on the target database.

Re-sync the sequences to their now-current values.

Sequences are not handled by Postgres logical decoding, so extra care needs to be implemented manually here.

Important

The next version of pgcopydb will include that step in the pgcopydb clone --snapshot command automatically, after it stops consuming changes and before the process terminates.

Clean-up the specific resources created for supporting resumability of the whole process (replication slot on the source database, replication origin on the target database).

Stop holding a snaphot on the source database by stopping the pgcopydb snapshot process left running in the background.

If the command pgcopydb clone --follow fails it’s then possible to start it again. It will automatically discover what was done successfully and what needs to be done again because it failed or was interrupted (table copy, index creation, resuming replication slot consuming, resuming applying changes at the right LSN position, etc).

Here is an example implement the previous steps:

 $ pgcopydb snapshot &

 $ pgcopydb stream setup

 $ pgcopydb clone --follow &

 # later when the application is ready to make the switch
 $ pgcopydb stream sentinel set endpos --current

 # when the follow process has terminated, re-sync the sequences
 $ pgcopydb copy sequences

 # later when the migration is finished, clean-up both source and target
 $ pgcopydb stream cleanup

 # now stop holding the snapshot transaction (adjust PID to your environment)
 $ kill %1

Options

The following options are available to pgcopydb clone:

--source

Connection string to the source Postgres instance. See the Postgres documentation for connection strings for the details. In short both the quoted form "host=... dbname=..." and the URI form postgres://user@host:5432/dbname are supported.

--target

Connection string to the target Postgres instance.

--dir

During its normal operations pgcopydb creates a lot of temporary files to track sub-processes progress. Temporary files are created in the directory specified by this option, or defaults to ${TMPDIR}/pgcopydb when the environment variable is set, or otherwise to /tmp/pgcopydb.

--table-jobs

How many tables can be processed in parallel.

This limit only applies to the COPY operations, more sub-processes will be running at the same time that this limit while the CREATE INDEX operations are in progress, though then the processes are only waiting for the target Postgres instance to do all the work.

--index-jobs

How many indexes can be built in parallel, globally. A good option is to set this option to the count of CPU cores that are available on the Postgres target system, minus some cores that are going to be used for handling the COPY operations.

--restore-jobs

How many threads or processes can be used during pg_restore. A good option is to set this option to the count of CPU cores that are available on the Postgres target system.

If this value is not set, we reuse the --index-jobs value. If that value is not set either, we use the the default value for --index-jobs.

--large-object-jobs

How many worker processes to start to copy Large Objects concurrently.

--split-tables-larger-than

Allow Same-table Concurrency when processing the source database. This environment variable value is expected to be a byte size, and bytes units B, kB, MB, GB, TB, PB, and EB are known.

--estimate-table-sizes

Use estimates on table sizes to decide how to split tables when using Same-table Concurrency.

When this option is used, we run vacuumdb --analyze-only --jobs=<table-jobs> command on the source database that updates the statistics for the number of pages for each relation. Later, we use the number of pages, and the size for each page to estimate the actual size of the tables.

If you wish to run the ANALYZE command manually before running pgcopydb, you can use the --skip-analyze option. This way, you can decrease the time spent on the migration.

--drop-if-exists

When restoring the schema on the target Postgres instance, pgcopydb actually uses pg_restore. When this options is specified, then the following pg_restore options are also used: --clean --if-exists.

This option is useful when the same command is run several times in a row, either to fix a previous mistake or for instance when used in a continuous integration system.

This option causes DROP TABLE and DROP INDEX and other DROP commands to be used. Make sure you understand what you’re doing here!

--roles

The option --roles add a preliminary step that copies the roles found on the source instance to the target instance. As Postgres roles are global object, they do not exist only within the context of a specific database, so all the roles are copied over when using this option.

The pg_dumpall --roles-only is used to fetch the list of roles from the source database, and this command includes support for passwords. As a result, this operation requires the superuser privileges.

Environment

PGCOPYDB_SOURCE_PGURI

Connection string to the source Postgres instance. When --source is ommitted from the command line, then this environment variable is used.

PGCOPYDB_TARGET_PGURI

Connection string to the target Postgres instance. When --target is ommitted from the command line, then this environment variable is used.

PGCOPYDB_TABLE_JOBS

Number of concurrent jobs allowed to run COPY operations in parallel. When --table-jobs is ommitted from the command line, then this environment variable is used.

PGCOPYDB_INDEX_JOBS

Number of concurrent jobs allowed to run CREATE INDEX operations in parallel. When --index-jobs is ommitted from the command line, then this environment variable is used.

PGCOPYDB_RESTORE_JOBS

Number of concurrent jobs allowed to run pg_restore operations in parallel. When --restore-jobs is ommitted from the command line, then this environment variable is used.

PGCOPYDB_LARGE_OBJECTS_JOBS

Number of concurrent jobs allowed to copy Large Objects data in parallel. When --large-objects-jobs is ommitted from the command line, then this environment variable is used.

PGCOPYDB_SPLIT_TABLES_LARGER_THAN

Allow Same-table Concurrency when processing the source database. This environment variable value is expected to be a byte size, and bytes units B, kB, MB, GB, TB, PB, and EB are known.

When --split-tables-larger-than is ommitted from the command line, then this environment variable is used.

PGCOPYDB_SPLIT_MAX_PARTS

Limit the maximum number of parts when Same-table Concurrency is used. When --split-max-parts is ommitted from the command line, then this environment variable is used.

PGCOPYDB_ESTIMATE_TABLE_SIZES

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb estimates the size of tables to determine whether or not to split tables. This option is only useful when querying the relation sizes on source database is costly.

When --estimate-table-sizes is ommitted from the command line, then this environment variable is used.

When this option is used, we run vacuumdb --analyze-only --jobs=<table-jobs> command on the source database that updates the statistics for the number of pages for each relation. Later, we use the number of pages, and the size for each page to estimate the actual size of the tables.

If you wish to run the ANALYZE command manually before running pgcopydb, you can use the --skip-analyze option or PGCOPYDB_SKIP_ANALYZE environment variable. This way, you can decrease the time spent on the migration.

PGCOPYDB_OUTPUT_PLUGIN

Logical decoding output plugin to use. When --plugin is omitted from the command line, then this environment variable is used.

PGCOPYDB_WAL2JSON_NUMERIC_AS_STRING

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb uses the wal2json option --numeric-data-types-as-string when using the wal2json output plugin.

When --wal2json-numeric-as-string is ommitted from the command line then this environment variable is used.

PGCOPYDB_DROP_IF_EXISTS

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb uses the pg_restore options --clean --if-exists when creating the schema on the target Postgres instance.

When --drop-if-exists is ommitted from the command line then this environment variable is used.

PGCOPYDB_FAIL_FAST

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb sends the TERM signal to all the processes in its process group as soon as one process terminates with a non-zero return code.

When --fail-fast is ommitted from the command line then this environment variable is used.

PGCOPYDB_SKIP_VACUUM

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb skips the VACUUM ANALYZE jobs entirely, same as when using the --skip-vacuum option.

PGCOPYDB_SKIP_ANALYZE

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb skips the vacuumdb --analyze-only commands entirely, same as when using the --skip-analyze option.

PGCOPYDB_SKIP_DB_PROPERTIES

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb skips the ALTER DATABASET SET properties commands that copy the setting from the source to the target database, same as when using the --skip-db-properties option.

PGCOPYDB_SKIP_CTID_SPLIT

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb skips the CTID split operation during the clone process, same as when using the --skip-split-by-ctid option.

PGCOPYDB_USE_COPY_BINARY

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb uses the COPY WITH (FORMAT BINARY) instead of the COPY command, same as when using the --use-copy-binary option.

PGCOPYDB_SNAPSHOT

Postgres snapshot identifier to re-use, see also --snapshot.

TMPDIR

The pgcopydb command creates all its work files and directories in ${TMPDIR}/pgcopydb, and defaults to /tmp/pgcopydb.

PGCOPYDB_LOG_TIME_FORMAT

The logs time format defaults to %H:%M:%S when pgcopydb is used on an interactive terminal, and to %Y-%m-%d %H:%M:%S otherwise. This environment variable can be set to any format string other than the defaults.

See documentation for strftime(3) for details about the format string. See documentation for isatty(3) for details about detecting if pgcopydb is run in an interactive terminal.

PGCOPYDB_LOG_JSON

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb formats its logs using JSON.

{
  "timestamp": "2023-04-13 16:53:14",
  "pid": 87956,
  "error_level": 4,
  "error_severity": "INFO",
  "file_name": "main.c",
  "file_line_num": 165,
  "message": "Running pgcopydb version 0.11.19.g2290494.dirty from \"/Users/dim/dev/PostgreSQL/pgcopydb/src/bin/pgcopydb/pgcopydb\""
}

PGCOPYDB_LOG_FILENAME

When set to a filename (in a directory that must exists already) then pgcopydb writes its logs output to that filename in addition to the logs on the standard error output stream.

If the file already exists, its content is overwritten. In other words the previous content would be lost when running the same command twice.

PGCOPYDB_LOG_JSON_FILE

When true (or yes, or on, or 1, same input as a Postgres boolean) then pgcopydb formats its logs using JSON when writing to PGCOPYDB_LOG_FILENAME.

XDG_DATA_HOME

The standard XDG Base Directory Specification defines several environment variables that allow controling where programs should store their files.

XDG_DATA_HOME defines the base directory relative to which user-specific data files should be stored. If $XDG_DATA_HOME is either not set or empty, a default equal to $HOME/.local/share should be used.

When using Change Data Capture (through --follow option and Postgres logical decoding with wal2json) then pgcopydb pre-fetches changes in JSON files and transform them into SQL files to apply to the target database.

These files are stored at the following location, tried in this order:

when --dir is used, then pgcopydb uses the cdc subdirectory of the --dir location,

when XDG_DATA_HOME is set in the environment, then pgcopydb uses that location,

when neither of the previous settings have been used then pgcopydb defaults to using ${HOME}/.local/share.

Examples

$ export PGCOPYDB_SOURCE_PGURI=postgres://pagila:0wn3d@source/pagila
$ export PGCOPYDB_TARGET_PGURI=postgres://pagila:0wn3d@target/pagila
$ export PGCOPYDB_DROP_IF_EXISTS=on

$ pgcopydb clone --table-jobs 8 --index-jobs 12
08:13:13.961 42893 INFO   [SOURCE] Copying database from "postgres://pagila:0wn3d@source/pagila?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60"
08:13:13.961 42893 INFO   [TARGET] Copying database into "postgres://pagila:0wn3d@target/pagila?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60"
08:13:14.009 42893 INFO   Using work dir "/tmp/pgcopydb"
08:13:14.017 42893 INFO   Exported snapshot "00000003-000000EB-1" from the source database
08:13:14.019 42904 INFO   STEP 1: fetch source database tables, indexes, and sequences
08:13:14.339 42904 INFO   Fetched information for 5 tables (including 0 tables split in 0 partitions total), with an estimated total of 1000 thousands tuples and 128 MB on-disk
08:13:14.342 42904 INFO   Fetched information for 4 indexes (supporting 4 constraints)
08:13:14.343 42904 INFO   Fetching information for 1 sequences
08:13:14.353 42904 INFO   Fetched information for 1 extensions
08:13:14.436 42904 INFO   Found 1 indexes (supporting 1 constraints) in the target database
08:13:14.443 42904 INFO   STEP 2: dump the source database schema (pre/post data)
08:13:14.448 42904 INFO    /usr/bin/pg_dump -Fc --snapshot 00000003-000000EB-1 --section=pre-data --section=post-data --file /tmp/pgcopydb/schema/schema.dump 'postgres://pagila:0wn3d@source/pagila?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60'
08:13:14.513 42904 INFO   STEP 3: restore the pre-data section to the target database
08:13:14.524 42904 INFO    /usr/bin/pg_restore --dbname 'postgres://pagila:0wn3d@target/pagila?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60' --section pre-data --jobs 2 --use-list /tmp/pgcopydb/schema/pre-filtered.list /tmp/pgcopydb/schema/schema.dump
08:13:14.608 42919 INFO   STEP 4: starting 8 table-data COPY processes
08:13:14.678 42921 INFO   STEP 8: starting 8 VACUUM processes
08:13:14.678 42904 INFO   Skipping large objects: none found.
08:13:14.693 42920 INFO   STEP 6: starting 2 CREATE INDEX processes
08:13:14.693 42920 INFO   STEP 7: constraints are built by the CREATE INDEX processes
08:13:14.699 42904 INFO   STEP 9: reset sequences values
08:13:14.700 42959 INFO   Set sequences values on the target database
08:13:16.716 42904 INFO   STEP 10: restore the post-data section to the target database
08:13:16.726 42904 INFO    /usr/bin/pg_restore --dbname 'postgres://pagila:0wn3d@target/pagila?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60' --section post-data --jobs 2 --use-list /tmp/pgcopydb/schema/post-filtered.list /tmp/pgcopydb/schema/schema.dump
08:13:16.751 42904 INFO   All step are now done,  2s728 elapsed
08:13:16.752 42904 INFO   Printing summary for 5 tables and 4 indexes

  OID | Schema |             Name | Parts | copy duration | transmitted bytes | indexes | create index duration
------+--------+------------------+-------+---------------+-------------------+---------+----------------------
16398 | public | pgbench_accounts |     1 |         1s496 |             91 MB |       1 |                 302ms
16395 | public |  pgbench_tellers |     1 |          37ms |            1002 B |       1 |                  15ms
16401 | public | pgbench_branches |     1 |          45ms |              71 B |       1 |                  18ms
16386 | public |           table1 |     1 |          36ms |             984 B |       1 |                  21ms
16392 | public |  pgbench_history |     1 |          41ms |               0 B |       0 |                   0ms


                                               Step   Connection    Duration    Transfer   Concurrency
 --------------------------------------------------   ----------  ----------  ----------  ------------
   Catalog Queries (table ordering, filtering, etc)       source       119ms                         1
                                        Dump Schema       source        66ms                         1
                                     Prepare Schema       target        59ms                         1
      COPY, INDEX, CONSTRAINTS, VACUUM (wall clock)         both       2s125                        18
                                  COPY (cumulative)         both       1s655      128 MB             8
                          CREATE INDEX (cumulative)       target       343ms                         2
                           CONSTRAINTS (cumulative)       target        13ms                         2
                                VACUUM (cumulative)       target       144ms                         8
                                    Reset Sequences         both        15ms                         1
                         Large Objects (cumulative)       (null)         0ms                         0
                                    Finalize Schema         both        27ms                         2
 --------------------------------------------------   ----------  ----------  ----------  ------------
                          Total Wall Clock Duration         both       2s728                        24

pgcopydb clone

pgcopydb clone

pgcopydb fork

Description

Base copy, or the clone operation

Postgres privileges, superuser, and dump and restore

Extension-aware migration

TimescaleDB

Citus

Cloning all databases with --all-databases

Change Data Capture using Postgres Logical Decoding

Change Data Capture Example 1

Change Data Capture Example 2

Options

Environment

Examples

Cloning all databases with `--all-databases`