Load and Replicate

On this page

Perform an initial bulk load of the source data using MOLT Fetch, then use MOLT Replicator to replicate ongoing changes to the target.

Note:

For CockroachDB-to-CockroachDB migrations, contact your account team for guidance.

Note:

This feature is in preview and subject to change. To share feedback and/or issues, contact Support.

Before you begin

Create a CockroachDB Cloud or self-hosted cluster.
Install the MOLT (Migrate Off Legacy Technology) tools.
Review the Fetch and Replicator best practices.
Review Migration Strategy.

Prerequisites

Oracle Instant Client

Install Oracle Instant Client on the machine that will run molt and replicator. If using the MOLT Replicator binary (instead of Docker), the Oracle Instant Client libraries must be accessible at /usr/lib.

On macOS ARM machines, download the Oracle Instant Client. After installation, you should have a new directory at /Users/$USER/Downloads/instantclient_23_3 containing .dylib files. Set the LD_LIBRARY_PATH environment variable to this directory:
```
export LD_LIBRARY_PATH=/Users/$USER/Downloads/instantclient_23_3
```

On Linux machines, install the Oracle Instant Client dependencies and set the LD_LIBRARY_PATH to the client library path:

sudo apt-get install -yqq --no-install-recommends libaio1t64
sudo ln -s /usr/lib/x86_64-linux-gnu/libaio.so.1t64 /usr/lib/x86_64-linux-gnu/libaio.so.1
unzip -d /tmp /tmp/instantclient-basiclite-linux-amd64.zip
sudo mv /tmp/instantclient_21_13/* /usr/lib
export LD_LIBRARY_PATH=/usr/lib

Tip:

You can also download Oracle Instant Client directly from the Oracle site for Linux ARM64 or Linux x86-64.

Limitations

Fetch limitations

OID LOB types in PostgreSQL are not supported, although similar types like BYTEA are supported.

Migrations must be performed from a single Oracle schema. You must include --schema-filter so that MOLT Fetch only loads data from the specified schema. Refer to Schema and table filtering.
- Specifying --table-filter is also strongly recommended to ensure that only necessary tables are migrated from the Oracle schema.
Oracle advises against LONG RAW columns and recommends converting them to BLOB. LONG RAW can only store binary values up to 2GB, and only one LONG RAW column per table is supported.

Only tables with primary key types of INT, FLOAT, or UUID can be sharded with --export-concurrency.

Replicator limitations

Replication modes require connection to the primary instance (PostgreSQL primary, MySQL primary/master, or Oracle primary). MOLT cannot obtain replication checkpoints or transaction metadata from replicas.

MySQL replication is supported only with GTID-based configurations. Binlog-based features that do not use GTID are not supported.

Replication will not work for tables or column names exceeding 30 characters. This is a limitation of Oracle LogMiner.
The following data types are not supported for replication:
- User-defined types (UDTs)
- Nested tables
- VARRAY
- LONGBLOB/CLOB columns (over 4000 characters)
If your Oracle workload executes UPDATE statements that modify only LOB columns, these UPDATE statements are not supported by Oracle LogMiner and will not be replicated.
If you are using Oracle 11 and execute UPDATE statements on XMLTYPE or LOB columns, those changes are not supported by Oracle LogMiner and will be excluded from ongoing replication.
If you are migrating LOB columns from Oracle 12c, use AWS DMS Binary Reader instead of LogMiner. Oracle LogMiner does not support LOB replication in 12c.

Running DDL on the source or target while replication is in progress can cause replication failures.
TRUNCATE operations on the source are not captured. Only INSERT, UPDATE, UPSERT, and DELETE events are replicated.
Changes to virtual columns are not replicated automatically. To migrate these columns, you must define them explicitly with transformation rules.

Prepare the source database

Create migration user on source database

Create a dedicated migration user (for example, MIGRATION_USER) on the source database. This user is responsible for reading data from source tables during the migration. You will pass this username in the source connection string.

CREATE USER migration_user WITH PASSWORD 'password';

Grant the user privileges to connect, view schema objects, and select the tables you migrate.

GRANT CONNECT ON DATABASE source_database TO migration_user;
GRANT USAGE ON SCHEMA migration_schema TO migration_user;
GRANT SELECT ON ALL TABLES IN SCHEMA migration_schema TO migration_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema GRANT SELECT ON TABLES TO migration_user;

Grant the SUPERUSER role to the user (recommended for replication configuration):

ALTER USER migration_user WITH SUPERUSER;

Alternatively, grant the following permissions to create replication slots, access replication data, create publications, and add tables to publications:

ALTER USER migration_user WITH LOGIN REPLICATION;
GRANT CREATE ON DATABASE source_database TO migration_user;
ALTER TABLE migration_schema.table_name OWNER TO migration_user;

Run the ALTER TABLE command for each table to replicate.

CREATE USER 'migration_user'@'%' IDENTIFIED BY 'password';

Grant the user privileges to select only the tables you migrate:

GRANT SELECT ON source_database.* TO 'migration_user'@'%';
FLUSH PRIVILEGES;

For replication, grant additional privileges for binlog access:

GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'migration_user'@'%';
FLUSH PRIVILEGES;

CREATE USER MIGRATION_USER IDENTIFIED BY 'password';

Note:

When migrating from Oracle Multitenant (PDB/CDB), this should be a common user. Prefix the username with C## (e.g., C##MIGRATION_USER).

Grant the user privileges to connect, read metadata, and SELECT and FLASHBACK the tables you plan to migrate. The tables should all reside in a single schema (e.g., migration_schema). For details, refer to Schema and table filtering.

Oracle Multitenant (PDB/CDB) user privileges

Connect to the Oracle CDB as a DBA and grant the following:

-- Basic access
GRANT CONNECT TO C##MIGRATION_USER;
GRANT CREATE SESSION TO C##MIGRATION_USER;

-- General metadata access
GRANT EXECUTE_CATALOG_ROLE TO C##MIGRATION_USER;
GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER;

-- Access to necessary V$ views
GRANT SELECT ON V_$LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGFILE TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO C##MIGRATION_USER;
GRANT SELECT ON V_$ARCHIVED_LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$DATABASE TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOG_HISTORY TO C##MIGRATION_USER;

-- Direct grants to specific DBA views
GRANT SELECT ON ALL_USERS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_USERS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_OBJECTS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_SYNONYMS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_TABLES TO C##MIGRATION_USER;

Connect to the Oracle PDB (not the CDB) as a DBA and grant the following:

-- Allow C##MIGRATION_USER to connect to the PDB and see active transaction metadata
GRANT CONNECT TO C##MIGRATION_USER;
GRANT CREATE SESSION TO C##MIGRATION_USER;

-- General metadata access
GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER;

-- Access to necessary V$ views
GRANT SELECT ON V_$SESSION TO C##MIGRATION_USER;
GRANT SELECT ON V_$TRANSACTION TO C##MIGRATION_USER;

-- Grant these two for every table to migrate in the migration_schema
GRANT SELECT, FLASHBACK ON migration_schema.tbl TO C##MIGRATION_USER;

Single-tenant Oracle user privileges

Connect to the Oracle database as a DBA and grant the following:

-- Basic access
GRANT CONNECT TO MIGRATION_USER;
GRANT CREATE SESSION TO MIGRATION_USER;

-- General metadata access
GRANT SELECT_CATALOG_ROLE TO MIGRATION_USER;
GRANT EXECUTE_CATALOG_ROLE TO MIGRATION_USER;

-- Access to necessary V$ views
GRANT SELECT ON V_$DATABASE TO MIGRATION_USER;
GRANT SELECT ON V_$SESSION TO MIGRATION_USER;
GRANT SELECT ON V_$TRANSACTION TO MIGRATION_USER;

-- Direct grants to specific DBA views
GRANT SELECT ON ALL_USERS TO MIGRATION_USER;
GRANT SELECT ON DBA_USERS TO MIGRATION_USER;
GRANT SELECT ON DBA_OBJECTS TO MIGRATION_USER;
GRANT SELECT ON DBA_SYNONYMS TO MIGRATION_USER;
GRANT SELECT ON DBA_TABLES TO MIGRATION_USER;

-- Grant these two for every table to migrate in the migration_schema
GRANT SELECT, FLASHBACK ON migration_schema.tbl TO MIGRATION_USER;

Configure source database for replication

Note:

Connect to the primary instance (PostgreSQL primary, MySQL primary/master, or Oracle primary), not a replica. Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.

Verify that you are connected to the primary server by running SELECT pg_is_in_recovery(); and getting a false result.

Enable logical replication by setting wal_level to logical in postgresql.conf or in the SQL shell. For example:

ALTER SYSTEM SET wal_level = 'logical';

Enable global transaction identifiers (GTID) and configure binary logging. Set binlog-row-metadata or binlog-row-image to full to provide complete metadata for replication.

Configure binlog retention to ensure GTIDs remain available throughout the migration:

MySQL 8.0.1+: Set binlog_expire_logs_seconds (default: 2592000 = 30 days) based on your migration timeline.
MySQL < 8.0: Set expire_logs_days, or manually manage retention by setting max_binlog_size and using PURGE BINARY LOGS BEFORE NOW() - INTERVAL 1 HOUR (adjusting the interval as needed). Force binlog rotation with FLUSH BINARY LOGS if needed.
Managed services: Refer to provider-specific configuration for Amazon RDS or Google Cloud SQL.

Version	Configuration
MySQL 5.6	`--gtid-mode=on` `--enforce-gtid-consistency=on` `--server-id={unique_id}` `--log-bin=mysql-binlog` `--binlog-format=row` `--binlog-row-image=full` `--log-slave-updates=ON`
MySQL 5.7	`--gtid-mode=on` `--enforce-gtid-consistency=on` `--binlog-row-image=full` `--server-id={unique_id}` `--log-bin=log-bin`
MySQL 8.0+	`--gtid-mode=on` `--enforce-gtid-consistency=on` `--binlog-row-metadata=full`
MariaDB	`--log-bin` `--server_id={unique_id}` `--log-basename=master1` `--binlog-format=row` `--binlog-row-metadata=full`

Enable ARCHIVELOG and FORCE LOGGING

Enable ARCHIVELOG mode for LogMiner to access archived redo logs:

-- Check current log mode
SELECT log_mode FROM v$database;

-- Enable ARCHIVELOG (requires database restart)
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE ARCHIVELOG;
ALTER DATABASE OPEN;

-- Verify ARCHIVELOG is enabled
SELECT log_mode FROM v$database; -- Expected: ARCHIVELOG

Enable supplemental logging for primary keys:

-- Enable minimal supplemental logging for primary keys
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;

-- Verify supplemental logging status
SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database;
-- Expected:
--   SUPPLEMENTAL_LOG_DATA_MIN: IMPLICIT (or YES)
--   SUPPLEMENTAL_LOG_DATA_PK: YES

Enable FORCE LOGGING to ensure all changes are logged:

ALTER DATABASE FORCE LOGGING;

-- Verify FORCE LOGGING is enabled
SELECT force_logging FROM v$database; -- Expected: YES

Create source sentinel table

Create a checkpoint table called _replicator_sentinel in the Oracle schema you will migrate:

CREATE TABLE migration_schema."_replicator_sentinel" (
  keycol NUMBER PRIMARY KEY,
  lastSCN NUMBER
);

Grant privileges to modify the checkpoint table. In Oracle Multitenant, grant the privileges on the PDB:

GRANT SELECT, INSERT, UPDATE ON migration_schema."_replicator_sentinel" TO C##MIGRATION_USER;

Grant LogMiner privileges

Grant LogMiner privileges. In Oracle Multitenant, grant the permissions on the CDB:

-- Access to necessary V$ views
GRANT SELECT ON V_$LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGFILE TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO C##MIGRATION_USER;
GRANT SELECT ON V_$ARCHIVED_LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOG_HISTORY TO C##MIGRATION_USER;

-- SYS-prefixed views (for full dictionary access)
GRANT SELECT ON SYS.V_$LOGMNR_DICTIONARY TO C##MIGRATION_USER;
GRANT SELECT ON SYS.V_$LOGMNR_LOGS TO C##MIGRATION_USER;
GRANT SELECT ON SYS.V_$LOGMNR_PARAMETERS TO C##MIGRATION_USER;
GRANT SELECT ON SYS.V_$LOGMNR_SESSION TO C##MIGRATION_USER;

-- Access to LogMiner views and controls
GRANT LOGMINING TO C##MIGRATION_USER;
GRANT EXECUTE ON DBMS_LOGMNR TO C##MIGRATION_USER;

The user must:

Query redo logs from LogMiner.
Retrieve active transaction information to determine the starting point for ongoing replication.
Update the internal _replicator_sentinel table created on the Oracle source schema by the DBA.

Verify LogMiner privileges

Query the locations of redo files in the Oracle database:

SELECT
    l.GROUP#,
    lf.MEMBER,
    l.FIRST_CHANGE# AS START_SCN,
    l.NEXT_CHANGE# AS END_SCN
FROM
    V$LOG l
JOIN
    V$LOGFILE lf
ON
    l.GROUP# = lf.GROUP#;

   GROUP# MEMBER                                       START_SCN                END_SCN
_________ _________________________________________ ____________ ______________________
        3 /opt/oracle/oradata/ORCLCDB/redo03.log         1232896    9295429630892703743
        2 /opt/oracle/oradata/ORCLCDB/redo02.log         1155042                1232896
        1 /opt/oracle/oradata/ORCLCDB/redo01.log         1141934                1155042

3 rows selected.

Get the current snapshot System Change Number (SCN):

SELECT CURRENT_SCN FROM V$DATABASE;

CURRENT_SCN
-----------
2358840

1 row selected.

Add the redo log files to LogMiner, using the redo log file paths you queried:

EXEC DBMS_LOGMNR.ADD_LOGFILE(LOGFILENAME => '/opt/oracle/oradata/ORCLCDB/redo01.log', OPTIONS => DBMS_LOGMNR.NEW);
EXEC DBMS_LOGMNR.ADD_LOGFILE(LOGFILENAME => '/opt/oracle/oradata/ORCLCDB/redo02.log', OPTIONS => DBMS_LOGMNR.ADDFILE);
EXEC DBMS_LOGMNR.ADD_LOGFILE(LOGFILENAME => '/opt/oracle/oradata/ORCLCDB/redo03.log', OPTIONS => DBMS_LOGMNR.ADDFILE);

Start LogMiner, specifying the SCN you queried:

EXEC DBMS_LOGMNR.START_LOGMNR(
  STARTSCN => 2358840,
  OPTIONS  => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG
);

PL/SQL procedure successfully completed.

Tip:

If you receive ORA-01435: user does not exist, the Oracle user lacks sufficient LogMiner privileges. Refer to Grant LogMiner privileges.

Prepare the target database

Create the target schema

Convert the source schema into a CockroachDB-compatible schema. CockroachDB supports the PostgreSQL wire protocol and is largely compatible with PostgreSQL syntax.

The source and target schemas must match. Review Type mapping to understand which source types can be mapped to CockroachDB types.

For example, a source table defined as CREATE TABLE migration_schema.tbl (pk INT PRIMARY KEY); must have a corresponding schema and table in CockroachDB:
```
CREATE SCHEMA migration_schema;
CREATE TABLE migration_schema.tbl (pk INT PRIMARY KEY);
```
- MOLT Fetch can automatically create a matching CockroachDB schema using the drop-on-target-and-recreate option.
- If you create the target schema manually, review how MOLT Fetch handles type mismatches. You can use the MOLT Schema Conversion Tool to create a matching schema.
- By default, table and column names are case-insensitive in MOLT Fetch. If using the --case-sensitive flag, schema, table, and column names must match Oracle's default uppercase identifiers. Use quoted names on the target to preserve case. For example, the following CockroachDB SQL statement will error:
```
CREATE TABLE co.stores (... store_id ...);
```
  It should be written as:
```
CREATE TABLE "CO"."STORES" (... "STORE_ID" ...);
```
  When using --case-sensitive, quote all identifiers and match the case exactly (for example, use "CO"."STORES" and "STORE_ID").
Every table must have an explicit primary key. For more information, refer to Primary key best practices.

Warning:

Avoid using sequential keys. To learn more about the performance issues that can result from their use, refer to the guidance on indexing with sequential keys. If a sequential key is necessary in your CockroachDB table, you must create it manually, after using MOLT Fetch to load and replicate the data.
Review Transformations to understand how computed columns and partitioned tables can be mapped to the target, and how target tables can be renamed.
By default on CockroachDB, INT is an alias for INT8, which creates 64-bit signed integers. PostgreSQL and MySQL default to 32-bit integers. Depending on your source database or application requirements, you may need to change the integer size to 4. For more information, refer to Considerations for 64-bit signed integers.

Schema Conversion Tool

The MOLT Schema Conversion Tool (SCT) automates target schema creation. It requires a free CockroachDB Cloud account.

Upload a source .sql file to convert the syntax and identify unimplemented features and syntax incompatibilities in the schema.
Import the converted schema to a CockroachDB cluster:
- When migrating to CockroachDB Cloud, the Schema Conversion Tool automatically applies the converted schema to a new Cloud database.
- When migrating to a self-hosted CockroachDB cluster, export a converted schema file and pipe the data definition language (DDL) directly into cockroach sql.

Syntax that cannot automatically be converted will be displayed in the Summary Report. These may include the following:

String case sensitivity

Strings are case-insensitive in MySQL and case-sensitive in CockroachDB. You may need to edit your MySQL data to get the results you expect from CockroachDB. For example, you may have been doing string comparisons in MySQL that will need to be changed to work with CockroachDB.

For more information about the case sensitivity of strings in MySQL, refer to Case Sensitivity in String Searches from the MySQL documentation. For more information about CockroachDB strings, refer to STRING.

Identifier case sensitivity

Identifiers are case-sensitive in MySQL and case-insensitive in CockroachDB. When using the Schema Conversion Tool, you can either keep case sensitivity by enclosing identifiers in double quotes, or make identifiers case-insensitive by converting them to lowercase.

`AUTO_INCREMENT` attribute

The MySQL AUTO_INCREMENT attribute, which creates sequential column values, is not supported in CockroachDB. When using the Schema Conversion Tool, columns with AUTO_INCREMENT can be converted to use sequences, UUID values with gen_random_uuid(), or unique INT8 values using unique_rowid(). Cockroach Labs does not recommend using a sequence to define a primary key column. For more information, refer to Unique ID best practices.

Note:

Changing a column type during schema conversion will cause MOLT Verify to identify a type mismatch during data validation. This is expected behavior.

`ENUM` type

MySQL ENUM types are defined in table columns. On CockroachDB, ENUM is a standalone type. When using the Schema Conversion Tool, you can either deduplicate the ENUM definitions or create a separate type for each column.

`TINYINT` type

TINYINT data types are not supported in CockroachDB. The Schema Conversion Tool automatically converts TINYINT columns to INT2 (SMALLINT).

Geospatial types

MySQL geometry types are not converted to CockroachDB geospatial types by the Schema Conversion Tool. They should be manually converted to the corresponding types in CockroachDB.

`FIELD` function

The MYSQL FIELD function is not supported in CockroachDB. Instead, you can use the array_position function, which returns the index of the first occurrence of element in the array.

Example usage:

SELECT array_position(ARRAY[4,1,3,2],1);

  array_position
------------------
               2
(1 row)

While MySQL returns 0 when the element is not found, CockroachDB returns NULL. So if you are using the ORDER BY clause in a statement with the array_position function, the caveat is that sort is applied even when the element is not found. As a workaround, you can use the COALESCE operator.

SELECT * FROM table_a ORDER BY COALESCE(array_position(ARRAY[4,1,3,2],5),999);

Drop constraints and indexes

To optimize data load performance, drop all non-PRIMARY KEY constraints and indexes on the target CockroachDB database before migrating:

FOREIGN KEY
UNIQUE
Secondary indexes
CHECK
DEFAULT
NOT NULL (you do not need to drop this constraint when using drop-on-target-and-recreate for table handling)

Warning:

Do not drop PRIMARY KEY constraints.

You can recreate the constraints and indexes after loading the data.

Create the SQL user

Create a SQL user in the CockroachDB cluster that has the necessary privileges.

To create a user crdb_user in the default database (you will pass this username in the target connection string):

CREATE USER crdb_user WITH PASSWORD 'password';

Grant database-level privileges for schema creation within the target database:

GRANT ALL ON DATABASE defaultdb TO crdb_user;

Grant user privileges to create internal MOLT tables like _molt_fetch_exceptions in the public schema:

Note:

Ensure that you are connected to the target database.

GRANT CREATE ON SCHEMA public TO crdb_user;

If you manually created the target schema (i.e., drop-on-target-and-recreate will not be used), grant the following privileges on the schema:

GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA migration_schema TO crdb_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user;

Grant the same privileges for internal MOLT tables:

GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO crdb_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO crdb_user;

Depending on the MOLT Fetch data load mode you will use, grant the necessary privileges to run either IMPORT INTO or COPY FROM on the target tables:

`IMPORT INTO` privileges

Grant SELECT, INSERT, and DROP (required because the table is taken offline during the IMPORT INTO) privileges on all tables in the target schema:

GRANT SELECT, INSERT, DROP ON ALL TABLES IN SCHEMA migration_schema TO crdb_user;

If you plan to use cloud storage with implicit authentication for data load, grant the EXTERNALIOIMPLICITACCESS system-level privilege:

GRANT EXTERNALIOIMPLICITACCESS TO crdb_user;

`COPY FROM` privileges

Grant admin privileges to the user:

GRANT admin TO crdb_user;

Replication privileges

Grant permissions to create the staging schema for replication:

ALTER USER crdb_user CREATEDB;

Configure GC TTL

Before starting the initial data load, configure the garbage collection (GC) TTL on the source CockroachDB cluster to ensure that historical data remains available when replication begins. The GC TTL must be long enough to cover the full duration of the data load.

Increase the GC TTL before starting the data load. For example, to set the GC TTL to 24 hours:

ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = 86400;

Note:

The GC TTL duration must be higher than your expected time for the initial data load.

Once replication has started successfully (which automatically protects its own data range), you can restore the GC TTL to its original value. For example, to restore to 5 minutes:

ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = 300;

For details, refer to Protect Changefeed Data from Garbage Collection.

Configure Fetch

When you run molt fetch, you can configure the following options for data load:

Connection strings: Specify URL‑encoded source and target connections.
Intermediate file storage: Export data to cloud storage or a local file server.
Table handling mode: Determine how existing target tables are initialized before load.
Schema and table filtering: Specify schema and table names to migrate.
Data load mode: Choose between IMPORT INTO and COPY FROM.
Fetch metrics: Configure metrics collection during initial data load.

Connection strings

Define the connection strings for the source and target databases, and keep them secure.

Source connection string

The --source flag specifies the connection string for the source database:

Note:

The source connection must point to the primary instance (PostgreSQL primary, MySQL primary/master, or Oracle primary). Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.

--source 'postgres://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'

For example:

--source 'postgres://migration_user:password@localhost:5432/molt?sslmode=verify-full'

--source 'mysql://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full&sslcert={path_to_client_crt}&sslkey={path_to_client_key}&sslrootcert={path_to_ca_crt}'

For example:

--source 'mysql://migration_user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt'

--source 'oracle://{username}:{password}@{host}:{port}/{service_name}'

In Oracle Multitenant, --source specifies the connection string for the PDB. --source-cdb specifies the connection string for the CDB. The username specified in both --source and --source-cdb must be a common user with the privileges described in Create migration user on source database.

--source 'oracle://{username}:{password}@{host}:{port}/{PDB_service_name}'
--source-cdb 'oracle://{username}:{password}@{host}:{port}/{CDB_service_name}'

Escape the C## prefix in the Oracle Multitenant username. For example, write C##MIGRATION_USER as C%23%23:

--source 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLPDB1'
--source-cdb 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLCDB'

Target connection string

The --target flag specifies the connection string for the target CockroachDB database:

--target 'postgres://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'

For example:

--target 'postgres://crdb_user:password@localhost:26257/defaultdb?sslmode=verify-full'

For details, refer to Connect using a URL.

Secure connections

To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:
- Avoid plaintext connection strings.
- Provide your connection strings as environment variables. For example:
```
export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full"
export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
```
  Afterward, reference the environment variables in MOLT commands:
```
--source $SOURCE
--target $TARGET
```
- If possible, use an external secrets manager to load the environment variables from stored secrets.
Use TLS-enabled connection strings to encrypt data in transit from MOLT to the database. When using TLS certificates, ensure certificate files are accessible to the MOLT binary on the same machine.

For example, a PostgreSQL connection string with TLS certificates:
```
postgresql://migration_user@db.example.com:5432/appdb?sslmode=verify-full&sslrootcert=/etc/molt/certs/ca.pem&sslcert=/etc/molt/certs/client.crt&sslkey=/etc/molt/certs/client.key
```
URL-encode connection strings for the source database and CockroachDB so special characters in passwords are handled correctly.
- Given a password a$52&, pass it to the molt escape-password command with single quotes:
```
molt escape-password --password 'a$52&'
```
  Use the encoded password in your connection string. For example:
```
postgres://migration_user:a%2452%26@localhost:5432/replicationload
```
Remove sslmode=disable from production connection strings.

Intermediate file storage

MOLT Fetch can write intermediate files to either a cloud storage bucket or a local file server:

Destination	MOLT Fetch flag(s)	Address and authentication
Cloud storage	`--bucket-path`	Specify a `s3://bucket/path`, `gs://bucket/path`, or `azure-blob://bucket/path` URL. AWS S3: Set `AWS_REGION`, `AWS_SECRET_ACCESS_KEY`, `AWS_ACCESS_KEY_ID` as environment variables or use `--use-implicit-auth`, `--assume-role`, and/or `--import-region`. Refer to Cloud storage authentication. GCS: Authenticate with Application Default Credentials or use `--use-implicit-auth`. Refer to Cloud storage authentication. Azure Blob Storage: Set `AZURE_ACCOUNT_NAME` and `AZURE_ACCOUNT_KEY` as environment variables or use `--use-implicit-auth`. Refer to Cloud storage authentication.
Local file server	`--local-path` `--local-path-listen-addr` `--local-path-crdb-access-addr`	Write to `--local-path` on a local file server at `--local-path-listen-addr`; if the target CockroachDB cluster cannot reach this address, specify a publicly accessible address with `--local-path-crdb-access-addr`. No additional authentication is required.

Tip:

Cloud storage is often preferred over a local file server, which may require significant disk space.

Cloud storage authentication

Ensure that access control is properly configured for Amazon S3, Google Cloud Storage, or Azure Blob Storage.

Amazon S3

Set the following environment variables in the terminal running molt fetch:
```
export AWS_REGION='us-east-1'
export AWS_SECRET_ACCESS_KEY='key'
export AWS_ACCESS_KEY_ID='id'
```
- To run molt fetch in a containerized environment (e.g., Docker), pass the required environment variables using -e. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using -v. For example:
```
docker run \
  -e AWS_ACCESS_KEY_ID='your-access-key' \
  -e AWS_SECRET_ACCESS_KEY='your-secret-key' \
  -v ~/.aws:/root/.aws \
  -it \
  cockroachdb/molt fetch \
  --bucket-path 's3://migration/data/cockroach' ...
```
Alternatively, set --use-implicit-auth to use implicit authentication. When using assume role authentication, specify the service account with --assume-role. For example:
```
--bucket-path 's3://migration/data/cockroach'
--assume-role 'arn:aws:iam::123456789012:role/MyMigrationRole'
--use-implicit-auth
```
Set --import-region to specify an AWS_REGION (e.g., --import-region 'ap-south-1').
Ensure the S3 bucket is created and accessible by authorized roles and users only.

Google Cloud Storage

Authenticate your local environment with Application Default Credentials:

Using gcloud:
```
gcloud init
gcloud auth application-default login
```
Using the environment variable:
```
export GOOGLE_APPLICATION_CREDENTIALS={path_to_cred_json}
```
- To run molt fetch in a containerized environment (e.g., Docker), pass the required environment variables using -e. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using -v. For example:
```
docker run \
  -e GOOGLE_APPLICATION_CREDENTIALS='/root/.config/gcloud/application_default_credentials.json' \
  -v ~/.config/gcloud:/root/.config/gcloud \
  -it \
  cockroachdb/molt fetch \
  --bucket-path 'gs://migration/data/cockroach' ...
```
Alternatively, set --use-implicit-auth to use implicit authentication. When using assume role authentication, specify the service account with --assume-role. For example:
```
--bucket-path 'gs://migration/data/cockroach
--use-implicit-auth
--assume-role 'user-test@cluster-ephemeral.iam.gserviceaccount.com'
```
Ensure the Google Cloud Storage bucket is created and accessible by authorized roles and users only.

Azure Blob Storage

Set the following environment variables in the terminal running molt fetch:
```
export AZURE_ACCOUNT_NAME='account'
export AZURE_ACCOUNT_KEY='key'
```
You can also speicfy client and tenant credentials as environment variables:
```
export AZURE_CLIENT_SECRET='secret'
export AZURE_TENANT_ID='id'
```
- To run molt fetch in a containerized environment (e.g., Docker), pass the required environment variables using -e. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using -v. For example:
```
docker run \
  -e AZURE_ACCOUNT_NAME='account' \
  -e AZURE_ACCOUNT_KEY='key' \
  -e AZURE_CLIENT_SECRET='secret' \
  -e AZURE_TENANT_ID='id' \
  -v ~/.azure:/root/.azure \
  -it \
  cockroachdb/molt fetch \
  --bucket-path 'azure-blob://migration/data/cockroach' ...
```
Alternatively, set --use-implicit-auth to use implicit authentication: For example:
```
--bucket-path 'azure-blob://migration/data/cockroach'
--use-implicit-auth
```
This mode supports Azure managed identities and workload identities.
Ensure the Azure Blob Storage container is created and accessible by authorized roles and users only.

Table handling mode

MOLT Fetch can initialize target tables on the CockroachDB database in one of three modes using --table-handling:

Mode	MOLT Fetch flag	Description
`none`	Default mode	Loads data into existing tables without altering schema or data. Exits early if schemas mismatch in some cases.
`truncate-if-exists`	`--table-handling truncate-if-exists`	Truncates target tables before loading data. Exits early if schemas mismatch in some cases.
`drop-on-target-and-recreate`	`--table-handling drop-on-target-and-recreate`	Drops and recreates target tables before loading data. Automatically creates missing tables with `PRIMARY KEY` and `NOT NULL` constraints.

Use none when you need to retain existing data and schema.
Use --table-handling truncate-if-exists to clear existing data while preserving schema definitions.
Use --table-handling drop-on-target-and-recreate for initial imports or when source and target schemas differ, letting MOLT Fetch generate compatible tables automatically.

Note:

When using the drop-on-target-and-recreate option, only PRIMARY KEY and NOT NULL constraints are preserved on the target tables. Other constraints, such as FOREIGN KEY references, UNIQUE, or DEFAULT value expressions, are not retained.

To guide schema creation with drop-on-target-and-recreate, you can explicitly map source types to CockroachDB types. Refer to Type mapping.

Schema and table filtering

MOLT Fetch can restrict which schemas (or users) and tables are migrated by using the following filter flags:

Filter type	Flag	Description
Schema filter	`--schema-filter`	POSIX regex matching schema names to include; all matching schemas and their tables are moved.
Table filter	`--table-filter`	POSIX regex matching table names to include across all selected schemas.
Table exclusion filter	`--table-exclusion-filter`	POSIX regex matching table names to exclude across all selected schemas.

Tip:

Use --schema-filter to migrate only the specified schemas, and refine which tables are moved using --table-filter or --table-exclusion-filter.

When migrating from Oracle, you must include --schema-filter to name an Oracle schema to migrate. This prevents Fetch from attempting to load tables owned by other users. For example:

--schema-filter 'migration_schema'

Table filter userscript

When loading a subset of tables using --table-filter, you must provide a TypeScript userscript to specify which tables to replicate.

For example, the following table_filter.ts userscript filters change events to the specified source tables:

import * as api from "replicator@v1";

// List the source tables (matching source names and casing) to include in replication
const allowedTables = ["EMPLOYEES", "PAYMENTS", "ORDERS"];

// Update this to your target CockroachDB database and schema name
api.configureSource("defaultdb.migration_schema", {
  dispatch: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Replicate only if the table matches one of the allowed tables
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    // Ignore all other tables
    return null;
  },
  deletesTo: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Optionally filter deletes the same way
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    return null;
  },
});

Pass the userscript to MOLT Replicator with the --userscript flag:

--userscript table_filter.ts

Table filter userscript

When loading a subset of tables using --table-filter, you must provide a TypeScript userscript to specify which tables to replicate.

For example, the following table_filter.ts userscript filters change events to the specified source tables:

import * as api from "replicator@v1";

// List the source tables (matching source names and casing) to include in replication
const allowedTables = ["EMPLOYEES", "PAYMENTS", "ORDERS"];

// Update this to your target CockroachDB database and schema name
api.configureSource("defaultdb.migration_schema", {
  dispatch: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Replicate only if the table matches one of the allowed tables
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    // Ignore all other tables
    return null;
  },
  deletesTo: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Optionally filter deletes the same way
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    return null;
  },
});

Pass the userscript to MOLT Replicator with the --userscript flag:

--userscript table_filter.ts

Data load mode

MOLT Fetch can use either IMPORT INTO or COPY FROM to load data into CockroachDB:

Statement	MOLT Fetch flag	Description
`IMPORT INTO`	Default mode	Fastest option, but takes target tables offline during load. Supports compression using the `--compression` flag to reduce storage used during export. Executes as a distributed background job in CockroachDB, so is more efficient for large, wide, or heavily partitioned tables.
`COPY FROM`	`--use-copy` or `--direct-copy`	Slower than `IMPORT INTO`, but keeps target tables online and queryable during load. Does not support compression. Runs on the MOLT host and streams data row-by-row, which can increase memory usage and limit concurrency for large tables (many rows) or wide tables (many columns or large values like `JSONB`).

Use IMPORT INTO (the default mode) for large datasets, wide rows, or partitioned tables.
Use --use-copy when tables must remain online during data load.
Use --direct-copy only when you cannot move data to a public cloud, or want to perform local testing without intermediate storage. In this case, no intermediate file storage is used.

Fetch metrics

By default, MOLT Fetch exports Prometheus metrics at http://127.0.0.1:3030/metrics. You can override the address with --metrics-listen-addr '{host}:{port}', where the endpoint will be http://{host}:{port}/metrics.

Cockroach Labs recommends monitoring the following metrics during data load:

Metric Name	Description
`molt_fetch_num_tables`	Number of tables that will be moved from the source.
`molt_fetch_num_task_errors`	Number of errors encountered by the fetch task.
`molt_fetch_overall_duration`	Duration (in seconds) of the fetch task.
`molt_fetch_rows_exported`	Number of rows that have been exported from a table. For example: `molt_fetch_rows_exported{table="public.users"}`
`molt_fetch_rows_imported`	Number of rows that have been imported from a table. For example: `molt_fetch_rows_imported{table="public.users"}`
`molt_fetch_table_export_duration_ms`	Duration (in milliseconds) of a table's export. For example: `molt_fetch_table_export_duration_ms{table="public.users"}`
`molt_fetch_table_import_duration_ms`	Duration (in milliseconds) of a table's import. For example: `molt_fetch_table_import_duration_ms{table="public.users"}`

You can also use the sample Grafana dashboard to view the preceding metrics.

Note:

Metrics from the replicator process are enabled by setting the --metricsAddr replication flag, and are served at http://{host}:{port}/_/varz.

To view Oracle-specific metrics from replicator, import this Grafana dashboard.

Start Fetch

Perform the initial load of the source data.

Issue the MOLT Fetch command to move the source data to CockroachDB. This example command passes the source and target connection strings as environment variables, writes intermediate files to S3 storage, and uses the truncate-if-exists table handling mode to truncate the target tables before loading data. It also limits the migration to a single schema and filters three specific tables to migrate. The data load mode defaults to IMPORT INTO.
You must include --pglogical-replication-slot-name and --pglogical-publication-and-slot-drop-and-recreate to automatically create the publication and replication slot during the data load.
```
molt fetch \
--source $SOURCE \
--target $TARGET \
--schema-filter 'migration_schema' \
--table-filter 'employees|payments|orders' \
--bucket-path 's3://migration/data/cockroach' \
--table-handling truncate-if-exists \
--pglogical-replication-slot-name molt_slot \
--pglogical-publication-and-slot-drop-and-recreate
```
```
molt fetch \
--source $SOURCE \
--target $TARGET \
--schema-filter 'migration_schema' \
--table-filter 'employees|payments|orders' \
--bucket-path 's3://migration/data/cockroach' \
--table-handling truncate-if-exists \
```
The command assumes an Oracle Multitenant (CDB/PDB) source. --source-cdb specifies the container database (CDB) connection string.
```
molt fetch \
--source $SOURCE \
--source-cdb $SOURCE_CDB \
--target $TARGET \
--schema-filter 'migration_schema' \
--table-filter 'employees|payments|orders' \
--bucket-path 's3://migration/data/cockroach' \
--table-handling truncate-if-exists \
```

Check the output to observe fetch progress.

The following message shows the appropriate values for the --backfillFromSCN and --scn replication flags to use when starting Replicator:

replication-only mode should include the following replicator flags: --backfillFromSCN 26685444 --scn 26685786

A starting fetch message indicates that the task has started:

{"level":"info","type":"summary","num_tables":3,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}

{"level":"info","type":"summary","num_tables":3,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-28","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}

{"level":"info","type":"summary","num_tables":3,"cdc_cursor":"26685786","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}

data extraction messages are written for each table that is exported to the location in --bucket-path:

{"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:11-05:00","message":"data extraction phase starting"}

{"level":"info","table":"migration_schema.employees","type":"summary","num_rows":200000,"export_duration_ms":1000,"export_duration":"000h 00m 01s","time":"2025-02-10T14:28:12-05:00","message":"data extraction from source complete"}

data import messages are written for each table that is loaded into CockroachDB:

{"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"}

{"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}

{"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}

{"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"2358840","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}

A fetch complete message is written when the fetch task succeeds:

{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"0/3F41E40","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}

{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}

This message includes a cdc_cursor value. You must set the --defaultGTIDSet replication flag to this value when starting Replicator:

--defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29

{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"2358840","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}

Verify the data load

Use MOLT Verify to confirm that the source and target data is consistent. This ensures that the data load was successful.

Run the MOLT Verify command, specifying the source and target connection strings and the tables to validate.
```
molt verify \
--source $SOURCE \ 
--target $TARGET \
--table-filter 'employees|payments|orders'
```
```
molt verify \
--source $SOURCE \ 
--target $TARGET \
--table-filter 'employees|payments|orders'
```
```
molt verify \
--source $SOURCE \ 
--target $TARGET \
--table-filter 'employees|payments|orders'
```
Note:

With Oracle Multitenant deployments, while --source-cdb is required for fetch, it is not necessary for verify.

Check the output to observe verify progress.

A verification in progress indicates that the task has started:

{"level":"info","time":"2025-02-10T15:35:04-05:00","message":"verification in progress"}

starting verify messages are written for each specified table:

{"level":"info","time":"2025-02-10T15:35:04-05:00","message":"starting verify on public.employees, shard 1/1"}

A finished row verification message containing a summary is written after each table is compared. For details on the summary fields, refer to the MOLT Verify page.

{"level":"info","type":"summary","table_schema":"public","table_name":"employees","num_truth_rows":200004,"num_success":200004,"num_conditional_success":0,"num_missing":0,"num_mismatch":0,"num_extraneous":0,"num_live_retry":0,"num_column_mismatch":0,"time":"2025-02-10T15:35:05-05:00","message":"finished row verification on public.employees (shard 1/1)"}

A verification complete message is written when the verify task succeeds:

{"level":"info","net_duration_ms":699.804875,"net_duration":"000h 00m 00s","time":"2025-02-10T15:35:05-05:00","message":"verification complete"}

Configure Replicator

When you run replicator, you can configure the following options for replication:

Replication connection strings: Specify URL-encoded source and target database connections.
Replication flags: Specify required and optional flags to configure replicator behavior.
Tuning parameters: Optimize replication performance and resource usage.
Replicator metrics: Monitor replication progress and performance.

Replication connection strings

MOLT Replicator uses --sourceConn and --targetConn to specify the source and target database connections.

--sourceConn specifies the connection string of the source database:

--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'

--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'

For Oracle Multitenant databases, also specify --sourcePDBConn with the PDB connection string:

--sourcePDBConn 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'

--targetConn specifies the target CockroachDB connection string:

--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

Tip:

Follow best practices for securing connection strings. Refer to Secure connections.

Replication flags

The following MOLT Replicator flags are required for continuous replication. For details on all available flags, refer to the MOLT Replicator documentation.

Flag	Description
`--slotName`	Required. PostgreSQL replication slot name. Must match the slot name specified with `--pglogical-replication-slot-name` in the MOLT Fetch command.
`--targetSchema`	Required. Target schema name on CockroachDB where tables will be replicated.
`--stagingSchema`	Required. Staging schema name for replication metadata and checkpoints.
`--stagingCreateSchema`	Required. Automatically create the staging schema if it does not exist. Include this flag when starting replication for the first time.
`--metricsAddr`	Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`.

Flag	Description
`--targetSchema`	Required. Target schema name on CockroachDB where tables will be replicated.
`--defaultGTIDSet`	Required. Default GTID set for changefeed.
`--stagingSchema`	Required. Staging schema name for replication metadata and checkpoints.
`--stagingCreateSchema`	Required. Automatically create the staging schema if it does not exist. Include this flag when starting replication for the first time.
`--fetchMetadata`	Explicitly fetch column metadata for MySQL versions that do not support `binlog_row_metadata`. Requires `SELECT` permissions on the source database or `PROCESS` privileges.
`--metricsAddr`	Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`.
`--userscript`	Path to a userscript that enables table filtering from MySQL sources. Refer to Table filter userscript.

You can find the starting GTID in the cdc_cursor field of the fetch complete message after the initial data load completes.

Flag	Description
`--sourceSchema`	Required. Source schema name on Oracle where tables will be replicated from.
`--targetSchema`	Required. Target schema name on CockroachDB where tables will be replicated.
`--scn`	Required. Snapshot System Change Number (SCN) for the initial changefeed starting point.
`--backfillFromSCN`	Required. SCN of the earliest active transaction at the time of the snapshot. Ensures no transactions are skipped.
`--stagingSchema`	Required. Staging schema name for replication metadata and checkpoints.
`--stagingCreateSchema`	Required. Automatically create the staging schema if it does not exist. Include this flag when starting replication for the first time.
`--metricsAddr`	Enable Prometheus metrics at a specified `{host}:{port}`. Metrics are served at `http://{host}:{port}/_/varz`.
`--userscript`	Path to a userscript that enables table filtering from Oracle sources. Refer to Table filter userscript.

You can find the SCN values in the message replication-only mode should include the following replicator flags after the initial data load completes.

Tuning parameters

Configure the following replicator flags to optimize replication throughput and resource usage. Test different combinations in a pre-production environment to find the optimal balance of stability and performance for your workload.

Note:

The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.

Flag	Description
`--parallelism`	Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity.
`--flushSize`	Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency.
`--targetApplyQueueSize`	Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption.
`--targetMaxPoolSize`	Set larger than `--parallelism` by a safety factor to avoid exhausting target pool connections. Replicator enforces setting parallelism to 80% of this value.
`--collapseMutations`	Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state.
`--enableParallelApplies`	Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase `--targetMaxPoolSize` or reduce `--parallelism`.
`--flushPeriod`	Set to the maximum allowable time between flushes (for example, `10s` if data must be applied within 10 seconds). Works with `--flushSize` to control when buffered mutations are committed to the target.
`--quiescentPeriod`	Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve.
`--scanSize`	Applies to failback (`replicator start`) scenarios only. Balance memory usage and throughput. Increase to read more rows at once from the CockroachDB staging cluster for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability.

Replicator metrics

MOLT Replicator can export Prometheus metrics by setting the --metricsAddr flag to a port (for example, --metricsAddr :30005). Metrics are not enabled by default. When enabled, metrics are available at the path /_/varz. For example: http://localhost:30005/_/varz.

Cockroach Labs recommends monitoring the following metrics during replication:

Metric Name	Description
`commit_to_stage_lag_seconds`	Time between when a mutation is written to the source database and when it is written to the staging database.
`source_commit_to_apply_lag_seconds`	End-to-end lag from when a mutation is written to the source database to when it is applied to the target CockroachDB.
`apply_conflicts_total`	Number of rows that experienced a compare-and-set (CAS) conflict.
`apply_deletes_total`	Number of rows deleted.
`apply_duration_seconds`	Length of time it took to successfully apply mutations.
`apply_errors_total`	Number of times an error was encountered while applying mutations.
`apply_resolves_total`	Number of rows that experienced a compare-and-set (CAS) conflict and which were resolved.
`apply_upserts_total`	Number of rows upserted.

You can use the Replicator Grafana dashboard to visualize the metrics.

For Oracle-specific metrics, import the Oracle Grafana dashboard.

To check MOLT Replicator health when metrics are enabled, run curl http://localhost:30005/_/healthz (replacing the port with your --metricsAddr value). This returns a status code of 200 if Replicator is running.

Start Replicator

With initial load complete, start replication of ongoing changes on the source to CockroachDB using MOLT Replicator.

Note:

MOLT Fetch captures a consistent point-in-time checkpoint at the start of the data load (shown as cdc_cursor in the fetch output). Starting replication from this checkpoint ensures that all changes made during and after the data load are replicated to CockroachDB, preventing data loss or duplication. The following steps use the checkpoint values from the fetch output to start replication at the correct position.

Run the replicator command, using the same slot name that you specified with --pglogical-replication-slot-name in the Fetch command. Use --stagingSchema to specify a unique name for the staging database, and include --stagingCreateSchema to have MOLT Replicator automatically create the staging database:
```
replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--targetSchema defaultdb.public \
--slotName molt_slot \
--stagingSchema _replicator \
--stagingCreateSchema \
--metricsAddr :30005 \
-v
```

Run the replicator command, specifying the GTID from the checkpoint recorded during data load. Use --stagingSchema to specify a unique name for the staging database, and include --stagingCreateSchema to have MOLT Replicator automatically create the staging database:
```
replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--targetSchema defaultdb.public \
--defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29 \
--stagingSchema _replicator \
--stagingCreateSchema \
--metricsAddr :30005 \
--userscript table_filter.ts \
-v
```
Tip:

For MySQL versions that do not support binlog_row_metadata, include --fetchMetadata to explicitly fetch column metadata. This requires additional permissions on the source MySQL database. Grant SELECT permissions with GRANT SELECT ON source_database.* TO 'migration_user'@'localhost';. If that is insufficient for your deployment, use GRANT PROCESS ON *.* TO 'migration_user'@'localhost';, though this is more permissive and allows seeing processes and server status.

Run the replicator command, specifying the backfill and starting SCN from the checkpoint recorded during data load. Use --stagingSchema to specify a unique name for the staging database, and include --stagingCreateSchema to have MOLT Replicator automatically create the staging database:
```
replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--targetConn $TARGET \
--sourceSchema migration_schema \
--targetSchema defaultdb.public \
--backfillFromSCN 26685444 \
--scn 26685786 \
--stagingSchema _replicator \
--stagingCreateSchema \
--metricsAddr :30005 \
--userscript table_filter.ts \
-v
```
Note:

When filtering out tables in a schema with a userscript, replication performance may decrease because filtered tables are still included in LogMiner queries and processed before being discarded.

Verify replication

Verify that Replicator is processing changes successfully. To do so, check the MOLT Replicator logs. Since you enabled debug logging with -v, you should see connection and row processing messages:

You should see periodic primary keepalive messages:

DEBUG  [Aug 25 14:38:10] primary keepalive received                    ReplyRequested=false ServerTime="2025-08-25 14:38:09.556773 -0500 CDT" ServerWALEnd=0/49913A58
DEBUG  [Aug 25 14:38:15] primary keepalive received                    ReplyRequested=false ServerTime="2025-08-25 14:38:14.556836 -0500 CDT" ServerWALEnd=0/49913E60

When rows are successfully replicated, you should see debug output like the following:

DEBUG  [Aug 25 14:40:02] upserted rows                                 conflicts=0 duration=7.855333ms proposed=1 target="\"molt\".\"public\".\"tbl1\"" upserted=1
DEBUG  [Aug 25 14:40:02] progressed to LSN: 0/49915DD0

You should see binlog syncer connection and row processing:

[2025/08/25 15:29:09] [info] binlogsyncer.go:463 begin to sync binlog from GTID set 77263736-7899-11f0-81a5-0242ac120002:1-38
[2025/08/25 15:29:09] [info] binlogsyncer.go:409 Connected to mysql 8.0.43 server
INFO   [Aug 25 15:29:09] connected to MySQL version 8.0.43

When rows are successfully replicated, you should see debug output like the following:

DEBUG  [Aug 25 15:29:38] upserted rows                                 conflicts=0 duration=1.801ms proposed=1 target="\"molt\".\"public\".\"tbl1\"" upserted=1
<!-- DEBUG  [Aug 25 15:29:38] progressed to consistent point: 77263736-7899-11f0-81a5-0242ac120002:1-39 -->

When transactions are read from the Oracle source, you should see registered transaction IDs (XIDs):

DEBUG  [Jul  3 15:55:12] registered xid 0f001f0040060000
DEBUG  [Jul  3 15:55:12] registered xid 0b001f00bb090000

When rows are successfully replicated, you should see debug output like the following:

DEBUG  [Jul  3 15:55:12] upserted rows                                 conflicts=0 duration=2.620009ms proposed=13 target="\"molt_movies\".\"USERS\".\"CUSTOMER_CONTACT\"" upserted=13
DEBUG  [Jul  3 15:55:12] upserted rows                                 conflicts=0 duration=2.212807ms proposed=16 target="\"molt_movies\".\"USERS\".\"CUSTOMER_DEVICE\"" upserted=16

These messages confirm successful replication. You can disable verbose logging after verifying the connection.

Stop replication and verify data

Stop application traffic to your source database. This begins downtime.
Wait for replication to drain, which means that all transactions that occurred on the source database have been fully processed and replicated to CockroachDB. There are two ways to determine that replication has fully drained:
- When replication is caught up, you will not see new upserted rows logs.
- If you set up the replication metrics endpoint with --metricsAddr in the preceding steps, metrics are available at:
```
http://{host}:{port}/_/varz
```
  Use the following Prometheus alert expression to observe when the combined rate of upserts and deletes is 0 for each schema:
```
sum by (schema) (rate(apply_upserts_total[$__rate_interval]) + rate(apply_deletes_total[$__rate_interval]))
```
Cancel replication to CockroachDB by entering ctrl-c to issue a SIGTERM signal. This returns an exit code 0.
Repeat Verify the data load to verify the updated data.

Modify the CockroachDB schema

Add any constraints or indexes that you previously removed from the CockroachDB schema to facilitate data load.

Note:

If you used the --table-handling drop-on-target-and-recreate option for data load, only PRIMARY KEY and NOT NULL constraints are preserved. You must manually recreate all other constraints and indexes.

For the appropriate SQL syntax, refer to ALTER TABLE ... ADD CONSTRAINT and CREATE INDEX. Review the best practices for creating secondary indexes on CockroachDB.

Cutover

Perform a cutover by resuming application traffic, now to CockroachDB.

Troubleshooting

Fetch issues

Fetch exits early due to mismatches

When run in none or truncate-if-exists mode, molt fetch exits early in the following cases, and will output a log with a corresponding mismatch_tag and failable_mismatch set to true:

A source table is missing a primary key.
A source primary key and target primary key have mismatching types.
Tip:
These restrictions (missing or mismatching primary keys) can be bypassed with --skip-pk-check.
A STRING primary key has a different collation on the source and target.
A source and target column have mismatching types that are not allowable mappings.
A target table is missing a column that is in the corresponding source table.
A source column is nullable, but the corresponding target column is not nullable (i.e., the constraint is more strict on the target).

molt fetch can continue in the following cases, and will output a log with a corresponding mismatch_tag and failable_mismatch set to false:

A target table has a column that is not in the corresponding source table.
A source column has a NOT NULL constraint, and the corresponding target column is nullable (i.e., the constraint is less strict on the target).
A DEFAULT, CHECK, FOREIGN KEY, or UNIQUE constraint is specified on a target column and not on the source column.

ORA-01950: no privileges on tablespace

If you receive ORA-01950: no privileges on tablespace 'USERS', it means the Oracle table owner (migration_schema in the preceding examples) does not have sufficient quota on the tablespace used to store its data. By default, this tablespace is USERS, but it can vary. To resolve this issue, grant a quota to the table owner. For example:

-- change UNLIMITED to a suitable limit for the table owner
ALTER USER migration_schema QUOTA UNLIMITED ON USERS;

No tables to drop and recreate on target

When expecting a bulk load but seeing no tables to drop and recreate on the target, ensure the migration user has SELECT and FLASHBACK privileges on each table to be migrated. For example:

GRANT SELECT, FLASHBACK ON migration_schema.employees TO C##MIGRATION_USER;
GRANT SELECT, FLASHBACK ON migration_schema.payments TO C##MIGRATION_USER;
GRANT SELECT, FLASHBACK ON migration_schema.orders TO C##MIGRATION_USER;

Table or view does not exist

If the Oracle migration user lacks privileges on certain tables, you may receive errors stating that the table or view does not exist. Either use --table-filter to limit the tables to be migrated, or grant the migration user SELECT privileges on all objects in the schema. Refer to Create migration user on source database.

Oracle sessions remain open after forcefully stopping `molt` or `replicator`

If you shut down molt or replicator unexpectedly (e.g., with kill -9 or a system crash), Oracle sessions opened by these tools may remain active.

Check your operating system for any running molt or replicator processes and terminate them manually.
After confirming that both processes have stopped, ask a DBA to check for active Oracle sessions using:
```
SELECT sid, serial#, username, status, osuser, machine, program
FROM v$session
WHERE username = 'C##MIGRATION_USER';
```
Wait until any remaining sessions display an INACTIVE status, then terminate them using:
```
ALTER SYSTEM KILL SESSION 'sid,serial#' IMMEDIATE;
```
Replace sid and serial# in the preceding statement with the values returned by the SELECT query.

Forward replication issues

Performance troubleshooting

If MOLT Replicator appears hung or performs poorly:

Enable trace logging with -vv to get more visibility into the replicator's state and behavior.

If MOLT Replicator is in an unknown, hung, or erroneous state, collect performance profiles to include with support tickets. Replace {host} and {metrics-port} with your Replicator host and the port specified by --metricsAddr:

curl '{host}:{metrics-port}/debug/pprof/trace?seconds=15' > trace.out
curl '{host}:{metrics-port}/debug/pprof/profile?seconds=15' > profile.out
curl '{host}:{metrics-port}/debug/pprof/goroutine?seconds=15' > gr.out
curl '{host}:{metrics-port}/debug/pprof/heap?seconds=15' > heap.out

Monitor lag metrics and adjust performance parameters as needed.

Unable to create publication or slot

This error occurs when logical replication is not supported.

Resolution: If you are connected to a replica, connect to the primary instance instead. Replicas cannot create or manage logical replication slots or publications.

Verify that the source database supports logical replication by checking the wal_level parameter on PostgreSQL:

SHOW wal_level;

If wal_level is not set to logical, update it and restart PostgreSQL:

ALTER SYSTEM SET wal_level = 'logical';

Replication slot already exists

ERROR: replication slot "molt_slot" already exists

Resolution: Either create a new slot with a different name, or drop the existing slot to start fresh:

SELECT pg_drop_replication_slot('molt_slot');

Warning:

Dropping a replication slot can be destructive and delete data that is not yet replicated. Only use this if you want to restart replication from the current position.

Publication does not exist

run CREATE PUBLICATION molt_fetch FOR ALL TABLES;

Resolution: Create the publication on the source database. Ensure you also create the replication slot:

CREATE PUBLICATION molt_publication FOR ALL TABLES;
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');

Could not connect to PostgreSQL

could not connect to source database: failed to connect to `user=migration_user database=source_database`

Resolution: Verify the connection details including user, host, port, and database name. Ensure the database name in your --sourceConn connection string matches exactly where you created the publication and slot. Verify you're connecting to the same host and port where you ran the CREATE PUBLICATION and SELECT pg_create_logical_replication_slot() commands. Check if TLS certificates need to be included in the connection URI.

Wrong replication slot name

run SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput'); in source database

Resolution: Create the replication slot or verify the correct slot name:

SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');

Repeated binlog syncing restarts

If Replicator repeatedly restarts binlog syncing or starts replication from an unexpectedly old location, this indicates an invalid or purged GTID. When an invalid GTID is provided, the binlog syncer will fall back to the first valid GTID.

Resolution: Verify the GTID set is valid and not purged:

-- Check if GTID is in executed set
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_executed) AS in_executed;

-- Check if GTID is purged
SELECT GTID_SUBSET('your-gtid-set', @@GLOBAL.gtid_purged) AS in_purged;

Interpret the results as follows:

If in_executed returns 1 and in_purged returns 0, the GTID is valid for replication.
If in_purged returns 1, the GTID has been purged and you must find a newer consistent point.
If both return 0, the GTID doesn't exist in the records and is invalid.

If the GTID is purged or invalid, follow these steps:

Increase binlog retention by configuring binlog_expire_logs_seconds in MySQL:
```
-- Increase binlog retention (example: 7 days = 604800 seconds)
SET GLOBAL binlog_expire_logs_seconds = 604800;
```
Note:

For managed MySQL services (such as Amazon RDS, Google Cloud SQL, or Azure Database for MySQL), binlog retention is typically configured through the provider's console or CLI. Consult your provider's documentation for how to adjust binlog retention settings.

Get a current GTID set to restart replication:

-- For MySQL < 8.0:
SHOW MASTER STATUS;
-- For MySQL 8.0+:
SHOW BINARY LOG STATUS;

+---------------+----------+--------------+------------------+-------------------------------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                         |
+---------------+----------+--------------+------------------+-------------------------------------------+
| binlog.000005 |      197 |              |                  | 77263736-7899-11f0-81a5-0242ac120002:1-38 |
+---------------+----------+--------------+------------------+-------------------------------------------+

Use the Executed_Gtid_Set value for the --defaultGTIDSet flag.

Invalid GTID format

Invalid GTIDs can occur when GTIDs are purged due to insufficient binlog retention, when connecting to a replica instead of the primary host, or when passing a GTID that has valid format but doesn't exist in the binlog history.

Resolution: Use a valid GTID from SHOW MASTER STATUS (MySQL < 8.0) or SHOW BINARY LOG STATUS (MySQL 8.0+) and ensure you're connecting to the primary host. If GTIDs are being purged, increase binlog retention.

Table/column names exceed 30 characters

Oracle LogMiner excludes tables and columns with names longer than 30 characters from redo logs.

Resolution: Rename tables and columns to 30 characters or fewer before migration.

Unsupported data types

LogMiner and replication do not support:

Long BLOB/CLOBs (4000+ characters)
User-defined types (UDTs)
Nested tables
Varrays
GEOGRAPHY and GEOMETRY

Resolution: Convert unsupported data types or exclude affected tables from replication.

LOB column UPDATE statements

UPDATE statements that only modify LOB columns are not supported by Oracle LogMiner.

Resolution: Avoid LOB-only updates during replication, or use Binary Reader for Oracle 12c.

JSONB null handling

SQL NULL and JSON null values are not distinguishable in JSON payloads during replication.

Resolution: Avoid using nullable JSONB columns where the distinction between SQL NULL and JSON null is important.

Missing redo logs or unavailable SCN

If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.

Resolution: Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.

Replicator lag

If the replicator process is lagging significantly behind the current Oracle SCN, you may see log messages like: replicator is catching up to the current SCN at 5000 from 1000…. This indicates that replication is progressing but is still behind the most recent changes on the source database.

Schema drift errors

Indicates source and target schemas are mismatched:

WARNING: schema drift detected in "database"."table" at payload object offset 0: unexpected columns: column_name

Resolution: Align schemas or use userscripts to transform data.

Apply flow failures

Apply flow failures occur when the target database encounters error conditions such as unique constraint violations, target database being unavailable, or incorrect data (missing or extraneous columns) during apply operations:

WARNING: warning during tryCommit: ERROR: duplicate key value violates unique constraint
ERROR: maximum number of retries (10) exceeded

Resolution: Check target database constraints and connection stability. MOLT Replicator will log warnings for each retry attempt. If you see warnings but no final error, the apply succeeded after retrying. If all retry attempts are exhausted, Replicator will surface a final error and restart the apply loop to continue processing.

Load and Replicate

Before you begin

Prerequisites

Oracle Instant Client

Limitations

Fetch limitations

Replicator limitations

Prepare the source database

Create migration user on source database

Oracle Multitenant (PDB/CDB) user privileges

Single-tenant Oracle user privileges

Configure source database for replication

Enable ARCHIVELOG and FORCE LOGGING

Create source sentinel table

Grant LogMiner privileges

Verify LogMiner privileges

Prepare the target database

Create the target schema

Schema Conversion Tool

String case sensitivity

Identifier case sensitivity

AUTO_INCREMENT attribute

ENUM type

TINYINT type

Geospatial types

FIELD function

Drop constraints and indexes

Create the SQL user

IMPORT INTO privileges

COPY FROM privileges

Replication privileges

Configure GC TTL

Configure Fetch

Connection strings

Source connection string

Target connection string

Secure connections

Intermediate file storage

Cloud storage authentication

Amazon S3

Google Cloud Storage

Azure Blob Storage

Table handling mode

Schema and table filtering

Table filter userscript

Table filter userscript

Data load mode

Fetch metrics

Start Fetch

Verify the data load

Configure Replicator

Replication connection strings

Replication flags

Tuning parameters

Replicator metrics

Start Replicator

Verify replication

Stop replication and verify data

Modify the CockroachDB schema

Cutover

Troubleshooting

Fetch issues

Fetch exits early due to mismatches

ORA-01950: no privileges on tablespace

No tables to drop and recreate on target

Table or view does not exist

Oracle sessions remain open after forcefully stopping molt or replicator

Forward replication issues

Performance troubleshooting

Unable to create publication or slot

Replication slot already exists

Publication does not exist

Could not connect to PostgreSQL

Wrong replication slot name

Repeated binlog syncing restarts

Invalid GTID format

Table/column names exceed 30 characters

Unsupported data types

LOB column UPDATE statements

JSONB null handling

`AUTO_INCREMENT` attribute

`ENUM` type

`TINYINT` type

`FIELD` function

`IMPORT INTO` privileges

`COPY FROM` privileges

Oracle sessions remain open after forcefully stopping `molt` or `replicator`