MOLT Replicator continuously replicates changes from source databases to CockroachDB as part of a database migration. It supports live ongoing migrations to CockroachDB from a source database, and enables backfill from CockroachDB to your source database for failback scenarios to preserve a rollback option during a migration window.
MOLT Replicator consumes change data from CockroachDB changefeeds, PostgreSQL logical replication streams, MySQL GTID-based replication, and Oracle LogMiner. It applies changes to target databases while maintaining configurable consistency , and features an embedded TypeScript/JavaScript environment for configuration and live data transforms.
Supported databases
The following source databases are supported:
- CockroachDB
- PostgreSQL 11-16
- MySQL 5.7, 8.0 and later
- Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)
The following target databases are supported:
- CockroachDB
- PostgreSQL
- MySQL
- Oracle
Installation
To install MOLT, download the binary that matches your architecture and source database:
Operating System | Architecture | PostgreSQL/MySQL | Oracle (Preview) |
---|---|---|---|
Windows | AMD 64-bit | Download | N/A |
ARM 64-bit | Download | N/A | |
Linux | AMD 64-bit | Download | Download |
ARM 64-bit | Download | N/A | |
Mac | AMD 64-bit | Download | N/A |
ARM 64-bit | Download | N/A |
The following binaries are included:
molt
replicator
Both molt
and replicator
must be in your current working directory.
To display the current version of each binary, run molt --version
and replicator --version
.
For previous binaries, refer to the MOLT version manifest. For release details, refer to the MOLT changelog.
Docker images
MOLT Replicator
Docker images for MOLT Replicator are also available as a standalone binary:
docker pull cockroachdb/replicator
To pull a specific version (for example, v1.1.1
):
docker pull cockroachdb/replicator:v1.1.1
For details on running in Docker, refer to Docker usage.
Setup
Complete the following items before using MOLT Replicator:
Follow the recommendations in Best practices.
Configure the source database for replication based on your source type. Refer to the migration workflow tutorials for PostgreSQL, MySQL, Oracle, and CockroachDB sources.
Ensure that the SQL user running MOLT Replicator has appropriate privileges on the source and target databases, as described in the migration workflow tutorials.
Docker usage
For details on pulling Docker images, refer to Docker images.
Performance
MOLT Fetch, Verify, and Replicator are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.
Local connection strings
When testing locally, specify the host as follows:
For macOS, use
host.docker.internal
. For example:--source 'postgres://postgres:postgres@host.docker.internal:5432/molt?sslmode=disable' --target "postgres://root@host.docker.internal:26257/molt?sslmode=disable"
For Linux and Windows, use
172.17.0.1
. For example:--source 'postgres://postgres:postgres@172.17.0.1:5432/molt?sslmode=disable' --target "postgres://root@172.17.0.1:26257/molt?sslmode=disable"
How it works
Failback from CockroachDB (start
): MOLT Replicator acts as an HTTP webhook sink for a CockroachDB changefeed. To avoid mixing streams of incoming data, only one changefeed should point to Replicator at a time. Replicator receives mutations from source cluster nodes, can optionally buffer them in a CockroachDB staging cluster, and then applies time-ordered transactional batches to the target database. Mutations are applied as UPSERT
or DELETE
statements while respecting foreign-key and table dependencies.
PostgreSQL source (pglogical
): MOLT Replicator uses PostgreSQL logical replication, which is based on publications and replication slots. You create a publication for the target tables, and a slot marks consistent replication points. MOLT Replicator consumes this logical feed directly and applies the data in sorted batches to the target.
MySQL source (mylogical
): MOLT Replicator relies on MySQL GTID-based replication to read change data from MySQL binlogs. It works with MySQL versions that support GTID-based replication and applies transactionally consistent feeds to the target. Binlog features that do not use GTIDs are not supported.
Oracle source (oraclelogminer
): MOLT Replicator uses Oracle LogMiner to capture change data from Oracle redo logs, supporting replication for a single database user/schema. Both multitenant (CDB) and non-CDB Oracle architectures are supported. Replicator periodically queries LogMiner-populated views and processes transactional data in ascending SCN windows for reliable throughput while maintaining consistency.
Consistency modes
MOLT Replicator offers three consistency modes for balancing throughput and transactional guarantees:
Consistent (default for CockroachDB sources): Preserves per-row order and source transaction atomicity. Concurrent transactions are controlled by
--parallelism
.BestEffort: Relaxes atomicity across tables that do not have foreign key constraints between them (maintains coherence within FK-connected groups). Enable with
--bestEffortOnly
or allow auto-entry via--bestEffortWindow
set to a positive duration (e.g.,1s
).Immediate (default for PostgreSQL, MySQL, and Oracle sources): Applies updates as they arrive to Replicator with no buffering or waiting for resolved timestamps. Provides highest throughput but requires no foreign keys on the target schema.
Commands
Command | Usage |
---|---|
start |
Start the replicator in CockroachDB-to-target (failback) mode. |
pglogical |
Start replication from a PostgreSQL source using logical replication. |
mylogical |
Start replication from a MySQL source using GTID-based replication. |
oraclelogminer |
Start replication from an Oracle source using LogMiner. |
Flags
Replication flags
Flag | Type | Description |
---|---|---|
--applyTimeout |
DURATION |
The maximum amount of time to wait for an update to be applied. Default: 30s |
--dlqTableName |
IDENT |
The name of a table in the target schema for storing dead-letter entries. Default: replicator_dlq |
--enableParallelApplies |
BOOL |
Enable parallel application of independent table groups during replication. By default, applies are synchronous. When enabled, this increases throughput at the cost of higher target pool usage and memory usage. Default: false |
--flushPeriod |
DURATION |
Flush queued mutations after this duration. Default: 1s |
--flushSize |
INT |
Ideal batch size to determine when to flush mutations. Default: 1000 |
--gracePeriod |
DURATION |
Allow background processes to exit. Default: 30s |
--logDestination |
STRING |
Write logs to a file. If not specified, write logs to stdout . |
--logFormat |
STRING |
Choose log output format: "fluent" , "text" .Default: "text" |
--maxRetries |
INT |
Maximum number of times to retry a failed mutation on the target (for example, due to contention or a temporary unique constraint violation) before treating it as a hard failure. Default: 10 |
--metricsAddr |
STRING |
A host:port on which to serve metrics and diagnostics. The metrics endpoint is http://{host}:{port}/_/varz . |
--parallelism |
INT |
The number of concurrent database transactions to use. Default: 16 |
--quiescentPeriod |
DURATION |
How often to retry deferred mutations. Default: 10s |
--retireOffset |
DURATION |
How long to delay removal of applied mutations. Default: 24h0m0s |
--retryInitialBackoff |
DURATION |
Initial delay before the first retry attempt when applying a mutation to the target database fails due to a retryable error, such as contention or a temporary unique constraint violation. Default: 25ms |
--retryMaxBackoff |
DURATION |
Maximum delay between retry attempts when applying mutations to the target database fails due to retryable errors. Default: 2s |
--retryMultiplier |
INT |
Multiplier that controls how quickly the backoff interval increases between successive retries of failed applies to the target database. Default: 2 |
--scanSize |
INT |
The number of rows to retrieve from the staging database used to store metadata for replication. Default: 10000 |
--schemaRefresh |
DURATION |
How often a watcher will refresh its schema. If this value is zero or negative, refresh behavior will be disabled. Default: 1m0s |
--sourceConn |
STRING |
The source database's connection string. When replicating from Oracle, this is the connection string of the Oracle container database (CDB). Refer to Oracle replication flags. |
--stageDisableCreateTableReaderIndex |
BOOL |
Disable the creation of partial covering indexes to improve read performance on staging tables. Set to true if creating indexes on existing tables would cause a significant operational impact.Default: false |
--stageMarkAppliedLimit |
INT |
Limit the number of mutations to be marked applied in a single statement. Default: 100000 |
--stageSanityCheckPeriod |
DURATION |
How often to validate staging table apply order (-1 to disable).Default: 10m0s |
--stageSanityCheckWindow |
DURATION |
How far back to look when validating staging table apply order. Default: 1h0m0s |
--stageUnappliedPeriod |
DURATION |
How often to report the number of unapplied mutations in staging tables (-1 to disable).Default: 1m0s |
--stagingConn |
STRING |
The staging database's connection string. |
--stagingCreateSchema |
Automatically create the staging schema if it does not exist. | |
--stagingIdleTime |
DURATION |
Maximum lifetime of an idle connection. Default: 1m0s |
--stagingJitterTime |
DURATION |
The time over which to jitter database pool disconnections. Default: 15s |
--stagingMaxLifetime |
DURATION |
The maximum lifetime of a database connection. Default: 5m0s |
--stagingMaxPoolSize |
INT |
The maximum number of staging database connections. Default: 128 |
--stagingSchema |
STRING |
Name of the CockroachDB schema that stores replication metadata. Required each time replicator is rerun after being interrupted, as the schema contains a checkpoint table that enables replication to resume from the correct transaction.Default: _replicator.public |
--targetApplyQueueSize |
INT |
Size of the apply queue that buffers mutations before they are written to the target database. Larger values can improve throughput, but increase memory usage. This flag applies only to CockroachDB and PostgreSQL (pglogical ) sources, and replaces the deprecated --copierChannel and --stageCopierChannelSize flags. |
--targetConn |
STRING |
The target database's connection string. |
--targetIdleTime |
DURATION |
Maximum lifetime of an idle connection. Default: 1m0s |
--targetJitterTime |
DURATION |
The time over which to jitter database pool disconnections. Default: 15s |
--targetMaxLifetime |
DURATION |
The maximum lifetime of a database connection. Default: 5m0s |
--targetMaxPoolSize |
INT |
The maximum number of target database connections. Default: 128 |
--targetSchema |
STRING |
The SQL database schema in the target cluster to update. |
--targetStatementCacheSize |
INT |
The maximum number of prepared statements to retain. Default: 128 |
--taskGracePeriod |
DURATION |
How long to allow for task cleanup when recovering from errors. Default: 1m0s |
--timestampLimit |
INT |
The maximum number of source timestamps to coalesce into a target transaction. Default: 1000 |
--userscript |
STRING |
The path to a TypeScript configuration script. For example, --userscript 'script.ts' . |
-v , --verbose |
COUNT |
Increase logging verbosity to debug ; repeat for trace . |
PostgreSQL replication flags
The following flags are used when replicating from a PostgreSQL source database.
Flag | Type | Description |
---|---|---|
--publicationName |
STRING |
The publication within the source database to replicate. |
--slotName |
STRING |
The replication slot in the source database. Default: "replicator" |
--standbyTimeout |
DURATION |
How often to report WAL progress to the source server. Default: 5s |
MySQL replication flags
The following flags are used when replicating from a MySQL source database.
Flag | Type | Description |
---|---|---|
--defaultGTIDSet |
STRING |
Default GTID set, in the format source_uuid:min(interval_start)-max(interval_end) . Required the first time replicator is run, as the GTID set provides a replication marker for streaming changes. |
--fetchMetadata |
Fetch column metadata explicitly, for older versions of MySQL that do not support binlog_row_metadata . |
|
--replicationProcessID |
UINT32 |
The replication process ID to report to the source database. Default: 10 |
Oracle replication flags
The following flags are used when replicating from an Oracle source database.
Flag | Type | Description |
---|---|---|
--sourceSchema |
STRING |
Required. Source schema name on Oracle where tables will be replicated from. |
--scn |
INT |
The snapshot System Change Number (SCN) queried by MOLT Fetch for the initial data load. |
--backfillFromSCN |
INT |
The SCN of the earliest active transaction at the time of the initial snapshot. Ensures no transactions are skipped when starting replication from Oracle. |
--sourcePDBConn |
STRING |
Connection string for the Oracle pluggable database (PDB). Only required when using an Oracle multitenant configuration. --sourceConn must be included. |
--schema-filter |
STRING |
Restricts replication to the specified Oracle PDB schema (user). Set to the PDB user that owns the tables you want to replicate. Without this flag, replication will be attempted on tables from other users. |
--oracle-application-users |
STRING |
List of Oracle usernames responsible for DML transactions in the PDB schema. Enables replication from the latest-possible starting point. Usernames are case-sensitive and must match the internal Oracle usernames (e.g., PDB_USER ). |
Failback replication flags
The following flags are used for failback from CockroachDB.
Flag | Type | Description |
---|---|---|
--assumeIdempotent |
Disable the extra staging table queries that debounce non-idempotent redelivery in changefeeds. | |
--bestEffortOnly |
Eventually-consistent mode; useful for high-throughput, skew-tolerant schemas with foreign keys. | |
--bestEffortWindow |
DURATION |
Use an eventually-consistent mode for initial backfill or when replication is behind; 0 to disable.Default: 1h0m0s |
--bindAddr |
STRING |
The network address to bind to. Default: ":26258" |
--disableAuthentication |
Disable authentication of incoming Replicator requests; not recommended for production. | |
--enableCheckpointStream |
Enable checkpoint streaming (use an internal changefeed from the staging table for real-time updates), rather than checkpoint polling (query the staging table for periodic updates), for failback replication. Default: false (use checkpoint polling) |
|
--discard |
Dangerous: Discard all incoming HTTP requests; useful for changefeed throughput testing. Not intended for production. | |
--discardDelay |
DURATION |
Adds additional delay in discard mode; useful for gauging the impact of changefeed round-trip time (RTT). |
--healthCheckTimeout |
DURATION |
The timeout for the health check endpoint. Default: 5s |
--httpResponseTimeout |
DURATION |
The maximum amount of time to allow an HTTP handler to execute. Default: 2m0s |
--immediate |
Bypass staging tables and write directly to target; recommended only for KV-style workloads with no foreign keys. | |
--limitLookahead |
INT |
Limit number of checkpoints to be considered when computing the resolving range; may cause replication to stall completely if older mutations cannot be applied. |
--ndjsonBufferSize |
INT |
The maximum amount of data to buffer while reading a single line of ndjson input; increase when source cluster has large blob values.Default: 65536 |
--tlsCertificate |
STRING |
A path to a PEM-encoded TLS certificate chain. |
--tlsPrivateKey |
STRING |
A path to a PEM-encoded TLS private key. |
--tlsSelfSigned |
If true, generate a self-signed TLS certificate valid for localhost . |
Usage
The following sections describe how to use MOLT Replicator with different source database types.
Source and target databases
Follow the recommendations in Connection strings.
--sourceConn
--sourceConn
specifies the connection string of the source database.
CockroachDB:
--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
PostgreSQL:
--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
MySQL:
--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'
Oracle:
--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'
In Oracle migrations, the --sourceConn
connection string specifies a CDB and the --sourcePDBConn
connection string specifies a PDB (in Oracle Multitenant databases). The {username}
corresponds to the owner of the tables you will migrate.
--targetConn
--targetConn
specifies the target database connection string:
CockroachDB and PostgreSQL:
--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
MySQL:
--targetConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'
Oracle:
--targetConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'
Data transformations
Use --userscript
to specify the path to a TypeScript/JavaScript file for data transformations and filtering. The userscript provides an embedded environment for configuration and live data transforms during replication.
When filtering out tables in a schema with a userscript, Oracle replication performance may decrease because filtered tables are still included in LogMiner queries and processed before being discarded.
Resume replication
MOLT Replicator can resume replication if it is interrupted. Specify the staging schema with the --stagingSchema
flag. MOLT Replicator uses the schema as a replication marker for streaming changes.
--stagingSchema {schema_name}
You must include the --stagingSchema
flag when resuming replication, as the schema provides checkpointing data that enables replication to resume from the correct point.
For detailed steps, refer to Resume replication.
Staging schema
The staging schema stores replication metadata, checkpoint information, and buffered mutations during replication. MOLT Replicator uses this schema to:
- Track replication progress: Store checkpoints that enable resuming from the correct point after interruptions
- Buffer mutations: Temporarily store data changes before applying them to the target in transaction order
- Maintain consistency: Ensure MVCC time-ordered transactional batches while respecting table dependencies
- Enable recovery: Provide restart capabilities with robust failure handling
For fresh replication runs, clean out the staging database memo table to avoid conflicts with previous checkpoints. Use --stagingCreateSchema
to automatically create the staging schema if it doesn't exist.
Best practices
Test and validate
To verify that your connections and configuration work properly, run MOLT Replicator in a staging environment before replicating any data in production. Use a test or development environment that closely resembles production.
Connection strings
URL-encode the connection strings for the source and target databases. This ensures that the MOLT tools can parse special characters in your password.
To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:
Avoid plaintext connection strings.
URL-encode connection strings for the source database and CockroachDB so special characters in passwords are handled correctly.
Given a password
a$52&
, pass it to themolt escape-password
command with single quotes:molt escape-password --password 'a$52&'
Use the encoded password in your
--source
connection string. For example:--source 'postgres://migration_user:a%2452%26@localhost:5432/replicationload'
Provide your connection strings as environment variables. For example:
export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full" export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
Afterward, reference the environment variables as follows:
--source $SOURCE --target $TARGET
If possible, use an external secrets manager to load the environment variables from stored secrets.
Optimize performance
Configure the following replicator
flags to optimize replication throughput and resource usage. Test different combinations in a staging environment to find the optimal balance of stability and performance for your workload.
The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.
Flag | Description |
---|---|
--parallelism |
Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity. |
--flushSize |
Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency. |
--targetApplyQueueSize |
Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption. |
--targetMaxPoolSize |
Set larger than --parallelism by a safety factor to avoid exhausting target pool connections. Cockroach Labs recommends setting parallelism to 80% of this value. |
--collapseMutations |
Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state. |
--enableParallelApplies |
Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase --targetMaxPoolSize or reduce --parallelism . |
--flushPeriod |
Set to the maximum allowable time between flushes (for example, 10s if data must be applied within 10 seconds). Works with --flushSize to control when buffered mutations are committed to the target. |
--quiescentPeriod |
Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve. |
--scanSize |
Applies to failback (replicator start ) scenarios only. Balance memory usage and throughput. Increase to read more rows at once for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability. |
Monitoring and observability
Monitor lag metrics (
source_lag_seconds
,target_lag_seconds
) and apply/stage duration/error counters to track replication performance. Set up Prometheus and Grafana to poll the metrics and view them in a dashboard. Use the published Grafana dashboard to visualize these metrics.Pay close attention to warning and error level logging, as it indicates when Replicator is misbehaving. Enable trace logging with
-vv
for additional visibility when troubleshooting.
By default, MOLT Replicator exports Prometheus metrics at the address specified by --metricsAddr
(default :30005
) at the path /_/varz
. For example: http://localhost:30005/_/varz
.
Cockroach Labs recommends monitoring the following metrics during replication:
Metric Name | Description |
---|---|
source_lag_seconds |
CockroachDB sources only: Time between when an incoming resolved MVCC timestamp originated on the source CockroachDB cluster and when it was received by Replicator. |
target_lag_seconds |
CockroachDB sources only: End-to-end lag from when an incoming resolved MVCC timestamp originated on the source CockroachDB to when all data changes up to that timestamp were written to the target database. |
source_lag_seconds_histogram |
CockroachDB sources: Same as source_lag_seconds but stored as a histogram for analyzing distributions over time.Non-CockroachDB sources: Time between when a source transaction is committed and when its COMMIT transaction log arrives at Replicator. |
target_lag_seconds_histogram |
CockroachDB sources: Same as target_lag_seconds but stored as a histogram for analyzing distributions over time.Non-CockroachDB sources: End-to-end lag from when a source transaction is committed to when its changes are fully written to the target CockroachDB. |
replicator_applier_mutations_staged |
Number of mutations that have been staged for application to the target database. |
replicator_applier_mutations_applied |
Number of mutations that have been successfully applied to the target database. |
You can use the Replicator Grafana dashboard to visualize these metrics. For Oracle-specific metrics, import this Oracle Grafana dashboard.
To check MOLT Replicator health, run curl http://localhost:30005/_/healthz
. This returns a status code of 200
if Replicator is running.
Logging
By default, MOLT Replicator writes two streams of logs: operational logs to stdout
(including warning, info, trace, and some errors) and final errors to stderr
.
Redirect both streams properly to ensure all logs are captured for troubleshooting:
# Merge both streams to console
./replicator ... 2>&1
# Redirect both streams to a file
./replicator ... > output.log 2>&1
# Merge streams to console while saving to file
./replicator > >(tee replicator.log) 2>&1
# Use logDestination flag to write all logs to a file
./replicator --logDestination replicator.log ...
Security
Cockroach Labs strongly recommends the following:
Connection security and credentials
- Use TLS-enabled connection strings for
--sourceConn
,--targetConn
, and--stagingConn
parameters. Removesslmode=disable
from production connection strings. - For webhook endpoints (failback scenarios), configure server certificates using
--tlsCertificate
and--tlsPrivateKey
to specify the certificate and private key file paths. - Configure proper TLS certificates in CockroachDB changefeed webhook URLs instead of using
insecure_tls_skip_verify=true
. - If relevant, enable JWT authentication for additional security.
CockroachDB changefeeds
For failback scenarios, generate TLS certificates using self-signed certificates, certificate authorities like Let's Encrypt, or your organization's certificate management system.
Encode certificates for changefeed webhook URLs:
- Webhook URLs: Use both URL encoding and base64 encoding:
base64 -i ./client.crt | jq -R -r '@uri'
- Non-webhook contexts: Use base64 encoding only:
base64 -w 0 ca.cert
Client certificates in changefeed webhook URLs must correspond to server certificates in MOLT Replicator configuration to ensure proper TLS handshake.
Production considerations
- Avoid
--disableAuthentication
and--tlsSelfSigned
flags in production environments. These flags should only be used for testing or development purposes.
Examples
For detailed examples of using MOLT Replicator usage, refer to the migration workflow tutorials:
- Load and replicate: Load data with MOLT Fetch and set up ongoing replication with MOLT Replicator.
- Resume Replication: Resume replication to CockroachDB after an interruption.
- Migration failback: Replicate changes from CockroachDB back to the source database.