These instructions assume you have already installed MOLT and completed the prerequisites for your source dialect.
Prepare the CockroachDB cluster
For details on enabling CockroachDB changefeeds, refer to Create and Configure Changefeeds.
Enable rangefeeds on the CockroachDB cluster:
SET CLUSTER SETTING kv.rangefeed.enabled = true;
Use the following optional settings to increase changefeed throughput. Note that these can impact source cluster performance and stability, especially SQL foreground latency during writes. For details, refer to Advanced Changefeed Configuration.
To lower changefeed emission latency, but increase SQL foreground latency:
SET CLUSTER SETTING kv.rangefeed.closed_timestamp_refresh_interval = '250ms';
To lower the closed timestamp lag duration:
SET CLUSTER SETTING kv.closed_timestamp.target_duration = '1s';
To improve catchup speeds but increase cluster CPU usage:
SET CLUSTER SETTING kv.rangefeed.concurrent_catchup_iterators = 64;
Grant Oracle user permissions
You should have already created a migration user on the source database with the necessary privileges. Refer to Create migration user on source database.
Grant the Oracle user additional INSERT
and UPDATE
privileges on the tables to fail back:
GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.employees TO MIGRATION_USER;
GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.payments TO MIGRATION_USER;
GRANT SELECT, INSERT, UPDATE, FLASHBACK ON migration_schema.orders TO MIGRATION_USER;
Configure replication
When you run replicator
, you can configure the following options for replication:
- Connection strings: Specify URL‑encoded source and target connections.
- TLS certificate and key: Configure secure TLS connections.
- Tuning parameters: Optimize failback performance and resource usage.
- Replicator metrics: Monitor failback replication performance.
Connection strings
For failback, MOLT Replicator uses --targetConn
to specify the original source database and --stagingConn
for the CockroachDB staging database.
--targetConn
is the connection string of the database you migrated from.
For example:
--targetConn 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full'
--targetConn 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt'
--targetConn 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLPDB1'
--stagingConn
is the CockroachDB connection string for staging operations:
--stagingConn 'postgres://crdb_user@localhost:26257/defaultdb?sslmode=verify-full'
Secure connections
To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:
Avoid plaintext connection strings.
URL-encode connection strings for the source database and CockroachDB so special characters in passwords are handled correctly.
Given a password
a$52&
, pass it to themolt escape-password
command with single quotes:molt escape-password --password 'a$52&'
Use the encoded password in your
--source
connection string. For example:--source 'postgres://migration_user:a%2452%26@localhost:5432/replicationload'
Provide your connection strings as environment variables. For example:
export SOURCE="postgres://migration_user:a%2452%26@localhost:5432/molt?sslmode=verify-full" export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
Afterward, reference the environment variables as follows:
--source $SOURCE --target $TARGET
If possible, use an external secrets manager to load the environment variables from stored secrets.
TLS certificate and key
Always use secure TLS connections for failback replication to protect data in transit. Do not use insecure configurations in production: avoid the --disableAuthentication
and --tlsSelfSigned
Replicator flags and insecure_tls_skip_verify=true
query parameter in the changefeed webhook URI.
Generate self-signed TLS certificates or certificates from an external CA. Ensure the TLS server certificate and key are accessible on the MOLT Replicator host machine via a relative or absolute file path. When you start failback with Replicator, specify the paths with --tlsCertificate
and --tlsPrivateKey
. For example:
replicator start \
... \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key
The client certificates defined in the changefeed webhook URI must correspond to the server certificates specified in the replicator
command. This ensures proper TLS handshake between the changefeed and MOLT Replicator. To include client certificates in the changefeed webhook URL, use URL encoding and base64 encoding:
base64 -i ./client.crt | jq -R -r '@uri'
base64 -i ./client.key | jq -R -r '@uri'
base64 -i ./ca.crt | jq -R -r '@uri'
When you create the changefeed, pass the encoded certificates in the changefeed URL, where client_cert
, client_key
, and ca_cert
are webhook sink parameters. For example:
CREATE CHANGEFEED FOR TABLE table1, table2
INTO 'webhook-https://host:port/database/schema?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}'
WITH ...;
For additional details on the webhook sink URI, refer to Webhook sink.
Replication flags
The following MOLT Replicator flags are required for continuous replication. For details on all available flags, refer to the MOLT Replicator documentation.
Flag | Description |
---|---|
--stagingSchema |
Required. Staging schema name for the changefeed checkpoint table. |
--bindAddr |
Required. Network address to bind the webhook sink for the changefeed. For example, :30004 . |
--tlsCertificate |
Path to the server TLS certificate for the webhook sink. Refer to TLS certificate and key. |
--tlsPrivateKey |
Path to the server TLS private key for the webhook sink. Refer to TLS certificate and key. |
--metricsAddr |
Enable Prometheus metrics at a specified {host}:{port} . Metrics are served at http://{host}:{port}/_/varz . |
Failback requires
--stagingSchema
, which specifies the staging database name used for replication checkpoints and metadata. The staging schema is first created with--stagingCreateSchema
. For details on the staging schema, refer to Staging schema.When configuring a secure changefeed for failback, you must include
--tlsCertificate
and--tlsPrivateKey
, which specify the paths to the server certificate and private key for the webhook sink connection.
Tuning parameters
Configure the following replicator
flags to optimize replication throughput and resource usage. Test different combinations in a staging environment to find the optimal balance of stability and performance for your workload.
The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.
Flag | Description |
---|---|
--parallelism |
Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity. |
--flushSize |
Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency. |
--targetApplyQueueSize |
Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption. |
--targetMaxPoolSize |
Set larger than --parallelism by a safety factor to avoid exhausting target pool connections. Cockroach Labs recommends setting parallelism to 80% of this value. |
--collapseMutations |
Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state. |
--enableParallelApplies |
Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase --targetMaxPoolSize or reduce --parallelism . |
--flushPeriod |
Set to the maximum allowable time between flushes (for example, 10s if data must be applied within 10 seconds). Works with --flushSize to control when buffered mutations are committed to the target. |
--quiescentPeriod |
Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve. |
--scanSize |
Applies to failback (replicator start ) scenarios only. Balance memory usage and throughput. Increase to read more rows at once for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability. |
Replicator metrics
By default, MOLT Replicator exports Prometheus metrics at the address specified by --metricsAddr
(default :30005
) at the path /_/varz
. For example: http://localhost:30005/_/varz
.
Cockroach Labs recommends monitoring the following metrics during replication:
Metric Name | Description |
---|---|
source_lag_seconds |
Time between when an incoming resolved MVCC timestamp originated on the source CockroachDB cluster and when it was received by Replicator. |
target_lag_seconds |
End-to-end lag from when an incoming resolved MVCC timestamp originated on the source CockroachDB to when all data changes up to that timestamp were written to the target database. |
source_lag_seconds_histogram |
Same as source_lag_seconds but stored as a histogram for analyzing distributions over time. |
target_lag_seconds_histogram |
Same as target_lag_seconds but stored as a histogram for analyzing distributions over time. |
replicator_applier_mutations_staged |
Number of mutations that have been staged for application to the target database. |
replicator_applier_mutations_applied |
Number of mutations that have been successfully applied to the target database. |
You can use the Replicator Grafana dashboard to visualize these metrics.
To check MOLT Replicator health, run curl http://localhost:30005/_/healthz
. This returns a status code of 200
if Replicator is running.
Stop forward replication
Wait for replication to drain, which means that all transactions that occurred on the source database have been fully processed and replicated to CockroachDB. There are two ways to determine that replication has fully drained:
- When replication is caught up, you will not see new
upserted rows
logs. If you set up the replication metrics endpoint with
--metricsAddr
in the preceding steps, metrics are available at:http://{host}:{port}/_/varz
Use the following Prometheus alert expression to observe when the combined rate of upserts and deletes is
0
for each schema:sum by (schema) (rate(apply_upserts_total[$__rate_interval]) + rate(apply_deletes_total[$__rate_interval]))
- When replication is caught up, you will not see new
Cancel replication to CockroachDB by entering
ctrl-c
to issue aSIGTERM
signal. This returns an exit code0
.
Start Replicator
Run the MOLT Replicator
start
command to begin failback replication from CockroachDB to your source database. In this example,--metricsAddr :30005
enables a Prometheus endpoint for monitoring replication metrics, and--bindAddr :30004
sets up the webhook endpoint for the changefeed.--stagingSchema
specifies the staging database name (_replicator
in this example) used for replication checkpoints and metadata. This staging database was created during initial forward replication when you first ran MOLT Replicator with--stagingCreateSchema
.replicator start \ --targetConn $TARGET \ --stagingConn $STAGING \ --stagingSchema _replicator \ --metricsAddr :30005 \ --bindAddr :30004 \ --tlsCertificate ./certs/server.crt \ --tlsPrivateKey ./certs/server.key \ --verbose
Create the CockroachDB changefeed
Create a CockroachDB changefeed to send changes to MOLT Replicator.
Get the current logical timestamp from CockroachDB, after ensuring that forward replication has fully drained:
SELECT cluster_logical_timestamp();
cluster_logical_timestamp ---------------------------------- 1759246920563173000.0000000000
Create the CockroachDB changefeed pointing to the MOLT Replicator webhook endpoint. Use
cursor
to specify the logical timestamp from the preceding step.Note:Ensure that only one changefeed points to MOLT Replicator at a time to avoid mixing streams of incoming data.
Tip:For details on the webhook sink URI, refer to Webhook sink.
CREATE CHANGEFEED FOR TABLE employees, payments, orders \ INTO 'webhook-https://replicator-host:30004/migration_schema/public?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}' \ WITH updated, resolved = '250ms', min_checkpoint_frequency = '250ms', initial_scan = 'no', cursor = '1759246920563173000.0000000000', webhook_sink_config = '{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';
CREATE CHANGEFEED FOR TABLE employees, payments, orders \ INTO 'webhook-https://replicator-host:30004/migration_schema?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}' \ WITH updated, resolved = '250ms', min_checkpoint_frequency = '250ms', initial_scan = 'no', cursor = '1759246920563173000.0000000000', webhook_sink_config = '{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';
CREATE CHANGEFEED FOR TABLE employees, payments, orders \ INTO 'webhook-https://replicator-host:30004/MIGRATION_SCHEMA?client_cert={base64_encoded_cert}&client_key={base64_encoded_key}&ca_cert={base64_encoded_ca}' \ WITH updated, resolved = '250ms', min_checkpoint_frequency = '250ms', initial_scan = 'no', cursor = '1759246920563173000.0000000000', webhook_sink_config = '{"Flush":{"Bytes":1048576,"Frequency":"1s"}}';
The output shows the job ID:
job_id ----------------------- 1101234051444375553
Monitor the changefeed status, specifying the job ID:
SHOW CHANGEFEED JOB 1101234051444375553;
job_id | ... | status | running_status | ... ----------------------+-----+---------+-------------------------------------------+---- 1101234051444375553 | ... | running | running: resolved=1759246920563173000,0 | ...
To confirm the changefeed is active and replicating changes to the target database, check that
status
isrunning
andrunning_status
showsrunning: resolved={timestamp}
.Warning:running: resolved
may be reported even if data isn't being sent properly. This typically indicates incorrect host/port configuration or network connectivity issues.Verify that Replicator is reporting incoming HTTP requests from the changefeed. To do so, check the MOLT Replicator logs. Since you enabled debug logging with
--verbose
, you should see periodic HTTP request successes:DEBUG [Aug 25 11:52:47] httpRequest="&{0x14000b068c0 45 200 3 9.770958ms false false}" DEBUG [Aug 25 11:52:48] httpRequest="&{0x14000d1a000 45 200 3 13.438125ms false false}"
These debug messages confirm successful changefeed connections to MOLT Replicator. You can disable verbose logging after verifying the connection.
Troubleshoot failback issues
Changefeed connection errors
If the changefeed shows connection errors in SHOW CHANGEFEED JOB
:
Connection refused
transient error: Post "https://replicator-host:30004/molt/public": dial tcp [::1]:30004: connect: connection refused
This indicates that Replicator is down, the webhook URL is incorrect, or the port is misconfigured.
Resolution: Verify that MOLT Replicator is running on the port specified in the changefeed INTO
configuration. Confirm the host and port are correct.
Unknown schema error
transient error: 400 Bad Request: unknown schema:
This indicates the webhook URL path is incorrectly formatted. Common causes include using the wrong path format for your target database type or incorrect database names.
Resolution: Check the webhook URL path mapping:
- PostgreSQL targets: Use
/database/schema
format (for example,/molt_db/public
). - MySQL/Oracle targets: Use
/SCHEMA
format (for example,/MOLT_DB
). Use only the schema name (for example,molt
instead ofmolt/public
).
Verify that the target database and schema names match the webhook URL.
GC threshold error
batch timestamp * must be after replica GC threshold
This indicates starting from an invalid cursor that has been garbage collected.
Resolution: Double-check the cursor to ensure it represents a valid range that has not been garbage collected, or extend the GC TTL on the source CockroachDB cluster:
ALTER DATABASE defaultdb CONFIGURE ZONE USING gc.ttlseconds = {gc_ttl_in_seconds};
Duplicated data re-application
This occurs when resuming a changefeed from a cursor causes excessive data duplication.
Resolution: Clear the staging database to prevent duplication. This deletes all checkpoints and buffered data, so use with caution:
DROP DATABASE _replicator;
For more targeted cleanup, delete mutations from specific staging tables:
DELETE FROM _replicator.employees WHERE true;