Resume Replication

On this page Carat arrow pointing down

Use MOLT Replicator to resume replication to CockroachDB after an interruption, without reloading data.

Note:

These steps assume that you previously started replication. Refer to Load and Replicate.

Resume replication after interruption

Resume replication using MOLT Replicator by running the replicator binary directly with the same arguments used during initial replication setup. Replicator will automatically resume from the saved checkpoint in the existing staging schema.

  1. Run the MOLT Replicator command corresponding to your source database type, using the same --stagingSchema value from your initial replication setup.

    Use the replicator pglogical command. Be sure to specify the same --slotName value that you used during initial replication setup.

    icon/buttons/copy
    replicator pglogical \
    --sourceConn $SOURCE \
    --targetConn $TARGET \
    --slotName molt_slot \
    --stagingSchema _replicator \
    --metricsAddr :30005
    

    Use the replicator mylogical command. Replicator will automatically use the saved GTID from the staging schema, or fall back to the specified --defaultGTIDSet if no saved state exists.

    icon/buttons/copy
    replicator mylogical \
    --sourceConn $SOURCE \
    --targetConn $TARGET \
    --defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29 \
    --stagingSchema _replicator \
    --metricsAddr :30005 \
    --userscript table_filter.ts
    
    Tip:

    For MySQL versions that do not support binlog_row_metadata, include --fetchMetadata to explicitly fetch column metadata. This requires additional permissions on the source MySQL database. Grant SELECT permissions with GRANT SELECT ON source_database.* TO 'migration_user'@'localhost';. If that is insufficient for your deployment, use GRANT PROCESS ON *.* TO 'migration_user'@'localhost';, though this is more permissive and allows seeing processes and server status.

    Use the replicator oraclelogminer command. Replicator will automatically find the correct restart SCN from the _oracle_checkpoint table in the staging schema. The restart point is determined by the non-committed row with the smallest startscn column value. Do not specify --scn or --backfillFromSCN flags when resuming.

    icon/buttons/copy
    replicator oraclelogminer \
    --sourceConn $SOURCE \
    --sourcePDBConn $SOURCE_PDB \
    --sourceSchema migration_schema \
    --targetConn $TARGET \
    --stagingSchema _replicator \
    --metricsAddr :30005 \
    --userscript table_filter.ts
    
    Note:

    When filtering out tables in a schema with a userscript, replication performance may decrease because filtered tables are still included in LogMiner queries and processed before being discarded.

    Replication resumes from the last checkpoint without performing a fresh load. Monitor the metrics endpoint at http://localhost:30005/_/varz to track replication progress.

Tuning parameters

Configure the following replicator flags to optimize replication throughput and resource usage. Test different combinations in a staging environment to find the optimal balance of stability and performance for your workload.

Note:

The following parameters apply to PostgreSQL, Oracle, and CockroachDB (failback) sources.

Flag Description
--parallelism Control the maximum number of concurrent target transactions. Higher values increase throughput but require more target connections. Start with a conservative value and increase based on target database capacity.
--flushSize Balance throughput and latency. Controls how many mutations are batched into each query to the target. Increase for higher throughput at the cost of higher latency.
--targetApplyQueueSize Control memory usage during operation. Increase to allow higher throughput at the expense of memory; decrease to apply backpressure and limit memory consumption.
--targetMaxPoolSize Set larger than --parallelism by a safety factor to avoid exhausting target pool connections. Cockroach Labs recommends setting parallelism to 80% of this value.
--collapseMutations Reduce the number of queries to the target by combining multiple mutations on the same primary key within each batch. Disable only if exact mutation order matters more than end state.
--enableParallelApplies Improve apply throughput for independent tables and table groups that share foreign key dependencies. Increases memory and target connection usage, so ensure you increase --targetMaxPoolSize or reduce --parallelism.
--flushPeriod Set to the maximum allowable time between flushes (for example, 10s if data must be applied within 10 seconds). Works with --flushSize to control when buffered mutations are committed to the target.
--quiescentPeriod Lower this value if constraint violations resolve quickly on your workload to make retries more frequent and reduce latency. Do not lower if constraint violations take time to resolve.
--scanSize Applies to failback (replicator start) scenarios only. Balance memory usage and throughput. Increase to read more rows at once for higher throughput, at the cost of memory pressure. Decrease to reduce memory pressure and increase stability.

See also

×