Snapshot and restore for migration
Snapshots are one of the most reliable methods for migrating data between OpenSearch clusters. This approach is particularly useful when you need to move data from one environment to another, such as migrating from a proof-of-concept cluster to a production environment, or when performing major version upgrades that require a fresh cluster deployment.
When to use snapshot and restore for migration
Snapshot and restore is ideal for migration scenarios when:
- Migrating between different OpenSearch versions where in-place upgrades aren’t supported.
- Moving to a different infrastructure (on-premises to cloud, different cloud providers).
- Changing cluster architecture (different node configurations, shard strategies).
- Zero-downtime requirements aren’t critical and you can afford some downtime.
- You are working with large data volumes where other migration methods might be impractical.
- Complete cluster migration including indexes, settings, and metadata.
Migration workflow overview
A typical snapshot-based migration follows this workflow:
- Prepare the source cluster: Ensure cluster health and configure a snapshot repository.
- Create snapshot repository: Set up shared storage accessible by both clusters.
- Take comprehensive snapshots: Capture all necessary indexes and cluster state.
- Set up a target cluster: Deploy and configure the destination OpenSearch cluster.
- Register repository on target: Connect the target cluster to the snapshot repository.
- Restore snapshots: Selectively restore indexes and configurations.
- Validate and test - Verify data integrity and application functionality.
- Switch traffic - Update applications to use the new cluster.
Key considerations for migration
Data consistency
Snapshots capture data as it existed when the snapshot was initiated, but they’re not instantaneous. For migration purposes, consider:
- Stopping writes to ensure data consistency during the final snapshot.
- Taking incremental snapshots to minimize the final downtime window.
- Planning for data that changes during the migration process.
Version compatibility
- Snapshots are forward compatible by one major version.
- For larger version gaps, you may need to restore to an intermediate cluster, reindex, and take new snapshots.
- Always verify compatibility between source and target OpenSearch versions.
Storage requirements
- Incremental nature means that frequent snapshots don’t significantly increase storage usage.
- Plan storage capacity for the full dataset plus incremental changes.
- Consider network bandwidth for cloud-based repositories.
Snapshot repository options for migration
Choose the appropriate repository type based on your migration requirements and infrastructure setup.
Shared file systems
Best for migrations within the same infrastructure where both clusters can access shared storage.
Amazon S3
Ideal for cloud migrations or when migrating between different environments. Provides durability and accessibility across AWS Regions.
Azure Blob Storage
Suitable for Azure-based migrations or hybrid cloud scenarios.
Cross-cloud considerations
When migrating between different cloud providers, consider:
- Data transfer costs and time requirements.
- Network connectivity between the source cluster, storage, and the target cluster.
- Security and access controls across different environments.
Migration-specific restoration options
When restoring for migration purposes, you have several options for customizing the process.
Selective restoration
- Choose specific indexes rather than restoring everything.
- Exclude system indexes that might conflict with the target cluster configuration.
- Rename indexes to avoid conflicts or implement new naming conventions.
Index settings modification
- Update replica counts to match target cluster capacity.
- Modify shard allocation for different node configurations.
- Adjust refresh intervals and other performance settings.
Remote snapshot restoration
For large datasets, consider using storage_type: remote_snapshot
to:
- Reduce initial restore time by keeping data in the repository.
- Save local storage on the target cluster.
- Enable faster access to historical data.
Security considerations for migration
When migrating with snapshots:
- Exclude security indexes (
.opendistro_security
) from snapshots to avoid conflicts. - Plan security configuration separately from data migration.
- Use appropriate access controls for snapshot repositories.
- Consider encryption for sensitive data in transit and at rest.
Monitoring and validation
During migration:
- Monitor snapshot progress using the Snapshot Status API.
- Validate data integrity after restoration.
- Test application functionality before switching traffic.
- Keep the source cluster available until migration is fully validated.
Next steps
- Snapshot creation and management: See Take and restore snapshots.
- API reference: See Snapshot APIs for complete API documentation.
- Automated snapshots: See Snapshot management for information about scheduling and automation.
- Alternative migration methods: Consider Migration Assistant for more complex migration scenarios.
Related migration approaches
- Migration Assistant: For zero-downtime migrations with live traffic capture
- Rolling upgrades: For in-place version upgrades
- Remote reindex: For selective data migration between clusters