2022.06

2022.06

  • Imply
  • Ingest
  • Query
  • Visualize
  • Administer
  • Deploy

›Misc

Overview

  • About Imply administration

Manager

  • Using Imply Manager
  • Managing Imply clusters
  • Imply Manager security
  • Extensions

Users

  • Imply Manager users
  • Druid API access
  • Authentication and Authorization

    • Get started with Imply Hybrid Auth
    • Authentication
    • Local users
    • User roles
    • User groups
    • User sessions
    • Brute force attack detection
    • Identity provider integration
    • Okta OIDC integration
    • Okta SAML integration
    • LDAP integration
    • OAuth client authentication

Clarity

  • Monitoring
  • Set up Clarity
  • Cloudwatch monitoring
  • Metrics

Druid administration

  • Configuration reference
  • Logging
  • Druid design

    • Design
    • Segments
    • Processes and servers
    • Deep storage
    • Metadata storage
    • ZooKeeper

    Security

    • Security overview
    • User authentication and authorization
    • LDAP auth
    • Dynamic Config Providers
    • Password providers
    • Authentication and Authorization
    • TLS support
    • Row and column level security

    Performance tuning

    • Basic cluster tuning
    • Segment size optimization
    • Mixed workloads
    • HTTP compression
    • Automated metadata cleanup
  • API reference
  • View Manager

    • View Manager
    • View Manager API
    • Create a view
    • List views
    • Delete a view
    • Inspect view load status
  • Rolling updates
  • Retaining or automatically dropping data
  • Alerts
  • Working with different versions of Apache Hadoop
  • Misc

    • dump-segment tool
    • reset-cluster tool
    • pull-deps tool
    • Deep storage migration
    • Export Metadata Tool
    • Metadata Migration

Deep storage migration

If you have been running an evaluation Druid cluster using local deep storage and wish to migrate to a more production-capable deep storage system such as S3 or HDFS, this document describes the necessary steps.

Migration of deep storage involves the following steps at a high level:

  • Copying segments from local deep storage to the new deep storage
  • Exporting Druid's segments table from metadata
  • Rewriting the load specs in the exported segment data to reflect the new deep storage location
  • Reimporting the edited segments into metadata

Shut down cluster services

To ensure a clean migration, shut down the non-coordinator services to ensure that metadata state will not change as you do the migration.

When migrating from Derby, the coordinator processes will still need to be up initially, as they host the Derby database.

Copy segments from old deep storage to new deep storage.

Before migrating, you will need to copy your old segments to the new deep storage.

For information on what path structure to use in the new deep storage, please see deep storage migration options.

Export segments with rewritten load specs

Druid provides an Export Metadata Tool for exporting metadata from Derby into CSV files which can then be reimported.

By setting deep storage migration options, the export-metadata tool will export CSV files where the segment load specs have been rewritten to load from your new deep storage location.

Run the export-metadata tool on your existing cluster, using the migration options appropriate for your new deep storage location, and save the CSV files it generates. After a successful export, you can shut down the coordinator.

Import metadata

After generating the CSV exports with the modified segment data, you can reimport the contents of the Druid segments table from the generated CSVs.

Please refer to import commands for examples. Only the druid_segments table needs to be imported.

Restart cluster

After importing the segment table successfully, you can now restart your cluster.

Last updated on 10/16/2020
← pull-deps toolExport Metadata Tool →
  • Shut down cluster services
  • Copy segments from old deep storage to new deep storage.
  • Export segments with rewritten load specs
    • Import metadata
    • Restart cluster
2022.06
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
Apache Druid forumsBlog
Copyright © 2022 Imply Data, Inc