2023.02

2023.02

  • Imply
  • Ingest
  • Query
  • Visualize
  • Administer
  • Deploy

›Druid design

Overview

  • About Imply administration

Manager

  • Using Imply Manager
  • Managing Imply clusters
  • Imply Manager security
  • Extensions

Users

  • Imply Manager users
  • Druid API access
  • Authentication and Authorization

    • Get started with Imply Hybrid Auth
    • Authentication
    • Local users
    • User roles
    • User groups
    • User sessions
    • Brute force attack detection
    • Identity provider integration
    • Okta OIDC integration
    • Okta SAML integration
    • LDAP integration
    • OAuth client authentication

Clarity

  • Monitoring
  • Set up SaaS Clarity
  • Cloudwatch monitoring
  • Metrics

Druid administration

  • Configuration reference
  • Logging
  • API reference
  • Druid design

    • Design
    • Segments
    • Processes and servers
    • Deep storage
    • Metadata storage
    • ZooKeeper

    Data management

    • Overview
    • Data updates
    • Data deletion
    • Schema changes
    • Compaction
    • Automatic compaction

    Security

    • Security overview
    • User authentication and authorization
    • LDAP auth
    • Dynamic Config Providers
    • Password providers
    • Authentication and Authorization
    • TLS support
    • Row and column level security

    Performance tuning

    • Basic cluster tuning
    • Segment size optimization
    • Mixed workloads
    • HTTP compression
    • Automated metadata cleanup

    View Manager

    • View Manager
    • View Manager API
    • Create a view
    • List views
    • Delete a view
    • Inspect view load status
  • Rolling updates
  • Using rules to drop and retain data
  • Alerts
  • Java runtime
  • Working with different versions of Apache Hadoop
  • Misc

    • dump-segment tool
    • reset-cluster tool
    • pull-deps tool
    • Deep storage migration
    • Export Metadata Tool
    • Metadata Migration

ZooKeeper

Apache Druid uses Apache ZooKeeper (ZK) for management of current cluster state.

Minimum ZooKeeper versions

Apache Druid supports ZooKeeper versions 3.5.x and above.

Note: Starting with Apache Druid 0.22.0, support for ZooKeeper 3.4.x has been removed

ZooKeeper Operations

The operations that happen over ZK are

  1. Coordinator leader election
  2. Segment "publishing" protocol from Historical
  3. Segment load/drop protocol between Coordinator and Historical
  4. Overlord leader election
  5. Overlord and MiddleManager task management

Coordinator Leader Election

We use the Curator LeaderLatch recipe to perform leader election at path

${druid.zk.paths.coordinatorPath}/_COORDINATOR

Segment "publishing" protocol from Historical and Realtime

The announcementsPath and servedSegmentsPath are used for this.

All Historical processes publish themselves on the announcementsPath, specifically, they will create an ephemeral znode at

${druid.zk.paths.announcementsPath}/${druid.host}

Which signifies that they exist. They will also subsequently create a permanent znode at

${druid.zk.paths.servedSegmentsPath}/${druid.host}

And as they load up segments, they will attach ephemeral znodes that look like

${druid.zk.paths.servedSegmentsPath}/${druid.host}/_segment_identifier_

Processes like the Coordinator and Broker can then watch these paths to see which processes are currently serving which segments.

Segment load/drop protocol between Coordinator and Historical

The loadQueuePath is used for this.

When the Coordinator decides that a Historical process should load or drop a segment, it writes an ephemeral znode to

${druid.zk.paths.loadQueuePath}/_host_of_historical_process/_segment_identifier

This znode will contain a payload that indicates to the Historical process what it should do with the given segment. When the Historical process is done with the work, it will delete the znode in order to signify to the Coordinator that it is complete.

Last updated on 11/17/2022
← Metadata storageOverview →
  • Minimum ZooKeeper versions
  • ZooKeeper Operations
  • Coordinator Leader Election
  • Segment "publishing" protocol from Historical and Realtime
  • Segment load/drop protocol between Coordinator and Historical
2023.02
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
Apache Druid forumsBlog
Copyright © 2023 Imply Data, Inc