Apache Druid
  • Imply Documentation

›Hidden

Getting started

  • Introduction to Apache Druid
  • Quickstart
  • Docker
  • Single server deployment
  • Clustered deployment

Tutorials

  • Loading files natively
  • Load from Apache Kafka
  • Load from Apache Hadoop
  • Querying data
  • Roll-up
  • Configuring data retention
  • Updating existing data
  • Compacting segments
  • Deleting data
  • Writing an ingestion spec
  • Transforming input data
  • Kerberized HDFS deep storage

Design

  • Design
  • Segments
  • Processes and servers
  • Deep storage
  • Metadata storage
  • ZooKeeper

Ingestion

  • Ingestion
  • Data formats
  • Schema design tips
  • Data management
  • Stream ingestion

    • Apache Kafka
    • Amazon Kinesis
    • Tranquility

    Batch ingestion

    • Native batch
    • Hadoop-based
  • Task reference
  • Troubleshooting FAQ

Querying

  • Druid SQL
  • Native queries
  • Query execution
  • Concepts

    • Datasources
    • Joins
    • Lookups
    • Multi-value dimensions
    • Multitenancy
    • Query caching
    • Context parameters

    Native query types

    • Timeseries
    • TopN
    • GroupBy
    • Scan
    • Search
    • TimeBoundary
    • SegmentMetadata
    • DatasourceMetadata

    Native query components

    • Filters
    • Granularities
    • Dimensions
    • Aggregations
    • Post-aggregations
    • Expressions
    • Having filters (groupBy)
    • Sorting and limiting (groupBy)
    • Sorting (topN)
    • String comparators
    • Virtual columns
    • Spatial filters

Configuration

  • Configuration reference
  • Extensions
  • Logging

Operations

  • Web console
  • Getting started with Apache Druid
  • Basic cluster tuning
  • API reference
  • High availability
  • Rolling updates
  • Retaining or automatically dropping data
  • Metrics
  • Alerts
  • Working with different versions of Apache Hadoop
  • HTTP compression
  • TLS support
  • Password providers
  • dump-segment tool
  • reset-cluster tool
  • insert-segment-to-db tool
  • pull-deps tool
  • Misc

    • Legacy Management UIs
    • Deep storage migration
    • Export Metadata Tool
    • Metadata Migration
    • Segment Size Optimization
    • Content for build.sbt

Development

  • Developing on Druid
  • Creating extensions
  • JavaScript functionality
  • Build from source
  • Versioning
  • Experimental features

Misc

  • Papers

Hidden

  • Apache Druid vs Elasticsearch
  • Apache Druid vs. Key/Value Stores (HBase/Cassandra/OpenTSDB)
  • Apache Druid vs Kudu
  • Apache Druid vs Redshift
  • Apache Druid vs Spark
  • Apache Druid vs SQL-on-Hadoop
  • Authentication and Authorization
  • Broker
  • Coordinator Process
  • Historical Process
  • Indexer Process
  • Indexing Service
  • MiddleManager Process
  • Overlord Process
  • Router Process
  • Peons
  • Approximate Histogram aggregators
  • Apache Avro
  • Microsoft Azure
  • Bloom Filter
  • DataSketches extension
  • DataSketches HLL Sketch module
  • DataSketches Quantiles Sketch module
  • DataSketches Theta Sketch module
  • DataSketches Tuple Sketch module
  • Basic Security
  • Kerberos
  • Cached Lookup Module
  • Apache Ranger Security
  • Google Cloud Storage
  • HDFS
  • Apache Kafka Lookups
  • Globally Cached Lookups
  • MySQL Metadata Store
  • ORC Extension
  • Druid pac4j based Security extension
  • Apache Parquet Extension
  • PostgreSQL Metadata Store
  • Protobuf
  • S3-compatible
  • Simple SSLContext Provider Module
  • Stats aggregator
  • Test Stats Aggregators
  • Ambari Metrics Emitter
  • Apache Cassandra
  • Rackspace Cloud Files
  • DistinctCount Aggregator
  • Graphite Emitter
  • InfluxDB Line Protocol Parser
  • InfluxDB Emitter
  • Kafka Emitter
  • Materialized View
  • Moment Sketches for Approximate Quantiles module
  • Moving Average Query
  • OpenTSDB Emitter
  • Druid Redis Cache
  • Microsoft SQLServer
  • StatsD Emitter
  • T-Digest Quantiles Sketch module
  • Thrift
  • Timestamp Min/Max aggregators
  • GCE Extensions
  • Aliyun OSS
  • Cardinality/HyperUnique aggregators
  • Select
  • Realtime Process
Edit

Kerberos

Apache Druid Extension to enable Authentication for Druid Processes using Kerberos. This extension adds an Authenticator which is used to protect HTTP Endpoints using the simple and protected GSSAPI negotiation mechanism SPNEGO. Make sure to include druid-kerberos as an extension.

Configuration

Creating an Authenticator

druid.auth.authenticatorChain=["MyKerberosAuthenticator"]

druid.auth.authenticator.MyKerberosAuthenticator.type=kerberos

To use the Kerberos authenticator, add an authenticator with type kerberos to the authenticatorChain. The example above uses the name "MyKerberosAuthenticator" for the Authenticator.

Configuration of the named authenticator is assigned through properties with the form:

druid.auth.authenticator.<authenticatorName>.<authenticatorProperty>

The configuration examples in the rest of this document will use "kerberos" as the name of the authenticator being configured.

Properties

PropertyPossible ValuesDescriptionDefaultrequired
druid.auth.authenticator.kerberos.serverPrincipalHTTP/_HOST@EXAMPLE.COMSPNEGO service principal used by druid processesemptyYes
druid.auth.authenticator.kerberos.serverKeytab/etc/security/keytabs/spnego.service.keytabSPNego service keytab used by druid processesemptyYes
druid.auth.authenticator.kerberos.authToLocalRULE:[1:$1@$0](druid@EXAMPLE.COM)s/.*/druid DEFAULTIt allows you to set a general rule for mapping principal names to local user names. It will be used if there is not an explicit mapping for the principal name that is being translated.DEFAULTNo
druid.auth.authenticator.kerberos.cookieSignatureSecretsecretStringSecret used to sign authentication cookies. It is advisable to explicitly set it, if you have multiple druid nodes running on same machine with different ports as the Cookie Specification does not guarantee isolation by port.No
druid.auth.authenticator.kerberos.authorizerNameDepends on available authorizersAuthorizer that requests should be directed toEmptyYes

As a note, it is required that the SPNego principal in use by the druid processes must start with HTTP (This specified by RFC-4559) and must be of the form "HTTP/_HOST@REALM". The special string _HOST will be replaced automatically with the value of config druid.host

druid.auth.authenticator.kerberos.excludedPaths

In older releases, the Kerberos authenticator had an excludedPaths property that allowed the user to specify a list of paths where authentication checks should be skipped. This property has been removed from the Kerberos authenticator because the path exclusion functionality is now handled across all authenticators/authorizers by setting druid.auth.unsecuredPaths, as described in the main auth documentation.

Auth to Local Syntax

druid.auth.authenticator.kerberos.authToLocal allows you to set a general rules for mapping principal names to local user names. The syntax for mapping rules is RULE:\[n:string](regexp)s/pattern/replacement/g. The integer n indicates how many components the target principal should have. If this matches, then a string will be formed from string, substituting the realm of the principal for $0 and the nth component of the principal for $n. e.g. if the principal was druid/admin then \[2:$2$1suffix] would result in the string admindruidsuffix. If this string matches regexp, then the s//[g] substitution command will be run over the string. The optional g will cause the substitution to be global over the string, instead of replacing only the first match in the string. If required, multiple rules can be be joined by newline character and specified as a String.

Increasing HTTP Header size for large SPNEGO negotiate header

In Active Directory environment, SPNEGO token in the Authorization header includes PAC (Privilege Access Certificate) information, which includes all security groups for the user. In some cases when the user belongs to many security groups the header to grow beyond what druid can handle by default. In such cases, max request header size that druid can handle can be increased by setting druid.server.http.maxRequestHeaderSize (default 8Kb) and druid.router.http.maxRequestBufferSize (default 8Kb).

Configuring Kerberos Escalated Client

Druid internal processes communicate with each other using an escalated http Client. A Kerberos enabled escalated HTTP Client can be configured by following properties -

PropertyExample ValuesDescriptionDefaultrequired
druid.escalator.typekerberosType of Escalator client used for internal process communication.n/aYes
druid.escalator.internalClientPrincipaldruid@EXAMPLE.COMPrincipal user name, used for internal process communicationn/aYes
druid.escalator.internalClientKeytab/etc/security/keytabs/druid.keytabPath to keytab file used for internal process communicationn/aYes
druid.escalator.authorizerNameMyBasicAuthorizerAuthorizer that requests should be directed to.n/aYes

Accessing Druid HTTP end points when kerberos security is enabled

  1. To access druid HTTP endpoints via curl user will need to first login using kinit command as follows -

    kinit -k -t <path_to_keytab_file> user@REALM.COM
    
  2. Once the login is successful verify that login is successful using klist command

  3. Now you can access druid HTTP endpoints using curl command as follows -

    curl --negotiate -u:anyUser -b ~/cookies.txt -c ~/cookies.txt -X POST -H'Content-Type: application/json' <HTTP_END_POINT>
    

    e.g to send a query from file query.json to the Druid Broker use this command -

    curl --negotiate -u:anyUser -b ~/cookies.txt -c ~/cookies.txt -X POST -H'Content-Type: application/json'  http://broker-host:port/druid/v2/?pretty -d @query.json
    

    Note: Above command will authenticate the user first time using SPNego negotiate mechanism and store the authentication cookie in file. For subsequent requests the cookie will be used for authentication.

Accessing Coordinator or Overlord console from web browser

To access Coordinator/Overlord console from browser you will need to configure your browser for SPNego authentication as follows -

  1. Safari - No configurations required.
  2. Firefox - Open firefox and follow these steps -
    1. Go to about:config and search for network.negotiate-auth.trusted-uris.
    2. Double-click and add the following values: "http://druid-coordinator-hostname:ui-port" and "http://druid-overlord-hostname:port"
  3. Google Chrome - From the command line run following commands -
    1. google-chrome --auth-server-whitelist="druid-coordinator-hostname" --auth-negotiate-delegate-whitelist="druid-coordinator-hostname"
    2. google-chrome --auth-server-whitelist="druid-overlord-hostname" --auth-negotiate-delegate-whitelist="druid-overlord-hostname"
  4. Internet Explorer -
    1. Configure trusted websites to include "druid-coordinator-hostname" and "druid-overlord-hostname"
    2. Allow negotiation for the UI website.

Sending Queries programmatically

Many HTTP client libraries, such as Apache Commons HttpComponents, already have support for performing SPNEGO authentication. You can use any of the available HTTP client library to communicate with druid cluster.

← Basic SecurityCached Lookup Module →
  • Configuration
    • Creating an Authenticator
    • Properties
    • druid.auth.authenticator.kerberos.excludedPaths
    • Auth to Local Syntax
    • Increasing HTTP Header size for large SPNEGO negotiate header
  • Configuring Kerberos Escalated Client
  • Accessing Druid HTTP end points when kerberos security is enabled
  • Accessing Coordinator or Overlord console from web browser
  • Sending Queries programmatically

Technology · Use Cases · Powered by Druid · Docs · Community · Download · FAQ

 ·  ·  · 
Copyright © 2019 Apache Software Foundation.
Except where otherwise noted, licensed under CC BY-SA 4.0.
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.