Druid implements an extension system that allows for adding functionality at runtime. Extensions are commonly used to add support for deep storage (like HDFS and S3), metadata stores (like MySQL and PostgreSQL), new aggregators, new input formats, and so on.
Production clusters will generally use at least two extensions; one for deep storage and one for a metadata store. Many clusters will also use additional extensions.
You can add Druid extensions in the Imply Manager. Several extensions are enabled by default, such as the druid-datasketches, druid-kafka-indexing-service, and druid-basic-security.
To view and enable bundled extensions, click the edit icon next to Druid extensions in the cluster setup page.
To load additional extensions, you need to make the custom extension file available to the Manager from a filesystem location or by URL. The specific instructions for loading files on the Manager vary depending on how you are running the Imply Manager:
After making the extension file available, click Add custom extension and provide the name along with the URL or path to the extension file
manager:/// addressing scheme.
Imply bundles many commonly used extensions out of the box, including most core Druid extensions. For available extensions, see the list below.
Not all Druid core extensions are intended for use or packaged with Imply. For instance, the Apache Druid pac4j extension is not supported.
You can load bundled extensions by adding their names to your common.runtime.properties
druid.extensions.loadList property. For example, to load the postgresql-metadata-storage and
druid-hdfs-storage extensions, use the configuration:
You can also install community and third-party extensions not already bundled with the Imply
distribution. To do this, first download the extension and then install it into your
dist/druid/extensions/ directory. You can download extensions from their distributors directly, or
if they are available from Maven, the included pull-deps tool can download them for you. To use pull-deps
specify the full Maven coordinate of the extension in the form
If you are installing a Druid community-contributed extension, use a coordinate like
org.apache.extensions.contrib:druid-orc-extensions:0.18.0. The version you provide should match the
community Druid version that your Imply distribution is based on. For example, to install
java \ -cp "dist/druid/lib/*" \ -Ddruid.extensions.directory="dist/druid/extensions" \ -Ddruid.extensions.hadoopDependenciesDir="dist/druid/hadoop-dependencies" \ org.apache.druid.cli.Main tools pull-deps \ --no-default-hadoop \ -c "org.apache.druid.extensions.contrib:druid-orc-extensions:0.18.0"
druid.extensions.loadList in common.runtime.properties to instruct Druid to load the extension.
Community extensions are contributed by Druid community members but are not necessarily maintained on an ongoing basis by Druid committers. The Druid documentation contains a list of community extensions.
Imply does not provide support for community extensions.
The following extensions are bundled with the Imply distribution. To load bundled extensions, see Loading bundled extensions. To install other extensions, see Loading community and third-party extensions.
Core extensions are maintained by Druid committers. Some are in experimental status and some are fully production-tested Druid components.
For additional documentation on these extensions, see the Druid documentation.
|druid-avro-extensions||Support for data in Apache Avro data format.||link|
|druid-azure-extensions||Microsoft Azure deep storage.||link|
|druid-basic-security||Support for Basic HTTP authentication and role-based access control.||link|
|druid-bloom-filter||Support for providing Bloom filters in Druid queries.||link|
|druid-datasketches||Support for approximate counts and set operations with DataSketches.||link|
|druid-google-extensions||Google Cloud Storage deep storage.||link|
|druid-hdfs-storage||HDFS deep storage.||link|
|druid-histogram||Approximate histograms and quantiles aggregator. Deprecated, please use the DataSketches quantiles aggregator from the
|druid-kafka-extraction-namespace||Kafka-based namespaced lookup. Requires namespace lookup extension.||link|
|druid-kafka-indexing-service||Supervised exactly-once Kafka ingestion for the indexing service.||link|
|druid-kinesis-indexing-service||Supervised exactly-once Kinesis ingestion for the indexing service.||link|
|druid-kerberos||Kerberos authentication for druid processes.||link|
|druid-lookups-cached-global||A module for lookups providing a jvm-global eager caching for lookups. It provides JDBC and URI implementations for fetching lookup data.||link|
|druid-lookups-cached-single||Per lookup caching module to support use cases where a lookup needs to be isolated from the global pool of lookups.||link|
|druid-orc-extensions||Support for data in Apache Orc data format.||link|
|druid-parquet-extensions||Support for data in Apache Parquet data format. Requires druid-avro-extensions to be loaded.||link|
|druid-protobuf-extensions||Support for data in Protobuf data format.||link|
|druid-ranger-security||Support for access control through Apache Ranger.||link|
|druid-s3-extensions||Interfacing with data in AWS S3, and using S3 as deep storage.||link|
|druid-ec2-extensions||Interfacing with AWS EC2 for autoscaling middle managers.||Documentation unavailable|
|druid-stats||Statistics related module including variance and standard deviation.||link|
|mysql-metadata-storage||MySQL metadata store.||link|
|postgresql-metadata-storage||PostgreSQL metadata store.||link|
|simple-client-sslcontext||Simple SSLContext provider module to be used by Druid's internal HttpClient when talking to other Druid processes over HTTPS.||link|