2022.06

2022.06

  • Imply
  • Ingest
  • Query
  • Visualize
  • Administer
  • Deploy

›Datasources

Overview

  • Querying data
  • Datasources

    • Datasources
    • Joins
    • Lookups
    • Query execution

    Query configuration

    • Query caching
    • Using query caching
    • Query context
  • Enhanced IP support
  • Multi-value dimensions
  • Multitenancy
  • Troubleshooting

Druid SQL

  • Overview and syntax
  • SQL data types
  • Druid SQL Functions

    • All functions
    • Operators
    • Scalar functions
    • Aggregation functions
    • Multi-value string functions
  • SQL metadata tables
  • SQL query context
  • Async SQL download
  • SQL query translation
  • Druid SQL APIs

    • Druid SQL API
    • Async SQL download API
    • JDBC driver API

Native queries

  • Native queries
  • Native query types

    • Timeseries
    • TopN
    • GroupBy
    • Scan
    • Search
    • TimeBoundary
    • SegmentMetadata
    • DatasourceMetadata

    Native query components

    • Filters
    • Granularities
    • Dimensions
    • Aggregations
    • Post-aggregations
    • Expressions
    • Having filters (groupBy)
    • Sorting and limiting (groupBy)
    • Sorting (topN)
    • String comparators
    • Virtual columns
    • Spatial filters

Joins

Apache Druid has two features related to joining of data:

  1. Join operators. These are available using a join datasource in native queries, or using the JOIN operator in Druid SQL. Refer to the join datasource documentation for information about how joins work in Druid.
  2. Query-time lookups, simple key-to-value mappings. These are preloaded on all servers that are involved in queries and can be accessed with or without an explicit join operator. Refer to the lookups documentation for more details.

Whenever possible, for best performance it is good to avoid joins at query time. Often this can be accomplished by joining data before it is loaded into Druid. However, there are situations where joins or lookups are the best solution available despite the performance overhead, including:

  • The fact-to-dimension (star and snowflake schema) case: you need to change dimension values after initial ingestion, and aren't able to reingest to do this. In this case, you can use lookups for your dimension tables.
  • Your workload requires joins or filters on subqueries.
Last updated on 6/16/2022
← DatasourcesLookups →
2022.06
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
Apache Druid forumsBlog
Copyright © 2022 Imply Data, Inc