• Developer guide
  • API reference

›Analytics

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Execute a POC
  • Create a dashboard
  • Navigate the console
  • Key concepts

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Create an ingestion job
  • Timestamp expressions
  • Data partitioning
  • Introduction to rollup
  • Approximation algorithms
  • Replace data

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest data from a table
  • Ingest from S3
  • Ingest from Kafka and MSK
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data
  • Connect to Confluent Schema Registry

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations
  • Query data

Monitoring

  • Overview

Management

  • Overview
  • Pause and resume a project

Billing

  • Overview
  • Polaris plans
  • Estimate project costs

Usage

  • Overview

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Manage users
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS
    • Create AWS PrivateLink connection

Developer guide

  • Overview
  • Authentication

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
  • Manage users and groups
  • Migrate deprecated resources
  • Create a table
  • Define a schema
  • Upload files
  • Create an ingestion job
  • Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Kafka and MSK
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference
  • Filter data to ingest
  • Ingest nested data
  • Ingest and query sketches
  • Specify data schema
  • Query data
  • Update a project
  • Link to BI tools
  • Connect over JDBC
  • Query parameters reference
  • API documentation

    • OpenAPI reference
    • Query API

Product info

  • Release notes
  • Known limitations
  • Druid extensions

Data cube dimensions

Data cube dimensions correspond to columns in your source table. In a data cube, you can display and filter data by dimension.

You can drag dimensions into the filter bar, show bar, and pinboard panels to interactively explore and analyze data.

In the following figure, the current view within the data cube filters by the Time dimension, __time in the original table. It shows the sum of events by country and number of events:

dimension panel with at least one collapsed and one expanded group

Number of Events is a data cube measure. In contrast to a dimension, a measure represents an output of an aggregation function applied to a corresponding list of dimensions.

Click on a dimension to activate the dimension preview menu.

dimension actions

Create a custom dimension

You can add new dimensions or edit the existing ones for a data cube. Click the edit icon at the top right to open the Edit data cube page. Click the Dimensions tab in the data cube edit view to display the dimensions in the data cube.

Cube Edit Dimensions

Click the edit icon for a specific dimension to modify its configuration. If the dimension directly represents a column from the source table, make your chances in the Basic tab.

To create a dimension to represent an arbitrary transformation, use the Custom tab. There you can enter a Druid SQL expression as the dimension's formula.

For example, the following expression defines a dimension based on a lookup function for country names:

COALESCE(LOOKUP(t."Country", 'store_sales_country_to_iso31661'), t."Country")

This expression performs a lookup that retrieves the country name based on a mapping between country abbreviations and a country name in the store_sales_country_to_iso31661 table. The resulting data cube includes a dimension with the full country name. In effect, this data cube joins rows from the store_sales_country_to_iso31661 and store_sales tables.

Data cube dimensions and measures can include a description text to help users in the Info pop-up window.

Dimension info

Dimension suggestions

You can use the Suggestions feature to rapidly add many new dimensions. Polaris scans the schema that underlies the dataset and automatically suggests a dimension for any column not already represented.

Dimension types

This section discusses some of the many types of dimensions that you can create.

Time dimensions

Druid uses a primary time column (called __time) for partitioning. However, it's often useful to have more than one time column.

Polaris can ingest these time columns as regular dimensions formatted in ISO string format (for example, 2017-04-20T16:20:00Z).

Cube view line chart

When creating or editing a dimension, select Time as the type to make a time dimension. You can also configure the preset granularities that are presented as defaults.

String dimensions

Most dimensions are categorical, in the sense that they are represented as strings. String dimensions appear in the left navigation panel with the "A" icon.

When creating or editing a dimension, select String from the type to make a string dimension.

Multi-value dimensions

Select the Set/String type to create a multi-value dimension, which is an array of strings. For example, a multi-value dimension named Cities might contain the following data:

["Paris","London","New York","Sydney"]

Numeric dimensions

Numeric dimensions represent numeric values in the data source. You can apply aggregations to numeric dimensions to create new measures.

When creating or editing a dimension, select Number from the type to make a numeric dimension. You can also configure the preset bucketing granularities that will be presented as defaults.

Geographic dimensions

Geographic units can be a useful way to segment data.

Cube view geo

To use the Geo marks and Geo shade visualizations, create and display a single dimension configured as the Geo type. Select the appropriate Geo encoding—the underlying data must conform to one of the following specifications:

  • ISO-3166-1 Alpha 2
  • ISO-3166-1 Alpha 3
  • ISO-3166-2
  • UN M49
  • Geohash

Boolean dimensions

You can use boolean expressions to create dimension values. In the following example, you use a formula to create dimensions for accounts in the United States and specific accounts by name.

t.country = 'United States' OR t.accountName IN ('Toyota', 'Honda')

IP dimensions

You can store a dimension as a complex data type representing an IP address or an IP prefix.

When you ingest IP data, you can use the IP and IP prefix type dimensions in Polaris to search and filter in your visualizations based on IP addresses and IP prefixes.

When you create a data cube, Polaris auto-detects the complex IP columns and applies the appropriate type:

  • IP: An IP address in a 128-bit binary format.
  • IP prefix: An IP prefix in a 136-bit binary format, comprised of a 128-bit address and an 8-bit prefix value.

See the Visualize data page for information on using IP dimensions to filter a visualization.

If you create a dimension with type IP or IP prefix on a column that doesn't contain complex IP type data, Polaris converts the dimension type to string. IP filters don't work on strings but you can apply string filters to this data.

Custom dimension examples

Here are some examples of common dimension patterns.

The following examples show expressions as Pivot SQL.

Lookups

You can create dimensions that perform a lookup at query time. To have a dimension that represents a key into another table, set up a Druid query-time lookup (QTL).

You can then use the LOOKUP() function, passing the lookup key and the lookup table name, as follows:

LOOKUP(t.lookupKey, 'my_awesome_lookup')

To handle values that are not found in the lookup table, you can either retain values that were not found as they are or replace them with a specific string.

For example, to keep values as they are, use the lookup key as the fallback value:

COALESCE(LOOKUP(t.lookupKey, 'my_awesome_lookup'), t.lookupKey)

To replace values not found in the lookup table with a string, such as the word missing, pass the string as the fallback value:

COALESCE(LOOKUP(t.lookupKey, 'my_awesome_lookup'), 'missing')
$lookupKey.lookup('my_awesome_lookup').fallback('missing')

URL dimensions

A URL dimension represent URLs in the source data. You can add a URL transformation to the dimension configuration to have a Go to URL action button for the dimension.

To create a URL dimension, edit the dimension and click Add URL. Provide the URL pattern, as in this example:

Cube edit dimension modal URL

Polaris interpolates the string (%s) for the given dimension value. In this example, the value Nepal becomes https://en.wikipedia.org/wiki/Nepal, the Wikipedia page for Nepal.

Extractions

Imagine you have a column called resourceName with the following values:

druid-0.18.2
druid-0.18.1
druid-0.17.0
index.html

You can create a dimension that extracts the version number in the column values:

REGEXP_EXTRACT(t.resourceName, '(\d+\.\d+\.\d+)')

Which returns the following values:

0.18.2
0.18.1
0.17.0
null

Other examples

This section includes common expressions you can use to build custom dimensions from the dimensions that already exist in your data cube.

This example returns the character length for every row of the dimension. Applies to the String dimension type.

LENGTH(t."comment")

This example extracts all numbers from every row of the dimension. If there are no numbers, it returns missing. Use regex expressions to build extractions. Applies to the String dimension type.

(REGEXP_EXTRACT(t."comment", '([0-9]+)', 0)) 

This example makes the entire string upper case. The function also accepts 'lowerCase'. Applies to the String dimension type.

UPPER(t."comment")

This example adds 'End of String' to the end of the dimension's values. Applies to the String dimension type.

CONCAT(t."comment", 'End of String')

This example returns the index value for 'World' in the string for every row in the dimension. In Pivot SQL, the function starts indexing at 1 and returns 0 when no match is found. Applies to the String dimension type.

POSITION('World' IN t."dimension") - 1

This example returns x-y number of characters for every row in the dimension. Applies to the String dimension type.

SUBSTR(t."dimension", x, y)

This example returns true if the value matches a value in the specified set; otherwise, returns false. You can specify the returned values as follows:

$cityName.in(['London','Aachen','Abbotsford']).then('Match Found').fallback('No Match')

Applies to String and Numeric data types.

t."cityName" IN ('London','Aachen','Abbotsford')

This example returns true or false based on whether the given regular expression is matched. You can specify the returned values as follows:

$dimension.match('^Hel*')then('Matched').fallback('Not matched')

Applies to the String dimension type.

REGEXP_LIKE(t."comment", ('^Hel*'))

This example returns true or false based on whether the input string is matched. You can specify the returned values like this: $dimension.contains('hello').then('Hello Friend').fallback('Bye Friend') Applies to the String dimension type.

CONTAINS_STRING(t."comment", 'hello')

You can use math operations in custom expressions. Applies to the Numeric data type.

LENGTH(f."comment") + 2

This example returns an absolute value. Applies to the Numeric data type.

ABS(CHAR_LENGTH(t."channel")-100)

This example raises the operand to the power that you specify. Applies to the Numeric data type.

POWER(LENGTH(t."comment"), 5)

This example returns true if the value matches the input value and false if it does not. Only use single quotes with strings. Applies to String, Numeric, and Boolean data types.

t."comment" IS 'red'

This example performs an AND operation between the value of the dimension and the second value that you input. Additional Boolean expressions include 'or', 'greaterOrEqual', 'greaterThan', 'lessThanOrEqual', 'lessThan', and 'not'. Applies to the Boolean data type.

t."comment" AND (2>1)

This example creates a time bucket of the duration that you specify. Applies to the Time data type.

TIME_FLOOR(t."__time", 'PT5M')

This example returns the largest operand time which is less than or equal to the nearest duration within the specified timezone. Applies to the Time data type.

TIME_FLOOR(t."__time", 'P1D', 'Asia/Shanghai')

This example shifts the operand time by the interval you specify within the given timezone. Do not use a + sign to shift time forward. Applies to the Time data type.

TIME_SHIFT(t."__time", 'PT1M', -5)

This example returns part of a timestamp. Additional possibilities include: 'MINUTE_OF_HOUR', 'MINUTE_OF_DAY', 'MINUTE_OF_WEEK', 'MINUTE_OF_MONTH', 'MINUTE_OF_YEAR', 'HOUR_OF_DAY', 'HOUR_OF_WEEK', 'HOUR_OF_MONTH', 'HOUR_OF_YEAR', 'DAY_OF_WEEK', 'DAY_OF_MONTH', 'DAY_OF_YEAR', 'WEEK_OF_MONTH', 'WEEK_OF_YEAR', and 'MONTH_OF_YEAR'. Applies to the Time data type.

TIME_EXTRACT(t."__time", 'SECOND')

This example casts a number to a time value.

MILLIS_TO_TIMESTAMP(t."dimension")
← Visualize dataData cube measures →
  • Create a custom dimension
  • Dimension suggestions
  • Dimension types
    • Time dimensions
    • String dimensions
    • Multi-value dimensions
    • Numeric dimensions
    • Geographic dimensions
    • Boolean dimensions
    • IP dimensions
  • Custom dimension examples
    • Lookups
    • URL dimensions
    • Extractions
    • Other examples
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc