• Developer guide
  • API reference

›Analytics

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Execute a POC
  • Create a dashboard
  • Navigate the console
  • Key concepts

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Create an ingestion job
  • Timestamp expressions
  • Data partitioning
  • Introduction to rollup
  • Approximation algorithms
  • Replace data

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest data from a table
  • Ingest from S3
  • Ingest from Kafka and MSK
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data
  • Connect to Confluent Schema Registry

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations
  • Query data

Monitoring

  • Overview

Management

  • Overview
  • Pause and resume a project

Billing

  • Overview
  • Polaris plans
  • Estimate project costs

Usage

  • Overview

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Manage users
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS
    • Create AWS PrivateLink connection

Developer guide

  • Overview
  • Authentication

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
  • Manage users and groups
  • Migrate deprecated resources
  • Create a table
  • Define a schema
  • Upload files
  • Create an ingestion job
  • Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Kafka and MSK
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference
  • Filter data to ingest
  • Ingest nested data
  • Ingest and query sketches
  • Specify data schema
  • Query data
  • Update a project
  • Link to BI tools
  • Connect over JDBC
  • Query parameters reference
  • API documentation

    • OpenAPI reference
    • Query API

Product info

  • Release notes
  • Known limitations
  • Druid extensions

Manage data cubes

A data cube is a multidimensional data model used to organize and visualize aggregated data. Data cubes contain data from one or more tables and provide an interface for users to explore a data set. This topic explains how to create and configure a data cube in Imply Polaris.

View data cubes

Click Data cubes in the left sidebar to view all data cubes.

You can click the star icon on the right side of a data cube in the list to identify it as a favorite. Polaris displays favorites at the top of the page for easy access.

Create a data cube

A data cube contains data from one or more tables. Before you create a data cube, verify that the data you want to present in a data cube is available in a table. See the Data overview to learn about creating tables and loading data.

Users with AdministerDataCubes or ManageDataCubes permissions can create data cubes in Polaris. See Permissions assigned to predefined groups for default permission assignments.

To create a data cube, follow these steps:

  1. Click Data cubes in the left sidebar.
  2. Click New data cube in the top left corner.
  3. Choose between a table or a SQL query as your data source:
    • If you choose a table, the table must already be created in Polaris.
    • For a SQL query, write a query that selects data used to populate the data cube. You can add queries against any data source. Use this if you want a subset of the table's data, or you otherwise want to use a SQL query to select and manipulate the data.
  4. Choose whether to Auto-fill dimensions and measures. When enabled (recommended), Polaris creates a dimension for each column in the table. It also makes some inferences about the data to create several measures. For instance, it creates a measure that represents the count of events returned by the query underlying the data cube. The following figure shows the dimensions created based on the Koalas to the Max data: Default measures

    For more information about dimension and measure detection, see Schema detection.

  5. Click Next: Create data cube.
  6. Complete the General properties for the data cube:
    • Name: Name of the data cube.
    • Description: An optional description.
    • Color theme: The color theme to apply to the data cube.
    • Default timezone: The data cube's timezone.
    • Minimum auto-refresh rate: The minimum rate at which data in the data cube will refresh.
  7. Click Save to save the data cube.

After you create a data cube, you can adjust any settings in the edit screen.

You can create additional data cubes by duplicating an existing data cube and editing its properties.

Edit a data cube

To edit a data cube, click the pencil icon in the data cube header.

Within the edit view, you can change the general properties of the data cube.

You can also edit and create dimensions and measures from their respective tabs, and manage access and access filters for the data cube.

Cube edit view

If the underlying schema of the table changes, you can update the data cube schema as well. To do so, access the Dimensions or Measures tab in the Edit data cube view. If there are changes to the underlying schema, the Suggestions button indicates the number of changes detected. Click the Suggestions button to review and accept the suggestions as needed.

For more information about schema detection, see Schema detection.

Manage access

By default, all users in your project can view a newly created data cube and all of its underlying data. You can use the Access and Access filters tabs to optionally control access to individual data cubes, and filter the data specified user groups can access in the data cube.

Data cube access

The data cube creator and users with the AdministerDatacubes permission assigned to their profile can edit the data cube. See Permissions reference for more information on permissions.

To control access to an individual data cube at a more granular level:

  1. Edit the data cube and click the Access tab. Data cube access
  2. Click Add people.
  3. Select one or more options from the drop-down lists for the following access levels:
    Can view: View the data cube. Users with this access level also need the AccessVisualization permission assigned to their profile.
    Can download: View and download the data cube. Users with this access level also need the AccessDownloadData permission assigned to their profile.
    Can edit: View, download, and edit the data cube. Users with this access level also need the ManageDataCubes or AdministerDataCubes permission assigned to their profile.

In cases where different permissions on the individual level and the group level apply to the same user, Polaris applies the most privileged permission. For example, if an individual user has Can view access to a data cube, but they are a member of a group with Can edit access, they have permission to edit the data cube.

Access filters

Access filters allow you to apply a SQL predicate to a data cube's underlying data. You assign the access filter to a user group, so that members of the group view the filtered data when they access the data cube.

You can create multiple access filters for the same data cube, to enable groups of users to have a unique, filtered view of the underlying data.

Access filters are transparent to group members—Polaris provides no indication that they are viewing filtered data.

To create an access filter:

  1. Edit the data cube and click the Access filters tab.
  2. Click Enable and require access filters.

    Once you enable access filters, only users in groups with applied access filters can query the data cube.

  3. Click New access filter and enter the following details:
    • Name: A name for the access filter.
    • Filter: A SQL query to select data for the filter. Preface column names with t.—for example, t.country.
    • User group: The user group to attach to the filter. You can only attach a user group to one access filter. Access filters
  4. Click Create access filter.
  5. Use the Allow combining multiple filters checkbox to determine the behaviour when a user is a member of two or more user groups with access filters applied:
    If checked, Polaris combines the filters with the OR operator.
    If unchecked, the user can't view any data in the data cube.
  6. Click Save.

Data cube access example

Imagine you are running an auction system with buyers and sellers. You have an auctions table that contains a record of every closed auction, and has columns for buyer_id and seller_id.

You want to offer sellers a way to view their own data, and prevent them from viewing any other data. At the same time, you want your internal users to be able to view all of the data.

Here's how you could set this up:

  1. Create the following roles:
    • InternalUser
      Assign this role to all internal users in your organization.
    • ExternalSeller
      Apply the AccessVisualization permission and assign this role to all sellers.
    • SellerA, SellerB, etc
      Assign these roles to the sellers in each seller organization (where the letter A, B, etc is the seller organization name).
  2. Create the following data cubes on the auctions data source:
    • Auctions_internal
      Create dimensions for buyer_id and seller_id to allow your internal users to query the data.
      Apply the Can edit access level to the InternalUser role.
      This data cube doesn't need an access filter.
    • Auctions_sellers
      Apply the Can view access level to the ExternalSeller role.
      Create an access filter for Seller A, apply a SQL query to select data with Seller A's seller_id and assign a user group that belongs to the SellerA role.
      Create additional access filters for the other sellers.
      This data cube doesn't need a dimension for seller_id because it is always filtered to a single seller_id value.

You could also set up a similar arrangement for buyers, by creating an ExternalBuyer role and an Auctions_buyers data cube with the appropriate buyer access filters.

Create a link to a data cube

You can create a URL that links directly to a data cube. See Query parameters reference for information on the query parameters Polaris supports in the URL.

Schema detection

The dimensions and measures of a data cube make up the schema for the data cube. When you create a data cube, Polaris can derive the schema from the base data source, which you can modify as needed.

How schema detection works

Imply looks at the dataset metadata and uses the returned list of columns, their types, and their aggregation (in case of rollup) to determine what dimensions and measures to suggest.

Imply generates dimensions and measures by applying the following rules to the discovered underlying column types:

  • Time columns get mapped to a dimension with automatic bucketing by default.
  • String columns get mapped to a dimension.
  • Numeric columns get mapped to a SUM measure or an otherwise appropriate measure if the column is marked as being aggregated as part of rollup.

Schema detection limitations

While schema detection enables you to set up a new data cube quickly, you may need to test it and tailor it to suit your needs. Try modifying or deleting the auto-generated dimensions or measures. You can always access them in the Suggestions tab if you decide to revert back.

In particular, schema detection cannot detect these common scenarios:

  • String columns that you might want to see as countDistinct - number of unique values.
  • The perfect granularities to apply to time and numeric dimensions.
  • Lookups that you might want to apply to certain dimensions.
  • Dimensions that correspond to a URL.
  • Measures that are interesting when filtered on something.
  • Measures that should be seen as a ratio, or some other post aggregation.

Set a default data cube view

When a user navigates to a data cube, the data cube displays data from the latest day and shows the first measure in the list of available measures by default.

You can modify this default view by adding filtering conditions or by setting a specific dimension to be shown with the default settings.

Create a default view:

  1. Create the view you want to save as your default view in the data cube.
  2. Open the options menu by clicking the toggle icon at the top right of the data cube view.
  3. Click Update data cube defaults.
  4. Click Set current view as default.
  5. Click Save for all users. Set as default

Download data

To download data cube data, click the download icon in the top navigation bar. The Download data page appears with the following fields:

  • Filename: Specify the name of the file to generate and download based on the data.
  • Max columns to download: Specify the maximum number of columns to download.
  • Max rows to download: Specify the maximum number of rows to download.
  • Format: Specify the file format for your download. Options are:
    • CSV (Comma-separated values)
    • TSV (Tab-separated values)
    • XSLX (Excel spreadsheet)
    • JSON (Newline-delimited JSON)

Once you click Download, the download status bar indicates the progress of your download.

Data cube options

Click the options icon in the top navigation bar to set the following data cube options:

  • Timezone: Set the data cube’s timezone.
  • Query cache: Enable or disable the query cache for all dashboards and data cubes. The cache setting persists until you reload the page. The cache optimizes query performance, but can cause results to be slightly out of date—especially for stream-ingested rolled-up datasources.
  • Refresh rate: Set the rate at which data in the data cube refreshes.
  • View raw data: View and download the raw data underlying the data cube.
  • View essence: View and copy the JSON structure of the data cube.
  • Monitor queries: View the underlying queries Polaris makes when you’re working with the data cube. Click Enable monitoring in the dialog to enable the feature.
  • Update data cube defaults: Update the initial settings for the data cube view when you first navigate to it.
  • Reset view: Reset the data cube view to the default view.

Click the ellipsis in the top navigation bar to access the following options:

  • Set up alert: Set up an alert for the data cube.
  • Schedule report: Set up a report for the data cube.
  • Embed this view: Create an embedding link to this data cube view.
  • Add to/Remove from favorites: Identify the data cube as a favorite or remove it from favorites. Polaris displays favorites at the top of the main data cubes page for easy access.
← OverviewVisualize data →
  • View data cubes
  • Create a data cube
  • Edit a data cube
  • Manage access
    • Data cube access
    • Access filters
    • Data cube access example
  • Create a link to a data cube
  • Schema detection
    • How schema detection works
    • Schema detection limitations
  • Set a default data cube view
  • Download data
  • Data cube options
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc