Skip to main content

Manage data cubes

A data cube is a multidimensional data model used to organize and visualize aggregated data. Data cubes contain data from one or more tables and provide an interface for users to explore a data set. This topic explains how to create and configure a data cube in Imply Polaris.

View data cubes

Click Data cubes in the left sidebar to view all data cubes.

You can click the star icon on the right side of a data cube in the list to identify it as a favorite. Polaris displays favorites at the top of the page for easy access.

Create a data cube

A data cube contains data from one or more tables. Before you create a data cube, verify that the data you want to present in a data cube is available in a table. See the Data overview to learn about creating tables and loading data.

Users with AdministerDataCubes or ManageDataCubes permissions can create data cubes in Polaris. See Permissions assigned to predefined groups for default permission assignments.

To create a data cube, follow these steps:

  1. Click Data cubes > Create data cube. Create data cube
  2. Select one of the following options:
    • From table: Select a Polaris table from the drop-down.
    • From SQL query: Write a query that selects data used to populate the data cube. You can add queries against any data source. Use this option if you want to populate the data cube with a subset of a table's data, or you otherwise want to use a SQL query to select and manipulate the data.
    • Import data cube: Import the JSON definition of a data cube. You can export a definition when you edit a data cube.
  3. Choose whether to Auto-fill dimensions and measures (recommended). When enabled, Polaris creates a dimension for each column in the table. It also makes some inferences about the data to create several measures. For instance, it creates a measure that represents the count of events returned by the query underlying the data cube. The following figure shows the dimensions created based on the Koalas to the Max data: Default measures
    info

    Polaris does not auto-generate dimensions for array-type data in the schema and does not support arrays. For more information about dimension and measure detection, see Schema detection.

  4. Click Next: Create data cube.
  5. Complete the General properties for the data cube:
    • Name: Name of the data cube.
    • Description: An optional description.
    • Color theme: The color theme to apply to the data cube.
    • Default timezone: The data cube's timezone.
    • Minimum auto-refresh rate: The minimum rate at which data in the data cube will refresh.
    • Primary time dimension: The time dimension Polaris uses for all time-related calculations for this data cube, including comparisons, filters, alerts, and reports.
    • Query timeout override: The optional number of milliseconds to override the default 40-second query timeout. Setting this value higher than 660,000 (11 minutes) overrides the default value for the client timeout, which is 660,000.
  6. Click Dimensions to add or modify dimensions.
  7. Click Measures to add or modify measures.
  8. Click Access to control access to the data cube. See Data cube access for more information.
  9. Click Access filters apply a SQL predicate to a data cube's underlying data. See Access filters for more information.
  10. Click Advanced to set the following advanced options:
* **Subset filter formula**: Applies a mandatory, hidden SQL filter to all queries made through this data cube. For example, `t."channel" = '#fr.wikipedia'`.
* **Enforce time filter**: Ensures that every query is filtered on the primary time dimension. You can enable this setting if time unbounded queries are likely to be slow due to the volume of data.
* **Latest data strategy**: Determines how Polaris calculates the latest data time for the data cube. Options are:
* **Query the latest timestamp from the data source**: The best option when loading historical data.
* **Use the current time**: The best option when ingesting real-time data.
* **Query caching**: Specifies query caching behavior. Allowing caching can greatly speed up exploration but can also cause results to be out of date, especially in real-time rolled up datasets.
* **Dimension and measure formulae visibility**: Hides formulae that appear in the info dialog for dimensions and measures. Polaris displays formulae by default. Users with appropriate permissions can see formulae in edit mode, regardless of this setting.
* **Minimum alert frequency** and **Minimum alert timeframe**: The minimum allowed frequency and timeframe for alerts created for this data cube. If set, these options become the minimum **Check every** and **Time frame** options for [alerts](/polaris/set-up-alerts). Users with the `CreateElevatedAlerts` permission are not subject to these restrictions when creating alerts.
* **Custom comparisons**: Click **Add Compare** to create a custom comparison period. Enter a **Time length** and select a **Unit**, for example `3 days`. Custom comparison periods appear in the **Comparisons** drop-down when you're exploring a data cube.
  1. Click Save to save the data cube.

After you create a data cube, you can adjust any settings in the edit screen.

You can create additional data cubes by duplicating an existing data cube and editing its properties.

Edit a data cube

To edit a data cube, click the ellipsis next to the data cube name on the Data cubes page, or in Favorites or Collections. When viewing a data cube, click the pencil icon in the data cube header.

Within the edit view, you can change the data cube properties described in Create a data cube.

Click Export to export the JSON definition of the data cube. You can import a data cube definition when you create a data cube.

If the underlying schema of the table changes, you can update the data cube schema as well. To do so, access the Dimensions or Measures tab in the Edit data cube view. If there are changes to the underlying schema, the Suggestions button indicates the number of changes detected. Click the Suggestions button to review and accept the suggestions as needed.

For more information about schema detection, see Schema detection.

Manage access

By default, all users in your project can view a newly created data cube and all of its underlying data. You can use the Access and Access filters tabs to optionally control access to individual data cubes, and filter the data specified user groups can access in the data cube.

Data cube access

The data cube creator and users with the AdministerDatacubes permission assigned to their profile can edit the data cube. See Permissions reference for more information on permissions.

To control access to an individual data cube at a more granular level:

  1. Edit the data cube and click the Access tab. Data cube access
  2. Click Add people.
  3. Select one or more options from the drop-down lists for the following access levels:
    • Can view: View the data cube. Users with this access level also need the AccessVisualization permission assigned to their profile.
    • Can download: View and download the data cube. Users with this access level also need the AccessDownloadData permission assigned to their profile.
    • Can edit: View, download, and edit the data cube. Users with this access level also need the ManageDataCubes or AdministerDataCubes permission assigned to their profile.

In cases where different permissions on the individual level and the group level apply to the same user, Polaris applies the most privileged permission. For example, if an individual user has Can view access to a data cube, but they are a member of a group with Can edit access, they have permission to edit the data cube.

Access filters

Access filters allow you to apply a SQL predicate to a data cube's underlying data. You assign the access filter to a user group, so that members of the group view the filtered data when they access the data cube.

You can create multiple access filters for the same data cube, to enable groups of users to have a unique, filtered view of the underlying data.

Access filters are transparent to group membersPolaris provides no indication that they are viewing filtered data.

To create an access filter:

  1. Edit the data cube and click the Access filters tab.
  2. Click Enable and require access filters.
    info

    Once you enable access filters, only users in groups with applied access filters can query the data cube.

  3. Click New access filter and enter the following details:
    • Name: A name for the access filter.
    • Filter: A SQL query to select data for the filter. Preface column names with t.for example, t.country.
    • User group: The user group to attach to the filter. You can only attach a user group to one access filter. Access filters
  4. Click Create access filter.
  5. Use the Allow combining multiple filters checkbox to determine the behaviour when a user is a member of two or more user groups with access filters applied:
    • If checked, Polaris combines the filters with the OR operator.
    • If unchecked, the user can't view any data in the data cube.
  6. Click Save.

Access filter example

Let’s say you created a data cube from the Koalas to the Max datasource described in the Quickstart.

You have a group of data analysts in France and a group in Germany. You only want the analysts to view data for their home country plus the United States.

The following steps illustrate how to set up two filtered views of the same Polaris data:

  1. Create a user group named Analysts France and a user group named Analysts Germany.
  2. Assign the analysts in France to the Analysts France role and the analysts in Germany to the Analysts Germany role.
  3. Create a data cube on the Koalas to the Max data source named KTTM France. In the Access filters section, create a new access filter with Filter: t.country IN ('France', 'United States') and User group: Analysts France.
  4. Create a data cube on the Koalas to the Max data source named KTTM Germany. In the Access filters section, create a new access filter with Filter: t.country IN ('Germany', 'United States') and User group: Analysts Germany.
  5. View the KTTM France data cube as an analyst in France and the KTTM Germany data cube as an analyst in Germany, and confirm that the data is filtered as expected.

Filtered data cube for data analysts in France:

access filter KTTM France

Filtered data cube for data analysts in Germany:

access filter KTTM Germany

You can create a URL that links directly to a data cube. See Query parameters reference for information on the query parameters Polaris supports in the URL.

Schema detection

The dimensions and measures of a data cube make up the schema for the data cube. When you create a data cube, Polaris can derive the schema from the base data source, which you can modify as needed.

How schema detection works

Imply looks at the dataset metadata and uses the returned list of columns, their types, and their aggregation (in case of rollup) to determine what dimensions and measures to suggest.

Imply generates dimensions and measures by applying the following rules to the discovered underlying column types:

  • Time columns get mapped to a dimension with automatic bucketing by default.
  • String columns get mapped to a dimension.
  • Numeric columns get mapped to a SUM measure or an otherwise appropriate measure if the column is marked as being aggregated as part of rollup.

Schema detection limitations

While schema detection enables you to set up a new data cube quickly, you may need to test it and tailor it to suit your needs. Try modifying or deleting the auto-generated dimensions or measures. You can always access them in the Suggestions tab if you decide to revert back.

Specifically, schema detection cannot detect the following:

  • Array data.
  • String columns that you might want to see as countDistinct - number of unique values.
  • The perfect granularities to apply to time and numeric dimensions.
  • Lookups that you might want to apply to certain dimensions.
  • Dimensions that correspond to a URL.
  • Measures that are interesting when filtered on something.
  • Measures that should be seen as a ratio, or some other post aggregation.

Set a default data cube view

When a user navigates to a data cube, the data cube displays data from the latest day and shows the first measure in the list of available measures by default.

You can modify this default view by adding filtering conditions or by setting a specific dimension to be shown with the default settings.

Create a default view:

  1. Create the view you want to save as your default view in the data cube.
  2. Open the options menu by clicking the toggle icon at the top right of the data cube view.
  3. Click Update data cube defaults.
  4. Click Set current view as default.
  5. Click Save for all users. Set as default

Data cube options

Click the options icon in the top navigation bar to set the following data cube options:

  • Timezone: Set the data cube’s timezone.
  • Query cache: Enable or disable the query cache for all dashboards and data cubes. The cache setting persists until you reload the page. The cache optimizes query performance, but can cause results to be slightly out of dateespecially for stream-ingested rolled-up datasources.
  • Query precision: Controls the precision of TopN and COUNT DISTINCT queries. Defaults to Approximate. Select Exact to improve accuracy in reported metricsnote that this setting can affect query performance.
  • Refresh rate: Set the rate at which data in the data cube refreshes.
  • View raw data: View and download the raw data underlying the data cube.
  • View essence: View and copy the JSON structure of the data cube.
  • Monitor queries: View the underlying queries Polaris executes when you’re working with the data cube. Click Enable monitoring in the dialog to enable the feature. Once enabled, you can view the time and type of each query. Click a query to see the SQL, copy to clipboard, and open the SQL in query view.
  • Update data cube defaults: Update the initial settings for the data cube view when you first navigate to it.
  • Reset view: Reset the data cube view to the default view.

Click the ellipsis in the top navigation bar to access the following options:

  • Embed this view: Create an embedding link to this data cube view.
  • Add to/Remove from favorites: Mark the data cube as a favorite or remove it from your favorites.
  • Add to collection: Add the data cube to a collection.

Click the download icon in the top navigation bar to download data from the data cube. See Download data for more information.