Managing data cubes

This section explains how to create and configure your data cubes.

Creating a data cube

Create a new data cube from the home view by clicking the + button and selecting New data cube from the menu.

add cube modal

The Create new data cube modal allows you to define the source data for a data cube.

add cube modal

Checking the Auto fill dimensions and measures checkbox instructs Pivot to automatically detect your schema and create dimensions and measures accordingly.

Another way to create a new data cube is to duplicate an existing data cube and edit its properties.

After creating the data cube you will be taken to the edit screen to tweak any fine details.

Editing a data cube

Click the pencil icon in the data cube header to edit a data cube.

Cube edit view

Within the edit view you can change the title, description, and default timezone of the data cube. You can also edit and create dimensions and measures from their respective tabs.

Schema detection

Pivot will can automatically detect the columns in your data source and recommend some possible dimensions and measures.

Schema detection can be helpful when you are creating a new data cube in which case it will allow you to initialize the data cube with some dimensions and measures.

The other use for schema detection is when you evolve your schema and add columns over time. If you have just added a column and you want a corresponding dimension to be created for it, you can use schema detection to automatically construct that column for you. This functionality is accessible from the Suggestions tab of the New dimension and New measure modals.

Cube edit new dimension modal

How schema detection works

Pivot looks at the data source metadata and uses the returned list of columns, their types, and their aggregation (in case of rollup) to determine what dimensions and measures to suggest.

Pivot generates dimensions and measures by applying the following rules to the discovered underlying column types:

  • Time columns get mapped to a dimension with automatic bucketing by default.
  • String columns get mapped to a dimension.
  • Numeric columns get mapped to a SUM measure or an otherwise appropriate measure if the column is marked as being aggregated as part of rollup.
  • ApproximateHistogram columns get mapped as a 98th percentile measure.
  • HyperUnique and ThetaSketch columns get mapped to a countDistinct measure.

Limitations

While schema detection is an invaluable tool for quickly getting you up and running with a new data source, it can never do a perfect job. Once you create a data cube you should play with it and taylor it to be perfectly suited to your needs - don't hesitate to change and delete the auto-generated dimensions and measures - they will always be there in the Suggestions tab.

In particular, schema detection can not detect these common scenarios:

  • String columns that you might want to see as countDistinct - number of unique values.
  • The perfect granularities to apply to time and numeric dimensions.
  • Lookups that you might want to apply to certain dimensions.
  • Dimensions that actually correspond to a URL.
  • Measures that are interesting when filtered on something.
  • Measures that should be seen as a ratio - or some other post aggregation.
Pivot

Advanced

Api