Introduction to tables
This topic provides an overview of tables in Imply Polaris.
A Polaris table is a first-class object that holds interrelated data organized in rows and columns. Polaris uses tables to store and display data records.
You can have up to 1000 tables per organization, and each table can hold up to 400 columns.
Required user permissions
Members of the Organization Admin, Project Admin, or Data Manager groups, and users with the
ManageTables permission can view table data and modify table schema.
Members of the Data Analyst or Viewer groups and users with the
ViewTables permission can view table data and schema.
For more information on permissions, visit Permissions reference.
Create a table
You create a table using the Polaris UI or the Tables API. In the Polaris UI, select Tables from the left pane to access the Tables page. Click the Create table button located in the top-right corner of the page to create a new table.
For information on how to create a table using the Tables API, see Create a table by API.
The following screenshot shows the Tables page with the Create new table dialog displayed:
To create a table, all you need is its name, schema mode, and table type.
The table name must be unique. Additionally, the table name cannot:
- Be empty
- Have leading or trailing spaces
- Contain a whitespace character except space
- Start with the
- Contain the
- Be longer than 255 characters
- Contain the American Standard Code for Information Interchange (ASCII) characters between ASCII 0-31 and 127-159. See the complete table of ASCII characters for more information.
- Contain emojis and Unicode block specials,
- Contain Unicode characters in the
\uD800 - \uDFFFrange, which are reserved exclusively for use with UTF-16
Delete a table
To delete a table, navigate to the Tables page. Click the ellipsis Open menu icon for the table you want to delete and select Drop table.
After you delete a table, data might still be available for a short time until the backend drops all the data.
Table schema and mode
A table schema is an ordered collection of columns that describe a table. You specify the column names and data types for a table in its schema.
A table schema may have declared columns, which are explicitly provided by the user, or undeclared columns, which are inferred by Polaris during ingestion. Both the declared columns and undeclared columns make up the queryable schema for a table.
The schema enforcement mode on a table controls how Polaris enforces the schema on the table. A strict table requires all columns to be declared in the table schema prior to ingestion. A flexible table does not require but allows declared columns in the table schema. Polaris creates tables in strict mode by default.
The schema mode you choose depends on your data governance strategies. If you want to enforce strict schema conformity on your data, use a strict table. If you want to allow for a changing or flexible schema, use a flexible table.
You can change the mode on a table. When changing the table's schema enforcement mode, the following behavior applies:
- When converting a strict table to a flexible table, Polaris retains all columns, and these columns stay declared.
- You can only change a flexible table to a strict table when the table is empty.
Either schema enforcement mode can be used in batch or streaming ingestion jobs.
Strict tables require all columns in the table schema to be fully declared before data is ingested. The schema of a strict table only consists of declared columns. Polaris does not create new columns even if it detects there are more input fields not associated with any table columns.
If you created a table before Polaris introduced schema enforcement modes, the table is strict.
Known limitation: For an aggregate table created with flexible schema mode, the UI displays the auto-discovered columns as dimensions even when they are measures. Declared measure columns still show up as measures.
For a flexible table, Polaris auto-discovers the table schema during ingestion. Polaris dynamically adds columns to the table based on the data it discovers during ingestion. The schema of a flexible table can have both declared and undeclared columns.
The advantage of declaring columns in the schema is to enforce a strict schema for those columns. You cannot change the name or data type of a declared column once it contains data. The data type of a declared column does not change as data is ingested.
On the other hand, the data type of an undeclared column may change as more data is ingested into it. The column's data type takes the most generic (least restrictive) type for the data stored in the column. For example, a job that ingests both long and string values into a column will assign the column a string data type.
You can declare a column in the table schema even after it has data ingested. Polaris enforces the data type in subsequent ingestion jobs. If there are column values incompatible with the newly declared type, they are treated as null values at query time.
To ingest data into a declared column with a specified data type, you can also create a new table in which the column is declared, then ingest from the existing table. You can perform table-to-table ingestion in the UI or the API.
Types of tables
Polaris supports two types of tables: detail and aggregate. A table's type is determined by its rollup status.
There is no conversion between table types. You cannot switch from a detail table to an aggregate table or the other way around.
A table's type is independent of its mode. That is, a detail table may be either flexible or strict in its schema enforcement, and an aggregate table may also be either flexible or strict. When you ingest data into an aggregate table in flexible mode, you must specify both input fields and mappings in an ingestion job to define the aggregations for table measures.
A detail table is the default type. Detail tables have rollup disabled and store each ingested record as is, without any pre-aggregation. For example, an online store manager keeps track of every purchase that is made. When the manager ingests those purchase records into a detail table, the table shows one row for each purchase. Detail tables store columns as dimensions only.
An aggregate table has rollup enabled to group individual records according to the table's time granularity and dimensions. For example, an online store manager is only interested in the total sales per hour for a region. In this case, the manager does not need to see records for every sale, only a summary by hour. Aggregate tables store columns as either dimensions or measures.
See the following topics for more information:
- Table schema for information on column types and how to define a table's schema.
- Introduction to data rollup for details about rollup on tables.
- Data partitioning for configuring partitioning on a table to improve query performance.
- Create a table by API for creating tables and schemas using the Polaris API.
- Create an ingestion job for ingesting data into tables.