• Developer guide
  • API reference

›Tables and data

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Navigate the console
  • Key concepts

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest from S3
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Create an ingestion job
  • Timestamp expressions
  • Data partitioning
  • Introduction to rollup
  • Approximation algorithms
  • Replace data

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Create a dashboard
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations
  • Query data

Monitoring

  • Overview

Management

  • Overview
  • Pause and resume a project

Billing

  • Overview
  • Polaris plans
  • Estimate project costs

Usage

  • Overview

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS

Developer guide

  • Overview
  • Authentication

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
  • Manage users and groups
  • Migrate deprecated resources
  • Create a table
  • Define a schema
  • Upload files
  • Create an ingestion job
  • Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference
  • Filter data to ingest
  • Ingest nested data
  • Ingest and query sketches
  • Query data
  • Update a project
  • Link to BI tools
  • Connect over JDBC
  • Query parameters reference
  • API documentation

    • OpenAPI reference
    • Query API

Product info

  • Release notes
  • Known limitations
  • Druid extensions

Data overview

Like many database systems, Imply Polaris stores your data in tables. Behind the scenes, the tables are in the columnar format used by Apache Druid.

This topic provides a general overview of the data lifecycle for both the Polaris UI and the API. For more information on how to use the API, see Polaris API overview.

Supported input data formats

Prior to building tables and loading data, verify that your source data adheres to the supported data and file formats.

Create a schema

A schema defines the columns and other metadata about your tables. You can create a schema:

  • Manually with the UI. See Table schema.
  • Using schema detection when you upload a file. Follow the Quickstart.
  • Using the API. See Define table schemas by API.

Polaris offers the following schema optimizations:

  • Rollup, a form of pre-aggregation that reduces the size of stored data and increases query performance.
  • Partitioning, a way to organize your schema to increase query performance based upon columns you frequently use for filters.

Additionally, you can use data sketch columns for approximation.

Ingest data

You can ingest from a variety of sources into Polaris, such as uploaded files, Amazon S3 buckets, and Kafka topics. Polaris automatically scales the number of ingestion tasks within your project to maintain optimal performance. For a reference on the available sources for ingestion, see Ingestion sources overview.

To ingest data into Polaris, create the following components:

  1. A table with a defined schema to receive the ingested data.
  2. A connection to define the source of the data. Connections are not required for ingesting from uploaded files or existing tables.
  3. An ingestion job to bring in data from the connection.

The ingestion job loads data into a table from the specified source data, whether from a connection or existing data in Polaris. In the ingestion job specification, you also define how the input data maps to the table schema and any transformations to apply. While your data is loading, you can monitor the status of your ingestion job on the Jobs tab in the UI or using the Jobs API.

You can create tables, connections, and ingestion jobs using the UI or the API.

You can stream event data from Confluent Cloud to Polaris using the UI or the API.

Work with existing tables and data

After you create a table, you can:

  • Add batch data in the UI or using the API.
  • Replace existing data for specific time intervals.
  • Drop data in the UI or use the Jobs API to create a delete_data job. When you drop data, you have the option to drop all data or the data for a time interval.
  • Delete the table in the UI or use the Jobs API to create a drop_table job.

To access table operations within the Tables list, click the ... menu button at the far left of the table.

Known limitations

We are regularly updating Polaris to add features and fixes. If you run into an issue, check Known limitations.

Learn more

See the following topics for more information:

  • Quickstart for a tutorial on how to upload data using batch ingestion.
  • Ingest from files for strategies and concepts for batch ingestion.
  • Load event data for information and examples on how to load event stream data.
← Push event dataIntroduction to tables →
  • Supported input data formats
  • Create a schema
  • Ingest data
  • Work with existing tables and data
  • Known limitations
  • Learn more
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc