• Developer guide
  • API reference

›Ingestion sources

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Execute a POC
  • Create a dashboard
  • Navigate the console
  • Key concepts

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Create an ingestion job
  • Timestamp expressions
  • Data partitioning
  • Introduction to rollup
  • Approximation algorithms
  • Replace data

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest data from a table
  • Ingest from S3
  • Ingest from Kafka and MSK
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data
  • Connect to Confluent Schema Registry

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations
  • Query data

Monitoring

  • Overview

Management

  • Overview
  • Pause and resume a project

Billing

  • Overview
  • Polaris plans
  • Estimate project costs

Usage

  • Overview

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Manage users
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS
    • Create AWS PrivateLink connection

Developer guide

  • Overview
  • Authentication

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
  • Manage users and groups
  • Migrate deprecated resources
  • Create a table
  • Define a schema
  • Upload files
  • Create an ingestion job
  • Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Kafka and MSK
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference
  • Filter data to ingest
  • Ingest nested data
  • Ingest and query sketches
  • Specify data schema
  • Query data
  • Update a project
  • Link to BI tools
  • Connect over JDBC
  • Query parameters reference
  • API documentation

    • OpenAPI reference
    • Query API

Product info

  • Release notes
  • Known limitations
  • Druid extensions

Supported data and file formats

This topic is a reference for data and file format support in Imply Polaris.

Supported source data formats

The following table describes the data formats that Polaris supports for batch and streaming ingestion.

FormatBatch ingestionStreaming ingestion
Newline-delimited JSONYesYes
Delimiter-separated valuesYesYes
Apache ParquetYesNo
Apache ORCYesNo
Apache AvroYes (Avro OCF)Yes (not supported for push streaming)
Protocol Buffers (Protobuf)NoYes (not supported for push streaming)

Polaris supports nested data for all supported data formats.

For details on how to specify your input data schema for Avro and Protobuf formats, see Specify input schema by API.

Supported file compression formats

Polaris supports the following compression formats for uploaded files:

  • bz2: Bzip2
  • gz: Gzip
  • xz: XZ
  • sz: Snappy
  • zst: Zstandard

ZIP files and TAR files are not supported.

You can send gzipped data in push streaming ingestion with the HTTP header Content-Encoding: gzip.

File size limit

Polaris supports individual files up to 10 GB. This limit refers to the size of the file transmitted by the browser or HTTP client.

You may upload a file that's larger than 10 GB on disk if your browser or client compresses the file less than 10 GB in transit.

Learn more

For information on supported timestamp formats in Polaris, see Timestamp expressions.

← Ingestion sources overviewCreate a connection →
  • Supported source data formats
  • Supported file compression formats
  • File size limit
  • Learn more
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc