Skip to main content

Ingestion source reference

This topic summarizes the features for various ingestion sources in Imply Polaris.

For information about usage and configuration options, see the topics for individual ingestion sources. For a more conceptual overview of ingestion, see Ingestion sources overview. To learn more about supported secure networking options for Amazon Web Services (AWS) and Microsoft Azure, see AWS private networking options and Azure private networking options, respectively.

Batch ingestion

Polaris supports the following batch ingestion sources.

Ingestion source
Amazon S3Ingestion methodRead from S3
SemanticsAtomic
Supported data typesJSON, CSV / TSV, Parquet, ORC, Avro OCF
Private networking optionsN/A
Security for data in transitTLS by default. User controlled
AuthenticationAWS IAM roles with short-lived auth keys and AWS access keys
CloudsAll
Cloud regionsAll
Good forHigh volume batch ingestion
Azure Blob StorageIngestion methodRead from Azure Storage
SemanticsAtomic
Supported data typesJSON, CSV / TSV, Parquet, ORC, Avro OCF
Private networking optionsN/A
Security for data in transitTLS by default. User controlled
AuthenticationAzure Shared Access Signatures (SAS) and Azure access keys
CloudsAll
Cloud regionsAll
Good forHigh volume batch ingestion
File uploadIngestion methodPublish files to Polaris file staging. File ingestion is Polaris internal
SemanticsAtomic
Supported data typesJSON, CSV / TSV, Parquet, ORC, Avro OCF
Private networking optionsAWS PrivateLink, Azure Private Link
Security for data in transitTLS
AuthenticationPolaris credentials
CloudsAll
Cloud regionsAll from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link
Good forGetting started quickly, small to medium use cases, trials and POCs
Table-to-tableIngestion methodPolaris internal
SemanticsAtomic
Supported data typesPolaris table
Private networking optionsN/A
Security for data in transitPolaris internal
AuthenticationPolaris credentials
CloudsN/A
Cloud regionsN/A
Good forRe-indexing data

Streaming ingestion

Polaris supports the following streaming ingestion sources.

Ingestion source
Confluent CloudIngestion methodConsume
SemanticsExactly once
Supported data typesJSON, CSV / TSV, Avro, Protobuf
Private networking optionsAWS PrivateLink, Azure Private Link
Security for data in transitTLS by default, user controlled
AuthenticationSASL/PLAIN authentication with long-lived API keys
CloudsAll
Cloud regionsAll
Good forHigh data volume and high throughput
Amazon KinesisIngestion methodConsume
SemanticsExactly once
Supported data typesJSON, CSV / TSV, Avro, Protobuf
Private networking optionsN/A
Security for data in transitTLS by default, user controlled
AuthenticationIAM role assumption with short-lived auth keys
CloudsAWS
AWS regionsAll
Good forHigh data volume and high throughput
Apache Kafka, AWS MSK, or Apache Kafka on Azure Event HubsIngestion methodConsume
SemanticsExactly once
Supported data typesJSON, CSV / TSV, Avro, Protobuf
Private networking optionsAWS PrivateLink, Azure Private Link
Security for data in transitTLS for SASL/SCRAM; TLS by default for MSK + IAM, user controlled
AuthenticationSASL/PLAIN, SASL/SCRAM. MSK also supports IAM role assumption with short-lived auth keys
CloudsAll
Cloud regionsAll from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link
Good forHigh data volume and high throughput
Kafka Connector to Apache Kafka, AWS MSK, or Apache Kafka on Azure Event HubsIngestion methodPublish
SemanticsAt least once
Supported data typesJSON, CSV / TSV, Avro, Protobuf
Private networking optionsAWS PrivateLink; Azure Private Link
Security for data in transitTLS
AuthenticationPolaris API key or OAuth token
CloudsAll
Cloud regionsAll from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link
Good forGetting started quickly, small-medium use cases, trials and POCs
Events APIIngestion methodPublish
SemanticsAt most once (no retries) or at least once (retries)
Supported data typesJSON, CSV / TSV
Private networking optionsAWS PrivateLink, Azure Private Link
Security for data in transitTLS
AuthenticationPolaris API key or OAuth token
CloudsAll
Cloud regionsAll from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link
Good forUse cases and configurations where you don’t want to own Kafka, for example IoT

Schema metadata

Polaris supports the following schema metadata sources.

Metadata source
Confluent Schema RegistryIngestion methodConsume metadata
Supported data typesConfluent Schema Registry
Private networking optionsN/A
Security for data in transitTLS by default, user controlled
AuthenticationConfluent Schema Registry API key
CloudsAll
Cloud regionsAll
Good forAvro or Protobuf consume ingestion
Schema specificationIngestion methodPublish metadata
Supported data typesAvro: JSON schema; Protobuf: descriptor file
Private networking optionsAWS PrivateLink, Azure Private Link
Security for data in transitTLS
AuthenticationPolaris credentials
CloudsAll
Cloud regionsAll from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link
Good forAvro or Protobuf consume ingestion