Ingestion source reference
This topic summarizes the features for various ingestion sources in Imply Polaris.
For information about usage and configuration options, see the topics for individual ingestion sources. For a more conceptual overview of ingestion, see Ingestion sources overview.
To learn more about supported secure networking options for Amazon Web Services (AWS) and Microsoft Azure, see AWS private networking options and Azure private networking options, respectively.
To enable unauthenticated connections, contact Polaris customer support. Imply recommends against using unauthenticated connections to connect to Polaris. You can only use unauthenticated connections when you enable private networking to secure the connection.
Batch ingestion
Polaris supports the following batch ingestion sources.
Ingestion source | ||
---|---|---|
Amazon S3 | Ingestion method | Read from S3 |
Semantics | Atomic | |
Supported data types | JSON, CSV / TSV, Parquet, ORC, Avro OCF | |
Private networking options | N/A | |
Security for data in transit | TLS by default. User controlled | |
Authentication | AWS IAM roles with short-lived auth keys and AWS access keys | |
Clouds | All | |
Cloud regions | All | |
Good for | High volume batch ingestion | |
Azure Blob Storage | Ingestion method | Read from Azure Storage |
Semantics | Atomic | |
Supported data types | JSON, CSV / TSV, Parquet, ORC, Avro OCF | |
Private networking options | N/A | |
Security for data in transit | TLS by default. User controlled | |
Authentication | Azure Shared Access Signatures (SAS) and Azure access keys | |
Clouds | All | |
Cloud regions | All | |
Good for | High volume batch ingestion | |
File upload | Ingestion method | Publish files to Polaris file staging. File ingestion is Polaris internal |
Semantics | Atomic | |
Supported data types | JSON, CSV / TSV, Parquet, ORC, Avro OCF | |
Private networking options | AWS PrivateLink, Azure Private Link | |
Security for data in transit | TLS | |
Authentication | Polaris credentials | |
Clouds | All | |
Cloud regions | All from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link | |
Good for | Getting started quickly, small to medium use cases, trials and POCs | |
Table-to-table | Ingestion method | Polaris internal |
Semantics | Atomic | |
Supported data types | Polaris table | |
Private networking options | N/A | |
Security for data in transit | Polaris internal | |
Authentication | Polaris credentials | |
Clouds | N/A | |
Cloud regions | N/A | |
Good for | Re-indexing data | |
Inline data | Ingestion method | Polaris internal |
Semantics | Atomic | |
Supported data types | JSON, CSV / TSV | |
Private networking options | N/A | |
Security for data in transit | Polaris internal | |
Authentication | Polaris credentials | |
Clouds | N/A | |
Cloud regions | N/A | |
Good for | Getting started quickly, trials and POCs |
Streaming ingestion
Polaris supports the following streaming ingestion sources.
Ingestion source | ||
---|---|---|
Confluent Cloud | Ingestion method | Consume |
Semantics | Exactly once | |
Supported data types | JSON, CSV / TSV, Avro, Protobuf | |
Private networking options | AWS PrivateLink, Azure Private Link | |
Security for data in transit | TLS by default, user controlled | |
Authentication | SASL/PLAIN authentication with long-lived API keys | |
Clouds | All | |
Cloud regions | All | |
Good for | High data volume and high throughput | |
Amazon Kinesis | Ingestion method | Consume |
Semantics | Exactly once | |
Supported data types | JSON, CSV / TSV, Avro, Protobuf | |
Private networking options | N/A | |
Security for data in transit | TLS by default, user controlled | |
Authentication | IAM role assumption with short-lived auth keys | |
Clouds | AWS | |
AWS regions | All | |
Good for | High data volume and high throughput | |
Apache Kafka, AWS MSK, or Apache Kafka on Azure Event Hubs | Ingestion method | Consume |
Semantics | Exactly once | |
Supported data types | JSON, CSV / TSV, Avro, Protobuf | |
Private networking options | AWS PrivateLink, Azure Private Link | |
Security for data in transit | TLS for SASL/SCRAM; TLS by default for MSK + IAM, user controlled | |
Authentication | SASL/PLAIN, SASL/SCRAM. MSK also supports IAM role assumption with short-lived auth keys | |
Clouds | All | |
Cloud regions | All from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link | |
Good for | High data volume and high throughput | |
Kafka Connector to Apache Kafka, AWS MSK, or Apache Kafka on Azure Event Hubs | Ingestion method | Publish |
Semantics | At least once | |
Supported data types | JSON, CSV / TSV, Avro, Protobuf | |
Private networking options | AWS PrivateLink, Azure Private Link | |
Security for data in transit | TLS | |
Authentication | Polaris API key or OAuth token | |
Clouds | All | |
Cloud regions | All from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link | |
Good for | Getting started quickly, small-medium use cases, trials and POCs | |
Events API | Ingestion method | Publish |
Semantics | At most once (no retries) or at least once (retries) | |
Supported data types | JSON, CSV / TSV | |
Private networking options | AWS PrivateLink, Azure Private Link | |
Security for data in transit | TLS | |
Authentication | Polaris API key or OAuth token | |
Clouds | All | |
Cloud regions | All from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link | |
Good for | Use cases and configurations where you don’t want to own Kafka, for example IoT |
Schema metadata
Polaris supports the following schema metadata sources.
Metadata source | ||
---|---|---|
Confluent Schema Registry | Ingestion method | Consume metadata |
Supported data types | Confluent Schema Registry | |
Private networking options | N/A | |
Security for data in transit | TLS by default, user controlled | |
Authentication | Confluent Schema Registry API key | |
Clouds | All | |
Cloud regions | All | |
Good for | Avro or Protobuf consume ingestion | |
Schema specification | Ingestion method | Publish metadata |
Supported data types | Avro: JSON schema; Protobuf: descriptor file | |
Private networking options | AWS PrivateLink, Azure Private Link | |
Security for data in transit | TLS | |
Authentication | Polaris credentials | |
Clouds | All | |
Cloud regions | All from public internet; Polaris-supported regions for AWS PrivateLink or Azure Private Link | |
Good for | Avro or Protobuf consume ingestion |