• Developer guide
  • API reference

›Ingestion sources

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Execute a POC
  • Create a dashboard
  • Navigate the console
  • Customize Polaris
  • Key concepts

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Ingestion jobs

    • Create an ingestion job
    • Ingest using SQL
    • Job auto-discovery
    • Timestamp expressions
    • SQL ingestion reference
    • Ingestion status reference
  • Data partitioning
  • Introduction to rollup
  • Replace data
  • Ingestion use cases

    • Approximation algorithms
    • Ingest earliest or latest value

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest data from a table
  • Ingest from S3
  • Ingest from Kafka and MSK
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data
  • Connect to Confluent Schema Registry
  • Ingestion source reference

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations

Querying

  • Overview
  • Time series functions

Monitoring

  • Overview
  • Monitoring dashboards
  • Monitor performance metrics
  • Integrate with Datadog
  • Integrate with Prometheus
  • Integrate with Elastic stack
  • Metrics reference

Management

  • Overview
  • Pause and resume a project

Usage and Billing

  • Billing structure overview
  • Polaris plans
  • Add a payment method
  • Monitor account usage

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Manage users
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS
    • Create AWS PrivateLink connection

Developer guide

  • Overview
  • Security

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
    • Manage users and groups
    • Restrict an embedding link
  • Migrate deprecated resources
  • Create a table
  • Upload files
  • Ingestion jobs

    • Create an ingestion job
    • Create a streaming ingestion job
    • Ingest using SQL
    • View and manage jobs

    Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Kafka and MSK
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference

    Ingestion use cases

    • Filter data to ingest
    • Ingest nested data
    • Ingest and query sketches
    • Specify data schema
    • Ingest Kafka metadata

    Analytics

    • Query data
    • Connect over JDBC
    • Link to BI tools
    • Query parameters reference
  • Update a project
  • API documentation

    • OpenAPI reference
    • Query API

    Migrations

    • Migrate from Hybrid

Product info

    Release notes

    • 2023
    • 2022
  • Known limitations
  • Druid extensions

Ingest from Kafka and Amazon MSK

You can create a connection to Kafka, including Amazon Managed Streaming for Apache Kafka (MSK), to ingest data into Imply Polaris. MSK is a fully managed, cloud-native service for Apache Kafka.

Ingestion from Kafka or MSK to Polaris uses exactly once semantics.

Polaris authenticates with your Kafka cluster using Simple Authentication Security Layer (SASL) authentication. You can authenticate using a Kafka username and password, or, for MSK clusters that use IAM access control, with IAM role assumption. Regardless of authentication method, data in transit between Polaris and Kafka is encrypted with TLS.

Kafka or MSK connection information

A Polaris connection to Kafka or MSK requires the following:

  • Topic name. The name of the Kafka topic containing your event data.

  • Bootstrap servers. A list of one or more host and port pairs representing the addresses of brokers in the Kafka cluster. This list should be in the form host1:port1,host2:port2,... For details on where to find the bootstrap servers in Amazon MSK, see Getting the bootstrap brokers using the AWS Management Console.

  • Authentication credentials. One of the following:

    • Apache Kafka username and password for Polaris to use to make the connection. Polaris supports Simple Authentication Security Layer (SASL) PLAIN and SASL SCRAM. For SASL SCRAM connections, you must also provide the SCRAM mechanism, either SCRAM-SHA-256 or SCRAM-SHA-512.

    • For MSK only, IAM access control for Amazon MSK requires that you set up your MSK cluster to use IAM access control. For Polaris to access your MSK data, you need to have the following:

      • ARN of IAM role: The Amazon Resource Name (ARN) of the AWS assumed role to use for access. For example, arn:aws:iam:::123456789012:role/msk-access-role.

      • Trust policy attached to the IAM role: Authorizing access to your MSK data from Polaris requires a trust policy added to your IAM role to allow Polaris to assume the role. For more information, see Trust policy.

      • Permissions policy attached to the IAM role: In order to grant Polaris access to view and ingest data from your MSK clusters, attach to the IAM role a permissions (authorization) policy that lists your MSK resources and includes the following actions. For details on each of the actions, see the AWS documentation on IAM access control for Amazon MSK.

        • kafka-cluster:Connect
        • kafka-cluster:DescribeTopic
        • kafka-cluster:DescribeGroup
        • kafka-cluster:AlterGroup
        • kafka-cluster:ReadData
  • Advanced options. You can optionally set the following:

    • Client rack: Kafka rack ID, if you want Kafka to connect to a broker matching a specific rack. For example, use1-az4.

    • Truststore certificates: One or more server certificates for Polaris to trust, based on your networking setup. You can copy and paste certificates, for example:

      -----BEGIN CERTIFICATE-----
      xxxx
      -----END CERTIFICATE-----
      

Example MSK IAM policy

For MSK clusters that use IAM access control, the following example shows an IAM policy that can be attached to your IAM role. The policy grants the role the listed permissions for Polaris to view and obtain data from topics in your MSK cluster. In Resource, list the ARNs for your MSK cluster, topic, or group.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kafka-cluster:Connect",
            ],
            "Resource": [
                "arn:aws:kafka:us-east-1:0123456789012:cluster/MyTestCluster/abcd1234-0123-abcd-5678-1234abcd-1"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kafka-cluster:DescribeTopic",
                "kafka-cluster:ReadData"
            ],
            "Resource": [
                "arn:aws:kafka:us-east-1:0123456789012:topic/MyTestCluster/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "kafka-cluster:AlterGroup",
                "kafka-cluster:DescribeGroup"
            ],
            "Resource": [
                "arn:aws:kafka:us-east-1:0123456789012:group/MyTestCluster/*"
            ]
        }
    ]
}

Example MSK connection

The following screenshot shows an example connection created in the UI. For more information on creating connections in the UI, see Create a connection.

Polaris sources

When you have your table, connection, and ingestion job set up for Kafka or MSK, Polaris automatically ingests data from the Kafka topic as data enters the topic defined in the connection.

Learn more

To learn how to ingest data from Kafka or MSK using the Polaris API, see Ingest data from Kafka and MSK by API.

← Ingest from S3Ingest from Kinesis →
  • Kafka or MSK connection information
  • Example MSK IAM policy
  • Example MSK connection
  • Learn more
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc