• Developer guide
  • API reference

›Ingestion sources

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Navigate the console
  • Key concepts

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest from S3
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Create an ingestion job
  • Timestamp expressions
  • Data partitioning
  • Introduction to rollup
  • Approximation algorithms
  • Replace data

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Create a dashboard
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations
  • Query data

Monitoring

  • Overview

Management

  • Overview
  • Pause and resume a project

Billing

  • Overview
  • Polaris plans
  • Estimate project costs

Usage

  • Overview

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS

Developer guide

  • Overview
  • Authentication

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
  • Manage users and groups
  • Migrate deprecated resources
  • Create a table
  • Define a schema
  • Upload files
  • Create an ingestion job
  • Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference
  • Filter data to ingest
  • Ingest nested data
  • Ingest and query sketches
  • Query data
  • Update a project
  • Link to BI tools
  • Connect over JDBC
  • Query parameters reference
  • API documentation

    • OpenAPI reference
    • Query API

Product info

  • Release notes
  • Known limitations
  • Druid extensions

Ingest from Amazon S3

You can create a connection to Amazon S3, an object storage service provided by Amazon Web Services, to ingest data into Imply Polaris.

Create a unique connection for each S3 bucket from which you want to ingest data.

S3 connection information

A Polaris connection to Amazon S3 takes the following information:

  • Information about the S3 bucket to ingest from.

    • Bucket name: The name of the S3 bucket that contains the data to ingest.
    • Prefix (optional): You can limit access to designated files in the S3 bucket by specifying a prefix. The connection will be limited to the set of files matching this prefix. For example, logs/20221014T00:00:00.
    • AWS S3 endpoint: The endpoint of the S3 service, such as s3.us-east-1.amazonaws.com.
  • Authorization to access the S3 bucket. For more information, see Secure connections to AWS and the AWS documentation on Managing access to resources.

    • ARN of IAM role: The Amazon Resource Name (ARN) of the AWS assumed role to use for access. For example, arn:aws:iam:::123456789012:role/s3-access-role.

    • Trust policy attached to the IAM role: Authorizing access to your S3 data from Polaris requires a trust policy added to your IAM role to allow Polaris to assume the role. For more information, see Trust policy.

    • Permissions policy attached to the IAM role: In order to grant Polaris access to view and ingest data from your S3 buckets, attach to the IAM role a permissions policy that lists your S3 resources and includes the following actions:

      • s3:GetObject to retrieve objects from the S3 bucket.
      • s3:ListBucket (optional) to list the objects in the S3 bucket. This permission is not required to ingest from S3; however, Imply strongly recommends you include the permission because it makes viewing and selecting objects to ingest more straightforward. Note that s3:ListBucket is the name of the permission that allows a user to list the objects in a bucket. ListObjectsV2 is the name of the API call that lists the objects in a bucket.

Example IAM policy

The following example shows an IAM policy that can be attached to your IAM role. The policy grants the role the listed permissions for Polaris to view and obtain data from your S3 bucket. Replace S3 ARN with the ARN for your S3 resource—for example, arn:aws:s3:::bucket_name.

{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Effect": "Allow",
          "Action": [
              "s3:ListBucket"
          ],
          "Resource": [
              "S3 ARN"
          ]
      },
      {
          "Effect": "Allow",
          "Action": [
              "s3:GetObject"
          ],
          "Resource": [
              "S3 ARN/*"
          ]
      }
  ]
}

The following screenshot shows an example connection created in the UI:

S3 connection UI

Learn more

To learn how to ingest data from Amazon S3 using the Polaris API, see Ingest data from Amazon S3 by API.

← Ingest from filesIngest from Kinesis →
  • S3 connection information
    • Example IAM policy
  • Learn more
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc