• Developer guide
  • API reference

›Ingestion jobs

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Execute a POC
  • Create a dashboard
  • Navigate the console
  • Customize Polaris
  • Key concepts

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Ingestion jobs

    • Create an ingestion job
    • Ingest using SQL
    • Job auto-discovery
    • Timestamp expressions
    • SQL ingestion reference
    • Ingestion status reference
  • Data partitioning
  • Introduction to rollup
  • Replace data
  • Ingestion use cases

    • Approximation algorithms
    • Ingest earliest or latest value

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest data from a table
  • Ingest from S3
  • Ingest from Kafka and MSK
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data
  • Connect to Confluent Schema Registry
  • Ingestion source reference

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations

Querying

  • Overview
  • Time series functions

Monitoring

  • Overview
  • Monitoring dashboards
  • Monitor performance metrics
  • Integrate with Datadog
  • Integrate with Prometheus
  • Integrate with Elastic stack
  • Metrics reference

Management

  • Overview
  • Pause and resume a project

Usage and Billing

  • Billing structure overview
  • Polaris plans
  • Add a payment method
  • Monitor account usage

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Manage users
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS
    • Create AWS PrivateLink connection

Developer guide

  • Overview
  • Security

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
    • Manage users and groups
    • Restrict an embedding link
  • Migrate deprecated resources
  • Create a table
  • Upload files
  • Ingestion jobs

    • Create an ingestion job
    • Create a streaming ingestion job
    • Ingest using SQL
    • View and manage jobs

    Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Kafka and MSK
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference

    Ingestion use cases

    • Filter data to ingest
    • Ingest nested data
    • Ingest and query sketches
    • Specify data schema
    • Ingest Kafka metadata

    Analytics

    • Query data
    • Connect over JDBC
    • Link to BI tools
    • Query parameters reference
  • Update a project
  • API documentation

    • OpenAPI reference
    • Query API

    Migrations

    • Migrate from Hybrid

Product info

    Release notes

    • 2023
    • 2022
  • Known limitations
  • Druid extensions

Ingestion status reference

You can view and monitor the status of an ingestion job in Imply Polaris by checking its overall execution status as well as the health of the job. Polaris displays the job health based on the status of row processing during ingestion.

This topic lists the health statuses that may be associated for an ingestion job.

Job health reference

The following table lists the row ingestion statuses that may be associated for an ingestion job.

Polaris UI labelPolaris API fieldDruid field
SuccessfulnumRowsProcessedprocessed
Processed with warningsnumRowsProcessedWithWarningsprocessedWithErrors
FilterednumRowsSkippedByFilterthrownAway
UnparseablenumRowsSkippedByErrorunparseable

The row ingestion statuses are described as follows:

  • Successful: Number of rows successfully ingested without parsing errors.
  • Processed with warnings: Ingested rows that have one or more parsing errors. This typically occurs when Polaris can parse the input row but detects an invalid type, such as a string value ingested into a numeric column.
  • Filtered: Number of rows skipped from a streaming ingestion job. Polaris skips rows when you specify an ingestion target interval or a SQL expression to filter the source data. Skipped rows include late arriving event data for streaming ingestion. This number does not include any header rows skipped in CSV data nor data ingested outside the system-wide time limits.

    The filtered row count only applies to streaming ingestion. Batch ingestion jobs do not track filtered rows.

  • Unparseable: Number of rows discarded because they could not be parsed.

See the Druid documentation on Task reference for details on row processing fields exposed from Apache Druid.

View job health

Use either the UI or API to check the details of your ingestion job.

In the UI, Polaris shows the job status on the Job details page:

Job details with spotlight on row ingestion statuses

In the API, Polaris returns the row ingestion statuses in the totals field of a GET request for the job's metrics. For example, GET /v2/jobs/db6a3110-d6c3-4a63-86b2-41d51a65ce11/metrics returns the following:

{
    "totals": {
        "numRowsProcessed": 99999,
        "numRowsProcessedWithWarning": 0,
        "numRowsSkippedByFilter": 0,
        "numRowsSkippedByError": 2
    }
}

For how to view job metrics and limit the number of unparseable rows allowed per job, see View metrics for a job.

Learn more

See the following topics for more information:

  • Create an ingestion job for creating and viewing ingestion jobs.
  • Metrics reference for more metrics you can track for streaming ingestion.
← SQL ingestion referenceData partitioning →
  • Job health reference
  • View job health
  • Learn more
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc