• Developer guide
  • API reference

›Ingestion jobs

Getting started

  • Introduction to Imply Polaris
  • Quickstart
  • Execute a POC
  • Create a dashboard
  • Navigate the console
  • Customize Polaris
  • Key concepts

Tables and data

  • Overview
  • Introduction to tables
  • Table schema
  • Ingestion jobs

    • Create an ingestion job
    • Ingest using SQL
    • Job auto-discovery
    • Timestamp expressions
    • SQL ingestion reference
    • Ingestion status reference
  • Data partitioning
  • Introduction to rollup
  • Replace data
  • Ingestion use cases

    • Approximation algorithms
    • Ingest earliest or latest value

Ingestion sources

  • Ingestion sources overview
  • Supported data formats
  • Create a connection
  • Ingest from files
  • Ingest data from a table
  • Ingest from S3
  • Ingest from Kafka and MSK
  • Ingest from Kinesis
  • Ingest from Confluent Cloud
  • Kafka Connector for Imply Polaris
  • Push event data
  • Connect to Confluent Schema Registry
  • Ingestion source reference

Analytics

  • Overview
  • Manage data cubes
  • Visualize data
  • Data cube dimensions
  • Data cube measures
  • Dashboards
  • Visualizations reference
  • Set up alerts
  • Set up reports
  • Embed visualizations

Querying

  • Overview
  • Time series functions

Monitoring

  • Overview
  • Monitoring dashboards
  • Monitor performance metrics
  • Integrate with Datadog
  • Integrate with Prometheus
  • Integrate with Elastic stack
  • Metrics reference

Management

  • Overview
  • Pause and resume a project

Usage and Billing

  • Billing structure overview
  • Polaris plans
  • Add a payment method
  • Monitor account usage

Security

    Polaris access

    • Overview
    • Invite users to your organization
    • Manage users
    • Permissions reference
    • Manage user groups
    • Enable SSO
    • SSO settings reference
    • Map IdP groups

    Secure networking

    • Connect to AWS
    • Create AWS PrivateLink connection

Developer guide

  • Overview
  • Security

    • Overview
    • Authenticate with API keys
    • Authenticate with OAuth
    • Manage users and groups
    • Restrict an embedding link
  • Migrate deprecated resources
  • Create a table
  • Upload files
  • Ingestion jobs

    • Create an ingestion job
    • Create a streaming ingestion job
    • Ingest using SQL
    • View and manage jobs

    Ingestion sources

    • Ingest from files
    • Ingest from a table
    • Get ARN for AWS access
    • Ingest from Amazon S3
    • Ingest from Kafka and MSK
    • Ingest from Amazon Kinesis
    • Ingest from Confluent Cloud
    • Push event data
    • Kafka Connector for Imply Polaris
    • Kafka Connector reference

    Ingestion use cases

    • Filter data to ingest
    • Ingest nested data
    • Ingest and query sketches
    • Specify data schema
    • Ingest Kafka metadata

    Analytics

    • Query data
    • Connect over JDBC
    • Link to BI tools
    • Query parameters reference
  • Update a project
  • API documentation

    • OpenAPI reference
    • Query API

    Migrations

    • Migrate from Hybrid

Product info

    Release notes

    • 2023
    • 2022
  • Known limitations
  • Druid extensions

View and manage ingestion jobs by API

After you create an ingestion job for Imply Polaris, you can monitor its progress, view job-related metrics, and cancel its execution. This topic walks you through the process of viewing job-related details and canceling an ingestion job.

Prerequisites

This topic assumes that you have an API key with the ManageIngestionJobs permission. In the examples below, the key value is stored in the variable named POLARIS_API_KEY. See API key authentication to obtain an API key and assign permissions. Visit Permissions reference for more information on permissions.

Monitor ingestion job progress

To monitor the progress of your ingestion job, issue a GET request to the Jobs v2 API with the job ID in the path. For example, /v2/jobs/efb35e3e-406e-4127-ad2e-280fede4f431.

Sample request

The following example shows how to monitor the progress of an example ingestion job with the job ID efb35e3e-406e-4127-ad2e-280fede4f431:

cURL
Python
curl --location --request GET 'https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v2/jobs/efb35e3e-406e-4127-ad2e-280fede4f431' \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data-raw ''
import os
import requests

url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v2/jobs/efb35e3e-406e-4127-ad2e-280fede4f431"

apikey = os.getenv("POLARIS_API_KEY")

payload = "\n"
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Sample response

The following example shows a successful response for an ingestion job progress:

Click to view the response

{
    "type": "batch",
    "id": "efb35e3e-406e-4127-ad2e-280fede4f431",
    "target": {
        "type": "table",
        "tableName": "Koalas Subset"
    },
    "desiredExecutionStatus": "running",
    "createdBy": {
        "username": "service-account-docs-demo",
        "userId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    },
    "lastModifiedBy": {
        "username": "service-account-docs-demo",
        "userId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
    },
    "executionStatus": "pending",
    "health": {
        "status": "ok"
    },
    "createdTimestamp": "2022-08-09T22:34:46.716017658Z",
    "lastUpdatedTimestamp": "2022-08-09T22:34:46.716017658Z",
    "source": {
        "type": "uploaded",
        "fileList": [
            "kttm-2019-08-19.json.gz",
            "kttm-2019-08-20.json.gz"
        ],
        "inputSchema": [
            {
                "dataType": "string",
                "name": "timestamp"
            },
            {
                "dataType": "string",
                "name": "city"
            },
            {
                "dataType": "string",
                "name": "session"
            },
            {
                "dataType": "long",
                "name": "session_length"
            }
        ],
        "formatSettings": {
            "format": "nd-json"
        }
    },
    "ingestionMode": "append",
    "mappings": [
        {
            "columnName": "__time",
            "expression": "TIME_PARSE(\"timestamp\")"
        },
        {
            "columnName": "city",
            "expression": "\"city\""
        },
        {
            "columnName": "session",
            "expression": "\"session\""
        },
        {
            "columnName": "max_session_length",
            "expression": "MAX(\"session_length\")"
        },
        {
            "columnName": "__count",
            "expression": "COUNT(*)"
        }
    ]
}

The executionStatus field shows the job's current execution status, and the health field describes the health of the job.

View metrics for a job

Send a GET request to /v2/jobs/JOB_ID/metrics to view the metrics for a job. Replace JOB_ID with the ID of the job.

Sample request

The following example shows how to view metrics for an ingestion job with the job ID db6a3110-d6c3-4a63-86b2-41d51a65ce11:

cURL
Python
curl --location --request GET 'https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v2/jobs/db6a3110-d6c3-4a63-86b2-41d51a65ce11/metrics' \
--user ${POLARIS_API_KEY}:
import os
import requests

url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v2/jobs/db6a3110-d6c3-4a63-86b2-41d51a65ce11/metrics"

apikey = os.getenv("POLARIS_API_KEY")

headers = {
'Authorization': f'Basic {apikey}'
}

response = requests.get(url, headers=headers)

print(response.text)

Sample response

The following example shows a successful response for metrics of an ingestion job:

{
    "totals": {
        "numRowsProcessed": 99999,
        "numRowsProcessedWithWarning": 0,
        "numRowsSkippedByFilter": 0,
        "numRowsSkippedByError": 2
    }
}

The preceding example shows an ingestion job with two rows skipped due to parsing error. By default, Polaris continues to ingest data when it encounters rows it cannot parse. You can specify a threshold for parsing exceptions for a job. The job fails if the number of parsing exceptions exceeds the threshold. Set this limit in maxParseExceptions when creating an ingestion job.

You can view more details on the raised exceptions when requesting job logs.

Cancel an ingestion job

To cancel an ingestion job, issue a PUT request to the Jobs v2 API with the job ID in the path. In the request body, set the desiredExecutionStatus to canceled.

Sample request

The following example shows how to cancel an ingestion job with the job ID efb35e3e-406e-4127-ad2e-280fede4f431:

cURL
Python
curl --location --request PUT 'https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v2/jobs/efb35e3e-406e-4127-ad2e-280fede4f431' \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data-raw '{
"desiredExecutionStatus": "canceled"
}'
import os
import requests
import json

url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v2/jobs/efb35e3e-406e-4127-ad2e-280fede4f431"

apikey = os.getenv("POLARIS_API_KEY")

payload = json.dumps({
"desiredExecutionStatus": "canceled"
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}

response = requests.request("PUT", url, headers=headers, data=payload)

print(response.text)

Sample response

When you successfully cancel an ingestion job, the Jobs v2 API returns the 200 OK status code and the details of the canceled job.

Learn more

See the following topics for more information:

  • Create an ingestion job for creating an ingestion job.
  • Monitor performance metrics for monitoring performance metrics.
← Ingest using SQLIngest from files →
  • Prerequisites
  • Monitor ingestion job progress
    • Sample request
    • Sample response
  • View metrics for a job
    • Sample request
    • Sample response
  • Cancel an ingestion job
    • Sample request
    • Sample response
  • Learn more
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2023 Imply Data, Inc