Skip to main content

Ingestion status reference

Each ingestion job in Imply Polaris provides information on row count metrics. This includes details on the number of rows successfully processed, rows filtered out during ingestion, and any rows that couldn't be ingested due to parsing issues. or unable to be ingested since the row was unable to be parsed.

This topic describes the row count metrics for an ingestion job.

View row count metrics

View the row count metrics for an ingestion job in the job details page.

You can also access these same metrics using the Jobs API. For more information, see View metrics for a job.

Job details with spotlight on row count metrics

Row count metrics reference

This section describes the row count metrics listed for batch and streaming ingestion jobs.

To limit the number of unparseable rows, you can specify a threshold for parsing exceptions in an ingestion job. The job fails if the number of parsing exceptions exceeds the threshold. Set this limit in maxParseExceptions when creating an ingestion job.

Batch ingestion

A batch ingestion job reports the following row count metrics. Polaris deduces batch metrics from the MSQ task report. For more information on MSQ task reports, see the Apache Druid documentation for SQL-based ingestion API.

Parsed successfully: Number of rows processed successfully without parsing errors.
Jobs API response property: numRowsProcessed

Ingested: Number of rows stored in the table.
This value may be fewer than the number of parsed rows when rollup is applied.
Jobs API response property: numRowsPersisted

Unparseable: Number of rows discarded because they could not be parsed.
Jobs API response property: numRowsSkippedByError

Streaming ingestion

Polaris obtains row count metrics from Apache Druid task metrics. For details on Apache Druid task metrics, see Task reference. A streaming ingestion job reports the following row count metrics.

Parsed successfully: Number of rows processed successfully without parsing errors.
Jobs API response property: numRowsProcessed
Apache Druid task metric: processed

Parsed with warnings: Number of rows processed with one or more parsing errors.
This typically occurs when Polaris can parse the input row but detects an invalid type, such as a string value ingested into a numeric column.
Jobs API response property: numRowsProcessedWithWarnings
Apache Druid task metric: processedWithErrors

Unparseable: Number of rows discarded because they could not be parsed.
Jobs API response property: numRowsSkippedByError
Apache Druid task metric: unparseable

Filtered: Number of rows skipped from a streaming ingestion job.
Polaris skips rows for late arriving event data or when a filter expression is defined. This number does not include any header rows skipped in CSV data.
Jobs API response property: numRowsSkippedByFilter
Apache Druid task metric: thrownAway

Learn more

See the following topics for more information: