Skip to main content

Filter data to ingest

You can use filters in Imply Polaris to ingest a subset of your data records that meet a specified condition. For example, you can use filters to limit a time range of data to ingest. An ingestion filter is a SQL WHERE clause that extracts records based on dimensions that fulfill the condition.

Apply filter

To add a filter, select Filter in the Map source to table stage of ingestion. Enter your filter clause in the Filter pane. Ingestion filters refer to input field names, not table column names. Do not include the WHERE keyword in the clause.

The following screenshot shows the Filter pane for an ingestion job:

Ingest with filters

Examples

This section shows examples of filters you can apply during ingestion for an input timestamp named "timestamp".

Timestamp filters:

  • "timestamp" IS NOT NULL: Ingest records that contain a timestamp.
  • TIME_IN_INTERVAL(TIME_PARSE("timestamp"), '2019-01-01/2020-01-01'): Ingest records that contain a timestamp between two given times, exclusive of the upper bound.
  • "timestamp" > TIMESTAMP '2019-01-01 00:00:00': Ingest records that contain a timestamp later than the specified time.
info

Do not use CURRENT_TIMESTAMP or CURRENT_DATE in a filter, since Polaris translates the function to an actual, static timestamp when the job begins. This timestamp does not update over time. For streaming ingestion jobs, you can set the earliest or latest rejection periods.

Dimension filters:

  • "units_purchased" > 5: Ingest records where units_purchased is greater than 5.
  • "region" = 'North America': Ingest records corresponding to the North America region.
  • "product_color" <> 'white': Ingest records where product_color is not white.
  • "city" = 'Christchurch' AND "session_length" > 500: Ingest records that meet multiple conditions.

Learn more

See the following topics for more information: