Skip to main content

Timestamp expressions

All data ingested into Imply Polaris must have an associated timestamp. Polaris uses the timestamp to partition and sort data, and to perform time-based data management operations, such as dropping time chunks. The primary timestamp column in Polaris is named __time. This topic shows how to map input fields to the __time table column and how to transform input timestamps in a mapping.

Prerequisites

Your input data should have the timestamp information in one of the following formats, or in a form that can be transformed into one of these formats:

  • ISO 8601
  • Milliseconds since Unix epoch (00:00:00 UTC on January 1, 1970)

Streaming ingestion requires the transformed timestamp to be within 30 days of ingestion time. For more information, see Timestamp requirements.

info

If the source data has null or unparseable timestamps, Polaris fails the ingestion job. Either assign a default timestamp using an input expression, or filter out null timestamps using an ingestion filter.

Map input timestamp to table column

Follow these steps to assign the primary timestamp when creating an ingestion job:

  1. In the source input schema, list the input fields that represent the timestamp. For example, a single input field that contains the timestamp or a combination of input fields that together contain the date and time. The UI may automatically detect your source input schema. In the API, list the input fields in source.inputSchema.

  2. Define the mapping used to optionally transform and parse the timestamp. In the mapping for __time, you can apply operations to transform the input fields. For the outermost function of the expression, apply one of the following functions to parse the timestamp:

    • TIME_PARSE() parses a string value containing times in ISO 8601 format, such as 2023-01-01T01:02:03.456.
    • MILLIS_TO_TIMESTAMP() parses a long value containing times in milliseconds since Unix epoch format, such as 1667337655.

    Regardless of the input field name, Polaris always stores the primary timestamp in the __time column of your table. When creating an ingestion job using the Polaris API, one of the objects in the mappings array must contain the column name __time.

ISO 8601 format

The following example shows the input expression for an input field named event_time containing ISO 8601 timestamps:

TIME_PARSE("event_time")

Milliseconds since epoch format

The following example shows the input expression for an input field named event_time, which contains timestamps in milliseconds since the Unix epoch:

MILLIS_TO_TIMESTAMP("event_time")

Input field named __time

Polaris accepts timestamps stored in a __time column in your source data when it meets the following requirements:

  • Formatted as milliseconds since Unix epoch
  • Defined as a long data type in the source input schema
  • Mapped using the identity mapping in the ingestion job

You cannot apply a transformation to an input field named __time. You cannot map another input field to the __time column when you have an input field named __time.

The input expression is only the input field name:

"__time"

The following API example shows the mapping for an input field named __time:

"mappings": [
{
"columnName": "__time",
"expression": "\"__time\""
},
...
]

Timestamp expressions

Use input expressions to set a default timestamp, modify timestamp formats, or apply time operations before ingesting your data. For supported timestamp functions, see Date and time functions.

This section introduces example input expressions you can use to transform timestamps in an ingestion job.

Fill in missing timestamps

Certain data records may not contain a timestamp. To set a default value for missing timestamps, use the COALESCE function in your input expression.

The following input expression applies a default time of 2022-07-25T02:47:05.000Z for missing values in the input field event_time:

COALESCE(TIME_PARSE("event_time"), TIMESTAMP '2022-07-25 02:47:05')

Set default timestamp

If you do not have an input field containing timestamps, you can set a static timestamp value using TIME_PARSE or MILLIS_TO_TIMESTAMP. This approach is suitable for batch ingestion rather than streaming ingestion jobs, which are subject to a late message rejection period.

The following example shows an input expression that sets a static timestamp in ISO 8601 format:

TIME_PARSE('2023-12-01T00:46:58.771Z')

The following example shows an input expression that sets a static timestamp using milliseconds since epoch format:

MILLIS_TO_TIMESTAMP(1699989996117)

Transform time since epoch

The following examples show expressions to transform an input field named event_time from various units of time since epoch:

  • Microseconds since epoch: MILLIS_TO_TIMESTAMP("event_time" / 1000)
  • Nanoseconds since epoch: MILLIS_TO_TIMESTAMP("event_time" / 1000000)
  • Seconds since epoch (POSIX time): MILLIS_TO_TIMESTAMP("event_time" * 1000)

MILLIS_TO_TIMESTAMP requires the input field parameter to be a long data type.

Combine date and time fields

Suppose that your source data has two columns: date, containing the date an event occurred, and time, containing the time it occurred:

{
"date": "2015-09-12",
"time": "02:47:05.474Z"
}

The following input expression concatenates the date and time columns to generate the timestamp:

TIME_PARSE(CONCAT("date", 'T', "time"))

Shift timestamp values

You can shift timestamp values in an ingestion job specification. For example, you may want to update timestamps for streaming ingestion in Polaris.

The following input expression adds 14 days to all timestamps in the input field event_time:

TIMESTAMPADD(DAY, 14, TIME_PARSE("event_time"))

Learn more