Timestamp expressions
All data ingested into Imply Polaris must have an associated timestamp.
Polaris uses the timestamp to partition and sort data, and to perform time-based data management operations, such as dropping time chunks.
The primary timestamp column in Polaris is named __time
.
This topic shows how to map input fields to the __time
table column and how to transform input timestamps in a mapping.
Prerequisites
Your input data should have the timestamp information in one of the following formats, or in a form that can be transformed into one of these formats:
- ISO 8601
- Milliseconds since Unix epoch (00:00:00 UTC on January 1, 1970)
Streaming ingestion requires the transformed timestamp to be within 30 days of ingestion time. For more information, see Timestamp requirements.
If the source data has null or unparseable timestamps, Polaris fails the ingestion job. Either assign a default timestamp using an input expression, or filter out null timestamps using an ingestion filter.
Map input timestamp to table column
Follow these steps to assign the primary timestamp when creating an ingestion job:
In the source input schema, list the input fields that represent the timestamp. For example, a single input field that contains the timestamp or a combination of input fields that together contain the date and time. The UI may automatically detect your source input schema. In the API, list the input fields in
source.inputSchema
.Define the mapping used to optionally transform and parse the timestamp. In the mapping for
__time
, you can apply operations to transform the input fields. For the outermost function of the expression, apply one of the following functions to parse the timestamp:TIME_PARSE()
parses a string value containing times in ISO 8601 format, such as2023-01-01T01:02:03.456
.MILLIS_TO_TIMESTAMP()
parses a long value containing times in milliseconds since Unix epoch format, such as1667337655
.
Regardless of the input field name, Polaris always stores the primary timestamp in the
__time
column of your table. When creating an ingestion job using the Polaris API, one of the objects in themappings
array must contain the column name__time
.
ISO 8601 format
The following example shows the input expression for an input field named event_time
containing ISO 8601 timestamps:
TIME_PARSE("event_time")
Milliseconds since epoch format
The following example shows the input expression for an input field named event_time
, which contains timestamps in milliseconds since the Unix epoch:
MILLIS_TO_TIMESTAMP("event_time")
Input field named __time
Polaris accepts timestamps stored in a __time
column in your source data when it meets the following requirements:
- Formatted as milliseconds since Unix epoch
- Defined as a long data type in the source input schema
- Mapped using the identity mapping in the ingestion job
You cannot apply a transformation to an input field named __time
.
You cannot map another input field to the __time
column when you have an input field named __time
.
The input expression is only the input field name:
"__time"
The following API example shows the mapping for an input field named __time
:
"mappings": [
{
"columnName": "__time",
"expression": "\"__time\""
},
...
]
Timestamp expressions
Use input expressions to set a default timestamp, modify timestamp formats, or apply time operations before ingesting your data. For supported timestamp functions, see Date and time functions.
This section introduces example input expressions you can use to transform timestamps in an ingestion job.
Fill in missing timestamps
Certain data records may not contain a timestamp. To set a default value for missing timestamps, use the COALESCE
function in your input expression.
The following input expression applies a default time of 2022-07-25T02:47:05.000Z
for missing values in the input field event_time
:
COALESCE(TIME_PARSE("event_time"), TIMESTAMP '2022-07-25 02:47:05')
Set default timestamp
If you do not have an input field containing timestamps, you can set a static timestamp value using TIME_PARSE
or MILLIS_TO_TIMESTAMP
.
This approach is suitable for batch ingestion rather than streaming ingestion jobs,
which are subject to a late message rejection period.
The following example shows an input expression that sets a static timestamp in ISO 8601 format:
TIME_PARSE('2023-12-01T00:46:58.771Z')
The following example shows an input expression that sets a static timestamp using milliseconds since epoch format:
MILLIS_TO_TIMESTAMP(1699989996117)
Transform time since epoch
The following examples show expressions to transform an input field named event_time
from various units of time since epoch:
- Microseconds since epoch:
MILLIS_TO_TIMESTAMP("event_time" / 1000)
- Nanoseconds since epoch:
MILLIS_TO_TIMESTAMP("event_time" / 1000000)
- Seconds since epoch (POSIX time):
MILLIS_TO_TIMESTAMP("event_time" * 1000)
MILLIS_TO_TIMESTAMP
requires the input field parameter to be a long
data type.
Combine date and time fields
Suppose that your source data has two columns: date
, containing the date an event occurred, and time
, containing the time it occurred:
{
"date": "2015-09-12",
"time": "02:47:05.474Z"
}
The following input expression concatenates the date and time columns to generate the timestamp:
TIME_PARSE(CONCAT("date", 'T', "time"))
Shift timestamp values
You can shift timestamp values in an ingestion job specification. For example, you may want to update timestamps for streaming ingestion in Polaris.
The following input expression adds 14 days to all timestamps in the input field event_time
:
TIMESTAMPADD(DAY, 14, TIME_PARSE("event_time"))
Learn more
- For details on input expressions, see Map and transform data with input expressions.
- For details on the timestamp column, see Timestamp.
- To filter out data during ingestion, see Ingest with filters.