Skip to main content

Ingest nested data

You can seamlessly ingest nested input data using a JSON column regardless of ingestion method. In Imply Polaris, you can store the JSON object and access properties within the JSON object.

Ingest nested data

In the Parse data stage of ingestion, Polaris assigns JSON as the data type for the input field with nested data. For the example data in nested_example_data.json, the following screenshot shows the shipTo and details input fields as a JSON type:

Parse data with nested fields

In the Map source to table stage, Polaris maps the input field to a table column with data type JSON. Click on any row of a JSON column to view and copy the nested data. The following screenshot shows the table preview with details of a nested data cell:

Map source to table with nested data dialog

The example removes the string column populated from the time input field and uses an input expression to parse the time: TIME_PARSE("time")

Transform nested data at ingestion

You can optionally transform nested data using input expressions in the Map source to table stage. SQL JSON functions lists the supported Druid SQL functions to operate on nested data.

info

For CSV input data, you can ingest nested data with escaped quotation marks, such as "{""id"":""123456""}". However, to access the nested data, you must call PARSE_JSON on the input field when mapping source to table. For example, PARSE_JSON("inputField").

For example, you might want to extract values from the nested data to populate a separate table column. The following example shows an input expression to extract the price information from the details input field:

JSON_VALUE("details", '$.price')

JSONPath syntax lists the supported syntax for extracting JSON values. You could then create a new table column called price and enter the input expression for the column. The following screenshot shows the table with the new column populated with values from the nested data:

Map source to table with input expression

Similarly, you can create a new JSON table column with nested data extracted from the input data. The following expression for the address column illustrates this:

JSON_QUERY("shipTo", '$.address')

Map source to table with input expression to create JSON column

Learn more

See the following topics for more information: