Skip to main content

Ingest inline data by API

Imply Polaris supports batch ingestion from data you provide inline. You provide inline data directly in the ingestion job spec that you submit to the Jobs API.

With inline data, Polaris supports the following data formats:

  • Newline-delimited JSON (ND-JSON)
  • CSV
  • TSV
  • Semicolon-separated values (;)
  • Pipe-separated values (|)

For all other formats, upload the file first and then ingest from it. For details on how to ingest from files using the API, see Ingest data from files by API. For a list of all ingestion source options, see Ingestion sources overview.

This topic shows how to use the API to create an ingestion job with inline data in the request.

For reference on providing inline data using SQL-based ingestion, see the EXTERN function.

Prerequisites

To ingest inline data, you need an API key with the ManageIngestionJobs permission. In the examples below, the key value is stored in the variable named POLARIS_API_KEY. For information about how to obtain an API key and assign permissions, see API key authentication. For more information on permissions, visit Permissions reference.

Ingest inline data

Submit a POST request to the /v1/projects/PROJECT_ID/jobs endpoint to start an ingestion job.

You don't need to create a table before starting an ingestion job. Set createTableIfNotExists to true in the ingestion job spec to instruct Polaris to automatically determine the table attributes from the job spec. For details, see Automatically created tables.

In the request payload, include the inline data in the source parameter. The source object takes the following fields:

  • type: Set to inline.

  • data: String containing the raw data.

    • ND-JSON example:

      "{\"timestamp\": 1722553997421, \"color\": \"red\", \"value\": \"#f00\"}\n{\"timestamp\": 1722554089087, \"color\": \"blue\",\"value\": \"#00f\"}"
    • CSV example:

      "0,1722553997421,values,formatted\n1,1722554089087,as,CSV"
  • formatSettings: The format of the data, either {"format": "nd-json"} or {"format": "csv"} for ND-JSON or delimiter-separated values, respectively. Polaris supports comma (,), tab (\t), semicolon (;), and pipe (|) delimiters in inline data.

  • inputSchema: The schema of the input data as an array of objects each containing name and dataType descriptors.

The following example shows a full source definition:

"source": {
"type": "inline",
"data": "{\"timestamp\": 1722553997421, \"color\": \"red\", \"value\": \"#f00\"}\n{\"timestamp\": 1722554089087, \"color\": \"blue\",\"value\": \"#00f\"}"
"inputSchema": [
{
"dataType": "long",
"name": "timestamp"
},
{
"dataType": "string",
"name": "color"
},
{
"dataType": "string",
"name": "value"
},
],
"formatSettings": {
"format": "nd-json"
}
},

For more information about the request payload for creating an ingestion job, see the Jobs v1 API documentation.

Sample request

The following example shows how to load inline data into a table named inline-colors.

curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"type": "batch",
"target": {
"type": "table",
"tableName": "inline-colors"
},
"createTableIfNotExists": true,
"source": {
"type": "inline",
"data": "{\"timestamp\": 1722553997421, \"color\": \"red\", \"value\": \"#f00\"}\n{\"timestamp\": 1722554089087, \"color\": \"blue\",\"value\": \"#00f\"}",
"inputSchema": [
{
"dataType": "long",
"name": "timestamp"
},
{
"dataType": "string",
"name": "color"
},
{
"dataType": "string",
"name": "value"
}
],
"formatSettings": {
"format": "nd-json"
}
},
"mappings": [
{
"columnName": "__time",
"expression": "MILLIS_TO_TIMESTAMP(\"timestamp\")"
},
{
"columnName": "color",
"expression": "\"color\""
},
{
"columnName": "value",
"expression": "\"value\""
}
]
}'

Sample response

The following example shows a response to a successful ingestion job launch:

View the response
{
"source": {
"data": "{\"timestamp\": 1722553997421, \"color\": \"red\", \"value\": \"#f00\"}\n{\"timestamp\": 1722554089087, \"color\": \"blue\",\"value\": \"#00f\"}",
"inputSchema": [
{
"dataType": "long",
"name": "timestamp"
},
{
"dataType": "string",
"name": "color"
},
{
"dataType": "string",
"name": "value"
}
],
"formatSettings": {
"flattenSpec": null,
"format": "nd-json"
},
"type": "inline"
},
"context": {
"mode": "nonStrict",
"sqlQueryId": "0191104b-7e8e-72f5-9e2b-b2498a572a35",
"maxNumTasks": 75,
"faultTolerance": true,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"filterExpression": null,
"ingestionMode": "append",
"mappings": [
{
"columnName": "__time",
"expression": "MILLIS_TO_TIMESTAMP(\"timestamp\")",
"isAggregation": null
},
{
"columnName": "color",
"expression": "\"color\"",
"isAggregation": null
},
{
"columnName": "value",
"expression": "\"value\"",
"isAggregation": null
}
],
"maxParseExceptions": 2147483647,
"query": "INSERT INTO \"inline-colors\"\nSELECT\n MILLIS_TO_TIMESTAMP(\"timestamp\") AS \"__time\",\n \"color\" AS \"color\",\n \"value\" AS \"value\"\nFROM TABLE(\n POLARIS_SOURCE(\n '{\"data\":\"{\\\"timestamp\\\": 1722553997421, \\\"color\\\": \\\"red\\\", \\\"value\\\": \\\"#f00\\\"}\\n{\\\"timestamp\\\": 1722554089087, \\\"color\\\": \\\"blue\\\",\\\"value\\\": \\\"#00f\\\"}\",\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"timestamp\"},{\"dataType\":\"string\",\"name\":\"color\"},{\"dataType\":\"string\",\"name\":\"value\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"inline\"}'\n )\n)\n\n\nPARTITIONED BY DAY",
"createdBy": {
"username": "api-key-pok_7udiv...xrujvd",
"userId": "b6340b70-3f30-4ccd-86a0-fe74ebfc7cbe"
},
"createdTimestamp": "2024-08-01T23:34:28.750852Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0191104b-7e8e-72f5-9e2b-b2498a572a35",
"lastModifiedBy": {
"username": "api-key-pok_7udiv...xrujvd",
"userId": "b6340b70-3f30-4ccd-86a0-fe74ebfc7cbe"
},
"lastUpdatedTimestamp": "2024-08-01T23:34:28.750852Z",
"spec": {
"source": {
"data": "{\"timestamp\": 1722553997421, \"color\": \"red\", \"value\": \"#f00\"}\n{\"timestamp\": 1722554089087, \"color\": \"blue\",\"value\": \"#00f\"}",
"inputSchema": [
{
"dataType": "long",
"name": "timestamp"
},
{
"dataType": "string",
"name": "color"
},
{
"dataType": "string",
"name": "value"
}
],
"formatSettings": {
"flattenSpec": null,
"format": "nd-json"
},
"type": "inline"
},
"target": {
"tableName": "inline-colors",
"type": "table",
"intervals": null
},
"context": {
"mode": "nonStrict",
"sqlQueryId": "0191104b-7e8e-72f5-9e2b-b2498a572a35",
"maxNumTasks": 75,
"faultTolerance": true,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"clusteringColumns": [],
"createTableIfNotExists": true,
"filterExpression": null,
"ingestionMode": "append",
"mappings": [
{
"columnName": "__time",
"expression": "MILLIS_TO_TIMESTAMP(\"timestamp\")",
"isAggregation": null
},
{
"columnName": "color",
"expression": "\"color\"",
"isAggregation": null
},
{
"columnName": "value",
"expression": "\"value\"",
"isAggregation": null
}
],
"maxParseExceptions": 2147483647,
"partitionedBy": null,
"replaceAll": false,
"type": "batch"
},
"target": {
"tableName": "inline-colors",
"type": "table",
"intervals": null
},
"type": "batch",
"completedTimestamp": null,
"startedTimestamp": null
}

Learn more

See the following topics for more information: