Create lookups by API
Lookups are a beta feature available to select customers. Imply must enable the feature for you. Contact Polaris support to find out more.
You can use lookups in Imply Polaris to enrich your data with additional information from a lookup source. A lookup function returns mapped entries from the lookup source based upon the contents of a column you specify. Both the key column and value column must be string-typed columns.
Lookups in Polaris support one-to-many relationships: a single key can map to multiple values, such as a vehicle ID that maps to both a vehicle’s make, model, and color.
Lookup aliases allow you retain your existing query syntax when you migrate lookups from Druid to Polaris. You must use the Lookups v1 API to manage lookup aliases—you can't do that in the Polaris UI.
This topic explains how to create, use, and manage lookup tables, lookups, and lookup aliases using Polaris API. For information on lookup strategies and lookup source requirements, see Lookups.
Prerequisites
Before you can create and use lookups and lookup aliases, you must have an API key with the following permissions:
ManageTables
to create a lookup table, lookups, and lookup aliases.ManageIngestionJobs
to ingest data into a lookup table.AccessQueries
to run SQL queries that use lookups.
The examples in this topic show the API key value stored in the variable POLARIS_API_KEY
.
For information about how to obtain an API key and assign permissions, see API key authentication.
For more information on permissions, visit Permissions reference.
Examples in this topic
The example requests in this topic reference two JSON files:
If you want to execute the example requests in this topic, first upload these files to Polaris. See Upload files for instructions on how to do this using the Files v1 API.
car-trips
represents all the trips taken by a fleet of cars.
car-makes
contains information on each car's make, model, and color.
Both tables contain a column vid
for vehicle ID.
By creating a lookup table from car-makes
, you can use vid
to obtain car details that aren't present in car-trips
.
Create a lookup table
Polaris only supports existing tables as a lookup source. A lookup table must have the following:
- Partitioning granularity
all
. - A single segment.
- One or more string columns to use as key-value pairs.
- One or more key columns containing unique values.
To create a lookup table, send a POST
request to the /v1/projects/PROJECT_ID/jobs
endpoint.
Sample request
The following example query creates a lookup table car-makes
from the car-makes.json
file. It casts vid
and year
to strings to make them eligible for use in lookups.
The createTableIfNotExists
property tells Polaris to create a new table if a table with the same name doesn't exist.
INSERT INTO "car-makes"
SELECT
TIME_PARSE("end_time") AS "__time",
"start_time" AS "start_time",
CAST("vid" AS VARCHAR) AS "vid",
"len_miles" AS "len_miles",
"avg_mph" AS "avg_mph"
FROM TABLE(
POLARIS_SOURCE(
'{
"type":"uploaded",
"fileList": ["car-trips.json"],
"formatSettings": {"format":"nd-json"},
"inputSchema":[
{"dataType":"string","name":"vid"},
{"dataType":"string","name":"end_time"},
{"dataType":"string","name":"start_time"},
{"dataType":"double","name":"len_miles"},
{"dataType":"double","name":"avg_mph"}
]
}'))
PARTITIONED BY ALL
The full SQL-based ingestion request to the Jobs v1 API is as follows:
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"query": "INSERT INTO \"car-makes\" SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"createTableIfNotExists": true
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
payload = json.dumps({
"query": "INSERT INTO \"car-makes\" SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"createTableIfNotExists": true
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"query": "INSERT INTO \"car-makes\" SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"context": {
"mode": "nonStrict",
"sqlQueryId": "0190fefb-e0e4-7d97-9869-628a1e4367cc",
"maxNumTasks": 75,
"faultTolerance": true,
"maxNumSegments": null,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"parameters": [],
"target": {
"tableName": "car-makes",
"type": "table",
"intervals": null
},
"createdBy": {
"username": "api-key-pok_9obmg...fomjbc",
"userId": "8741a40a-dfaa-4eea-abd4-f474b27709c4"
},
"createdTimestamp": "2024-07-29T14:53:58.372606Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0190fefb-e0e4-7d97-9869-628a1e4367cc",
"lastModifiedBy": {
"username": "api-key-pok_9obmg...fomjbc",
"userId": "8741a40a-dfaa-4eea-abd4-f474b27709c4"
},
"lastUpdatedTimestamp": "2024-07-29T14:53:58.372606Z",
"spec": {
"query": "INSERT INTO \"car-makes\" SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"context": {
"mode": "nonStrict",
"sqlQueryId": "0190fefb-e0e4-7d97-9869-628a1e4367cc",
"maxNumTasks": 75,
"faultTolerance": true,
"maxNumSegments": null,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"createTableIfNotExists": true,
"parameters": [],
"type": "sql"
},
"type": "sql",
"completedTimestamp": null,
"startedTimestamp": null
}
Identify eligible lookup tables
Let's say you've created 10 lookup tables, but you have 1000 tables in your project. To determine which tables are eligible to source lookup, get the list of tables that have all
partitioning granularity and a single segment. To do so, send the following request using the Projects v1 API:
GET
https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/tables?partitioningGranularity=all&segmentCount=1
Create a lookup
Once you have a lookup table, create a lookup to reference data from the lookup table.
To create a lookup, send a POST
request to the /v1/projects/PROJECT_ID/lookups
endpoint.
Sample request
The following example creates a lookup named makes-lookup
to the car-makes
lookup table:
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/lookups" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"name": "makes-lookup",
"source": {
"tableName": "car-makes",
"type": "table"
}
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/lookups"
payload = json.dumps({
"name": "makes-lookup",
"source": {
"tableName": "car-makes",
"type": "table"
}
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
{
"name": "makes-lookup",
"source": {
"tableName": "car-makes",
"type": "table"
}
}
Use a lookup
You can use lookups in two ways:
- At query time, to extract additional information from a lookup table when querying a data table—the data table provides most of the information in the result set.
- At ingestion time, to reduce the query burden later. You ingest data from both the data source and the lookup table into a new table.
Query example
The following example query uses the makes-lookup
lookup from the previous step to extract car make and model and display them alongside details from a table called car-trips
.
To run this query, first create a Polaris table using the car-trips.json
file.
Remember to cast vid
as a string-typed column using the input expression CAST("vid" AS VARCHAR)
.
Where vid
isn't present in the car-makes
table, N/A
displays in the results.
SELECT
vid,
TIMESTAMPDIFF(MINUTE, TIME_PARSE(start_time), __time) AS "elapsed_minutes",
len_miles,
avg_mph,
LOOKUP(vid, 'makes-lookup[vid][make]', 'N/A') AS "make",
LOOKUP(vid, 'makes-lookup[vid][model]', 'N/A') AS "model"
FROM
"car-trips"
WHERE
len_miles > 35 AND avg_mph > 30
The full SQL-based ingestion request to the Query v1 API is as follows:
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/query/sql" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"query": "SELECT vid, TIMESTAMPDIFF(MINUTE, TIME_PARSE(start_time), __time) AS \"elapsed_minutes\", len_miles, avg_mph, LOOKUP(vid, 'makes-lookup[vid][make]', 'N/A') AS \"make\", LOOKUP(vid, 'makes-lookup[vid][model]', 'N/A') AS \"model\" FROM \"car-trips\" WHERE len_miles > 35 AND avg_mph > 30"
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/query/sql"
payload = json.dumps({
"query": "SELECT vid, TIMESTAMPDIFF(MINUTE, TIME_PARSE(start_time), __time) AS \"elapsed_minutes\", len_miles, avg_mph, LOOKUP(vid, 'makes-lookup[vid][make]', 'N/A') AS \"make\", LOOKUP(vid, 'makes-lookup[vid][model]', 'N/A') AS \"model\" FROM \"car-trips\" WHERE len_miles > 35 AND avg_mph > 30"
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
The following example shows a successful response:
[
{
"vid": "48949",
"elapsed_minutes": 41,
"len_miles": 38.24,
"avg_mph": 55.64,
"make": "BMW",
"model": "M8"
},
{
"vid": "52356",
"elapsed_minutes": 58,
"len_miles": 35.42,
"avg_mph": 36.03,
"make": "BMW",
"model": "i5"
},
{
"vid": "88840",
"elapsed_minutes": 55,
"len_miles": 35.42,
"avg_mph": 38.11,
"make": "Honda",
"model": "Odyssey"
},
{
"vid": "47600",
"elapsed_minutes": 48,
"len_miles": 35.75,
"avg_mph": 44.39,
"make": "Toyota",
"model": "Prius"
}
]
Ingestion example
The example in this section shows how to use lookups during ingestion.
Ingest data into a table using a lookup
The following example query creates a table car-trips-makes
which includes columns from the car-trips.json
file.
It uses the makes-lookup
lookup to include make
and model
columns from the car-makes
lookup table.
INSERT INTO "car-trips-makes"
SELECT
TIME_PARSE(end_time) AS "__time",
start_time AS "start_time",
vid AS "vid",
len_miles AS "len_miles",
avg_mph AS "avg_mph",
LOOKUP(vid, 'makes-lookup[vid][make]') AS "make",
LOOKUP(vid, 'makes-lookup[vid][model]') AS "model"
FROM TABLE(
POLARIS_SOURCE(
'{
"type":"uploaded",
"fileList": ["car-trips.json"],
"formatSettings": {"format":"nd-json"},
"inputSchema":[
{"dataType":"string","name":"end_time"},
{"dataType":"string","name":"start_time"},
{"dataType":"string","name":"vid"},
{"dataType":"double","name":"len_miles"},
{"dataType":"double","name":"avg_mph"}
]
}'))
PARTITIONED BY DAY
The full SQL-based ingestion request to the Jobs v1 API is as follows:
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"query": "INSERT INTO \"car-trips-makes\" SELECT TIME_PARSE(end_time) AS \"__time\", start_time AS \"start_time\", vid AS \"vid\", len_miles AS \"len_miles\", avg_mph AS \"avg_mph\", LOOKUP(vid, 'makes-lookup[vid][make]') AS \"make\", LOOKUP(vid, 'makes-lookup[vid][model]') AS \"model\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-trips.json\"],\"inputSchema\":[{\"dataType\":\"string\",\"name\":\"end_time\"},{\"dataType\":\"string\",\"name\":\"start_time\"},{\"dataType\":\"string\",\"name\":\"vid\"},{\"dataType\":\"double\",\"name\":\"len_miles\"},{\"dataType\":\"double\",\"name\":\"avg_mph\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY DAY",
"createTableIfNotExists": true
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
payload = json.dumps({
"query": "INSERT INTO \"car-trips-makes\" SELECT TIME_PARSE(end_time) AS \"__time\", start_time AS \"start_time\", vid AS \"vid\", len_miles AS \"len_miles\", avg_mph AS \"avg_mph\", LOOKUP(vid, 'makes-lookup[vid][make]') AS \"make\", LOOKUP(vid, 'makes-lookup[vid][model]') AS \"model\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-trips.json\"],\"inputSchema\":[{\"dataType\":\"string\",\"name\":\"end_time\"},{\"dataType\":\"string\",\"name\":\"start_time\"},{\"dataType\":\"string\",\"name\":\"vid\"},{\"dataType\":\"double\",\"name\":\"len_miles\"},{\"dataType\":\"double\",\"name\":\"avg_mph\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY DAY",
"createTableIfNotExists": true
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
The following example shows a successful response:
Click to view the response
{
"query": "INSERT INTO \"car-trips-makes\" SELECT TIME_PARSE(end_time) AS \"__time\", start_time AS \"start_time\", vid AS \"vid\", len_miles AS \"len_miles\", avg_mph AS \"avg_mph\", LOOKUP(vid, 'makes-lookup[vid][make]') AS \"make\", LOOKUP(vid, 'makes-lookup[vid][model]') AS \"model\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-trips.json\"],\"inputSchema\":[{\"dataType\":\"string\",\"name\":\"end_time\"},{\"dataType\":\"string\",\"name\":\"start_time\"},{\"dataType\":\"string\",\"name\":\"vid\"},{\"dataType\":\"double\",\"name\":\"len_miles\"},{\"dataType\":\"double\",\"name\":\"avg_mph\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY DAY",
"context": {
"mode": "nonStrict",
"sqlQueryId": "0190ef63-08b1-792c-bff9-cdab2e444c9f",
"maxNumTasks": 75,
"faultTolerance": true,
"maxNumSegments": null,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"parameters": [],
"target": {
"tableName": "car-trips-makes",
"type": "table",
"intervals": null
},
"createdBy": {
"username": "api-key-pok_9obmg...fomjbc",
"userId": "8741a40a-dfaa-4eea-abd4-f474b27709c4"
},
"createdTimestamp": "2024-07-26T14:12:43.313912Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0190ef63-08b1-792c-bff9-cdab2e444c9f",
"lastModifiedBy": {
"username": "api-key-pok_9obmg...fomjbc",
"userId": "8741a40a-dfaa-4eea-abd4-f474b27709c4"
},
"lastUpdatedTimestamp": "2024-07-26T14:12:43.313912Z",
"spec": {
"query": "INSERT INTO \"car-trips-makes\" SELECT TIME_PARSE(end_time) AS \"__time\", start_time AS \"start_time\", vid AS \"vid\", len_miles AS \"len_miles\", avg_mph AS \"avg_mph\", LOOKUP(vid, 'makes-lookup[vid][make]') AS \"make\", LOOKUP(vid, 'makes-lookup[vid][model]') AS \"model\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-trips.json\"],\"inputSchema\":[{\"dataType\":\"string\",\"name\":\"end_time\"},{\"dataType\":\"string\",\"name\":\"start_time\"},{\"dataType\":\"string\",\"name\":\"vid\"},{\"dataType\":\"double\",\"name\":\"len_miles\"},{\"dataType\":\"double\",\"name\":\"avg_mph\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY DAY",
"context": {
"mode": "nonStrict",
"sqlQueryId": "0190ef63-08b1-792c-bff9-cdab2e444c9f",
"maxNumTasks": 75,
"faultTolerance": true,
"maxNumSegments": null,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"createTableIfNotExists": true,
"parameters": [],
"type": "sql"
},
"type": "sql",
"completedTimestamp": null,
"startedTimestamp": null
}
Refresh a lookup table
To refresh a lookup table, update the source of the lookup.
In a situation where new and updated information on car makes and models exists in a new file called car-makes-new.json
, updating the makes-lookup
lookup updates the data in the car-makes
lookup table.
Sample request
The following query replaces all data in the car-makes
lookup table using REPLACE INTO ... OVERWRITE ALL
syntax, referencing the new file, and adhering to the ALL granularity requirement.
REPLACE INTO "car-makes"
OVERWRITE ALL
SELECT
CURRENT_TIMESTAMP AS "__time",
CAST("vid" AS VARCHAR) AS "vid",
"make" AS "make",
"model" AS "model",
CAST("year" AS VARCHAR) AS "year",
"color" AS "color"
FROM TABLE(
POLARIS_SOURCE(
'{
"type":"uploaded",
"fileList": ["car-makes-new.json"],
"formatSettings": {"format":"nd-json"},
"inputSchema":[
{"dataType":"string","name":"vid"},
{"dataType":"string","name":"make"},
{"dataType":"string","name":"model"},
{"dataType":"long","name":"year"},
{"dataType":"string","name":"color"}
]
}'))
PARTITIONED BY ALL
The full SQL-based ingestion request to the Jobs v1 API is as follows:
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"query": "REPLACE INTO \"car-makes\" OVERWRITE ALL SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes-new.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"createTableIfNotExists": true
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
payload = json.dumps({
"query": "REPLACE INTO \"car-makes\" OVERWRITE ALL SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes-new.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"createTableIfNotExists": true
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"query": "REPLACE INTO \"car-makes\" OVERWRITE ALL SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes-new.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"context": {
"mode": "nonStrict",
"sqlQueryId": "0190ff21-b8d9-70a9-b005-475fb61c8cad",
"maxNumTasks": 75,
"faultTolerance": true,
"maxNumSegments": null,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"parameters": [],
"target": {
"tableName": "car-makes",
"type": "table",
"intervals": null
},
"createdBy": {
"username": "api-key-pok_9obmg...fomjbc",
"userId": "8741a40a-dfaa-4eea-abd4-f474b27709c4"
},
"createdTimestamp": "2024-07-29T15:35:18.48943Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0190ff21-b8d9-70a9-b005-475fb61c8cad",
"lastModifiedBy": {
"username": "api-key-pok_9obmg...fomjbc",
"userId": "8741a40a-dfaa-4eea-abd4-f474b27709c4"
},
"lastUpdatedTimestamp": "2024-07-29T15:35:18.48943Z",
"spec": {
"query": "REPLACE INTO \"car-makes\" OVERWRITE ALL SELECT CURRENT_TIMESTAMP AS \"__time\", CAST(\"vid\" AS VARCHAR) AS \"vid\", make AS \"make\", model AS \"model\", CAST(\"year\" AS VARCHAR) AS \"year\", color AS \"color\" FROM TABLE(POLARIS_SOURCE('{\"fileList\":[\"car-makes-new.json\"],\"inputSchema\":[{\"dataType\":\"long\",\"name\":\"vid\"},{\"dataType\":\"string\",\"name\":\"make\"},{\"dataType\":\"string\",\"name\":\"model\"},{\"dataType\":\"long\",\"name\":\"year\"},{\"dataType\":\"string\",\"name\":\"color\"}],\"formatSettings\":{\"format\":\"nd-json\"},\"type\":\"uploaded\"}')) PARTITIONED BY ALL",
"context": {
"mode": "nonStrict",
"sqlQueryId": "0190ff21-b8d9-70a9-b005-475fb61c8cad",
"maxNumTasks": 75,
"faultTolerance": true,
"maxNumSegments": null,
"taskAssignment": "auto",
"maxParseExceptions": 2147483647,
"finalizeAggregations": true,
"durableShuffleStorage": true,
"catalogValidationEnabled": false,
"clusterStatisticsMergeMode": "SEQUENTIAL",
"groupByEnableMultiValueUnnesting": false
},
"createTableIfNotExists": false,
"parameters": [],
"type": "sql"
},
"type": "sql",
"completedTimestamp": null,
"startedTimestamp": null
}
Create a lookup alias
When you migrate lookups from open source Apache Druid or Imply's distribution of Apache Druid to Polaris, you can retain your existing lookups without change. In Polaris, you need a lookup table and a lookup. You can then use lookup aliases for query compatibility. See Migrate lookups from Druid for more information on migrating lookups from Druid.
A lookup alias consists of a reference to a lookup name, a key column, and a value column.
To create a lookup alias, send a PUT
request to the /v1/projects/PROJECT_ID/lookups/LOOKUP_NAME/aliases
endpoint.
Sample request
The following example creates two lookup aliases for the makes-lookup
lookup named makes-alias
and models-alias
.
The aliases target the make
and model
columns in the lookup table—in this case, the car-makes
table as defined in makes-lookup
.
- cURL
- Python
curl --location --request PUT "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/lookups/makes-lookup/aliases" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"values": [
{
"key": "vid",
"name": "makes-alias",
"value": "make"
},
{
"key": "vid",
"name": "models-alias",
"value": "model"
}
]
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/lookups/makes-lookup/aliases"
payload = json.dumps({
"values": [
{
"key": "vid",
"name": "makes-alias",
"value": "make"
},
{
"key": "vid",
"name": "models-alias",
"value": "model"
}
]
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("PUT", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
{
"values": [
{
"key": "vid",
"name": "makes-alias",
"value": "make"
},
{
"key": "vid",
"name": "model-alias",
"value": "model"
}
]
}
Update or delete a lookup alias
To update a lookup alias, send a PUT
request to the /v1/projects/PROJECT_ID/lookups/LOOKUP_NAME/aliases
endpoint.
You must provide the details of all aliases you want to retain for the lookup, even if you don't want to update them. Polaris recreates the lookup's aliases with the details you specify in the request. Polaris creates a new alias if the supplied alias name doesn't exist.
To delete lookup aliases, specify the aliases you want to keep in the request and omit the aliases you want to delete.
Sample request
The following example updates the aliases for the makes-lookup
lookup.
The request creates a new alias color-alias
which targets the color
column in the lookup table—in this case, the car-makes
table as defined in makes-lookup
.
The request deletes any other pre-existing aliases for the makes-lookup
lookup.
- cURL
- Python
curl --location --request PUT "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/lookups/makes-lookup/aliases" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"values": [
{
"key": "vid",
"name": "color-alias",
"value": "color"
}
]
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/lookup/makes-lookup/aliasess"
payload = json.dumps({
"values": [
{
"key": "vid",
"name": "color-alias",
"value": "color"
}
]
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("PUT", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
{
"values": [
{
"key": "vid",
"name": "color-alias",
"value": "color"
}
]
}
Use a lookup alias
You can use a lookup alias in a query or ingestion job, just as you'd use a lookup. The difference lies in the syntax you use to call it.
Sample request
The following example query uses both the makes-lookup
lookup and the color-alias
alias to display car color
from the car-makes
table alongside other details from the car-trips
table.
Note the syntax difference when using the lookup and the alias.
SELECT
vid,
TIMESTAMPDIFF(MINUTE, TIME_PARSE(start_time), __time) AS "elapsed_minutes",
len_miles,
avg_mph,
LOOKUP(vid, 'makes-lookup[vid][color]') AS "color",
LOOKUP(vid, 'color-alias') AS "color with alias"
FROM
"car-trips"
WHERE
len_miles > 35 AND avg_mph > 30
The full request using the Query v1 API is as follows:
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/query/sql" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data-raw '{
"query": "SELECT vid, TIMESTAMPDIFF(MINUTE, TIME_PARSE(start_time), __time) AS \"elapsed_minutes\", len_miles, avg_mph, LOOKUP(vid, 'makes-lookup[vid][color]') AS \"color\", LOOKUP(vid, 'color-alias') AS \"color with alias\" FROM \"car-trips\" WHERE len_miles > 35 AND avg_mph > 30"
}'
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/query/sql"
payload = json.dumps({
"query": "SELECT vid, TIMESTAMPDIFF(MINUTE, TIME_PARSE(start_time), __time) AS \"elapsed_minutes\", len_miles, avg_mph, LOOKUP(vid, 'makes-lookup[vid][color]') AS \"color\", LOOKUP(vid, 'color-alias') AS \"color with alias\" FROM \"car-trips\" WHERE len_miles > 35 AND avg_mph > 30"
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
[
{
"vid": "48949",
"elapsed_minutes": 41,
"len_miles": 38.24,
"avg_mph": 55.64,
"color": "blue",
"color with alias": "blue"
},
{
"vid": "52356",
"elapsed_minutes": 58,
"len_miles": 35.42,
"avg_mph": 36.03,
"color": "blue",
"color with alias": "blue"
},
{
"vid": "88840",
"elapsed_minutes": 55,
"len_miles": 35.42,
"avg_mph": 38.11,
"color": "blue",
"color with alias": "blue"
},
{
"vid": "47600",
"elapsed_minutes": 48,
"len_miles": 35.75,
"avg_mph": 44.39,
"color": "green",
"color with alias": "green"
}
]
Learn more
See the following topics for more information:
- Lookups v1 API for the Lookups API reference.
- Lookups for creating and managing lookups in the Polaris UI.
- Migrate lookups from Apache Druid for the lookup differences in Apache Druid and the lookup migration process from Druid to Polaris.
- Query data by API for using the Query API.
- Create an ingestion job by API for using the Jobs v1 API to ingest data into Polaris.