Define table schemas by API
You can set and update table schemas using the Imply Polaris APIs. A table schema specifies the column names and column types for the table. Any updates to a table must maintain the same definition of columns and their types. To update the table schema, the following must be true:
- The table contains no data.
- There are no ingestion jobs pertaining to the table.
After you create a table and define its schema, you can upload data from files or from streaming events. For information on how to load data into a table using the Polaris APIs, see Ingest to table or Load event data.
This topic explains how to define the schema for an existing table in Polaris.
Prerequisites
This topic assumes you have the following:
- An empty table with no schema in Polaris.
- An OAuth access token with the
ManageTables
role. In the examples below, the token value is stored in the variable namedIMPLY_TOKEN
. See Authenticate API requests to obtain an access token and assign service account roles. Visit User roles reference for more information on roles and their permissions.
Define a table schema
To define a schema for a table, submit a PUT
request to the Tables API. In your request, specify the table ID as a path parameter and pass a table request object as the payload.
An invalid table schema results in a 400 Bad Request
status code with a JSON body describing the error.
Some factors that cause a table schema to be invalid include the following:
- Empty column names.
- Leading or trailing spaces in column names.
- A column name prefixed with double underscores.
- If the rollup schema references a nonexistent base column.
- If more than 200 columns are defined.
The following snippet shows a table request object containing an input schema:
{
"name": "Koalas Subset",
"inputSchema": [
{
"type": "string",
"name": "continent"
},
{
"type": "string",
"name": "country"
},
{
"type": "string",
"name": "city"
}
]
}
Sample request
The following example shows how to set the schema for a table in Polaris:
curl --location --request PUT 'https://api.imply.io/v1/tables/8c18fbf7-1081-4004-81be-775ee418c061' \
--header "Authorization: Bearer $IMPLY_TOKEN" \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Koalas Subset",
"inputSchema": [
{
"type": "string",
"name": "continent"
},
{
"type": "string",
"name": "country"
},
{
"type": "string",
"name": "city"
}
]
}'
import requests
import json
url = "https://api.imply.io/v1/tables/8c18fbf7-1081-4004-81be-775ee418c061"
payload = json.dumps({
"name": "Koalas Subset",
"inputSchema": [
{
"type": "string",
"name": "continent"
},
{
"type": "string",
"name": "country"
},
{
"type": "string",
"name": "city"
}
]
})
headers = {
'Authorization': 'Bearer {token}'.format(token=IMPLY_TOKEN),
'Content-Type': 'application/json'
}
response = requests.request("PUT", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response for setting the table's input schema:
{
"name": "Koalas Subset",
"id": "8c18fbf7-1081-4004-81be-775ee418c061",
"version": 4,
"totalDataSize": 0,
"totalRows": 0,
"lastUpdateDateTime": "2022-02-17T20:48:09.372677715Z",
"createdByUsername": "service-account-docs-demo",
"status": "no_data",
"timePartitioning": "day",
"createdBy": "d3c723aa-52f2-4ab0-b23f-7b5c4aaf3ded",
"lastModifiedBy": "d3c723aa-52f2-4ab0-b23f-7b5c4aaf3ded",
"lastModifiedByUsername": "service-account-docs-demo",
"inputSchema": [
{
"type": "string",
"name": "continent"
},
{
"type": "string",
"name": "country"
},
{
"type": "string",
"name": "city"
}
],
"pushEndpointUrl": "/v1/events/8c18fbf7-1081-4004-81be-775ee418c061"
}
Learn more
See the following topics for more information:
- Create a schema for details on table schemas and how to create a schema in the UI.
- Ingest to table for loading batch data into Polaris tables using the API.
- Load event data for loading streaming data into Polaris tables using the API.