Delete data by API
You can use the Imply Polaris API to delete data in a table, whether you want to remove all data or a subset of data that falls within a given time interval.
In the Polaris API, the following job types delete data:
delete_data
Deletes data within a table.drop_table
: Deletes a table and all of its data.
For both job types, you can specify whether to soft delete or permanently delete data.
You can provide time intervals of data to delete in delete_data
jobs.
This topic explains how to use the Polaris API to delete data.
For information on deleting data using the UI, see Delete data.
Project-less regional API resources have been deprecated and will be removed by the end of September 2024. See Migrate to project-scoped URL for more information.
Delete behavior
By default, requests to delete data using the Polaris API permanently deletes the data. Permanently deleted data reduces your data usage, which can help you stay within your project size.
If you want the ability to restore the data within a 30 day grace period, include "softDelete": True
in your requests. When you include "softDelete": True
in drop_table
and delete_data
job requests, Polaris soft deletes the data instead of permanently deleting it. That means the data can be restored.
Note that when you replace data, Polaris soft deletes the current versions of the data prior to the replace and generates new segments for that time interval.
When deleting data for a time interval, you have the option to delete all segments in the time interval or a subset of segments using their version IDs.
For information about restoring data, see either Restore data by API or Restore or permanently delete data for the UI.
Time intervals for deletion
The time interval identifies the data that already exists in the table based on values in the __time
column.
For delete_data
jobs, you can provide any number of intervals in
ISO 8601 format.
While you can provide any time intervals, if an interval doesn’t align with the granularity of existing segments, Polaris deletes the entirety of any segment that overlaps the interval. Polaris removes the entire segment even if the provided interval doesn’t fully encapsulate the segment.
For example, if you specify a one-hour time interval
but your data is stored with day
granularity, Polaris deletes the entire day of data.
Prerequisites
This topic assumes that you have the following:
- A table containing data.
- An API key with the
ManageIngestionJobs
andManageTables
permissions. In the examples below, the key value is stored in the variable namedPOLARIS_API_KEY
. To obtain an API key and assign permissions, see API key authentication. For more information on permissions, visit Permissions reference.
Soft delete data
Data deletion is a job in Polaris in which the job type
is delete_data
. The inclusion of "softDelete": True
in the API calls means that the data gets soft deleted, which is not the default behavior. You can recover soft deleted data if it's within the 30 day grace period and hasn't been permanently deleted.
Soft delete data by time interval
To remove a subset of data based on the time interval, provide the time interval in
ISO 8601 format in target.intervals
.
Note that the provided intervals may be extended based on the granularity of segments associated with the table. For more information, see Time intervals for deletion.
Sample request
The following example shows a delete_data
job to soft delete data within a given time interval. See the Jobs v1 API documentation for more information.
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data '{
"type": "delete_data",
"softDelete": true,
"target": {
"type": "table",
"tableName": "demo_table",
"intervals": ["2022-07-01/2022-08-01"]
}
}'
import os
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
apikey = os.getenv("POLARIS_API_KEY")
payload = json.dumps({
"type": "delete_data",
"softDelete": True,
"target": {
"type": "table",
"tableName": "demo_table",
"intervals": [
"2022-07-01/2022-08-01"
]
}
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"deleteAll": false,
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdTimestamp": "2023-08-10T00:14:21.188501674Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0189dccb-5804-7bf9-bbbb-6ac085b71b44",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdatedTimestamp": "2023-08-10T00:14:21.188501674Z",
"spec": {
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": [
"2022-07-01/2022-08-01"
]
},
"deleteAll": false,
"type": "delete_data",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": []
},
"type": "delete_data",
"completedTimestamp": null,
"startedTimestamp": null
}
Soft delete all data
To remove all data within a table, submit a delete_data
type job and set the deleteAll
property to true
in addition to "softDelete": True
.
Sample request
The following example shows a delete_data
job to soft delete all the data in a table.
See the Jobs v1 API documentation for more information.
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data '{
"type": "delete_data",
"softDelete": true,
"target": {
"type": "table",
"tableName": "demo_table"
},
"deleteAll": true
}'
import os
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
apikey = os.getenv("POLARIS_API_KEY")
payload = json.dumps({
"type": "delete_data",
"softDelete": True,
"target": {
"type": "table",
"tableName": "demo_table"
},
"deleteAll": True
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"deleteAll": true,
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdTimestamp": "2023-08-10T00:23:40.518838888Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0189dcd3-e0e6-7d8c-9f1d-0a338a4b443b",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdatedTimestamp": "2023-08-10T00:23:40.518838888Z",
"spec": {
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": []
},
"deleteAll": true,
"type": "delete_data",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": []
},
"type": "delete_data",
"completedTimestamp": null,
"startedTimestamp": null
}
Drop a table and soft delete its data
To delete a table and all of its data, create a job with its type
set to drop_table
.
Sample request
The following example shows a drop_table
job to soft delete a table and its data.
See the Jobs v1 API documentation for more information.
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data '{
"type": "drop_table",
"softDelete": true,
"target": {
"type": "table",
"tableName": "Koalas"
}
}'
import os
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
apikey = os.getenv("POLARIS_API_KEY")
payload = json.dumps({
"type": "drop_table",
"softDelete": True,
"target": {
"type": "table",
"tableName": "Koalas"
}
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdTimestamp": "2023-08-10T00:25:22.094458188Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0189dcd5-6dae-72c1-a04c-8caceba85a32",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdatedTimestamp": "2023-08-10T00:25:22.094458188Z",
"spec": {
"target": {
"tableName": "Koalas",
"type": "table",
"intervals": []
},
"type": "drop_table",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "Koalas",
"type": "table",
"intervals": []
},
"type": "drop_table",
"completedTimestamp": null,
"startedTimestamp": null
}
Permanently delete data
The API calls for a soft delete and a permanent deletion are identical except permanent deletion omits "softDelete": True
. Permanent deletion of data is the default behavior when you delete data through the API.
Permanently delete data by time interval
To remove a subset of data based on the time interval, provide the time intervals in
ISO 8601 format in target.intervals
.
Note that the provided intervals may be extended based on the granularity of segments associated with the table. For more information, see Time intervals for deletion.
If you don't specify versions
of segments to delete, Polaris deletes all versions of associated segments within the time interval.
Sample request
The following example shows a delete_data
job to delete data within a given time interval.
See the Jobs v1 API documentation for more information.
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data '{
"type": "delete_data",
"target": {
"type": "table",
"tableName": "demo_table",
"intervals": ["2022-07-01/2022-08-01"]
}
}'
import os
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
apikey = os.getenv("POLARIS_API_KEY")
payload = json.dumps({
"type": "delete_data",
"target": {
"type": "table",
"tableName": "demo_table",
"intervals": [
"2022-07-01/2022-08-01"
]
}
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"deleteAll": false,
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdTimestamp": "2023-08-10T00:14:21.188501674Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0189dccb-5804-7bf9-bbbb-6ac085b71b44",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdatedTimestamp": "2023-08-10T00:14:21.188501674Z",
"spec": {
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": [
"2022-07-01/2022-08-01"
]
},
"deleteAll": false,
"type": "delete_data",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": []
},
"type": "delete_data",
"completedTimestamp": null,
"startedTimestamp": null
}
Permanently delete data by time interval and version
If you don't want to delete all segments for a given time interval, supply versions
in the request body and list the specific segment versions you want to delete.
To get the version IDs of soft deleted segments for a table, send a GET
request as follows:
GET /v1/projects/{projectId}/tables/{tableName}/unusedSegments
For example, the data in the interval 2022-07-01/2022-08-01
may have previous iterations of data in three soft deleted segments:
2024-01-01T22:01:31.100Z
2024-02-01T23:01:31.100Z
2024-03-01T00:01:31.100Z
You may want to only delete segments with versions 2024-01-01T22:01:31.100Z
and 2024-02-01T23:01:31.100Z
, while leaving the most recent segment to recover data if needed.
Sample request
The following example shows a delete_data
job to delete specific iterations of data within a given time interval.
See the Jobs v1 API documentation for more information.
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data '{
"type": "delete_data",
"target": {
"type": "table",
"tableName": "demo_table",
"intervals": ["2022-07-01/2022-08-01"],
"versions": ["2024-01-01T22:01:31.100Z", "2024-02-01T23:01:31.100Z"]
}
}'
import os
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
apikey = os.getenv("POLARIS_API_KEY")
payload = json.dumps({
"type": "delete_data",
"target": {
"type": "table",
"tableName": "demo_table",
"intervals": [
"2022-07-01/2022-08-01"
],
"versions": [
"2024-01-01T22:01:31.100Z",
"2024-02-01T23:01:31.100Z"
]
}
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"deleteAll": false,
"versions": [],
"createdBy": {
"username": "api-key-pok_7udiv...xrujvd",
"userId": "b6340b70-3f30-4ccd-86a0-fe74ebfc7cbe"
},
"createdTimestamp": "2024-04-30T18:38:57.918651Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "018f304d-857e-7570-aab0-ba953d81df64",
"lastModifiedBy": {
"username": "api-key-pok_7udiv...xrujvd",
"userId": "b6340b70-3f30-4ccd-86a0-fe74ebfc7cbe"
},
"lastUpdatedTimestamp": "2024-04-30T18:38:57.918651Z",
"spec": {
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": [
"2022-07-01/2022-08-01"
]
},
"deleteAll": false,
"softDelete": false,
"versions": [],
"type": "delete_data",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "recover_data2",
"type": "table",
"intervals": []
},
"type": "delete_data",
"completedTimestamp": null,
"startedTimestamp": null
}
Permanently delete all data
To permanently remove all data within a table, submit a delete_data
type job and set the deleteAll
property to true
.
Sample request
The following example shows a delete_data
job to delete all the data in a table.
See the Jobs v1 API documentation for more information.
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data '{
"type": "delete_data",
"target": {
"type": "table",
"tableName": "demo_table"
},
"deleteAll": true
}'
import os
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
apikey = os.getenv("POLARIS_API_KEY")
payload = json.dumps({
"type": "delete_data",
"target": {
"type": "table",
"tableName": "demo_table"
},
"deleteAll": True
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"deleteAll": true,
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdTimestamp": "2023-08-10T00:23:40.518838888Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0189dcd3-e0e6-7d8c-9f1d-0a338a4b443b",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdatedTimestamp": "2023-08-10T00:23:40.518838888Z",
"spec": {
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": []
},
"deleteAll": true,
"type": "delete_data",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": []
},
"type": "delete_data",
"completedTimestamp": null,
"startedTimestamp": null
}
Drop a table and permanently delete its data
To delete a table and all of its data, create a job with its type
set to drop_table
.
Sample request
The following example shows a drop_table
job to permanently delete a table and its data.
See the Jobs v1 API documentation for more information.
- cURL
- Python
curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data '{
"type": "drop_table",
"target": {
"type": "table",
"tableName": "Koalas"
}
}'
import os
import requests
import json
url = "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs"
apikey = os.getenv("POLARIS_API_KEY")
payload = json.dumps({
"type": "drop_table",
"target": {
"type": "table",
"tableName": "Koalas"
}
})
headers = {
'Authorization': f'Basic {apikey}',
'Content-Type': 'application/json'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Sample response
The following example shows a successful response:
Click to view the response
{
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdTimestamp": "2023-08-10T00:25:22.094458188Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0189dcd5-6dae-72c1-a04c-8caceba85a32",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdatedTimestamp": "2023-08-10T00:25:22.094458188Z",
"spec": {
"target": {
"tableName": "Koalas",
"type": "table",
"intervals": []
},
"type": "drop_table",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "Koalas",
"type": "table",
"intervals": []
},
"type": "drop_table",
"completedTimestamp": null,
"startedTimestamp": null
}
Learn more
See the following topics for more information:
- Recover data by API for recovering soft deleted data.
- Delete data for deleting data using the UI.
- Set a storage policy by API for automatically deleting data using a retention policy.