Skip to main content

Restore data by API

You can use the Imply Polaris API to restore the following data deleted from a table as long as it is within the grace period:

Imply reserves the right to delete data before the end of 30 day grace period, if needed.

For information on recovering data using the UI, see Manage deleted data.

info

Project-less regional API resources have been deprecated and will be removed by the end of September 2024. See Migrate to project-scoped URL for more information.

Prerequisites

This topic assumes that you have the following:

  • A table that has data soft deleted within the past 30 days.
  • An API key with the ManageIngestionJobs and ManageTables permissions. In the examples below, the key value is stored in the variable named POLARIS_API_KEY. To obtain an API key and assign permissions, see API key authentication. For more information on permissions, visit Permissions reference.

Restore data

Data restoration is a job in Polaris in which the job type is restore_data. The job request takes a root property, interval, and an ISO 8601 time interval of the data to recover.

While you can delete multiple intervals by providing an array of them, you can only restore a single interval at a time. For data that spans multiple time intervals, create a separate job for each one.

To view data that can be restored (soft deleted data), use the Tables API to send a GET request to the /unusedSegments endpoint. Polaris returns the soft deleted data as a list of segments that are not in use. For example, to view soft deleted data in a table called demo_table, make the following request:

GET /v1/projects/{projectId}/tables/demo_table/unusedSegments

There's an optional versions field where you can provide a single string that specifies the iteration of data you want to restore. A segment version in Polaris can contain multiple iterations of the same data depending on multiple factors.

Sample request

The following example shows a restore_data job that recovers data for the time interval 2022-07-01/2022-08-01. It recovers the most recent iteration of the data in that interval since there's no version specified. See the Jobs v1 API documentation for more information.

curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data '{
"type": "restore_data",
"target": {
"type": "table",
"tableName": "demo_table"
},
"interval": "2022-07-01/2022-08-01"
}'

Sample response

The following example shows a successful response:

Click to view the response
{
"deleteAll": false,
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdTimestamp": "2023-08-10T00:14:21.188501674Z",
"desiredExecutionStatus": "running",
"executionStatus": "pending",
"health": {
"status": "ok"
},
"id": "0189dccb-5804-7bf9-bbbb-6ac085b71b44",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdatedTimestamp": "2023-08-10T00:14:21.188501674Z",
"spec": {
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": [
"2022-07-01/2022-08-01"
]
},
"deleteAll": false,
"type": "restore_data",
"desiredExecutionStatus": "running"
},
"target": {
"tableName": "demo_table",
"type": "table",
"intervals": []
},
"type": "restore_data",
"completedTimestamp": null,
"startedTimestamp": null
}

Restore an older version of data that's in use

For general information about data partitioning and versions, see Segment generation.

If you want to restore a version of data that has a higher version in use, you need to first soft delete the newer overlapping data version that's in use.

Then, you can restore the earlier version as usual.

A common scenario that can require this process is when you want to restore segments that have been replaced. Polaris soft deletes older versions of segments when you replace data. Although you can restore the lower version, it gets soft deleted again automatically because there is a higher version in use.

You need to soft delete the higher version segment before you try to restore the lower version segment.

Consider the following example in which you ingested data from 2022. You perform the ingestion on January 1, 2024. Then you replace the data on February 1, 2024. You perform a second replace of the data on March 1, 2024.

Your table has the following segments:

  • v0: (2024-01-01T22:01:31.100Z), which was soft deleted after you used a replace job to load v1
  • v1: (2024-02-01T23:01:31.100Z), soft deleted after you used a replace job to load v2
  • v2: (2024-03-01T00:01:31.100Z), which is the version that's in use

If you try to restore v1 segment at this point, the v1 segment gets automatically soft deleted when it gets restored. This occurs because the v2 segment that's active is more recent. To avoid that, soft delete the v2 segment first. Then, restore the v1 segment.

The following example restores v1, the 2024-02-01T23:01:31.100Z version. Remember to soft delete the existing data for that interval if it's more recent:

curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/jobs" \
--user ${POLARIS_API_KEY}: \
--header 'Content-Type: application/json' \
--data '{
"type": "restore_data",
"target": {
"type": "table",
"tableName": "demo_table"
},
"interval": "2022-001-01T00:00:00Z/2023-01-01T00:00:00Z",
"versions": ["2024-02-01T23:01:31.100Z"]
}'

Learn more

See the following topics for more information: