Restore or permanently delete data
After you soft delete data in Imply Polaris or data expires based on your retention period, you have a grace period that can be up to 30 days to:
- Do nothing and let Polaris permanently delete the data after the grace period.
- Proactively delete the data permanently before the grace period ends. Permanently deleting data reduces your data usage, helping you stay within your project size.
- Restore the data if it is within the grace period. Note that there is some lag between recovering data and it becoming queryable again.
Imply reserves the right to delete data before the end of the 30 day grace period if needed.
This topic shows how to access deleted data in order to restore or permanently delete it.
Prerequisites
To manage deleted data, you need the ManageTables
and ManageIngestionJobs
permissions. These permissions are assigned to the Organization Admin, Project Admin, and Data Manager groups by default.
For information on permissions, see Permissions reference.
Restorable data
For data to be eligible for recovery, it must have been soft deleted within the last 30 days or be within 30 days of the data exceeding the retention period.
Segment overlap
You can't restore a segment whose interval overlaps with any interval already present in the table.
To restore the deleted interval, you must first delete the interval in the table that overlaps the same time interval.
Schema incompatibility
Schema incompatibility occurs when there are changes in the declared schema, leading to inconsistencies between the declared schema of the restored data and the declared schema on the table.
When the schema of the deleted data isn't compatible with the schema declared on the table, Polaris still recovers the data. However, incompatible values of the recovered data become NULL at query time.
You can restore the nulled values by undeclaring the column that contains the incompatible values. This removes the type constraint since an undeclared column uses the data type that works for all data in the column. You can still declare the column again, using a compatible data type.
Restore data
To recover data deleted from a table, do the following:
Navigate to the table you want to recover data for.
Select Manage > Manage deleted data.
On the Manage deleted data page, identify the segments for the data you want to recover:
For more guidance on this step, see Identify data to recover.
Select the data segments you want to recover. Then, select Recover.
After you select Recover, a confirmation window appears.Recovering by versionIf you select multiple versions of a given interval to recover, Polaris only uses the latest one.
- Verify that the intervals and any selected versions are correct, and select Recover.
A notification that confirms the recovery job appears. You can monitor the status from the table's page.
Permanently delete data
To permanently delete data, follow the steps for restoring data, but select Permanently delete instead.
Identify data to recover
The underlying data in Polaris is organized into segments. Segments are files that store data over units of time. A segment's granularity is defined by the ingestion job's time partitioning, which defaults to the time partitioning set on the table. A given time interval may have zero, one, or more segments, depending on how much data has been ingested for that time interval.
To recover or permanently delete data, you need to identify the interval or intervals that contain the relevant data. Use the following characteristics to identify the deleted segments to recover or permanently delete:
- Time range of the data. Segments are designated by the start and end time of the
__time
column.
For example,2024-01-01, 00:00:00
to2024-01-02, 00:00:00
. - Date the data was ingested. For segments with the same time range, use the segment creation time for the date the data was ingested. This can help you differentiate segments from a previously deleted table compared to segments from a currently existing table with the same name.
- Date the data was deleted. If you know when you deleted your data, such as from reviewing recent jobs, the deletion date can help you distinguish segments from within the same time range. The grace period to recover deleted data starts on this date.
- Version. The UTC timestamp for when the segment was created. Polaris assigns versions as follows:
- When you insert data into a new interval, the version corresponds to the time of the job.
- When you replace data into an existing interval, the version corresponds to the time of the job.
- When you insert data into an existing interval, Polaris may store the data in an existing segment or create a new segment. For existing segments, the version corresponds to the time that data was originally ingested into that segment. This time precedes the insert operation. For new segments, the version corresponds to the time of the job.
Learn more
See the following topics for more information: