Skip to main content

Set a storage policy by API

info

Project-less regional API resources have been deprecated and will be removed by the end of September 2024.

You must include the project ID in the URL for all regional API calls in projects created after September 29, 2023. For example: https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID

Projects created before September 29, 2023 can continue to use project-less URLs until the end of September 2024. We strongly recommend updating your regional API calls to include the project ID prior to September 2024. See the API migration guide for more information.

You can set a storage policy on a table to manage the lifecycle and query accessibility of data in the table. A retention policy is a storage policy that determines how long to retain data. A cache policy is a storage policy that determines how long to keep data cached. There are two types of policies you can use when configuring a retention or cache policy:

  • A period-based policy (period), which accepts an ISO 8601 duration
  • An interval-based policy (intervals), which accepts one or more intervals in addition to an optional ISO 8601 duration.

This topic shows how to assign a storage policy to a table using the Polaris API. For general information on storage policies, see Data lifecycle management.

Prerequisites

You must have an API key with the ManageTables permission. In the examples below, the key value is stored in the variable named POLARIS_API_KEY. See API key authentication to obtain an API key and assign permissions. Visit Permissions reference for more information on permissions.

Create a table with a storage policy

When you create a table, you can customize how long Polaris retains or caches data in the table. In the table definition, specify your custom retention or cache policy in the storagePolicy property.

Retention policy

A retain-type storage policy, or retention policy, automatically deletes data with timestamps older than the specified time period or outside the specified intervals. The default behavior in Polaris is to retain all data forever until you delete the data (such as with a delete_data) job or drop the table (such as with a drop_table job).

To create a table with a retention policy, include the storagePolicy.retain property in the request payload. The following example shows a retention policy that retains data for the past three months:

    "storagePolicy": {
"retain": {
"type": "period",
"period": "P3M"
}
}

The following example shows a retention policy that retains data for part of 2023 (the interval 2024-01-01/2024-11-30):

    "storagePolicy": {
"retain": {
"type": "intervals",
"intervals": ["2024-01-01/2024-11-30"],
}
}

When you use intervals, you must provide at least one interval. To specify multiple intervals, provide a comma-separated list, such as ["2023-01-01/2023-11-30", "2024-01-01/2024-11-30", ...].

Optionally, you can combine a period with intervals. The following example shows a retention policy that retains data for part of 2023 (the interval 2023-01-01/2023-11-30) and data from the last 3 months:

    "storagePolicy": {
"retain": {
"type": "intervals",
"intervals": ["2023-01-01/2023-11-30"],
"period": "P3M"
}
}

Polaris retains any data that falls within either the intervals or the period.

Cache policy

info

This is a beta feature available to select customers. Imply must enable the feature for you. Contact your Polaris support representative to find out more.

With a cached-typed storage policy, or cache policy, Polaris caches data within the specified time period. Data outside this time period resides only in deep storage and must be queried asynchronously. The default behavior in Polaris is to keep all data cached.

caution

If the time period in your cache policy does not encompass any of the data in the table, no data is cached. You will not be able to query any data in the table if no data is cached. Ensure your cache policy covers at least a portion of data in the table.

To create a table with a cache policy, include the storagePolicy.cached property in the request payload. The following example shows a cache policy to cache data for the last month:

    "storagePolicy": {
"cached": {
"type": "period",
"period": "P1M"
}
}

The following example shows a cache policy to cache data for the interval 2024-01-01/2024-11-30:

    "storagePolicy": {
"cached": {
"type": "intervals",
"intervals": ["2024-01-01/2024-11-30"]
}
}

You can provide a combination of one or more intervals and an optional period for cache policies. The following example shows a cache policy to cache data from part of 2023 (the interval 2023-01-01/2023-11-30) and the last 3 months:

    "storagePolicy": {
"cached": {
"type": "intervals",
"intervals": ["2023-01-01/2023-11-30"],
"period": "P3M"
}
}

Polaris caches any data that falls within either the intervals or the period.

Retention and cache policy

You can set both retention and cache policies simultaneously when creating a table.

The following example shows a storage policy definition to retain data for the past three months and cache data for the last month:

    "storagePolicy": {
"retain": {
"type": "period",
"period": "P3M"

},
"cached": {
"type": "period",
"period": "P1M"
}
}
info

Cache policies encompass retention behavior. Polaris retains all cached data, regardless of the time range of the retention policy.

Use the /query/sql/statements endpoint to submit an asynchronous query that accesses data outside the cache period and within the retention period. For example with a P3M retention period and a P1M cache period, you must use an asynchronous query to access data older than one month but within the last three months. Queries that use the /query/sql endpoint access cached data only. For optimal performance, ensure that you cache data that is regularly accessed, and query the data using /query/sql. To learn more, see Query data in cache and deep storage.

Cache policies and retention policies don't need to overlap. This way you can create policies to fit your storage and query performance requirements. For example, consider a retention policy that specifies the period P90D and a cache policy that specifies the predating time interval 2022-01-01/2023-01-01. Since all cached data is retained regardless of your retention policy, the data for the interval is both cached and retained.

With both policies set, Polaris manages the data as follows:

  • Retain but do not cache data for the last 90 days. You can query this data from deep storage using the /sql/statements endpoint.
  • Retain and cache all data from the year 2022. You can query this data synchronously using the /query/sql endpoint as well as from deep storage using the /sql/statements endpoint.

The policy would look like this:

    "storagePolicy": {
"retain": {
"type": "period",
"period": "P90D"

},
"cached": {
"type": "intervals",
"interval": ["2022-01-01/2023-01-01"]
}
}

Sample request

Send a POST request to the /v1/projects/PROJECT_ID/tables endpoint to create a new table with a storage policy. For more information on creating tables, see Create a table by API.

curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/tables" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data '{
"name": "Koalas Retention",
"type": "detail",
"storagePolicy": {
"retain": {
"period": "P3M",
"type": "period"
}
}
}'

Sample response

The following example shows a successful response:

Click to view the response
{
"schema": [
{
"name": "__time",
"dataType": "timestamp"
}
],
"name": "Koalas Retention",
"type": "detail",
"version": 0,
"availability": "available",
"clusteringColumns": [],
"compactionConfig": null,
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdByUser": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdOnTimestamp": "2023-08-10T23:43:01.543495184Z",
"createdTimestamp": "2023-08-10T23:43:01.543495184Z",
"description": null,
"id": "0189e1d5-05a6-7015-bb2c-de10182c7f03",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdateTimestamp": "2023-08-10T23:43:01.543499193Z",
"modifiedByUser": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"modifiedOnTimestamp": "2023-08-10T23:43:01.543499193Z",
"partitioningGranularity": "day",
"queryableSchema": [],
"storagePolicy": {
"retain": {
"period": "P3M",
"type": "period"
}
},
"schemaMode": "flexible",
"segmentCompactedBytes": 0,
"segmentTotalBytes": 0,
"totalDataSizeBytes": 0,
"totalRows": 0
}

Add or remove a storage policy

You can add or remove a storage policy from a table that already contains data. When you update the retention period on a table to a longer time period, Polaris does not recover previously deleted data.

To restore the default behavior in Polaris, remove the custom storage policies from the table. By default, Polaris retains all data forever and caches all retained data.

The following storage policy example resets the default retention and cache behavior:

    "storagePolicy": {}

The following storage policy example resets the cache policy behavior and keeps the three month retention policy:

    "storagePolicy": {
"cached": null,
"retain": {
"period": "P3M",
"type": "period"
}
}

The net effect is for Polaris to retain and cache the past three months of data.

When sending a PUT request to update a table, keep in mind the following differences from creating a table:

  • Supply the table name as a path parameter.
  • Include version in the request body.

Sample request

Send a PUT request to the /v1/projects/PROJECT_ID/tables/TABLE_NAME endpoint to update a table's storage policy. See the Tables v1 API documentation for more information.

curl --location --request POST "https://ORGANIZATION_NAME.REGION.CLOUD_PROVIDER.api.imply.io/v1/projects/PROJECT_ID/tables/Koalas" \
--header "Authorization: Basic $POLARIS_API_KEY" \
--header "Content-Type: application/json" \
--data '{
"name": "Koalas",
"type": "aggregate",
"version": 0,
"storagePolicy": {
"retain": {
"period": "P1M",
"type": "period"
}
}
}'

Sample response

The following example shows a successful response:

Click to view the response
{
"schema": [
{
"dataType": "timestamp",
"type": "dimension",
"name": "__time"
}
],
"timeResolution": "millisecond",
"name": "Koalas",
"type": "aggregate",
"version": 1,
"availability": "available",
"clusteringColumns": [],
"compactionConfig": null,
"createdBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdByUser": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"createdOnTimestamp": "2023-08-10T23:27:51.549893Z",
"createdTimestamp": "2023-08-10T23:27:51.549893Z",
"description": null,
"id": "0189e1c7-22fd-7ea3-addd-a1f06705afa0",
"lastModifiedBy": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"lastUpdateTimestamp": "2023-08-10T23:50:57.554676548Z",
"modifiedByUser": {
"username": "api-key-pok_vipgj...bjjvyo",
"userId": "a52cacf6-3ddc-48e5-8675-xxxxxxxxxxxx"
},
"modifiedOnTimestamp": "2023-08-10T23:50:57.554676548Z",
"partitioningGranularity": "day",
"queryableSchema": [],
"storagePolicy": {
"retain": {
"period": "P1M",
"type": "period"
}
},
"schemaMode": "flexible",
"segmentCompactedBytes": 0,
"segmentTotalBytes": 0,
"totalDataSizeBytes": 0,
"totalRows": 0
}

Learn more

See the following topics for more information: