Skip to main content

Imply Enterprise and Hybrid release notes

Imply releases include Imply Manager, Pivot, Clarity, and Imply's distribution of Apache Druid®. Imply delivers improvements more quickly than open source because Imply's distribution of Apache Druid uses the primary branch of Apache Druid. This means that it isn't an exact match to any specific open source release. Any open source version numbers mentioned in the Imply documentation don't pertain to Imply's distribution of Apache Druid.asdfsed

The following release notes provide information on features, improvements, and bug fixes up to Imply STS release 2025.01. Read all release notes carefully, especially the Upgrade and downgrade notes, before upgrading. Additionally, review the deprecations page regularly to see if any features you use are impacted.

For information on the LTS release, see the LTS release notes.

If you are upgrading by more than one version, read the intermediate release notes too.

The following end-of-support dates apply in 2025:

  • On January 26, 2025, Imply 2023.01 LTS reaches EOL. This means that the 2023.01 LTS release line will no longer receive any patches, including security updates. Imply recommends that you upgrade to the latest LTS or STS release.
  • On January 31, 2025, Imply 2024.01 LTS ends general support status and will be eligible only for security support.

For more information, see Lifecycle Policy.

See Previous versions for information on older releases.

Imply evaluation

New to Imply? Get started with an Imply Hybrid (formerly Imply Cloud) Free Trial or start a self-hosted trial at Get started with Imply!

With Imply Hybrid, the Imply team manages your clusters in AWS, while you control the infrastructure and own the data. With self-hosted Imply, you can run Imply on *NIX systems in your own environment or cloud provider.

Imply Enterprise

If you run Imply Enterprise, see Imply product releases & downloads to access the Imply Enterprise distribution. When prompted, log on to Zendesk with your Imply customer credentials.

Changes in 2025.01

Druid highlights

SQL behavior

Starting in 2025.01 STS, you can't continue to use non-ANSI SQL compliant behavior for Booleans, nulls, and two-valued logic.

Make sure you update your queries to account for this behavior. For more information on how to update your queries, see the SQL compliant mode migration guide.

Support for the configs that enabled the legacy behavior has been removed. They no longer affect your query results. If these configs are set to the legacy behavior, Druid services fail to start.

Remove the following configs:

  • druid.generic.useDefaultValueForNull=true
  • druid.expressions.useStrictBooleans=false
  • druid.generic.useThreeValueLogicForNativeFilters=false

If you want to continue to get the same results , you must update your queries or your results will be incorrect after you upgrade.

Join hints in MSQ task engine queries

Druid now supports hints for SQL JOIN queries that use the MSQ task engine. This allows queries to provide hints for the JOIN type that should be used at a per join level. Join hints recursively affect sub queries.

select /*+ sort_merge */ w1.cityName, w2.countryName
from
(
select /*+ broadcast */ w3.cityName AS cityName, w4.countryName AS countryName from wikipedia w3 LEFT JOIN wikipedia-set2 w4 ON w3.regionName = w4.regionName
) w1
JOIN wikipedia-set1 w2 ON w1.cityName = w2.cityName
where w1.cityName='New York';

(#17406) (id: 62998)

Front-coded dictionaries

You can specify that Druid uses front-coded dictionaries feature during segment creation. Once Druid starts using segments with front-coded dictionaries, you can't downgrade to a version where Druid doesn't support front-coded dictionaries. For more information, see Migration guide: front-coded dictionaries.

Concurrent append and replace

Concurrent append and replace is now generally available.

Deprecation updates

  • CentOS support for Imply Enterprise: if you are using CentOS, migrate to a supported operating system: RHEL 7.x and 8.x or Ubuntu 18.04 and 20.04. Support is planned to end in April 2025.
  • ioConfig.inputSource.type.azure storage schema: update your ingestion specs to use the azureStorage storage schema, which provides more capabilities. Support is planned to end in 2026.01 STS.
  • ZooKeeper-based task discovery: it has not been the default method for task discovery for several releases. Support is planned to end in 2026.01 STS.

For features that have reached end of support in 2025.01 STS, see End of support.

For a more complete list of deprecations including upcoming ones, see Deprecations.

Segment management APIs

APIs for marking segments as used or unused have been moved from the Coordinator to the Overlord service:

  • Mark all segments of a datasource as unused: POST /druid/indexer/v1/datasources/{dataSourceName}

  • Mark all (non-overshadowed) segments of a datasource as used: DELETE /druid/indexer/v1/datasources/{dataSourceName}

  • Mark multiple segments as used POST /druid/indexer/v1/datasources/{dataSourceName}/markUsed

  • Mark multiple (non-overshadowed) segments as unused POST /druid/indexer/v1/datasources/{dataSourceName}/markUnused

  • Mark a single segment as used: POST /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}

  • Mark a single segment as unused: DELETE /druid/indexer/v1/datasources/{dataSourceName}/segments/{segmentId}

    (#17545) (id: 64884)

Improve metadata IO

You can now reduce the metadata I/O during segment allocation by using the following Overlord runtime property: druid.indexer.tasklock.batchAllocationReduceMetadataIO. This property is set to true by default.

When set to true, the Overlord only fetches necessary segment payloads during segment allocation.

(#17496) (id: 64772)

New metrics for GroupBy queries

When merging the groupBy results, the following metrics are now emitted by the GroupByStatsMonitor:

  • mergeBuffer/used: Number of merge buffers used.
  • mergeBuffer/acquisitionTimeNs: Total time required to acquire merge buffer.
  • mergeBuffer/acquisition: Number of queries that acquired a batch of merge buffers.
  • groupBy/spilledQueries: Number of queries that spilled onto the disk.
  • groupBy/spilledBytes: Spilled bytes on the disk.
  • groupBy/mergeDictionarySize: Size of the merging dictionary.

(#17360) (id: 62147)

Auto-compaction using compaction supervisors (alpha)

You can run automatic compaction using compaction supervisors on the Overlord rather than Coordinator duties. Compaction supervisors provide the following benefits over Coordinator duties:

  • Can use the supervisor framework to get information about the auto-compaction, such as status or state
  • More easily suspend or resume compaction for a datasource
  • Can use either the native compaction engine or the MSQ task engine
  • More reactive and submits tasks as soon as a compaction slot is available
  • Tracked compaction task status to avoid re-compacting an interval repeatedly

For more information, see Auto-compaction using compaction supervisors

(#16291)

Projections (alpha)

Datasources now support projections as an alpha feature. They can improve query performance by pre-aggregating data. They are similar to materialized views but are part built into a segment and are automatically used when a query fits the projection.

To use a projection, you must ingest a datasource using JSON-based ingestion. Include a projections block in your ingestion spec with the following fields: type, name, virtualColumns, groupingColumns, and aggregators. Note that you can have projections that only include aggregators and no grouping columns, such as when you want to create a projection for the sum of certain columns.

Then, use the following query context flags when running either a native query or SQL query:

  • useProjection: accepts a specific projection name and instructs the query engine that it must use that projection, and will fail the query if the projection does not match the query
  • forceProjections: accepts true or false and instructs the query engine that it must use a projection, and will fail the query if it cannot find a matching projection
  • noProjections: accpets true or false and instructs the query engines to not use any projections

Note that auto-compaction does not preserve projections.

For more information, see the open source Druid issue for projections.

(#17214) (id: 64172) (#17484) (id: 64763)

Realtime query processing for multi-value strings

Realtime query processing no longer considers all strings as multi-value strings during expression processing, fixing a number of bugs and unexpected failures. This should also improve realtime query performance of expressions on string columns.

This change impacts topN queries for realtime segments where rows of data are implicitly null, such as from a property missing from a JSON object.

Before this change, these were handled as [] instead of null, leading to inconsistency between processing realtime segments and published segments. When processing segments, the value was treated as [], which topN ignores. After publishing, the value became null, which topN does not ignore. The same query could have different results before and after being persisted

After this change, the topN engine now treats [] as null when processing realtime segments, which is consistent with published segments.

This change doesn't impact actual multi-value string columns, regardless of if they're realtime.

(#17386) (id: 63771) (id: 64672)

Druid changes

  • Added support to the web console for the expectedLoadTimeMillis metric (#17359) (id: 64208)
  • Added support for aggregate only projections (#17484) (id: 64763)
  • Added support for UNION in decoupled planning (#17354) (id: 64402)
  • Added ingest/notices/queueSize, ingest/pause/time, and ingest/notices/time to statsd emitter (#17487) (id: 64679) (#17468) (id: 64601)
  • Added druid.expressions.allowVectorizeFallback and default to false (#17248) (id: 64173)
  • Added stageId and workerNumber to the MSQ task engine's processing thread names (#17324) (id: 64147)
  • Added support for a high-precision ST_GEOHASH function that takes the complex column geo, which contains longitude and latitude in that order, and returns a hash (id: 63437)
  • Added the config druid.server.http.showDetailedJsonMappingError, which is similar to druid.server.http.showDetailedJettyError, to configure the detail level for JSON mapping error messages (#16821) (id: 62645)
  • Changed how real-time segment metrics are now for each Sink instead of for each FireHydrant. This is a return to emission behavior prior to improvements to real-time query performance made in 2024.02 (#17170) (id: 61871)
  • You no longer have to configure a temporary storage directory on the Middle Manager for durable storage or exports. If it isn't configured, Druid uses the task directory (#17015) (id: 60547)
  • Improved the column order for scan queries so that they align with its desired signature (#17463) (id: 64441)
  • Improved the Query view in the web console to support resizable side panels (#17387) (id: 64404)
  • Improved how the Overlord service determines the leader and hands off leadership (#17415) (id: 64312)
  • Improved Middle Manager-less ingestion so that the Kubernetes task runner exposes the getMaximumCapacity field (#17107) (id: 64168)
  • Improved the styling in the web console for the stage timing bar (#17295) (id: 64157)
  • Improved autoscaling for Supervisors so that scaling doesn't happen when partitions are less than minTaskCount (#17335) (id: 64145)
  • Improved how the Explore view in the web console handles defaults (#17252) (id: 64020)
  • Improved the MSQ task engine to account for situations where there are two simultaneous statistics collectors (#17216) (id: 63987)
  • Improved the lookups extension to support iterating over fetched data (#17212) (id: 63939)
  • Improved logging to include taskId in handoff notifier thread (#17185) (id: 63882)
  • Improved window functions that use the MSQ task engine so that its processor can send any number of rows and columns to the operator without having to partition by column (#17038)(id: 63249)
  • Fixed an issue with PostgreSQL metadata storage because of table name casing issues (#17351) (id: 64128)
  • Fixed an issue with Supervisor autoscaling which could cause it to get skipped when the Supervisor could be publishing or when minTriggerScaleActionFrequencyMillis hasn't elapsed (#17356) (id: 64226)
  • Fixed in issue in the web console where the progress indication for table input gets stuck at 0 (#17334) (id: 64209)
  • Fixed an issue where batch segment allocation fails when there are replicas (#17262) (id: 64169)
  • Fixed an issue when grouping on a string array and sorting by it (#17183) (id: 64166)
  • Fixed an issue where duplicate compaction tasks might get launched (#17287) (id: 64154)
  • Fixed a race condition for failed queries with the MSQ task engine (#17313) (id: 64153)
  • Fixed several issues with the Explore view in the web console (#17234) (id: 64005) (#17240) (id: 64010) (#17225) (id: 63985)
  • Fixed an issue with querying realtime segment when using concurrent append and replace (#17157) (id: 63852)
  • Fixed an issue where Indexer tasks get stuck in a publishing state and must either get killed or hit the timeout (#17146) (id: 63800)
  • Removed unused Coordinator dynamic configs mergeSegmentsLimit and mergeBytesLimit (#17384) (id: 64267)

Imply Manager changes

  • Fixed a problem where updated Helm values were sometimes incorrectly displayed (id: 64648)

Pivot changes

  • The async download process now shows more information during the download process, including the number of rows processed (id: 60947)
  • The time series visualization now supports the TIMESERIES function (id: 63901)
  • In the records visualization you can now use the Nulls summary pill drop-down to turn off displaying the number of hidden null values (id: 64197)
  • You can now set a minimum auto-refresh rate when creating or editing a dashboard (id: 64032)
  • You can now preview the time range when adding a relative comparison to a visualization (id: 63944)
  • You can now specify the date and time to start evaluating alerts (id: 40669)
  • In the general options for a dashboard you can now set a default auto-refresh rate (id: 39798)
  • Fixed an issue with editing a report after removing a dimension used as a report filter (id: 63475)

Upgrade and downgrade notes

In addition to the upgrade and downgrade notes, review the deprecations page regularly to see if any features you use are impacted.

Minimum supported version for rolling upgrade

See "Supported upgrade paths" in the Lifecycle Policy documentation.

Default string array ingestion

Starting in 2024.10 STS, SQL-based ingestion with the MSQ task engine defaults to array typed columns instead of multi-value dimensions (MVDs). You must adjust your queries to either use array typed columns or explicitly specify your arrays as MVDs in your ingestion query. For more information, refer to the product feature update that Imply shared.

Front-coded dictionaries

Once Druid starts using segments with front-coded dictionaries, you can't downgrade to a version where Druid doesn't support front-coded dictionaries. For more information, see Migration guide: front-coded dictionaries.

If you're already using this feature, you don't need to take any action.

Automatic compaction

Imply preserves your automatic compaction configurations upon upgrade.

Segment sorting

This feature is in alpha and not backwards compatible with versions earlier than 2024.09. If you enable it, you can't downgrade to a version earlier than 2024.09 STS.

You can now configure Druid to sort segments by something other than time first.

For SQL-based ingestion, include the query context parameter forceSegmentSortByTime: false. For JSON-based batch and streaming ingestion, include forceSegmentSortByTime: false in the dimensionsSpec block.

(#16849) (id: 63215)

Changed low-level APIs for extensions

This information is meant for users who write their own Druid extensions and doesn't impact anyone who only uses extensions supported by Imply.

As part of changes starting in 2024.09 to improve the Druid, including the changes described in Segment sorting for Druid users, some low-level APIs used by some extensions may no longer be compatible with any existing custom extensions you have. For more information about which interfaces are impacted, see the following pull requests:

Compression for complex metric columns

If you use the IndexSpec option complexMetricCompression to compress complex metric columns, you cannot downgrade to a version that doesn't support compressing those columns.

This feature was introduced in 2024.09 STS.

(#16863) (id: 63277)

Changes to native equals filter

Beginning in 2024.01 STS, the native query equals filter on mixed type 'auto' columns that contain arrays must now be filtered as their presenting type. So if any rows are arrays (the segment metadata and information_schema reports the type as some array type), then the native queries must also filter as if they are some array type. This does not impact SQL, which already has this limitation due to how the type presents itself. This only impacts mixed type 'auto' columns, which contain both scalars and arrays.

Imply Hybrid MySQL upgrade

Imply Hybrid previously used MySQL 5.7 by default. New clusters will use MySQL 8 by default. If you have an existing cluster, you'll need to upgrade the MySQL version since the Amazon RDS support end date for this version is scheduled for February 29, 2024. Although you can opt for extended support from Amazon, you can use Imply Hybrid Manager to upgrade your MySQL instance to MySQL 8.

The upgrade should have little to no impact on your queries but does require a reconnection to the database. The process can take an hour and services will reconnect to the database during the upgrade.

In preparation for the upgrade, you need to grant certain permissions to the Cloud Manager IAM role by applying the following policy:

Show the policy
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"rds:CreateBlueGreenDeployment",
"rds:PromoteReadReplica"
],
"Resource": [
"arn:aws:rds:*:*:pg:*",
"arn:aws:rds:*:*:deployment:*",
"arn:aws:rds:*:*:*:imply-*"
],
"Effect": "Allow"
},
{
"Action": [
"rds:AddTagsToResource",
"rds:CreateDBInstanceReadReplica",
"rds:DeleteBlueGreenDeployment",
"rds:DescribeBlueGreenDeployments",
"rds:SwitchoverBlueGreenDeployment"
],
"Resource": "*",
"Effect": "Allow"
}
]
}

After you grant the permissions, click Apply changes for Amazon RDS MySQL Update on the Overview page of Imply Hybrid Manager.

Three-valued logic

caution

The legacy two-valued logic and the corresponding properties that support it will be removed in the January 2025 STS and January 2026 LTS. The SQL compatible three-valued logic will become the only option.

Update your queries and downstream apps prior to these releases.

SQL standard three-valued logic introduced in 2023.11 primarily affects filters using the logical NOT operation on columns with NULL values. This applies to both query and ingestion time filtering.

The following example illustrates the old behavior and the new behavior: Consider the filter “x <> 'some value'” to filter results for which x is not equal to 'some value'. Previously, Druid included all rows not matching "x='some value'" including null values. The new behavior follows the SQL standard and will now only match rows with a value and which are not equal to 'some value'. Null values are excluded from the results.

This change primarily affects filters using the logical NOT operation on columns with NULL values.

Three-valued logic is only enabled if you accept the following default values:

druid.generic.useDefaultValueForNull=false
druid.expressions.useStrictBooleans=true
druid.generic.useThreeValueLogicForNativeFilters=true

SQL compatibility

caution

The legacy behavior that is not compatible with standard ANSI SQL and the corresponding properties will be removed in the January 2025 STS and January 2026 LTS releases. The SQL-compatible behavior introduced in the 2023.09 STS will be the only behavior available.

Update your queries and any downstream apps prior to these releases.

Starting with 2023.09 STS, the default way Druid treats nulls and booleans has changed.

For nulls, Druid now differentiates between an empty string ('') and a record with no data as well as between an empty numerical record and 0.

You can revert to the previous behavior by setting druid.generic.useDefaultValueForNull to true. This property affects both storage and querying, and must be set on all Druid service types to be available at both ingestion time and query time. Reverting this setting to the old value restores the previous behavior without reingestion.

For booleans, Druid now strictly uses 1 (true) or 0 (false). Previously, true and false could be represented either as true and false as well as 1 and 0, respectively. In addition, Druid now returns a null value for Boolean comparisons like True && NULL.

druid.expressions.useStrictBooleans primarily affects querying, however it also affects json columns and type-aware schema discovery for ingestion. You can set druid.expressions.useStrictBooleans to false to configure Druid to ingest booleans in 'auto' and 'json' columns as VARCHAR (native STRING) typed columns that use string values of 'true' and 'false' instead of BIGINT (native LONG). You must set it on all Druid service types to be available at both ingestion time and query time.

The following table illustrates some example scenarios and the impact of the changes:

Show the table
Query2023.08 STS and earlier2023.09 STS and later
Query empty stringEmpty string ('') or nullEmpty string ('')
Query null stringNull or emptyNull
COUNT(*)All rows, including nullsAll rows, including nulls
COUNT(column)All rows excluding empty stringsAll rows including empty strings but excluding nulls
Expression 100 && 11111
Expression 100 || 111001
Null FLOAT/DOUBLE column0.0Null
Null LONG column0Null
Null __time column0, meaning 1970-01-01 00:00:00 UTC1970-01-01 00:00:00 UTC
Null MVD column''Null
ARRAYNullNull
COMPLEXnoneNull
Update your queries

Before you upgrade, update your queries to account for the following changed behavior:

NULL filters

If your queries use NULL in the filter condition to match both nulls and empty strings, you should add an explicit filter clause for empty strings. For example, update s IS NULL to s IS NULL OR s = ''.

COUNT functions

COUNT(column) now counts empty strings. If you want to continue excluding empty strings from the count, replace COUNT(column) with COUNT(column) FILTER(WHERE column <> '').

GroupBy queries

GroupBy queries on columns containing null values can now have additional entries as nulls can co-exist with empty strings.

Avatica JDBC driver upgrade

info

The Avatica JDBC is not packaged with Druid. Its upgrade is separate from any upgrades to Imply.

If you notice intermittent query failures after upgrading your Avatica JDBC to version 1.21.0 or later, you may need to set the transparent_reconnection.

Parameter execution changes for Kafka

When using the built-in FileConfigProvider for Kafka, interpolations are now intercepted by the JsonConfigurator instead of being passed down to the Kafka provider. This breaks existing deployments.

For more information, see KIP-297 and #13023.

Deprecation notices

For a more complete list of deprecations and their planned removal dates, see Deprecations.

CentOS support

If you are using CentOS, migrate to a supported operating system: RHEL 7.x and 8.x or Ubuntu 18.04 and 20.04. Removal of support for CentOS has been planned for April 2025.

Some segment loading configs deprecated

The following segment related configs are now deprecated and will be removed in future releases:

  • replicationThrottleLimit
  • useRoundRobinSegmentAssignment
  • maxNonPrimaryReplicantsToLoad
  • decommissioningMaxPercentOfMaxSegmentsToMove

Use smartSegmentLoading mode instead, which calculates values for these variables automatically.

ioCOnfig.inputSource.type.azure storage schema

Update your ingestion specs to use the azureStorage storage schema, which provides more capabilities.

ZooKeeper-based task discovery

Use HTTP-based task discovery instead, which has been the default since 2022.

End of support

Two-valued logic

Druid's legacy two-valued logic for native filters and the properties for maintaining that behavior are deprecated and will be removed in the January 2025 STS and January 2026 LTS releases.

The ANSI-SQL compliant three-valued logic will be the only supported behavior after these releases. This SQL-compatible behavior became the default in the Imply 2023.11 STS and January 2024 LTS releases.

Update your queries and downstream apps and remove the corresponding configs.

For more information, see three-valued logic.

Properties for legacy Druid SQL behavior

Druid's legacy behavior for Booleans and NULLs and the corresponding properties are deprecated and will be removed in the January 2025 STS and January 2026 LTS releases.

The ANSI-SQL compliant treatment of Booleans and null values will be the only supported behavior after these releases. This SQL-compatible behavior became the default in the Imply 2023.11 STS and January 2024 LTS releases.

Update your queries and downstream apps and remove the corresponding configs.

For more information, see SQL compatibility.

druid.azure.endpointSuffix

The config has been removed. Update any references to use druid.azure.storageAccountEndpointSuffix instead.

SysMonitor support

Switch to OshiSysMonitor as SysMonitor is removed.

Asynchronous SQL download

The async downloads feature has been removed. This refers to an older version of async SQL download that has been replaced with a new version with the same name. For more information, see Download data.

ZooKeeper segment serving processes

ZooKeep-based segment loading has been disabled in 2024.06 STS. In 2024.08 STS, segment serving processes such as Peons, Historicals and Indexers won't create ZooKeeper loadQueuePath anymore. The property druid.zk.paths.loadQueuePath will also be ignored if they are still in your configs.

If you are still using ZooKeeper-based segment loading and want to upgrade to a more recent release where only HTTP-based segment loading is supported, switch to HTTP-based segment loading before upgrading. For more information, see Segment management.

(#16816) (id: 62629)

Java 8

Java 8 for Druid is at end of support. We recommend you upgrade to Java 17.

JSON columns v3 and v4

JSON columns v3 and v4 is at end of support. Only JSON column v5 is supported and has been the default for several releases. While Druid can still read these older versions, it can't create those versions. Druid can only create v5 columns now.

After upgrading to a version with support for a higher JSON version, you cannot downgrade to an earlier version. Imply's distribution of Apache Druid® has been on JSON v5 since the 2024.01 LTS and 2023.09 STS.

Segment loading rules

Smart segment loading automatically calculates the optimal values for settings you previously had to manually set. As a result, the following settings are automatically ignored: maxSegmentsInNodeLoadingQueue, maxSegmentsToMove, replicantLifetime, balancerComputeThreads. Additionally, the cachingCost balancer strategy is no longer supported.