SQL-based ingestion known issues
This page describes SQL-based batch ingestion using the
druid-multi-stage-queryextension, new in Druid 24.0. Refer to the ingestion methods table to determine which ingestion method is right for you.
Multi-stage query task runtime
Fault tolerance is partially implemented. Workers get relaunched when they are killed unexpectedly. The controller does not get relaunched if it is killed unexpectedly.
Worker task stage outputs are stored in the working directory given by
druid.indexer.task.baseDir. Stages that generate a large amount of output data may exhaust all available disk space. In this case, the query fails with an UnknownError with a message including "No space left on device".
SELECTfrom a Druid datasource does not include unpublished real-time data.
UNION ALLare not implemented. Queries using these features return a QueryNotSupported error.
COUNT DISTINCTqueries, you'll encounter a QueryNotSupported error that includes
Must not have 'subtotalsSpec'as one of its causes. This is caused by the planner attempting to use
GROUPING SETs, which are not implemented.
The numeric varieties of the
LATESTaggregators do not work properly. Attempting to use the numeric varieties of these aggregators lead to an error like
java.lang.ClassCastException: class java.lang.Double cannot be cast to class org.apache.druid.collections.SerializablePair. The string varieties, however, do work properly.
REPLACEstatements with column lists, like
INSERT INTO tbl (a, b, c) SELECT ..., is not implemented.
INSERT ... SELECTand
REPLACE ... SELECTinsert columns from the
SELECTstatement based on column name. This differs from SQL standard behavior, where columns are inserted based on position.
REPLACEdo not support all options available in ingestion specs, including the
multiValueHandlingdimension properties, and the
The schemaless dimensions feature is not available. All columns and their types must be specified explicitly using the
signatureparameter of the
EXTERNwith input sources that match large numbers of files may exhaust available memory on the controller task.
EXTERNrefers to external files. Use