All Druid SQL functions
Apache Druid supports two query languages: Druid SQL and native queries. This document describes the SQL language.
This page provides a reference of Apache Druid® SQL functions in alphabetical order. For more details on a function, refer to the following:
- Aggregation functions
- Array functions
- JSON functions
- Multi-value string functions
- Scalar functions
- Window functions
Example data
The examples on this page use the following example datasources:
array-example
created with SQL-based ingestionflight-carriers
usingFlightCarrierOnTime (1 month)
included with Druidkttm
usingKoalasToTheMax one day
included with Druidmvd-example
using SQL-based ingestiontaxi-trips
usingNYC Taxi cabs (3 files)
included with Druid
To load a datasource included with Druid, access the web console and go to Load data > Batch - SQL > Example data. Select Connect data, and parse using the default settings. On the page to configure the schema, select the datasource label and enter the name of the datasource listed above.
Use the following query to create the array-example
datasource:
Datasource for arrays
REPLACE INTO "array-example" OVERWRITE ALL
WITH "ext" AS (
SELECT *
FROM TABLE(
EXTERN(
'{"type":"inline","data":"{\"timestamp\": \"2023-01-01T00:00:00\", \"label\": \"row1\", \"arrayString\": [\"a\", \"b\"], \"arrayLong\":[1, null,3], \"arrayDouble\":[1.1, 2.2, null]}\n{\"timestamp\": \"2023-01-01T00:00:00\", \"label\": \"row2\", \"arrayString\": [null, \"b\"], \"arrayLong\":null, \"arrayDouble\":[999, null, 5.5]}\n{\"timestamp\": \"2023-01-01T00:00:00\", \"label\": \"row3\", \"arrayString\": [], \"arrayLong\":[1, 2, 3], \"arrayDouble\":[null, 2.2, 1.1]} \n{\"timestamp\": \"2023-01-01T00:00:00\", \"label\": \"row4\", \"arrayString\": [\"a\", \"b\"], \"arrayLong\":[1, 2, 3], \"arrayDouble\":[]}\n{\"timestamp\": \"2023-01-01T00:00:00\", \"label\": \"row5\", \"arrayString\": null, \"arrayLong\":[], \"arrayDouble\":null}"}',
'{"type":"json"}'
)
) EXTEND (
"timestamp" VARCHAR,
"label" VARCHAR,
"arrayString" VARCHAR ARRAY,
"arrayLong" BIGINT ARRAY,
"arrayDouble" DOUBLE ARRAY
)
)
SELECT
TIME_PARSE("timestamp") AS "__time",
"label",
"arrayString",
"arrayLong",
"arrayDouble"
FROM "ext"
PARTITIONED BY DAY
Use the following query to create the mvd-example
datasource:
Datasource for multi-value string dimensions
REPLACE INTO "mvd-example" OVERWRITE ALL
WITH "ext" AS (
SELECT *
FROM TABLE(
EXTERN(
'{"type":"inline","data":"{\"timestamp\": \"2011-01-12T00:00:00.000Z\", \"label\": \"row1\", \"tags\": [\"t1\",\"t2\",\"t3\"]}\n{\"timestamp\": \"2011-01-13T00:00:00.000Z\", \"label\": \"row2\", \"tags\": [\"t3\",\"t4\",\"t5\"]}\n{\"timestamp\": \"2011-01-14T00:00:00.000Z\", \"label\": \"row3\", \"tags\": [\"t5\",\"t6\",\"t7\"]}\n{\"timestamp\": \"2011-01-14T00:00:00.000Z\", \"label\": \"row4\", \"tags\": []}"}',
'{"type":"json"}',
'[{"name":"timestamp", "type":"STRING"},{"name":"label", "type":"STRING"},{"name":"tags", "type":"ARRAY<STRING>"}]'
)
)
)
SELECT
TIME_PARSE("timestamp") AS "__time",
"label",
ARRAY_TO_MV("tags") AS "tags"
FROM "ext"
PARTITIONED BY DAY
ABS
Calculates the absolute value of a numeric expression.
- Syntax:
ABS(<NUMERIC>)
- Function type: Scalar, numeric
Example
The following example applies the ABS function to the ArrDelay
column from the flight-carriers
datasource.
SELECT
"ArrDelay" AS "arrival_delay",
ABS("ArrDelay") AS "absolute_arrival_delay"
FROM "flight-carriers"
WHERE "ArrDelay" < 0
LIMIT 1
Returns the following:
arrival_delay | absolute_arrival_delay |
---|---|
-27 | 27 |
ACOS
Calculates the arc cosine (arccosine) of a numeric expression.
- Syntax:
ACOS(expr)
- Function type: Scalar, numeric
Example
The following example calculates the arc cosine of 0
.
SELECT ACOS(0) AS "arc_cosine"
Returns the following:
arc_cosine |
---|
1.5707963267948966 |
ANY_VALUE
Returns any value of the specified expression.
- Syntax:
ANY_VALUE(expr, [maxBytesPerValue, [aggregateMultipleValues]])
- Function type: Aggregation
Example
The following example returns the state abbreviation, state name, and average flight time grouped by each state in flight-carriers
:
SELECT
"OriginState",
ANY_VALUE("OriginStateName") AS "OriginStateName",
AVG("ActualElapsedTime") AS "AverageFlightTime"
FROM "flight-carriers"
GROUP BY 1
LIMIT 3
Returns the following:
OriginState | OriginStateName | AverageFlightTime |
---|---|---|
AK | Alaska | 113.2777967841259 |
AL | Alabama | 92.28766697732215 |
AR | Arkansas | 95.0391382405745 |
APPROX_COUNT_DISTINCT
Counts distinct values of a regular column or a prebuilt sketch column using an approximate algorithm.
- Syntax:
APPROX_COUNT_DISTINCT(expr)
- Function type: Aggregation
Example
The following example counts the number of distinct airlines reported in flight-carriers
:
SELECT APPROX_COUNT_DISTINCT("Reporting_Airline") AS "num_airlines"
FROM "flight-carriers"
Returns the following:
num_airlines |
---|
20 |
APPROX_COUNT_DISTINCT_BUILTIN
Counts distinct values of a string, numeric, or hyperUnique
column using Druid's built-in cardinality
or hyperUnique
aggregators.
Consider using APPROX_COUNT_DISTINCT_DS_HLL
instead, which offers better accuracy in many cases.
- Syntax:
APPROX_COUNT_DISTINCT_BUILTIN(expr)
- Function type: Aggregation
Example
The following example counts the number of distinct airlines reported in flight-carriers
:
SELECT APPROX_COUNT_DISTINCT_BUILTIN("Reporting_Airline") AS "num_airlines"
FROM "flight-carriers"
Returns the following:
num_airlines |
---|
20 |
APPROX_COUNT_DISTINCT_DS_HLL
Returns the approximate number of distinct values in a HLL sketch column or a regular column. See DataSketches HLL Sketch module for a description of optional parameters.
- Syntax:
APPROX_COUNT_DISTINCT_DS_HLL(expr, [lgK, tgtHllType])
- Function type: Aggregation
Example
The following example returns the approximate number of distinct tail numbers in the flight-carriers
datasource.
SELECT APPROX_COUNT_DISTINCT_DS_HLL("Tail_Number") AS "estimate"
FROM "flight-carriers"
Returns the following:
estimate |
---|
4686 |
APPROX_COUNT_DISTINCT_DS_THETA
Returns the approximate number of distinct values in a Theta sketch column or a regular column. See DataSketches Theta Sketch module for a description of optional parameters.
- Syntax:
APPROX_COUNT_DISTINCT_DS_THETA(expr, [size])
- Function type: Aggregation
Example
The following example returns the approximate number of distinct tail numbers in the Tail_Number
column of the flight-carriers
datasource.
SELECT APPROX_COUNT_DISTINCT_DS_THETA("Tail_Number") AS "estimate"
FROM "flight-carriers"
Returns the following:
estimate |
---|
4667 |
APPROX_QUANTILE
Deprecated in favor of APPROX_QUANTILE_DS
.
- Syntax:
APPROX_QUANTILE(expr, probability, [k])
- Function type: Aggregation
APPROX_QUANTILE_DS
Computes approximate quantiles on a Quantiles sketch column or a regular numeric column. See DataSketches Quantiles Sketch module for a description of parameters.
- Syntax:
APPROX_QUANTILE_DS(expr, probability, [k])
- Function type: Aggregation
Example
The following example approximates the median of the Distance
column from the flight-carriers
datasource. The query may return a different approximation on each execution.
SELECT APPROX_QUANTILE_DS("Distance", 0.5, 128) AS "estimate_median"
FROM "flight-carriers"
Returns a result similar to the following:
estimate_median |
---|
569 |
APPROX_QUANTILE_FIXED_BUCKETS
Computes approximate quantiles on fixed buckets histogram column or a regular numeric column. See Fixed buckets histogram for a description of parameters.
- Syntax:
APPROX_QUANTILE_FIXED_BUCKETS(expr, probability, numBuckets, lowerLimit, upperLimit, [outlierHandlingMode])
- Function type: Aggregation
Example
The following example approximates the median of a histogram on the Distance
column from the flight-carriers
datasource. The histogram has 10 buckets, a lower limit of zero, an upper limit of 2500, and ignores outlier values.
SELECT APPROX_QUANTILE_FIXED_BUCKETS("Distance", 0.5, 10, 0, 2500, 'ignore') AS "estimate_median"
FROM "flight-carriers"
Returns the following:
estimate_median |
---|
571.6983032226562 |
ARRAY
Constructs a SQL ARRAY
literal from the provided expression arguments. All arguments must be of the same type.
- Syntax:
ARRAY[expr1, expr2, ...]
- Function type: Array
Example
The following example constructs arrays from the values of the agent_category
, browser
, and browser_version
columns in the kttm
datasource.
SELECT ARRAY["agent_category", "browser", "browser_version"] AS "user_agent_details"
FROM "kttm"
LIMIT 5
Returns the following:
user_agent_details |
---|
["Personal computer","Chrome","76.0.3809.100"] |
["Smartphone","Chrome Mobile","50.0.2661.89"] |
["Personal computer","Chrome","76.0.3809.100"] |
["Personal computer","Opera","62.0.3331.116"] |
["Smartphone","Mobile Safari","12.0"] |
ARRAY_AGG
Returns an array of all values of the specified expression. To include only unique values, specify DISTINCT
.
- Syntax:
ARRAY_AGG([DISTINCT] expr, [size])
- Function type: Aggregation
Example
The following example returns arrays of unique values from the OriginState
column in the flight-carriers
datasource, grouped by Reporting_Airline
.
SELECT "Reporting_Airline", ARRAY_AGG(DISTINCT "OriginState", 50000) AS "Origin"
FROM "flight-carriers"
GROUP BY "Reporting_Airline"
LIMIT 5
Returns the following:
Reporting_Airline | Origin |
---|---|
AA | ["AL","AR","AZ","CA","CO","CT","FL","GA","HI","IL","IN","KS","KY","LA","MA","MD","MI","MN","MO","NC","NE","NJ","NM","NV","NY","OH","OK","OR","PA","PR","RI","TN","TX","UT","VA","VI","WA"] |
AS | ["AK","AZ","CA","CO","FL","ID","IL","MA","NJ","NV","OR","TX","VA","WA"] |
B6 | ["AZ","CA","CO","FL","LA","MA","NJ","NV","NY","OR","PR","UT","VA","VT","WA"] |
CO | ["AK","AL","AZ","CA","CO","CT","FL","GA","HI","IL","IN","LA","MA","MD","MI","MN","MO","MS","NC","NE","NH","NJ","NM","NV","NY","OH","OK","OR","PA","PR","RI","SC","TN","TX","UT","VA","VI","WA"] |
DH | ["AL","CA","CT","FL","GA","IL","MA","ME","MI","NC","NH","NJ","NV","NY","OH","PA","RI","SC","TN","VA","VT","WA","WV"] |
ARRAY_APPEND
Appends the expression to the array. The source array type determines the resulting array type.
- Syntax:
ARRAY_APPEND(arr, expr)
- Function type: Array
Example
The following example appends c
to the values in the arrayString
column from the array-example
datasource.
SELECT ARRAY_APPEND("arrayString",'c') AS "array_appended"
FROM "array-example"
Returns the following:
array_appended |
---|
[a, b, c] |
[null,"b","c"] |
[c] |
[a, b, c] |
null |
ARRAY_CONCAT
Concatenates two arrays. The type of arr1
determines the resulting array type.
- Syntax:
ARRAY_CONCAT(arr1, arr2)
- Function type: Array
Example
The following example concatenates the arrays in the arrayLong
and arrayDouble
columns from the array-example
datasource.
SELECT ARRAY_CONCAT("arrayLong", "arrayDouble") AS "arrayConcatenated"
FROM "array-example"
Returns the following:
arrayConcatenated |
---|
[1,null,3,1.1,2.2,null] |
null |
[1,2,3,null,2.2,1.1] |
[1,2,3] |
null |
ARRAY_CONCAT_AGG
Concatenates array inputs into a single array. To include only unique values, specify DISTINCT
.
- Syntax:
ARRAY_CONCAT_AGG([DISTINCT] expr, [size])
- Function type: Aggregation
Example
The following example concatenates the array inputs from the arrayDouble
column of the array-example
datasource into a single array.
SELECT ARRAY_CONCAT_AGG( DISTINCT "arrayDouble") AS "array_concat_agg_distinct"
FROM "array-example"
Returns the following:
array_concat_agg_distinct |
---|
[null,1.1,2.2,5.5,999] |
ARRAY_CONTAINS
Checks if the array contains the specified expression.
Scalar
If the specified expression is a scalar value, returns true if the source array contains the value.
- Syntax:
ARRAY_CONTAINS(arr, expr)
- Function type: Array
Example
The following example returns true if the arraySring
column from the array-example
datasource contains 2
.
SELECT "arrayLong", ARRAY_CONTAINS("arrayLong", 2) AS "arrayContains"
FROM "array-example"
Returns the following:
arrayLong | arrayContains |
---|---|
[1,null,3] | false |
null | null |
[1,2,3] | true |
[1,2,3] | true |
[] | false |
Array
If the specified expression is an array, returns true if the source array contains all elements of the expression.
- Syntax:
ARRAY_CONTAINS(arr, expr)
- Function type: Array
Example
The following example returns true if the arrayLong
column from the array-example
datasource contains all elements of the provided expression.
SELECT "label", "arrayLong", ARRAY_CONTAINS("arrayLong", ARRAY[1,2,3]) AS "arrayContains"
FROM "array-example"
Returns the following:
label | arrayLong | arrayContains |
---|---|---|
row1 | [1,null,3] | false |
row2 | null | null |
row3 | [1,2,3] | true |
row4 | [1,2,3] | true |
row5 | [] | false |
ARRAY_LENGTH
Returns the length of the array.
- Syntax:
ARRAY_LENGTH(arr)
- Function type: Array
Example
The following example returns the length of array expressions in the arrayDouble
column from the array-example
datasource.
SELECT "arrayDouble" AS "array", ARRAY_LENGTH("arrayDouble") AS "arrayLength"
FROM "array-example"
Returns the following:
larray | arrayLength |
---|---|
row1 | 3 |
row2 | 3 |
row3 | 3 |
row4 | 0 |
row5 | null |
ARRAY_OFFSET
Returns the array element at the specified zero-based index. Returns null if the index is out of bounds.
- Syntax:
ARRAY_OFFSET(arr, long)
- Function type: Array
Example
The following example returns the element at the specified zero-based index from the arrays in the arrayLong
column of the array-example
datasource.
SELECT "arrayLong" as "array", ARRAY_OFFSET("arrayLong", 2) AS "elementAtIndex"
FROM "array-example"
Returns the following:
array | elementAtIndex |
---|---|
[1,null,3] | 3 |
null | null |
[1,2,3] | 3 |
[1,2,3] | 3 |
[] | null |
ARRAY_OFFSET_OF
Returns the zero-based index of the first occurrence of the expression in the array. Returns null if the value isn't present, or -1
if druid.generic.useDefaultValueForNull=true
(deprecated legacy mode).
- Syntax:
ARRAY_OFFSET_OF(arr, expr)
- Function type: Array
Example
The following example returns the zero-based index of the fist occurrence of 3
in the arrays in the arrayLong
column of the array-example
datasource.
SELECT "arrayLong" as "array", ARRAY_OFFSET_OF("arrayLong", 3) AS "offset"
FROM "array-example"
Returns the following:
array | offset |
---|---|
[1,null,3] | 2 |
null | null |
[1,2,3] | 2 |
[1,2,3] | 2 |
[] | null |
ARRAY_ORDINAL
Returns the array element at the specified one-based index. Returns null if the index is out of bounds.
- Syntax:
ARRAY_ORDINAL(arr, long)
- Function type: Array
Example
The following example returns the element at the specified one-based index from the arrays in the arrayLong
column of the array-example
datasource.
SELECT "arrayLong" as "array", ARRAY_ORDINAL("arrayLong", 2) AS "elementAtIndex"
FROM "array-example"
Returns the following:
array | elementAtIndex |
---|---|
[1,null,3] | null |
null | null |
[1,2,3] | 2 |
[1,2,3] | 2 |
[] | null |
ARRAY_ORDINAL_OF
Returns the one-based index of the first occurrence of the expression in the array. Returns null if the value isn't present, or -1
if druid.generic.useDefaultValueForNull=true
(deprecated legacy mode).
- Syntax:
ARRAY_ORDINAL_OF(arr, expr)
- Function type: Array
Example
The following example returns the one-based index of the fist occurrence of 3
in the arrays in the arrayLong
column of the array-example
datasource.
SELECT "arrayLong" as "array", ARRAY_ORDINAL_OF("arrayLong", 3) AS "ordinal"
FROM "array-example"
Returns the following:
array | ordinal |
---|---|
[1,null,3] | 3 |
null | null |
[1,2,3] | 3 |
[1,2,3] | 3 |
[] | null |
ARRAY_OVERLAP
Returns true if two arrays have any elements in common. Treats NULL
values as known elements.
- Syntax:
ARRAY_OVERLAP(arr1, arr2)
- Function type: Array
Example
The following example returns true if columns arrayString
and arrayDouble
from the array-example
datasource have common elements.
SELECT "arrayString", "arrayDouble", ARRAY_OVERLAP("arrayString", "arrayDouble") AS "overlap"
FROM "array-example"
Returns the following:
arrayString | arrayDouble | overlap |
---|---|---|
["a","b"] | [1.1,2.2,null] | false |
[null,"b"] | [999,null,5.5] | true |
[] | [null,2.2,1.1] | false |
["a","b"] | [] | false |
null | null | null |
SCALAR_IN_ARRAY
Checks if the scalar value is present in the array. Returns false if the value is non-null, or UNKNOWN
if the value is NULL
. Returns UNKNOWN
if the array is NULL
.
- Syntax:
SCALAR_IN_ARRAY(expr, arr)
- Function type: Array
Example
The following example returns true if the value 36
is present in the array generated from the elements in the DestStateFips
column from the flight-carriers
datasource.
SELECT "Reporting_Airline", ARRAY_AGG(DISTINCT "DestStateFips") AS "StateFipsArray", SCALAR_IN_ARRAY(36, ARRAY_AGG(DISTINCT "DestStateFips")) AS "ValueInArray"
FROM "flight-carriers"
GROUP BY "Reporting_Airline"
LIMIT 5
Returns the following:
Reporting_Airline | StateFipsArray | ValueInArray |
---|---|---|
AA | [1,4,5,6,8,9,12,13,15,17,18,20,21,22,24,25,26,27,29,31,32,34,35,36,37,39,40,41,42,44,47,48,49,51,53,72,78] | true |
AS | [2,4,6,8,12,16,17,25,32,34,41,48,51,53] | false |
B6 | [4,6,8,12,22,25,32,34,36,41,49,50,51,53,72] | true |
CO | [1,2,4,6,8,9,12,13,15,17,18,22,24,25,26,27,28,29,31,32,33,34,35,36,37,39,40,41,42,44,45,47,48,49,51,53,72,78] | true |
DH | [1,6,9,12,13,17,23,25,26,32,33,34,36,37,39,42,44,45,47,50,51,53,54] | true |
ARRAY_PREPEND
Prepends the expression to the array. The source array type determines the resulting array type.
- Syntax:
ARRAY_PREPEND(expr, arr)
- Function type: Array
Example
The following example prepends c
to the arrays in the arrayString
column from the array-example
datasource.
SELECT ARRAY_PREPEND('c', "arrayString") AS "arrayPrepended"
FROM "array-example"
Returns the following:
arrayPrepended |
---|
[c, a, b] |
["c",null,"b"] |
[c] |
[c,a,b] |
null |
ARRAY_SLICE
Returns a subset of the array from the zero-based index start
(inclusive) to end
(exclusive). Returns null if start
is less than 0, greater than the length of the array, or greater than end
.
- Syntax:
ARRAY_SLICE(arr, start, end)
- Function type: Array
Example
The following example constructs a new array from the elements of arrays in the arrayDouble
column from the array-example
datasource.
SELECT "arrayDouble", ARRAY_SLICE("arrayDouble", 0, 2) AS "arrayNew"
FROM "array-example"
Returns the following:
arrayDouble | arrayNew |
---|---|
[1.1,2.2,null] | [1.1,2.2] |
[999,null,5.5] | [999,null] |
[null,2.2,1.1] | [null,2.2] |
[] | [null,null] |
null | null |
ARRAY_TO_MV
Converts an array of any type into a multi-value string.
- Syntax:
ARRAY_TO_MV(arr)
- Function type: Array
Example
The following example converts the arrays in the arrayDouble
column from the array-example
datasource into multi-value strings.
SELECT ARRAY_TO_MV("arrayDouble") AS "multiValueString"
FROM "array-example"
Returns the following:
multiValueString |
---|
["1.1","2.2",null] |
["999.0",null,"5.5"] |
[null,"2.2","1.1"] |
[] |
null |
ARRAY_TO_STRING
Joins all elements of the array into a string using the specified delimiter.
- Syntax:
ARRAY_TO_STRING(arr, delimiter)
- Function type: Array
Example
The following example converts the arrays in the arrayDouble
column of the array-example
datasource into concatenated strings.
SELECT ARRAY_TO_STRING("arrayDouble", '') AS "notSeparated"
FROM "array-example"
Returns the following:
multiValueString |
---|
1.12.2null |
999.0null5.5 |
null2.21.1 |
|
null |
ASIN
Calculates the arc sine (arcsine) of a numeric expression.
- Syntax:
ASIN(expr)
- Function type: Scalar, numeric
Example
The following example calculates the arc sine of 1
.
SELECT ASIN(1) AS "arc_sine"
Returns the following:
arc_sine |
---|
1.5707963267948966 |
ATAN
Calculates the arc tangent (arctangent) of a numeric expression.
- Syntax:
ATAN(expr)
- Function type: Scalar, numeric
Example
The following example calculates the arc tangent of 1
.
SELECT ATAN(1) AS "arc_tangent"
Returns the following:
arc_tangent |
---|
0.7853981633974483 |
ATAN2
Calculates the arc tangent (arctangent) of a specified x and y coordinate.
- Syntax:
ATAN2(x, y)
- Function type: Scalar, numeric
Example
The following example calculates the arc tangent of the coordinate (1, -1)
SELECT ATAN2(1,-1) AS "arc_tangent_2"
Returns the following:
arc_tangent_2 |
---|
2.356194490192345 |
AVG
Calculates the average of a set of values.
- Syntax:
AVG(<NUMERIC>)
- Function type: Aggregation
Example
The following example calculates the average minutes of delay for a particular airlines in flight-carriers
:
SELECT AVG("DepDelayMinutes") AS avg_delay
FROM "flight-carriers"
WHERE "Reporting_Airline" = 'AA'
Returns the following:
avg_delay |
---|
8.936 |
BIT_AND
Performs a bitwise AND operation on all input values.
- Syntax:
BIT_AND(expr)
- Function type: Aggregation
Example
The following example returns the bitwise AND operation for all values in passenger-count
from taxi-trips
:
SELECT
BIT_AND("passenger_count") AS "bit_and"
FROM "taxi-trips"
Returns the following:
bit_and |
---|
0 |
BIT_OR
Performs a bitwise OR operation on all input values.
- Syntax:
BIT_OR(expr)
- Function type: Aggregation
Example
The following example returns the bitwise OR operation for all values in passenger-count
from taxi-trips
:
SELECT
BIT_OR("passenger_count") AS "bit_or"
FROM "taxi-trips"
Returns the following:
bit_or |
---|
15 |
BIT_XOR
Performs a bitwise XOR operation on all input values.
- Syntax:
BIT_XOR(expr)
- Function type: Aggregation
Example
The following example returns the bitwise XOR operation for all values in passenger-count
from taxi-trips
:
SELECT
BIT_OR("passenger_count") AS "bit_xor"
FROM "taxi-trips"
Returns the following:
bit_xor |
---|
6 |
BITWISE_AND
Returns the bitwise AND between two expressions: expr1 & expr2
.
- Syntax:
BITWISE_AND(expr1, expr2)
- Function type: Scalar, numeric
Example
The following example performs the bitwise AND operation 12 & 10
.
SELECT BITWISE_AND(12, 10) AS "bitwise_and"
Returns the following:
bitwise_and |
---|
8 |
BITWISE_COMPLEMENT
Returns the bitwise complement (bitwise not) for the expression: ~expr
.
- Syntax:
BITWISE_COMPLEMENT(expr)
- Function type: Scalar, numeric
Example
The following example performs the bitwise complement operation ~12
.
SELECT BITWISE_COMPLEMENT(12) AS "bitwise_complement"
Returns the following:
bitwise_complement |
---|
-13 |
BITWISE_CONVERT_DOUBLE_TO_LONG_BITS
Converts the bits of an IEEE 754 floating-point double value to long.
- Syntax:
BITWISE_CONVERT_DOUBLE_TO_LONG_BITS(expr)
- Function type: Scalar, numeric
Example
The following example returns the IEEE 754 floating-point double representation of 255
as a long.
SELECT BITWISE_CONVERT_DOUBLE_TO_LONG_BITS(255) AS "ieee_754_double_to_long"
Returns the following:
ieee_754_double_to_long |
---|
4643176031446892544 |
BITWISE_CONVERT_LONG_BITS_TO_DOUBLE
Converts a long to the IEEE 754 floating-point double specified by the bits stored in the long.
- Syntax:
BITWISE_CONVERT_LONG_BITS_TO_DOUBLE(expr)
- Function type: Scalar, numeric
Example
The following example returns the long representation of 4643176031446892544
as an IEEE 754 floating-point double.
SELECT BITWISE_CONVERT_LONG_BITS_TO_DOUBLE(4643176031446892544) AS "long_to_ieee_754_double"
Returns the following:
long_to_ieee_754_double |
---|
255 |
BITWISE_OR
Returns the bitwise OR between the two expressions: expr1 | expr2
.
- Syntax:
BITWISE_OR(expr1, expr2)
- Function type: Scalar, numeric
Example
The following example performs the bitwise OR operation 12 | 10
.
SELECT BITWISE_OR(12, 10) AS "bitwise_or"
Returns the following:
bitwise_or |
---|
14 |
BITWISE_SHIFT_LEFT
Returns the bitwise left shift by x positions of an expr: expr << x
.
- Syntax:
BITWISE_SHIFT_LEFT(expr, x)
- Function type: Scalar, numeric
Example
The following example performs the bitwise SHIFT operation 2 << 3
.
SELECT BITWISE_SHIFT_LEFT(2, 3) AS "bitwise_shift_left"
Returns the following:
bitwise_shift_left |
---|
16 |
BITWISE_SHIFT_RIGHT
Returns the bitwise right shift by x positions of an expr: expr >> x
.
- Syntax:
BITWISE_SHIFT_RIGHT(expr, x)
- Function type: Scalar, numeric
Example
The following example performs the bitwise SHIFT operation 16 >> 3
.
SELECT BITWISE_SHIFT_RIGHT(16, 3) AS "bitwise_shift_right"
Returns the following:
bitwise_shift_right |
---|
2 |
BITWISE_XOR
Returns the bitwise exclusive OR between the two expressions: expr1 ^ expr2
.
- Syntax:
BITWISE_XOR(expr1, expr2)
- Function type: Scalar, numeric
Example
The following example performs the bitwise XOR operation 12 ^ 10
.
SELECT BITWISE_XOR(12, 10) AS "bitwise_xor"
Returns the following:
bitwise_xor |
---|
6 |
BLOOM_FILTER
Computes a Bloom filter from values provided in an expression.
- Syntax:
BLOOM_FILTER(expr, numEntries)
numEntries
specifies the maximum number of distinct values before the false positive rate increases. - Function type: Aggregation
Example
The following example returns a Base64-encoded Bloom filter representing the set of devices, agent_category
, used in Albania:
SELECT "country",
BLOOM_FILTER(agent_category, 10) as albanian_bloom
FROM "kttm"
WHERE "country" = 'Albania'
GROUP BY "country"
Returns the following:
country | albanian_bloom |
---|---|
Albania | BAAAAAgAAACAAEAAAAAAAAAAAEIAAAAAAAAAAAAAAAAAAAAAAAIIAAAAAAAAAAAAAAAAAAIAAAAAAQAAAAAAAAAAAAAA |
BLOOM_FILTER_TEST
Returns true if an expression is contained in a Base64-encoded Bloom filter string.
- Syntax:
BLOOM_FILTER_TEST(expr, <STRING>)
- Function type: Scalar, other
Example
The following example returns true
when a device type, agent_category
, exists in the Bloom filter representing the set of devices used in Albania:
SELECT agent_category,
BLOOM_FILTER_TEST("agent_category", 'BAAAAAgAAACAAEAAAAAAAAAAAEIAAAAAAAAAAAAAAAAAAAAAAAIIAAAAAAAAAAAAAAAAAAIAAAAAAQAAAAAAAAAAAAAA') AS bloom_test
FROM "kttm"
GROUP BY 1
Returns the following:
agent_category | bloom_test |
---|---|
empty | false |
Game console | false |
Personal computer | true |
Smart TV | false |
Smartphone | true |
Tablet | false |
BTRIM
Trims characters from both the leading and trailing ends of an expression. Defaults chars
to a space if none is provided.
- Syntax:
BTRIM(expr[, chars])
- Function type: Scalar, string
Example
The following example trims the _
characters from both ends of the string expression.
SELECT
'___abc___' AS "original_string",
BTRIM('___abc___', '_') AS "trim_both_ends"
Returns the following:
original_string | trim_both_ends |
---|---|
___abc___ | abc |
CASE
Returns a result based on given conditions.
Simple CASE
Compares an expression to a set of values or expressions.
- Syntax:
CASE expr WHEN value1 THEN result1 \[ WHEN value2 THEN result2 ... \] \[ ELSE resultN \] END
- Function type: Scalar, other
Example
The following example returns a UI type based on the value of agent_category
from the kttm
datasource.
SELECT "agent_category" AS "device_type",
CASE "agent_category"
WHEN 'Personal computer' THEN 'Large UI'
WHEN 'Smartphone' THEN 'Mobile UI'
ELSE 'other'
END AS "UI_type"
FROM "kttm"
LIMIT 2
Returns the following:
device_type | UI_type |
---|---|
Personal computer | Large UI |
Smartphone | Mobile UI |
Searched CASE
Evaluates a set of Boolean expressions.
- Syntax:
CASE WHEN boolean_expr1 THEN result1 \[ WHEN boolean_expr2 THEN result2 ... \] \[ ELSE resultN \] END
- Function type: Scalar, other
Example
The following example returns the departure location corresponding to the value of the OriginStateName
column from the flight-carriers
datasource.
SELECT "OriginStateName" AS "flight_origin",
CASE
WHEN "OriginStateName" = 'Puerto Rico' THEN 'U.S. Territory'
WHEN "OriginStateName" = 'U.S. Virgin Islands' THEN 'U.S. Territory'
ELSE 'U.S. State'
END AS "state_status"
FROM "flight-carriers"
LIMIT 2
Returns the following:
flight_origin | departure_location |
---|---|
Puerto Rico | U.S. Territory |
Massachusetts | U.S. State |
CAST
Converts a value into the specified data type.
- Syntax:
CAST(value AS TYPE)
- Function type: Scalar, other
Example
The following example converts the values in the Distance
column from the flight-carriers
datasource from DOUBLE
to VARCHAR
.
SELECT "Distance" AS "original_column",
CAST("Distance" AS VARCHAR) "cast_to_string"
FROM "flight-carriers"
LIMIT 1
Returns the following:
original_column | cast_to_string |
---|---|
1571 | 1571.0 |
CEIL
Date and time
Rounds up a timestamp by a given time unit.
- Syntax:
CEIL(timestamp_expr TO unit>)
- Function type: Scalar, date and time
Example
The following example rounds up the __time
column from the taxi-trips
datasource to the nearest year.
SELECT
"__time" AS "original_time",
CEIL("__time" TO YEAR) AS "ceiling"
FROM "taxi-trips"
LIMIT 1
Returns the following:
original_time | ceiling |
---|---|
2013-08-01T08:14:37.000Z | 2014-01-01T00:00:00.000Z |
Numeric
Calculates the smallest integer value greater than or equal to the numeric expression.
- Syntax:
CEIL(<NUMERIC>)
- Function type: Scalar, numeric
Example
The following example applies the CEIL function to the fare_amount
column from the taxi-trips
datasource.
SELECT
"fare_amount" AS "fare_amount",
CEIL("fare_amount") AS "ceiling_fare_amount"
FROM "taxi-trips"
LIMIT 1
Returns the following:
fare_amount | ceiling_fare_amount |
---|---|
21.25 | 22 |
CHAR_LENGTH
Alias for LENGTH
.
- Syntax:
CHAR_LENGTH(expr)
- Function type: Scalar, string
CHARACTER_LENGTH
Alias for LENGTH
.
- Syntax:
CHARACTER_LENGTH(expr)
- Function type: Scalar, string
COALESCE
Returns the first non-null value.
- Syntax:
COALESCE(expr, expr, ...)
- Function type: Scalar, other
Example
The following example returns the first non-null value from the list of parameters.
SELECT COALESCE(null, null, 5, 'abc') AS "first_non_null"
Returns the following:
first_non_null |
---|
5 |
CONCAT
Concatenates a list of expressions.
- Syntax:
CONCAT(expr[, expr,...])
- Function type: Scalar, string
Example
The following example concatenates the OriginCityName
column from flight-carriers
, the string to
, and the DestCityName
column from flight-carriers
.
SELECT
"OriginCityName" AS "origin_city",
"DestCityName" AS "destination_city",
CONCAT("OriginCityName", ' to ', "DestCityName") AS "concatenate_flight_details"
FROM "flight-carriers"
LIMIT 1
Returns the following:
origin_city | destination_city | concatenate_flight_details |
---|---|---|
San Juan, PR | Washington, DC | San Juan, PR to Washington, DC |
CONTAINS_STRING
Returns true if str
is a substring of expr
, case-sensitive. Otherwise, returns false.
- Syntax:
CONTAINS_STRING(expr, str)
- Function type: Scalar, string
Example
The following example returns true if the OriginCityName
column from the flight-carriers
datasource contains the substring San
.
SELECT
"OriginCityName" AS "origin_city",
CONTAINS_STRING("OriginCityName", 'San') AS "contains_string"
FROM "flight-carriers"
LIMIT 2
Returns the following:
origin_city | contains_string |
---|---|
San Juan, PR | true |
Boston, MA | false |
COS
Calculates the trigonometric cosine of an angle expressed in radians.
- Syntax:
COS(expr)
- Function type: Scalar, numeric
Example
The following example calculates the cosine of angle PI/3
radians.
SELECT COS(PI / 3) AS "cosine"
Returns the following:
cosine |
---|
0.5000000000000001 |
COT
Calculates the trigonometric cotangent of an angle expressed in radians.
- Syntax:
COT(expr)
- Function type: Scalar, numeric
Example
The following example calculates the cotangent of angle PI/3
radians.
SELECT COT(PI / 3) AS "cotangent"
Returns the following:
cotangent |
---|
0.577350269189626 |
COUNT
Counts the number of rows.
- Syntax:
COUNT([DISTINCT] expr)
COUNT(*)
COUNT DISTINCT is an alias forAPPROX_COUNT_DISTINCT
. - Function type: Aggregation
Example
The following example counts the number of distinct flights per day after '2005-01-01 00:00:00'
in flight-carriers
:
SELECT
TIME_FLOOR(__time, 'P1D') AS "flight_day",
COUNT(*) AS "num_flights"
FROM "flight-carriers"
WHERE __time > '2005-01-01 00:00:00'
GROUP BY 1
LIMIT 3
Returns the following:
flight_day | num_flights |
---|---|
2005-11-01T00:00:00.000Z | 18961 |
2005-11-02T00:00:00.000Z | 19434 |
2005-11-03T00:00:00.000Z | 19745 |
CUME_DIST
Returns the cumulative distribution of the current row within the window calculated as number of window rows at the same rank or higher than current row
/ total window rows
. The return value ranges between 1/number of rows
and 1.
- Syntax:
CUME_DIST()
- Function type: Window
Example
The following example returns the cumulative distribution of number of flights by airline from two airports on a single day.
SELECT FLOOR("__time" TO DAY) AS "flight_day",
"Origin" AS "airport",
"Reporting_Airline" as "airline",
COUNT("Flight_Number_Reporting_Airline") as "num_flights",
CUME_DIST() OVER (PARTITION BY "Origin" ORDER BY COUNT("Flight_Number_Reporting_Airline") DESC) AS "cume_dist"
FROM "flight-carriers"
WHERE FLOOR("__time" TO DAY) = '2005-11-01'
AND "Origin" IN ('KOA', 'LIH')
GROUP BY 1, 2, 3
Returns the following:
flight_day | airport | airline | num_flights | cume_dist |
---|---|---|---|---|
2005-11-01T00:00:00.000Z | KOA | HA | 11 | 0.25 |
2005-11-01T00:00:00.000Z | KOA | UA | 4 | 0.5 |
2005-11-01T00:00:00.000Z | KOA | AA | 1 | 1 |
2005-11-01T00:00:00.000Z | KOA | NW | 1 | 1 |
2005-11-01T00:00:00.000Z | LIH | HA | 15 | 0.3333333333333333 |
2005-11-01T00:00:00.000Z | LIH | AA | 2 | 1 |
2005-11-01T00:00:00.000Z | LIH | UA | 2 | 1 |
CURRENT_DATE
Returns the current date in UTC time, unless you specify a different timezone in the query context.
- Syntax:
CURRENT_DATE
- Function type: Scalar, date and time
Example
The following example returns the current date.
SELECT CURRENT_DATE AS "current_date"
Returns the following:
current_date |
---|
2024-08-14T00:00:00.000Z |
CURRENT_TIMESTAMP
Returns the current timestamp in UTC time, unless you specify a different timezone in the query context.
- Syntax:
CURRENT_TIMESTAMP
- Function type: Scalar, date and time
Example
The following example returns the current timestamp.
SELECT CURRENT_TIMESTAMP AS "current_timestamp"
Returns the following:
current_timestamp |
---|
2024-08-14T21:30:13.793Z |
DATE_EXPAND
DATE_EXPAND(<STARTING_TIMESTAMP>, <ENDING_TIMESTAMP>, '<DURATION>)
Inserts evenly spaced timestamps that are inclusive of the starting and ending timestamps over a given duration. Use ISO 8601 format for the duration, such as PT1H
.
Note that you can use functions like TIMESTAMP_TO_MILLIS with your starting and ending timestamps. For example, DATE_EXPAND(TIMESTAMP_TO_MILLIS(__time), TIMESTAMP_TO_MILLIS(__time) + 20000, 'PT1S')
.
DATE_TRUNC
Rounds down a timestamp by a given time unit.
- Syntax:
DATE_TRUNC(unit, timestamp_expr)
- Function type: Scalar, date and time
Example
The following example truncates a timestamp from the __time
column from the taxi-trips
datasource to the most recent decade
.
SELECT
"__time" AS "original_timestamp",
DATE_TRUNC('decade', "__time") AS "truncate_timestamp"
FROM "taxi-trips"
LIMIT 1
Returns the following:
original_timestamp | truncate_time |
---|---|
2013-08-01T08:14:37.000Z | 2010-01-01T00:00:00.000Z |
DECODE_BASE64_COMPLEX
Decodes a Base64-encoded expression into a complex data type.
You can use the function to ingest data when a column contains an encoded data sketch such as Theta or HLL.
The function supports hyperUnique
and serializablePairLongString
data types by default.
To enable support for a complex data type, load the corresponding extension:
druid-bloom-filter
:bloom
druid-datasketches
:arrayOfDoublesSketch
,HLLSketch
,KllDoublesSketch
,KllFloatsSketch
,quantilesDoublesSketch
,thetaSketch
druid-histogram
:approximateHistogram
,fixedBucketsHistogram
druid-stats
:variance
druid-compressed-bigdecimal
:compressedBigDecimal
druid-momentsketch
:momentSketch
druid-tdigestsketch
:tDigestSketch
- Syntax:
DECODE_BASE64_COMPLEX(dataType, expr)
- Function type: Scalar, other
Example
The following example returns a Theta sketch complex type from a Base64-encoded string representation of the sketch:
SELECT DECODE_BASE64_COMPLEX('thetaSketch','AgMDAAAazJNBAAAAAACAP+k/tkWGkSoFYWMAG0y+3gVabvKcIUNrBv0jAkGsw7sK5szX1k0ScwtMfCQmFP/rDhFK6yU7PPkObZ/Ugw5fcBQZ+GaO+Nt6FP+Whz6TmxkWyRJ+gaQLFhcts1+c0Q/vF9FLFfaVlOkb3/XpXaZ3JhyZ2dG8Di2/HO10sMs9C0AdM4FdHuye6SB+GYinIhTOITOHzB5SAfIiph3de9qIGSM89V+s/TkdI/WZVzK9wF0npfi4ZrmgBSnVjphCtQA5K2fp0x59UCwvMopZarsSkzEo81OIxjznNNXLr1BbQBo1Ei3OxJOoNzVs0x9xzsm4NfgAZSvZQvI1c2TmPsZvlzpW7tmIlizOOsr6pGWoh0U99/tV8RFwhz0SJoWyU1Z2P0hZ5d7KRnZBjlWC+e/FLEKrWsu14rlFRXhsOuxRId9FboEuH9PqMUixI2lB8MhLS803hJDoZ7tMy7Egl+YNU04QM11stXX4Tu96NHHcGiZRuCyciGiTGVQflMLmNt6lW6zIwJy0baNdbwjMCTjtUF7oZOtugWLYYJE9sJU3HuVijc0J10l6SmPslbfY6Fw0Za9w/Zdhn/5nIuKc1WMrYWnAJQJKXY73bHYWq7gI6dRvYdC2fLJyv3F8qwQcOJgFc0GaGXw8KRF3w3IVCwxsMntWhdTkaJ88e++5NFyM1Hd/D79wg0b9vH8=') AS "theta_sketch"
You can perform Theta sketch operations on the resulting COMPLEX<thetaSketch>
value which resembles the input string.
DECODE_BASE64_UTF8
Decodes a Base64-encoded expression into a UTF-8 encoded string.
- Syntax:
DECODE_BASE64_UTF8(expr)
- Function type: Scalar, string
Example
The following example decodes the Base64-encoded representation of "Hello, World!":
SELECT
DECODE_BASE64_UTF8('SGVsbG8sIFdvcmxkIQ==') as decoded
Returns the following:
decoded |
---|
Hello, World! |
DEGREES
Converts an angle from radians to degrees.
- Syntax:
DEGREES(expr)
- Function type: Scalar, numeric
Example
The following example converts an angle of PI
radians to degrees
SELECT DEGREES(PI) AS "degrees"
Returns the following:
degrees |
---|
180 |
DENSE_RANK
Returns the rank for a row within a window without gaps. For example, if two rows tie for a rank of 1, the subsequent row is ranked 2.
- Syntax:
DENSE_RANK()
- Function type: Window
Example
The following example returns the dense rank by airline for flights from two airports on a single day.
SELECT FLOOR("__time" TO DAY) AS "flight_day",
"Origin" AS "airport",
"Reporting_Airline" as "airline",
COUNT("Flight_Number_Reporting_Airline") as "num_flights",
DENSE_RANK() OVER (PARTITION BY "Origin" ORDER BY COUNT("Flight_Number_Reporting_Airline") DESC) AS "dense_rank"
FROM "flight-carriers"
WHERE FLOOR("__time" TO DAY) = '2005-11-01'
AND "Origin" IN ('KOA', 'LIH')
GROUP BY 1, 2, 3
Returns the following:
flight_day | airport | airline | num_flights | dense_rank |
---|---|---|---|---|
2005-11-01T00:00:00.000Z | KOA | HA | 11 | 1 |
2005-11-01T00:00:00.000Z | KOA | UA | 4 | 2 |
2005-11-01T00:00:00.000Z | KOA | AA | 1 | 3 |
2005-11-01T00:00:00.000Z | KOA | NW | 1 | 3 |
2005-11-01T00:00:00.000Z | LIH | HA | 15 | 1 |
2005-11-01T00:00:00.000Z | LIH | AA | 2 | 2 |
2005-11-01T00:00:00.000Z | LIH | UA | 2 | 2 |
DIV
Returns the result of integer division of x
by y
.
The DIV
function is not implemented in Druid versions 30.0.0 or earlier. Consider using SAFE_DIVIDE
instead.
- Syntax:
DIV(x, y)
- Function type: Scalar, numeric
Example
The following calculates integer divisions of 78
by 10
.
SELECT DIV(78, 10) as "division"
Returns the following:
division |
---|
7 |
DS_CDF
Returns a string representing an approximation to the cumulative distribution function given a list of split points that define the edges of the bins from a Quantiles sketch.
- Syntax:
DS_CDF(expr, splitPoint0, splitPoint1, ...)
- Function type: Scalar, sketch
Example
The following example specifies three split points to return cumulative distribution function approximations on the Distance
column from the flight-carriers
datasource. The query may return a different approximation for each bin on each execution.
SELECT DS_CDF( DS_QUANTILES_SKETCH("Distance"), 750, 1500, 2250) AS "estimate_cdf"
FROM "flight-carriers"
Returns a result similar to the following:
estimate_cdf |
---|
[0.6332237016416492,0.8908411023460711,0.9612303007393957,1.0] |
DS_GET_QUANTILE
Returns the quantile estimate corresponding to the fraction from a Quantiles sketch.
- Syntax:
DS_GET_QUANTILE(expr, fraction)
- Function type: Scalar, sketch
Example
The following example approximates the median of the Distance
column from the flight-carriers
datasource. The query may return a different approximation with each execution.
SELECT DS_GET_QUANTILE( DS_QUANTILES_SKETCH("Distance"), 0.5) AS "estimate_median"
FROM "flight-carriers"
Returns a result similar to the following:
estimate_median |
---|
569 |
DS_GET_QUANTILES
Returns a string representing an array of quantile estimates corresponding to a list of fractions from a Quantiles sketch.
- Syntax:
DS_GET_QUANTILES(expr, fraction0, fraction1, ...)
- Function type: Scalar, sketch
Example
The following example approximates the 25th, 50th, and 75th percentiles of the Distance
column from the flight-carriers
datasource. The query may return a different approximation for each percentile on each execution.
SELECT DS_GET_QUANTILES( DS_QUANTILES_SKETCH("Distance"), 0.25, 0.5, 0.75) AS "estimate_fractions"
FROM "flight-carriers"
Returns a result similar to the following:
estimate_fractions |
---|
[316.0,571.0,951.0] |
DS_HISTOGRAM
Returns an approximation to the histogram from a Quantiles sketch. The split points define the histogram bins.
- Syntax:
DS_HISTOGRAM(expr, splitPoint0, splitPoint1, ...)
- Function type: Scalar, sketch
Example
The following example specifies three split points to approximate a histogram on the Distance
column from the flight-carriers
datasource. The query may return a different approximation for each bin on each execution.
SELECT DS_HISTOGRAM( DS_QUANTILES_SKETCH("Distance"), 750, 1500, 2250) AS "estimate_histogram"
FROM "flight-carriers"
Returns a result similar to the following:
estimate_histogram |
---|
[358496.0,153974.99999999997,39909.99999999999,13757.000000000005] |
DS_HLL
Creates a HLL sketch on a column containing HLL sketches or a regular column. See DataSketches HLL Sketch module for a description of optional parameters.
- Syntax:
DS_HLL(expr, [lgK, tgtHllType])
- Function type: Aggregation
Example
The following example creates a HLL sketch on the Tail_number
column of the flight-carriers
datasource grouping by OriginState
and DestState
.
SELECT
"OriginState" AS "origin_state",
"DestState" AS "destination_state",
DS_HLL("Tail_Number") AS "hll_tail_number"
FROM "flight-carriers"
GROUP BY 1,2
LIMIT 1
Returns the following:
origin_state | destination_state | hll_tail_number |
---|---|---|
AK | AK | "AwEHDAcIAAFBAAAAfY..." |
DS_UTF8_MURMUR3
DS_UTF8_MURMUR3(string|column|expression)
Function type: Scalar
Hashes the input UTF-8 encoded string, column, or expression and returns corresponding 128-bit hashes of type VARCHAR.
DS_UTF8_MURMUR3_64
DS_UTF8_MURMUR3_64(string|column|expression)
Function type: Scalar
Hashes the input UTF-8 encoded string, column, or expression and returns corresponding 64-bit hashes of type VARCHAR.