Enhanced IP support
IPv6 support is an alpha feature. For more information, see Preview features.
Enhanced IP support in the Imply distribution of Apache Druid includes support for IPv6 in addition to IPv4. With this feature, you can store a dimension as an IP address complex data type. Druid requires the IP address type to query on the IP address columns using the functions in this topic.
You need a valid Imply license to enable the enhanced IP feature in Druid. The license must specify the feature name enhanced-ip-support
, for example:
"features":["alerts","cross-tab","enhanced-ip-support"]
Contact your Imply representative to obtain a license file if you do not have one.
Setup and ingestion
Support for IPv6 is disabled by default.
To enable this feature, load the imply-utility-belt
extension in the common.runtime.properties
, for example:
druid.extensions.loadList=["imply-utility-belt"]
To configure a dimension as an IP address column, include a dimension object specifying the ipAddress
complex type and input column name in the dimensionsSpec
property of your ingestion spec.
IP address-typed columns support both IPv6 and IPv4 addresses.
For example, the following snippet of the ingestion spec instructs Druid to read the page
and language
columns as string-typed dimensions and casts the remote_address
column as an IP address dimension:
"dimensionsSpec" : {
"dimensions": [
"page",
"language",
{ "type": "ipAddress", "name": "remote_address" }
]
}
See Ingestion spec reference for more information on configuring dimensions during ingestion.
Querying on IP addresses
Apply the following functions in Druid SQL or native queries.
Function | Notes |
---|---|
IP_PARSE(expr) | Parses the string expr into an IP address complex type. Raises ParseException if the input string is not a valid IP address. |
IP_TRY_PARSE(expr) | Parses the string expr into an IP address complex type. Returns null if the input string is not a valid IP address. |
IP_STRINGIFY(expr, [compact]) | Converts expr from an IP address complex data type to a string representation. Returns a string representation of the IP address. compact defaults to true if not specified, providing a compact form of the IP address with zero compression applied. Specify false for compact to obtain the full, expanded IP address. |
IP_MATCH(expr1, expr2) | Checks whether expr2 contains or is equal to the address of expr1 . expr1 must be an IP address type. expr2 may be a string-typed IP address or a string CIDR range. |
IP_PREFIX(expr, prefixLength) | Given an IP address expr , creates a new IP address consisting of prefixLength number of bits of the original address. Use IP_STRINGIFY to convert the resulting prefix address into string format. |
Examples
The examples in this section use the following table that contains a mix of IPv4 addresses, IPv6 addresses, and null values:
time | ipv4 | ipv6 | ipvmix |
---|---|---|---|
2014102000 | 192.0.2.4 | 2001:db8:3:4:5:6:7:8 | 192.0.2.4 |
2014102000 | 192.0.2.5 | 2001:db8:33:44:55:66:77:88 | 2001:db8:33:44:55:66:77:88 |
2014102000 | 198.51.100.3 | 2001:db8:3:4:50:60:70:80 | 198.51.100.11 |
2014102000 | 192.0.2.7 | 2001:db8:6:5:4:3:2:1 | 2001:db8:6:5:5:6:7:8 |
2014102000 | 203.0.113.5 | 2001:db8:6:5:5:6:7:8 | 2001:db8:33:44:55:66:77:88 |
2014102001 | 192.0.2.4 | 2001:db8:3:4:5:6:7:8 | 192.0.2.4 |
2014102001 | 2001:db8:33:44:55:66:77:88 | ||
2014102001 | 198.51.100.2 | 2001:db8:3:4:50:60:70:80 | 203.0.113.12 |
2014102001 | 192.0.2.4 | ||
2014102001 | 203.0.113.4 | 2001:db8:6:5:5:6:7:8 | 192.0.2.5 |
Ingest data with IP address types
The following ingestion spec loads the example data and specifies the columns ipv4
, ipv6
, and ipvmix
as IP address types.
{
"type": "index_parallel",
"spec": {
"ioConfig": {
"type": "index_parallel",
"inputSource": {
"type": "inline",
"data": "2014102000\t192.0.2.4\t2001:db8:3:4:5:6:7:8\t192.0.2.4\n2014102000\t192.0.2.5\t2001:db8:33:44:55:66:77:88\t2001:db8:33:44:55:66:77:88\n2014102000\t198.51.100.3\t2001:db8:3:4:50:60:70:80\t198.51.100.11\n2014102000\t192.0.2.7\t2001:db8:6:5:4:3:2:1\t2001:db8:6:5:5:6:7:8\n2014102000\t203.0.113.5\t2001:db8:6:5:5:6:7:8\t2001:db8:33:44:55:66:77:88\n2014102001\t192.0.2.4\t2001:db8:3:4:5:6:7:8\t192.0.2.4\n2014102001\t\t2001:db8:33:44:55:66:77:88\n2014102001\t198.51.100.2\t2001:db8:3:4:50:60:70:80\t203.0.113.12\n2014102001\t192.0.2.4\n2014102001\t203.0.113.4\t2001:db8:6:5:5:6:7:8\t192.0.2.5"
},
"inputFormat": {
"type": "tsv",
"findColumnsFromHeader": false,
"columns": [
"time",
"ipv4",
"ipv6",
"ipvmix"
]
}
},
"tuningConfig": {
"type": "index_parallel",
"partitionsSpec": {
"type": "dynamic"
}
},
"dataSchema": {
"dataSource": "ip_addresses",
"timestampSpec": {
"column": "time",
"format": "yyyyMMddHH"
},
"transformSpec": {},
"dimensionsSpec": {
"dimensions": [
{
"type": "ipAddress",
"name": "ipv4"
},
{
"type": "ipAddress",
"name": "ipv6"
},
{
"type": "ipAddress",
"name": "ipvmix"
}
]
},
"granularitySpec": {
"queryGranularity": "none",
"rollup": false,
"segmentGranularity": "hour"
}
}
}
}
Select and view IP address columns
IP address columns store the addresses in binary format. Convert them to string values to display a human-readable format, whether in dotted decimal form for IPv4 or colon-separated hexadecimal notation for IPv6.
Call IP_STRINGIFY
with optional argument compact
, which controls the formatting of the string representation of IPv6 addresses.
compact
defaults to true to return compressed IPv6 addresses, applying zero compression and removing leading zeroes.
Set compact
to false to return fully exapnded IPv6 addresses.
IPv4 addresses are not affected by the compact
parameter.
Query:
SELECT ipv4,
ipv6,
ipvmix,
IP_STRINGIFY(ipv4) AS compactv4,
IP_STRINGIFY(ipv6) AS compactv6,
IP_STRINGIFY(ipvmix) AS compactvmix,
IP_STRINGIFY(ipv4, false) AS fullv4,
IP_STRINGIFY(ipv6, false) AS fullv6,
IP_STRINGIFY(ipvmix, false) AS fullvmix
FROM ip_addresses
Output:
ipv4 | ipv6 | ipvmix | compactv4 | compactv6 | compactvmix | fullv4 | fullv6 | fullvmix |
---|---|---|---|---|---|---|---|---|
AAAAAAAAAAAAAP//wAACBA== | IAENuAADAAQABQAGAAcACA== | AAAAAAAAAAAAAP//wAACBA== | 192.0.2.4 | 2001:db8:3:4:5:6:7:8 | 192.0.2.4 | 192.0.2.4 | 2001:0db8:0003:0004:0005:0006:0007:0008 | 192.0.2.4 |
AAAAAAAAAAAAAP//wAACBQ== | IAENuAAzAEQAVQBmAHcAiA== | IAENuAAzAEQAVQBmAHcAiA== | 192.0.2.5 | 2001:db8:33:44:55:66:77:88 | 2001:db8:33:44:55:66:77:88 | 192.0.2.5 | 2001:0db8:0033:0044:0055:0066:0077:0088 | 2001:0db8:0033:0044:0055:0066:0077:0088 |
AAAAAAAAAAAAAP//wAACBw== | IAENuAAGAAUABAADAAIAAQ== | IAENuAAGAAUABQAGAAcACA== | 192.0.2.7 | 2001:db8:6:5:4:3:2:1 | 2001:db8:6:5:5:6:7:8 | 192.0.2.7 | 2001:0db8:0006:0005:0004:0003:0002:0001 | 2001:0db8:0006:0005:0005:0006:0007:0008 |
AAAAAAAAAAAAAP//xjNkAw== | IAENuAADAAQAUABgAHAAgA== | AAAAAAAAAAAAAP//xjNkCw== | 198.51.100.3 | 2001:db8:3:4:50:60:70:80 | 198.51.100.11 | 198.51.100.3 | 2001:0db8:0003:0004:0050:0060:0070:0080 | 198.51.100.11 |
AAAAAAAAAAAAAP//ywBxBQ== | IAENuAAGAAUABQAGAAcACA== | IAENuAAzAEQAVQBmAHcAiA== | 203.0.113.5 | 2001:db8:6:5:5:6:7:8 | 2001:db8:33:44:55:66:77:88 | 203.0.113.5 | 2001:0db8:0006:0005:0005:0006:0007:0008 | 2001:0db8:0033:0044:0055:0066:0077:0088 |
null | IAENuAAzAEQAVQBmAHcAiA== | null | null | 2001:db8:33:44:55:66:77:88 | null | null | 2001:0db8:0033:0044:0055:0066:0077:0088 | null |
AAAAAAAAAAAAAP//wAACBA== | null | null | 192.0.2.4 | null | 192.0.2.4 | null | null | |
AAAAAAAAAAAAAP//wAACBA== | IAENuAADAAQABQAGAAcACA== | AAAAAAAAAAAAAP//wAACBA== | 192.0.2.4 | 2001:db8:3:4:5:6:7:8 | 192.0.2.4 | 192.0.2.4 | 2001:0db8:0003:0004:0005:0006:0007:0008 | 192.0.2.4 |
AAAAAAAAAAAAAP//xjNkAg== | IAENuAADAAQAUABgAHAAgA== | AAAAAAAAAAAAAP//ywBxDA== | 198.51.100.2 | 2001:db8:3:4:50:60:70:80 | 203.0.113.12 | 198.51.100.2 | 2001:0db8:0003:0004:0050:0060:0070:0080 | 203.0.113.12 |
AAAAAAAAAAAAAP//ywBxBA== | IAENuAAGAAUABQAGAAcACA== | AAAAAAAAAAAAAP//wAACBQ== | 203.0.113.4 | 2001:db8:6:5:5:6:7:8 | 192.0.2.5 | 203.0.113.4 | 2001:0db8:0006:0005:0005:0006:0007:0008 | 192.0.2.5 |
Count distinct values of an IP address column
Translate IP addresses into their string forms to perform operations such as grouping or distinct counting.
Query:
SELECT COUNT(DISTINCT IP_STRINGIFY(ipv6)) AS approx_count
FROM ip_addresses
Output:
approx_count |
---|
6 |
Group IP addresses by prefix
Truncate IP addresses by a specified number of bits in the prefix with IP_PREFIX
.
Use the truncated addresses for grouping all addresses that fall within a certain range.
Convert the results of IP_PREFIX
to string type before use in other operations.
Query:
SELECT IP_STRINGIFY(IP_PREFIX(ipv6, 64)) AS prefix,
COUNT(DISTINCT IP_STRINGIFY(ipv6)) AS address_count
FROM ip_addresses
GROUP BY 1
Output:
prefix | address_count |
---|---|
null | 1 |
2001:db8:33:44:: | 1 |
2001:db8:3:4:: | 2 |
2001:db8:6:5:: | 2 |
Determine whether an IP address is within a subnet
Match an IP address to another address or CIDR range with IP_MATCH
.
Match on IPv4
SELECT __time,
IP_STRINGIFY(ipv4) AS stringv4
FROM ip_addresses
WHERE IP_MATCH(ipv4, '192.0.2.0/16')
Output:
__time | stringv4 |
---|---|
2014-10-20T00:00:00.000Z | 192.0.2.4 |
2014-10-20T00:00:00.000Z | 192.0.2.5 |
2014-10-20T00:00:00.000Z | 192.0.2.7 |
2014-10-20T01:00:00.000Z | 192.0.2.4 |
2014-10-20T01:00:00.000Z | 192.0.2.4 |
Match on IPv6
SELECT __time,
IP_STRINGIFY(ipv6) AS stringv6
FROM ip_addresses
WHERE IP_MATCH(ipv6, '2001:db8:6::/48')
Output:
__time | stringv6 |
---|---|
2014-10-20T00:00:00.000Z | 2001:db8:6:5:4:3:2:1 |
2014-10-20T00:00:00.000Z | 2001:db8:6:5:5:6:7:8 |
2014-10-20T01:00:00.000Z | 2001:db8:6:5:5:6:7:8 |
Known issues
- The Druid web console does not display tooltips with documentation for the IP address functions.
- Druid displays the IP address complex type as Base64-encoded strings. Translate these strings to a human-readable format using the
IP_STRINGIFY
function described above. - Most Druid operations, including grouping and distinct count, require string-typed IP addresses. Call
IP_STRINGIFY
to convert an IP address data type to a string type. IP_PREFIX
produces an IP address, not a string. Do not useIP_PREFIX
directly forexpr2
inIP_MATCH
withoutIP_STRINGIFY
.IP_PREFIX
does not retain the number of prefix bits. For example,IP_PREFIX('1.2.3.4', 16)
produces 1.2.0.0 instead of 1.2.0.0/16. You can restore the prefix bits by string concatenation. For example, to determine whether a given IP address belongs to a given subnet:IP_MATCH(ipaddr, IP_STRINGIFY(IP_PREFIX(otherAddr, 16)) || '/16')
.- The IPv4 address functions described in SQL scalar functions and Expressions do not support IP address columns.
- Enhanced IP support has not been optimized for performance.