2022.04

2022.04

  • Imply
  • Ingest
  • Query
  • Visualize
  • Administer
  • Deploy

›Overview

Overview

  • Querying data
  • Datasources

    • Datasources
    • Joins
    • Lookups
    • Query execution

    Query configuration

    • Query caching
    • Using query caching
    • Query context
  • Enhanced IP support
  • Multi-value dimensions
  • Multitenancy
  • Troubleshooting

Druid SQL

  • Druid SQL overview
  • SQL query syntax
  • SQL data types
  • Druid SQL Functions

    • All functions
    • Operators
    • Scalar functions
    • Aggregation functions
    • Multi-value string functions
  • SQL metadata tables
  • SQL connection context
  • Async SQL download
  • SQL query translation
  • Druid SQL APIs

    • Druid SQL API
    • Async SQL download API
    • JDBC Driver API

Native queries

  • Native queries
  • Native query types

    • Timeseries
    • TopN
    • GroupBy
    • Scan
    • Search
    • TimeBoundary
    • SegmentMetadata
    • DatasourceMetadata

    Native query components

    • Filters
    • Granularities
    • Dimensions
    • Aggregations
    • Post-aggregations
    • Expressions
    • Having filters (groupBy)
    • Sorting and limiting (groupBy)
    • Sorting (topN)
    • String comparators
    • Virtual columns
    • Spatial filters

Enhanced IP support

IPv6 support is an alpha feature. For more information, see Preview features.

Enhanced IP support in the Imply distribution of Apache Druid includes support for IPv6 in addition to IPv4. With this feature, you can store a dimension as an IP address complex data type. Druid requires the IP address type to query on the IP address columns using the functions in this topic.

You need a valid Imply license to enable the enhanced IP feature in Druid. The license must specify the feature name enhanced-ip-support, for example:

"features":["alerts","cross-tab","enhanced-ip-support"]

Contact your Imply representative to obtain a license file if you do not have one.

Setup and ingestion

Support for IPv6 is disabled by default. To enable this feature, load the imply-utility-belt extension in the common.runtime.properties, for example:

druid.extensions.loadList=["imply-utility-belt"]

To configure a dimension as an IP address column, include a dimension object specifying the ipAddress complex type and input column name in the dimensionsSpec property of your ingestion spec. IP address-typed columns support both IPv6 and IPv4 addresses.

For example, the following snippet of the ingestion spec instructs Druid to read the page and language columns as string-typed dimensions and casts the remote_address column as an IP address dimension:

"dimensionsSpec" : {
  "dimensions": [
    "page",
    "language",
    { "type": "ipAddress", "name": "remote_address" }
  ]
}

See Ingestion spec reference for more information on configuring dimensions during ingestion.

Querying on IP addresses

Apply the following functions in Druid SQL or native queries.

FunctionNotes
IP_PARSE(expr)Parses the string expr into an IP address complex type. Raises ParseException if the input string is not a valid IP address.
IP_TRY_PARSE(expr)Parses the string expr into an IP address complex type. Returns null if the input string is not a valid IP address.
IP_STRINGIFY(expr, [compact])Converts expr from an IP address complex data type to a string representation. Returns a string representation of the IP address. compact defaults to true if not specified, providing a compact form of the IP address with zero compression applied. Specify false for compact to obtain the full, expanded IP address.
IP_MATCH(expr1, expr2)Checks whether expr2 contains or is equal to the address of expr1. expr1 must be an IP address type. expr2 may be a string-typed IP address or a string CIDR range.
IP_PREFIX(expr, prefixLength)Given an IP address expr, creates a new IP address consisting of prefixLength number of bits of the original address. Use IP_STRINGIFY to convert the resulting prefix address into string format.

Examples

The examples in this section use the following table that contains a mix of IPv4 addresses, IPv6 addresses, and null values:

timeipv4ipv6ipvmix
2014102000192.0.2.42001:db8:3:4:5:6:7:8192.0.2.4
2014102000192.0.2.52001:db8:33:44:55:66:77:882001:db8:33:44:55:66:77:88
2014102000198.51.100.32001:db8:3:4:50:60:70:80198.51.100.11
2014102000192.0.2.72001:db8:6:5:4:3:2:12001:db8:6:5:5:6:7:8
2014102000203.0.113.52001:db8:6:5:5:6:7:82001:db8:33:44:55:66:77:88
2014102001192.0.2.42001:db8:3:4:5:6:7:8192.0.2.4
20141020012001:db8:33:44:55:66:77:88
2014102001198.51.100.22001:db8:3:4:50:60:70:80203.0.113.12
2014102001192.0.2.4
2014102001203.0.113.42001:db8:6:5:5:6:7:8192.0.2.5

Ingest data with IP address types

The following ingestion spec loads the example data and specifies the columns ipv4, ipv6, and ipvmix as IP address types.

{
  "type": "index_parallel",
  "spec": {
    "ioConfig": {
      "type": "index_parallel",
      "inputSource": {
        "type": "inline",
        "data": "2014102000\t192.0.2.4\t2001:db8:3:4:5:6:7:8\t192.0.2.4\n2014102000\t192.0.2.5\t2001:db8:33:44:55:66:77:88\t2001:db8:33:44:55:66:77:88\n2014102000\t198.51.100.3\t2001:db8:3:4:50:60:70:80\t198.51.100.11\n2014102000\t192.0.2.7\t2001:db8:6:5:4:3:2:1\t2001:db8:6:5:5:6:7:8\n2014102000\t203.0.113.5\t2001:db8:6:5:5:6:7:8\t2001:db8:33:44:55:66:77:88\n2014102001\t192.0.2.4\t2001:db8:3:4:5:6:7:8\t192.0.2.4\n2014102001\t\t2001:db8:33:44:55:66:77:88\n2014102001\t198.51.100.2\t2001:db8:3:4:50:60:70:80\t203.0.113.12\n2014102001\t192.0.2.4\n2014102001\t203.0.113.4\t2001:db8:6:5:5:6:7:8\t192.0.2.5"
      },
      "inputFormat": {
        "type": "tsv",
        "findColumnsFromHeader": false,
        "columns": [
          "time",
          "ipv4",
          "ipv6",
          "ipvmix"
        ]
      }
    },
    "tuningConfig": {
      "type": "index_parallel",
      "partitionsSpec": {
        "type": "dynamic"
      }
    },
    "dataSchema": {
      "dataSource": "ip_addresses",
      "timestampSpec": {
        "column": "time",
        "format": "yyyyMMddHH"
      },
      "transformSpec": {},
      "dimensionsSpec": {
        "dimensions": [
          {
            "type": "ipAddress",
            "name": "ipv4"
          },
          {
            "type": "ipAddress",
            "name": "ipv6"
          },
          {
            "type": "ipAddress",
            "name": "ipvmix"
          }
        ]
      },
      "granularitySpec": {
        "queryGranularity": "none",
        "rollup": false,
        "segmentGranularity": "hour"
      }
    }
  }
}

Select and view IP address columns

IP address columns store the addresses in binary format. Convert them to string values to display a human-readable format, whether in dotted decimal form for IPv4 or colon-separated hexadecimal notation for IPv6.

Call IP_STRINGIFY with optional argument compact, which controls the formatting of the string representation of IPv6 addresses. compact defaults to true to return compressed IPv6 addresses, applying zero compression and removing leading zeroes. Set compact to false to return fully exapnded IPv6 addresses. IPv4 addresses are not affected by the compact parameter.

Query:

SELECT ipv4,
       ipv6,
       ipvmix,
       IP_STRINGIFY(ipv4)          AS compactv4,
       IP_STRINGIFY(ipv6)          AS compactv6,
       IP_STRINGIFY(ipvmix)        AS compactvmix,
       IP_STRINGIFY(ipv4, false)   AS fullv4,
       IP_STRINGIFY(ipv6, false)   AS fullv6,
       IP_STRINGIFY(ipvmix, false) AS fullvmix
FROM   ip_addresses 

Output:

ipv4ipv6ipvmixcompactv4compactv6compactvmixfullv4fullv6fullvmix
AAAAAAAAAAAAAP//wAACBA==IAENuAADAAQABQAGAAcACA==AAAAAAAAAAAAAP//wAACBA==192.0.2.42001:db8:3:4:5:6:7:8192.0.2.4192.0.2.42001:0db8:0003:0004:0005:0006:0007:0008192.0.2.4
AAAAAAAAAAAAAP//wAACBQ==IAENuAAzAEQAVQBmAHcAiA==IAENuAAzAEQAVQBmAHcAiA==192.0.2.52001:db8:33:44:55:66:77:882001:db8:33:44:55:66:77:88192.0.2.52001:0db8:0033:0044:0055:0066:0077:00882001:0db8:0033:0044:0055:0066:0077:0088
AAAAAAAAAAAAAP//wAACBw==IAENuAAGAAUABAADAAIAAQ==IAENuAAGAAUABQAGAAcACA==192.0.2.72001:db8:6:5:4:3:2:12001:db8:6:5:5:6:7:8192.0.2.72001:0db8:0006:0005:0004:0003:0002:00012001:0db8:0006:0005:0005:0006:0007:0008
AAAAAAAAAAAAAP//xjNkAw==IAENuAADAAQAUABgAHAAgA==AAAAAAAAAAAAAP//xjNkCw==198.51.100.32001:db8:3:4:50:60:70:80198.51.100.11198.51.100.32001:0db8:0003:0004:0050:0060:0070:0080198.51.100.11
AAAAAAAAAAAAAP//ywBxBQ==IAENuAAGAAUABQAGAAcACA==IAENuAAzAEQAVQBmAHcAiA==203.0.113.52001:db8:6:5:5:6:7:82001:db8:33:44:55:66:77:88203.0.113.52001:0db8:0006:0005:0005:0006:0007:00082001:0db8:0033:0044:0055:0066:0077:0088
nullIAENuAAzAEQAVQBmAHcAiA==nullnull2001:db8:33:44:55:66:77:88nullnull2001:0db8:0033:0044:0055:0066:0077:0088null
AAAAAAAAAAAAAP//wAACBA==nullnull192.0.2.4null192.0.2.4nullnull
AAAAAAAAAAAAAP//wAACBA==IAENuAADAAQABQAGAAcACA==AAAAAAAAAAAAAP//wAACBA==192.0.2.42001:db8:3:4:5:6:7:8192.0.2.4192.0.2.42001:0db8:0003:0004:0005:0006:0007:0008192.0.2.4
AAAAAAAAAAAAAP//xjNkAg==IAENuAADAAQAUABgAHAAgA==AAAAAAAAAAAAAP//ywBxDA==198.51.100.22001:db8:3:4:50:60:70:80203.0.113.12198.51.100.22001:0db8:0003:0004:0050:0060:0070:0080203.0.113.12
AAAAAAAAAAAAAP//ywBxBA==IAENuAAGAAUABQAGAAcACA==AAAAAAAAAAAAAP//wAACBQ==203.0.113.42001:db8:6:5:5:6:7:8192.0.2.5203.0.113.42001:0db8:0006:0005:0005:0006:0007:0008192.0.2.5

Count distinct values of an IP address column

Translate IP addresses into their string forms to perform operations such as grouping or distinct counting.

Query:

SELECT COUNT(DISTINCT IP_STRINGIFY(ipv6)) AS approx_count
FROM   ip_addresses 

Output:

approx_count
6

Group IP addresses by prefix

Truncate IP addresses by a specified number of bits in the prefix with IP_PREFIX. Use the truncated addresses for grouping all addresses that fall within a certain range. Convert the results of IP_PREFIX to string type before use in other operations.

Query:

SELECT IP_STRINGIFY(IP_PREFIX(ipv6, 64))  AS prefix,
       COUNT(DISTINCT IP_STRINGIFY(ipv6)) AS address_count
FROM   ip_addresses
GROUP  BY 1 

Output:

prefixaddress_count
null1
2001:db8:33:44::1
2001:db8:3:4::2
2001:db8:6:5::2

Determine whether an IP address is within a subnet

Match an IP address to another address or CIDR range with IP_MATCH.

Match on IPv4

SELECT __time,
       IP_STRINGIFY(ipv4) AS stringv4
FROM   ip_addresses
WHERE  IP_MATCH(ipv4, '192.0.2.0/16')

Output:

__timestringv4
2014-10-20T00:00:00.000Z192.0.2.4
2014-10-20T00:00:00.000Z192.0.2.5
2014-10-20T00:00:00.000Z192.0.2.7
2014-10-20T01:00:00.000Z192.0.2.4
2014-10-20T01:00:00.000Z192.0.2.4

Match on IPv6

SELECT __time,
       IP_STRINGIFY(ipv6) AS stringv6
FROM   ip_addresses
WHERE  IP_MATCH(ipv6, '2001:db8:6::/48')

Output:

__timestringv6
2014-10-20T00:00:00.000Z2001:db8:6:5:4:3:2:1
2014-10-20T00:00:00.000Z2001:db8:6:5:5:6:7:8
2014-10-20T01:00:00.000Z2001:db8:6:5:5:6:7:8

Known issues

  • The Druid web console does not display tooltips with documentation for the IP address functions.
  • Druid displays the IP address complex type as Base64-encoded strings. Translate these strings to a human-readable format using the IP_STRINGIFY function described above.
  • Most Druid operations, including grouping and distinct count, require string-typed IP addresses. Call IP_STRINGIFY to convert an IP address data type to a string type.
  • IP_PREFIX produces an IP address, not a string. Do not use IP_PREFIX directly for expr2 in IP_MATCH without IP_STRINGIFY.
  • IP_PREFIX does not retain the number of prefix bits. For example, IP_PREFIX('1.2.3.4', 16) produces 1.2.0.0 instead of 1.2.0.0/16. You can restore the prefix bits by string concatenation. For example, to determine whether a given IP address belongs to a given subnet: IP_MATCH(ipaddr, IP_STRINGIFY(IP_PREFIX(otherAddr, 16)) || '/16').
  • The IPv4 address functions described in SQL scalar functions and Expressions do not support IP address columns.
  • Enhanced IP support has not been optimized for performance.
Last updated on 1/20/2022
← Query contextMulti-value dimensions →
  • Setup and ingestion
  • Querying on IP addresses
  • Examples
    • Ingest data with IP address types
    • Select and view IP address columns
    • Count distinct values of an IP address column
    • Group IP addresses by prefix
    • Determine whether an IP address is within a subnet
  • Known issues
2022.04
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
Apache Druid forumsBlog
Copyright © 2022 Imply Data, Inc