2020.11

2020.11

  • Imply
  • Pivot
  • Druid
  • Manager
  • Clarity

›Imply Managed

Overview

  • Imply Overview
  • Quickstart
  • Design
  • Release notes

Deploy

  • Deployment planning
  • Imply Managed

    • Imply Cloud overview
    • Imply Cloud security
    • Direct access Pivot
    • On-prem Cloud crossover

    Imply Private

    • Imply Private overview
    • Install Imply on Minikube
    • Imply Private on Kubernetes
    • Imply Private on Azure Kubernetes Service
    • Enhanced Imply Private on Google Kubernetes Engine (alpha)
    • Kubernetes Scaling Reference
    • Kubernetes Deep Storage Reference
    • Imply Private on Linux
    • Pivot state sharing
    • Migrate to Imply

    Unmanaged Imply

    • Unmanaged Imply deploy

Misc

  • Druid API users
  • Extensions
  • Third-party software licenses

Imply Cloud security

Security is of the utmost importance for any mission-critical application, and at Imply we know this is especially true for applications deployed in the cloud. This document details the security features offered by Imply Cloud.

Secure foundation

Imply Cloud is built on the secure foundation of Amazon Web Services (AWS). Your Imply cluster is private and dedicated to you. It is deployed completely within your own AWS account, using a dedicated virtual private network (VPC). This gives you full control over your machines and your data.

Imply also operates user-facing login and management software (“Imply Cloud Portal”), allowing you to manage and use your clusters through a web interface at https://implycloud.com/. The Imply Cloud Portal is provisioned on Imply servers and does not run in your AWS account.

Data that you load into your Imply cluster is never stored on Imply servers — it is only stored on EC2 instances and S3 buckets in your own AWS account. However, you may query your cluster through the Imply Cloud Portal, in which case query requests and responses are transferred through Imply's network. Imply's network merely transfers these query responses to you, and does not store them. This communication is encrypted using TLS; for more details, see data in transit below.

Imply Cloud requires two linkages between your AWS account and Imply:

  1. You must grant Imply permissions through IAM for management operations such as launching and terminating instances. This occurs during initial setup of your Imply Cloud account, as described in the Cloud signup instructions that appear after you sign up.
  2. The Imply Cloud VPC in your AWS account will be peered with an Imply-operated VPC containing the Imply Cloud Portal. This peering allows the Imply Cloud service to communicate with your cluster through private IP addresses, without requiring access over the internet. For increased security, the standard Imply Cloud configuration blocks ingress from the internet to your Imply Cloud VPC.

User authentication and authorization

Imply Cloud identifies each authorized user of your account by a unique login tied to their email address. Imply Cloud offers a role-based access control (RBAC) model that allows you to ensure that your users have exactly the permissions they need to do their work.

In particular, the RBAC model can be used to grant or restrict the following permissions on Imply Cloud authorized users:

  • Ability to manage clusters, datasets, and/or other users.
  • Ability to manage data cubes (building blocks of visualizations).
  • Ability to manage dashboards.
  • Ability to see visualizations (data cube view).
  • Ability to query the Imply cluster directly with SQL.
  • Ability to load new data into the Imply cluster.

In addition to authorized users, Imply Cloud lets you define API users that can call Druid APIs on your Imply cluster directly. API users can be used for building your own apps on top of the Druid API, automating workloads, and many other functions. The Druid API is protected by TLS encryption and HTTP Basic authentication. API users can be granted permissions tailored to the access they require, including:

  • Ability to read or write druid configuration.
  • Ability to load new data into the cluster (all datasets, or specific datasets).
  • Ability to query the cluster (all datasets, or specific datasets).

Data in transit

Imply Cloud uses Transport Layer Security (TLS) for end-to-end encryption of data in transit. In particular, TLS is used to secure communications including the following:

  • All communications between your browser and the Imply Cloud Portal at https://implycloud.com/.
  • All communications between the Imply Cloud Portal and your private Imply cluster.
  • All internal traffic involving your data within your Imply cluster, including data ingestion, persistence, and query requests and responses.
  • All Druid API calls you make to your Imply cluster.

TLS 1.0 and 1.1 are deprecated for use with any Imply user interface, including browser-based UIs, such as Pivot or the Imply Manager, or APIs. If you use a supported browser to access Imply user interfaces, you should not be impacted by this change, since they use later protocols exclusively. However, if you have tools or other types of client software that access Imply APIs, you should verify that they use TLS 1.2 or later.

Your Imply cluster also uses ZooKeeper for certain cluster coordination tasks. This traffic is not encrypted. However, Imply Cloud still safeguards this traffic using network segmentation through AWS EC2 Security Groups. Furthermore, this traffic is restricted to cluster coordination: your data is not at risk of exposure in these communications.

Data at rest

Data that you load into your Imply cluster is never stored on Imply servers. Your data is only stored on EC2 instances and S3 buckets in your own AWS account, and you have full control over it.

Imply Cloud supports encryption of your data in S3 at rest, through the use of Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3), if you configure this option for your S3 buckets linked to Imply Cloud. The choice of whether or not to use this option is wholly within your control. If you would like to enable this, you should do it before launching your Imply cluster.

Imply Cloud optionally supports encryption of data at rest on your EC2 instances. The mechanism varies based on instance type:

  • For EBS-backed instance types (e.g., c5, m5), Imply Cloud can optionally provision encrypted EBS volumes. This is controlled by the "instance encryption" setting in your Imply Cloud cluster configuration, and is enabled by default for clusters created after April 2, 2018. This setting can be changed for existing clusters, but they must be stopped first and then restarted.
  • Instances types that use NVMe local storage (e.g., i3, c5d, m5d) are transparently encrypted via hardware on the instance. This is always enabled, regardless of your Imply Cloud "instance encryption" setting. See here for details: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html.

Network setup

Imply Cloud uses 6 subnets across 3 availability zones to provide resiliency in case of an AWS data center outage (each AZ has one subnet for dynamic addresses and one for static ones). These subnets are configured with a route to an internet gateway to provide egress access to, and optionally ingress access from the internet.

Three security groups are set up for use by EC2 and ELB instances: one managed group containing security rules specified by Imply, and initially empty unmanaged groups for the EC2 and ELB instances respectively intended for custom rules provided by you. By default, there are no internet- facing rules configured and all ingress from the internet is blocked.

To facilitate communication with the Imply Cloud Manager, the VPC created in your AWS account is linked through a peering connection with the Cloud Manager VPC in Imply's account. This allows your Imply cluster to be addressable by the Manager using only private IP addresses without requiring any form of public accessibility.

Ports used in Imply Cloud

To allow communication between the Imply Manager and the Druid cluster VPCs, Imply configures the following ingress rules in the managed security group.

The table lists the ports, why they are open, and whether it can be disabled for scenarios in which security policies require strict minimization of open ports.

Ports

Function

8081 – 8084

(Deprecated; will be removed soon)

HTTP

Used for detailed health checks on Master, Query and Data Nodes.

8281

HTTPS

Coordinator Process on Master Nodes.

Sending node states in the cluster.

Coordinator access is required to perform cluster upgrades initiated from the Imply Manager.

8282

HTTPS

Broker Process on Query Nodes. Used for getting detailed health check information.

8283

HTTPS

Historical Process on Data Nodes. Used for getting detailed health check information.

8290

Overlord Process on Master Nodes for rolling update (checking pending index tasks)

Overlord access is required to perform cluster upgrades initiated from the Imply Manager.

8291

Middle Manager Process on Data Nodes for rolling update (checking pending index tasks)

Middle Manager access is required to perform cluster upgrades initiated from the Imply Manager.

9088

HTTPS

Router Process on Query Nodes. Used for information retrieval and web console access from within the Imply Manager.

Used for retrieving data that is displayed on the Cluster Overview Page on the Imply Manager UI (e.g., remaining capacity and server information).

Also, the Manage Data link to the Druid Web Console uses this port.

9994

Used for sending server logs to the Imply control plane, which is surfaced to the Imply Manager UI for user to gain quick access Used for the feature function of View server logs

9095

HTTPS

Pivot Process on Query Nodes

Used for submitting pivot queries, dashboard requests from Imply Cloud Manager UI.

22

When granted permission, the Imply support team may connect via SSH to cluster nodes to help with troubleshooting and maintenance.

Downloading the root TLS certificate

Each cluster node's TLS certificate is signed by a self-signed root certificate from Imply. In order to access your cluster nodes over TLS, you'll need to download the Imply root certificate and ensure that it is trusted by your client software.

The download link for this root certificate can be found in your cluster's API view, at the Root certificate field under the Security section:

Cluster API View

Accessing a cluster node over TLS with curl

This example command uses curl to access the /status endpoint on the coordinator node of the cluster shown in the image above, over TLS:

curl -u admin:bWm4+3iDF+URWmDV7zXTpQ== --cacert f1503f07-f919-47f5-942f-e6a5f42b2d57.crt https://ip-10-1-0-5.ec2.internal:8281/status

f1503f07-f919-47f5-942f-e6a5f42b2d57.crt is the root certificate downloaded from this cluster's API view, specified with the --cacert flag.

For information on configure access control for the Druid API endpoints, see Druid API users.

Other clients

For other clients such as web browsers, please refer to your client software's documentation for instructions on how to add a trusted root TLS certificate.

Further reading

For more information see the documentation for the Druid basic authentication and authorization extension.

← Imply Cloud overviewDirect access Pivot →
  • Secure foundation
  • User authentication and authorization
  • Data in transit
  • Data at rest
  • Network setup
  • Ports used in Imply Cloud
  • Downloading the root TLS certificate
    • Accessing a cluster node over TLS with curl
    • Other clients
    • Further reading
2020.11
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2020 Imply Data, Inc