2021.01 LTS

2021.01 LTS

  • Imply
  • Pivot
  • Druid
  • Manager
  • Clarity

›Imply Private

Overview

  • Imply Overview
  • Design
  • Release notes

Tutorials

  • Quickstart
  • Data ingestion tutorial
  • Kafka ingestion tutorial
  • Connect to Kinesis
  • Querying data

Deploy

  • Deployment planning
  • Imply Managed

    • Imply Cloud overview
    • Imply Cloud security
    • Direct access Pivot
    • On-prem Cloud crossover

    Imply Private

    • Imply Private overview
    • Install Imply on Minikube
    • Imply Private on Kubernetes
    • Imply Private on Azure Kubernetes Service
    • Enhanced Imply Private on Google Kubernetes Engine
    • Kubernetes Scaling Reference
    • Kubernetes Deep Storage Reference
    • Imply Private on Linux
    • Pivot state sharing
    • Migrate to Imply

    Unmanaged Imply

    • Unmanaged Imply deploy

Misc

  • Druid API users
  • Extensions
  • Third-party software licenses
  • Experimental features

Install Imply Private on Linux

This document describes how to install the binary distribution of Imply on CentOS Linux. This is a basic, self-hosted deployment that leaves the provisioning and bootstrapping of the cluster nodes to you.

If you have existing Kubernetes expertise and infrastructure, we recommend deploying Imply Private on Kubernetes. Use the installation described here if you cannot use Kubernetes or Imply Cloud.

Overview

The Imply Private on Linux distribution consists of two archives:

  • Imply Manager distribution: imply-manager-2021.01.1
  • Imply Agent distribution: imply-agent-v1

Upon installation, Imply Agents become the master, query, and data nodes in your cluster. The Imply Manager archive, naturally, provides for the installation of the Imply Manager, which serves the central configuration and administration interface for the Imply deployment.

Additionally, as depicted in the figure below, a complete Imply deployment includes:

  • An external database for metadata storage (MySQL or PostgreSQL)
  • A ZooKeeper ensemble for server coordination
  • A distributed storage system to serve as Druid's deep storage (e.g., S3, GCS, HDFS, NFS, etc.)

New Cluster

See Design for more about cluster servers and processes.

Distribution archive versioning

In the previous section, notice that the names of the Manager and agent distribution archives include version numbers 2021.01.1 and v1.

There are a few points to note regarding this versioning:

  • While the Manager archive version corresponds to the Imply software it is distributed with, the Manager distribution archive is not necessarily bound to that version. That is, you can typically upgrade the Imply cluster to a later version without updating the Manager version.
  • Occasionally, a change in a subsequent version of the Imply software may necessitate an update to the Imply Manager software, as installed here. You may wish to upgrade to access new features in the Manager as well. However, it is not typically required for a routine update of a cluster with the latest Imply software version.
  • Imply agents follow a simple versioning scheme, such as v1. Like the Manager, agent versions are not necessarily bound to an Imply software version; that is, you can update the cluster to a later version without updating the agent. However, it is possible, although relatively rare, that a update to a new Imply version will first require an update to the underlying Imply agent software.

Currently, there is only a single agent version, so no incompatibilities exist between Imply software versions and agent versions.

Supported Linux Distributions

  • Centos 8

Software Dependencies

The Imply Manager and each Imply Agent must have the following software installed

  • Java 8 (8u92 or higher) Zulu OpenJDK Recommended
  • Python 3 (3.6 or higher)
  • Perl 5 (5.26 or higher)

The Imply installer will check if these dependencies are installed and will fail if they are not found.

Networking

The Imply Manager and each Imply Agent machine must be configured with a hostname unique within your network infrastructure. Each hostname must have a corresponding entry in the domain name system (DNS) so that Imply services can connect to each other using the hostname. The fully qualified domain name (FQDN), which is used in DNS, includes the hostname and the entire domain name. The hostname and the domain name labels are separated by periods or dots, as follows: hostname.domain

Installation steps

Before starting, install or verify the availability of the external dependencies mentioned above (a relational database, ZooKeeper, and deep storage). Note that the Manager and cluster both rely on external databases to store metadata. You can either use the same or different databases for the manager and the cluster.

You will need sudo privileges on the target machines to run the installers.

Step 1: Install Imply Manager

On the machine that will serve Imply Manager, follow these steps:

  1. Download the imply-manager-2021.01.1.tar.gz archive to the target directory.

  2. Extract the distribution archive:

    $ tar -xvf imply-manager-2021.01.1.tar.gz
    
  3. To run the installer:

    $ sudo imply-manager-2021.01.1/script/install
    

    The installer checks the environment for software dependencies. It raises an error message if they are not found.

  4. Configure the Manager by editing /etc/opt/imply/manager.conf. Specify the connection settings for the manager store database, Zookeeper, the Imply metadata storage database, and deep storage, as appropriate for your environment. For example:

    IMPLY_MANAGER_LICENSE_KEY=<license (trial if not set)>
    
    IMPLY_MANAGER_STORE_TYPE=<[mysql, postgresql]>
    IMPLY_MANAGER_STORE_HOST=<db_host>
    IMPLY_MANAGER_STORE_PORT=3306
    IMPLY_MANAGER_STORE_USER=<user>
    IMPLY_MANAGER_STORE_PASSWORD=<password>
    IMPLY_MANAGER_STORE_DATABASE=imply_manager
    
    # These can be set here or through the Imply Manager when creating the cluster
    imply_defaults_zkType=external
    imply_defaults_zkHosts=<zk_host>:2181
    imply_defaults_zkBasePath=imply
    imply_defaults_metadataStorageType=<[mysql, postgresql]>
    imply_defaults_metadataStorageHost=<db_host>
    imply_defaults_metadataStoragePort=3306
    imply_defaults_metadataStorageUser=<user>
    imply_defaults_metadataStoragePassword=<password>
    imply_defaults_deepStorageType=<[azure, google, hdfs, local, s3]>
    imply_defaults_deepStoragePath=<e.g. s3://bucket-name/path>
    imply_defaults_deepStorageUser=<user>
    imply_defaults_deepStoragePassword=<password>
    

    See Enabling TLS, below, for how to configure TLS in the cluster.

  1. Start Imply Manager:

    $ sudo systemctl start imply-manager
    
  2. (Optional) Ensure Imply Manager is running

    Ensure all services starting with "imply-" have a green dot

    $ systemctl list-dependencies --reverse imply-manager
    imply-manager.service
    ● ├─imply-grove-server.service
    ● ├─imply-manager-be.service
    ● ├─imply-manager-fe.service
    

The Imply Manager UI should now be accessible on port 9097.

Step 2: Log in and create the cluster

  1. From a browser, access Imply Manager at: http://<hostname>:9097.
  2. Create the administrator account and click Login to proceed.
  3. From the dashboard, click + New cluster.
  4. The default values are appropriate for a small-scale deployment. If you did not provide the imply_defaults_* configurations in the manager.conf file, you will need to set them at this time.
  5. In the Schema field, type a distinguishing name for the Imply Manager database schema, such as imply_manager.
  6. Click Create cluster to proceed.
  7. Click OK to confirm cluster creation. Note the Cluster ID indicated in the cluster status page. You will need this in the next step to configure the agents.

Step 3: Install agents

Install the agent on each node that will be part of this cluster. You will need at least one master, one query, and one data node:

The Imply agent must have read and write permissions to the /mnt/var and /mnt/tmp directories. If these directories do not exist, the installer will create them and set appropriate permissions. However, if they do exist and the imply user does not have read and write permissions to them, the install will fail.

  1. Download the imply-agent-v1.tar.gz archive to the target directory.

  2. Extract the distribution archive:

    $ tar -xvf imply-agent-2021.01.1.tar.gz
    
  3. To run the installer:

    $ sudo imply-agent-2021.01.1/script/install
    
  4. Configure the agent by editing /etc/opt/imply/agent.conf. Set the Manager hostname, the cluster ID you noted in the previous step, and the node type for this agent (master, query or data):

    IMPLY_MANAGER_HOST=<manager_host>
    IMPLY_MANAGER_AGENT_CLUSTER=<cluster_id>
    IMPLY_MANAGER_AGENT_NODE_TYPE=<[master, query, data]>
    

    The value of IMPLY_MANAGER_HOST must not include the protocol (http:// or https://) section of the URL.

    For clusters that use multiple data tiers, IMPLY_MANAGER_AGENT_NODE_TYPE can also be set to dataTier2 or dataTier3.

  5. Start the agent:

    $ sudo systemctl start imply-agent
    
  6. (Optional) Ensure Imply Agent is running

    Ensure all services starting with "imply-" are preceded by a green dot, which indicates running status:

    $ systemctl list-dependencies --reverse imply-agent
    imply-agent.service
    ● ├─imply-grove-agent.service
    ● ├─imply-runsvdir.service
    

Repeat this step for each server in the cluster, specifying the appropriate node type for each.

You can view status or stop the agent by running systemctl status imply-agent and sudo systemctl stop imply-agent, respectively.

Step 4: Accessing Imply

Once the nodes in the cluster have been configured, they should appear on the Servers list. Ensure that all of the nodes have joined the cluster, and then locate the IP / hostname for a query node. Access the Pivot UI from your browser at: http://<query_hostname>:9095.

A typical deployment would consist of multiple master, query, and data nodes for high availability. Instead of directly accessing the query node, you would route your requests through a load balancer for resiliency.

Reinstalling

You can re-install the agent or manager by re-running the installer. The installer detects an existing installation and prompts you to remove the instance. Like installation, root access is required to reinstall Imply.

Existing configuration files are removed during the reinstall. To ensure you don't lose your configuration, make a backup of the manager.conf or agent.conf files to a directory that does not include /opt/imply/agent.

Imply Manager

$ cp /etc/opt/imply/manager.conf ~/manager.conf.bkp

Imply Agent

$ cp /etc/opt/imply/agent.conf ~/agent.conf.bkp

Enabling TLS

See the TLS Docs for information on how to generate certificates and general TLS information.

By default Imply Manager looks for the certificate files in the following locations:

  • Signing certificate: /run/secrets/imply-ca.crt
  • Signing key: /run/secrets/imply-ca.key

The imply user must have read permissions to the CA Key and Certificate files

To enable TLS, follow these steps:

  1. Copy the certificate and key to the above location on the cluster nodes, or specify a custom location in the following configuration files.

  2. Add the following configurations:

    • For the manager, add the following lines to /etc/opt/imply/manager.conf:

      IMPLY_MANAGER_CA_KEY_PATH=/run/secrets/imply-ca.key
      IMPLY_MANAGER_CA_CERT_PATH=/run/secrets/imply-ca.crt
      
    • For the agent, add the following line to /etc/opt/imply/agent.conf:

      IMPLY_MANAGER_CA_CERT_PATH=/run/secrets/imply-ca.crt
      
  3. Restart the cluster.

If successfully enabled, TLS: Enabled appears in the logs at manager and agent startup.

Authentication

See the Authentication Docs for more information on Manager authentication.

To enable authentication, follow these steps:

The authentication token can either be loaded from a file or set in the manager and agent configuration files.

Load from file

By default Imply Manager looks for the authentication token in /run/secrets/imply-auth-token. To load the token from a custom file location, set IMPLY_MANAGER_AUTH_TOKEN_PATH in the manager and agent configuration files.

  • For the manager, add the following lines to /etc/opt/imply/manager.conf:

    IMPLY_MANAGER_AUTH_TOKEN_PATH=/run/secrets/imply-auth-token
    
  • For the agent, add the following line to /etc/opt/imply/agent.conf:

    IMPLY_MANAGER_AUTH_TOKEN_PATH=/run/secrets/imply-auth-token
    

Set in configuration file

  • For the manager, add the following lines to /etc/opt/imply/manager.conf:

    IMPLY_MANAGER_AUTH_TOKEN=<authentication token text>
    
  • For the agent, add the following line to /etc/opt/imply/agent.conf:

    IMPLY_MANAGER_AUTH_TOKEN=<authentication token text>
    

Logging

Imply Manager and agents write logs to the system log, /var/log/messages on CentOS. If you have trouble installing or starting, be sure to check the log for error messages.

← Kubernetes Deep Storage ReferencePivot state sharing →
  • Overview
  • Distribution archive versioning
  • Supported Linux Distributions
  • Software Dependencies
  • Installation steps
    • Step 1: Install Imply Manager
    • Step 2: Log in and create the cluster
    • Step 3: Install agents
    • Step 4: Accessing Imply
  • Reinstalling
  • Enabling TLS
  • Authentication
  • Logging
2021.01 LTS
Key links
Try ImplyApache Druid siteImply GitHub
Get help
Stack OverflowSupportContact us
Learn more
BlogApache Druid docs
Copyright © 2021 Imply Data, Inc