Imply Manager (PREVIEW)

Welcome to the preview program for Imply Manager. The following documentation provides details on how to fetch, configure, and run the manager.

Fetching the image

Imply Manager is provided as a pair of Docker images available on Docker Hub. There is one image for the manager container which includes the coordination services and manager application, and another image for the agent containers which become the nodes of one or more Druid clusters. At this time, a deployment will always have a single manager and multiple agents. Support for running multiple managers for high availability will be offered in the near future.

The images are available in the imply/onprem-manager and imply/onprem-agent Docker Hub repositories. These are private repositories, and you will need to provide your Docker ID to Imply in order to gain access.

To pull these images using the command-line, first login using docker login and then pull the desired tag, for example:

docker pull imply/onprem-manager:2.9-PREVIEW1
docker pull imply/onprem-agent:2019-03-01-PREVIEW

Configuring the container

The recommended way to configure the image is to create a configuration file that can be passed to the container using the --env-file Docker run argument. The following items are required:

IMPLY_MANAGER_HOST=<manager hostname>

IMPLY_MANAGER_STORE_TYPE=mysql
IMPLY_MANAGER_STORE_HOST=<mysql hostname>
IMPLY_MANAGER_STORE_PORT=<mysql port>
IMPLY_MANAGER_STORE_USER=<mysql user>
IMPLY_MANAGER_STORE_PASSWORD=<mysql password>
IMPLY_MANAGER_STORE_DATABASE=<mysql database name>

IMPLY_MANAGER_LICENSE_KEY=<license key>

It is important that IMPLY_MANAGER_HOST be a name that is resolvable and routable from the agent containers. Appropriate values to use here depends on what Docker networking configuration you plan to use. For Linux-host deployments not utilizing a container orchestration system (Kubernetes, Docker Swarm, etc.), running a single container per host and starting the container with the --network host configuration is a common setup. In this case, IMPLY_MANAGER_HOST would be the externally addressable name or address of the host machine running the manager container. Note that --network host is not a supported option on OSX (Docker will accept the configuration but since the container runs in a VM, it will inherit the VM's network stack and not the host's stack as expected). See the Networking section for suggestions on how to run the container in OSX.

Druid clusters require an external ZooKeeper ensemble, metadata store, and a deep storage location. Currently MySQL is the only metadata store supported by Imply Manager. S3, HDFS, and local file system are the supported deep storage locations. Using the local file system will only work if you are running a single node PoC or you are mounting a volume backed by a Docker volume driver that manages the writing to a resilient distributed storage.

You can provide default values for these external dependencies so that they don't need to be specified at cluster creation time. To do so, use the following parameters:

imply_defaults_zkType=external
imply_defaults_zkHosts=<zk-host-1:2181,zk-host-2-:2181,zk-host-3:2181>
imply_defaults_zkBasePath=imply

imply_defaults_metadataStorageType=mysql
imply_defaults_metadataStorageHost=<mysql hostname>
imply_defaults_metadataStoragePort=<mysql port>
imply_defaults_metadataStorageUser=<mysql user>
imply_defaults_metadataStoragePassword=<mysql password>

imply_defaults_deepStorageType=<'s3', 'hdfs', or 'local'>
imply_defaults_deepStorageBaseLocation=s3://<bucket>/<prefix> # s3
# imply_defaults_deepStorageBaseLocation=hdfs://<namenode host>:8020/<prefix> # hdfs
# imply_defaults_deepStorageBaseLocation=</path/to/volume/mount> # local

# Not necessary if you are providing your credentials elsewhere, e.g. using an EC2 instance role
# imply_defaults_deepStorageS3AccessKey=<s3 access key>
# imply_defaults_deepStorageS3SecretKey=<s3 secret key>

Running the container

Networking

Your Docker run command will vary based on the networking configuration chosen. Note the following important ports:

Ports which should be externally exposed on the manager container:

Ports which should be externally exposed on agent containers that will run as Imply query nodes:

For proper cluster coordination, the manager and agent containers should be able to talk to each other. In most default networking configurations, containers can talk to each other unrestricted. If you need to allow access on specific ports, the ones used for cluster coordination are:

Host Networking

The recommended deployment strategy if you are not using a container orchestration system and you are using a Linux host is to use host networking. In this configuration, the container will use the host's network stack, which will provide the best performance as there is no need for packets to be routed to a container's separate network stack. You would access the Imply Manager on port 9097 of the host system.

User-defined Bridge

On a non-Linux host, you are unable to use host networking since the containers do not run natively on the host but run in a virtual machine. The recommended alternative is to use a user-defined bridge network. To create a bridge network, run:

docker network create imply

You will then be able to use this network when running containers by specifying --network imply and will be able to address a container from other containers by its container name.

Volumes

Your Docker run command will also vary depending on if you plan to use Docker volumes or bind mounts. Note that by default, Docker runs containers on the host's primary storage device, which may not have sufficient space particularly for Imply data nodes. You can either configure Docker to move these containers to a secondary device on the host or use volumes/binds to mount another location inside the container. By default, Imply writes temporary files, logs, and caches to the /mnt directory, so this is a good path to mount a larger volume. See the Docker volume documentation for more information.

Docker Run

As an example, running the manager and agent using host networking, with a bind mount on /mnt, and using a configuration file named 'env.list' would look like this:

docker run --rm --network host --env-file env.list --mount type=bind,source=/mnt,target=/mnt imply/onprem-manager:2.9-PREVIEW1
docker run --rm --network host --env-file env.list --mount type=bind,source=/mnt,target=/mnt imply/onprem-agent:2019-03-01-PREVIEW

As another example, running the manager and agent using a user-defined network named imply with a configuration file named 'env.list' and the IMPLY_MANAGER_HOST property set to match a manager container name of 'imply-manager' would look like this:

docker run --rm --network imply --name imply-manager -p 9097:9097 -e "IMPLY_MANAGER_HOST=imply-manager" --env-file env.list imply/onprem-manager:2.9-PREVIEW1
docker run --rm --network imply -p 9095 -p 8888 -e "IMPLY_MANAGER_HOST=imply-manager" --env-file env.list imply/onprem-agent:2019-03-01-PREVIEW

Once the manager container is running, you can access the Imply Manager application at http://<managerHost>:9097

Running Imply Manager in an Internet-restricted environment

Imply Manager makes outbound calls to S3 to fetch additional configurations, distribution bundles, and third-party extensions. If you are running Imply Manager in an environment that prohibits Internet access, you will have to manually fetch these files and make them available to the manager container. Please contact Imply for more information on this deployment scenario.

Overview

Tutorial

Deploy

Manage Data

Query Data

Visualize

Configure

Special UI Features

Misc