Deploy on Google Kubernetes Engine

This document describes how to deploy a highly available, distributed Imply cluster on Google Cloud Platform, with Google Kubernetes Engine (GKE) acting as the underlying orchestration engine.

While this document takes you through the steps for setting up Imply with the Google Cloud Platform, it does not constitute a complete guide for GKE or Google Cloud. For more information, see the Google Kubernetes Engine documentation. Also, for general information on deploying Imply on Kubernetes, see Deploy with Kubernetes.

Google Cloud components and sample architecture

This guide takes you through the steps for creating a Google Cloud deployment depicted in the following diagram:

GCP sample setup

As shown, the Imply cluster deployments are deployed to Kubernetes pods.

For external dependencies, we'll use MySQL (powered by Google Cloud Managed RDS: CloudSQL) for metadata storage and Google Cloud storage for deep storage. All components will be in a single GCP project.

We recommend running all components in a dedicated network VPC. Thus, if you want to rely on an existing CloudSQL RDS store, make sure you create the GKE cluster in the same VPC. Otherwise, be sure to create the new MySQL instance in the same VPC as the GKE cluster.

The machines in the VPC will need to be able to connect to the Internet. Before starting, follow the instructions in Example GKE Setup in the Google documentation to configure the VPC accordingly.


Step 1. Set up Google Kubernetes Engine (GKE)

  1. From the Google Cloud Platform home, create or select the project in which you want to deploy the GKE cluster.
  2. Click Kubernetes Engine from the left navigation pane. It may take a moment for the Kubernetes Engine API to be enabled. The Kubernetes configuration page appears: GCP Kubernetes setup
  3. In the Cluster basics page, enter a name for the Kubernetes cluster.
  4. Choose the desired location type for the cluster, zonal or regional. Regional (multiple zones) is recommended for production deployments requiring high availability, whereas zonal (single zone) is usually sufficient for experimental or testing environments.
  5. Click default-pool from the left pane, and click the Enable autoscaling option to enable it.
  6. Under Nodes, for the machine type, choose one of the high memory machines, such as n2-highmem-8. The default image type, Container-Optimized OS, is acceptable.
  7. For Node security, Metadata, and Automation settings, you can keep the defaults.
  8. For Networking, choose Private and select the network where the GKE cluster runs.
  9. Click Create to create the cluster.

After a moment, the new cluster should appear in the Kubernetes cluster list with a green check mark, indicating that it is running.

Note that you can check the status of the cluster by clicking Connect next to your new cluster, and then Open Workloads dashboard.

Step 2. Set up MySQL metadata store

  1. From the console home, go to SQL and click Create an instance.
  2. Choose MySQL for the database type.
  3. Enter a password for the instance and enter an instance ID.
  4. For Region, choose the same region as you chose for the GKE cluster.
  5. Click show configuration options.
  6. Choose private IP and under Associated networking choose the VPC used for the Kubernetes cluster.
  7. Click Create.

When the instance finishes deploying, note the address shown in the Private IP address column. You will need to enter this in the YAML configuration file later.

GCP Kubernetes setup

Step 3. Configure Storage Account

  1. Go to the Storage browser and click CREATE BUCKET.
  2. Be sure to select the same region for the new bucket as you chose for the Kubernetes cluster.

The following shows the remaining configuration settings, which can remain at their default values:

GCP Kubernetes setup

Step 4. Connect to the GKE Cluster

  1. In the cluster page, click Connect next to the new cluster.

  2. Get credentials for the GKE cluster by copying the command shown and running in the Google Cloud client. The command should be similar to the following: gcloud container clusters get-credentials imply-k8s-1 --region us-central1 --project possible-router-145522

    Where imply-k8s-1 is the GKE cluster name, possible-router-145522 is the project ID.

Step 5: Install Imply Druid

You can now follow the instructions in the Kubernetes deployment guide to complete the installation using Helm:

  1. Add the Imply repository to Helm by running:
    helm repo add imply
    helm repo update

    See Deploy with Kubernetes for introductory information on using Helm with Imply.

  2. Create a values.yaml file, populating it with the downloaded contents of the latest Helm chart from Imply:
    helm show values imply/imply > values.yaml
  3. In values.yaml, change the configuration of minIO and mySQL to false, since you're using Google Cloud storage and the MySQL instance.
  4. Add the license key, typically from a file:
     kubectl create secret generic imply-secrets --from-file=IMPLY_MANAGER_LICENSE_KEY
  5. Add metadata store configuration:
    1. Use the GCP MySQL IP, username, and password.
    2. Specify the name of the database you created.

      Note that the default name in the Imply Helm chart has a hyphen in the database name, which is not permitted in CloudSQL database names. Be sure to replace the name with the name of the database you created in Step 2 above, or change the default name.

    3. If you want to only allow SSL, add the certificate.
  6. Configure the same settings for the Druid metadata store configuration.
  7. Configure Google Cloud Storage as the deepStorage resource by setting the Path to gs://imply-k8s. (No username or password is required, since the storage lives in the same project.) See other Google deep storage settings for more information.
  8. Run:
    helm install {release-name} imply/imply -f values.yaml
    Where {release-name} is the deployment name you chose.
  9. To access the Imply Druid Cluster and Manager locally, make sure you’ve switched the Kubernetes context to the {gke-cluster-name} and get the credential from the GCP CLI:
    kubectl config set-context {gke-cluster-name}
  10. Use port forwarding to access Imply Manager, following the instructions presented when finishing the helm installation.

For more general information on adapting the Helm chart for your deployment, see Deploy with Kubernetes.

Next steps

That's it! You now have a cluster running on GKE. If you are getting to know Imply, learn about who to load data in the quickstart.

For ongoing administration and maintenance, see Deploy with Kubernetes and Using the Imply Manager.





Manage Data

Query Data