Kubernetes Scaling Reference

This topic describes the configuration options for scaling your Imply cluster on Kubernetes. For general information on how to deploy Imply on Kubernetes, see Install Imply Private on Kubernetes.

Configuring the node count

The default Helm configuration deploys one each of master, query, and data nodes. To increase the number of nodes running in the cluster, modify the value for replicaCount in values.yaml and then run helm update.

For example, to increase the master replica count to three nodes, set the following in values.yaml:

master:
  replicaCount: 3

Configuring available disk

To increase the disk available to data nodes you can:

Increase the size of the requested persistent volume.
Add volume requests and set the Druid segment cache to span across them. Additional volumes can take advantage of multiple volumes on a node.

Carefully consider the disk size you allocate to data nodes before going to production. Increasing the size of a volume is not supported by all storageClass implementations.

Increasing the size of volume claims for StatefulSet is currently out of the scope of this documentation. Depending on the storageClass used, this may be hard to change. See https://github.com/kubernetes/kubernetes/issues/68737 for more information.

Increasing requested volume size

To make a larger initial request of volume size, modify the following section in values.yaml:

dataTier1:
  ...
  segmentCacheVolume:
    storageClassName:
    resources:
      requests:
        storage: 40Gi
  ...

Configure the storage setting in Druid in the historical runtime properties key in the druid section of values.yaml. For example:

druid:
  historicalRuntimeProperties:
    - 'druid.segmentCache.locations=[{"path": "/mnt/var", "maxSize": 40000000000}]'
    - 'druid.server.maxSize=40000000000'

Keep in mind that this size is the amount of storage used for segment cache only. There should be about 25 GB of additional space available for other internal processes, such as temporary storage space for ingestion, heap dumps, and log files.

See the Druid configuration documentation for more information.

Adding a volume claim

You can add volume claims to the extraVolumeClaimTemplates section in values.yaml. For example:

dataTier1:
  ...
  extraVolumeClaimTemplates:
    - metadata:
        name: var2
      spec:
        accessModes:
          -ReadWriteOnce
        resources:
          requests:
            storage: 20Gi
  ...

After you add a volume claim, verify that the data pods mount this volume. Update the extraVolumeMounts section in values.yaml. For example:

dataTier1:
  ...
  extraVolumeMounts:
    - mountPath: "/mnt/var2"
      name: var2
  ...

The value of name in the extra volume mounts section must match the name from the extra volume claim templates section. Do not use the reserved namesvar or tmp. They are used by the defaults segmentCacheVolume and tmpVolume. To configure Druid use the new volume, update the historical runtime properties in values.yaml:

druid:
  historicalRuntimeProperties:
    - 'druid.segmentCache.locations=[{"path": "/mnt/var", "maxSize": 40000000000}, {"path": "/mnt/var2", "maxSize": 20000000000}]'
    - 'druid.server.maxSize=60000000000'

This adds /mnt/var2 as another available location to cache segments and sets druid.server.maxSize to the combined size of the two locations. See the Druid configuration documentation for more information.

Adding additional data tiers

Adding data tiers works much the same way as scaling. By default the dataTier1 comprises one data node. To add a second data tier, increase the replicaCount for the dataTier2 to the desired count and then run helm update on the cluster. All configuration options available from the default tier are available to both dataTier2 and dataTier3.

To add more than three data tiers, use a separate YAML block called additionalDataTiers. Using these tiers requires agent version 9 or later. You can verify the version in the values file for the Helm chart: images.agent.tag.

The additionalDataTiers block includes all the options that dataTier has and two additional ones: customMiddleManagerRuntimeProperties and CustomHistoricalRuntimeProperties where you can set custom properties. These two configs are optional and can also be set in the Imply Manager UI.

Imply Manager assigns tier numbers sequentially based on the order of the list, so the first tier for additionalDataTiers is 4, the second 5, and so on. If you modify the list, such as removing the first entry of 3 entries, the number assigned to each tier changes. The previous tier 5 becomes tier 4, and the following tier 6 becomes tier 5. This can affect where your segments get loaded, so verify your segments are where you expect them to be any time you modify the number of additional data tiers.

The following example snippet includes entries for two additional data tiers with a subset of their available options. This deployment has five total tiers, three tiers from dataTierN and two tiers from additionalDataTiers.

additionalDataTiers example:

additionalDataTiers:
- replicaCount: 2
  resources:
    requests:
      cpu: 400m
      memory: 1300M
    # limits:
    #   cpu:
    #   memory:
  persistence:
    enabled: false
  sysctlInitContainer:
    enabled: true
    sysctlVmMaxMapCount: 500000
    sysctlKernelThreadsMax: 999999
  podDisruptionBudget:
    maxUnavailable: 1  
  customMiddleManagerRuntimeProperties: "druid.property=1,druid.property2=2" # Set custom Middle Manager properties or in the UI
  customHistoricalRuntimeProperties: "druid.property=1,druid.property2=2" # Set custom Historical properties with this property or in the UI
  ... # Additional configs omitted for brevity
- replicaCount: 2
  resources:
    requests:
      cpu: 400m
      memory: 1300M
    # limits:
    #   cpu:
    #   memory:
  persistence:
    enabled: false
  sysctlInitContainer:
    enabled: true
    sysctlVmMaxMapCount: 500000
    sysctlKernelThreadsMax: 999999
  podDisruptionBudget:
    maxUnavailable: 1  
  customMiddleManagerRuntimeProperties: "druid.property=1,druid.property2=2" # Set custom Middle Manager properties or in the UI
  customHistoricalRuntimeProperties: "druid.property=1,druid.property2=2" # Set custom Historical properties with this property or in the UI
  ... #  ... # Additional configs omitted for brevity

For more information about tiers, see Druid multitenancy documentation.

Adding clusters

Add a new cluster as follows:

Run helm list to display a list of the currently deployed releases.
Note the name of the release name of the existing deployment.
Create a values file for the new cluster as follows:
```
helm show values imply/imply > cluster-two.yaml
```
Edit the deployments section of cluster-two.yaml and disable everything except the agents key:
```
...
deployments:
  manager: false
  agents: true

  zookeeper: false
  mysql: false
  minio: false
...
```
The second cluster reuses the other resources.
Set the managerHost value to point to the Manager service defined in the existing deployment. Also configure the name for the new cluster, for example:
```
...
agents:
  managerHost: wanton-toucan-imply-manager-int
  clusterName: cluster-two
...
```
In this example wanton-toucan is the release name of the deployment we found using helm list.
Verify your Kubernetes environment has enough capacity to accommodate the second Druid Cluster, as defined by the cluster-two.yaml settings.

Run the following command to deploy the second cluster:

helm install -f cluster-two.yaml cluster-two imply/imply

Any changes to the druid settings in the second cluster will not change the defaults. Use the Manager UI to modify these values. Only changes to the master/query/dataTierX take effect.

Configuring the node count​

Configuring available disk​

Increasing requested volume size​

Adding a volume claim​

Adding additional data tiers​

Adding clusters​