Imply is a real-time reporting solution to rapidly ingest, visualize, slice and dice, drill down, and aggregate critical business activities. Imply is particularly powerful for providing low latency queries on high volume, high dimension, high cardinality data. You can ingest data either as streams (in real-time) or as static files.
Imply is offered as a cloud service or installable on-premise.
Imply deploys EC2 instances in a Virtual Private Cloud (VPC) running in your Amazon Web Services account. You own the cluster and the data. The main features include:
For docs on Imply Cloud, please see here.
You can download Imply as packaged software and install it in any on-premise or cloud-based environment. If you deploy Imply on-premise, you will have to self-manage deployment, operations, and updates. The management, easy data loading, and operations features of Imply Cloud are not available for on-premise installations.
For docs on installing Imply on-premise, please see here.
Druid is the open source analytics data store at the core of the platform. Druid enables arbitrary data exploration, low latency data ingestion, and fast aggregations at scale. Druid can scale to store trillion of events and ingest millions of events per second. Druid is best used to power user-facing data applications.
For more information about Druid, please visit http://druid.io.
Imply Pivot is a web-based UI for visual data exploration. It features dimensional pivoting, slice-and- dice and nested visualization, as well as contextual information and navigation. Use Pivot to perform OLAP operations with your data and immediately visualize your data once it is loaded in the platform.
For more information about Pivot, please visit the Pivot section.
Clarity is a dev ops and performance analytics tool that connects to your Imply Cluster. Explore anomalies, diagnose performance bottlenecks, and ensure your cluster is working optimally.
Query servers are the endpoints that users and client applications interact with. Query servers run a Druid Broker that route queries to the appropriate data nodes. They also include an Imply Pivot server as a way to directly explore and visualize your data.
Data servers store and ingest data. Data servers run Druid Historical Nodes for storage and processing of large amounts of immutable data, Druid MiddleManagers for ingestion and processing of data, and optionally Tranquility components to assist in streaming data ingestion.
For clusters with complex resource allocation needs, you can break apart the pre-packaged Data server and scale the components individually. This allows you to scale Druid Historical Nodes independently of Druid MiddleManagers, as well as eliminate the possibility of resource contention between historical workloads and real-time workloads.
The Master server coordinates data ingestion and storage in your Druid cluster. It is not involved in queries. It is responsible for starting new ingestion jobs and for handling failover of the Druid Historical Node and Druid MiddleManager processes running on your Data servers.
Master servers can be deployed standalone, or in a highly-available configuration with failover. For failover-based configurations, we recommend separating ZooKeeper and the metadata store into their own hardware. See the clustering documentation for more details.