Execute a Proof of Concept

Once you've signed up for the 30-day free trial of Imply Polaris and checked out the Quickstart, you might want to execute a Proof of Concept (POC) to evaluate whether Polaris meets your requirements and expectations.

This guide is designed to help you make the most of Polaris, and is aligned with key test milestones to maximize the 30 days of your trial. During the trial you can use Polaris continuously for the entire period without incurring any additional charges.

If you need more help to analyze your business case and requirements, contact Imply. You can also join Polaris support on Slack to get help and advice.

Define your POC requirements

A modern analytics application must provide all of the following:

Interactive analytics at any scale
High concurrency at best value
Insights on batch and/or streaming data

The following guidelines provide a blueprint on how to structure a POC to validate these requirements.

We recommend that you take the following approach:

Define your success criteria
Define the scope of the data
Ingest a small dataset using streaming or batch ingestion
Query and analyze the data
Test and monitor the data
Use the Polaris API
Iterate
Evaluate POC success
Consider next steps

Define your success criteria

Specify the desired outcomes of the trial. Some examples of success criteria are as follows:

Streaming ingestion:
- Ingest JSON and Avro data with latency less than x seconds
- Consume data directly from Confluent Cloud
- Surface ingested data in the Polaris UI within x seconds
Querying and analytics:
- Execute x% of queries in x seconds or less
- Run queries over x records for a variety of time periods
- Create alerts for slow-running queries and ingestion
- Set up resource-based access control at a row and column level
Monitoring:
- Monitor for slow-running queries and ingestion
API:
- Use the API to create and update users and groups

Define the scope of the data

Determine the attributes and formats of the data you want to test in Polaris. Create synthetic data or select production data based on these requirements.

During your trial, you can store up to 200 GB of data (input size can be up to 1 TB of JSON, CSV, and TSV data) in Polaris.

Ingest data

Choose your ingestion method: batch, streaming, or both.

Streaming ingestion

Evaluate the throughput needed to support your analytics applications. Consider the number of messages per second or per minute (x) and the size of messages (y).

Polaris supports realtime ingestion through the Push ingestion API, Apache Kafka (and Kafka-compatible APIs), and Amazon Kinesis. Use one of these methods to ingest real-time data and observe data appearing instantly in Polaris tables.

See the following pages to learn more about streaming ingestion options in Polaris:

Criteria for POC

Ingest x number of messages per second or per minute with an average message size of y bytes.

Batch ingestion

If ingesting data through batch is key to your application needs, consider the following questions:

What is the format of the data? (x)
How much data do you need to ingest in a batch? (y)
How often will you need to ingest each batch of data? (z)

See the following pages to learn more about batch ingestion options in Polaris:

Criteria for POC

Ingest data with format x and size y GB every z hours. Validate that the data appears in your Polaris tables.

Query and analyze the data

Interactive drill-down analytics are a key feature of an analytics application. Polaris gives you the ability to query, explore, and analyze your data with visually rich, intuitive tools. See Analytics overview for more information.

Before you validate this requirement, decide what's most important for your users to see, what kind of interactive exploration you want to enable, and a suitable period of time for which to query and analyze data.

Criteria for POC

Create data cubes, dashboards, and other resources as required to analyze data for a defined period of time. Investigate and troubleshoot anomalies.

Test and monitor the data

Develop a good understanding of the most frequently used query patterns for your application. You can obtain these queries from the monitoring pages in the Polaris UI—see Monitoring for details.

You might want to create a workload framework that includes a mix of your query patterns, in a tool such as Apache JMeter or Locust. These applications are designed to load test functional behavior and measure performance.

Define the number of queries (x) your application should complete in a specified time period, typically one or two seconds (y). Try to be realistic—it's common to overestimate the required query frequency, which impacts query patterns and data model design.

Your trial has limited resources. If you need to execute more than 10 queries per second, contact Imply.

Criteria for POC

Ensure that Polaris can support x representative queries per y seconds.

Use the Polaris API

APIs enable you to programmatically build an application around your database, and automate ingestion and other processes.

See the following pages to learn more about the Polaris API:

Criteria for POC

Confirm that the Polaris API supports the tasks you need for your application.

Guidance on iteration

When you execute the POC, perform initial and subsequent iterations. Iterations should have the following characteristics:

First iteration:

Test Polaris functionality with a small amount of data.
Start with a sample dataset of 1 GB or less.

Second and subsequent iterations:

Ingest a significant portion of test data using your preferred ingestion methods—1-5% of your estimated production data volume is a good starting point.
Ingest data that covers the full timeframe of your production needs.
For batch ingestion:
- Create incremental batch files.
- Use cron, Apache Airflow, or another scheduling tool to upload the incremental files and schedule ingestion API requests.
- Replace existing data, if that's one of your requirements. See Replace data for more information.
For streaming ingestion:
- Increase your throughput by pushing event data by API, or ingest events from Kafka or Kinesis.
Test and tune queries produced in the first iteration until they meet your success criteria. See the following topics for details:
- Data rollup for information on aggregating raw data during ingestion.
- Data partitioning for information on partitioning, sorting, and clustering.
- Compute results with cardinality sketches for information on using sketches to trade accuracy for reduced storage and improved performance.

Evaluate POC success

Evaluate how well your completed POC has satisfied your success criteria.

Next steps

If you're satisfied with what Polaris can provide at the end of your free trial, you can continue to use your Polaris database with pay-as-you-go billing. You can continue to test larger datasets and workloads once you once you transfer to pay-as-you-go.

If you have questions or need more time to evaluate Polaris, contact Imply for help.

Learn more

See the following pages for further information:

Define your POC requirements​

Define your success criteria​

Define the scope of the data​

Ingest data​

Streaming ingestion​

Criteria for POC​

Batch ingestion​

Criteria for POC​

Query and analyze the data​

Criteria for POC​

Test and monitor the data​

Criteria for POC​

Use the Polaris API​

Criteria for POC​

Guidance on iteration​

Evaluate POC success​

Next steps​

Learn more​

Define your POC requirements

Define your success criteria

Define the scope of the data

Ingest data

Streaming ingestion

Criteria for POC

Batch ingestion

Criteria for POC

Query and analyze the data

Criteria for POC

Test and monitor the data

Criteria for POC

Use the Polaris API

Criteria for POC

Guidance on iteration

Evaluate POC success

Next steps

Learn more