Tutorial: HTTP push

In this tutorial, you'll load your own streams over HTTP using Tranquility Server.

Imply additionally supports a wide variety of batch and streaming loading methods. See the Loading data page for more information about other options, including Kafka, Hadoop, HTTP, Storm, Samza, Spark Streaming, and your own JVM apps.

Prerequisites

You will need:

  • Java 8 or better
  • Node.js 4.5.x or better
  • Linux, Mac OS X, or other Unix-like OS (Windows is not supported)
  • At least 4GB of RAM

On Mac OS X, you can use Oracle's JDK 8 to install Java and Homebrew to install Node.js.

On Linux, your OS package manager should be able to help for both Java and Node.js. If your Ubuntu- based OS does not have a recent enough version of Java, WebUpd8 offers packages for those OSes. If your Debian, Ubuntu, or Enterprise Linux OS does not have a recent enough version of Node.js, NodeSource offers packages for those OSes.

Start Imply

If you've already installed and started Imply using the quickstart, you can skip this step.

First, download Imply 2.3.4 from imply.io/get-started and unpack the release archive.

tar -xzf imply-2.3.4.tar.gz
cd imply-2.3.4

Next, you'll need to start up Imply, which includes Druid, Imply Pivot, and ZooKeeper. You can use the included supervise program to start everything with a single command:

bin/supervise -c conf/supervise/quickstart.conf

You should see a log message printed out for each service that starts up. You can view detailed logs for any service by looking in the var/sv/ directory using another terminal.

Later on, if you'd like to stop the services, CTRL-C the supervise program in your terminal. If you want a clean start after stopping the services, remove the var/ directory and then start up again.

Enable Tranquility Server

Imply includes Tranquility Server to support loading data over HTTP. To enable this in the Imply quickstart-based configuration:

  • In your conf/supervise/quickstart.conf, uncomment the tranquility-server line.
  • Stop your bin/supervise command (CTRL-C or bin/service --down) and then restart it by again running bin/supervise -c conf/supervise/quickstart.conf.

As part of the output of supervise you should see something like:

Running command[tranquility-server], logging to[/home/imply/imply-2.3.4/var/sv/tranquility-server.log]: bin/tranquility server -configFile conf-quickstart/tranquility/server.json

You can check the log file in var/sv/tranquility-server.log to confirm that the server is starting up properly.

Send data

Let's send some data!

We've included a script that can generate some random sample metrics to load into this datasource. To use it, run:

bin/generate-example-metrics | curl -XPOST -H'Content-Type: application/json' --data-binary @- http://localhost:8200/v1/post/tutorial-tranquility-server

Which will print something like:

{"result":{"received":25,"sent":25}}

This indicates that the HTTP server received 25 events from you, and sent 25 to Druid. This command may generate a "connection refused" error if you run it too quickly after enabling Tranquility Server, which means the server has not yet started up. It should start up within a few seconds. The command may also take a few seconds to finish the first time you run it, during which time Druid resources are being allocated to the ingestion task. Subsequent POSTs will complete quickly once this is done.

Once the data is sent to Druid, you can immediately query it.

Query data

After sending data, you can immediately query it using any of the supported query methods. To start off, try a SQL query:

$ bin/dsql
dsql> SELECT server, SUM("count") AS "events", COUNT(*) AS "rows" FROM "tutorial-tranquility-server" GROUP BY server;
┌──────────────────┬────────┬──────┐
│ server           │ events │ rows │
├──────────────────┼────────┼──────┤
│ www1.example.com │      9 │    6 │
│ www2.example.com │     11 │    6 │
│ www3.example.com │     11 │    5 │
│ www4.example.com │     11 │    7 │
│ www5.example.com │      8 │    4 │
└──────────────────┴────────┴──────┘
Retrieved 5 rows in 0.02s.

You can see that due to Druid's OLAP rollup feature, COUNT(*) may return a smaller count than SUM("count").

Next, try configuring a datacube in Pivot:

  1. Navigate to Pivot at http://localhost:9095/pivot.
  2. Click on the Plus icon in the top right of the header bar and select "New data cube".
  3. Select the source "druid: tutorial-tranquility-server" and ensure "Auto-fill dimensions and measures" is checked.
  4. Click "Next: configure data cube".
  5. Click "Create cube". You should see the confirmation message "Data cube created".
  6. View your new datacube by clicking the Home icon in the top-right and selecting the "Tutorial Tranquility Server" cube you just created.

Load your own data

So far, you've loaded data using an ingestion spec that we've included in the distribution. Each ingestion spec is designed to work with a particular dataset. You can load your own datasets by writing a custom ingestion spec.

To customize Tranquility Server ingestion, you can edit the conf-quickstart/tranquility/server.json configuration file. See the Tranquility documentation for more details about how to interpret and modify the configuration. After updating the configuration, you can restart Tranquility Server by running:

bin/service --restart tranquility-server

Note that when sending your own streaming data, you must ensure that the timestamp is recent enough (within windowPeriod of the current time). Older events will not be sent to Druid. See the Segment granularity and windowPeriod section of the Tranquility documentation for more details.

Further reading

On-Premise