Skip to main content

Supported file formats

To take advantage of Apache Druid's querying capabilities, you ingest or load your original data from an input source. This topic is a reference for the supported input formats and sources.

Natively supported input formats by type

Druid supports the following input file types natively without loading an extension:

JSON

Comma separated values (CSV)

  • Streaming ingestion: Kafka, Kinesis
  • Batch ingestion: Index parallel (index)
    See CSV data format.

Tab separated values (TSV) and custom delimiters

  • Streaming ingestion: Kafka, Kinesis
  • Batch ingestion: Index parallel (index)
    See TSV data format.

Input formats supported with an extension by type

Druid supports the following input file types through an extension:

Avro OCF

Avro Stream

Parquet

Orc

Protobuf

Supported input formats by ingestion type

This section lists the file formats Druid supports for each ingestion type.

Streaming ingestion

Druid supports the following file formats for streaming ingestion:

Kafka

Kinesis

Azure blob

Batch ingestion

Druid supports the following file formats for batch ingestion:

Index parallel (index)

Hadoop batch ingestion

Learn more

See the following topics for more information: