Skip to main content

Event formats reference

AI summary
Explains supported event and compression formats for batch ingestion in Imply Lumi, useful for backfills or quick evaluations. Covers Splunk® HEC JSON, generic JSON, Splunk CSV, and standard CSV formats. Also details compression options and automatic format detection capabilities.

About AI summaries.

When you use Imply Lumi in your observability workflows, you typically have an incoming stream of events for the logs you want to capture and analyze. In some cases, you might need to ingest a batch of events stored in a file. This can be useful when performing a backfill to load historical events or for a quick evaluation of your data in Lumi.

This topic lists the supported event and file compression formats you can use to batch ingest events.
The formats apply to file upload and S3 pull.

Supported formats

You can use the following event formats for batch ingestion:

The following sections describe the formats in more detail. See Format detection for the framework Lumi uses to detect the event format.

CSV

For events in generic CSV format, Lumi automatically parses each field as a user attribute. The CSV format requires:

  • A header row with at least two non-blank column names
  • One or more data rows

Consider the following example CSV:

host,timestamp,location
127.0.0.1,24/Mar/2025:16:25:29 -0500,"Chicago, IL"

Lumi generates the following user attributes, shown in alphabetical order:

host: 127.0.0.1
location: Chicago, IL
timestamp: 24/Mar/2025:16:25:29 -0500

JSON

For generic JSON, Lumi extracts each top-level field as a separate user attribute. You can format JSON events using any of the following structures:

  • Separate objects: each object is a separate event.

    For example:

    {"time": "2025-11-14T22:46:11Z", "log": "example log 1"}{"time": "2025-11-14T23:46:11Z", "log": "example log 2"}
  • Array of objects: each object in the array is an event.

    For example:

    [
    {"time": "2025-11-14T22:46:11Z", "log": "example log 1"},
    {"time": "2025-11-14T23:46:11Z", "log": "example log 2"}
    ]
  • Newline-delimited objects: each line is a separate event.

    For example:

    {"time": "2025-11-14T22:46:11Z", "log": "example log 1"}
    {"time": "2025-11-14T23:46:11Z", "log": "example log 2"}

Lumi doesn't support the following:

  • A single JSON object that contains all the events.
  • JSON files exported from Splunk.

Splunk CSV

The Splunk CSV format represents the format of a CSV file that Splunk exports. When Lumi reads this format, it looks for the fields _raw and _time, which contain the raw event message and timestamp respectively. A Splunk export includes these fields by default. See the Splunk documentation for more information.

Splunk HEC

The Splunk HEC format refers to the JSON format for an HTTP request to the Splunk HEC endpoint. At minimum, it requires the top-level field event that contains the raw event message.

Lumi automatically detects Splunk HEC when the object has the top-level fields for event and at least one of the Splunk event metadata fields: time, host, source, sourcetype, index, fields.

To send an event with user attributes, include the attributes in a nested JSON object assigned to fields. If you include a custom attribute—for example, status—as a top-level field, Lumi doesn't assign the user attribute. For more information, see Format events for HTTP event collector in the Splunk documentation.

The following example shows a Splunk HEC JSON event that includes attributes for key1 and key2:

{
"event": "Demo log",
"time": "2025-11-14T22:46:11Z",
"index": "demo",
"fields": {
"key1": "value1",
"key2": [
"value2.0",
"value2.1"
]
}
}

In this case, Lumi generates the following user attributes:

index: demo
key1: value1
key2: [value2.0, value2.1]
sourcetype: httpevent

For additional examples, see Send events with Splunk HEC.

Compressed files

A compressed format allows you to save storage space and streamline data management tasks. You can ingest a compressed file in one of the supported event formats.

Lumi supports the following compression formats:

  • Brotli
  • BZIP2
  • DEFLATE
  • GZIP
  • LZMA
  • LZ4
  • Snappy
  • XZ
  • Z
  • ZSTD

Automatic detection

Lumi automatically detects the event format using the following heuristics:

  1. Determine the format based on the base file extension. For a compressed file like .json.gz, the base file extension is .json.

    • CSV: indicated by the extension .csv
    • JSON: indicated by extension .json or .ndjson
  2. If the file extension is inconclusive, evaluate file contents in the first 1024 bytes:

    • CSV: contains at least two rows, where the first row contains at least two non-blank columns such as time,event
    • JSON: starts with { or [
  3. If neither CSV nor JSON are detected, Lumi treats the event format as plain text and proceeds with line-based parsing.

  4. Identify any Splunk-specific formats:

  5. If Lumi doesn't detect a Splunk format, parse events in generic CSV or JSON format, respectively.

The following diagram shows the decision tree of format detection for an example event. The arrows in bold show the pathway of the example event detected as generic JSON.

Format detection diagram

Manual specification

You can manually designate the format when you set up recurring ingestion with S3 pull. For example, you might have CSV-formatted data that you want to treat as individual lines of plain text. Or you might have a JSON event interpreted as Splunk HEC, which has specific requirements on how to supply custom fields.

To manually set your event format:

  1. Go to the Keys page in Lumi and find your IAM key.
  2. For the S3 pull integration, click the ellipsis and select Edit attributes.
  3. Under Format, click the drop-down and select your format.
  4. Click Save.

For more details on configuring S3 pull attributes, see Send events with S3 pull.

Learn more

See the following topics for more information: