Event formats reference
AI summary
About AI summaries.
When you use Imply Lumi in your observability workflows, you typically have an incoming stream of events for the logs you want to capture and analyze. In some cases, you might need to ingest a batch of events stored in a file. This can be useful when performing a backfill to load historical events or for a quick evaluation of your data in Lumi.
This topic lists the supported event and file compression formats you can use to batch ingest events.
The formats apply to file upload and S3 pull.
Supported formats
You can use the following event formats for batch ingestion:
- CSV (generic)
- JSON (generic)
- Splunk® CSV
- Splunk® HEC
- Plain text (not supported in file upload)
The following sections describe the formats in more detail. See Automatic detection for the framework Lumi uses to detect the event format and Manual specification to specifically designate your format.
CSV
For events in generic CSV format, Lumi automatically parses each field as a user attribute.
Consider the following example CSV:
host,timestamp,location
127.0.0.1,24/Mar/2025:16:25:29 -0500,"Chicago, IL"
Lumi generates the following user attributes, shown in alphabetical order:
host: 127.0.0.1
location: Chicago, IL
timestamp: 24/Mar/2025:16:25:29 -0500
When you manually specify CSV instead of using format auto-detection, you can provide the following settings:
- Delimiter: Defaults to
,for CSV. Specify any single character or\tfor TSV. - Rows to skip: Defaults to 0. Set this value if you want to skip any preamble rows or override headers in the file.
- Custom headers: By default, Lumi infers headers from the file. You can provide your own header names, separated by commas (regardless of your delimiter). For example,
timestamp,message,host.
JSON
For generic JSON, Lumi extracts each top-level field as a separate user attribute. You can format JSON events using any of the following structures:
-
Separate objects: each object is a separate event.
For example:
{"time": "2025-11-14T22:46:11Z", "log": "example log 1"}{"time": "2025-11-14T23:46:11Z", "log": "example log 2"} -
Array of objects: each object in the array is an event.
For example:
[{"time": "2025-11-14T22:46:11Z", "log": "example log 1"},{"time": "2025-11-14T23:46:11Z", "log": "example log 2"}] -
Newline-delimited objects: each line is a separate event.
For example:
{"time": "2025-11-14T22:46:11Z", "log": "example log 1"}{"time": "2025-11-14T23:46:11Z", "log": "example log 2"}
Lumi doesn't support the following:
- A single JSON object that contains all the events.
- JSON files exported from Splunk.
Splunk CSV
The Splunk CSV format represents the format of a CSV file that Splunk exports.
When Lumi reads this format, it looks for the fields _raw and _time, which contain the raw event message and timestamp respectively.
A Splunk export includes these fields by default.
See the Splunk documentation for more information.
Splunk HEC
The Splunk HEC format refers to the JSON format for an HTTP request to the Splunk HEC endpoint.
At minimum, it requires the top-level field event that contains the raw event message.
Lumi automatically detects Splunk HEC when the object has the top-level fields for event and at least one of
the Splunk event metadata fields:
time, host, source, sourcetype, index, fields.
To send an event with user attributes, include the attributes in a nested JSON object assigned to fields.
If you include a custom attribute—for example, status—as a top-level field, Lumi doesn't assign the user attribute.
For more information, see Format events for HTTP event collector in the Splunk documentation.
The following example shows a Splunk HEC JSON event that includes attributes for key1 and key2:
{
"event": "Demo log",
"time": "2025-11-14T22:46:11Z",
"index": "demo",
"fields": {
"key1": "value1",
"key2": [
"value2.0",
"value2.1"
]
}
}
In this case, Lumi generates the following user attributes:
index: demo
key1: value1
key2: [value2.0, value2.1]
sourcetype: httpevent
For additional examples, see Send events with Splunk HEC.
Plain text
Plain text encompasses all other event formats. The raw log line becomes the event message, and the timestamp Lumi received the event becomes the event timestamp. You can set up a pipeline to extract user attributes or manually assign the event timestamp.
Compressed files
A compressed format allows you to save storage space and streamline data management tasks. You can ingest a compressed file in one of the supported event formats.
Lumi supports the following compression formats:
- Brotli
- BZIP2
- DEFLATE
- GZIP
- LZMA
- LZ4
- Snappy
- XZ
- Z
- ZSTD
Automatic detection
Lumi automatically detects the event format using the following heuristics:
-
Determine the format based on the base file extension. For a compressed file like
.json.gz, the base file extension is.json.- CSV: indicated by the extension
.csv - JSON: indicated by extension
.jsonor.ndjson
- CSV: indicated by the extension
-
If the file extension is inconclusive, evaluate file contents in the first 1024 bytes:
- CSV: contains at least two rows, where the first row contains at least two non-blank columns such as
time,event - JSON: starts with
{or[
- CSV: contains at least two rows, where the first row contains at least two non-blank columns such as
-
If neither CSV nor JSON are detected, Lumi treats the event format as plain text and proceeds with line-based parsing.
-
Identify any Splunk-specific formats:
- Splunk CSV: CSV headers include
_rawand_time - Splunk HEC: JSON includes
eventand at least one of the supported metadata fields such assource
- Splunk CSV: CSV headers include
-
If Lumi doesn't detect a Splunk format, parse events in generic CSV or JSON format, respectively.
The following diagram shows the decision tree of format detection for an example event. The arrows in bold show the pathway of the example event detected as generic JSON.
Manual specification
For S3 pull ingestion, you can specify an event format to ensure parsing that type rather than rely on format auto-detection. You can select any of the supported formats. For example, you might have comma-separated data that you want to treat as plain text instead of CSV. Or you might have JSON events misinterpreted as Splunk HEC, which has specific requirements on how to supply custom fields.
You can manually designate the format for S3 pull ingestion in the following ways:
-
IAM key attribute: Applies to all incoming events from S3 pull configured with the IAM key.
To set the format on an IAM key:
- Go to the Keys page in Lumi and find your IAM key.
- For the S3 pull integration, click the ellipsis and select Edit attributes.
- Under Format, click the drop-down and select your format.
- Click Save.
-
Backfill job specification: Applies to incoming events from a single backfill job. Overrides any format assigned on the IAM key.
When creating a job, set Format in the UI orformatOptionsin your API request.
For recurring ingestion, create a separate IAM key for each format type. For backfill ingestion, create a job for each format type if you want to override the one on the key.
Learn more
See the following topics for more information:
- Send events to Lumi to learn about options for sending events to Lumi.
- Send events with S3 pull for details on the S3 pull integration.
- Quickstart to learn to send events to Lumi through file upload.