Processors reference
AI summary
About AI summaries.
Within pipelines, processors perform the tasks to transform events. Each processor performs a unique action, such as parsing the event message or removing incoming metadata. Lumi stores unprocessed metadata from incoming events as user attributes.
This topic describes the available processors in Lumi.
To learn how to create a pipeline and test processors, see Manage pipelines and processors.
Processor settings
Use the following guidelines when configuring processors.
Source and output attributes
The following guidelines describe source and output attributes in processors.
- You can use incoming event metadata as source attributes.
- An attribute created by one processor can be the source attribute for a later processor in the same or different pipeline.
- Select processors let you use the event message (log body) as the source attribute. For example, the redaction processor.
- You can write to the event message using the message mapper.
- You can't refer to system attributes as source or output attributes.
Note that system attributes are allowed in the pipeline conditions.
Override output attributes
Some processors have the option to Override value when output attribute exists. This applies to situations in which you specify an output attribute with the same name as a previously existing user attribute or event metadata. When you choose to override, the processor replaces the original attribute value with the newly processed result. If you don't override, no processing occurs.
For example, suppose you send an event with incoming metadata key: value1, and a processor computes the output attribute key: value2.
You can choose whether to preserve value1 or override it to value2.
The override applies even when the input value is an empty string or one or more whitespace characters. An exception is when the source attribute is missing or its value is null, in which case processing is skipped.
The grok and regex parsers override any existing user attributes by default. You can't disable override for these processors.
Remove mapped attributes
When you map an attribute, the processor doesn't remove the source. To remove it, use the attribute remover.
Removing unused attributes can lead to better query performance and more efficient storage. It also simplifies your search experience and reduces complexity for any data maintenance tasks.
Try out a processor
For any processor, use the Try it out feature to preview the expected output for your test case. For details, see Manage pipelines.
Arithmetic processor
Evaluates an arithmetic formula and outputs the result to an attribute.
You can reference existing attributes as variables in the formula.
The formula supports the basic operators for addition (+), subtaction (-), multiplication (*), and division (/).
Parentheses (()) control the order of operations.
In the arithmetic formula, surround operators with space characters.
For example, val1 - val2 is a valid subtraction formula.
Without the space characters, the processor interprets val1-val2 as a single attribute.
Configure the processor with the following settings:
- Override value when output attribute exists: If an attribute with the same name already exists, you can select whether to override its value or leave it unchanged.
- Round output decimal value: The number of decimal places allowed in the output value. For example, enter 2 to round a result of
0.888to0.89. - Replace invalid input values with zero: When you select this toggle, Lumi replaces any nonexistent attributes with zero. If you don't select this option, Lumi skips processing and doesn't evaluate the formula.
Example
- Processor configuration
- Arithmetic formula:
(val1 + val2) / (val4 - val3)- Output attribute:
computed - Output attribute:
- Event input
- Event metadata:
val1: 5
val2: 8
val3: 11
val4: 14
- Event output
- User attribute:
computed: 4.333
Attribute mapper
Maps the value of a source attribute to an output user attribute.
The processor creates a new attribute when it doesn't exist. If an attribute with the same name already exists, you choose to override its value or leave it unchanged.
Example
- Processor configuration
- Source attribute:
status- Output attribute:
http_status - Output attribute:
- Event input
- Event metadata:
status: 401
- Event output
- User attribute:
http_status: 401
Attribute remover
Removes one or more source attributes.
Use this processor to drop unneeded fields to reduce storage size and improve query performance. You can also use the attribute remover to drop personally identifiable information, whether to remove it completely or to remove the source metadata after redaction.
Example
- Processor configuration
- Attributes to remove:
userid
- Event input
- Event metadata:
userid: wilma
- Event output
- User attribute: none
Conditional mapper
Evaluates one or more conditions, and maps a source attribute or value to an output user attribute. When no conditions are satisifed, no mapping occurs.
Specify one or more conditions to evaluate to determine the mapping behavior. The processor evaluates conditions from highest to lowest priority and applies the mapping for the first condition that's satisfied.
The processor also requires a name of an output attribute to store the mapped value. You can choose whether to override the attribute if it already exists. Create a separate processor for each output user attribute.
A condition takes the following components:
- Search expression in Lumi query syntax
- Type of mapping to perform, whether a value mapper or attribute mapper
- Configuration based on the mapper type:
- For a value mapper, a static value
- For an attribute mapper, the name of the source attribute
You can perform similar conditional evaluation using a lookup mapper. The lookup mapper checks source attribute values, and when it finds a match, it maps the new output attribute values.
Example
Consider a static value replacement only for events that have a specific source type.
- Processor configuration
- Condition:
sourcetype=access_combined- Mapper type: Value
- Value / Attribute:
redacted- Output attribute:
user - Mapper type: Value
- Event input
- Event metadata:
sourcetype: access_combined
user: wilma
- Event output
- User attributes:
sourcetype: access_combined
user: redacted
This configuration ensures that events store the user attribute user: redacted
when the event satisfies the pipeline condition
as well as the condition sourcetype=access_combined.
Grok parser
Parses a source attribute into one or more output attributes using a grok expression. You can use the event message as the source attribute.
In the parser configuration, provide the source attribute to parse and a grok expression. A grok expression is made up of one or more grok patterns in the following format:
%{PATTERN_NAME:OUTPUT}
PATTERN_NAME identifies a preset pattern, and OUTPUT is the label you assign to the output value that Lumi stores as a user attribute.
The grok parser extracts structured data when it matches the specified expression,
similar to the regex parser.
Grok expressions tend to be more human-readable than regex because they use preset templates for common patterns, such as TIMESTAMP_ISO8601.
Unlike the regex parser, you don't supply the output attribute names in a separate field; you include them directly in the grok expression.
For a reference on the available patterns, see Grok patterns. Note that you can test your grok patterns using an online parser such as Grok Debugger before you add them to a processor.
Example
- Processor configuration
- Source attribute: Select the option to Extract from log body
- Grok expression:
%{TIMESTAMP_ISO8601:time} %{LOGLEVEL:status}: %{GREEDYDATA:message} - Grok expression:
- Event input
- Event message:
2025-08-05 15:45:00 INFO: Starting application...
- Event output
- User attributes:
time: 2025-08-05 15:45:00
status: INFO
message: Starting application...
For examples of how to map the extracted values to other event components, see the timestamp mapper and message mapper.
Example with Apache combined log format
This example parses a log in Apache combined log format as represented in the tutorial data.
- Processor configuration
- Source attribute: Select the option to Extract from log body
- Grok expression:
%{IP:clientip} %{DATA:ident} %{DATA:user} \[%{HTTPDATE:req_time}\] "%{WORD:method} %{DATA:uri} %{DATA:version}" %{NUMBER:status} %{NUMBER:bytes} "%{URI:referer}" "%{GREEDYDATA:useragent}" - Grok expression:
- Event input
- Event message:
830:1e0e:525:e6a0:6479:cd69:c364:23c3 - - [24/Mar/2025:16:25:29 -0500] "POST /products/23394 HTTP/1.1" 200 1027 "https://techcrunch.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:110.0) Gecko/20100101 Firefox/110.0"
- Event output
- User attributes:
bytes: 1027
clientip: 830:1e0e:525:e6a0:6479:cd69:c364:23c3
ident: -
method: POST
version: HTTP/1.1
referer: https://techcrunch.com/
status: 200
req_time: 24/Mar/2025:16:25:29 -0500
uri: /products/23394
user: -
useragent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:110.0) Gecko/20100101 Firefox/110.0
Key-value parser
Parses key-value pairs from a source attribute into one or more output attributes.
Lumi creates a user attribute for each key-value pair.
A single source attribute can generate multiple output attributes when the input value contains multiple instances of the pattern.
For example, index=main, source=/var/log/messages, sourcetype=access_combined.
If you don't need to retain the original source value, you can use the attribute remover to remove it after key-value parsing. For additional details, see Remove mapped attributes.
The following table lists the supported key-value pattern types.
Each example produces the user attribute key: value.
| Key-value pattern | Description | Processor configuration | Example source value |
|---|---|---|---|
| Equality | Parses text joined by an equal sign, without space characters. | N/A | key=value |
| JSON | Parses text from JSON objects. | Flatten or preserve nesting | {"key": "value"} |
| Regex | Parses text adhering to a regex with two capture groups. Also see regex parser. |
| key_value |
| XML | Parses text within an XML document's root element. Each input requires a single root element. The key-value parser doesn't extract the root name. | Flatten or preserve nesting | <root><key>value</key></root> |
Output attribute names
The name of an extracted key-value pair is based on the key name with an additional prefix.
For example, consider the JSON input {"foo": "bar"}.
-
When the key-value input is in the event message, you select the source attribute Extract from log body. In this case, the user attribute name is just the key name. The resulting output attribute is
foo: bar. -
When the key-value input is in an existing attribute such as incoming event metadata, you select the source attribute Extract from user attribute. In this case, the user attribute name takes the pattern
SOURCE_ATTRIBUTE.KEY. If the example JSON is in a source attribute calledmetadata, the resulting output attribute ismetadata.foo: bar.
If the processed output has the same name as a previously existing attribute, you can choose to preserve its value or override it to the extracted value from the key-value pair.
Duplicate keys
For the regex pattern, you can enable Combine values from duplicate keys to capture multiple matches for the same key. When this option is disabled, the processor stores only the first match. For example:
- Input:
- Regex:
(\w*):(\w*) - Source value:
key:val1 key:val2 key:val3
- Regex:
- Output:
- Without combine:
key:val1 - With combine:
key:val1|val2|val3
- Without combine:
If an input for the equality pattern contains duplicate keys, such as a=b a=c a=d,
the parser only takes the last match: a=d.
Nested JSON
Use Flatten into a single level to control whether to preserve JSON nesting. If you don't select the flatten option, the processor preserves the original nesting of the source attribute value. For example:
- Input:
- Source value:
{"key1": {"key2": "value"}}
- Source value:
- Output:
- Nested:
"key1": {"key2": "value"} - Flattened:
key1.key2: value
- Nested:
Nested XML
Use Flatten into a single level to control whether to preserve XML nesting. Without flattening, the processor preserves the original nesting of the source attribute value. For example:
- Input:
- Source value:
<guestbook><guestcount>123</guestcount><guest rsvp="true">Wilma Rudolph</guest><venue><reception>courtyard</reception></venue></guestbook>
- Source value:
- Output:
-
Nested:
venue: {reception=courtyard}
guestcount: 123
guest: {rsvp=true, =Wilma Rudolph} -
Flattened:
venue.reception: courtyard
guestcount: 123
guest: Wilma Rudolph
guest.rsvp: true
-
Example
- Processor configuration
- Source attribute: Select the option to Extract from user attribute
json_attr- Key-value pattern: JSON
- Flatten into a single level: true
- Key-value pattern: JSON
- Subsequent processor
- Attribute remover to remove
json_attr
- Event input
- Event metadata:
json_attr: {"jsonkey1": "value1", "outerkey": {"innerkey": "value2"}}
- Event output
- User attributes:
json_attr.jsonkey1: value1
json_attr.outerkey.innerkey: value2
Lookup mapper
Looks up source attributes in a user-provided CSV lookup table, and creates one or more output attributes based on the table columns.
You can set the delimiter to another character, such as ;.
Be sure to match the delimiter to your lookup table.
For example, the delimiter comma , is different from comma with a space , .
Designate one or more source attributes as the lookup IDs in the table. The processor uses the ID columns to look up the matching row and creates output user attributes from the specified columns. The source attributes are also user attributes on the event.
The source and output attributes must match the names of the provided headers. You can provide the column headers as part of the lookup CSV or as comma-separated values in the Headers field. If your events already contain the output attributes, you can designate whether to override existing values.
Consider an example lookup table:
product_id | category | description |
|---|---|---|
| 23394 | Furniture | Leather Sectional Sofa |
| 32729 | Electronics | Raspberry Pi 5 |
| 23002 | Books | Man's Search for Meaning |
| 23394 | Instruments | Analog Theremin |
| 78905 | Jewelry | Art Deco Diamond Bracelet |
If product_id is the source attribute, the processor can create user attributes for category and description when it identifies a row matching the product ID.
You can specify category, description, or both for the output attributes.
The processor doesn't create user attributes when it doesn't identify a match.
Example
This example adds the description user attribute for events
that store a specific product ID and category.
- Processor configuration
- Headers: Lookup CSV includes header line
- Lookup CSV:
product_id,category,description
23394,Furniture,Leather Sectional Sofa
32729,Electronics,Raspberry Pi 5
23002,Books,Man's Search for Meaning
23394,Instruments,Analog Theremin
28201,Jewelry,Art Deco Diamond Bracelet- Delimiter:
,- Source attributes:
product_id,category- Output attribute:
description - Lookup CSV:
- Event input
- Event metadata:
product_id: 23394
category: Instruments
- Event output
- User attributes:
product_id: 23394
category: Instruments
description: Analog Theremin
Note that if you only select product_id as the source attribute, the resulting user attribute would be description: Leather Sectional Sofa, since it's the first matched row for product ID 23394.
Message mapper
Maps the value of a source attribute to the event message.
You have the option to overwrite the event message with an empty string when the source attribute is missing or empty.
Example
- Preceding processor
- Grok parser to extract
message: Starting application...
- Processor configuration
- Source attribute:
message
- Event input
- Event message:
2025-08-05 15:45:00 INFO: Starting application...
- Event output
- Event message:
Starting application...
Redaction processor
Redacts a source attribute using a regular expression. You can use the event message as the source attribute.
The processor overrides the source attribute to store the redacted content.
You can search on the redacted content such as user!=*redacted*.
Take note of the following for regex:
- Must have at least one capture group.
- Can match zero or more times in the source attribute. Redacts each match.
- Captures the entire value when the pattern is
(.+). However, if you're redacting the entire value, consider using the value mapper or conditional mapper with the override option.
The redaction strategy determines how Lumi identifies and replaces the redacted values. The processor uses regex capture groups differently depending on the strategy:
- String: Replaces the entirety of every regex match. The purpose of the capture group is to optionally retain content from the match.
- Hash: Replaces each capture group with a hash for every regex match. The purpose of the capture group is to define the redacted content.
The following sections describe the redaction strategies in more detail.
String redaction
With the string strategy for redaction, the regex describes the entire pattern to redact.
Regex capture groups define content you want to retain.
To keep a capture group, backreference it in the replacement text using the syntax $N, where N is the one-based index of the group.
For example, $2 references the second capture group.
When the replacement text doesn't reference a capture group, the capture group is ignored.
You can rearrange the order of the groups, such as $1$2 or $2$1.
Consider the email address username@example.org that you want to redact to u***@example.org.
The following configuration generates the redacted output:
- Regex:
(\w)\w*(@\w+\.\w+)
This simplistic pattern only searches for alphanumeric and underscore characters. The regex contains a capture group for the first character and a capture group for the email domain. - Replacement text:
$1***$2
The replacement text includes the asterisk redaction characters flanked by backreferences to the capture groups.
When there are multiple regex matches, the redaction and any backreference is applied for each match. For example, consider the following log that contains two phone numbers:
user="wilma", phone="800-555-0100", phone="800-555-0100", id="123-456-7890"
When you use the regex (phone=)"\d{3}-\d{3}-\d{4}" and replacement text $1"REMOVED", the processor keeps phone= and redacts the phone number for each match.
The redacted line becomes:
user="wilma", phone="REMOVED", phone="REMOVED", id="123-456-7890"
Hash redaction
With the hash strategy for redaction, the regex describes the pattern to search for. The processor replaces each capture group with its own cryptographic hash. When there are multiple regex matches, the processor replaces each capture group of each match.
Note that this approach differs from the string strategy, where the entire regex is replaced and not just the capture groups. You can't backreference capture groups in the hash strategy.
In the processor configuration, you specify a hash algorithm such as MD5 or SHA-256.
You also have the option to specify a salt, such as a random string, that gets combined with the source attribute before applying the hash function.
You can either prepend or append the salt to the source attribute.
Example of partial string redaction
This example redacts a Social Security number and retains the last four digits.
- Processor configuration
- Source attribute: Select the option to Extract from log body
- Regular expression:
(\d{3})-(\d{2})-(\d{4})- Strategy: String
- Replacement text:
xxx-xx-$3 - Regular expression:
- Event input
- Event message:
2023-10-27 10:01:05 INFO UserID: 88421 - Username: jdoe - SSN: 999-00-1111 - Action: UpdateRecord
- Event output
- Event message:
2023-10-27 10:01:05 INFO UserID: 88421 - Username: jdoe - SSN: xxx-xx-1111 - Action: UpdateRecord
Example of string redaction with multiple matches
This example redacts multiple instances of medical diagnosis codes.
- Processor configuration
- Source attribute: Select the option to Extract from log body
- Regular expression:
([A-Z][0-9]\w.\w+)- Strategy: String
- Replacement text:
REDACTED - Regular expression:
- Event input
- Event message:
{"patient_id":"P550","diagnosis_codes":["I11.0","I50.9","J44.9"]}
- Event output
- Event message:
{"patient_id":"P550","diagnosis_codes":["REDACTED","REDACTED","REDACTED"]}
Example of hash redaction of a password
This example redacts the password from a log and replaces it with a hash.
- Processor configuration
- Source attribute: Select the option to Extract from log body
- Regular expression:
password=(.+);- Strategy: Hash
- Algorithm: SHA-512
- Salt: True
- Value:
jv4w7m- Position: Append
- Regular expression:
- Event input
- Event message:
connection=db;user=admin;password=secret123;host=local
- Event output
- Event message:
connection=db;user=admin;password=c8a776ac50189ec7ad12c9573865717cad8b37cba9af872059094fed920e70a642ac5d84e63a36da0c8fd6b94da9b0e6fdee07f12b42352afd1304155763d13b;host=local
Regex parser
Parses a source attribute into one or more output attributes using a regular expression. You can use the event message as the source attribute.
In the parser configuration, provide the source attribute to parse, a regular expression with one or more capture groups, and a comma-separated list of output attributes. To test regular expressions, you can try out the processor or use a free regex parser such as Regex101.
The number of capture groups, denoted by (), determines the number of output attributes.
If a capture group matches more than one result, the processor only takes the first match.
Consider the following examples that parse the input string hello world:
- The regex
(\w+)matches on one or more word characters. Provide a single output attribute, such asgreeting. The parser returnsgreeting: hello. - If your regex is
(\w+) (\w+), provide two output attributes, such asgreeting, addressee. In this case, the parser returnsgreeting: helloandaddressee: world.
If you have an existing user attribute with the same name as one of the specified output attributes, the parser overwrites the previously existing user attribute. This behavior applies even when the match is an empty string or whitespace character.
In some cases, your input string contains both the attribute name and value. You can extract the attribute names directly from the source rather than provide a list of output attributes. To parse with regex to extract both attribute names and values, see the key-value parser and its regex pattern.
Example
- Processor configuration
- Source attribute: Select the option to Extract from log body
- Regular expression:
status: \[(\w*)\]- Output attributes:
status - Regular expression:
- Event input
- Event message:
Deployment successful. System 1 status: [ok] System 2 status: [alert]
- Event output
- User attribute:
status: ok
Example with Apache combined log format
This example parses a log in Apache combined log format as represented in the tutorial data.
- Processor configuration
- Source attribute: Select the option to Extract from log body
- Regular expression:
([^ ]*) ([^ ]*) ([^ ]*) \[([^\]]*)\] "(\S+)(?: +([^\"]*?)(?: +(\S+))?)?" ([^ ]*) ([^ ]*)(?: "([^\"]*)" "([^\"]*)")?
- Output attributes:
clientip, ident, user, req_time, method, uri, version, status, bytes, referer, useragent
- Regular expression:
- Event input
- Event message:
830:1e0e:525:e6a0:6479:cd69:c364:23c3 - - [24/Mar/2025:16:25:29 -0500] "POST /products/23394 HTTP/1.1" 200 1027 "https://techcrunch.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:110.0) Gecko/20100101 Firefox/110.0"
- Event output
- User attributes:
bytes: 1027
clientip: 830:1e0e:525:e6a0:6479:cd69:c364:23c3
ident: -
method: POST
version: HTTP/1.1
referer: https://techcrunch.com/
status: 200
req_time: 24/Mar/2025:16:25:29 -0500
uri: /products/23394
user: -
useragent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:110.0) Gecko/20100101 Firefox/110.0
Status mapper
Maps the value of a source attribute to the event status.
Lumi attempts to map status codes to human-readable values.
For example, the HTTP status code 500 maps to Error.
For more information, see the system attribute for status.
You can optionally include a fallback value that Lumi sets for the status when the source attribute doesn't exist or if it's unable to be interpreted.
Example
- Processor configuration
- Source attribute:
http_code
- Event input
- Event metadata:
http_code: 200
- Event output
- System attribute:
status: ok
Note that status here is a system attribute, not a user attribute.
If you want to remove the source attribute after the status mapping,
use the attribute remover.
Timestamp mapper
Maps the value of a source attribute to the event timestamp. Provide the name of the source attribute, timestamp format, and the time zone (optional).
Supported timestamp formats include ISO 8601, Common Log Format, and Unix epoch values.
To automatically detect the format, select Auto.
To define your own timestamp pattern, select Custom, and enter your pattern.
Custom formats use Java DateTimeFormatter syntax.
For details and examples, see Time formats.
When the timestamp is embedded in the event message, you can map it to the event timestamp as follows:
- Create a processor, such as the regex parser, to extract the timestamp as a new attribute.
- Use the newly extracted attribute in the timestamp mapper.
- Clean up the extracted attribute using the attribute remover.
For more details, see Manual timestamp mapping.
Example
- Preceding processor
- Grok parser to extract
time: 2025-08-05 15:45:00
- Processor configuration
- Source attribute:
time- Time format:
Custom: yyyy-MM-dd HH:mm:ss- Time zone ID: supply your time zone
- Time format:
- Event input
- Event message:
2025-08-05 15:45:00 INFO: Starting application...
- Event output
- Event timestamp:
Aug 05, 03:45:00.000 PM
Example with Apache combined log format
- Preceding processor
- Regex parser to extract
time: 24/Mar/2025:16:25:29 -0500
- Processor configuration
- Source attribute:
time- Time format:
CLF(Common Log Format)- Time zone ID: leave empty
- Time format:
- Event input
- Event message:
29.182.147.96 - - [24/Mar/2025:16:25:29 -0500] "POST /products/23394 ...
- Event output
- Event timestamp, viewed from PDT time:
Mar 24, 02:25:29.000 PM
In this example, the event message recorded the time as 4:25 PM CDT (denoted by the -0500 time zone specification).
The user observed the event from the America/Los_Angeles time zone (PDT).
As a result, the event displays the timestamp in Lumi as two hours prior.
Value mapper
Maps a static value to an output user attribute.
For the static value, you can enter your own value or assign the Unix timestamp of event indexing. The Unix time represents seconds from Unix epoch—January 1, 1970, at 00:00:00 UTC. For more information about the event indexing timestamp, see Timestamp handling.
The processor creates a new attribute when it doesn't exist. If an attribute with the same name already exists, you can choose to override its value or leave it unchanged.
Example
- Processor configuration
- Static value:
example.com
- Event input
- Event metadata:
host: 23.192.228.84
- Event output
- User attribute:
host: example.com
Limitations
Lumi doesn't currently support extractions on time fields.
Learn more
See the following topics for more information:
- How to transform events with pipelines for a tutorial on pipelines.
- Transform events using pipelines for an overview of pipelines and processors.
- Manage pipelines and processors for how to create and manage pipelines and processors.