Backfill events with S3 pull
AI summary
About AI summaries.
You can use the S3 pull integration to ingest historical data from an Amazon S3 bucket into Imply Lumi. The process, often called backfill ingestion, involves manually specifying which objects to ingest. You provide an S3 bucket with optional prefix and suffix filters to limit what gets ingested. Lumi ingests events from the objects matching the specified filters.
You can ingest S3 data through the Lumi UI or the API.
The following diagram shows how AWS services and Lumi interact to send events through backfill ingestion:
This topic provides details to configure backfill ingestion using the S3 pull integration. It assumes you have already completed all the steps in Send events with S3 pull.
Backfill job behavior
When you submit a backfill job, Lumi first validates permissions to access your S3 objects. Upon successful validation, Lumi discovers objects to ingest based on your provided filter. After the discovery phase, Lumi begins ingesting events from the objects.
Before creating backfill jobs, note the following constraints and behavior:
- Limit your job to a maximum of 1,000,000 objects. If you exceed this amount, Lumi doesn't proceed with ingestion. Refine your filter to reduce the size of your job. For details, see Reduce job size.
- It can take time for a job to begin, depending on the volume of discovered objects and any backlog of existing backfill requests.
- Avoid creating backfill jobs with the same specification. This can lead to duplicate events.
- Lumi assigns the user attribute
filenameand the system attributecorrelationIdto each event from a backfill job. Events from the same job have the same correlation ID. For more information, see S3 pull attributes.
Create a job in the UI
Create a backfill job using the Lumi UI:
-
From the Lumi navigation menu, click Integrations > S3 pull.
-
In Select the job type, click Backfill.
-
Select your IAM key.
-
In Create a new job, enter the following details:
-
Bucket name: Name of the S3 bucket containing the data.
-
Object filter: Glob pattern for object keys that defines which objects to include when ingesting data. The pattern must match the entire object key. See S3 object filters for examples.
Ensure that the number of objects doesn't exceed the maximum limit. If you have more objects than can fit in one job, subdivide them into multiple jobs. For more information, see Reduce job size.
-
Region (optional): AWS region of your S3 bucket. By default, it assumes the same region as the Lumi environment.
-
Modified after (optional): Start date in ISO 8601 format. Only include objects that were created or modified after this date.
-
Modified before (optional): End date in ISO 8601 format. Only include objects that were created or modified before this date.

-
-
Click Start job. Note that ingestion might not begin immediately. See Backfill job behavior.
-
In Preview incoming data, view the events coming into Lumi. Lumi automatically refreshes the preview pane to display the latest events.
-
Click Explore events to see more events associated with the IAM key. Adjust the time filter to choose the range of data displayed.
Create a job by API
You can use the S3 pull API to ingest event data from an S3 bucket on demand.
To start a backfill job, send a POST request to the /ingest endpoint.
Replace S3_PULL_BACKFILL_ENDPOINT with your Lumi endpoint URL.
You can find this URL in the Authentication and access pane of the S3 pull backfill integration in the UI.
curl --location 'S3_PULL_BACKFILL_ENDPOINT' \
--header 'Content-Type: application/json' \
--data '{
"bucket": "BUCKET_NAME"
}'
Include the S3 bucket name in the request body. All other body parameters are optional.
View body parameters
-
bucket- Type: string
- S3 bucket name containing the data to backfill.
-
pattern- Type: string
- Pattern that defines which objects to include when ingesting data. By default, Lumi ingests all objects in the S3 bucket. See S3 object filters for details.
-
region- Type: string
- AWS region of your S3 bucket. For example,
us-east-1. By default, Lumi assumes the same region as the Lumi environment.
-
modifiedAfter- Type: string
- Start date in ISO 8601 format. Only include objects that were modified after this date and time. For example,
2025-01-01T00:00:00Z.
-
modifiedBefore- Type: string
- End date in ISO 8601 format. Only include objects that were modified before this date and time. For example,
2025-01-31T23:59:59Z.
Sample request
The following example shows how to ingest data from the example_logs.bz2 file in the example-bucket S3 bucket:
curl --location 'https://60252ae2-c123-4d56-b78f-910112bef518@s3-pull.us1.api.lumi.imply.io/ingest' \
--header 'Content-Type: application/json' \
--data '{
"bucket": "example-bucket",
"pattern": "logs/example_logs.bz2",
"region": "us-east-1",
"modifiedAfter": "2025-01-01T00:00:00Z"
}'
A successful request returns an HTTP 200 OK message code and the task ID (taskId) in the response body.
View job status
If you use the Lumi UI to submit a job, Lumi displays the status of the job. Click View and manage job for more details on the Jobs page. There, you can see the progress of discovery and processing for the job. You can also cancel the job and view past jobs. For more details, see View and manage jobs.

Check Lumi for events
You can preview the incoming data in the Lumi UI:
- From the Lumi navigation menu, click Integrations > S3 pull.
- Click Backfill and select your IAM key.
- In Preview incoming data, view the events coming into Lumi. Lumi automatically refreshes the preview pane to display the latest events. The preview pane only shows events with timestamps in the last 24 hours.
- Click Explore events to see more events associated with the IAM key. Adjust the time range selector to filter the data displayed.
Once events start flowing into Lumi, you can search them. See Search events with Lumi for details on how to search and Lumi query syntax for a list of supported operators.
If you sent events but don't see them in the preview pane, search for them in the explore view. Filter your search by the time range that spans your event timestamps. For information on troubleshooting ingestion, see Troubleshoot data ingestion.
S3 object filters
You can define filters for S3 object keys to control which objects to ingest.
In the Lumi UI, specify a pattern in the Object filter field.
In the API, include the pattern field in the request body.
The following table lists the glob patterns supported by Lumi:
| Pattern | Description | Example |
|---|---|---|
** | Matches zero or more path segments | logs/**, **/*.json |
* | Matches any characters except path separators | logs/2025-10-*.json |
? | Matches exactly one character except path separators | logs/demo-logs-?.json |
[abc] | Matches any character in the set | logs/[aeu]*_logs.* |
[a-z] | Matches any character in the range | logs/[a-z]*_logs.* |
[!abc] | Matches any character not in the set (negation) | logs/[!aeu]*_logs.* |
{a,b,c} | Matches any of the alternatives (brace expansion) | logs/*.{bz2,gz} |
Reduce job size
To reduce the size of your backfill job, modify the prefix in your object filter. For example, you have an S3 bucket with objects organized in the following structure:
logs/
├── access/
├── 2026/
└── 2025/
├── firewall/
└── system/
Instead of a single backfill job that uses the filter logs/**, you can initiate three backfill jobs with more specific prefixes:
logs/access/**logs/firewall/**logs/system/**
You can also further subdivide the backfill job, such as to ingest access logs by specific years.
Note that you can't reduce the job size by applying the following changes:
- Adjusting the time range of objects modified before or after a certain timestamp.
- Changing the filter such that the prefix doesn't modify the scope of the discovery.
For example,
logs/**/*.jsonwould still require evaluation of the same number of objects aslogs/**.
Learn more
See the following topics for more information:
- Recurring ingestion for recurring ingestion from an Amazon S3 bucket to Lumi.
- Transform events using pipelines for information on how to transform events in Lumi.
- Send events to Lumi for other options to send events.
- Cloud regions for Lumi regions and their AWS equivalents.