IngestionJobSpec
Specification for a batch ingestion job.
Example
{
"jobType": "ingestion",
"fileList": ["logicalFilename1", ...],
"timestampMapping": {
"inputField": "myTimestampColumn",
"format": "iso",
"missingValue": "2022-01-01"
},
"columnRenames": [
{
"inputField": "source",
"outputColumn": "target"
},
...
],
"maxParsingErrors": 1000,
"formatSettings": {
"format": "nd-json"
}
}
Properties
jobType
String required
Possible values: ingestion
Type of the job.
fileList
array[String] required
List of files to ingest. All files listed must have the same format (e.g., newline-delimited JSON) and (if specified) formatSettings
. To ingest files with different formats or format settings into the same table, split into multiple ingestion jobs.
isReplace
Boolean
Default: false
Controls behavior when the data to ingest occurs within an interval for which the table already has data. If false
(the default), appends new data and preserves any existing data for those intervals. If true
, the new data replaces any data that exists for the common intervals.
intervals
array[String]
Specifies the intervals to ingest for the data. If not specified, empty, or null (the default), Polaris discovers the intervals by inspecting the data to ingest. If specified, Polaris ignores data outside the given intervals. Required if isReplace
is true.
Example: [2021-05-01/2021-05-02]
timestampMapping
TimestampMapping required
Describes which input field should be used as the Druid timestamp column.
For rows that do not have the specified input timestamp field, define their default timestamp in missingValue
.
columnRenames
Array
Any column renames that should be applied. Specify each column rename as a JSON object with the following fields:
inputField
: Name of the field from the input data.outputColumn
: Desired new name for the input field.
maxParsingErrors
Integer
Maximum number of parsing errors allowed to occur before the job fails.
maxSavedParsingErrors
Integer
Default: 5
Maximum number of parsing errors to save.
excludedColumns
array[String]
Columns that should not be ingested.
formatSettings
JSON object
Data format settings that apply to all files in the ingestion job. Define the appropriate settings based on the files to ingest:
Polaris automatically detects the file type based on the file extension. If you specify a value for format
in formatSettings
that does not match the automatically detected type, Polaris attempts to ingest based on the user-specified value.