Skip to content

Ingest query returns SUCCESS before generated segments are loaded even when waitUntilSegmentsLoad is true #19003

@SperedArGoadeg

Description

@SperedArGoadeg

Affected Version

35.0.1

Description

When running an MSQ ingestion with the waitUntilSegmentsLoad option enabled in the query context, the task may return a SUCCESS status even though the generated segments are not yet loaded.

This behavior is intermittent and does not occur systematically for every ingestion.

When the issue occurs, any query against the target datasource fails with the following error:

{
  "error": "druidException",
  "errorCode": "invalidInput",
  "persona": "USER",
  "category": "INVALID_INPUT",
  "errorMessage": "Object 'my-datasource' not found (line [1], column [753])",
  "context": {
    "sourceType": "sql",
    "line": "1",
    "column": "753",
    "endLine": "1",
    "endColumn": "790"
  }
}

Example query (columns removed for confidentiality):

SELECT __time,other_column FROM "my-datasource" LIMIT 1

Anonymized SQL query:

INSERT INTO "my-datasource"
WITH ext AS (
  SELECT *
  FROM TABLE(
    EXTERN(
      '{"type":"hdfs","paths":"hdfs://hdfs-namenodes:8020/data.parquet"}',
      '{"type":"parquet"}'
    )
  )
  EXTEND ( start_date BIGINT, other_column VARCHAR )
)
SELECT MILLIS_TO_TIMESTAMP(start_date) AS __time, other_column
FROM ext
PARTITIONED BY ALL

Query context:

{
  "__user": "druid_system",
  "finalize": true,
  "maxNumTasks": 6,
  "maxParseExceptions": 0,
  "queryId": "d7120b41-b2a4-4541-9723-655824523f29",
  "rowBasedFrameType": 19,
  "sqlInsertSegmentGranularity": "{\"type\":\"all\"}",
  "sqlQueryId": "d7120b41-b2a4-4541-9723-655824523f29",
  "startTime": "2026-02-10T11:20:29.592Z",
  "taskAssignment": "auto",
  "waitUntilSegmentsLoad": true,
  "windowFunctionOperatorTransformation": true
}

A similar issue can also occur where the task returns a SUCCESS status while not all columns of the datasource are available yet. In this case, the datasource exists and is queryable, but some columns are temporarily missing.

More informations:

  • Cluster setup:
    • 2 brokers
    • 1 coordinator
    • 2 historicals
  • Deployment:
    • Middle manager less mode
    • Kubernetes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions