Skip to content

furniture field missing from DoclingDocument JSON serialized via docling-java #386

@mohammedfaisal

Description

@mohammedfaisal

Summary

When converting a PDF to a DoclingDocument using docling-java, the resulting JSON is missing the mandatory furniture field as defined in the DoclingDocument schema. This makes the document non-compliant with the schema and breaks downstream processing — for example, converting the document to Markdown fails because the furniture root node is absent.

Environment

Steps to Reproduce

  1. Use docling-java to convert a PDF via the ConvertDocumentRequest API with OutputFormat.JSON and includeImages(false).
  2. Retrieve the DoclingDocument from response.getDocument().getJsonContent().
  3. Serialize the DoclingDocument to JSON using Jackson ObjectMapper.
  4. Observe that the furniture field is absent from the output JSON.

Expected Behavior

The serialized JSON should include the furniture field, as it is a mandatory part of the DoclingDocument schema. When converting the same PDF using the docling-serve web UI (backed by the same docling-serve instance), the furniture field is correctly present:

"furniture": {
  "self_ref": "#/furniture",
  "parent": null,
  "children": [],
  "content_layer": "furniture",
  "meta": null,
  "name": "_root_",
  "label": "unspecified"
}

Actual Behavior

The JSON produced by docling-java is missing the furniture field entirely. The document starts directly with body after origin:

{
  "schema_name": "DoclingDocument",
  "version": "1.9.0",
  "name": "4085-original",
  "origin": { ... },
  "body": { ... }
  // ← no "furniture" field
}

Please the diff of DoclingDocuments generated via docling-serve web UI and docling-java in the screenshot attached.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions