This pipeline generates line sub-patches for retrieval training with Option A implicit neighborhood metadata.
- Detects Tibetan textboxes per page using model
m(YOLO). - Detects lines inside each textbox using algorithm
A(classical vertical profile segmentation). - Normalizes each line to fixed height
Ht, computes ink mapJ, horizontal profilep(x), and boundary minima. - Builds dense and boundary-aligned windows per scale.
- Filters by ink ratio and samples candidates per line/scale.
- Saves patch PNGs and Parquet metadata.
Via package entrypoint:
python -m pechabridge.cli.gen_patches --config configs/patch_gen.yamlVia project root CLI:
python cli.py gen-patches --config configs/patch_gen.yamlOverride YAML keys from CLI, e.g.:
python cli.py gen-patches \
--config configs/patch_gen.yaml \
--model models/your_layout_model.pt \
--input-dir sbb_images \
--output-dir datasets/pecha_line_patches \
--no-samples 200 \
--debug-dump 12out_dataset/
patches/
doc={doc_id}/page={page_id}/line={line_id}/scale={scale_w}/patch_{patch_id}.png
meta/
patches.parquet
debug/
... overlays (if --debug-dump > 0)
For each group (doc_id, page_id, line_id, scale_w):
- metadata rows are sorted by
x0_px kis assigned as contiguous0..n-1
Neighborhood can be derived on-the-fly from (doc_id, page_id, line_id, scale_w, k) during training.