Instructions to use BDRC/Tibetan_Modern_Book_Layout_Detection_Model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ultralytics
How to use BDRC/Tibetan_Modern_Book_Layout_Detection_Model with ultralytics:
# Couldn't find a valid YOLO version tag. # Replace XX with the correct version. from ultralytics import YOLOvXX model = YOLOvXX.from_pretrained("BDRC/Tibetan_Modern_Book_Layout_Detection_Model") source = 'http://images.cocodataset.org/val2017/000000039769.jpg' model.predict(source=source, save=True) - Notebooks
- Google Colab
- Kaggle
TMBLD-YOLO26m — Tibetan Modern book layout dection
A fine-tuned YOLO26m object-detection model for Tibetan Modern book layout dection. The model detects four layout classes in Tibetan modern book page images: header, Text area, footnote, and footer.
Model Description
This model was fine-tuned from the Ultralytics YOLO26m pretrained checkpoint on the BDRC/TDLA-Training-Dataset, a YOLO-format bounding-box dataset of Tibetan document pages sourced from the Buddhist Digital Resource Center (BDRC) digital library.
| Property | Value |
|---|---|
| Architecture | YOLO26m |
| Task | Object Detection |
| Image size | 640 × 640 |
| Number of classes | 4 |
| Training platform | Ultralytics HUB |
| Weights file | Tibetan_modern_book_Layout_detection.pt |
Classes
| ID | Class | Description |
|---|---|---|
| 0 | header | Page header region |
| 1 | Text area | Main body text region |
| 2 | footnote | Footnote region |
| 3 | footer | Page footer region |
Performance
Evaluated on the validation split of the TDLA Training Dataset.
Training Loss (final epoch)
| Loss Component | Train | Val |
|---|---|---|
| Box loss | 0.515 | 0.643 |
| Classification loss | 0.218 | 0.276 |
| DFL loss | 0.003 | 0.004 |
Training Details
Dataset
- Dataset: BDRC/TDLA-Training-Dataset
- Train images: 2,692
- Val images: 103
- Test images: 313
- Total annotations: 14,705
- Train/Val split: Iterative multi-label stratification (seed 42, 80/20 ratio)
Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 150 |
| Patience | 100 |
| Batch size | Auto (-1) |
| Image size | 640 |
| Optimizer | Auto (SGD) |
| Initial learning rate (lr0) | 0.01 |
| Final learning rate factor (lrf) | 0.01 |
| Momentum | 0.937 |
| Weight decay | 0.0005 |
| Warmup epochs | 3.0 |
| Warmup momentum | 0.8 |
| Warmup bias lr | 0.1 |
| AMP (mixed precision) | True |
| Pretrained | True |
| Deterministic | True |
| Seed | 0 |
Loss Weights
| Component | Weight |
|---|---|
| Box | 7.5 |
| Classification | 0.5 |
| DFL | 1.5 |
Augmentation
| Augmentation | Value |
|---|---|
| HSV-Hue | 0.015 |
| HSV-Saturation | 0.7 |
| HSV-Value | 0.4 |
| Translation | 0.1 |
| Scale | 0.5 |
| Flip left-right | 0.5 |
| Mosaic | 1.0 |
| Erasing | 0.4 |
| Close mosaic (last N epochs) | 10 |
| Auto augment | RandAugment |
Usage
Inference with Ultralytics
from ultralytics import YOLO
model = YOLO("Tibetan_modern_book_Layout_detection.pt")
results = model.predict("page_image.jpg", imgsz=640)
for result in results:
boxes = result.boxes
for box in boxes:
cls_id = int(box.cls)
conf = float(box.conf)
xyxy = box.xyxy[0].tolist()
print(f"Class: {cls_id}, Confidence: {conf:.3f}, Box: {xyxy}")
Batch Inference
from ultralytics import YOLO
model = YOLO("Tibetan_modern_book_Layout_detection.pt")
results = model.predict("path/to/images/", imgsz=640, conf=0.25)
Intended Use
This model is designed for automatic layout detection of modern Tibetan book pages. It can be used as a preprocessing step for:
- OCR pipelines on Tibetan documents
- Document digitization workflows
- Structured text extraction from scanned Tibetan texts
- Digital library cataloging and indexing
Limitations
- Trained primarily on modern Tibetan book layouts; performance on historical manuscripts, woodblock prints, or non-standard layouts may vary.
- Optimized for 640×640 input resolution; very high-resolution pages may benefit from tiling or higher
imgszvalues. - The footnote class has fewer training samples (456 annotations) compared to other classes, which may affect detection quality for that class.
License
This model is released under the CC0 1.0 Universal (Public Domain Dedication). You are free to copy, modify, and distribute the model, even for commercial purposes, without asking permission.
Acknowledgements
This dataset was developed by Dharmaduta from specifications provided by the Buddhist Digital Resource Center (BDRC) for the BDRC Etext Corpus, with funding from the Khyentse Foundation.
Citation
If you use this model, please cite the dataset:
@software{bdrc_tmbld_yolo26m_2026,
title = {tmbld-YOLO26m: Tibetan Modern book layout detection Model},
author = {Buddhist Digital Resource Center (BDRC)},
year = {2026},
url = {https://huggingface.co/BDRC/TDLA-YOLO26m},
license = {CC0-1.0}
}
- Downloads last month
- 49
Dataset used to train BDRC/Tibetan_Modern_Book_Layout_Detection_Model
Evaluation results
- mAP@0.5 on TDLA Training Datasetself-reported0.982
- mAP@0.5:0.95 on TDLA Training Datasetself-reported0.799
- Precision on TDLA Training Datasetself-reported0.966
- Recall on TDLA Training Datasetself-reported0.970