Commit History

Author SHA1 Message Date
  myhloli 1ed61cb5d6 refactor: update OCR span extraction logic and improve PDF page processing 5 months ago
  myhloli 51393aa814 refactor: update union_make import and adjust middle JSON structure for consistency 5 months ago
  myhloli a9abb4e201 refactor: enhance OCR processing and paragraph splitting in document analysis pipeline 5 months ago
  myhloli 0f21495a06 refactor: enhance block processing and sorting utilities for improved span management 5 months ago
  myhloli ae7b0a6eba refactor: implement block preprocessing utilities for improved bounding box management 5 months ago
  myhloli 8f1f9abec5 refactor: enhance bounding box utilities and add configuration reader for S3 integration 5 months ago
  myhloli ea5cb65a1f refactor: enhance document parsing by supporting multiple PDF files and improving method organization 5 months ago