myhloli 309be741e8 refactor(txt_parse): improve text extraction accuracy with new algorithm il y a 1 an
..
__init__.py d5dbed7325 目录重构 il y a 1 an
boxbase.py 1279f2cd0f feat(model): add support for DocLayout-YOLO model il y a 1 an
calc_span_stats.py d5dbed7325 目录重构 il y a 1 an
clean_memory.py 4c9bf8abd5 refactor(memory management): remove unused clean_memory function il y a 1 an
commons.py 1de37e4c65 add version_name to middle json il y a 1 an
config_reader.py b492c19c4c refactor: move some constants or enums defs to config folder il y a 1 an
convert_utils.py 709a65008a 中间态dict结构调整 il y a 1 an
coordinate_transform.py 7b0db8a4b3 将fix缩放倍率的bbox写入model_list il y a 1 an
detect_language_from_model.py e492b3dce8 语言检测逻辑移动到parse流程 il y a 1 an
draw_bbox.py b492c19c4c refactor: move some constants or enums defs to config folder il y a 1 an
hash_utils.py 00f16239c6 实现parse_ocr_pdf api,切图逻辑s3使用平铺地址,本地使用层级地址,删除预设s3_image_save_path il y a 1 an
json_compressor.py d5dbed7325 目录重构 il y a 1 an
language.py 57380cbed5 feat(language): add FT LANG cache directory setup il y a 1 an
local_math.py 12bec17eed refactor(magic_pdf): replace math module with local_math il y a 1 an
markdown_utils.py 59b0b0c3da make markdown时特殊符号转义 il y a 1 an
nlp_utils.py d5dbed7325 目录重构 il y a 1 an
path_utils.py 6c656af65f update:cleanup requirements.txt il y a 1 an
pdf_check.py 8998380da5 update check invalid_chars algorithm to improve accuracy il y a 1 an
pdf_image_tools.py 309be741e8 refactor(txt_parse): improve text extraction accuracy with new algorithm il y a 1 an
safe_filename.py d5dbed7325 目录重构 il y a 1 an
textbase.py d5dbed7325 目录重构 il y a 1 an
version.py 149132d608 feat(pdf_parse): improve span filtering and add new block types il y a 1 an
vis_utils.py d5dbed7325 目录重构 il y a 1 an