myhloli f2a3a49541 fix(pdf_extract_kit):change unimernet base -> small hai 1 ano
..
dict2md 98313d4a25 Merge branch 'dev' into content-list-not-drop hai 1 ano
filter df14c61f6f update: Enhance the capability to detect garbled document issues hai 1 ano
integrations b72d4ebd94 Feat/support rag (#510) hai 1 ano
layout 03469909bb Feat/support footnote in figure (#532) hai 1 ano
libs 37fbe998ac feat(ocr_mkcontent): support drop reason in none_with_reason modeEnable the `NONE_WITH_REASON` drop mode in `para_to_standard_format_v2` by updating the hai 1 ano
model f2a3a49541 fix(pdf_extract_kit):change unimernet base -> small hai 1 ano
para 58a003177c fix: resolve inaccuracy of drawing layout box caused by paragraphs combination #384 (#574) hai 1 ano
pipe 23b621e05a feat(UNIPipe): change default drop_mode to NONE_WITH_REASON hai 1 ano
post_proc 1b9d65b3d3 1、Trace类的key增加前置下划线 hai 1 ano
pre_proc 03469909bb Feat/support footnote in figure (#532) hai 1 ano
resources f2a3a49541 fix(pdf_extract_kit):change unimernet base -> small hai 1 ano
rw 40e0827e60 Feat/impl cli (#264) hai 1 ano
spark c9af3457f5 delete useless files hai 1 ano
tools a4c72e2e33 fix: solve conflicts hai 1 ano
__init__.py d5dbed7325 目录重构 hai 1 ano
pdf_parse_by_ocr.py 959b8d82d8 renamed pipeline file name hai 1 ano
pdf_parse_by_txt.py 959b8d82d8 renamed pipeline file name hai 1 ano
pdf_parse_union_core.py 068fab7f81 fix(end_page_id):Fix the issue where end_page_id is corrected to len-1 when its input is 0. (#518) hai 1 ano
user_api.py 6062862c96 feat(pipeline): pass language parameter for parsing and markdown conversion hai 1 ano