Zhang Zelun 6 months ago
parent
commit
cdb43d0a58

+ 6 - 0
docs/module_usage/tutorials/vlm_modules/chart_parsing.en.md

@@ -16,16 +16,22 @@ Multimodal chart parsing is a cutting-edge technology in the OCR field, focusing
 <th>Model</th><th>Model Download Link</th>
 <th>Model</th><th>Model Download Link</th>
 <th>Model parameter size(B)</th>
 <th>Model parameter size(B)</th>
 <th>Model Storage Size (GB)</th>
 <th>Model Storage Size (GB)</th>
+<th>Model Score </th>
 <th>Description</th>
 <th>Description</th>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-Chart2Table</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-Chart2Table_infer.tar">Inference Model</a></td>
 <td>PP-Chart2Table</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-Chart2Table_infer.tar">Inference Model</a></td>
 <td>0.58</td>
 <td>0.58</td>
 <td>1.4</td>
 <td>1.4</td>
+<th>75.98</th>
 <td>PP-Chart2Table is a self-developed multimodal model by the PaddlePaddle team, focusing on chart parsing, demonstrating outstanding performance in both Chinese and English chart parsing tasks. The team adopted a carefully designed data generation strategy, constructing a high-quality multimodal dataset of nearly 700,000 entries covering common chart types like pie charts, bar charts, stacked area charts, and various application scenarios. They also designed a two-stage training method, utilizing large model distillation to fully leverage massive unlabeled OOD data. In internal business tests in both Chinese and English scenarios, PP-Chart2Table not only achieved the SOTA level among models of the same parameter scale but also reached accuracy comparable to 7B parameter scale VLM models in critical scenarios.</td>
 <td>PP-Chart2Table is a self-developed multimodal model by the PaddlePaddle team, focusing on chart parsing, demonstrating outstanding performance in both Chinese and English chart parsing tasks. The team adopted a carefully designed data generation strategy, constructing a high-quality multimodal dataset of nearly 700,000 entries covering common chart types like pie charts, bar charts, stacked area charts, and various application scenarios. They also designed a two-stage training method, utilizing large model distillation to fully leverage massive unlabeled OOD data. In internal business tests in both Chinese and English scenarios, PP-Chart2Table not only achieved the SOTA level among models of the same parameter scale but also reached accuracy comparable to 7B parameter scale VLM models in critical scenarios.</td>
 </tr>
 </tr>
 </table>
 </table>
 
 
+<b>Note: The above model scores are the results of internal evaluation set model testing, with a total of 1801 data points, including various chart types such as bar charts, line charts, and pie charts for testing samples under various scenarios such as financial reports, laws and regulations, contracts, etc. There are currently no plans to make them public.</b>
+
+
+
 ## III. Quick Integration
 ## III. Quick Integration
 > ❗ Before quick integration, please install the PaddleX wheel package. For details, please refer to [PaddleX Local Installation Tutorial](../../../installation/installation.md)
 > ❗ Before quick integration, please install the PaddleX wheel package. For details, please refer to [PaddleX Local Installation Tutorial](../../../installation/installation.md)
 
 

+ 5 - 0
docs/module_usage/tutorials/vlm_modules/chart_parsing.md

@@ -17,16 +17,21 @@ comments: true
 <th>模型</th><th>模型下载链接</th>
 <th>模型</th><th>模型下载链接</th>
 <th>模型参数规模(B)</th>
 <th>模型参数规模(B)</th>
 <th>模型存储大小(GB)</th>
 <th>模型存储大小(GB)</th>
+<th>模型分数 </th>
 <th>介绍</th>
 <th>介绍</th>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-Chart2Table</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-Chart2Table_infer.tar">推理模型</a></td>
 <td>PP-Chart2Table</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-Chart2Table_infer.tar">推理模型</a></td>
 <td>0.58</td>
 <td>0.58</td>
 <td>1.4</td>
 <td>1.4</td>
+<th>75.98</th>
 <td>PP-Chart2Table是飞桨团队自研的一款专注于图表解析的多模态模型,在中英文图表解析任务中展现出卓越性能。团队采用精心设计的数据生成策略,构建了近70万条高质量的图表解析多模态数据集,全面覆盖饼图、柱状图、堆叠面积图等常见图表类型及各类应用场景。同时设计了二阶段训练方法,结合大模型蒸馏实现对海量无标注OOD数据的充分利用。在内部业务的中英文场景测试中,PP-Chart2Table不仅达到同参数量级模型中的SOTA水平,更在关键场景中实现了与7B参数量级VLM模型相当的精度。</td>
 <td>PP-Chart2Table是飞桨团队自研的一款专注于图表解析的多模态模型,在中英文图表解析任务中展现出卓越性能。团队采用精心设计的数据生成策略,构建了近70万条高质量的图表解析多模态数据集,全面覆盖饼图、柱状图、堆叠面积图等常见图表类型及各类应用场景。同时设计了二阶段训练方法,结合大模型蒸馏实现对海量无标注OOD数据的充分利用。在内部业务的中英文场景测试中,PP-Chart2Table不仅达到同参数量级模型中的SOTA水平,更在关键场景中实现了与7B参数量级VLM模型相当的精度。</td>
 </tr>
 </tr>
 </table>
 </table>
 
 
+<b>注:以上模型分数为内部评估集模型测试结果,共1801条数据,包括了各个场景(财报、法律法规、合同等)下的各种图表类型(柱状图、折线图、饼图等)的测试样本,暂时未有计划公开。</b>
+
+
 
 
 ## 三、快速集成
 ## 三、快速集成
 > ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
 > ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)

+ 7 - 1
docs/module_usage/tutorials/vlm_modules/doc_vlm.en.md

@@ -10,24 +10,30 @@ The document visual-language model is a cutting-edge multimodal processing techn
 <tr>
 <tr>
 <th>Model</th><th>Download Link</th>
 <th>Model</th><th>Download Link</th>
 <th>Storage Size (GB)</th>
 <th>Storage Size (GB)</th>
+<th>Model Score</th>
 <th>Description</th>
 <th>Description</th>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">Inference Model</a></td>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">Inference Model</a></td>
 <td>4.2</td>
 <td>4.2</td>
+<td>765</td>
 <td rowspan="2">PP-DocBee is a multimodal large model developed by the PaddlePaddle team, focused on document understanding with excellent performance on Chinese document understanding tasks. The model is fine-tuned and optimized using nearly 5 million multimodal datasets for document understanding, including general VQA, OCR, table, text-rich, math and complex reasoning, synthetic, and pure text data, with different training data ratios. On several authoritative English document understanding evaluation leaderboards in academia, PP-DocBee has generally achieved SOTA at the same parameter level. In internal business Chinese scenarios, PP-DocBee also exceeds current popular open-source and closed-source models.</td>
 <td rowspan="2">PP-DocBee is a multimodal large model developed by the PaddlePaddle team, focused on document understanding with excellent performance on Chinese document understanding tasks. The model is fine-tuned and optimized using nearly 5 million multimodal datasets for document understanding, including general VQA, OCR, table, text-rich, math and complex reasoning, synthetic, and pure text data, with different training data ratios. On several authoritative English document understanding evaluation leaderboards in academia, PP-DocBee has generally achieved SOTA at the same parameter level. In internal business Chinese scenarios, PP-DocBee also exceeds current popular open-source and closed-source models.</td>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">Inference Model</a></td>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">Inference Model</a></td>
 <td>15.8</td>
 <td>15.8</td>
+<td>-</td>
 </tr>
 </tr>
 <tr>
 <tr>
-<td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">推理模型</a></td>
+<td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">Inference Model</a></td>
 <td>7.6</td>
 <td>7.6</td>
+<td>852</td>
 <td>PP-DocBee2 is a multimodal large model independently developed by the PaddlePaddle team, specifically tailored for document understanding. Building upon PP-DocBee, the team has further optimized the foundational model and introduced a new data optimization scheme to enhance data quality. With just a relatively small dataset of 470,000 samples generated using the team's proprietary data synthesis strategy, PP-DocBee2 demonstrates superior performance in Chinese document understanding tasks. In terms of internal business metrics for Chinese-language scenarios, PP-DocBee2 has achieved an approximately 11.4% improvement over PP-DocBee, outperforming both current popular open-source and closed-source models of a similar scale.</td>
 <td>PP-DocBee2 is a multimodal large model independently developed by the PaddlePaddle team, specifically tailored for document understanding. Building upon PP-DocBee, the team has further optimized the foundational model and introduced a new data optimization scheme to enhance data quality. With just a relatively small dataset of 470,000 samples generated using the team's proprietary data synthesis strategy, PP-DocBee2 demonstrates superior performance in Chinese document understanding tasks. In terms of internal business metrics for Chinese-language scenarios, PP-DocBee2 has achieved an approximately 11.4% improvement over PP-DocBee, outperforming both current popular open-source and closed-source models of a similar scale.</td>
 </tr>
 </tr>
 </table>
 </table>
 
 
+<b>Note: The total scores of the above models are based on the test results from the internal evaluation set. All images in the internal evaluation set have a resolution (height, width) of (1680, 1204), with a total of 1,196 data entries. These entries cover various scenarios such as financial reports, laws and regulations, science and engineering papers, instruction manuals, liberal arts papers, contracts, research reports, etc. There are currently no plans to make this dataset publicly available.</b>
+
 ## 3. Quick Integration
 ## 3. Quick Integration
 > ❗ Before quick integration, please install the PaddleX wheel package. For details, refer to [PaddleX Local Installation Guide](../../../installation/installation.md).
 > ❗ Before quick integration, please install the PaddleX wheel package. For details, refer to [PaddleX Local Installation Guide](../../../installation/installation.md).
 
 

+ 6 - 0
docs/module_usage/tutorials/vlm_modules/doc_vlm.md

@@ -14,24 +14,30 @@ comments: true
 <tr>
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>模型</th><th>模型下载链接</th>
 <th>模型存储大小(GB)</th>
 <th>模型存储大小(GB)</th>
+<th>模型总分</th>
 <th>介绍</th>
 <th>介绍</th>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">推理模型</a></td>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">推理模型</a></td>
 <td>4.2</td>
 <td>4.2</td>
+<td>765</td>
 <td rowspan="2">PP-DocBee 是飞桨团队自研的一款专注于文档理解的多模态大模型,在中文文档理解任务上具有卓越表现。该模型通过近 500 万条文档理解类多模态数据集进行微调优化,各种数据集包括了通用VQA类、OCR类、图表类、text-rich文档类、数学和复杂推理类、合成数据类、纯文本数据等,并设置了不同训练数据配比。在学术界权威的几个英文文档理解评测榜单上,PP-DocBee基本都达到了同参数量级别模型的SOTA。在内部业务中文场景类的指标上,PP-DocBee也高于目前的热门开源和闭源模型。</td>
 <td rowspan="2">PP-DocBee 是飞桨团队自研的一款专注于文档理解的多模态大模型,在中文文档理解任务上具有卓越表现。该模型通过近 500 万条文档理解类多模态数据集进行微调优化,各种数据集包括了通用VQA类、OCR类、图表类、text-rich文档类、数学和复杂推理类、合成数据类、纯文本数据等,并设置了不同训练数据配比。在学术界权威的几个英文文档理解评测榜单上,PP-DocBee基本都达到了同参数量级别模型的SOTA。在内部业务中文场景类的指标上,PP-DocBee也高于目前的热门开源和闭源模型。</td>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">推理模型</a></td>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">推理模型</a></td>
 <td>15.8</td>
 <td>15.8</td>
+<td>-</td>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">推理模型</a></td>
 <td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">推理模型</a></td>
 <td>7.6</td>
 <td>7.6</td>
+<td>852</td>
 <td>PP-DocBee2 是飞桨团队自研的一款专注于文档理解的多模态大模型,在PP-DocBee的基础上进一步优化了基础模型,并引入了新的数据优化方案,提高了数据质量,使用自研数据合成策略生成的少量的47万数据便使得PP-DocBee2在中文文档理解任务上表现更佳。在内部业务中文场景类的指标上,PP-DocBee2相较于PP-DocBee提升了约11.4%,同时也高于目前的同规模热门开源和闭源模型。</td>
 <td>PP-DocBee2 是飞桨团队自研的一款专注于文档理解的多模态大模型,在PP-DocBee的基础上进一步优化了基础模型,并引入了新的数据优化方案,提高了数据质量,使用自研数据合成策略生成的少量的47万数据便使得PP-DocBee2在中文文档理解任务上表现更佳。在内部业务中文场景类的指标上,PP-DocBee2相较于PP-DocBee提升了约11.4%,同时也高于目前的同规模热门开源和闭源模型。</td>
 </tr>
 </tr>
 </table>
 </table>
 
 
+<b>注:以上模型总分为内部评估集模型测试结果,内部评估集所有图像分辨率 (height, width) 为 (1680,1204),共1196条数据,包括了财报、法律法规、理工科论文、说明书、文科论文、合同、研报等场景,暂时未有计划公开。</b>
+
 
 
 ## 三、快速集成
 ## 三、快速集成
 > ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)
 > ❗ 在快速集成前,请先安装 PaddleX 的 wheel 包,详细请参考 [PaddleX本地安装教程](../../../installation/installation.md)

+ 11 - 4
docs/pipeline_usage/tutorials/vlm_pipelines/doc_understanding.en.md

@@ -14,26 +14,33 @@ The Document Understanding Pipeline is an advanced document processing technolog
 
 
 <table>
 <table>
 <tr>
 <tr>
-<th>Model</th><th>Model Download Link</th>
-<th>Model Storage Size (GB)</th>
+<th>Model</th><th>Download Link</th>
+<th>Storage Size (GB)</th>
+<th>Model Score</th>
 <th>Description</th>
 <th>Description</th>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">Inference Model</a></td>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">Inference Model</a></td>
 <td>4.2</td>
 <td>4.2</td>
-<td rowspan="2">PP-DocBee is a self-developed multimodal large model by the PaddlePaddle team, focusing on document understanding with excellent performance on Chinese document understanding tasks. The model is fine-tuned with nearly 5 million multimodal datasets for document understanding, including general VQA, OCR, chart, text-rich documents, mathematics and complex reasoning, synthetic data, and pure text data, with different training data ratios. On several authoritative English document understanding evaluation benchmarks in academia, PP-DocBee has achieved SOTA for models of the same parameter scale. In internal business Chinese scenarios, PP-DocBee also outperforms current popular open and closed-source models.</td>
+<td>765</td>
+<td rowspan="2">PP-DocBee is a multimodal large model developed by the PaddlePaddle team, focused on document understanding with excellent performance on Chinese document understanding tasks. The model is fine-tuned and optimized using nearly 5 million multimodal datasets for document understanding, including general VQA, OCR, table, text-rich, math and complex reasoning, synthetic, and pure text data, with different training data ratios. On several authoritative English document understanding evaluation leaderboards in academia, PP-DocBee has generally achieved SOTA at the same parameter level. In internal business Chinese scenarios, PP-DocBee also exceeds current popular open-source and closed-source models.</td>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">Inference Model</a></td>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">Inference Model</a></td>
 <td>15.8</td>
 <td>15.8</td>
+<td>-</td>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">Inference Model</a></td>
 <td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">Inference Model</a></td>
 <td>7.6</td>
 <td>7.6</td>
-<td>PP-DocBee2 is a multimodal large model developed by the PaddlePaddle team, specifically designed for document understanding. Building upon PP-DocBee, the team has further optimized the foundational model and introduced a new data optimization scheme to enhance data quality. With just a relatively small dataset of 470,000 samples generated using the team's proprietary data synthesis strategy, PP-DocBee2 demonstrates superior performance in Chinese document understanding tasks. In terms of internal business metrics for Chinese language scenarios, PP-DocBee2 has achieved an approximately 11.4% improvement over PP-DocBee, and it also outperforms current popular open-source and closed-source models of a similar scale.</td>
+<td>852</td>
+<td>PP-DocBee2 is a multimodal large model independently developed by the PaddlePaddle team, specifically tailored for document understanding. Building upon PP-DocBee, the team has further optimized the foundational model and introduced a new data optimization scheme to enhance data quality. With just a relatively small dataset of 470,000 samples generated using the team's proprietary data synthesis strategy, PP-DocBee2 demonstrates superior performance in Chinese document understanding tasks. In terms of internal business metrics for Chinese-language scenarios, PP-DocBee2 has achieved an approximately 11.4% improvement over PP-DocBee, outperforming both current popular open-source and closed-source models of a similar scale.</td>
 </tr>
 </tr>
 </table>
 </table>
 
 
+<b>Note: The total scores of the above models are based on the test results from the internal evaluation set. All images in the internal evaluation set have a resolution (height, width) of (1680, 1204), with a total of 1,196 data entries. These entries cover various scenarios such as financial reports, laws and regulations, science and engineering papers, instruction manuals, liberal arts papers, contracts, research reports, etc. There are currently no plans to make this dataset publicly available.</b>
+
+
 ## 2. Quick Start
 ## 2. Quick Start
 
 
 ### 2.1 Local Experience
 ### 2.1 Local Experience

+ 7 - 0
docs/pipeline_usage/tutorials/vlm_pipelines/doc_understanding.md

@@ -19,24 +19,31 @@ comments: true
 <tr>
 <tr>
 <th>模型</th><th>模型下载链接</th>
 <th>模型</th><th>模型下载链接</th>
 <th>模型存储大小(GB)</th>
 <th>模型存储大小(GB)</th>
+<th>模型总分</th>
 <th>介绍</th>
 <th>介绍</th>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">推理模型</a></td>
 <td>PP-DocBee-2B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-2B_infer.tar">推理模型</a></td>
 <td>4.2</td>
 <td>4.2</td>
+<td>765</td>
 <td rowspan="2">PP-DocBee 是飞桨团队自研的一款专注于文档理解的多模态大模型,在中文文档理解任务上具有卓越表现。该模型通过近 500 万条文档理解类多模态数据集进行微调优化,各种数据集包括了通用VQA类、OCR类、图表类、text-rich文档类、数学和复杂推理类、合成数据类、纯文本数据等,并设置了不同训练数据配比。在学术界权威的几个英文文档理解评测榜单上,PP-DocBee基本都达到了同参数量级别模型的SOTA。在内部业务中文场景类的指标上,PP-DocBee也高于目前的热门开源和闭源模型。</td>
 <td rowspan="2">PP-DocBee 是飞桨团队自研的一款专注于文档理解的多模态大模型,在中文文档理解任务上具有卓越表现。该模型通过近 500 万条文档理解类多模态数据集进行微调优化,各种数据集包括了通用VQA类、OCR类、图表类、text-rich文档类、数学和复杂推理类、合成数据类、纯文本数据等,并设置了不同训练数据配比。在学术界权威的几个英文文档理解评测榜单上,PP-DocBee基本都达到了同参数量级别模型的SOTA。在内部业务中文场景类的指标上,PP-DocBee也高于目前的热门开源和闭源模型。</td>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">推理模型</a></td>
 <td>PP-DocBee-7B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee-7B_infer.tar">推理模型</a></td>
 <td>15.8</td>
 <td>15.8</td>
+<td>-</td>
 </tr>
 </tr>
 <tr>
 <tr>
 <td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">推理模型</a></td>
 <td>PP-DocBee2-3B</td><td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0.0/PP-DocBee2-3B_infer.tar">推理模型</a></td>
 <td>7.6</td>
 <td>7.6</td>
+<td>852</td>
 <td>PP-DocBee2 是飞桨团队自研的一款专注于文档理解的多模态大模型,在PP-DocBee的基础上进一步优化了基础模型,并引入了新的数据优化方案,提高了数据质量,使用自研数据合成策略生成的少量的47万数据便使得PP-DocBee2在中文文档理解任务上表现更佳。在内部业务中文场景类的指标上,PP-DocBee2相较于PP-DocBee提升了约11.4%,同时也高于目前的同规模热门开源和闭源模型。</td>
 <td>PP-DocBee2 是飞桨团队自研的一款专注于文档理解的多模态大模型,在PP-DocBee的基础上进一步优化了基础模型,并引入了新的数据优化方案,提高了数据质量,使用自研数据合成策略生成的少量的47万数据便使得PP-DocBee2在中文文档理解任务上表现更佳。在内部业务中文场景类的指标上,PP-DocBee2相较于PP-DocBee提升了约11.4%,同时也高于目前的同规模热门开源和闭源模型。</td>
 </tr>
 </tr>
 </table>
 </table>
 
 
+<b>注:以上模型总分为内部评估集模型测试结果,内部评估集所有图像分辨率 (height, width) 为 (1680,1204),共1196条数据,包括了财报、法律法规、理工科论文、说明书、文科论文、合同、研报等场景,暂时未有计划公开。</b>
+
+
 ## 2. 快速开始
 ## 2. 快速开始
 
 
 ### 2.1 本地体验
 ### 2.1 本地体验

+ 13 - 0
paddlex/configs/modules/chart_parsing/PP-Chart2Table.yaml

@@ -0,0 +1,13 @@
+Global:
+  model: PP-Chart2Table
+  mode: predict # only support predict
+  device: gpu:0
+  output: "output"
+
+Predict:
+  batch_size: 1
+  model_dir: "/path/to/PP-Chart2Table"
+  input:
+    image: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png"
+  kernel_option:
+    run_mode: paddle

+ 14 - 0
paddlex/configs/modules/doc_vlm/PP-DocBee2-3B.yaml

@@ -0,0 +1,14 @@
+Global:
+  model: PP-DocBee2-3B
+  mode: predict # only support predict
+  device: gpu:0
+  output: "output"
+
+Predict:
+  batch_size: 1
+  model_dir: "/path/to/PP-DocBee2-3B"
+  input:
+    image: "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png"
+    query: "识别这份表格的内容"
+  kernel_option:
+    run_mode: paddle