|
|
@@ -37,7 +37,21 @@ Table structure recognition is a crucial component in table recognition systems,
|
|
|
SLANet_plus is an enhanced version of SLANet, a table structure recognition model developed by Baidu PaddlePaddle's Vision Team. Compared to SLANet, SLANet_plus significantly improves its recognition capabilities for wireless and complex tables, while reducing the model's sensitivity to the accuracy of table localization. Even when there are offsets in table localization, it can still perform relatively accurate recognition.
|
|
|
</td>
|
|
|
</tr>
|
|
|
+<tr>
|
|
|
+<td>SLANeXt_wired</td>
|
|
|
+<td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/SLANeXt_wired_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/SLANeXt_wired_pretrained.pdparams">Trained Model</a></td>
|
|
|
+<td rowspan="2">69.65</td>
|
|
|
+<td rowspan="2"></td>
|
|
|
+<td rowspan="2"></td>
|
|
|
+<td rowspan="2">351M</td>
|
|
|
+<td rowspan="2">SLANeXt series is a new generation of form structure recognition model developed by Baidu PaddlePaddle's Vision Team. Compared with SLANet and SLANet_plus, SLANeXt focuses on the recognition of form structure, and special weights are trained for the recognition of wired and wireless forms, which significantly improves the recognition ability of all types of forms, especially the recognition ability of wired forms is greatly improved.</td>
|
|
|
+</tr>
|
|
|
+<tr>
|
|
|
+<td>SLANeXt_wireless</td>
|
|
|
+<td><a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/SLANeXt_wireless_infer.tar">Inference Model</a>/<a href="https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/SLANeXt_wireless_pretrained.pdparams">Trained Model</a></td>
|
|
|
+</tr>
|
|
|
</table>
|
|
|
+
|
|
|
<b>Note: The above accuracy metrics are evaluated on a self-built English table recognition dataset by PaddleX. All GPU inference times are based on an NVIDIA Tesla T4 machine with FP32 precision. CPU inference speeds are based on an Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz with 8 threads and FP32 precision.</b>
|
|
|
|
|
|
|
|
|
@@ -57,13 +71,14 @@ for res in output:
|
|
|
|
|
|
<details><summary>👉 <b>After running, the result is: (Click to expand)</b></summary>
|
|
|
|
|
|
-```json
|
|
|
-{'res': {'input_path': 'table_recognition.jpg', 'bbox': [array([ 42, 2, 390, 2, 388, 27, 40, 26]), array([11, 35, 89, 35, 87, 63, 11, 63]), array([113, 34, 192, 34, 186, 64, 109, 64]), array([219, 33, 399, 33, 393, 62, 212, 62]), array([413, 33, 544, 33, 544, 64, 407, 64]), array([12, 67, 98, 68, 96, 93, 12, 93]), array([115, 66, 205, 66, 200, 91, 111, 91]), array([234, 65, 390, 65, 385, 92, 227, 92]), array([414, 66, 537, 67, 537, 95, 409, 95]), array([ 7, 97, 106, 97, 104, 128, 7, 128]), array([113, 96, 206, 95, 201, 127, 109, 127]), array([236, 96, 386, 96, 381, 128, 230, 128]), array([413, 96, 534, 95, 533, 127, 408, 127])], 'structure': ['<html>', '<body>', '<table>', '<tr>', '<td', '="" ',="" colspan="4">', '', '</td',></tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</table>', '</body>', '</html>'], 'structure_score': 0.99948007}}
|
|
|
+```
|
|
|
+{'res': {'input_path': 'table_recognition.jpg', 'page_index': null, 'bbox': [array([ 42, 2, 390, 2, 388, 27, 40, 26]), array([11, 35, 89, 35, 87, 63, 11, 63]), array([113, 34, 192, 34, 186, 64, 109, 64]), array([219, 33, 399, 33, 393, 62, 212, 62]), array([413, 33, 544, 33, 544, 64, 407, 64]), array([12, 67, 98, 68, 96, 93, 12, 93]), array([115, 66, 205, 66, 200, 91, 111, 91]), array([234, 65, 390, 65, 385, 92, 227, 92]), array([414, 66, 537, 67, 537, 95, 409, 95]), array([ 7, 97, 106, 97, 104, 128, 7, 128]), array([113, 96, 206, 95, 201, 127, 109, 127]), array([236, 96, 386, 96, 381, 128, 230, 128]), array([413, 96, 534, 95, 533, 127, 408, 127])], 'structure': ['<html>', '<body>', '<table>', '<tr>', '<td', ' colspan="4"', '>', '</td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</table>', '</body>', '</html>'], 'structure_score': 0.99948007}}
|
|
|
```
|
|
|
|
|
|
Parameter meanings are as follows:
|
|
|
<ul>
|
|
|
<li><code>input_path</code>: The path of the input image to be predicted</li>
|
|
|
+<li><code>page_index</code>:If the input is a PDF file, this indicates the current page number of the PDF. Otherwise, it is `null`</li>
|
|
|
<li><code>boxes</code>: Predicted table cell information, a list composed of several predicted table cell coordinates. Note that the table cell predictions from the SLANeXt series models are invalid</li>
|
|
|
<li><code>structure</code>: Predicted table structure in HTML expressions, a list composed of several predicted HTML keywords in order</li>
|
|
|
<li><code>structure_score</code>: Confidence score of the predicted table structure</li>
|
|
|
@@ -116,7 +131,7 @@ Relevant methods, parameters, and explanations are as follows:
|
|
|
<tr>
|
|
|
<td><code>input</code></td>
|
|
|
<td>Data to be predicted, supporting multiple input types</td>
|
|
|
-<td><code>Python Var</code>/<code>str</code>/<code>dict</code>/<code>list</code></td>
|
|
|
+<td><code>Python Var</code>/<code>str</code>/<code>list</code></td>
|
|
|
<td>
|
|
|
<ul>
|
|
|
<li><b>Python Variable</b>, such as image data represented by <code>numpy.ndarray</code></li>
|
|
|
@@ -192,7 +207,7 @@ Relevant methods, parameters, and explanations are as follows:
|
|
|
</tr>
|
|
|
</table>
|
|
|
|
|
|
-* In addition, it also supports obtaining a visualization image with results through attributes, as follows:
|
|
|
+* In addition, it also supports obtaining results through attributes, as follows:
|
|
|
|
|
|
<table>
|
|
|
<thead>
|
|
|
@@ -241,29 +256,19 @@ After executing the above command, PaddleX will validate the dataset and summari
|
|
|
"attributes": {
|
|
|
"train_samples": 2000,
|
|
|
"train_sample_paths": [
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_right_7384_X9UFEPKVMLALY7DDB11A.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_top_13708_VE2DGBD4DCQU2ITLBTEA.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_top_6490_14Z6ZN6G52GG4XA0K4XU.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_top_14236_DG96EX0EDKIIDK8P6ENG.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_19648_SV8B7X34RTYRAT2T5CPI.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_bottom_7186_HODBC25HISMCSVKY0HJ9.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/head_border_bottom_5773_4K4H9OVK9X9YVHE4Y1BQ.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_7760_8C62CCH5T57QUGE0NTHZ.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_bottom_15707_B1YVOU3X4NHHB6TL269O.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/no_border_5223_HLG406UK35UD5EUYC2AV.jpg"
|
|
|
+ "check_dataset/demo_img/border_right_7384_X9UFEPKVMLALY7DDB11A.jpg",
|
|
|
+ "check_dataset/demo_img/border_top_13708_VE2DGBD4DCQU2ITLBTEA.jpg",
|
|
|
+ "check_dataset/demo_img/border_top_6490_14Z6ZN6G52GG4XA0K4XU.jpg",
|
|
|
+ "check_dataset/demo_img/border_top_14236_DG96EX0EDKIIDK8P6ENG.jpg",
|
|
|
+ "check_dataset/demo_img/border_19648_SV8B7X34RTYRAT2T5CPI.jpg",
|
|
|
],
|
|
|
"val_samples": 100,
|
|
|
"val_sample_paths": [
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_2945_L7MSRHBZRW6Y347G39O6.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/head_border_bottom_4825_LH9WI6X104CP3VFXPSON.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/head_border_bottom_16837_79KHWU9WDM9ZQHNBGQAL.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_bottom_10107_9ENLLC29SQ6XI8WZY53E.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_top_16668_JIS0YFDZKTKETZIEKCKX.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_18653_J9SSKHLFTRJD4J8W17OW.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_bottom_8396_VJ3QJ3I0DP63P4JR77FE.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_9017_K2V7QBWSU2BA4R3AJSO7.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/border_top_19494_SDFMWP92NOB2OT7109FI.jpg",
|
|
|
- "../dataset/table_rec_dataset_examples/images/no_border_288_6LK683JUCMOQ38V5BV29.jpg"
|
|
|
+ "check_dataset/demo_img/border_18653_J9SSKHLFTRJD4J8W17OW.jpg",
|
|
|
+ "check_dataset/demo_img/border_bottom_8396_VJ3QJ3I0DP63P4JR77FE.jpg",
|
|
|
+ "check_dataset/demo_img/border_9017_K2V7QBWSU2BA4R3AJSO7.jpg",
|
|
|
+ "check_dataset/demo_img/border_top_19494_SDFMWP92NOB2OT7109FI.jpg",
|
|
|
+ "check_dataset/demo_img/no_border_288_6LK683JUCMOQ38V5BV29.jpg"
|
|
|
]
|
|
|
},
|
|
|
"analysis": {},
|