|
|
@@ -26,40 +26,44 @@ python main.py \
|
|
|
在开启 Benchmark 后,将自动打印 benchmark 指标:
|
|
|
|
|
|
```
|
|
|
-+-------------------+-----------------+------+---------------+
|
|
|
-| Stage | Total Time (ms) | Nums | Avg Time (ms) |
|
|
|
-+-------------------+-----------------+------+---------------+
|
|
|
-| ReadCmp | 49.95107651 | 10 | 4.99510765 |
|
|
|
-| Resize | 8.48054886 | 10 | 0.84805489 |
|
|
|
-| Normalize | 23.08964729 | 10 | 2.30896473 |
|
|
|
-| ToCHWImage | 0.02717972 | 10 | 0.00271797 |
|
|
|
-| ImageDetPredictor | 75.94108582 | 10 | 7.59410858 |
|
|
|
-| DetPostProcess | 0.26535988 | 10 | 0.02653599 |
|
|
|
-+-------------------+-----------------+------+---------------+
|
|
|
++----------------+-----------------+------+---------------+
|
|
|
+| Stage | Total Time (ms) | Nums | Avg Time (ms) |
|
|
|
++----------------+-----------------+------+---------------+
|
|
|
+| ReadCmp | 185.48870087 | 10 | 18.54887009 |
|
|
|
+| Resize | 16.95227623 | 30 | 0.56507587 |
|
|
|
+| Normalize | 41.12100601 | 30 | 1.37070020 |
|
|
|
+| ToCHWImage | 0.05745888 | 30 | 0.00191530 |
|
|
|
+| Copy2GPU | 14.58549500 | 10 | 1.45854950 |
|
|
|
+| Infer | 100.14462471 | 10 | 10.01446247 |
|
|
|
+| Copy2CPU | 9.54508781 | 10 | 0.95450878 |
|
|
|
+| DetPostProcess | 0.56767464 | 30 | 0.01892249 |
|
|
|
++----------------+-----------------+------+---------------+
|
|
|
+-------------+-----------------+------+---------------+
|
|
|
| Stage | Total Time (ms) | Nums | Avg Time (ms) |
|
|
|
+-------------+-----------------+------+---------------+
|
|
|
-| PreProcess | 81.54845238 | 10 | 8.15484524 |
|
|
|
-| Inference | 75.94108582 | 10 | 7.59410858 |
|
|
|
-| PostProcess | 0.26535988 | 10 | 0.02653599 |
|
|
|
-| End2End | 161.07797623 | 10 | 16.10779762 |
|
|
|
-| WarmUp | 5496.41847610 | 5 | 1099.28369522 |
|
|
|
+| PreProcess | 243.61944199 | 30 | 8.12064807 |
|
|
|
+| Inference | 124.27520752 | 30 | 4.14250692 |
|
|
|
+| PostProcess | 0.56767464 | 30 | 0.01892249 |
|
|
|
+| End2End | 379.70948219 | 30 | 12.65698274 |
|
|
|
+| WarmUp | 9465.68179131 | 5 | 1893.13635826 |
|
|
|
+-------------+-----------------+------+---------------+
|
|
|
```
|
|
|
|
|
|
-在 Benchmark 结果中,会统计该模型全部组件(`Component`)的总耗时(`Total Time`,单位为“毫秒”)、调用次数(`Nums`)、调用平均执行耗时(`Avg Time`,单位为“毫秒”),以及按预热(`WarmUp`)、预处理(`PreProcess`)、模型推理(`Inference`)、后处理(`PostProcess`)和端到端(`End2End`)进行划分的耗时统计,包括每个阶段的总耗时(`Total Time`,单位为“毫秒”)、样本数(`Nums`)和单样本平均执行耗时(`Avg Time`,单位为“毫秒”),同时,保存相关指标会到本地 `./benchmark.csv` 文件中:
|
|
|
+在 Benchmark 结果中,会统计该模型全部组件(`Component`)的总耗时(`Total Time`,单位为“毫秒”)、**调用次数**(`Nums`)、**调用**平均执行耗时(`Avg Time`,单位为“毫秒”),以及按预热(`WarmUp`)、预处理(`PreProcess`)、模型推理(`Inference`)、后处理(`PostProcess`)和端到端(`End2End`)进行划分的耗时统计,包括每个阶段的总耗时(`Total Time`,单位为“毫秒”)、**样本数**(`Nums`)和**单样本**平均执行耗时(`Avg Time`,单位为“毫秒”),同时,保存相关指标会到本地 `./benchmark.csv` 文件中:
|
|
|
|
|
|
```csv
|
|
|
Stage,Total Time (ms),Nums,Avg Time (ms)
|
|
|
-ReadCmp,0.04995107650756836,10,0.004995107650756836
|
|
|
-Resize,0.008480548858642578,10,0.0008480548858642578
|
|
|
-Normalize,0.02308964729309082,10,0.002308964729309082
|
|
|
-ToCHWImage,2.7179718017578125e-05,10,2.7179718017578126e-06
|
|
|
-ImageDetPredictor,0.07594108581542969,10,0.007594108581542969
|
|
|
-DetPostProcess,0.00026535987854003906,10,2.6535987854003906e-05
|
|
|
-PreProcess,0.08154845237731934,10,0.008154845237731934
|
|
|
-Inference,0.07594108581542969,10,0.007594108581542969
|
|
|
-PostProcess,0.00026535987854003906,10,2.6535987854003906e-05
|
|
|
-End2End,0.16107797622680664,10,0.016107797622680664
|
|
|
-WarmUp,5.496418476104736,5,1.0992836952209473
|
|
|
+ReadCmp,0.18548870086669922,10,0.018548870086669923
|
|
|
+Resize,0.0169522762298584,30,0.0005650758743286133
|
|
|
+Normalize,0.04112100601196289,30,0.001370700200398763
|
|
|
+ToCHWImage,5.745887756347656e-05,30,1.915295918782552e-06
|
|
|
+Copy2GPU,0.014585494995117188,10,0.0014585494995117188
|
|
|
+Infer,0.10014462471008301,10,0.0100144624710083
|
|
|
+Copy2CPU,0.009545087814331055,10,0.0009545087814331055
|
|
|
+DetPostProcess,0.0005676746368408203,30,1.892248789469401e-05
|
|
|
+PreProcess,0.24361944198608398,30,0.0081206480662028
|
|
|
+Inference,0.12427520751953125,30,0.0041425069173177086
|
|
|
+PostProcess,0.0005676746368408203,30,1.892248789469401e-05
|
|
|
+End2End,0.37970948219299316,30,0.012656982739766438
|
|
|
+WarmUp,9.465681791305542,5,1.8931363582611085
|
|
|
```
|