Yolov5 run very slow

··· run yolov5 logs:
Start run graph [10] times…
=============Run the 1 time: 389ms or 389111us
=============Run the 2 time: 388ms or 388554us
=============Run the 3 time: 389ms or 389342us
=============Run the 4 time: 388ms or 388465us
=============Run the 5 time: 388ms or 388381us
=============Run the 6 time: 393ms or 393072us
=============Run the 7 time: 389ms or 389178us
=============Run the 8 time: 389ms or 389352us
=============Run the 9 time: 388ms or 388639us
=============Run the 10 time: 388ms or 388491us
vxProcessGraph execution time:
Total 3893ms or 3893069us
Average 389.30ms or 389306.91us
I [vsi_nn_ConvertTensorToData:502]Create 1178100 data.

···
run inceptionv1 so fast,log as follow:
Verify Graph: 242ms or 242508us
Start run graph [3] times…
Run the 1 time: 5ms or 5010us
Run the 2 time: 5ms or 5361us
Run the 3 time: 4ms or 4849us
vxProcessGraph execution time:
Total 15ms or 15289us
Average 5.00ms or 5096.33us
I [vsi_nn_ConvertTensorToData:502]Create 2002 data.
— Top5 —
2: 0.918457
928: 0.009003
1: 0.007011
869: 0.007011
116: 0.004253
I [vsi_nn_ConvertTensorToData:502]Create 2002 data.

问题找到了,现在经过很多张图片量化之后是90毫秒,卡了下发现三个池化层就占了54毫秒,一半还多的时间,SPP模块性能太差

@shihaozhang

Hi, could you please share how did you manage to let yolov5 work?
In the past I tried to convert yolov4 without any success.
Did you notice any result improvements over yolov3 on vim3 npu after quantization?
Thanks and regards
F

@shihaozhang Which one yolov4 model you used ? I have test yolov4 ,some models can run with VIM3

@Frank
Hi, some time passed by, but I’m quite sure to have tested yolov4_416.cfg
I remember that the conversion was successful but the detections quality was ugly. I’ve seen similar results over some other posts in the forum where there were many unmotivated bounding boxes all around the test images, in particular on the pictures top.
I’m not an expert of convolution layers so I decided to wait for official support, considering that yolo-v3 was performing quite well.
Now I noticed some issues with yolo-v3 in crowded scenes compared to not-quantized darknet results, so I’m interested to improve yolo detections.
Regards
F

@fguerzoni I will release it this week or next week . About “there were many unmotivated bounding boxes all around the test images, in particular on the pictures top” , this needs to modify the code of yolov4.

1 Like

@Frank
that’s a really great news. Thank you.
I hope that we’ll be able to keep the 8/9 FPS we reached with yolo-v3 along the great improvements from yolo-v4

@fguerzoni In fact, the frame rate of the model depends on many factors, which can be adjusted when you train your own model, especially the weights parameter and the class parameter, which are the two most important parameters that affect the final frame rate.

@Frank
Thank you, I’m eager to try as soon as you release it
Regards
F

@Frank I run the yolo model with SPP.Do you have any idea about speeding up time with SPP module,especially the layer of max_pool?

@shihaozhang I think there is not much difference between yolov4-spp and yolov4. The original 608x608 yolov4 runs on the board; it runs. This is the problem of the tool itself. It is difficult for us to solve this problem. I think you can try to modify it to a 416x416 model. Or use voc data training to reduce the training class, first look at the effect

@Frank OK,thank you for your reply.

@shihaozhang If you have a new try, welcome to communicate again here

@Frank OK. Thank you very much for your help, I will continue to follow you.