PPOCR detection model conversion error

Which system do you use? Android, Ubuntu, OOWOW or others?

Ubuntu

Which version of system do you use? Please provide the version of the system here:

Ubuntu 22.04

Please describe your issue below:

I am converting a PPOCR detection model.
I’m trying to convert for use in C code based on a sample that converts to Python.

When I detected with the model here, the results were correct.

And I downloaded the multilingual detection model v3 and found that it works fine on Windows.

However, when I used the ONNX to ADLA conversion like the other models, I got a feature map with a mean that converges to zero.

How to fix the problem?

Post a console log of your issue below:



Hello @GHdevlog ,

Could you provide your model?

paddle to onnx

image

onnx to adla

I’m use VIM4 in below system

Linux Khadas 5.15.137 #1.7.3 SMP PREEMPT Fri Nov 29 09:53:55 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

Multilingual_PP-OCRv3_det_infer.zip (2.1 MB)

Hello @GHdevlog ,

I have received your model. The problem may occur in our convert tool. Now our engineer is looking for issue.

@Louis-Cheng-Liu

Have you made any progress in resolving the issue so far?

Is there anything else I can check in the conversion or execution process?

Hello @GHdevlog ,

Sorry for late. The problem is adla model has lost too much precision. Add a parameter to qunatify model per channel.

--model-name mul_ppocr_det 
--model-type onnx 
--model ./mul_ppocr_det.onnx 
--inputs "x" 
--input-shapes  "3,736,736" 
--dtypes "float32" 
--quantize-dtype int8 
--outdir onnx_output 
--channel-mean-value "123.675,116.28,103.53,57.375" 
--source-file ocr_det_dataset.txt 
--iterations 500 
--batch-size 1 
--kboard VIM4 
--inference-input-type "float32" 
--inference-output-type "float32" 
--inference-output-type "float32" 
--disable-per-channel False 

Thank you for your help.

I was wondering how the option --disable-per-channel False specifically affects the conversion?

Can you explain how this option changes the model so dramatically?

Hello @GHdevlog ,

Each layer has many filters. Per layer quantification will quantize the all filters. So they share scaling factors and zero points. Per channel quantificaion will quantize each filter individually. Each filter has its own scaling factors and zero points. So the model by per channel has higher accuracy.