KSNN align tool

Hello everyone, I’ve trained a yolov8n and converted it on khadas (Vim3 Pro) and results are a little bit confusing. I know that after quantation results can be worse, but maybe I’m using the incorrect parametres?

./convert \
--model-name yolov8n \
--platform onnx \
--model ./yolov8n.onnx \
--mean-values '0 0 0 0.00392156' \
--quantized-dtype asymmetric_affine \
--source-files ./data/dataset/dataset0.txt \
--kboard VIM3 \
--print-level 0

Are there different options in quantized-dtypes and how to find mean-values, because I cant understand why you have here 4 elements, okay - first three are for the RGB, but the last one?
And also what size of source files should be, I used 9 photos there.

Thank you very much

Hello @Agent_kapo ,

quantized-dtype has two options. asymmetric_affine means that qunatify your model into uint8. dynamic_fixed_point means quantify into int8 or int16. If you use the latter, you need to add qtype to choose which you want.

./convert \
--model-name yolov8n \
--platform onnx \
--model ./yolov8n.onnx \
--mean-values '0 0 0 0.00392156' \
--quantized-dtype dynamic_fixed_point \
--qtype int8(int16) \
--source-files ./data/dataset/dataset0.txt \
--kboard VIM3 \
--print-level 0

mean-values is your model normalization before inference. If mean-values is ‘m1 m2 m3 scale’, do the following. (0.00392156 = 1 / 255)

channel_1 = (channel_1 - m1) * scale
channel_2 = (channel_2 - m2) * scale
channel_2 = (channel_2 - m3) * scale

Then, model infers this picture.

At last, we suggest the picture to quantify has better about 200 showing the use and running conditions.

Yeah, I already read abiut quantify that i can even push 6 thousands of images by changing --batch-size and --epochs parametres, correct?

Also can it influence a lot that I had small quantify set + used your parametres?

Thank you

Hello @Agent_kapo ,

Sorry i forgot.

./convert \
--model-name yolov8n \
--platform onnx \
--model ./yolov8n.onnx \
--mean-values '0 0 0 0.00392156' \
--quantized-dtype dynamic_fixed_point \
--qtype int8(int16) \
--source-files ./data/dataset/dataset0.txt \
--batch-size 1 \
--iterations 200 \
--kboard VIM3 \
--print-level 0

iterations×batch-size is the number of quantification images. Both of their default values are 1.

There is not parameter named epochs.

200 quantify images are enough. Of course the more the better. Small set can influence a lot but not for each model. According to quantified model working, you can choose whether add images.

In general, int16 work better than int8 and uint8. But latters infer quicker than former. Theoretically, precision between int8 and uint8 are not much difference. Also choose by quantified model performing.

1 Like

Thank you very much!

One last thing, how I should pick quant ways and where I can see all the options?

--input-size-list '299,299,3'

Does this parametr important?

Hello @Agent_kapo ,

Sorry, there is not document about all options of each parameter. In general, quantification has five options, int8, int16, uint8, hybird and non-quant. Now KSNN only support int8, int16 and uint8.

In ksnn/example, it has examples about different platform model to convert nb. No all platform convertion need input-size-list. ONNX do not need. You can refer README in each folder.

Thank you!
I found old documentation in aml_npu repo and I have few questions
1)Is it good idea to use pegasus instead of convert tool to quant weights?
2)I wanted to set GPU parametr in convert tool and here what I got:


3)Why I can’t set batch-size parametr more then 1?
4)int8 and int16 models are realy bad, they work longer + they predict nothing

My dataset is bunch of gray images with faces. After the quantization the accuracy drops, how I can fix it? Can the number of calibration images influence on it? Can mean and scale parametrs influence on it? Also, as I understood scale praclically always is 1/255, by default mean values are 0, can it influence in bad way?

Maybe you have any other suggestions?

Thank you

And also, do you know how to speed up the post-processing part, I already did it with numba, but still 75ms is too much
I’m thinking about multithreading, but I don’t know

Hello @Agent_kapo ,

About 1), i try to use pegasus converted model in KSNN demo. It does not work.
About 3), i try to set batch-size more than 1, it can work. But the batch-size have better not too much. May out of memory.

About 2) and 4), i have given feedback to our engineer. They are the bug.

About mean and scale, as i all know, in general it can not affect. You can also use other mean-values, but you must change in train code and retrain model.

Multithreading is one way. You can search about how to use multithreading in python.
Another way is improving post-processing algorithm.

Thank you!
Let me know when it will be clear about 2 and 4
Also, how does number of iterations influence on the results of quantanization (beside speed)

Hi. I got same problem that int16 predicts nothing as well. Did you manage to make it run?

When I use output tensor: float32 and model quantization type qtype int16, the output works but there’s no bounding boxes.

output tensor: float32 and qtype uint8 work fine.

Hi
Actually I’m still waiting for Khadas team answer
If you have problems with float32, drop me convert promt, which you are using + you are using convert or pegasus tool?

Also, just interested, how fast your inference work?

I use convert tool. I tested on 500 images . Below are the results

Interesting
What is Pferd_onnx?
Your Yolo works pretty fast, I used numba to optimise it, it takes 80ms per frame on fhd video

About int16 → it just didn’t work, I had less predictions ore sometimes they were incorrect

PHRD is the name of my company’s deep learning model which is based on YOLO architecture