How to use quant_tool?

Agent_kapo · September 25, 2023, 6:42am

Im using Vim3 pro 1.5. Im working with yolov5s and everythong is fine, it predicts what I want, but when I quantize the .tmfile weights, yolo stops detecting anything. At first I thought that it’s because of mean and std, but I counted it and nothing changed

Quantize command:

Here Im computing mean & std:

The usual Yolo works, but when I quantize it, it turns into pumkin

Agent_kapo · September 25, 2023, 6:51am

Here you can see that usual yolo works good. But timvx yolo sucks:

Electr1 · September 25, 2023, 7:36am

@Agent_kapo quantized models won’t ensure guarantee accuracy all the time. Sometimes the loss is significant, it won’t work properly.

Agent_kapo · September 25, 2023, 7:42am

Okay, but that sounds strange
Khadas provides the tool that not always work)
It’s like I would by a car that not always start

Electr1 · September 25, 2023, 8:52am

@Agent_kapo its not issue of conversion tool. The same performance drop can be seen with using quantized weights with any NN model. This issue is presented because of converting floating point values to fixed point values. FP32 will offer the highest precision and have higher computational requirements.

You can read about how we can get away with using INT8 fixed point for computation in most cases.

but when the quantized weights are just not accurate enough for the model. it will just give garbage inference.

In this case, prefer to run model without quantization.

Agent_kapo · September 25, 2023, 10:45am

Okay, so is there a solutiom?
I need timvx because it works fast. I tested it few month ago on dataset with chess and everything worked fine.

Agent_kapo · September 25, 2023, 10:47am

because for me it’s strange, model works with yolo dataset where 88 clasess and a lot of data, and in my dataset is 19 clasess and 5.000 photos. So why is it like this?

Electr1 · September 25, 2023, 10:48am

Do not quantize for best results. Whatever unsupported operations that cannot run on the NPU will run on the CPU and it won’t comprimise on performance/accuracy balance.

Is best.tmfile quantized ? can you explain the models you are using to compare here.

Agent_kapo · September 25, 2023, 11:42am

If I wont quantize then whats the point to buy khadas with npu?

Here is my github:

I changed yolo.py (added method forward_export, which Im using when Im exporting pt to onnx)
Then Im exporting it in tmfile and quantizing it
Before everything worked well. Your Yolo with 88 classes, my customn dataset with chess, but this time no

Agent_kapo · September 25, 2023, 11:42am

And I also rewrote .cpp file, made 19 classes and changed the names of classes

Agent_kapo · September 25, 2023, 12:17pm

And also strange that quantized models dont work. Bcs usual yolo easily quntizes, I did it on Jetson on few datasets and had no problems

Electr1 · September 25, 2023, 12:29pm

@Louis-Cheng-Liu can maybe share some info about this regarding.

Agent_kapo · September 25, 2023, 12:51pm

It would be perfect
Will be waiting

numbqq · September 26, 2023, 1:48am

Hello @Agent_kapo

Tengine is not maintained and test anymore, please check the NPU releated docs here:

https://docs.khadas.com/products/sbc/vim3/npu/start

Agent_kapo · September 26, 2023, 2:46am

Thank you for the kink, I saw KSNN year ago, but this time it was not that effective
I’ll try it

But why tengine is not maintained anymore?

Agent_kapo · September 26, 2023, 3:06am

Also about the convert
Can I convert weights not on Khadas and where is ./convert ?