Opencv DNN demo model quantization

Which Khadas SBC do you use?

Vim3 pro

Which system do you use? Android, Ubuntu, OOWOW or others?

ubuntu 20 release 1.4

Which version of system do you use? Khadas official images, self built images, or others?

official 1.4

Please describe your issue below:

how to quantize the opencv dnn onnx model. I have tried to quantize it using the following for yolov5 but the performence in NPU is worst then CPU for 640 model.

0.6 FPS vs 1.7 FPS

how did you quantizeand calibrated the DNN demo onnx?
here is how I tried to do it with onnxruntime

import os
import sys
print("add argments to quantize, onnx_file_name path_imageset.yaml ")
sys.path.append("..")
from onnxruntime.quantization import quantize_static, QuantType, CalibrationMethod, CalibrationDataReader
from onnxruntime.quantization import quantize_static, CalibrationDataReader, QuantType, QuantFormat  
import torch
from utils.datasets import LoadImages
from utils.general import check_dataset
import numpy as np


model_path = 'yolov5s'

model_path = sys.argv[1]
print("model path=", model_path)

def representative_dataset_gen(dataset, ncalib=100):
    # Representative dataset generator for use with converter.representative_dataset, returns a generator of np arrays
    def data_gen():
        for n, (path, img, im0s, vid_cap, string) in enumerate(dataset):
            input = np.transpose(img, [0, 1, 2])
            input = np.expand_dims(input, axis=0).astype(np.float32)
            input /= 255
            yield [input]
    return data_gen

class CalibrationDataGenYOLO(CalibrationDataReader):
    def __init__(self,
        calib_data_gen,
        input_name
    ):
        x_train = calib_data_gen
        self.calib_data = iter([{input_name: np.array(data[0])} for data in x_train()])

    def get_next(self):
        return next(self.calib_data, None)


dataset = LoadImages(check_dataset(sys.argv[2])['val'], img_size=[640, 640], auto=False)
#dataset = LoadImages(check_dataset('./data/coco128.yaml')['train'], img_size=[640, 640], auto=False)
data_generator = representative_dataset_gen(dataset)

data_reader = CalibrationDataGenYOLO(
    calib_data_gen=data_generator,
    input_name='images'
)


##since someone mensioned QDQ is bad with NPU (https://github.com/opencv/opencv/issues/22803) ,  I specified QuantFormat.QOperator. is that correct? 
# Quantize the exported model
quantize_static(
    f'{model_path}.onnx',
    f'{model_path}_ort_quant.u8s8.exclude.bigscale.onnx',
    calibration_data_reader=data_reader,
    quant_format=QuantFormat.QOperator,
    activation_type=QuantType.QUInt8,
    weight_type=QuantType.QInt8,
    nodes_to_exclude=['Mul_214', 'Mul_225', 'Mul_249', 'Mul_260', 'Mul_284', 'Mul_295', 'Concat_231', 'Concat_266', 'Concat_301', 'Concat_303'],
    per_channel=True,
    reduce_range=True,
    calibrate_method=CalibrationMethod.MinMax
        )

is there some better parameters to test.

please advise

@Frank @Louis-Cheng-Liu
Please help reply.

Hello @osos55

Because this model has limitation to fully running on the NPU with OpenCV NPU backend.

You can try to use the NPU SDK to convert the model and use the C++ code instead of OpenCV NPU backend to get better performance.

https://docs.khadas.com/products/sbc/vim3/npu/start#c-api