INT16 model does not work on KSNN NPU inference

keikeigd · March 6, 2024, 5:41am

Hi all

I tried to run an INT16 model using the output_format=OUT_FORMAT_INT16 setting, but it didn’t work as expected.

Here’s the code snippet I used:

outputs = ssd.nn_inference(cv_img, platform=‘TFLITE’, output_tensor=4, reorder=‘0 1 2’, output_format=output_format.OUT_FORMAT_INT16)

input0_data = outputs[0]
input1_data = outputs[2]
input2_data = outputs[1]
input3_data = outputs[3]
print(outputs)

input0_data = input0_data.reshape(GRID0, GRID0, SPAN, LISTSIZE).transpose(2, 3, 0, 1)
input1_data = input1_data.reshape(GRID1, GRID1, SPAN, LISTSIZE).transpose(2, 3, 0, 1)
input2_data = input2_data.reshape(GRID2, GRID2, SPAN, LISTSIZE).transpose(2, 3, 0, 1)
input3_data = input3_data.reshape(GRID3, GRID3, SPAN, LISTSIZE).transpose(2, 3, 0, 1)

#input_data = [input0_data, input1_data, input2_data, input3_data]
input_data = list()
input_data.append(np.transpose(input0_data, (2, 3, 0, 1)))
input_data.append(np.transpose(input1_data, (2, 3, 0, 1)))
input_data.append(np.transpose(input2_data, (2, 3, 0, 1)))
input_data.append(np.transpose(input3_data, (2, 3, 0, 1)))

However, the console output showed incorrect data, ofc there’s no detection:

 |---+ KSNN Version: v1.3 +---| 
Start init neural network ...
Done.
[array([0, 0, 0, ..., 0, 0, 0], dtype=int16), array([0, 0, 0, ..., 0, 0, 0], dtype=int16), array([0, 0, 0, ..., 0, 0, 0], dtype=int16), array([0, 0, 0, ..., 0, 0, 0], dtype=int16)]

I have already attempted to convert the model to INT16 using the following command:

./convert \
--model-name phrd_tflite-int16 \
--platform tflite \
--model phrd.tflite \
--mean-values '0 0 0 0.003921569' \
--quantized-dtype dynamic_fixed_point \
--qtype int16 \
--source-files ./data/dataset/dataset0.txt \
--kboard VIM3 \
--print-level 1

can anyone please help me resolve this issue? Thank you.

Electr1 · March 6, 2024, 11:05am

@keikeigd what does your model architecture look like (ex. From netron), is the output in float16 or float32 ?

keikeigd · March 7, 2024, 12:19am

@Electr1 The model architecture is same as YOLO architecture. The output is in float32.

keikeigd · March 7, 2024, 12:51am

Sorry I pasted wrong code. It should be output_format=output_format.OUT_FORMAT_INT16

I tried to use KSNN API to run with output format INT16. And I already quantized the model to INT16 with --quantized-dtype dynamic_fixed_point \ --qtype int16 \.

keikeigd · March 7, 2024, 1:49am

I did some study and found that output_format only works when we set it to fp32.

outputs = ssd.nn_inference(cv_img, platform=‘TFLITE’, output_tensor=4, reorder=‘0 1 2’, output_format=output_format.OUT_FORMAT_FLOAT32)

If I change the output_format to other type uint8, int8, int16, the output results show list of zero numbers.

For model quantization type, the default type is --quantized-dtype asymmetric_affine which is uint8. In order to convert quantization type to int16 or int8, i use --quantized-dtype dynamic_fixed_point \ --qtype int16 \ or --qtype int8 .

Testing case: all cases use output tensors fp32

Case 1: Use model quantization type uint8, it shows outputs and bounding boxes

Case 2: Use model quantization type int8, it shows outputs and bounding boxes, but with low accuracy

Case 3: Use model quantization type int16, it shows output but no detection.

I hope you can help me check the problem.

keikeigd · March 8, 2024, 11:56am

@Agent_kapo Hi, you can check on the problems here. Do you have the same problem?

Agent_kapo · March 8, 2024, 3:31pm

In my case, the work time was practically the same + there were less predictions or they were incorrect