VIM3 Segmentation fault with Tflite NPU

Pedro_Fernandes · December 15, 2021, 6:11pm

Hey all,

I have a model that was trained in Tf.keras and then converted to tflite it’s essentially mobilenetv2 without the last layer. I successfully convert using the tflite example however when I use KSNN api to call inference it gives me a seg fault. Any ideas?

johndoe · December 15, 2021, 6:29pm

Facing the same error (segmentation fault. core dumped) during onnx inference

model - resnet50
platform - onnx
quantized-dtype - dynamic_fixed_point
qdtype - int16

Working fine with asymmetric affine & dynamic_fixed_point (int8) as quantized-dtype

Pedro_Fernandes · December 15, 2021, 6:33pm

what did you use for convertion?

Frank · December 16, 2021, 12:51am

@Pedro_Fernandes Share me you model and you convert parameters. I will test it next week

johndoe · December 16, 2021, 3:32am

I used the ksnn conversion script

./convert \
--model-name resnet18-v1 \
--platform onnx \
--model onnx_models/resnet18-v1.onnx \
--mean-values '103.94,116.78,123.68,58.82' \
--quantized-dtype dynamic_fixed_point --qtype int16 \
--kboard VIM3 --print-level 0

@Frank any reason for this behaviour? Does khadas not support int16 as of yet?

Frank · December 16, 2021, 6:09am

@johndoe I tested it, it can be converted or run, but the result is not correct (because my parameters have not been adjusted, they are still uint8 parameters)

python3 resnet18.py --model ~/resnet18.nb --library ~/libnn_resnet18.so --picture data/goldfish_224x224.jpg --level 0
 |---+ KSNN Version: v1.2 +---| 
Start init neural network ...
Done.
Get input data ...
Done.
Start inference ...
Done. inference : 0.6531527042388916 s
----Resnet18----
-----TOP 5-----
[611]: 0.5673807859420776
[599]: 0.10485244542360306
[741]: 0.06157940998673439
[750]: 0.03895159438252449
[669]: 0.01403861679136753

johndoe · December 16, 2021, 6:11am

@Frank I see. So some issue with the convert script itself then?

Here’s my model source

Frank · December 16, 2021, 6:23am

@johndoe

(npu-test) yan@yan-wyb:~/yan/git/khadas/about-npu/amlogic/aml_npu_sdk/acuity-toolkit/python$ ./convert --model-name resnet18 --platform onnx --model ~/yan/tmp/resnet18-v1-7.onnx  --mean-values '103.94,116.78,123.68,58.82' --quantized-dtype dynamic_fixed_point --qtype int16 --kboard VIM3 --print-level 1

I verified your model, the same, it can run, but the parameters are not adjusted, the result is incorrect

johndoe · December 16, 2021, 6:25am

@Frank How can I fix this issue?

Frank · December 16, 2021, 6:27am

@johndoe I did not encounter a segfault, so I cannot reproduce your problem

johndoe · December 16, 2021, 6:30am

@Frank Have there been any reports of similar segmentation faults in int16? Can you recommend me any first-hand principles to deal with it?

In case I’m able to handle the segmentation fault, how can I get past the wrong inference results?

Frank · December 16, 2021, 6:32am

@johndoe I didn’t encounter any errors, I just converted, and then used the resnet18 code in ksnn to verify. No changes to the model and code

johndoe · December 16, 2021, 6:33am

@Frank Got you. Will try to figure out some workaround

Frank · December 16, 2021, 6:34am

@johndoe Did you make any changes? I haven’t done any change to the test code. The original code can run. I’m sorry I can’t reproduce it, otherwise I can help you solve this problem

johndoe · December 16, 2021, 6:35am

Even I didn’t make any changes in the code per se. I’m using the resnet18.py code in examples/pytorch directory

Frank · December 16, 2021, 6:38am

@johndoe You try again, my test can be run, I don’t know if we missed something, which caused our test results to be different

johndoe · December 16, 2021, 6:46am

@Frank I’m rerunning my script. Did you find any warning during conversion?

Also, has it got to do with the onnx version? I’ve got 1.6.0 in my host machine where the conversion script runs

johndoe · December 16, 2021, 6:50am

My convert script

./convert \
--model-name resnet18-v1 \
--platform onnx \
--model onnx_models/resnet18-v1.onnx \
--mean-values '103.94,116.78,123.68,58.82' \
--quantized-dtype dynamic_fixed_point --qtype int16 \
--kboard VIM3 --print-level 1

My inference script

khadas@Khadas:~/onnx$ python3 resnet18.py --model ~/resnet18-v1/resnet18-v1.nb  --library ~/resnet18-v1/libnn_resnet18-v1.so
#productname=VIPNano-QI, pid=0x88
Create Neural Network: 34ms or 34008us
Segmentation fault (core dumped)
khadas@Khadas:~/onnx$

Pedro_Fernandes · December 16, 2021, 9:41am

@Frank

./convert --model-name mnv2_tflite --platform tflite --model.tflite --mean-values '128,128,128,128' --quantized-dtype asymmetric_affine --kboard VIM3 --print-level 1```
this is my conversion script

import numpy as np
import os
import argparse
import sys
from ksnn.api import KSNN
from ksnn.types import *
import cv2 as cv
import IPython
import time

model = "/home/khadas/mnv2_tflite/mnv2_tflite.nb"
library = "/home/khadas/mnv2_tflite/libnn_mnv2_tflite.so"
model_knn = KSNN("VIM3")
model_knn.nn_init(library=library, model=model, level=0)
print("here")
CSV_IMG = cv.cvtColor(np.ones((640,820,3), dtype=np.float32), cv.COLOR_RGB2GRAY)
cv2_im = []
cv2_im.append(CSV_IMG)
outputs = model_knn.nn_inference(cv2_im, platform='TFLITE')
print("here")
print(outputs, outputs[0].shape)

now the script I run for inference and please also find attached the tflite model
https://drive.google.com/file/d/1T7jSm5bKKw57y2RSlBpyCQCDdtPnosCj/view?usp=sharing

Frank · December 17, 2021, 12:45am

@Pedro_Fernandes I wiil test it next week