Performance issues with resnet

johndoe · November 30, 2021, 12:10pm

I converted resnet50 model (pretrained) using the following command (parameters taken from netron)

--platform tensorflow \
--model ~/models/research/slim/resnet_v1_50_graph_new.pb \
--input-size-list '224,224,3' \
--inputs input \
--outputs resnet_v1_50/predictions/Reshape_1 \
--mean-values '124,127,104,60' \
--quantized-dtype asymmetric_affine \
--kboard VIM3 --print-level 1

I plugged the generated .so and .nb files to run an inference using the same script as inceptionv3.py
python3 inceptionv3.py --model ./models/VIM3/resnet50_v1.nb --library ./libs/resnet50_v1_libnn.so --picture ./data/goldfish_224x224.jpg --level 0

But the accuracy doesn’t seem to be anywhere near the actual ones

Result from inception v3:
----- Show Top5 ±----
2: 0.93408
795: 0.00307
974: 0.00180
408: 0.00169
393: 0.00148

Result from resnet50:
----- Show Top5 ±----
644: 0.03381
783: 0.02422
418: 0.02138
845: 0.01886
111: 0.01665

Frank · December 1, 2021, 1:14am

@johndoe I remember resnet model need to do softmax in your postprocess. you can follow the resnet18 model demo in pytorch dir

johndoe · December 1, 2021, 4:14am

I was hoping to benchmark a whole range of networks on the VIM3 board using tensorflow. Sticking to a single framework could help getting an uniformity over the experiments. Additionally, say I get resnet18 from pytorch. I’d still need resnet50, 101, 150… in the future

Could you please help me with any steps (apart from the convert script) that I need to execute to get a model with proper accuracy?

Frank · December 2, 2021, 1:09am

@johndoe After the model is converted, it is the same, no matter which platform is the same, so you can refer to the post-processing of the pytorch demo for the same

johndoe · December 2, 2021, 4:47am

@Frank
I copied the .nb and .so files for the converted resnet into pytorch’s folder and ran

python3 resnet18.py --model ./models/VIM3/resnet50_v2.nb --library ./libs/resnet50_v2_libnn.so --picture data/goldfish_299x299.jpg --level 0

The outputs were still not consistent. I noticed another issue in this inference (299x299). The prediction results kept changing with every successive run (without any changes in code/model). I’m attaching the screenshot for it

Here’s the result for 224x224 image:
----Resnet18----
-----TOP 5-----
[829 905]: 0.0014902635011821985
[829 905]: 0.0014902635011821985
[491]: 0.0010400540195405483
[600]: 0.0010229613399133086
[557]: 0.0010146443964913487

Here’s what it should have been:
----Resnet18----
-----TOP 5-----
[1]: 0.991869330406189
[963]: 0.0015490282094106078
[923]: 0.0009275339543819427
[115]: 0.0006153833237476647
[112]: 0.0005012493929825723

Frank · December 2, 2021, 5:35am

@johndoe There is my conver parameter.

python3 convert.py \
--model-name resnet50 \
--platform onnx \
--model /home/yan/yan/Yan/models-zoo/onnx/resnet50/resnet50v2.onnx \
--mean-values '123.675,116.28,103.53,58.82' \
--quantized-dtype asymmetric_affine \
--kboard VIM3 --print-level 1

can you share you resnet model ?

johndoe · December 2, 2021, 5:48am

I’m not very familiar with onnx framework. Is the issue in performance platform specific?

Here’s the link to my model
https://drive.google.com/drive/folders/1BI_egXhR_gEDpzXKr5C3fRn1EKZsYVJ_?usp=sharing

johndoe · December 4, 2021, 5:07am

I used the convert parameters (mean-values) as yours but the accuracy still stayed the same. Were you able to figure out the issue?

Frank · December 4, 2021, 5:41am

@johndoe Can you try my reset50 model, although it is onnx, I think it can help you determine where the problem is

johndoe · December 4, 2021, 5:43am

Yes. I tried your onnx model. Gives correct accuracy results when run with the inference. I still couldn’t understand the error behind it

How did you get this model? I mean, what steps did you have to follow to finally get the frozen file (before conversion)?

Frank · December 4, 2021, 5:50am

@johndoe I got the model from github, you can compare the difference with your model through https://netron.app

johndoe · December 4, 2021, 5:51am

I tried that too. The differences between your onnx model and my frozen resnet model aren’t that high (other than naming conventions followed by each of the frameworks). Would you want me to attach the netron outputs for both of them>

johndoe · December 4, 2021, 5:53am

Could you please try converting the resnet_v2_50 model from tf_slim (assuming that’s where you got the other models too) into .nb & .so files using the ksnn converter? Let me know the accuracy results too in case you’re able to run an inference after that

Frank · December 4, 2021, 6:02am

@johndoe I will find time to test, but not so fast

johndoe · December 4, 2021, 6:03am

Sure. Thanks a lot!

Meanwhile, we can try to figure out the issue

johndoe · December 4, 2021, 6:06am

Here are the netron outputs for both (tf_slim and onnx) frozen files

Onnx

tf_slim

johndoe · December 5, 2021, 5:22am

@Frank While you’re working on the tf_slim resnet_v2_50 issue, could you please mention me the steps that you followed to get the resnet50.onnx model?

Frank · December 6, 2021, 1:04am

@johndoe I got it from github, you can search directly on github

johndoe · December 6, 2021, 5:19am

@Frank
I tried converting resnet50 from onnx model zoo and it gave good results

 |--- KSNN Version: v1.0 +---|
Done. inference time:  0.02891254425048828
----Resnet50----
-----TOP 5-----
[1]: 0.9986613392829895
[0]: 0.0005297655006870627
[794]: 0.00040659261867403984
[29]: 6.378102261805907e-05
[391]: 5.587655687122606e-05

But when I use resnet18 from onnx model zoo and try to run it, it gives somewhat unexpected values

 |---+ KSNN Version: v1.0 +---|
Done. inference : 0.010450601577758789 s
----Resnet18----
-----TOP 5-----
[1]: 0.8268496990203857
[121]: 0.03389100730419159
[927]: 0.030109353363513947
[963]: 0.023764867335557938
[928]: 0.014804825186729431

Whereas, pytorch’s pretrained model (resnet18) shows better scores

 |---+ KSNN Version: v1.0 +---|
Done. inference : 0.008417129516601562 s
----Resnet18----
-----TOP 5-----
[1]: 0.991869330406189
[963]: 0.0015490282094106078
[923]: 0.0009275339543819427
[115]: 0.0006153833237476647
[112]: 0.0005012493929825723

johndoe · December 6, 2021, 5:25am

Is this behaviour explainable? Or is there any way to know which framework’s pretrained models would work best after conversion, beforehand?