Neural network doesn't work after upgrade

Which Khadas SBC do you use?

VIM3

Which system do you use? Android, Ubuntu, OOWOW or others?

Ubuntu

Which version of system do you use? Khadas official images, self built images, or others?

Khadas oficial image

Please describe your issue below:

After I performed a system upgrade on my khadas:

sudo apt update
sudo apt upgrade

My neural network no longer runs

Post a console log of your issue below:

Start init neural network ...
E [vnn_ProcessGraph:153]CHECK STATUS(-1:A generic error code, used when no other describes the error.)
run neural network  error !!!

My current system version is:

Welcome to Fenix 1.1.1 Ubuntu 20.04.4 LTS Linux 4.9.241  

I tried to re create the model, but it fails:

Start to Generate inputmeta ...
2022-08-08 18:24:15.892560: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/jupyter/aml_npu_sdk/acuity-toolkit/bin/acuitylib:/tmp/_MEIP2cua2
2022-08-08 18:24:15.892610: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
I Namespace(channel_mean_value='127.5', generate='inputmeta', input_meta_output='Model_inputmeta.yml', model='Model.json', separated_database=False, source_file='data/dataset/dataset0.txt', which='generate')
I Load model in Model.json
[17420] Failed to execute script pegasus
Traceback (most recent call last):
  File "pegasus.py", line 131, in <module>
  File "pegasus.py", line 123, in main
  File "acuitylib/app/console/commands.py", line 105, in execute
  File "acuitylib/app/console/inputmeta.py", line 112, in generate_inputmeta
  File "acuitylib/app/console/inputmeta.py", line 80, in generate_port
  File "acuitylib/app/console/inputmeta.py", line 80, in <listcomp>
IndexError: list index out of range

So how do I downgrade my system to the point before the upgrade?
Or how can I re create the model?

Thanks

@Gabriel_Lema Which one demo you run?

Not sure if demo is right word here. I’m following the instructions you pointed out here to convert my mobilenet-ssd-v1 model.

I’m no longer able to run those same commands if I download the repo from scratch. I get the error mentioned above when trying to convert the model.

Thanks

If I increase the verbose level in the nn_init function I get:

Start init neural network ...
#productname=VIPNano-QI, pid=0x88
Create Neural Network: 13ms or 13711us
Start run graph [1] times...
vxoBinaryGraph_GenerateStatesBuffer[6848]: binary sramSize: 0x100000, context size: 0xfff00
vxoBinaryGraph_GenerateStatesBuffer[6852]: binary sram more than context sram
fail to initial memory in generate states buffer
fail in import kernel from file initializer
Failed to initialize Kernel "model_person_v1.0" of Node 0x5f2f840 (status = -1)vxProcessGraph[19051]: Process Graph fail!
Run graph the 0 time fail
E [vnn_ProcessGraph:153]CHECK STATUS(-1:A generic error code, used when no other describes the error.)
run neural network  error !!!

@Gabriel_Lema The versions of all repositories must be the latest. My suggestion is to clone the latest demo test on the latest system to confirm that there is no problem with the environment.

This is usually a version mismatch

Thanks for your reply @Frank . If I follow the steps in the repo:

And I run the command:

./convert --model-name mobilenet_ssd  \ 
          --platform onnx --kboard VIM3  \ 
          --model model/mb1-ssd.onnx \
          --quantized-dtype asymmetric_affine \
          --mean-values '127.5' \
          --print-level 1 \
          --source-files data/dataset/dataset0.txt

I get the following:

Done.import model success !!!


Start to Generate inputmeta ...
2022-08-12 20:55:39.591489: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/glema/aml/acuity-toolkit/bin/acuitylib:/tmp/_MEIj0J5c2

2022-08-12 20:55:39.591534: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

I Namespace(channel_mean_value='127.5', generate='inputmeta', input_meta_output='Model_inputmeta.yml', model='Model.json', separated_database=False, source_file='data/dataset/dataset0.txt', which='generate')

I Load model in Model.json

[364389] Failed to execute script pegasus

Traceback (most recent call last):

  File "pegasus.py", line 131, in <module>

  File "pegasus.py", line 123, in main

  File "acuitylib/app/console/commands.py", line 105, in execute

  File "acuitylib/app/console/inputmeta.py", line 112, in generate_inputmeta

  File "acuitylib/app/console/inputmeta.py", line 80, in generate_port

  File "acuitylib/app/console/inputmeta.py", line 80, in <listcomp>

IndexError: list index out of range

What am I doing wrong?
Thanks!

@Gabriel_Lema

If your model is a single channel model, here should be two parameters.
I think it should be -mean-values '128 0.0078125' .

The issue was that in the command when specifying the mean values I was using comma separated values (like I did in the first post) instead of separated by space.

So if I run:

./convert --model-name mobilenet_ssd  \ 
          --platform onnx --kboard VIM3  \ 
          --model model/mb1-ssd.onnx \
          --quantized-dtype asymmetric_affine \
          --mean-values '127.5 127.5 127.5 127.5 0.0078125' \
          --print-level 1 \
          --source-files data/dataset/dataset0.txt

everything works now.
Thanks @Frank!