NPU Python API: ksnn v0.1 TEST Version Release(en)

What is ksnn?


The meaning of ksnn is Khadas Software Neural Network.

Why do this ?

The original SDK only has a C/C++ interface. Since its release in 2019, users have been asking whether it will support python api. Python is the most widely used programming tool in the AI ​​field, so we decided to make a python version of the api to meet the needs of users for the python version.


v0.1 is a test version. We hope that more users will participate and give us more feedback. In the future, we will release the official version after improving the API.



  1. For some special reasons, the current python api still relies on the nb files and case codes converted from the original SDK.
  2. The dependent python packages, in addition to the core packages, mainly include opencv and numpy.


  1. The current version only supports single input, and multiple inputs will be supported in the future
  2. Currently does not support hybrid quantization
  3. Like the original SDK, only the tensor and layer supported by the SDK can be used

How To Use



$ git clone

Use SDK to convert

The original model cannot directly call npu and needs to use the sdk conversion model. Here, taking mobilenet as an example, use the python conversion tool to obtain the required nb files and libraries.

$ cd acuity-toolkit/python/
$ ./convert --model-name mobilenet_v1 \
--convert-platform tensorflow \
--tf-inputs input --tf-input-size-list '224,224,3' \
--tf-outputs MobilenetV1/Predictions/Softmax \
--tf-model-file ../demo/model/mobilenet_v1.pb \
--source-file-path ../demo/data/validation_tf.txt \
--channel-mean-value '128 128 128 128' \
--quantized-dtype asymmetric_affine-u8 \
--reorder-channel '0 1 2' \
--kboard VIM3 

Get the converted data

$ ls outputs/mobilenet_v1/  mobilenet_v1.nb

Copy the data to the board.

ksnn package


$ wget

The test version of the wheel is placed on the khadas server.


Only adapt to python3 here. You need to install Python3-pip online before installing wheel

$ sudo apt install python3-pip

After the installation is successful, install the wheel,

$ pip3 install ksnn-0.1-py3-none-linux_aarch64.whl

Both opencv and numpy will be installed here.


$ python3
Python 3.8.10 (default, Jun  2 2021, 10:49:15) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ksnn

The normal import means that the installation is successful.

Test examples

Get code

$ git clone


Install dependence

$ pip3 install matplotlib

Run the inceptionv3 test example,

$ cd ksnn-examples/tensorflow/   
khadas@Khadas:~/ksnn-example/tensorflow$ python3 --nb-file models/VIM3/inceptionv3.nb --input-picture data/1080p.bmp --so-lib libs/ 
 |---+ KSNN Version: v0.1 +---| 
Create Neural Network: 46ms or 46580us
set input time :  0.004050493240356445
Start run graph [1] times...
Run the 1 time: 20.00ms or 20854.00us
vxProcessGraph execution time:
Total   20.00ms or 20906.00us
Average 20.91ms or 20906.00us
get ouput time:  0.0007474422454833984
-----+ Show Top5 +-----
   904: 0.41821
   656: 0.11829
   446: 0.09064
   639: 0.05692
   825: 0.03821

To run the yolov3 test code, you can refer to README.


API documentation location:


  1. Any feedback and suggestions for testing are welcome, you can feed back in this post, or:
  1. Each platform will add examples in the future.

  2. In the future, the entire ksnn package will be open sourced, but it will not be open for the time being in the testing phase.


@Frank - firstly thanks for this API - much appreciated!

Im trying to run one of the demos but am getting the following errors with the command line:

python3 --nb-file ./models/VIM3/yolov3.nb --so-lib ./libs/ --video-device 0

|—+ KSNN Version: v0.1 ±–|
Create Neural Network: 123ms or 123007us
[ WARN:0] global /tmp/pip-req-build-h8zflkjv/opencv/modules/videoio/src/cap_v4l.cpp (890) open VIDEOIO(V4L2:/dev/video0): can’t open camera by index
resize input pictures error !!!
set input time : 0.0004115104675292969
Set nn inputs error !!!

I suspect the resize and inputs error are just because the video cant be opened, and I suspect that the video not being able to be opened has something to do with it being a MIPI video source? Device /dev/video0 is the video source I’ve been using for the npu c++ examples and it works fine. Could you point me in the right direction please?

Edit to say kernel log has this in it (but don’t understand this as Im using the khadas OS08A10 camera and it works for the c++ examples)

"[ 4869.478983@5] isp_v4l2_stream_try_format@isp-v4l2-stream.c:1326 GENERIC(CRIT) :[Stream#0] format 0x00000000 is not supported, setting default format 0x34424752.
[ 4869.487763@2]
[ 4869.489444@2] fw_intf_stream_stop@fw-interface.c:331 GENERIC(CRIT) :Stream off 4
edit edit rebooted just to see if it made and difference and the detect_demo_x11_mipi binary still works fine while the python demo still throws the same errors

@birty Hello, this is a bug for opencv-python. So you can try my steps

  1. $ pip3 uninstall opencv-python
  2. $ sudo apt install libopencv-dev python3-opencv
1 Like

Perfect - thank you! Got the demo working easily. Now to integrate into my project

@Frank one more question is it possible to turn off the printing these messages, its filling my screen so Im losing the info i do want to see printed in the noise. Couldnt see the option in the API documentation? Thanks

“set input time : 0.004715919494628906
Start run graph [1] times…
Run the 1 time: 81.00ms or 81521.00us
vxProcessGraph execution time:
Total 81.00ms or 81582.00us
Average 81.58ms or 81582.00us
get ouput time: 0.0045964717864990234”

1 Like

@birty I will print the information as an option in the next version

1 Like

@Frank Thank you, much appreciated

@birty If you find any bugs, unreasonable places, or suggestions when you test, please let me know as soon as possible


The code is much faster than C++ version. Thanks for the python edition.

I would like to add some suggestions.

  1. Drawing all the objects are unnecessary for most of the use cases. It would be nice if there is an option to choose the objects based on class number in yolo labels.

  2. Supporting more NN models would definitely offer flexibility.

Since the code is open source, could we contribute back like adding object tracker stuff like that

1 Like

@Vignesh_Raja I wll try to do it .

In my plan, it includes making different demos for different platforms

After we release the official version, we will consider directly open source, so that more people can participate


Ive been working on getting object tracking functional - just need some time to work on it! That will definitely be a very useful addition!

1 Like

So does the current version work only for Inception and Yolo models?

Would be really great if a demo can be done for MobileNet architecture. As this is very efficient for embedded edge devices.

@Akkisony It just a demo. You can convert you self model.

@Frank for some models the conversion from .tflite seems to work including the model export, but fails without an error message when trying to generate the library.

There is a single warning for one tensor for which the variables are all zero.

Is the conversion script hiding library generation messages?

@jdrew You can setup the print level. --print-level 1


You can refer to the official 1v.0 version

The print level flag is set to 1, and it seems to work, but I am not getting any output files for some models.
Keras-Application Models converted to .tflite seem to work fine.
Using this command:

./convert --model-name $var --platform tflite --model $a --mean-values ‘127.5,127.5,127.5,127.5’ --quantized-dtype asymmetric _affine --kboard VIM3 --print-level 1

Getting this as output in final few lines:

[TRAINER]Quantization complete.
[TRAINER]Quantization complete.
End quantization…
Dump net quantize tensor table to test_model.quantize
Save net to
W ----------------Warning(30)----------------
Done.Quantize success !!!
Start export model …
Done.Export model success !!!

Start generate library…

Afterwards the intermediate files are deleted and no output files are written. No error messages as to what fails in “generate library”.
The warnings are due to some tensor outputs being zero.