@numbqq Now we have a fix for the network crash issue, it would be really helpful to be able to have a Debian build capability using Fenix
Hello @davidharding @RichardPar
please try this Debian image and provide us feedback, please note it will still have some known bugs, if you find anything out of the blue, please let us know!
Hi @Electr1 @numbqq
I have installed the Debian 5.15 server image, and on a clean install things appear to work fine.
I have seen no kernel panics, and the npu examples work as expected.
However, after installing various packages, I am again seeing a problem with the npu function.
I get the following error
root@Khadas:/media/nvme/khadas/vim4_npu_applications-master/face_recognition/build# sudo ./face_recognition -M β¦/data/model/retinaface_int8.adla -m β¦/data/model/facenet_int8.adla -p 1
adla usr space 1.2.0.5
E NN_SDK:[aml_adla_create_network_common:357]Error: create network fail.
amlnn_init is fail
What additional logs can I source to investigate this further?
To reproduce
(1) Flash the emmc with the image
(2) Install the face_recognition demo
(3) Attempt to run the demo, and it should work successfully
(4) apt install runc
(5) Attempt to run the demo, and it should work successfully
(6) apt install containerd
(7) Attempt to run the demo, and it should now fail with the mentioned error
@davidharding debian images do not come with the same packages as Ubuntu image, there may be things missing regarding the npu, this is standard.
we will check this and provide you with necessary packages
Hello @davidharding
Can you check the new image ? It works on my side. You can use OOWOW to install vim4-debian-11-server-linux-5.15-fenix-1.5.2-231102-develop-test-only
online.
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$ cat /etc/fenix-release
# PLEASE DO NOT EDIT THIS FILE
BOARD=VIM4
VENDOR=Amlogic
VERSION=1.5.2
ARCH=arm64
INITRD_ARCH=arm64
IMAGE_VERSION=1.5.2-231102
################ GIT VERSION ################
UBOOT_GIT_VERSION=khadas-vims-u-boot-2019.01-v1.5.2-release-709-g1b24e6d
LINUX_GIT_VERSION=v5.15.78-6346-g33a25ce
FENIX_GIT_VERSION=v1.5.2-132-g203e2c6
#############################################
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$ sudo ./face_recognition -M ../data/model/retinaface_int8.adla -m ../data/model/facenet_int8.adla -p 1
adla usr space 1.2.0.5
adla usr space 1.2.0.5
[ 1134.260636][1 T416 ..] adlak_core clk requirement of 800000000 Hz,and real val is 799999988 Hz.
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$ sudo ./face_recognition -M ../data/model/retinaface_int8.adla -m ../data/model/facenet_int8.adla -p ../data/img/lin_2.jpg
adla usr space 1.2.0.5
adla usr space 1.2.0.5
lin_2.dat
1.000000
lin_1.dat
0.873803
lin_3.dat
0.795178
xu_1.dat
0.457403
xu_3.dat
0.377446
xu_2.dat
0.307754
class:face,label_num:0,prob:0.999055,left:30,top:55,right:128,bot:158
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
I have tried with the latest image on OOWOW
PLEASE DO NOT EDIT THIS FILE
BOARD=VIM4
VENDOR=Amlogic
VERSION=1.5.2
ARCH=arm64
INITRD_ARCH=arm64
IMAGE_VERSION=1.5.2-231102
################ GIT VERSION ################
UBOOT_GIT_VERSION=khadas-vims-u-boot-2019.01-v1.5.2-release-709-g1b24e6d
LINUX_GIT_VERSION=v5.15.78-6346-g33a25ce
FENIX_GIT_VERSION=v1.5.2-132-g203e2c6
#############################################
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye
It is definitely the installation of βcontainerdβ that causes the problem.
Before that the example works fine.
I canβt track this fault down further myself, as the fault appears to originate from within the libnnsdk.so
This is 100% reproducible
What you mean about this ? Just install containerd
package will break the npu? Can you provide the reproduce steps?
Here are the steps on my side, it works.
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$ sudo apt install containerd
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
runc
Suggested packages:
containernetworking-plugins
Recommended packages:
criu
The following NEW packages will be installed:
containerd runc
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 16.8 MB of archives.
After this operation, 77.9 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://mirrors.tuna.tsinghua.edu.cn/debian bullseye/main arm64 runc arm64 1.0.0~rc93+ds1-5+deb11u2 [2,078 kB]
Get:2 http://mirrors.tuna.tsinghua.edu.cn/debian bullseye/main arm64 containerd arm64 1.4.13~ds1-1~deb11u4 [14.7 MB]
Fetched 16.8 MB in 1s (21.0 MB/s)
Selecting previously unselected package runc.
(Reading database ... 170434 files and directories currently installed.)
Preparing to unpack .../runc_1.0.0~rc93+ds1-5+deb11u2_arm64.deb ...
Unpacking runc (1.0.0~rc93+ds1-5+deb11u2) ...
Selecting previously unselected package containerd.
Preparing to unpack .../containerd_1.4.13~ds1-1~deb11u4_arm64.deb ...
Unpacking containerd (1.4.13~ds1-1~deb11u4) ...
Setting up runc (1.0.0~rc93+ds1-5+deb11u2) ...
Setting up containerd (1.4.13~ds1-1~deb11u4) ...
Created symlink /etc/systemd/system/multi-user.target.wants/containerd.service β /lib/systemd/system/containerd.service.
Processing triggers for man-db (2.9.4-2) ...
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$
khadas@Khadas:~/vim4_npu_applications/face_recognition/build$ sudo ./face_recognition -M ../data/model/retinaface_int8.adla -m ../data/model/facenet_int8.adla -p 1
adla usr space 1.2.0.5
adla usr space 1.2.0.5
[ 59.351966][1 T422 ..] adlak_core clk requirement of 800000000 Hz,and real val is 799999988 Hz.
If I do the same thing as you, I get different results
I have copied below the complete console output, from a first time boot of a fresh install of the debian 5.15 server image via OOWOW
This is still 100% reproducible for me
I get the same error from the steps β¦
Additionally, Its VERY SLOW to load βmcβ (Midnight Commander)
Strange things happeningβ¦
I ran gdb with the face_detection and it started working - the failure went away! Something is wonky
MC is still slow though
An strace comparison between a working and not working setup
There appears to be a problem created the βadlau_thread_0β
Working
clone(child_stack=0x7fb3993c70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[4225], tls=0x7fb3994a70, child_tidptr=0x7fb3994440) = 4225
sched_setscheduler(4225, SCHED_FIFO, [99]) = 0
Not Working
clone(child_stack=0x7fac275c70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[4156], tls=0x7fac276a70, child_tidptr=0x7fac276440) = 4156
sched_setscheduler(4156, SCHED_FIFO, [99]) = -1 EPERM (Operation not permitted)
I am running the LTP test (Linux Test Project)
Most of the scheduler tests fail
root@Khadas:/opt/ltp/results# cat LTP_RUN_ON-2023_11_09-22h_22m_48s.log |grep FAIL
bpf_prog06 FAIL 33
epoll_pwait03 FAIL 1
ioctl_loop01 FAIL 1
ioctl_loop02 FAIL 1
fanotify10 FAIL 36
openat04 FAIL 1
sched_rr_get_interval01 FAIL 1
sched_rr_get_interval02 FAIL 1
sched_rr_get_interval03 FAIL 1
sched_setparam02 FAIL 1
sched_setparam03 FAIL 2
sched_getscheduler01 FAIL 1
semctl09 FAIL 1
Hello @RichardPar @davidharding
Can you try to upgrade the kernel and check whether this issue still exist?
$ wget https://dl.khadas.com/.test/vim4/5.15/linux-dtb-amlogic-5.15_1.5.2_arm64.deb
$ wget https://dl.khadas.com/.test/vim4/5.15/linux-image-amlogic-5.15_1.5.2_arm64.deb
$ sudo dpkg -i linux-dtb-amlogic-5.15_1.5.2_arm64.deb linux-image-amlogic-5.15_1.5.2_arm64.deb
$ sync
$ sudo reboot
After reboot, please check again.
Hi @numbqq ,
Iβm seeing mixed results, but things have definitely improved.
The npu examples can now run successfully, but on repeated attempts I am getting device resets
Please see the logs below
Thanks
Dave
Can you try this new kernel?
$ wget https://dl.khadas.com/.test/vim4/5.15/1/linux-dtb-amlogic-5.15_1.5.2_arm64.deb
$ wget https://dl.khadas.com/.test/vim4/5.15/1/linux-image-amlogic-5.15_1.5.2_arm64.deb
$ sudo dpkg -i linux-dtb-amlogic-5.15_1.5.2_arm64.deb linux-image-amlogic-5.15_1.5.2_arm64.deb
$ sync
$ sudo reboot
The second set of patches appear to be much more stable.
I have run several thousand iterations on my test script, and I havenβt seen a failure, nor a device reset