RKNN problem on Edge2 device

Which system do you use? Android, Ubuntu, OOWOW or others?

Linux Khadas 6.1.43 #1.6.8 SMP PREEMPT

Which version of system do you use? Please provide the version of the system here:

We’re having problem on Edge 2 device. NPU doesn’t properly respond to the inference request.

Please describe your issue below:

We’re running YOLOv8 models on this device. We can’t get the inference results.

Post a console log of your issue below:


This is what we see in journalctl allways.
Jun 20 14:35:42 Khadas kernel: RKNPU: job: 000000008fd823ec, wait_count: 1, continue wait: 0, commit elapse time: 6099966us, wait time: 6099968us, timeout: 6000000us
Jun 20 14:35:42 Khadas kernel: RKNPU: job: 000000005eea273c, wait_count: 1, continue wait: 0, commit elapse time: 6100189us, wait time: 6100189us, timeout: 6000000us
Jun 20 14:35:42 Khadas kernel: RKNPU: failed to wait job, task counter: 0, flags: 0x1, ret = 0, elapsed time: 6100197us
Jun 20 14:35:42 Khadas kernel: RKNPU: failed to wait job, task counter: 0, flags: 0x1, ret = 0, elapsed time: 6099993us
Jun 20 14:35:43 Khadas kernel: RKNPU: job timeout, flags: 0x0:
Jun 20 14:35:43 Khadas kernel: RKNPU: job timeout, flags: 0x0:
Jun 20 14:35:43 Khadas kernel: RKNPU:         core 1 irq status: 0x100, raw status: 0x40000100, require mask: 0x300, task counter: 0x0, elapsed time: 6206854us
Jun 20 14:35:43 Khadas kernel: RKNPU:         core 2 irq status: 0x100, raw status: 0x40000100, require mask: 0x300, task counter: 0x0, elapsed time: 6206643us
Jun 20 14:35:43 Khadas kernel: RKNPU: soft reset
Jun 20 14:35:49 Khadas kernel: RKNPU: job: 000000005eea273c, wait_count: 1, continue wait: 0, commit elapse time: 6138521us, wait time: 6138522us, timeout: 6000000us
Jun 20 14:35:49 Khadas kernel: RKNPU: failed to wait job, task counter: 0, flags: 0x1, ret = 0, elapsed time: 6138542us
Jun 20 14:35:49 Khadas kernel: RKNPU: job: 000000004585f674, wait_count: 1, continue wait: 0, commit elapse time: 6104642us, wait time: 6104644us, timeout: 6000000us
Jun 20 14:35:49 Khadas kernel: RKNPU: failed to wait job, task counter: 0, flags: 0x1, ret = 0, elapsed time: 6104655us
Jun 20 14:35:49 Khadas kernel: RKNPU: job timeout, flags: 0x0:
Jun 20 14:35:49 Khadas kernel: RKNPU: job timeout, flags: 0x0:
Jun 20 14:35:49 Khadas kernel: RKNPU:         core 2 irq status: 0x100, raw status: 0x40000100, require mask: 0x300, task counter: 0x0, elapsed time: 6245190us
Jun 20 14:35:49 Khadas kernel: RKNPU:         core 1 irq status: 0x100, raw status: 0x40000100, require mask: 0x300, task counter: 0x0, elapsed time: 6211260us
Jun 20 14:35:49 Khadas kernel: RKNPU: soft reset

This is something we see in the journalctl also:

Jun 20 14:16:22 Khadas kernel: ------------[ cut here ]------------
Jun 20 14:16:22 Khadas kernel: WARNING: CPU: 0 PID: 0 at drivers/iommu/rockchip-iommu.c:780 rk_iommu_irq+0x50/0xb0
Jun 20 14:16:22 Khadas kernel: Modules linked in: rfcomm xt_conntrack algif_hash algif_skcipher af_alg bnep nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_>
Jun 20 14:16:22 Khadas kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.43 #1.6.8
Jun 20 14:16:22 Khadas kernel: Hardware name: Khadas Edge2 (DT)
Jun 20 14:16:22 Khadas kernel: pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Jun 20 14:16:22 Khadas kernel: pc : rk_iommu_irq+0x50/0xb0
Jun 20 14:16:22 Khadas kernel: lr : rk_iommu_irq+0x24/0xb0
Jun 20 14:16:22 Khadas kernel: sp : ffffffc008003d10
Jun 20 14:16:22 Khadas kernel: x29: ffffffc008003d10 x28: ffffffc009af6000 x27: 0000000000000101
Jun 20 14:16:22 Khadas kernel: x26: ffffffc0080d98dc x25: ffffffc00a07ce60 x24: ffffffc00a07ce20
Jun 20 14:16:22 Khadas kernel: x23: 0000000000000000 x22: 0000000000000001 x21: ffffff81014d4400
Jun 20 14:16:22 Khadas kernel: x20: 0000000000000025 x19: ffffff81014d4080 x18: 0000000000000000
Jun 20 14:16:22 Khadas kernel: x17: ffffffc3f3f89000 x16: ffffffc008000000 x15: 0000000000000000
Jun 20 14:16:22 Khadas kernel: x14: ffffffc009b001c0 x13: ffffffc3f3f89000 x12: 000000003464591d
Jun 20 14:16:22 Khadas kernel: x11: ffffffc009e98d28 x10: 0000000000000025 x9 : ffffffc0080cbe40
Jun 20 14:16:22 Khadas kernel: x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffffff8100400248
Jun 20 14:16:22 Khadas kernel: x5 : ffffff81014d4400 x4 : 0000000000000000 x3 : 0000000000000000
Jun 20 14:16:22 Khadas kernel: x2 : ffffffc009b001c0 x1 : 0000000000000101 x0 : 0000000000000000
Jun 20 14:16:22 Khadas kernel: Call trace:
Jun 20 14:16:22 Khadas kernel:  rk_iommu_irq+0x50/0xb0
Jun 20 14:16:22 Khadas kernel:  __handle_irq_event_percpu+0xd4/0x1f8
Jun 20 14:16:22 Khadas kernel:  handle_irq_event_percpu+0x1c/0x4c
Jun 20 14:16:22 Khadas kernel:  handle_irq_event+0x4c/0x8c
Jun 20 14:16:22 Khadas kernel:  try_one_irq+0xbc/0xe0
Jun 20 14:16:22 Khadas kernel:  poll_spurious_irqs+0xf4/0x11c
Jun 20 14:16:22 Khadas kernel:  call_timer_fn+0x8c/0x130
Jun 20 14:16:22 Khadas kernel:  __run_timers+0xac/0x1fc
Jun 20 14:16:22 Khadas kernel:  run_timer_softirq+0x34/0x58
Jun 20 14:16:22 Khadas kernel:  __do_softirq+0x26c/0x320
Jun 20 14:16:22 Khadas kernel:  ____do_softirq+0x14/0x1c
Jun 20 14:16:22 Khadas kernel:  call_on_irq_stack+0x24/0x34
Jun 20 14:16:22 Khadas kernel:  do_softirq_own_stack+0x20/0x28
Jun 20 14:16:22 Khadas kernel:  __irq_exit_rcu+0x90/0xec
Jun 20 14:16:22 Khadas kernel:  irq_exit_rcu+0x14/0x1c
Jun 20 14:16:22 Khadas kernel:  el1_interrupt+0x90/0xd4
Jun 20 14:16:22 Khadas kernel:  el1h_64_irq_handler+0x14/0x1c
Jun 20 14:16:22 Khadas kernel:  el1h_64_irq+0x74/0x78
Jun 20 14:16:22 Khadas kernel:  arch_local_irq_enable+0x8/0x20
Jun 20 14:16:22 Khadas kernel:  cpuidle_enter+0x3c/0x50
Jun 20 14:16:22 Khadas kernel:  do_idle+0x22c/0x248
Jun 20 14:16:22 Khadas kernel:  cpu_startup_entry+0x28/0x2c
Jun 20 14:16:22 Khadas kernel:  kernel_init+0x0/0x12c
Jun 20 14:16:22 Khadas kernel:  arch_post_acpi_subsys_init+0x0/0x18
Jun 20 14:16:22 Khadas kernel:  start_kernel+0x688/0x6c4
Jun 20 14:16:22 Khadas kernel:  __primary_switched+0xb4/0xbc
Jun 20 14:16:22 Khadas kernel:

This is what we see in the application side.

E RKNN: [14:42:32.667] failed to submit!, op id: 1, op name: Conv:model_3/tf.math.add/Add;model_3/tf.nn.convolution_4/convolution;model_3/tf.nn.convolution/convolution;model_3/tf.math.add/Add/y, flags: 0x1, task start: 0, task number: 61, run task counter: 0, int status: 0, please try updating to the latest version of the toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn)
E RKNN: [14:42:32.774] failed to submit!, op id: 1, op name: Conv:model_3/tf.math.add/Add;model_3/tf.nn.convolution_4/convolution;model_3/tf.nn.convolution/convolution;model_3/tf.math.add/Add/y, flags: 0x1, task start: 0, task number: 61, run task counter: 0, int status: 0, please try updating to the latest version of the toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn)
E RKNN: [14:42:39.067] failed to submit!, op id: 1, op name: Conv:model_3/tf.math.add/Add;model_3/tf.nn.convolution_4/convolution;model_3/tf.nn.convolution/convolution;model_3/tf.math.add/Add/y, flags: 0x1, task start: 0, task number: 61, run task counter: 0, int status: 0, please try updating to the latest version of the toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn)
E RKNN: [14:42:39.174] failed to submit!, op id: 1, op name: Conv:model_3/tf.math.add/Add;model_3/tf.nn.convolution_4/convolution;model_3/tf.nn.convolution/convolution;model_3/tf.math.add/Add/y, flags: 0x1, task start: 0, task number: 61, run task counter: 0, int status: 0, please try updating to the latest version of the toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn)

@burak.cakmak 6.1 kernel is still in debug stage, if you want to have stable environment, please use the 5.10 kernel image.

Furthermore, please share the other details of your code which triggered this issue, it can aid in fixing it in the 6.1 kernel.

Hello @burak.cakmak

Sorry, I can’t reproduce your issue.

I have used our official image: kernel version is 6.1.43 Ubuntu24.04.

Follow the document YOLOv8n OpenCV Edge2 Demo - 2 [Khadas Docs]

It can run successfully. Could you show more details?

1 Like

Hello @Jacobe @Electr1

RKNPU inference fails when using a Ubuntu Noble 1.6.8 gnome image with RKNNRT versions 1.6.0 and 2.0.0b0 and rknn-toolkit-lite2 versions 1.6.0 and 2.0.0b0.

This uname -a output.

Linux Khadas 6.1.43 #1.6.8 SMP PREEMPT Wed Jun 5 08:56:49 CST 2024 aarch64 aarch64 aarch64 GNU/Linux

Tested using:

librknnrt.so version 1.6.0, rknn-toolkit-lite2 v1.6.0 with a model converted using rknn-toolkit2 version 1.6.0 and version 2.0.0b0

librknnrt.so version 2.0.0b0, rknn-toolkit-lite2 v1.6.0 with a model converted using rknn-toolkit2 version 1.6.0 and version 2.0.0b0

librknnrt.so version 2.0.0b0, rknn-toolkit-lite2 v2.0.0b0 with a model converted using rknn-toolkit2 version 1.6.0 and version 2.0.0b0

librknnrt.so version 2.0.0b16, rknn-toolkit-lite2 v2.0.0b0 with a model converted using rknn-toolkit2 version 1.6.0 and version 2.0.0b0

@burak.cakmak I will check, can you share your model, what size is the YOLOv8 model you are using?

edit:

FYI: the RKNPU driver has a 6 second operator processing limit, meaning your model is involving some processing step that is exceedingly long in time.

most likely this could mean you are using a much larger YOLOv8, or you haven’t removed the post-processing step from the end of the inference.

it’s recommended you use YOLOv8n/s size models for optimal performance.
and remove the post processing steps as guided by the docs

https://docs.khadas.com/products/sbc/edge2/npu/demos/yolov8n#convert

Hello @burak.cakmak ,

The problem originates from 6.1 kernel. We will fix the problem as soon as possible. Suggest you to use 5.10 kernel now.