Hi everyone,
I am currently profiling the new VIM4 using YOLOv8s via the KSNN API.
While monitoring NPU utilization through the /sys/class/adla/adla0/device/debug/utilization node, I’ve encountered an inconsistent data pattern. Even during continuous inference, the utilization values fluctuate like this: 34% -> 0% -> 0% -> 0% -> 34% ...
It appears as though the NPU only reports activity in short bursts followed by several “0%” cycles, even though the workload should be steady.
I have two main questions:
-
Why are there so many 0% values? Is this a sampling/aliasing issue caused by the driver’s 300ms
dpm_period, or is the NPU hardware actually entering an idle/sleep state between frames due to software stack overhead? -
How can I improve/saturate NPU utilization? My FPS for yolov8s is around 26, but the NPU seems to be idling frequently. Are there specific driver tweaks, batching methods, or multi-threading strategies for the new VIM4 to keep the NPU more consistently active?
For context, I have checked /proc/interrupts and the interrupt counts are increasing steadily, which confirms the hardware is processing tasks. However, the utilization reporting remains fragmented.
Any advice from the community or the Khadas team would be very helpful!