VIM 3 Pro: RAM management issues with periodic NPU performance drops (every 1k iterations)

Which Khadas SBC do you use?

VIM 3 Pro

Which system do you use? Android, Ubuntu, OOWOW or others?


Which version of system do you use? Khadas official images, self built images, or others?

Khadas official


Please describe your issue below:

#Experimental setup:
I have tested out a Neural Network for the task of object detection. The minimalist (python+opencv) code reads 1 single image (that is read only once initially) and loops over the same image to perform inference / detection 5k times using KSNN.

There are no other substantial processes running during the experiment.

#The problems:
1) Every 1000 iterations of object detection (on the same image), there are random sudden drops in NPU performance (peaks in latency).

2) The RAM usage grows substantially over time around 50-70% (looks like memory bloating or leakage going on?) and only partially drops every 1000 iterations of processing, (on the same test image, no additional imreads nor writes are involved).

Both of (1) NPU latency peaks and (2) partial clearance of the RAM also seem to be in sync.

As KSNN is somewhat of a black box, any help regarding this will be much appreciated: What is going on? How can I fix it? :slight_smile:

Thank you.

Post a console log of your issue below:

@justbob KSNN will clear the cached data every certain number of runs, which should be the reason

@Frank , thank you for the prompt feedback.

While I totally understand the mechanism for the (i) release part, (ii) the amount of RAM consumption is relatively high for NPU based inference.

On the other hand, the cleared memory seems insuffisient: (iii) there is memory space that is lost in the process.

(i) is resonable, I agree. (ii) and (iii) are problematic.

Thank you for your kind help.

@justbob Thank you for your feedback, we will try to improve it in the future

Thanks again @Frank for the feedback.

In particular, apart from the cyclical memory clearance, there seems to be occasional memory leaks in the inference calls (cf. below for the memory profiling).

Do you guys plan to address it within a known timeline? Alternatively, could you share the KSNN code, so I can try to contribute and help with that?


@justbob There is no development plan for the time being, we will open it at the right time. We do Python API mainly to provide an interface for ordinary users to experience NPU. If you have high requirements for efficiency, it is recommended that you use the C interface

Thanks again @Frank for the follow-up.

Is there any starting point to a C api or documentation to start from regarding the execution providers/delegates to interact with?



@Frank Thank you for your kind help.