VIM3 tflite with Delegate Segfault

Which Khadas SBC do you use?

VIM3

Which system do you use? Android, Ubuntu, OOWOW or others?

Debian 11 Server Fenix

Which version of system do you use? Khadas official images, self built images, or others?

Fenix

Please describe your issue below:

TFLite crashing on init on NPU

khadas@Khadas:~$ python3 test.py aa
Vx delegate: allowed_cache_mode set to 0.
Vx delegate: device num set to 0.
Vx delegate: allowed_builtin_code set to 0.
Vx delegate: error_during_init set to 0.
Vx delegate: error_during_prepare set to 0.
Vx delegate: error_during_invoke set to 0.
========== INPUT DETAILS ========
[{'name': 'input_1', 'index': 13, 'shape': array([ 1, 28, 28,  1], dtype=int32), 'shape_signature': array([ 1, 28, 28,  1], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.007843137718737125, 128), 'quantization_parameters': {'scales': array([0.00784314], dtype=float32), 'zero_points': array([128], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

====== OUTPUT DETAILS ==========
[{'name': 'dense/Softmax', 'index': 11, 'shape': array([ 1, 10], dtype=int32), 'shape_signature': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.00390625, 0), 'quantization_parameters': {'scales': array([0.00390625], dtype=float32), 'zero_points': array([0], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[     1] HAL user version: 6.4.6.345497
[     2] HAL kernel version: 6.4.8.415784
E [/Source/wksp/tflite-vx-delegate/build/_deps/tim-vx-src/src/tim/vx/tensor.cc:Init:341]Create tensor fail!
E [/Source/wksp/tflite-vx-delegate/build/_deps/tim-vx-src/src/tim/vx/tensor.cc:Init:341]Create tensor fail!
E [/Source/wksp/tflite-vx-delegate/build/_deps/tim-vx-src/src/tim/vx/tensor.cc:Init:341]Create tensor fail!
E [/Source/wksp/tflite-vx-delegate/build/_deps/tim-vx-src/src/tim/vx/tensor.cc:Init:341]Create tensor fail!
E [/Source/wksp/tflite-vx-delegate/build/_deps/tim-vx-src/src/tim/vx/tensor.cc:Init:341]Create tensor fail!
Segmentation fault
khadas@Khadas:~$

I have a simple Python app… that just loads the delegate…

import numpy as np
import tflite_runtime.interpreter as tflite

# Load TFLite model and allocate tensors.
# (if you are using the complete tensorflow package you can find load_delegate in tf.experimental.load_delegate)
delegate = tflite.load_delegate( library="libvx_delegate.so", options={"logging-severity":"debug"})
# Delegates/Executes all operations supported by Arm NN to/with Arm NN
interpreter = tflite.Interpreter(model_path="mock_model.tflite", 
                                 experimental_delegates=[delegate])
interpreter.allocate_tensors()

# Get input and output tensors.
input_details = interpreter.get_input_details()
print("========== INPUT DETAILS ========")
print(input_details)

print()
print("====== OUTPUT DETAILS ==========")
output_details = interpreter.get_output_details()
print(output_details)

# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.uint8)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()

# Print out result
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

Galcore version 6.4.8.7.1.1.1

Is there a known model to try out? … another simple test app?

Regards,
Richard

I have managed to get it working by using older GALCORE drivers

Kernel version
Linux Khadas 4.9.241 #2 SMP PREEMPT Fri Mar 24 17:24:31 GMT 2023 aarch64 GNU/Linux

Not Working - This is bundled sources with the Debian11 Fenix image
Galcore version 6.4.8.7.1.1.1

I rmmod the old and insmod the older .ko from the Verisilicon git repo for the A311D, the NPU works
Galcore version 6.4.6.2

khadas@Khadas:~/MasterTest$ python3 test.py 
Vx delegate: allowed_cache_mode set to 0.
Vx delegate: device num set to 0.
Vx delegate: allowed_builtin_code set to 0.
Vx delegate: error_during_init set to 0.
Vx delegate: error_during_prepare set to 0.
Vx delegate: error_during_invoke set to 0.
<tflite_runtime.interpreter.Delegate object at 0x7fb243ffd0>
========== INPUT DETAILS ========
[{'name': 'serving_default_input_1:0', 'index': 0, 'shape': array([  1, 256, 256,   3], dtype=int32), 'shape_signature': array([  1, 256, 256,   3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.003921568859368563, 0), 'quantization_parameters': {'scales': array([0.00392157], dtype=float32), 'zero_points': array([0], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

====== OUTPUT DETAILS ==========
[{'name': 'StatefulPartitionedCall:0', 'index': 408, 'shape': array([   1, 4032,    6], dtype=int32), 'shape_signature': array([   1, 4032,    6], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.003994452301412821, 0), 'quantization_parameters': {'scales': array([0.00399445], dtype=float32), 'zero_points': array([0], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
W [HandleLayoutInfer:281]Op 162: default layout inference pass.
W [HandleLayoutInfer:281]Op 162: default layout inference pass.
W [HandleLayoutInfer:281]Op 162: default layout inference pass.
W [HandleLayoutInfer:281]Op 162: default layout inference pass.
W [HandleLayoutInfer:281]Op 162: default layout inference pass.
W [HandleLayoutInfer:281]Op 162: default layout inference pass.
[[[  8   8  18  18   0 249]
  [  8   9  18  22   0 249]
  [  7   9  18  22   0 249]
  ...
  [227 229  48  44   0 249]
  [224 221  54  58   0 249]
  [224 224 129 112   0 249]]]