Error in operation wise usage times

johndoe · January 4, 2022, 9:37am

I was trying to generate operation wise usage times by following the steps mentioned in Section 4.6 “NN Tool FAQ (0.4)” and am facing this error

khadas@Khadas:~/aml_npu_app/detect_library/resnet18_normal_case_demo/bin_r_cv4$ ./detect_resnet18  ../resnet18.export.data resnet18.jpeg
#productname=VIPNano-QI, pid=0x88
Created VX Thread: 0x8cdc51b0
Create Neural Network: 72ms or 72817us
E [_decode_jpeg:275]CHECK PTR 275
E [_get_jpeg_data:606]CHECK PTR 606
E [_handle_multiple_inputs:745]CHECK PTR 745
E [vnn_PreProcessResnet18:892]CHECK STATUS(-1:A generic error code, used when no other describes the error.)
E [main:233]CHECK STATUS(-1:A generic error code, used when no other describes the error.)
Exit VX Thread: 0x8cdc51b0

Frank · January 4, 2022, 10:21am

@johndoe What is the size of your picture? The picture resolution here must be consistent with the model input

johndoe · January 4, 2022, 10:29am

I’m not feeding in any picture as of yet. The model input size is 224x224

Frank · January 4, 2022, 10:31am

The first place to report errors is to check the image data here. So the problem should be in the picture. The image size and model input size requirements are the same

johndoe · January 4, 2022, 10:35am

Should I provide an image named resnet18.jpeg? Because in the documentation, it hasn’t said anything like that

johndoe · January 4, 2022, 10:37am

Now I copied the goldfish224x224 image into the directory and ran
./detect_resnet18 ../resnet18.export.data goldfish224x224.jpg

The error has changed to a segmentation fault

khadas@Khadas:~/aml_npu_app/detect_library/resnet18_normal_case_demo/bin_r_cv4$ ./detect_resnet18  ../resnet18.export.data img.jpg 
#productname=VIPNano-QI, pid=0x88
Created VX Thread: 0x97e841b0
Create Neural Network: 41ms or 41010us
Verify...
[getVXCKernelInfo(60)] Failed to open library libNNVXCBinary.so.
Segmentation fault

Frank · January 4, 2022, 10:41am

@johndoe I will try to give a feasible solution, I have not implemented it in detail, please give me some time

johndoe · January 4, 2022, 10:42am

Sure thing
Thanks @Frank

Frank · January 5, 2022, 2:33am

@johndoe Follw my steps

clone my repo

khadas@Khadas:~$ cd ~
khadas@Khadas:~$ git clone https://github.com/yan-wyb/Just_for_get_op_time.git

Setup

khadas@Khadas:~$ export VIVANTE_SDK_DIR=/home/khadas/Just_for_get_op_time/data/vcmdtools
khadas@Khadas:~$ export LD_LIBRARY_PATH=/home/khadas/Just_for_get_op_time/data/drivers_64_exportdata
khadas@Khadas:~$ export VIV_VX_DEBUG_LEVEL=1
khadas@Khadas:~$ export CNN_PERF=1
khadas@Khadas:~$ export NN_LAYER_DUMP=1

Get source code (Host PC)

$ clone --recursive https://github.com/khadas/aml_npu_sdk.git
$ cd aml_npu_sdk/acuity-toolkit/demo
$ vim 2_export_case_code.sh

remove some lines

diff --git a/acuity-toolkit/demo/2_export_case_code.sh b/acuity-toolkit/demo/2_export_case_code.sh
index 6ba29e4..ba883b3 100755
--- a/acuity-toolkit/demo/2_export_case_code.sh
+++ b/acuity-toolkit/demo/2_export_case_code.sh
@@ -12,9 +12,6 @@ $export_ovxlib \
     --reorder-channel '0 1 2' \
     --channel-mean-value '128 128 128 128' \
     --export-dtype quantized \
-    --optimize VIPNANOQI_PID0XE8  \
-    --viv-sdk ${ACUITY_PATH}vcmdtools \
-    --pack-nbg-unify  \
 
 #Note:
 #       --optimize VIPNANOQI_PID0XB9

Then compile

$ bash 0_import_model.sh && bash 1_quantize_model.sh && bash 2_export_case_code.sh
$ mkdir op_test
$ cp *.c *.h *export.data op_test

Move this dir to you VIM3

Create compilation file （On VIM3）

khadas@Khadas:~$ cd op_test
khadas@Khadas:~$ wget https://raw.githubusercontent.com/khadas/aml_npu_app/master/detect_library/inception/makefile.linux.def
khadas@Khadas:~$ wget https://raw.githubusercontent.com/khadas/aml_npu_app/master/detect_library/inception/build_vx.sh
khadas@Khadas:~$ echo "TARGET_NAME = detect_mobilnet"  > makefile.target_name
khadas@Khadas:~$ vim makefile.linux

there is the content

include ./makefile.linux.def
include ./makefile.target_name

INCLUDE += -I$(OPENCV_ROOT)/modules
INCLUDE += -I$(OPENCV_ROOT)/modules/highgui/include
INCLUDE += -I$(OPENCV_ROOT)/modules/core/include
INCLUDE += -I$(OPENCV_ROOT)/modules/imgproc/include
INCLUDE += -I$(OPENCV_ROOT)/modules/objdetect/include
INCLUDE += -I$(OPENCV_ROOT)/modules/imgcodecs/include
INCLUDE += -I$(OPENCV_ROOT)/modules/videoio/include 
INCLUDE += -I$(OPENCV4_ROOT)
INCLUDE += -I. 

INCLUDE += -I$(VIVANTE_SDK_INC)

CXXFLAGS += $(INCLUDE) -std=c++11 -std=gnu++11 -Wall -std=c++11

################################################################################
# Supply necessary libraries.
#LIBS += $(OVXLIB_DIR)/lib/libjpeg.a

#LIBS +=  -lpthread -ldl
LIBS += -L$(OPENCV_ROOT)/lib -lz -lm

LIBS += -L$(VIVANTE_SDK_LIB) -lOpenVX -lOpenVXU -lGAL -lovxlib -lArchModelSw -lNNArchPerf

#LIBS +=-L$(LIB_DIR) -lstdc++
LIBS += -lvpcodec -lamcodec -lamadec -lamvdec -lamavutils -lrt -lpthread -lge2d -lion -ljpeg

#############################################################################
# Macros.
PROGRAM = 1
CUR_SOURCE = ${wildcard *.c}
#############################################################################
# Objects.
OBJECTS =  ${patsubst %.c, $(OBJ_DIR)/%.o, $(CUR_SOURCE)}
#OBJECTS += $(OBJ_DIR)/main.o
# installation directory
#INSTALL_DIR := ./

OBJ_DIR = bin_r_cv4
################################################################################

# Include the common makefile.

#include $(AQROOT)/common.target

LDFLAGS += -Wall -shared -Wl,-soname,$(TARGET_NAME) -Wl,-z,defs

TARGET_OUTPUT = $(OBJ_DIR)/$(TARGET_NAME)

all: $(TARGET_OUTPUT)

clean:
	@rm -rf $(OBJ_DIR)/* $(OBJ_DIR)

install: $(TARGET_OUTPUT)
	@mkdir -p $(INSTALL_DIR)
	@-cp $(TARGET_OUTPUT) $(INSTALL_DIR)

$(TARGET_OUTPUT): $(OBJECTS)
	@$(CXX) $(OBJECTS) -o $(TARGET_OUTPUT) $(LIBS)

$(OBJ_DIR)/%.o: %.c
	@echo "  COMPILE $(abspath $<)"
	@mkdir -p $(OBJ_DIR)
	@$(CC) $(LDFLAGS) -c $(CFLAGS) -o $@ $<

$(OBJ_DIR)/%.o: %.cpp
	@echo "  COMPILE $(abspath $<)"
	@mkdir -p $(OBJ_DIR)
	@$(CXX) -c $(CXXFLAGS) -o $@ $<

then compile

khadas@Khadas:~$ ./build_vx.sh

Run

khadas@Khadas:~$ cd ~/op_test/bin_r_cv4
khadas@Khadas:~$ ./detect_mobilnet ../mobilenet_tf.export.data space_shuttle_224.jpg

You will see the runtime you want

Ps: The follow-up will be organized into Khadas Docs, it should be more complete

johndoe · January 5, 2022, 4:46am

Working like a charm!

Thanks a ton @Frank

johndoe · January 5, 2022, 4:49am

The accuracy values however, aren’t quite consistent

I used a resnet18 model (source: onnx model zoo) to classify the goldfish224x224 image and got these values:

 --- Top5 ---
978: 7.605184
107: 7.381502
  3: 7.269661
148: 7.157820
  6: 6.822298

[EDIT]
Accuracy values with mobilenet model

 --- Top5 ---
116: 0.354980
899: 0.130371
795: 0.106750
972: 0.029037
693: 0.019455

Frank · January 5, 2022, 6:53am

The data from the resnet model also needs to be processed by softmax to get the correct result.

khadas@Khadas:~/op_test/bin_r_cv4$ ./detect_mobilnet ../mobilenet_tf.export.data space_shuttle_224.jpg 
Create Neural Network: 47ms or 47469us
Verify...
Verify Graph: 26362ms or 26362602us
Start run graph [1] times...
Run the 1 time: 4.00ms or 4955.00us
vxProcessGraph execution time:
Total   4.00ms or 4984.00us
Average 4.98ms or 4984.00us
 --- Top5 ---
813: 0.999512
405: 0.000238
868: 0.000086
896: 0.000077
  0: 0.000000

In my side, mobilnet work fine. Did you do any change ?

johndoe · January 5, 2022, 7:42am

I must’ve messed up with the channel mean values
In case of the resnet mode, how can I add a softmax layer for post-processing?

Frank · January 5, 2022, 7:45am

@johndoe You can follow my python demo to do this

github.com

khadas/ksnn/blob/master/examples/pytorch/resnet18.py#L26

    
      
                  for j in range(len(index)):
                      if (i + j) >= 5:
                          break
                      if value > 0:
                          topi = '{}: {}\n'.format(index[j], value)
                      else:
                          topi = '-1: 0.0\n'
                      top5_str += topi
              print(top5_str)
          
          
def softmax(x):
              return np.exp(x)/sum(np.exp(x))
          
          
if __name__ == "__main__":
          
          
    parser = argparse.ArgumentParser()
              parser.add_argument("--library", help="Path to C static library file")
              parser.add_argument("--model", help="Path to nbg file")
              parser.add_argument("--picture", help="Path to input picture")
              parser.add_argument("--level", help="Information printer level: 0/1/2")
              args = parser.parse_args()

johndoe · January 5, 2022, 7:46am

The code is in the form of an executable (detect_resnet). Won’t I have to make modifications in the C code?

Frank · January 5, 2022, 8:12am

@johndoe You need to modify the top5 code, after sorting, do softmax for each output value

johndoe · January 5, 2022, 8:14am

Got it! Thanks a lot @Frank

johndoe · January 5, 2022, 9:26am

Hi @Frank
I’m facing segmentation fault for the same model I was working some time back. Could you guess any reason behind it?

Frank · January 5, 2022, 9:30am

@johndoe You should add printed information to see if the problem lies in pre-processing or post-processing. If it is pre-processing, it may be that the input data does not match the input field of the model.If it is post-processing, it may be due to the softmax you designed

johndoe · January 5, 2022, 9:35am

I didn’t add a softmax function as coding it in C would be a lot of hassle. Anyway I just need layerwise execution times and not the accuracy for now