When trying with my own images (that are 480x848), nothing was predicted.
As a sanity-check, I resized the example image (1080p.bmp) to 480x848, and passed it through the detect_demo_x11. Again, nothing was predicted (see below).
(npu) khadas@biped1 detect_demo_picture git:(master) ✗ ./detect_demo_x11 -m 4 -p 848p.bmp
W Detect_api:[det_set_log_level:19]Set log level=1
W Detect_api:[det_set_log_level:21]output_format not support Imperfect, default to DET_LOG_TERMINAL
W Detect_api:[det_set_log_level:26]Not exist VSI_NN_LOG_LEVEL, Setenv set_vsi_log_error_level
det_set_log_config Debug
det_set_model success!!
model.width:416
model.height:416
model.channel:3
Det_set_input START
Det_set_input END
Det_get_result START
Det_get_result END
resultData.detect_num=0
result type is 0
I am surprised because looking through the source code (i.e. in int ge2d_init(int width, int height) inside of the file aml_npu_app/detect_library/sample_demo_x11/main.cpp), it looks to me like the image is in any case resized to the model’s input shape (i.e. 416x416).
Does anyone know why the input size of the image matters so much?
The source code is developed, you can study it yourself. In fact, it is very simple here, as long as you use the opencv interface to scale the image. The demo uses 1920x1080 by default
@arthurgassner I think your OPENCV interface should not be handled correctly, I think you need to briefly understand how opencv scales the image. I think your data here is seriously deformed, which leads to biased recognition results
Do you think it’s my OPENCV install that is wrong then?
I kept the image ratio (to avoid deforming the image, evening slightly) and changed sample_demo_x11/main.cpp to
#define MAX_HEIGHT 480
#define MAX_WIDTH 853
Compiling the code and testing it on a 480x853 image, nothing is detected.
But if I manually rescale the 480x853 image to 1080x1920 (resulting in a slightly pixelated image) then, the person is detected (changing MAX_HEIGHT and MAX_WIDTH in the main.cpp of course)