Realtime Text Recognition with VIM4 and IMX415 MIPI Camera

Which system do you use? Android, Ubuntu, OOWOW or others?

**I have used OOWOW to install Ubuntu 24.04 in my Khadas VIM 4, and a IMX415 MIPI Camera**

Which version of system do you use? Please provide the version of the system here:

**Ubuntu 24.04**

Please describe your issue below:

`**I’m new to computer vision and I have tried to build an application which uses camera to capture realtime Text Recognition. I have followed the setup for IMX415 Camera to VIM4 via MIPI, and uses gstreamer to stream my video. It works perfectly.

However, the difficult part now is the text recognition part. I have tried a few methods of doing it, which are the easyocr package, pytesseract package, the OpenCV East Text Detection method, all of them using OpenCV Python. The results are not as expected as is it too lag, to a point where the app will crash.

So I assume I wasn’t using either the Mali GPU nor the Amlogic NPU. I have tried this setup

which teaches you how to setup TIM VX Backend, which then allows me to use it in the OpenCV Dnn Module. This approach uses the CRNN and ONNX file. However, it still lags alot.

So, i did some research on Khadas official site, and found some of the NPU Applications which i can use. Then, i only saw one Python Demo which uses the KSNN module. I have downloaded it and try to run in, but got a Segmentation Error. I have did a lot of research online but not much support.

Now, I’m going for the C++ approach (which I’m not so familiar with). There are quite some C++ Demo in Khadas official site, I have downloaded them from GitHub - khadas/vim4_npu_applications and tried runing the YOLOv8 Object Detection, and seems like it is working better than my previous Python approaches. However, there is no demo for realtime text recognition. There is the DenseNet CTC demo, which is not realtime (I’m sorry I’m not familiar with C++ :cry: ). Can anyone help me on this or is there any ready Demo which i can use? I have been researching this for more than two weeks now and still can’t find any solution.

P.S. the Python scripts metioned above runs smoothly on my Windows Intel with NVIDIA GPU, but not in Khadas VIM4**`

Post a console log of your issue below:


**Delete this line and post your log here.**

I am also facing the same Python Error and I am also referencing the C++ demo now

Any solution so far?

Hello @JietChoo ,

Could you provide VIM4 KSNN problem reproduction method? It is a test version and we are collecting bugs to fix it.

And about text recognition, OCR usually has two steps. First is detecting all text lines. Then cut each line from image and recognize this line of text. DenseNet CTC is the latter. So the demo only recognize a image.

You are not familiar with C++. So suggest that you can use KSNN after we fix the bug.

Thank you for your reply,

I followed the instruction in

First, i install the pip3 package using

sudo apt install python3-pip

After that, i go into the ksnn folder and install the ksnn_vim4-1.4-py3-none-any.whl file using pip3 install

pip3 install ksnn_vim4-1.4-py3-none-any.whl 

However, during this time, my cv2 package is not able to use GStreamer. When i call print(cv2.getBuildInformation()), the Gstreamer Backend was disabled. The cv2 build information print statement is as follows:-

General configuration for OpenCV 4.10.0 =====================================
  Version control:               4.10.0-dirty

  Platform:
    Timestamp:                   2024-06-17T18:00:16Z
    Host:                        Linux 5.3.0-28-generic aarch64
    CMake:                       3.29.5
    CMake generator:             Unix Makefiles
    CMake build tool:            /bin/gmake
    Configuration:               Release

  CPU/HW features:
    Baseline:                    NEON FP16
    Dispatched code generation:  NEON_DOTPROD NEON_FP16 NEON_BF16
      requested:                 NEON_FP16 NEON_BF16 NEON_DOTPROD
      NEON_DOTPROD (1 files):    + NEON_DOTPROD
      NEON_FP16 (2 files):       + NEON_FP16
      NEON_BF16 (0 files):       + NEON_BF16

  C/C++:
    Built as dynamic libs?:      NO
    C++ standard:                11
    C++ Compiler:                /opt/rh/devtoolset-10/root/usr/bin/c++  (ver 10.2.1)
    C++ flags (Release):         -Wl,-strip-all   -fsigned-char -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -Wl,-strip-all   -fsigned-char -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /opt/rh/devtoolset-10/root/usr/bin/cc
    C flags (Release):           -Wl,-strip-all   -fsigned-char -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -Wl,-strip-all   -fsigned-char -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):      -L/ffmpeg_build/lib  -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined  
    Linker flags (Debug):        -L/ffmpeg_build/lib  -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined  
    ccache:                      YES
    Precompiled headers:         NO
    Extra dependencies:          /lib64/libopenblas.so Qt5::Core Qt5::Gui Qt5::Widgets Qt5::Test Qt5::Concurrent /usr/local/lib/libpng.so /lib64/libz.so dl m pthread rt
    3rdparty dependencies:       libprotobuf ade ittnotify libjpeg-turbo libwebp libtiff libopenjp2 IlmImf tegra_hal

  OpenCV modules:
    To be built:                 calib3d core dnn features2d flann gapi highgui imgcodecs imgproc ml objdetect photo python3 stitching video videoio
    Disabled:                    world
    Disabled by dependency:      -
    Unavailable:                 java python2 ts
    Applications:                -
    Documentation:               NO
    Non-free algorithms:         NO

  GUI:                           QT5
    QT:                          YES (ver 5.15.13 )
      QT OpenGL support:         NO
    GTK+:                        NO
    VTK support:                 NO

  Media I/O: 
    ZLib:                        /lib64/libz.so (ver 1.2.7)
    JPEG:                        build-libjpeg-turbo (ver 3.0.3-70)
      SIMD Support Request:      YES
      SIMD Support:              YES
    WEBP:                        build (ver encoder: 0x020f)
    PNG:                         /usr/local/lib/libpng.so (ver 1.6.43)
    TIFF:                        build (ver 42 - 4.6.0)
    JPEG 2000:                   build (ver 2.5.0)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      NO
    FFMPEG:                      YES
      avcodec:                   YES (59.37.100)
      avformat:                  YES (59.27.100)
      avutil:                    YES (57.28.100)
      swscale:                   YES (6.7.100)
      avresample:                NO
    GStreamer:                   NO
    v4l/v4l2:                    YES (linux/videodev2.h)

  Parallel framework:            pthreads

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Lapack:                      YES (/lib64/libopenblas.so)
    Eigen:                       NO
    Custom HAL:                  YES (carotene (ver 0.0.1, Auto detected))
    Protobuf:                    build (3.19.1)
    Flatbuffers:                 builtin/3rdparty (23.5.9)

  OpenCL:                        YES (no extra features)
    Include path:                /io/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 /opt/python/cp39-cp39/bin/python3.9 (ver 3.9.19)
    Libraries:                   libpython3.9m.a (ver 3.9.19)
    Limited API:                 YES (ver 0x03060000)
    numpy:                       /home/ci/.local/lib/python3.9/site-packages/numpy/_core/include (ver 2.0.0)
    install path:                python/cv2/python-3

  Python (for build):            /opt/python/cp39-cp39/bin/python3.9

  Java:                          
    ant:                         NO
    Java:                        NO
    JNI:                         NO
    Java wrappers:               NO
    Java tests:                  NO

  Install to:                    /io/_skbuild/linux-aarch64-3.9/cmake-install
-----------------------------------------------------------------

Under the Video I/O section, the Gstreamer has the value of NO. This cause me unable to stream video feed from the Khadas IMX415 MIPI via Gstreamer. This only happens after i pip3 install ksnn_vim4-1.4-py3-none-any.whl. Before that, the Gstreamer was working fine.

Hence, i install necessary gstreamer packages as follows:

sudo apt install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev gstreamer1.0-plugins-{base,good,bad}

Then, i did the following:-

$ cd /opt
$ git clone https://github.com/opencv/opencv
$ cmake -B opencv-build \
      -D CMAKE_BUILD_TYPE=RELEASE \
      -D BUILD_opencv_gapi=OFF \
      -D CMAKE_INSTALL_PREFIX=opencv-install \
	-D WITH_GSTREAMER=ON \
	-D BUILD_opencv_python3=ON opencv
$ cmake --build opencv-build --target install -j 8

** This part runs very long

After that, i go sudo nano /etc/environment and add the following line:-

PYTHONPATH=/opt/opencv-build/python_loader:$PYTHONPATH

This enable my GSTREAMER in cv2

After that modified the yolov8n-cap.py as follows:-

cap = cv.VideoCapture(int(cap_num))

to

pipeline = "v4l2src device=/dev/media0 io-mode=mmap ! video/x-raw,format=NV12,width=3840,height=2160,framerate=30/1 ! videoconvert ! appsink"	
cap = cv.VideoCapture(pipeline,cv.CAP_GSTREAMER)

And also, i removed:-

cap.set(3,1920)
cap.set(4,1080)

All these are under the main function.

Then i run:-

python3 yolov8n-cap.py --model ./models/VIM4/yolov8n_int8.adla --library ./libs/libnn_yolov8n.so --device 0

**Since the input for args --device does not affect anything already, i put 0 for now

Then i got Segmentation fault error, which im not sure what is it.

The above are the steps i took.

Also, another two questions:-

  1. Is there any ready KSNN realtime text recognition demo for me to try out?
  2. Why is it so lag when i use easyocr, or pytesseract or the EAST text detection method (which i got it online). Is it that i wasn’t tapping into the GPU nor the NPU? If yes, is there anyway i can utilize them?

**Edit: forgot to mention that i called

pip3 install numpy==1.26.4 

because i got the following error:-

ImportError: 
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

I also did

sudo apt-get install libatk-adaptor libgail-common

because i got the follow error:-

Gtk-Message: 11:34:56.472: Failed to load module "gail"

** (python3:4509): WARNING **: 11:34:56.518: (../atk-adaptor/bridge.c:1085):atk_bridge_adaptor_init: runtime check failed: (root)

After running the yolov8-cap.py file, the Segmentation Error is as follows:-

E NN_SDK:[adla_get_rgb_data:97]Error: input shape check fail.
E NN_SDK:[aml_adla_inputs_set_off:760]Error: get input fail.
E NN_SDK:[aml_adla_run_network_off:1163]Error: run network fail.
[2024-10-18 11:39:41]  DEBUG  [amlv4l2src camctrl.cc:914:Signalhandler]enter camctrl ```Signalhandler: 15
[2024-10-18 11:39:41]  DEBUG  [amlv4l2src camctrl.cc:917:Signalhandler]exit camctrl Signalhandler: 15
Segmentation fault

This is just a guess, I have no clue in the world what is going on.

Typically with python and mis-match and version errors you need to establish a working directory then source a .venv. Install all the packages in it and work your project in that sourced venv.

Also, I get everything up and running on a big box first. Your level of frustration will be much lower. It is so much easier to bring it up then once it is working port it over to your arm board.

$sudo apt install  python3.10-venv
$python3 -m venv .venv
$source .venv/bin/activate

Now install everything in the .venv and work from it.

To exit venv

$deactivate
1 Like

Hello @JietChoo ,

Thank you for your feedback. We will find the problem and fix it.

About your questions.

  1. I am sorry that we do not have text recognition demo for KSNN now.
  2. There are many possible reasons for lag. CPU inference is one. Second, the model you use is too large. Third your video input is too large. Now VIM4 only can infer by NPU. Two ways to use NPU, KSNN and C++. You need to convert your model first then put it on VIM4 to infer. You can refer YOLOv8n demo to convert your model.
    YOLOv8n OpenCV VIM4 Demo - 2 [Khadas Docs]
    YOLOv8n KSNN Demo - 2 [Khadas Docs]

Will you guys have real time text recognition KSNN demo soon? Really appreciate it if you guys will come out with one soon.

Also, is it possible to use Vim3 KSNN Demos in Vim4? I havn’t tried it yet.

Hi Louis, i have tried using the C++ Demo to do realtime text recognition. I’m using your DenseNet CTC demo, using the cv::VideoCapture, and run postprocess_densenet_ctc for each frame, but i got set input size too big, please check it error.

My Code:

/****************************************************************************
*
*    Copyright (c) 2019  by amlogic Corp.  All rights reserved.
*
*    The material in this file is confidential and contains trade secrets
*    of amlogic Corporation. No part of this work may be disclosed,
*    reproduced, copied, transmitted, or used in any way for any purpose,
*    without the express written permission of amlogic Corporation.
*
***************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "nn_sdk.h"
#include "nn_util.h"
#include "postprocess_util.h"
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/types_c.h>
#include <opencv2/opencv.hpp>
#include <opencv2/imgproc/imgproc_c.h>
#include <getopt.h>
#include <sys/time.h>

//#define MODEL_WIDTH 280
//#define MODEL_HEIGHT 32

#define MODEL_WIDTH 640
#define MODEL_HEIGHT 640

#define DEFAULT_WIDTH 1280
#define DEFAULT_HEIGHT 720

int default_width = DEFAULT_WIDTH;
int default_height = DEFAULT_HEIGHT;

struct option longopts[] = {
	{ "model",          required_argument,  NULL,   'm' },
	{ "width",          required_argument,  NULL,   'w' },
	{ "height",         required_argument,  NULL,   'h' },
	{ "help",           no_argument,        NULL,   'H' },
	{ 0, 0, 0, 0 }
};

nn_input inData;

cv::Mat img;

aml_module_t modelType;

static int input_width,input_high,input_channel;

typedef enum _amlnn_detect_type_ {
	Accuracy_Detect_Yolo_V3 = 0
} amlnn_detect_type;

void* init_network_file(const char *mpath) {

	void *qcontext = NULL;

	aml_config config;
	memset(&config, 0, sizeof(aml_config));

	config.nbgType = NN_ADLA_FILE;
	config.path = mpath;
	config.modelType = ADLA_LOADABLE;
	config.typeSize = sizeof(aml_config);

	qcontext = aml_module_create(&config);
	if (qcontext == NULL) {
		printf("amlnn_init is fail\n");
		return NULL;
	}

	if (config.nbgType == NN_ADLA_MEMORY && config.pdata != NULL) {
		free((void*)config.pdata);
	}

	return qcontext;
}

//int set_input(void *qcontext, const cv::Mat jimg) {

//	int ret = 0;
//	int input_size = 0;
//	int hw = input_width*input_high;
//	unsigned char *rawdata = NULL;
//	cv::Mat temp_img(MODEL_WIDTH, MODEL_HEIGHT, CV_8UC1),normalized_img;
	//img = cv::imread(jmat, 0);
//	int width = jimg.cols;
//	int height = jimg.rows;
//	cv::resize(jimg, temp_img, cv::Size(MODEL_WIDTH, MODEL_HEIGHT));
//	temp_img.convertTo(normalized_img, CV_32FC1, 1.0 / 255.0);
//	rawdata = normalized_img.data;
//	inData.input_type = RGB24_RAW_DATA;
//	inData.input = rawdata;
//	inData.input_index = 0;
//	inData.size = input_width * input_high * input_channel * sizeof(float);
//	ret = aml_module_input_set(qcontext, &inData);

//	return ret;
//}

int run_network(void *qcontext) {

	//int ret = 0;
	//nn_output *outdata = NULL;
	//aml_output_config_t outconfig;
	//memset(&outconfig, 0, sizeof(aml_output_config_t));

	//outconfig.format = AML_OUTDATA_RAW;//AML_OUTDATA_RAW or AML_OUTDATA_FLOAT32
	//outconfig.typeSize = sizeof(aml_output_config_t);

	//outdata = (nn_output*)aml_module_output_get(qcontext, outconfig);
	//if (outdata == NULL) {
	//	printf("aml_module_output_get error\n");
	//	return -1;
	//}

	//char result[35] = {0};
	//int result_len = 0;

	//postprocess_densenet_ctc(outdata, result, &result_len);

	//printf("%d\n", result_len);
	//printf("%s\n", result);

	//return ret;
	int ret = 0;
	int frames = 0;
	nn_output *outdata = NULL;
	struct timeval time_start, time_end;
	float total_time = 0;

	aml_output_config_t outconfig;
	memset(&outconfig, 0, sizeof(aml_output_config_t));

	outconfig.format = AML_OUTDATA_FLOAT32;//AML_OUTDATA_RAW or AML_OUTDATA_FLOAT32
	outconfig.typeSize = sizeof(aml_output_config_t);
	outconfig.order = AML_OUTPUT_ORDER_NCHW;

	obj_detect_out_t yolov3_detect_out;

	cv::namedWindow("Image Window");
	
	int input_size = 0;
	int hw = input_width*input_high;
	unsigned char *rawdata = NULL;
	cv::Mat temp_img(MODEL_WIDTH, MODEL_HEIGHT, CV_8UC1);

	inData.input_type = RGB24_RAW_DATA;
	inData.input_index = 0;
	inData.size = input_width * input_high * input_channel;
	
	std::string pipeline = std::string("v4l2src device=/dev/media0 io-mode=mmap ! video/x-raw,format=NV12,width=") + std::to_string(default_width) + std::string(",height=") + std::to_string(default_height) + std::string(",framerate=30/1 ! videoconvert ! appsink");
	cv::VideoCapture cap(pipeline);

	gettimeofday(&time_start, 0);

	if (!cap.isOpened()) {
		std::cout << "capture device failed to open!" << std::endl;
		cap.release();
		exit(-1);
	}

	while(1) {
		gettimeofday(&time_start, 0);
		
		if (!cap.read(img)) {
			std::cout<<"Capture read error"<<std::endl;
			break;
		}
		
		printf("Input");
		cv::resize(img, temp_img, cv::Size(MODEL_WIDTH, MODEL_HEIGHT));
		cv::cvtColor(temp_img, temp_img, cv::COLOR_RGB2BGR);

		inData.input = temp_img.data;

//		gettimeofday(&time_start, 0);
		ret = aml_module_input_set(qcontext, &inData);

		if (ret != 0) {
			printf("set_input fail.\n");
			return -1;
		}

		outdata = (nn_output*)aml_module_output_get(qcontext, outconfig);
		if (outdata == NULL) {
			printf("aml_module_output_get error\n");
			return -1;
		}
		printf("Output");
		
		++frames;
		total_time += (float)((time_end.tv_sec - time_start.tv_sec) + (time_end.tv_usec - time_start.tv_usec) / 1000.0f / 1000.0f);
		if (total_time >= 1.0f) {
			int fps = (int)(frames / total_time);
			printf("Inference FPS: %i\n", fps);
			frames = 0;
			total_time = 0;
		}
		
		char result[35] = {0};
		int result_len = 0;
		printf("Start DenseNet CTC");
		postprocess_densenet_ctc(outdata, result, &result_len);
		printf("Result of DenseNet CTC");
		printf("%d\n", result_len);
		printf("%s\n", result);
		
		cv::imshow("Image Window",img);
		cv::waitKey(1);
	}
	
	return ret;
}

int destroy_network(void *qcontext) {

	int ret = aml_module_destroy(qcontext);
	return ret;
}

int main(int argc,char **argv)
{
	int c;
	int ret = 0;
	void *context = NULL;
	char *model_path = NULL;
	//char *input_data = NULL;
	input_width = MODEL_WIDTH;
	input_high = MODEL_HEIGHT;
	//input_channel = 1;
	input_channel = 3;

	while ((c = getopt_long(argc, argv, "m:w:h:H", longopts, NULL)) != -1) {
		switch (c) {
			case 'w':
				default_width = atoi(optarg);
				break;
				
			case 'h':
				default_height = atoi(optarg);
				break;
				
			case 'm':
				model_path = optarg;
				break;
				
			//case 'p':
			//	input_data = optarg;
			//	break;

			default:
				printf("%s[-m model path] [-w camera width] [-h camera height] [-H]\n", argv[0]);
				exit(1);
		}
	}

	context = init_network_file(model_path);
	if (context == NULL) {
		printf("init_network fail.\n");
		return -1;
	}

	//ret = set_input(context, input_data);

	//if (ret != 0) {

	//	printf("set_input fail.\n");
	//	return -1;
	//}

	ret = run_network(context);

	if (ret != 0) {
		printf("run_network fail.\n");
		return -1;
	}

	ret = destroy_network(context);

	if (ret != 0) {
		printf("destroy_network fail.\n");
		return -1;
	}

	return ret;
}

The Error I Got:

QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
WARNING: Some incorrect rendering might occur because the selected Vulkan device (Mali-G52) doesn't support base Zink requirements: feats.features.logicOp feats.features.fillModeNonSolid feats.features.shaderClipDistance 
[API:aml_v4l2src_connect:271]Enter, devname : /dev/media0
func_name: aml_src_get_cam_method
initialize func addr: 0x7f6c9f167c
finalize func addr: 0x7f6c9f1948
start func addr: 0x7f6c9f199c
stop func addr: 0x7f6c9f1a4c
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 1, type 0x20000, name isp-csiphy), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 4, type 0x20000, name isp-adapter), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 7, type 0x20000, name isp-test-pattern-gen), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 9, type 0x20000, name isp-core), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 20, type 0x20001, name imx415-0), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 24, type 0x10001, name isp-ddr-input), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 28, type 0x10001, name isp-param), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 32, type 0x10001, name isp-stats), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 36, type 0x10001, name isp-output0), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 40, type 0x10001, name isp-output1), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 44, type 0x10001, name isp-output2), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 48, type 0x10001, name isp-output3), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 52, type 0x10001, name isp-raw), ret 0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id -2147483596, type 0x0, name ), ret -1
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:239:carm_src_is_usb]carm_src_is_usb:error Invalid argument
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:79:cam_src_select_socket]select socket:/tmp/camctrl0.socket
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:103:cam_src_obtain_devname]fork ok, pid:7369
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:103:cam_src_obtain_devname]fork ok, pid:0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camsrc.c:107:cam_src_obtain_devname]execl /usr/bin/camctrl
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camctrl.cc:925:main][camctrl.cc:main:925]

[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camctrl.cc:889:parse_opt]media device name: /dev/media0
[2024-10-22 07:42:39]  DEBUG  [amlv4l2src camctrl.cc:898:parse_opt]Server socket: /tmp/camctrl0.socket
Opening media device /dev/media0
Enumerating entities
Found 13 entities
Enumerating pads and links
mediaStreamInit[35]: mediaStreamInit ++. 

mediaStreamInit[39]: media devnode: /dev/media0
mediaStreamInit[56]: ent 0, name isp-csiphy 
mediaStreamInit[56]: ent 1, name isp-adapter 
mediaStreamInit[56]: ent 2, name isp-test-pattern-gen 
mediaStreamInit[56]: ent 3, name isp-core 
mediaStreamInit[56]: ent 4, name imx415-0 
mediaStreamInit[56]: ent 5, name isp-ddr-input 
mediaStreamInit[56]: ent 6, name isp-param 
mediaStreamInit[56]: ent 7, name isp-stats 
mediaStreamInit[56]: ent 8, name isp-output0 
mediaStreamInit[56]: ent 9, name isp-output1 
mediaStreamInit[56]: ent 10, name isp-output2 
mediaStreamInit[56]: ent 11, name isp-output3 
mediaStreamInit[56]: ent 12, name isp-raw 
mediaStreamInit[96]: get  lens_ent fail
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video63 ok, fd 5 
 
mediaStreamInit[151]: mediaStreamInit open video0 fd 5 
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video64 ok, fd 6 
 
mediaStreamInit[155]: mediaStreamInit open video1 fd 6 
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video65 ok, fd 7 
 
mediaStreamInit[159]: mediaStreamInit open video2 fd 7 
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video66 ok, fd 8 
 
mediaStreamInit[163]: mediaStreamInit open video3 fd 8 
mediaStreamInit[172]: media stream init success
fetchPipeMaxResolution[27]: find matched sensor configs 3840x2160
media_set_wdrMode[420]: media_set_wdrMode ++ wdr_mode : 0 

media_set_wdrMode[444]: media_set_wdrMode success --

media_set_wdrMode[420]: media_set_wdrMode ++ wdr_mode : 4 

media_set_wdrMode[444]: media_set_wdrMode success --

[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:374:link_and_activate_subdev]link and activate subdev successfully
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:407:media_stream_config]config media stream successfully
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video62 ok, fd 13 
 
mediaLog[30]: VIDIOC_QUERYCAP: success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:172:check_capability]entity[isp-stats] -> video[/dev/video62], cap.driver:aml-camera, capabilities:0x85200001, device_caps:0x5200001
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video61 ok, fd 14 
 
mediaLog[30]: VIDIOC_QUERYCAP: success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:172:check_capability]entity[isp-param] -> video[/dev/video61], cap.driver:aml-camera, capabilities:0x85200001, device_caps:0x5200001
mediaLog[30]: set format ok, ret 0.
 
mediaLog[30]: set format ok, ret 0.
 
mediaLog[30]:  request buf ok
 
mediaLog[30]:  request buf ok
 
mediaLog[30]: query buffer success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 0
mediaLog[30]: query buffer success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 262144
mediaLog[30]: query buffer success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 524288
mediaLog[30]: query buffer success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 786432
mediaLog[30]: query buffer success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:568:isp_alg_param_init]isp param query buffer, length: 262144, offset: 0
alg2User func addr: 0x7fab7f8ed8
alg2Kernel func addr: 0x7fab7f8f08
algEnable func addr: 0x7fab7f8d70
algDisable func addr: 0x7fab7f8e90
algFwInterface func addr: 0x7fab7f9008
matchLensConfig[43]: LKK: fail to match lensConfig

cmos_get_ae_default_imx415[65]: cmos_get_ae_default

cmos_get_ae_default_imx415[116]: cmos_get_ae_default++++++

cmos_get_ae_default_imx415[65]: cmos_get_ae_default

cmos_get_ae_default_imx415[116]: cmos_get_ae_default++++++

aisp_enable[984]: tuning device not exist!

aisp_enable[987]: 3a commit b56e430e80b995bb88cecff66a3a6fc17abda2c7 

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 16351232, 0, 0, 0

mediaLog[30]: streamon   success 
 
mediaLog[30]: streamon   success 
 
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:650:isp_alg_param_init]Finish initializing amlgorithm parameter ...
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:971:main]UNIX domain socket bound
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:977:main]Accepting connections ...
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camsrc.c:122:cam_src_obtain_devname]udp_sock_create
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src common/common.c:70:udp_sock_create][453321940][/tmp/camctrl0.socket] start connect
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camsrc.c:124:cam_src_obtain_devname]udp_sock_recv
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:985:main]connected_sockfd: 20
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:989:main]video_dev_name: /dev/video63
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camsrc.c:282:cam_src_initialize]obtain devname: /dev/video63
devname : /dev/video63
driver : aml-camera
device : Amlogic Camera Card
bus_info : platform:aml-cam
version : 331657
error tvin-port use -1 
[API:aml_v4l2src_streamon:373]Enter
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camsrc.c:298:cam_src_start]start ...
[API:aml_v4l2src_streamon:376]Exit
[2024-10-22 07:42:40]  DEBUG  [amlv4l2src camctrl.cc:860:process_socket_thread]receive streamon notification
cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 1836, 1836

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 11046912, 11046912, 11046912, 11046912

cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 0, 0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 14585856, 14585856, 14585856, 14585856

[ WARN:0@1.792] global ./modules/videoio/src/cap_gstreamer.cpp (1405) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 13967360, 13967360, 13967360, 13967360

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 13545472, 13545472, 13545472, 13545472

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 13185024, 13185024, 13185024, 13185024

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12959744, 12959744, 12959744, 12959744

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12816384, 12816384, 12816384, 12816384

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12697600, 12697600, 12697600, 12697600

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12693504, 12693504, 12693504, 12693504

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12689408, 12689408, 12689408, 12689408

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12685312, 12685312, 12685312, 12685312

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12681216, 12681216, 12681216, 12681216

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12677120, 12677120, 12677120, 12677120

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12673024, 12673024, 12673024, 12673024

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12668928, 12668928, 12668928, 12668928

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12664832, 12664832, 12664832, 12664832

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12660736, 12660736, 12660736, 12660736

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12656640, 12656640, 12656640, 12656640

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12652544, 12652544, 12652544, 12652544

OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12648448, 12648448, 12648448, 12648448

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12644352, 12644352, 12644352, 12644352

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 12640256, 12640256, 12640256, 12640256

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

InputE NN_SDK:[aml_adla_inputs_set_off:812]Error: set input size too big, please check it! [1228800] : [35840]
OutputStart DenseNet CTCResult of DenseNet CTC0

The Video is displaying on a Window but no text recognition is happening.

How do i fix the input size too big error? Or is my approach wrong?

Hello @JietChoo ,

Do you use demo model directly?

Demo model needs a single channel image and its input size is 280×32×1. So you need to convert RGB to GRAY and do not change model input size.

image

Another question, before get input to model, you need to do normalization. The model normalization is 1/255. You can refer codes in red boxes.

About this question, i am sorry that VIM3 Demo cannot use in VIM4, because they use different structure.

Why can’t i change the model width and height? Which part determines what model width and height i can use? Is it possible to change it?

Also, I have amended my code as below:-

/****************************************************************************
*
*    Copyright (c) 2019  by amlogic Corp.  All rights reserved.
*
*    The material in this file is confidential and contains trade secrets
*    of amlogic Corporation. No part of this work may be disclosed,
*    reproduced, copied, transmitted, or used in any way for any purpose,
*    without the express written permission of amlogic Corporation.
*
***************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "nn_sdk.h"
#include "nn_util.h"
#include "postprocess_util.h"
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/types_c.h>
#include <opencv2/opencv.hpp>
#include <opencv2/imgproc/imgproc_c.h>
#include <getopt.h>
#include <sys/time.h>

#define MODEL_WIDTH 280
#define MODEL_HEIGHT 32

//#define MODEL_WIDTH 640
//#define MODEL_HEIGHT 640

#define DEFAULT_WIDTH 1280
#define DEFAULT_HEIGHT 720

int default_width = DEFAULT_WIDTH;
int default_height = DEFAULT_HEIGHT;

struct option longopts[] = {
	{ "model",          required_argument,  NULL,   'm' },
	{ "width",          required_argument,  NULL,   'w' },
	{ "height",         required_argument,  NULL,   'h' },
	{ "help",           no_argument,        NULL,   'H' },
	{ 0, 0, 0, 0 }
};

nn_input inData;

cv::Mat img;

aml_module_t modelType;

static int input_width,input_high,input_channel;

typedef enum _amlnn_detect_type_ {
	Accuracy_Detect_Yolo_V3 = 0
} amlnn_detect_type;

void* init_network_file(const char *mpath) {

	void *qcontext = NULL;

	aml_config config;
	memset(&config, 0, sizeof(aml_config));

	config.nbgType = NN_ADLA_FILE;
	config.path = mpath;
	config.modelType = ADLA_LOADABLE;
	config.typeSize = sizeof(aml_config);

	qcontext = aml_module_create(&config);
	if (qcontext == NULL) {
		printf("amlnn_init is fail\n");
		return NULL;
	}

	if (config.nbgType == NN_ADLA_MEMORY && config.pdata != NULL) {
		free((void*)config.pdata);
	}

	return qcontext;
}

//int set_input(void *qcontext, const cv::Mat jimg) {

//	int ret = 0;
//	int input_size = 0;
//	int hw = input_width*input_high;
//	unsigned char *rawdata = NULL;
//	cv::Mat temp_img(MODEL_WIDTH, MODEL_HEIGHT, CV_8UC1),normalized_img;
	//img = cv::imread(jmat, 0);
//	int width = jimg.cols;
//	int height = jimg.rows;
//	cv::resize(jimg, temp_img, cv::Size(MODEL_WIDTH, MODEL_HEIGHT));
//	temp_img.convertTo(normalized_img, CV_32FC1, 1.0 / 255.0);
//	rawdata = normalized_img.data;
//	inData.input_type = RGB24_RAW_DATA;
//	inData.input = rawdata;
//	inData.input_index = 0;
//	inData.size = input_width * input_high * input_channel * sizeof(float);
//	ret = aml_module_input_set(qcontext, &inData);

//	return ret;
//}

int run_network(void *qcontext) {

	//int ret = 0;
	//nn_output *outdata = NULL;
	//aml_output_config_t outconfig;
	//memset(&outconfig, 0, sizeof(aml_output_config_t));

	//outconfig.format = AML_OUTDATA_RAW;//AML_OUTDATA_RAW or AML_OUTDATA_FLOAT32
	//outconfig.typeSize = sizeof(aml_output_config_t);

	//outdata = (nn_output*)aml_module_output_get(qcontext, outconfig);
	//if (outdata == NULL) {
	//	printf("aml_module_output_get error\n");
	//	return -1;
	//}

	//char result[35] = {0};
	//int result_len = 0;

	//postprocess_densenet_ctc(outdata, result, &result_len);

	//printf("%d\n", result_len);
	//printf("%s\n", result);

	//return ret;
	int ret = 0;
	int frames = 0;
	nn_output *outdata = NULL;
	struct timeval time_start, time_end;
	float total_time = 0;

	aml_output_config_t outconfig;
	memset(&outconfig, 0, sizeof(aml_output_config_t));

	outconfig.format = AML_OUTDATA_FLOAT32;//AML_OUTDATA_RAW or AML_OUTDATA_FLOAT32
	outconfig.typeSize = sizeof(aml_output_config_t);
	outconfig.order = AML_OUTPUT_ORDER_NCHW;

	obj_detect_out_t yolov3_detect_out;

	cv::namedWindow("Image Window");
	int input_size = 0;
	int hw = input_width*input_high;
	unsigned char *rawdata = NULL;

	//int input_size = 0;
	//int hw = input_width*input_high;
	//unsigned char *rawdata = NULL;
	//cv::Mat temp_img(MODEL_WIDTH, MODEL_HEIGHT, CV_8UC1);

	//inData.input_type = RGB24_RAW_DATA;
	//inData.input_index = 0;
	//inData.size = input_width * input_high * input_channel;
	
	std::string pipeline = std::string("v4l2src device=/dev/media0 io-mode=mmap ! video/x-raw,format=NV12,width=") + std::to_string(default_width) + std::string(",height=") + std::to_string(default_height) + std::string(",framerate=30/1 ! videoconvert ! appsink");
	cv::VideoCapture cap(pipeline);

	gettimeofday(&time_start, 0);

	if (!cap.isOpened()) {
		std::cout << "capture device failed to open!" << std::endl;
		cap.release();
		exit(-1);
	}

	while(1) {
		gettimeofday(&time_start, 0);
		
		if (!cap.read(img)) {
			std::cout<<"Capture read error"<<std::endl;
			break;
		}

		///set_input
		cv::Mat temp_img(MODEL_WIDTH, MODEL_HEIGHT, CV_8UC1),normalized_img;
		
		int width = img.cols;
		int height = img.rows;
		cv::resize(img, temp_img, cv::Size(MODEL_WIDTH, MODEL_HEIGHT));
		temp_img.convertTo(normalized_img, CV_32FC1, 1.0 / 255.0);
		rawdata = normalized_img.data;
		inData.input_type = RGB24_RAW_DATA;
		inData.input = rawdata;
		inData.input_index = 0;
		inData.size = input_width * input_high * input_channel * sizeof(float);
		//ret = aml_module_input_set(qcontext, &inData);
		
		//inData.input = temp_img.data;

//		gettimeofday(&time_start, 0);
		ret = aml_module_input_set(qcontext, &inData);

		if (ret != 0) {
			printf("set_input fail.\n");
			return -1;
		}

		outdata = (nn_output*)aml_module_output_get(qcontext, outconfig);
		if (outdata == NULL) {
			printf("aml_module_output_get error\n");
			return -1;
		}
		printf("Output: ");
		
		++frames;
		total_time += (float)((time_end.tv_sec - time_start.tv_sec) + (time_end.tv_usec - time_start.tv_usec) / 1000.0f / 1000.0f);
		if (total_time >= 1.0f) {
			int fps = (int)(frames / total_time);
			printf("Inference FPS: %i\n", fps);
			frames = 0;
			total_time = 0;
		}
		
		char result[35] = {0};
		int result_len = 0;
		printf("Start DenseNet CTC");
		postprocess_densenet_ctc(outdata, result, &result_len);
		printf("Result of DenseNet CTC");
		printf("%d\n", result_len);
		printf("%s\n", result);
		
		cv::imshow("Image Window",img);
		cv::waitKey(1);
	}
	
	return ret;
}

int destroy_network(void *qcontext) {

	int ret = aml_module_destroy(qcontext);
	return ret;
}

int main(int argc,char **argv)
{
	int c;
	int ret = 0;
	void *context = NULL;
	char *model_path = NULL;
	//char *input_data = NULL;
	input_width = MODEL_WIDTH;
	input_high = MODEL_HEIGHT;
	//input_channel = 1;
	input_channel = 1;

	while ((c = getopt_long(argc, argv, "m:w:h:H", longopts, NULL)) != -1) {
		switch (c) {
			case 'w':
				default_width = atoi(optarg);
				break;
				
			case 'h':
				default_height = atoi(optarg);
				break;
				
			case 'm':
				model_path = optarg;
				break;
				
			//case 'p':
			//	input_data = optarg;
			//	break;

			default:
				printf("%s[-m model path] [-w camera width] [-h camera height] [-H]\n", argv[0]);
				exit(1);
		}
	}

	context = init_network_file(model_path);
	if (context == NULL) {
		printf("init_network fail.\n");
		return -1;
	}

	//ret = set_input(context, input_data);

	//if (ret != 0) {

	//	printf("set_input fail.\n");
	//	return -1;
	//}

	ret = run_network(context);

	if (ret != 0) {
		printf("run_network fail.\n");
		return -1;
	}

	ret = destroy_network(context);

	if (ret != 0) {
		printf("destroy_network fail.\n");
		return -1;
	}

	return ret;
}

It’s still not working. When i run

sudo ./densenet_ctc -m ../data/densenet_ctc_int16.adla

I got the following:-

adla usr space 1.2.0.5
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
WARNING: Some incorrect rendering might occur because the selected Vulkan device (Mali-G52) doesn't support base Zink requirements: feats.features.logicOp feats.features.fillModeNonSolid feats.features.shaderClipDistance 
[API:aml_v4l2src_connect:271]Enter, devname : /dev/media0
func_name: aml_src_get_cam_method
initialize func addr: 0x7f8d39167c
finalize func addr: 0x7f8d391948
start func addr: 0x7f8d39199c
stop func addr: 0x7f8d391a4c
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 1, type 0x20000, name isp-csiphy), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 4, type 0x20000, name isp-adapter), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 7, type 0x20000, name isp-test-pattern-gen), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 9, type 0x20000, name isp-core), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 20, type 0x20001, name imx415-0), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 24, type 0x10001, name isp-ddr-input), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 28, type 0x10001, name isp-param), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 32, type 0x10001, name isp-stats), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 36, type 0x10001, name isp-output0), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 40, type 0x10001, name isp-output1), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 44, type 0x10001, name isp-output2), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 48, type 0x10001, name isp-output3), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id 52, type 0x10001, name isp-raw), ret 0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:235:carm_src_is_usb]carm_src_is_usb:info(id -2147483596, type 0x0, name ), ret -1
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:239:carm_src_is_usb]carm_src_is_usb:error Invalid argument
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:79:cam_src_select_socket]select socket:/tmp/camctrl0.socket
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:103:cam_src_obtain_devname]fork ok, pid:8144
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:103:cam_src_obtain_devname]fork ok, pid:0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:107:cam_src_obtain_devname]execl /usr/bin/camctrl
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:925:main][camctrl.cc:main:925]

[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:889:parse_opt]media device name: /dev/media0
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:898:parse_opt]Server socket: /tmp/camctrl0.socket
Opening media device /dev/media0
Enumerating entities
Found 13 entities
Enumerating pads and links
mediaStreamInit[35]: mediaStreamInit ++. 

mediaStreamInit[39]: media devnode: /dev/media0
mediaStreamInit[56]: ent 0, name isp-csiphy 
mediaStreamInit[56]: ent 1, name isp-adapter 
mediaStreamInit[56]: ent 2, name isp-test-pattern-gen 
mediaStreamInit[56]: ent 3, name isp-core 
mediaStreamInit[56]: ent 4, name imx415-0 
mediaStreamInit[56]: ent 5, name isp-ddr-input 
mediaStreamInit[56]: ent 6, name isp-param 
mediaStreamInit[56]: ent 7, name isp-stats 
mediaStreamInit[56]: ent 8, name isp-output0 
mediaStreamInit[56]: ent 9, name isp-output1 
mediaStreamInit[56]: ent 10, name isp-output2 
mediaStreamInit[56]: ent 11, name isp-output3 
mediaStreamInit[56]: ent 12, name isp-raw 
mediaStreamInit[96]: get  lens_ent fail
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video63 ok, fd 5 
 
mediaStreamInit[151]: mediaStreamInit open video0 fd 5 
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video64 ok, fd 6 
 
mediaStreamInit[155]: mediaStreamInit open video1 fd 6 
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video65 ok, fd 7 
 
mediaStreamInit[159]: mediaStreamInit open video2 fd 7 
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video66 ok, fd 8 
 
mediaStreamInit[163]: mediaStreamInit open video3 fd 8 
mediaStreamInit[172]: media stream init success
fetchPipeMaxResolution[27]: find matched sensor configs 3840x2160
media_set_wdrMode[420]: media_set_wdrMode ++ wdr_mode : 0 

media_set_wdrMode[444]: media_set_wdrMode success --

media_set_wdrMode[420]: media_set_wdrMode ++ wdr_mode : 4 

media_set_wdrMode[444]: media_set_wdrMode success --

[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:374:link_and_activate_subdev]link and activate subdev successfully
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:407:media_stream_config]config media stream successfully
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video62 ok, fd 13 
 
mediaLog[30]: VIDIOC_QUERYCAP: success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:172:check_capability]entity[isp-stats] -> video[/dev/video62], cap.driver:aml-camera, capabilities:0x85200001, device_caps:0x5200001
mediaLog[30]: v4l2_video_open: open subdev device node /dev/video61 ok, fd 14 
 
mediaLog[30]: VIDIOC_QUERYCAP: success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:172:check_capability]entity[isp-param] -> video[/dev/video61], cap.driver:aml-camera, capabilities:0x85200001, device_caps:0x5200001
mediaLog[30]: set format ok, ret 0.
 
mediaLog[30]: set format ok, ret 0.
 
mediaLog[30]:  request buf ok
 
mediaLog[30]:  request buf ok
 
mediaLog[30]: query buffer success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 0
mediaLog[30]: query buffer success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 262144
mediaLog[30]: query buffer success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 524288
mediaLog[30]: query buffer success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:546:isp_alg_param_init]isp stats query buffer, length: 262144, offset: 786432
mediaLog[30]: query buffer success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:568:isp_alg_param_init]isp param query buffer, length: 262144, offset: 0
alg2User func addr: 0x7f93c18ed8
alg2Kernel func addr: 0x7f93c18f08
algEnable func addr: 0x7f93c18d70
algDisable func addr: 0x7f93c18e90
algFwInterface func addr: 0x7f93c19008
matchLensConfig[43]: LKK: fail to match lensConfig

cmos_get_ae_default_imx415[65]: cmos_get_ae_default

cmos_get_ae_default_imx415[116]: cmos_get_ae_default++++++

cmos_get_ae_default_imx415[65]: cmos_get_ae_default

cmos_get_ae_default_imx415[116]: cmos_get_ae_default++++++

aisp_enable[984]: tuning device not exist!

aisp_enable[987]: 3a commit b56e430e80b995bb88cecff66a3a6fc17abda2c7 

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 16351232, 0, 0, 0

mediaLog[30]: streamon   success 
 
mediaLog[30]: streamon   success 
 
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:650:isp_alg_param_init]Finish initializing amlgorithm parameter ...
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:971:main]UNIX domain socket bound
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:977:main]Accepting connections ...
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:122:cam_src_obtain_devname]udp_sock_create
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src common/common.c:70:udp_sock_create][841421972][/tmp/camctrl0.socket] start connect
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:124:cam_src_obtain_devname]udp_sock_recv
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:985:main]connected_sockfd: 20
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:989:main]video_dev_name: /dev/video63
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:282:cam_src_initialize]obtain devname: /dev/video63
devname : /dev/video63
driver : aml-camera
device : Amlogic Camera Card
bus_info : platform:aml-cam
version : 331657
error tvin-port use -1 
[API:aml_v4l2src_streamon:373]Enter
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camsrc.c:298:cam_src_start]start ...
[API:aml_v4l2src_streamon:376]Exit
[2024-10-24 08:11:54]  DEBUG  [amlv4l2src camctrl.cc:860:process_socket_thread]receive streamon notification
cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 1224, 1224

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 16568320, 16568320, 16568320, 16568320

cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 1428, 1428

cmos_inttime_calc_table_imx415[150]: cmos_inttime_calc_table: 18411520, 18411520, 18411520, 18411520

cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 1836, 1836

[ WARN:0@1.901] global ./modules/videoio/src/cap_gstreamer.cpp (1405) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
double free or corruption (out)
[2024-10-24 08:11:55]  DEBUG  [amlv4l2src camctrl.cc:914:Signalhandler]enter camctrl Signalhandler: 15
[2024-10-24 08:11:55]  DEBUG  [amlv4l2src camctrl.cc:917:Signalhandler]exit camctrl Signalhandler: 15
Aborted

Approach
This approach is with the idea of running VideoCapture and infer Densenet CTC on each frame. Is this approach correct?

Hello @JietChoo ,

Models’ inputs are decided when you build this model. And most of them cannot change.

From the log, it is the MIPI camera problem. And i ask my colleague, he wil help you tomorrow.

1 Like

Thank you Louis, I will wait for your colleague’s reply. Need to get this realtime text recognition up on VIM4 as soon as possible :frowning:


If i comment out this line of code, the video feed works fine. However it will have the below issue:

ImageE NN_SDK:[aml_adla_inputs_set_off:756]Error: get input fail.
E NN_SDK:[aml_adla_inputs_set_off:756]Error: get input fail.
Output: Inference FPS: 0
Start DenseNet CTC
Result of DenseNet CTC
0

ImageE NN_SDK:[aml_adla_inputs_set_off:756]Error: get input fail.
E NN_SDK:[aml_adla_inputs_set_off:756]Error: get input fail.
Output: Inference FPS: 0
Start DenseNet CTC
Result of DenseNet CTC
0

ImageE NN_SDK:[aml_adla_inputs_set_off:756]Error: get input fail.
E NN_SDK:[aml_adla_inputs_set_off:756]Error: get input fail.
Output: Inference FPS: 0
Start DenseNet CTC
Result of DenseNet CTC
0

Hello @JietChoo ,

Which VIM4 kernel do you use? I will try to reproduce your problem.

I’m using this
vim4-ubuntu-24.04-gnome-linux-5.15-fenix-1.7.1-240930.img.xz

Hi, I’m using the following code and the video frames are running

/****************************************************************************
*
*    Copyright (c) 2019  by amlogic Corp.  All rights reserved.
*
*    The material in this file is confidential and contains trade secrets
*    of amlogic Corporation. No part of this work may be disclosed,
*    reproduced, copied, transmitted, or used in any way for any purpose,
*    without the express written permission of amlogic Corporation.
*
***************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "nn_sdk.h"
#include "nn_util.h"
#include "postprocess_util.h"
#include <opencv2/objdetect/objdetect.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/types_c.h>
#include <opencv2/opencv.hpp>
#include <opencv2/imgproc/imgproc_c.h>
#include <getopt.h>
#include <sys/time.h>

#define MODEL_WIDTH 280
#define MODEL_HEIGHT 32

//#define MODEL_WIDTH 640
//#define MODEL_HEIGHT 640

#define DEFAULT_WIDTH 1280
#define DEFAULT_HEIGHT 720


int default_width = DEFAULT_WIDTH;
int default_height = DEFAULT_HEIGHT;

struct option longopts[] = {
	{ "device",         required_argument,  NULL,   'd' },
	{ "model",          required_argument,  NULL,   'm' },
	{ "width",          required_argument,  NULL,   'w' },
	{ "height",         required_argument,  NULL,   'h' },
	{ "help",           no_argument,        NULL,   'H' },
	{ 0, 0, 0, 0 }
};

int device = 0;

nn_input inData;

cv::Mat img;

aml_module_t modelType;

static int input_width,input_high,input_channel;

static const char *coco_names[] = {
	"person","bicycle","car","motorbike","aeroplane","bus","train",
	"truck","boat","traffic light","fire hydrant","stop sign","parking meter",
	"bench","bird","cat","dog","horse","sheep","cow","elephant","bear","zebra",
	"giraffe","backpack","umbrella","handbag","tie","suitcase","frisbee","skis",
	"snowboard","sports ball","kite","baseball bat","baseball glove","skateboard",
	"surfboard","tennis racket","bottle","wine glass","cup","fork","knife","spoon",
	"bowl","banana","apple","sandwich","orange","broccoli","carrot","hot dog","pizza",
	"donut","cake","chair","sofa","pottedplant","bed","diningtable","toilet","tvmonitor",
	"laptop","mouse","remote","keyboard","cell phone","microwave","oven","toaster","sink",
	"refrigerator","book","clock","vase","scissors","teddy bear","hair drier","toothbrush"
};

typedef enum _amlnn_detect_type_ {
	Accuracy_Detect_Yolo_V3 = 0
} amlnn_detect_type;



void* init_network_file(const char *mpath) {

	void *qcontext = NULL;

	aml_config config;
	memset(&config, 0, sizeof(aml_config));

	config.nbgType = NN_ADLA_FILE;
	config.path = mpath;
	config.modelType = ADLA_LOADABLE;
	config.typeSize = sizeof(aml_config);

	qcontext = aml_module_create(&config);
	if (qcontext == NULL) {
		printf("amlnn_init is fail\n");
		return NULL;
	}

	if (config.nbgType == NN_ADLA_MEMORY && config.pdata != NULL) {
		free((void*)config.pdata);
	}

	return qcontext;
}

static cv::Scalar obj_id_to_color(int obj_id) {

	int const colors[6][3] = { { 1,0,1 }, { 0,0,1 }, { 0,1,1 }, { 0,1,0 }, { 1,1,0 }, { 1,0,0 } };
	int const offset = obj_id * 123457 % 6;
	int const color_scale = 150 + (obj_id * 123457) % 100;
	cv::Scalar color(colors[offset][0], colors[offset][1], colors[offset][2]);
	color *= color_scale;
	return color;
}

int run_network(void *qcontext) {

	int ret = 0, frames = 0;
	nn_output *outdata = NULL;
	struct timeval time_start, time_end;
	float total_time = 0;

	aml_output_config_t outconfig;
	memset(&outconfig, 0, sizeof(aml_output_config_t));

//	ret = set_input(context, input_data);

	outconfig.format = AML_OUTDATA_FLOAT32;//AML_OUTDATA_RAW or AML_OUTDATA_FLOAT32
	outconfig.typeSize = sizeof(aml_output_config_t);
	outconfig.order = AML_OUTPUT_ORDER_NCHW;

	obj_detect_out_t yolov3_detect_out;

	int classid = 0;
	float prob = 0;
	int left = 0, right = 0, top = 0, bot = 0;

	cv::Point pt1;
	cv::Point pt2;

	int baseline;

	cv::namedWindow("Image Window");


	//int input_size = 0;
	//int hw = input_width*input_high;
	//unsigned char *rawdata = NULL;
	//cv::Mat temp_img(MODEL_WIDTH, MODEL_HEIGHT, CV_8UC1);

	//inData.input_type = RGB24_RAW_DATA;
	//inData.input_index = 0;
	//inData.size = input_width * input_high * input_channel;
	
	///set_input from densenet_ctc
	int input_size = 0;
	int hw = input_width*input_high;
	unsigned char *rawdata = NULL;
	cv::Mat temp_img(MODEL_WIDTH, MODEL_HEIGHT, CV_8UC1),normalized_img;

	std::string pipeline = std::string("v4l2src device=/dev/media0 io-mode=mmap ! video/x-raw,format=NV12,width=") + std::to_string(default_width) + std::string(",height=") + std::to_string(default_height) + std::string(",framerate=30/1 ! videoconvert ! appsink");
	cv::VideoCapture cap(pipeline);
	
	//cv::VideoCapture cap(device);
	//cap.set(cv::CAP_PROP_FRAME_WIDTH, default_width);
	//cap.set(cv::CAP_PROP_FRAME_HEIGHT, default_height);

	gettimeofday(&time_start, 0);

	if (!cap.isOpened()) {
		std::cout << "capture device failed to open!" << std::endl;
		cap.release();
		exit(-1);
	}

	while(1) {

		gettimeofday(&time_start, 0);
		if (!cap.read(img)) {
			std::cout<<"Capture read error"<<std::endl;
			break;
		}

		//cv::resize(img, temp_img, cv::Size(MODEL_WIDTH, MODEL_HEIGHT));
		//cv::cvtColor(temp_img, temp_img, cv::COLOR_RGB2BGR);

		//inData.input = temp_img.data;
		
		///set_input from densenet_ctc
		int width = img.cols;
		int height = img.rows;
		cv::resize(img, temp_img, cv::Size(MODEL_WIDTH, MODEL_HEIGHT));
		temp_img.convertTo(normalized_img, CV_32FC1, 1.0 / 255.0);

		rawdata = normalized_img.data;
		inData.input_type = BINARY_RAW_DATA;
		inData.input = rawdata;
		inData.input_index = 0;
		inData.size = input_width * input_high * input_channel * sizeof(float);

//		gettimeofday(&time_start, 0);
		ret = aml_module_input_set(qcontext, &inData);

		if (ret != 0) {
			printf("set_input fail.\n");
			return -1;
		}

		outdata = (nn_output*)aml_module_output_get(qcontext, outconfig);
		if (outdata == NULL) {
			printf("aml_module_output_get error\n");
			return -1;
		}
		gettimeofday(&time_end, 0);
		++frames;
		total_time += (float)((time_end.tv_sec - time_start.tv_sec) + (time_end.tv_usec - time_start.tv_usec) / 1000.0f / 1000.0f);
		if (total_time >= 1.0f) {
			int fps = (int)(frames / total_time);
			printf("Inference FPS: %i\n", fps);
			frames = 0;
			total_time = 0;
		}
		
		char result[35] = {0};
		int result_len = 0;
	
		postprocess_densenet_ctc(outdata, result, &result_len);
	
		printf("Result\n");
		printf("%d\n", result_len);
		printf("%s\n", result);


		//postprocess_yolov3(outdata, &yolov3_detect_out);
		//printf("object_num:%d\n", yolov3_detect_out.detNum);

		//for (int i =0;i < yolov3_detect_out.detNum;i++){
		//	classid = (int)yolov3_detect_out.pBox[i].objectClass;
		//	prob = yolov3_detect_out.pBox[i].score;
  	  
		//	left  = (yolov3_detect_out.pBox[i].x - yolov3_detect_out.pBox[i].w/2.) * img.cols;
		//	right = (yolov3_detect_out.pBox[i].x + yolov3_detect_out.pBox[i].w/2.) * img.cols;
		//	top   = (yolov3_detect_out.pBox[i].y - yolov3_detect_out.pBox[i].h/2.) * img.rows;
		//	bot   = (yolov3_detect_out.pBox[i].y + yolov3_detect_out.pBox[i].h/2.) * img.rows;
		//	printf("class:%s,label_num:%d,prob:%f,left:%d,top:%d,right:%d,bot:%d\n",coco_names[classid], classid, prob, left, top, right, bot);

		//	pt1=cv::Point(left, top);
		//	pt2=cv::Point(right, bot);
		//	cv::Rect rect(left, top, right-left, bot-top);
		//	cv::rectangle(img, rect, obj_id_to_color(classid), 1, 8, 0);

		//	cv::Size text_size = cv::getTextSize(coco_names[classid], cv::FONT_HERSHEY_COMPLEX, 0.5 , 1, &baseline);
		//	cv::Rect rect1(left, top-20, text_size.width+10, 20);
		//	cv::rectangle(img, rect1, obj_id_to_color(classid), -1);
		//	cv::putText(img, coco_names[classid], cvPoint(left+5,top-5), cv::FONT_HERSHEY_COMPLEX, 0.5, cv::Scalar(0,0,0), 1);
		//}
		cv::imshow("Image Window",img);
		cv::waitKey(1);
	}

    return ret;
}

int destroy_network(void *qcontext) {

	int ret = aml_module_destroy(qcontext);
	return ret;
}

int main(int argc,char **argv)
{
	int c;
	int ret = 0;
	void *context = NULL;
	char *model_path = NULL;
	input_width = MODEL_WIDTH;
	input_high = MODEL_HEIGHT;
	input_channel = 1;

	while ((c = getopt_long(argc, argv, "d:m:w:h:H", longopts, NULL)) != -1) {
		switch (c) {
			case 'd':
				device = atoi(optarg);
				break;

			case 'w':
				default_width = atoi(optarg);
				break;

			case 'h':
				default_height = atoi(optarg);
				break;

			case 'm':
				model_path = optarg;
				break;

			default:
				printf("%s [-d device] [-m model path] [-w camera width] [-h camera height]  [-H]\n", argv[0]);
				exit(1);
		}
	}

	context = init_network_file(model_path);
	if (context == NULL) {
		printf("init_network fail.\n");
		return -1;
	}

	ret = run_network(context);

	if (ret != 0) {
		printf("run_network fail.\n");
		return -1;
	}

	ret = destroy_network(context);

	if (ret != 0) {
		printf("destroy_network fail.\n");
		return -1;
	}

	return ret;
}

I ran

sudo ./yolov8n_cap -m ../data/densenet_ctc_int16.adla

However, the print statement result i got from densenet ctc inferecing for each frames are gibberish results.

@@
Result
2
@@
Result
2
@@
Result
2
@@
Result
2
@@
Result
2
@@
Result
1
@
Result
2
@@
Result
2
@@
cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 3264, 3264

Result
2
@@
cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 3060, 3060

cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 2856, 2856

Result
2
@@
cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 2652, 2652

Result
2
@@
Result
2
@@
Result
2
@@
Result
1
@
Result
1
@
Result
0

Result
2
$~
Result
2
$~
Result
2
$~
Result
2
@~
Result
2
@~
Result
3
@~~
Result
5
$~~~~
Result
5
$~~~~
Result
5
$~~~~
Inference FPS: 27
Result
3
$~~
Result
3
@~~
Result
4
@~~s
Result
1
@
Result
2
@s
Result
1
@
cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 3468, 3468

cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 4080, 4080

Result
1
@
cmos_again_calc_table_imx415[125]: cmos_again_calc_table: 4284, 4284

Result
1
@
Result
1
@
Result
1
@
Result
1
@
Result
1
@
Result
1
@
Result
1
@
Result
1
$
Result
1
$
Result
1
@
Result
1
@
Result


Hello Louis, any updates?

Hello @JietChoo ,

Deploy model on board requires some basic knowledge of deep learning. Text Recognition has two models. It is too difficult for new.

First stage is detecting the text in image. The result is like the image below.

Then cut image from the image like below.

At last, recognize characters in each cut image. Densenet CTC is a recognize characters model. It need input like pictures above. That is why Densenet CTC only has image demo.

In you log, model infer is successful. The result is the red boxes. However, as a result of input violation of design, its result is wrong.
image

Considering you are a new. We will add a new text recognition demo in VIM4 KSNN. But it is only a demo, the accuracy will not be very high. It will take about a month.

Thank you so much Louis. Yes, I’m very new to deep learning and still trying to figure out stuff.

Yes i understand Densenet is for text recognition. So, I will need a model for text detection. But I tried converting one of the text detection model from OpenCV Zoo to adla format, but I’m unable to do it :frowning:

Is there any ready text detection model in adla format?