Khadas VIM3 custom one-class YOLOv3 inference issue

@Akkisony The NPU is integrated in the CPU, so there is no way to test the power consumption separately

@Frank Thanks again!
Can you please suggest me what other parameters can I measure in order to compare NPU with Coral TPU.
Currently, I have inference time and accuracy. I thought power consumption is also an important parameter.
Can you suggest me any other parameters to compare?

@Akkisony I am not familiar with this part of the hardware, nor do I have a good way to measure power consumption. Maybe you can measure the power consumption of a standby VIM3, and then measure the power consumption of running the NPU application to see how much the power consumption of running the NPU has increased. Then use the same method and software to measure coral

@Frank Thanks for your input.

@Frank I just need few clarification.

  1. Which image format does yolov3 take input? Is it RGB or BGR?
  2. When we use the SDK for quantizing the model, which quantization technique is applied?

Thanks for your time.

@Frank I am using gdb to track the error. I found that TIM-VX is an issue. DO you have any idea where I need to clarify? I initially thought the program was not able to read the tengine model (But I think it does read it correctly as I get no error message printed for model not being read). Please find the screenshot below. Any inputs would help me solve the issue.

I traced the program and found the below snippet is getting executed.

struct device* selected_device = find_device_via_name(dev_name);
if (NULL == selected_device)
{
    TLOG_ERR("Tengine: Device(%s) is not found(may not registered).\n", dev_name);
    return -1;
}

Can you share any inputs what might be missing in this case?

You can found this in aml_npu_sdk docs

@Frank Thank you for the confirmation.

Can you please help me with Khadas VIM3 custom one-class YOLOv3 inference issue - #21 by Akkisony

Thanks in advance.

@Akkisony What program are you running? Is it the official demo?

@Frank Yes, it is the official demo. But I have just altered the code a little to be compatible to our software.
I just want to know when do we get TIM-VX error? Do I get them when the model is not loaded correctly or are there any other reason?
Thank you!

@Akkisony I’m not sure why this problem occurs, I haven’t seen it yet, what have you modified, can you provide me with a look?

@Frank Can you please let me know how can I measure the accuracy loss of the model after converting them from darknet to tengine format. Because there is always a loss in accuracy after quantization. I need to know if there is any means where I can measure them? @Frank
@alcohol
Thanks

@Akkisony Maybe you can found it in https://github.com/OAID/Tengine/tree/tengine-lite/doc

@Frank Hi, can you please explain briefly how do you calculate the ‘top’ and ‘left’ co-ordinates of the bounding box?

if (cls >= 0)
{
box b = dets[i].bbox;
int left = (b.x - b.w / 2.) * frame.cols;
int top = (b.y - b.h / 2.) * frame.rows;
if (top < 30) {
top = 30;
left +=10;
}

#if DEBUG_OPTION
fprintf(stderr, “left = %d,top = %d\n”, left, top);

Does ‘top’ mean that the co-ordinate of the bounding box is ‘top’ pixels below?
Does ‘left’ mean that the co-ordinate of the bounding box is ‘left’ number of pixels away?

I need this small clarification. Thanks in advance.

@Akkisony

  1. You get a 1920x1080 picture from the camera.
  2. Resize to 416x416 for NPU reasoning.
  3. Assuming that an object is recognized, the center point and length and width of the object area will be obtained.
  4. Next, on the 416x416 picture, through the position and length and width of this area, calculate the upper left corner point and w and h of this area.
  5. Finally, according to the initial resize ratio, the data on the original picture is calculated, that is, the information drawn on the picture through opencv.

The code here is step 4

1 Like

@Frank Thank you for the explanation with the diagram. :slight_smile:

I have trained a model to detect a single class using yolov3. The detection seems fine on the CPU, however on the NPU, I have some issues with the non max suppression (as I get 2601 detections on a single image). Can you share your expereince, so it would help me to solve this problem?

These below are the parameter:
const int classes = 1;
const float thresh = 0.5;
const float hier_thresh = 0.5;
const float nms = 0.80;
const int numBBoxes = 5;
const int relative = 1;
const char *coco_names[1] = {“battery”};
float biases[18] = {10, 13, 16, 30, 33, 23, 30, 61, 62, 45, 59, 119, 116, 90, 156, 198, 373, 326};

I even increased the nms value to 0.80, yet, I have the same issue. Please shed some input which can help me to solve the issue! Thanks in advance.

Please find the sample output.
Repeat 1 times, thread 1, avg time 85.24 ms, max_time 85.24 ms, min_time 85.24 ms

num_detections,2601
0: 100%
left = 245,top = 30
0: 100%
left = 253,top = 30
0: 100%
left = 269,top = 30
0: 100%
left = 385,top = 35
0: 100%
left = 102,top = 59
0: 100%
left = -23668,top = 51
0: 100%
left = 110,top = 51
0: 100%
left = -1333402262,top = 58
0: 100%

@Akkisony Many people have reported the problem of the single-category yolo model, so I will make a single-category yolo model this week to see where the problem is

@Frank Thank you. Please update me if you find a solution.

@Akkisony I successfully converted a hand detection model. I will release it this week or next. You can follow the forum or docs at that time.

@Frank Looking forward to the release. I hope now single class detection works using yolov3.
Thank you