NPU Mobilenet SSD v2 demo and source code

Maybe the issue is becuase the model is lready qunatizied? should I export a non-quatizied model instead?

you looks on the rigth way,maybe this sample help you

~/tensorflow/models$ python3 research/object_detection/export_tflite_ssd_graph.py --pipeline_config_path=research/object_detection/test_data/pipeline.config --trained_checkpoint_prefix research/object_detection/test_data/model.ckpt-93313 --output_directory train/fortflite/ --add_postprocessing_op=true

Hi Omer, this is the issue “TensorFlow binary was not compiled to use: AVX2 FMA”
I had the same problem initially was due to fact that my x86 ubuntu was running on an old processor which didn’t support avx2 and Acuity tensorflow compiled binary requires that. I installed ubuntu on a virtual machine on my desktop, which has a newer processor and it worked.
What processor do you use?
See here for more details on AVX here python - Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 - Stack Overflow

1 Like

i have same problem.I tried many conversion.Every time its make a folder.when i compiled on vim3 everthing files maded.
when i converted inception model v3 ,its worked
But i converted mobilenetSsd v1 and mobiilessdV2 ,all files maded and converted.
this warning message is here.
and when i open aplication,they result wrong, i inspected all code .And i finded my out levels (asymetric quantizition) have a problem
my cpu i7 6700 and ubuntu x64 18.04 gpu gtx 960m and i use cuda 10.0

larry when i was converted mobilessnet ,in script "./1_quantize_model"
i used this parameter -source-file ./data/validation_tf.txt
and my text = ./06AF5418._9_9_.JPG, 208
this can be problem,because everthing working but out results are too small

Thanks Larry, I did notice the message but ignored it since it’s only an info message and online reading lead me to believe it’s mostly to do with performance. I also noticed your comment on the cpu model, but since I am running with Intel I7 I thought it is modern enough. Anyway thanks so much for the input, I will try to convert with a different machine.
EDIT: I just verified that indeed for me the problem was not with the cpu , but rather with trying to convert a quntizied model. I was able to convert a model that was not quantizied on training.

1 Like

I have a big problem :frowning: @larrylart
everything works perfectly when i run your example and your “mobilenet_ssd_v2a.nb” database,
I get very false results when I run it with my own export database. iused model this link
http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz
I converted this model exactly as you described it, i used
$convert_tf \ --tf-pb ./model/ssd_mobilenet_v2.pb \ --inputs 'normalized_input_image_tensor' \ --input-size-list '300,300,3' \ --outputs 'raw_outputs/box_encodings concat_1' \ --net-output ${NAME}.json \ --data-output ${NAME}.data

and i tried with tflite_graph.pb and freeze_inference_graph.pb.
everytime i exported with this script
python3 research/object_detection/export_tflite_ssd_graph.py --pipeline_config_path=research/object_detection/ssd_mobilenet_v2/pipeline.config --trained_checkpoint_prefix research/object_detection/ssd_mobilenet_v2/model.ckpt --output_directory train/ --add_postprocessing_op=false
ı have tryied many option ,but everytime application work perfectly your “mobilenet_ssd_v2.nb” and everytime false result my converted database.
i inspected output result .Results were too low when I examined output results.
I even changed the zero point values ​​in mobilenet_ssd_v2.c
what do you think the problem might be?you were able to convert wonderfully

You need to use tflite_graph.pb ( the output of the export_tflite_ssd_graph.py) . Also make sur eyou copied the exported mobilenet_ssd_v2.c and mobilenet_ssd_v2.h and replace the references in the example code and make file or just rename them to mobilenet_ssd_v2a .c/.h
I think I had a similar issue at one point when I changed the name of the output and I forgot to replace the sample app all the references.

Hello Larry

I am planning to get a Khadas VIM3, could you let me know which is the hardware you have which does this 72 IFPS, is it VIM3L or VIM3 Basic or Vim3Pro? Does the RAM size for NPU play any part in inference speed.

How do you rate this board vs Jetson Nano - in terms of NPU/GPU performance alone.

Do you recommend a heat sink and fan for the board?

Can you please upload a video demo of your code in action

Is the NDA issue for NPU development resolved? is the API available for everyone who buys a board?

Hi Larry, thanks for all of your help. I was able to convert the model (mobilenet ssd v1 depth_mul.=0.25 , 300X300) and run it. The results look strange, all bboxes are on the top of the image. Any idea why? I suspect it has to do with the anchors, But I am using the generated file as instructed (and I got the same anchors as yours , since it is the same dims - 300X300). Maybe this has to do with the depth multiplier?
this is what I get:
obj_detect_note
Thanks, Omer

EDIT:
The problem was with the code to get class_id and idx in aml_worker.cpp
the number 91 is the num of classes in the labels file (which was not clear to me). Once I realized it, I got it to work, and I now get more reasonable bboxes.

2 Likes

@ahmetkemal, Maybe you have a similar issue to what I’ve encountered, if you change the lables file, you will need to update the code in aml_worker.cpp to get idx, class_id (91). I hope this helps - Omer.

Where can i download the AML SDK?

@DragonViVi You can apply it in there .

It’s the problem with this tool kit,

@DragonViVi 你这是进行什么编译导致的。麻烦贴出完整的

Hi larrylart:
about:Note that for mobilenet SSD I had to use concat_1 instead raw_outputs/class_predictions since I got an error for op RealDiv not supported and I implemented instead the last part in the code.
had you tried the aml_npu_sdk_6.4.0.10? in this version, the op realdiv is working.
./bin/convertensorflow --tf-pb ./process_result/stuff/tflite_graph-no-postprocessing.pb --inputs normalized_input_image_tensor --outputs ‘raw_outputs/box_encodings raw_outputs/class_predictions’ --net-output ./process_result/tflite_graph-no-postprocessing.json --data-output ./process_result/tflite_graph-no-postprocessing.data --input-size-list ‘300,300,3’
above command is ok, so the code: float logi = (float) 1./(1. + exp(-m_pOutputBufferScore[i])); can simply change to float logi = (float)m_pOutputBufferScore[i];?
one more question, in you code,i didn’t saw any preprocess for image except for resizing, the opencv image format is BGR, it is not necessary to chang to RGB?
how do you modify the 1_quantize_model.sh?exspecifily, the --channel-mean-value parameters you use.
i am a new guy,hoping you help!

1 Like

Hi, can you share the process file of your transfer model.Thank you very much!
Or can you share me with your json file?

Hi shihaozhang:
are you asking me? i guess you are chinese,what is the problem you met?

I have solved the problem,thanks!

@Frank @moderators I would like to convert my MobileNet v2 SSD using the SDK and run the model using NPU on VIM3: Please let me know if there is a guideline to do the same. I see that there is some information to convert using YOLOv3. But for MobileNet and other architecture, there is no information at all. Can you please help me to convert other custom models?