NPU Mobilenet SSD v2 demo and source code

Thanks for sharing your plan.
My plan is to use VIM3 and tourchscreen to emulate commaai-openpilot (open source code in github) without car controlling part. So it can do lane detection and departure warning.

1 Like

good New its done.on vim3 :smiley: thanks again larry

1 Like

Larry, thanks for sharing this repo. I am trying to convert mobile net ssd (v1) and failing. this it what I run:

…/bin/convertensorflow --tf-pb /media/omer/DATA1/Data/10_classes_300X300/checkpoint/out/tflite_graph.pb --inputs normalized_input_image_tensor --input-size-list ‘300,300,3’ --outputs ‘raw_outputs/box_encodings concat_1’ --net-output /media/omer/DATA1/Data/10_classes_300X300/checkpoint/out_aml/mobilenet_ssd.json --data-output /media/omer/DATA1/Data/10_classes_300X300/checkpoint/out_aml/

And this is what I get:

Fold/bias:out0’, ‘FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/mul_fold:out0’, ‘FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_13_pointwise/mul_fold:out0’, ‘FeatureExtractor/MobilenetV1/Conv2d_13_pointwise_2_Conv2d_5_3x3_s2_32/mul_fold:out0’]
2019-12-03 17:57:30.881831: I tensorflow/core/platform/] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
I Convert To TFLite to Import quantized model.
Traceback (most recent call last):
File “”, line 62, in
File “”, line 58, in main
File “acuitylib/app/importer/”, line 125, in run
IndexError: list index out of range
[18365] Failed to execute script convertensorflow

Any ideas?

omer how can you make tflite_graph ?
did you use or script ?

I used the as explained in the repo readme, it actually creates a .pb file and not a .tflite file, but that seems to follow the steps as I understood them

Maybe the issue is becuase the model is lready qunatizied? should I export a non-quatizied model instead?

you looks on the rigth way,maybe this sample help you

~/tensorflow/models$ python3 research/object_detection/ --pipeline_config_path=research/object_detection/test_data/pipeline.config --trained_checkpoint_prefix research/object_detection/test_data/model.ckpt-93313 --output_directory train/fortflite/ --add_postprocessing_op=true

Hi Omer, this is the issue “TensorFlow binary was not compiled to use: AVX2 FMA”
I had the same problem initially was due to fact that my x86 ubuntu was running on an old processor which didn’t support avx2 and Acuity tensorflow compiled binary requires that. I installed ubuntu on a virtual machine on my desktop, which has a newer processor and it worked.
What processor do you use?
See here for more details on AVX here

1 Like

i have same problem.I tried many conversion.Every time its make a folder.when i compiled on vim3 everthing files maded.
when i converted inception model v3 ,its worked
But i converted mobilenetSsd v1 and mobiilessdV2 ,all files maded and converted.
this warning message is here.
and when i open aplication,they result wrong, i inspected all code .And i finded my out levels (asymetric quantizition) have a problem
my cpu i7 6700 and ubuntu x64 18.04 gpu gtx 960m and i use cuda 10.0

larry when i was converted mobilessnet ,in script "./1_quantize_model"
i used this parameter -source-file ./data/validation_tf.txt
and my text = ./06AF5418._9_9_.JPG, 208
this can be problem,because everthing working but out results are too small

Thanks Larry, I did notice the message but ignored it since it’s only an info message and online reading lead me to believe it’s mostly to do with performance. I also noticed your comment on the cpu model, but since I am running with Intel I7 I thought it is modern enough. Anyway thanks so much for the input, I will try to convert with a different machine.
EDIT: I just verified that indeed for me the problem was not with the cpu , but rather with trying to convert a quntizied model. I was able to convert a model that was not quantizied on training.

1 Like

I have a big problem :frowning: @larrylart
everything works perfectly when i run your example and your “mobilenet_ssd_v2a.nb” database,
I get very false results when I run it with my own export database. iused model this link
I converted this model exactly as you described it, i used
$convert_tf \ --tf-pb ./model/ssd_mobilenet_v2.pb \ --inputs 'normalized_input_image_tensor' \ --input-size-list '300,300,3' \ --outputs 'raw_outputs/box_encodings concat_1' \ --net-output ${NAME}.json \ --data-output ${NAME}.data

and i tried with tflite_graph.pb and freeze_inference_graph.pb.
everytime i exported with this script
python3 research/object_detection/ --pipeline_config_path=research/object_detection/ssd_mobilenet_v2/pipeline.config --trained_checkpoint_prefix research/object_detection/ssd_mobilenet_v2/model.ckpt --output_directory train/ --add_postprocessing_op=false
ı have tryied many option ,but everytime application work perfectly your “mobilenet_ssd_v2.nb” and everytime false result my converted database.
i inspected output result .Results were too low when I examined output results.
I even changed the zero point values ​​in mobilenet_ssd_v2.c
what do you think the problem might be?you were able to convert wonderfully

You need to use tflite_graph.pb ( the output of the . Also make sur eyou copied the exported mobilenet_ssd_v2.c and mobilenet_ssd_v2.h and replace the references in the example code and make file or just rename them to mobilenet_ssd_v2a .c/.h
I think I had a similar issue at one point when I changed the name of the output and I forgot to replace the sample app all the references.

Hello Larry

I am planning to get a Khadas VIM3, could you let me know which is the hardware you have which does this 72 IFPS, is it VIM3L or VIM3 Basic or Vim3Pro? Does the RAM size for NPU play any part in inference speed.

How do you rate this board vs Jetson Nano - in terms of NPU/GPU performance alone.

Do you recommend a heat sink and fan for the board?

Can you please upload a video demo of your code in action

Is the NDA issue for NPU development resolved? is the API available for everyone who buys a board?

Hi Larry, thanks for all of your help. I was able to convert the model (mobilenet ssd v1 depth_mul.=0.25 , 300X300) and run it. The results look strange, all bboxes are on the top of the image. Any idea why? I suspect it has to do with the anchors, But I am using the generated file as instructed (and I got the same anchors as yours , since it is the same dims - 300X300). Maybe this has to do with the depth multiplier?
this is what I get:
Thanks, Omer

The problem was with the code to get class_id and idx in aml_worker.cpp
the number 91 is the num of classes in the labels file (which was not clear to me). Once I realized it, I got it to work, and I now get more reasonable bboxes.

1 Like

@ahmetkemal, Maybe you have a similar issue to what I’ve encountered, if you change the lables file, you will need to update the code in aml_worker.cpp to get idx, class_id (91). I hope this helps - Omer.

Where can i download the AML SDK?

@DragonViVi You can apply it in there .

It’s the problem with this tool kit,

@DragonViVi 你这是进行什么编译导致的。麻烦贴出完整的