Acuity-toolkit convert tflite model got error

jujuede · March 9, 2020, 4:18pm

I use the tflite model to convert, bug got errors :

(vim3l) ➜  retinaface 
NAME=retinaface
ACUITY_PATH=../bin/

convert_caffe=${ACUITY_PATH}convertcaffe
convert_tf=${ACUITY_PATH}convertensorflow
convert_tflite=${ACUITY_PATH}convertflite
convert_darknet=${ACUITY_PATH}convertdarknet
convert_onnx=${ACUITY_PATH}convertonnx

   
$convert_tflite \
   --tflite-model /home/zqh/workspace/vim3l/aml_npu_sdk/acuity-toolkit/retinaface/retinaface.tflite \
   --net-output ${NAME}.json \
   --data-output ${NAME}.data
I Model: retinaface
I Version: 3
I Description: TOCO Converted.
I Subgraphs: 1
D Convert layer pad+conv_2d
D Convert layer depthwise_conv_2d
D Convert layer conv_2d
D Convert layer pad+depthwise_conv_2d
.
.
.
D Convert layer concatenation
D Convert layer concatenation
I Dump net to retinaface.json
2020-03-10 00:10:10.237926: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
D Process input_0 ...
D Acuity output shape(input): (1 640 640 3)
D Process model_1/conv1_pad_1 ...
D Acuity output shape(convolution): (1 319 319 8)
D Process model_1/conv1_relu_2 ...
D Acuity output shape(relun): (1 319 319 8)
D Process model_1/conv_dw_1_relu_3 ...
D Acuity output shape(convolution): (1 319 319 8)
D Process model_1/conv_dw_1_relu_4 ...
D Acuity output shape(relun): (1 319 319 8)
D Process model_1/conv_pw_1_relu_5 ...
D Acuity output shape(convolution): (1 319 319 16)
D Process model_1/conv_pw_1_relu_6 ...
D Acuity output shape(relun): (1 319 319 16)
D Process model_1/conv_pad_2_7 ...
D Acuity output shape(convolution): (1 159 159 16)
D Process model_1/conv_dw_2_relu_8 ...
D Acuity output shape(relun): (1 159 159 16)
D Process model_1/conv_pw_2_relu_9 ...
D Acuity output shape(convolution): (1 159 159 32)
D Process model_1/conv_pw_2_relu_10 ...
D Acuity output shape(relun): (1 159 159 32)
D Process model_1/conv_dw_3_relu_11 ...
D Acuity output shape(convolution): (1 159 159 32)
D Process model_1/conv_dw_3_relu_12 ...
D Acuity output shape(relun): (1 159 159 32)
D Process model_1/conv_pw_3_relu_13 ...
D Acuity output shape(convolution): (1 159 159 32)
D Process model_1/conv_pw_3_relu_14 ...
D Acuity output shape(relun): (1 159 159 32)
D Process model_1/conv_pad_4_15 ...
D Acuity output shape(convolution): (1 79 79 32)
D Process model_1/conv_dw_4_relu_16 ...
D Acuity output shape(relun): (1 79 79 32)
D Process model_1/conv_pw_4_relu_17 ...
D Acuity output shape(convolution): (1 79 79 64)
D Process model_1/conv_pw_4_relu_18 ...
D Acuity output shape(relun): (1 79 79 64)
D Process model_1/conv_dw_5_relu_19 ...
D Acuity output shape(convolution): (1 79 79 64)
D Process model_1/conv_dw_5_relu_20 ...
D Acuity output shape(relun): (1 79 79 64)
D Process model_1/conv_pw_5_relu_21 ...
D Acuity output shape(convolution): (1 79 79 64)
D Process model_1/conv_pw_5_relu_22 ...
D Acuity output shape(relun): (1 79 79 64)
D Process model_1/batch_normalization_23 ...
D Acuity output shape(convolution): (1 79 79 64)
D Process model_1/leaky_re_lu_24 ...
Traceback (most recent call last):
  File "convertflite.py", line 63, in <module>
  File "convertflite.py", line 41, in main
  File "acuitylib/acuitynetbuilder.py", line 279, in build
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 305, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 305, in <genexpr>
AttributeError: 'NoneType' object has no attribute 'to_string'
[19456] Failed to execute script convertflite

I used tf2.0 training, there is no layer in the model that the model converter does not support. tflite model in here, Can check for me what’s wrong?

Frank · March 10, 2020, 1:29am

@jujuede When you use the conversion script , you tensorflow version must be tf1.10 . The conversion script can’t work in newer version . I use tf1.14 to train ,but I must use tf1.10 to convert . But I am not sure about TF2.0 model

jujuede · March 10, 2020, 5:07am

I know, the conversion script environment is tf1.10.

(vim3l) ➜  retinaface pip list                  
Package           Version  
----------------- ---------
absl-py           0.9.0    
astor             0.8.1    
certifi           2018.8.24
cycler            0.10.0   
decorator         4.4.2    
dill              0.2.8.2  
Django            2.2.11   
flatbuffers       1.10     
gast              0.3.3    
grpcio            1.27.2   
h5py              2.8.0    
image             1.5.5    
lmdb              0.93     
Markdown          3.2.1    
matplotlib        2.1.0    
networkx          1.11     
numpy             1.14.5   
onnx              1.4.1    
onnx-tf           1.2.1    
Pillow            5.3.0    
pip               20.0.2   
protobuf          3.6.1    
pyparsing         2.4.6    
python-dateutil   2.8.1    
pytz              2019.3   
PyYAML            5.3      
ruamel.yaml       0.15.81  
scipy             1.1.0    
setuptools        39.1.0   
six               1.14.0   
sqlparse          0.3.1    
tensorboard       1.10.0   
tensorflow        1.10.0   
termcolor         1.1.0    
typing            3.7.4.1  
typing-extensions 3.7.4.1  
Werkzeug          1.0.0    
wheel             0.31.1

I mean tflite should be all supported, not be unable to convert due to the lower version of tensorflow, and my model has no unsupported op. So I hope you can help me solve this problem

Frank · March 10, 2020, 8:06am

@jujuede the error log about AttributeError: 'NoneType' object has no attribute 'xxxxx' is the common version errors . Some functions in tf1.14 and tf2.0 are not available in tf1.10 . So if you use the models directly trained and frozen by others will contain some functional interfaces not included in tf1.10 . I have tried direct conversion and will report an error . Then I use tf1.10 again to export and freeze my model, which can be converted . This problem is also a very troublesome problem, which greatly affects the use . This means that some frozen models cannot be transformed which you can downloads in Internet . This tool is neither open source nor developed by us. We cannot change the current situation. We are already trying to find a way to feed back and solve this problem.

Frank · March 10, 2020, 8:09am

@jujuede You can give me a link to download this model, I will try to slove it although I think it’s a version issue

jujuede · March 10, 2020, 9:12am

Thank you very much !

Frank · March 10, 2020, 9:26am

@jujuede Maybe you can give me a link to get the model you used ?

jujuede · March 10, 2020, 11:01am

Sorry, I reply late, the link to the model is at the end of the first post.

Frank · March 11, 2020, 12:23am

@jujuede OK,I will try to convert it .

jujuede · March 15, 2020, 5:22pm

Does the model conversion tool not support tflite models?

I have generated my model from tf 1.10 to tflite, but the conversion still fails. At the same time, I used tf1.10 to generate tflite with the built-in mobilenetv1, and the conversion failed.

D Process global_average_pooling2d_109 ...
D Acuity output shape(pooling): (1 1 1 256)
D Process reshape_1_110 ...
D Acuity output shape(reshape): (1 1 1 256)
D Process conv_preds_111 ...
D Acuity output shape(convolution): (1 1 1 1000)
D Process variable_173 ...
D Acuity output shape(variable): (1)
D Process act_softmax_112 ...
Traceback (most recent call last):
  File "convertflite.py", line 63, in <module>
  File "convertflite.py", line 41, in main
  File "acuitylib/acuitynetbuilder.py", line 279, in build
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 300, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 305, in build_layer
  File "acuitylib/acuitynetbuilder.py", line 305, in <genexpr>
AttributeError: 'NoneType' object has no attribute 'to_string'
[4915] Failed to execute script convertflite

model link 1 : https://share.weiyun.com/5qRBCWR
model link 2 : https://share.weiyun.com/5LawLtn

Frank · March 16, 2020, 1:44am

@jujuede I test it with google mobilenet tflite model . It convet success . Maybe you can try it ?
http://download.tensorflow.org/models/mobilenet_v1_2018_08_02/mobilenet_v1_1.0_224.tgz

jujuede · March 16, 2020, 6:33am

Maybe my model is tf.keras to transform tflite, and the mobilenet model in 2018 should be written in TF-slim.

But the biggest problem is why I generated tflite with tf1.10 and got the same error. I can’t find out which OP node has the problem.

Frank · March 16, 2020, 6:53am

@jujuede I had test you model which you link to me . I get the same error . I am not sure what happen with it . Beacuse the convert tool not a open source tools . But I met this mistake once . Once I try to use TF tool that which build from tensorflow source code to test a inception model . It return this error . In the end, I found out that the reason for this mistake is my TF version is 1.10, not 1.14 . I forget to switch to 1.14 . I find the docs in the tf website that some functions can be work in 1.10. So I think this may have something to do with your transition from TF. Keras . Maybe you should use tflite directly to train one . There are many problems in the relationship transformation tool, and we are already giving feedback . Next, we will try to train some tflite, or transform some tflite for users to use . But this can’t be realized in the near future. Now we have a lot of stacked tasks to deal with. We can’t deal with this . Several users have fed back to me about tflite . It’s on the plan , just need times . I want to know what you use this model for,If you are not in a hurry, you can wait for further information.

jujuede · March 19, 2020, 7:00am

I successfully quantified the model after converting it to Pb, but the result of running the reasoning in vim3l was all 0.

I would like to ask whether this is due to the error in the process of model quantification or other reasons?

the pb model and image data link: https://share.weiyun.com/5zVvzfZ

convert command：

NAME=retinaface
ACUITY_PATH=../bin/

convert_caffe=${ACUITY_PATH}convertcaffe
convert_tf=${ACUITY_PATH}convertensorflow
convert_tflite=${ACUITY_PATH}convertflite
convert_darknet=${ACUITY_PATH}convertdarknet
convert_onnx=${ACUITY_PATH}convertonnx


   
$convert_tf \
    --tf-pb data_2/model.pb \
    --inputs input_1 \
    --input-size-list '640,640,3' \
    --outputs 'concatenate_3/concat concatenate_4/concat concatenate_5/concat' \
    --net-output ${NAME}.json \
    --data-output ${NAME}.data 

tensorzone=${ACUITY_PATH}tensorzonex
#asymmetric_quantized-u8 dynamic_fixed_point-8 dynamic_fixed_point-16
$tensorzone \
    --action quantization \
    --source text \
    --source-file data_2/validation.txt \
    --channel-mean-value '127.5 127.5 127.5 127.5' \
    --model-input ${NAME}.json \
    --model-data ${NAME}.data \
    --quantized-dtype dynamic_fixed_point-8 \
	--quantized-rebuild


export_ovxlib=${ACUITY_PATH}ovxgenerator

$export_ovxlib \
    --model-input ${NAME}.json \
    --data-input ${NAME}.data \
    --reorder-channel '0 1 2' \
    --channel-mean-value '127.5 127.5 127.5 127.5' \
    --export-dtype quantized \
    --model-quantize ${NAME}.quantize \
    --optimize VIPNANOQI_PID0X99  \
    --viv-sdk ../bin/vcmdtools \
    --pack-nbg-unify 

rm *.h *.c .project .cproject *.vcxproj *.lib BUILD *.linux

mv nbg_unify nbg_unify_${NAME}

cd nbg_unify_${NAME}

mv network_binary.nb ${NAME}.nb

In c code, I find the inception demo stride == 2 , but in my model stride == 1:

    stride= vsi_nn_TypeGetBytes(tensor->attr.dtype.vx_type);
    tensor_data= (uint8_t *)vsi_nn_ConvertTensorToData(graph, tensor);
    buffer= (float *)malloc(sizeof(float) * sz);
    printf("stride == %d\n", stride);
    for (i= 0; i < sz; i++) {
        if (i < 100) { printf("%d ", tensor_data[stride * i]); }
        status= vsi_nn_DtypeToFloat32(&tensor_data[stride * i], &buffer[i], &tensor->attr.dtype);
    }

output:

I [vsi_nn_PrintGraph:1309]***************** Tensors ******************
D [print_tensor:137]id[   0] shape[ 4, 16800, 1      ] fmt[i8 ] qnt[DFP fl= 12]
D [print_tensor:137]id[   1] shape[ 10, 16800, 1     ] fmt[i8 ] qnt[DFP fl= 12]
D [print_tensor:137]id[   2] shape[ 2, 16800, 1      ] fmt[i8 ] qnt[DFP fl= 12]
D [print_tensor:137]id[   3] shape[ 3, 640, 640, 1   ] fmt[i8 ] qnt[DFP fl=  7]
I [vsi_nn_PrintGraph:1318]***************** Nodes ******************
I [vsi_nn_PrintNode:156](             NBG)node[0] [in: 3 ], [out: 0, 1, 2 ] [204c1950]
I [vsi_nn_PrintGraph:1327]******************************************
I [vsi_nn_ConvertTensorToData:732]Create 1228800 data.
Verify...
Verify Graph: 2ms or 2730us
Start run graph [1] times...
Run the 1 time: 54ms or 54772us
vxProcessGraph execution time:
Total   54ms or 54823us
Average 54.82ms or 54823.00us
I [vsi_nn_ConvertTensorToData:732]Create 67200 data.
stride == 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  --- Top5 ---
 -1: 0.000000
 -1: 0.000000
 -1: 0.000000
 -1: 0.000000
 -1: 0.000000

Frank · March 19, 2020, 7:51am

@jujuede Which model did you used ? inception ? or other ?

jujuede · March 19, 2020, 8:15am

my custom model, have 3 outputs node. but each outputs tensor result all are 0.

Frank · March 19, 2020, 9:02am

@jujuede Maye you need to look the source code . Bucase this demo just to get one output data . If you need to get more . You need to modified the code .

Add this to you script 1_quantize_model.sh (you need to change parameter)

 $tensorzone \
     --action inference \
     --source text \
     --source-file ./data/validation_edu.txt \
     --channel-mean-value '128 128 128 128' \
     --model-input ${NAME}.json \
     --model-data ${NAME}.data \
     --dtype quantized

Then the log with 1_quantize_model.sh will print the output original data in the end .

jujuede · March 19, 2020, 1:04pm

I add command to quantize_model.sh:

$tensorzone \
    --action quantization \
    --source text \
    --source-file data_2/validation.txt \
    --channel-mean-value '127.5 127.5 127.5 127.5' \
    --model-input ${NAME}.json \
    --model-data ${NAME}.data \
    --quantized-dtype dynamic_fixed_point-8 \
	--quantized-rebuild

$tensorzone \
    --action inference \
    --source text \
    --source-file data_2/validation.txt \
    --channel-mean-value '127.5 127.5 127.5 127.5' \
    --model-input ${NAME}.json \
    --model-data ${NAME}.data \
    --dtype quantized

Then I found the output tensor was all 0.
What could be the problem?

Frank · March 20, 2020, 1:10am

@jujuede Maybe you need to make sure your pictures in validation.txt is right? If you pictures are right , you need to check you parameter . Because the conversion tool is not open-source, it is difficult for me to debug. You can only try to modify the parameters .

ahmetkemal · March 20, 2020, 10:17pm

Can you share “retinaface.quantize” , after running script 1_conversion?
What it says @concatenate_3/concat ,
@concatenate_4/concat,
@ concatenate_5/concat’ line?