VIM3L vs VIM3 NPU nb model size and differences in inference

Sergei · October 27, 2021, 10:16am

Hi, I converted my model to NPU on VIM3 and VIM3L using two different optimize PID in ovxgenerator , all other the same. As far as I understand it converts fp32 weights of model to int8, so the model size on VIM3 is 4 times less. But on VIM3L it produces even smaller file.

You can see it here https://github.com/khadas/aml_npu_demo_binaries/blob/master/inceptionv3/VIM3/inception_v3.nb 35.5Mb https://github.com/khadas/aml_npu_demo_binaries/blob/master/inceptionv3/VIM3L/inception_v3.nb 18.5Mb. Does it mean that VIM3L run on int4 or prune some weights?