Error converting Keras and Pytorch models

Hello, I’m using VIM3 Pro, tried to convert my Resnet pytorch weights and got error:


I decided to convert it to ONNX (and it was big mistake because the model stopped working)

Then I decided to use Keras model and your Keras converter and here what I got:

So I have a question, how to convert my PyTorch and Keras weights without converting it to ONNX?

Hello @Agent_kapo

@Louis-Cheng-Liu will help you then.

Hello @Agent_kapo ,

Could you provide your two wrong model?

Here is a link to google drive: Khadas - Google Drive

Also, I provided there my calibration set

Hello @Agent_kapo ,

Your model only save weight but no structure.
You can use Netron to open your model check if save weight and structure.

Right is like this.

Hm
So how then you converted pt weights in pb and os for Khadas in your ksnn example? Also I have troubles while converting pt → onnx → pb + os. Can you please provide a script where you converting pt → onnx (resnet example) and h5 → onnx

Also, if in an example you’re converting .pt to .nb + .os, tell me how you’re converting resnet to pt, maybe I’m doing something wrong

Now I’m using .pt file with weights and structure, but I’m still keep getting the same error. I found this:


If I’m using pytorch to train my model, I have to use Pytorch 1.2.0?
Now I’m using Python 3.9 and Pytorch 1.9.1

Hello @Agent_kapo ,

For pt to onnx, i use pytorch api, torch.onnx.export. I do not try to convert h5 to onnx. I search and find two ways. The api of tf2onnx, tf2onnx.convert.from_keras and the api of keras2onnx, keras2onnx.convert.from_keras. In addition, you can search on github to find other script.

I am sorry that the resnet model is not trained by myself. The details of resnet training, i do not know.

Could you provide your weights and structure pt model? Pytorch version does not have to 1.2.0.

Here is my PyTorch model @Louis-Cheng-Liu
https://drive.google.com/drive/folders/1S5e60IRlo6__4raDcvoE0bexw8b5PDLB

Hello @Agent_kapo ,

First, you need to use torch.jit.save to save PyTorch model. The model saved by torch.save only is used on PyTorch but torch.jit.save can be used on other platform.

import torch

model = torch.load("./best.pt").cpu()
traced_model = torch.jit.trace(model, torch.randn(1, 3, 224, 224))
torch.jit.save(traced_model, "best_1.pt")

About PyTorch version, i find the note on PyTorch official doc.
torch.jit.save — PyTorch 2.2 documentation


You need to use PyTorch version lower than 1.6.

Also, you can convert pt to onnx and then convert onnx to nb.

import torch

model = torch.load("./best.pt").cpu()
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "best_1.onnx", verbose=True, input_names=["input"], output_names=["output"])

Thank you @Louis-Cheng-Liu

I did everything as you said (never used torch.jit.trace before…)
But I’m still keep getting the error, another one, but still…

Here is the full promt:

titan@titan:~/Desktop/aml_npu_sdk/acuity-toolkit/python$ ./convert --model-name resnet-example --platform pytorch --model /home/titan/Desktop/Work/ElCub/resnet_runs/train/exp2/last_1.pt --mean-values '103.94 116.78 123.68 0.01700102' --quantized-dtype asymmetric_affine --source-files /home/titan/Desktop/Work/ElCub/Calibration.txt --kboard VIM3 --print-level 0


--+ KSNN Convert tools v1.3 +--


Start import model ...
2024-04-04 17:26:58.574346: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/titan/Desktop/aml_npu_sdk/acuity-toolkit/bin/acuitylib:/tmp/_MEIH3DLo2
2024-04-04 17:26:58.574365: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
I Namespace(config=None, import='pytorch', input_size_list=None, inputs=None, model='/home/titan/Desktop/Work/ElCub/resnet_runs/train/exp2/last_1.pt', output_data='Model.data', output_model='Model.json', outputs=None, size_with_batch=None, which='import')
I Start importing pytorch...
[6794] Failed to execute script pegasus
Traceback (most recent call last):
  File "pegasus.py", line 131, in <module>
  File "pegasus.py", line 112, in main
  File "acuitylib/app/importer/commands.py", line 294, in execute

Hello @Agent_kapo ,

Which version of your PyTorch?

And could you provide your new model?

It’s my bad, I’m using Pytorch 1.9, I know that it should be 1.5, but I’ve thought that mistake massage should be different for different Pytorch version