Error converting Keras and Pytorch models

Agent_kapo · March 29, 2024, 6:28am

Hello, I’m using VIM3 Pro, tried to convert my Resnet pytorch weights and got error:

I decided to convert it to ONNX (and it was big mistake because the model stopped working)

Then I decided to use Keras model and your Keras converter and here what I got:

So I have a question, how to convert my PyTorch and Keras weights without converting it to ONNX?

numbqq · March 29, 2024, 9:30am

Hello @Agent_kapo

@Louis-Cheng-Liu will help you then.

Louis-Cheng-Liu · March 29, 2024, 9:45am

Hello @Agent_kapo ,

Could you provide your two wrong model?

Agent_kapo · March 29, 2024, 10:02am

Here is a link to google drive: Khadas - Google Drive

Agent_kapo · March 29, 2024, 10:03am

Also, I provided there my calibration set

Louis-Cheng-Liu · April 1, 2024, 9:13am

Hello @Agent_kapo ,

Your model only save weight but no structure.
You can use Netron to open your model check if save weight and structure.

Right is like this.

Agent_kapo · April 1, 2024, 3:25pm

Hm
So how then you converted pt weights in pb and os for Khadas in your ksnn example? Also I have troubles while converting pt → onnx → pb + os. Can you please provide a script where you converting pt → onnx (resnet example) and h5 → onnx

Agent_kapo · April 1, 2024, 3:40pm

Also, if in an example you’re converting .pt to .nb + .os, tell me how you’re converting resnet to pt, maybe I’m doing something wrong

Agent_kapo · April 1, 2024, 8:05pm

Now I’m using .pt file with weights and structure, but I’m still keep getting the same error. I found this:

If I’m using pytorch to train my model, I have to use Pytorch 1.2.0?
Now I’m using Python 3.9 and Pytorch 1.9.1

Louis-Cheng-Liu · April 2, 2024, 1:35am

Hello @Agent_kapo ,

For pt to onnx, i use pytorch api, torch.onnx.export. I do not try to convert h5 to onnx. I search and find two ways. The api of tf2onnx, tf2onnx.convert.from_keras and the api of keras2onnx, keras2onnx.convert.from_keras. In addition, you can search on github to find other script.

I am sorry that the resnet model is not trained by myself. The details of resnet training, i do not know.

Could you provide your weights and structure pt model? Pytorch version does not have to 1.2.0.

Agent_kapo · April 2, 2024, 6:18am

Here is my PyTorch model @Louis-Cheng-Liu
https://drive.google.com/drive/folders/1S5e60IRlo6__4raDcvoE0bexw8b5PDLB

Louis-Cheng-Liu · April 2, 2024, 9:47am

Hello @Agent_kapo ,

First, you need to use torch.jit.save to save PyTorch model. The model saved by torch.save only is used on PyTorch but torch.jit.save can be used on other platform.

import torch

model = torch.load("./best.pt").cpu()
traced_model = torch.jit.trace(model, torch.randn(1, 3, 224, 224))
torch.jit.save(traced_model, "best_1.pt")

About PyTorch version, i find the note on PyTorch official doc.
torch.jit.save — PyTorch 2.2 documentation

You need to use PyTorch version lower than 1.6.

Also, you can convert pt to onnx and then convert onnx to nb.

import torch

model = torch.load("./best.pt").cpu()
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "best_1.onnx", verbose=True, input_names=["input"], output_names=["output"])

Agent_kapo · April 4, 2024, 10:32am

Thank you @Louis-Cheng-Liu

I did everything as you said (never used torch.jit.trace before…)
But I’m still keep getting the error, another one, but still…

Agent_kapo · April 4, 2024, 10:34am

Here is the full promt:

titan@titan:~/Desktop/aml_npu_sdk/acuity-toolkit/python$ ./convert --model-name resnet-example --platform pytorch --model /home/titan/Desktop/Work/ElCub/resnet_runs/train/exp2/last_1.pt --mean-values '103.94 116.78 123.68 0.01700102' --quantized-dtype asymmetric_affine --source-files /home/titan/Desktop/Work/ElCub/Calibration.txt --kboard VIM3 --print-level 0


--+ KSNN Convert tools v1.3 +--


Start import model ...
2024-04-04 17:26:58.574346: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/titan/Desktop/aml_npu_sdk/acuity-toolkit/bin/acuitylib:/tmp/_MEIH3DLo2
2024-04-04 17:26:58.574365: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
I Namespace(config=None, import='pytorch', input_size_list=None, inputs=None, model='/home/titan/Desktop/Work/ElCub/resnet_runs/train/exp2/last_1.pt', output_data='Model.data', output_model='Model.json', outputs=None, size_with_batch=None, which='import')
I Start importing pytorch...
[6794] Failed to execute script pegasus
Traceback (most recent call last):
  File "pegasus.py", line 131, in <module>
  File "pegasus.py", line 112, in main
  File "acuitylib/app/importer/commands.py", line 294, in execute

Louis-Cheng-Liu · April 7, 2024, 2:41am

Hello @Agent_kapo ,

Which version of your PyTorch?

And could you provide your new model?

Agent_kapo · April 7, 2024, 5:09am

It’s my bad, I’m using Pytorch 1.9, I know that it should be 1.5, but I’ve thought that mistake massage should be different for different Pytorch version