使用自定义模型,基本没有调用NPU,如果是tensorflow高版本转换其它格式,只有很少一部分NPU调用

root@Khadas:~/nbg_unify_rbqe/bin_r_cv4# ls
1.jpeg 2.jpeg 3.jpeg 4.jpeg main.o repvgg repvgg.nb vnn_post_process.o vnn_pre_process.o vnn_repvgg.o
root@Khadas:~/nbg_unify_rbqe/bin_r_cv4# ./repvgg repvgg.nb 1.jpeg
#productname=VIPNano-QI, pid=0x88
Created VX Thread: 0x980e61c0
Create Neural Network: 274ms or 274772us
Verify…
generate command buffer, total device count=1, core count per-device: 1,
current device id=0, AXI SRAM base address=0xff000000
---------------------------Begin VerifyTiling -------------------------
AXI-SRAM = 1048576 Bytes VIP-SRAM = 522240 Bytes SWTILING_PHASE_FEATURES[1, 1, 0]
0 NBG [( 0 0 0 0, 0, 0x(nil)(0x(nil), 0x(nil)) → 0 0 0 0, 0, 0x(nil)(0x(nil), 0x(nil))) k(0 0 0, 0) pad(0 0) pool(0 0, 0 0)]
id IN [ x y w h ] OUT [ x y w h ] (tx, ty, kpc) (ic, kc, kc/ks, ks/eks, kernel_type)
0 NBG DD 0x(nil) [ 0 0 0 0] → DD 0x(nil) [ 0 0 0 0] ( 0, 0, 0) ( 0, 0, 0.000000%, 0.000000%, NONE)
PreLoadWeightBiases = 1048576 100.000000%
---------------------------End VerifyTiling -------------------------
Verify Graph: 33ms or 33664us
Start run graph [1] times…
layer_id: 0 layer name:network_binary_graph operation[0]:unkown operation type target:unkown operation target.
uid: 0
abs_op_id: 0
execution time: 1105465 us
[ 1] TOTAL_READ_BANDWIDTH (MByte): 1921.566071
[ 2] TOTAL_WRITE_BANDWIDTH (MByte): 917.348814
[ 3] AXI_READ_BANDWIDTH (MByte): 1323.427848
[ 4] AXI_WRITE_BANDWIDTH (MByte): 791.665362
[ 5] DDR_READ_BANDWIDTH (MByte): 598.138223
[ 6] DDR_WRITE_BANDWIDTH (MByte): 125.683452
[ 7] GPUTOTALCYCLES: 884337957
[ 8] GPUIDLECYCLES: 1186124
VPC_ELAPSETIME: 1107159
Run the 1 time: 1108.00ms or 1108173.00us
vxProcessGraph execution time:
Total 1108.00ms or 1108379.00us
Average 1108.38ms or 1108379.00us
— Top5 —
0: 0.000000
1: 0.000000
2: 0.000000
3: 0.000000
4: 0.000000
Exit VX Thread: 0x980e61c0