Another example of successful use case using PaddleOCR Code
py tools/infer_rec.py -c configs/rec/PP-OCRv4/en_PP-OCRv4_rec.yml -o Global.pretrained_model=pretrain_models/en_PP-OCRv4_rec_train/best_accuracy Global.infer_img=doc/imgs_en/K.png
K.png
Result
[2025/01/22 17:30:55] ppocr WARNING: Skipping import of the encryption module.
[2025/01/22 17:30:55] ppocr INFO: Architecture :
[2025/01/22 17:30:55] ppocr INFO: Backbone :
[2025/01/22 17:30:55] ppocr INFO: name : PPLCNetV3
[2025/01/22 17:30:55] ppocr INFO: scale : 0.95
[2025/01/22 17:30:55] ppocr INFO: Head :
[2025/01/22 17:30:55] ppocr INFO: head_list :
[2025/01/22 17:30:55] ppocr INFO: CTCHead :
[2025/01/22 17:30:55] ppocr INFO: Head :
[2025/01/22 17:30:55] ppocr INFO: fc_decay : 1e-05
[2025/01/22 17:30:55] ppocr INFO: Neck :
[2025/01/22 17:30:55] ppocr INFO: depth : 2
[2025/01/22 17:30:55] ppocr INFO: dims : 120
[2025/01/22 17:30:55] ppocr INFO: hidden_dims : 120
[2025/01/22 17:30:55] ppocr INFO: kernel_size : [1, 3]
[2025/01/22 17:30:55] ppocr INFO: name : svtr
[2025/01/22 17:30:55] ppocr INFO: use_guide : True
[2025/01/22 17:30:55] ppocr INFO: NRTRHead :
[2025/01/22 17:30:55] ppocr INFO: max_text_length : 25
[2025/01/22 17:30:55] ppocr INFO: nrtr_dim : 384
[2025/01/22 17:30:55] ppocr INFO: name : MultiHead
[2025/01/22 17:30:55] ppocr INFO: Transform : None
[2025/01/22 17:30:55] ppocr INFO: algorithm : SVTR_LCNet
[2025/01/22 17:30:55] ppocr INFO: model_type : rec
[2025/01/22 17:30:55] ppocr INFO: Eval :
[2025/01/22 17:30:55] ppocr INFO: dataset :
[2025/01/22 17:30:55] ppocr INFO: data_dir : ./train_data/ic15_data/
[2025/01/22 17:30:55] ppocr INFO: label_file_list : ['./train_data/ic15_data/rec_gt_test.txt']
[2025/01/22 17:30:55] ppocr INFO: name : SimpleDataSet
[2025/01/22 17:30:55] ppocr INFO: transforms :
[2025/01/22 17:30:55] ppocr INFO: DecodeImage :
[2025/01/22 17:30:55] ppocr INFO: channel_first : False
[2025/01/22 17:30:55] ppocr INFO: img_mode : BGR
[2025/01/22 17:30:55] ppocr INFO: MultiLabelEncode :
[2025/01/22 17:30:55] ppocr INFO: gtc_encode : NRTRLabelEncode
[2025/01/22 17:30:55] ppocr INFO: RecResizeImg :
[2025/01/22 17:30:55] ppocr INFO: image_shape : [3, 48, 320]
[2025/01/22 17:30:55] ppocr INFO: KeepKeys :
[2025/01/22 17:30:55] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_gtc', 'length', 'valid_ratio']
[2025/01/22 17:30:55] ppocr INFO: loader :
[2025/01/22 17:30:55] ppocr INFO: batch_size_per_card : 62
[2025/01/22 17:30:55] ppocr INFO: drop_last : False
[2025/01/22 17:30:55] ppocr INFO: num_workers : 4
[2025/01/22 17:30:55] ppocr INFO: shuffle : False
[2025/01/22 17:30:55] ppocr INFO: Global :
[2025/01/22 17:30:55] ppocr INFO: cal_metric_during_train : True
[2025/01/22 17:30:55] ppocr INFO: character_dict_path : ppocr/utils/en_dict.txt
[2025/01/22 17:30:55] ppocr INFO: checkpoints : None
[2025/01/22 17:30:55] ppocr INFO: debug : False
[2025/01/22 17:30:55] ppocr INFO: distributed : False
[2025/01/22 17:30:55] ppocr INFO: epoch_num : 50
[2025/01/22 17:30:55] ppocr INFO: eval_batch_step : [0, 2000]
[2025/01/22 17:30:55] ppocr INFO: infer_img : doc/imgs_en/K.png
[2025/01/22 17:30:55] ppocr INFO: infer_mode : False
[2025/01/22 17:30:55] ppocr INFO: log_smooth_window : 20
[2025/01/22 17:30:55] ppocr INFO: max_text_length : 25
[2025/01/22 17:30:55] ppocr INFO: pretrained_model : pretrain_models/en_PP-OCRv4_rec_train/best_accuracy
[2025/01/22 17:30:55] ppocr INFO: print_batch_step : 10
[2025/01/22 17:30:55] ppocr INFO: save_epoch_step : 10
[2025/01/22 17:30:55] ppocr INFO: save_inference_dir : None
[2025/01/22 17:30:55] ppocr INFO: save_model_dir : ./output/rec_ppocr_v4
[2025/01/22 17:30:55] ppocr INFO: save_res_path : ./output/rec/predicts_ppocrv3.txt
[2025/01/22 17:30:55] ppocr INFO: use_gpu : True
[2025/01/22 17:30:55] ppocr INFO: use_space_char : True
[2025/01/22 17:30:55] ppocr INFO: use_visualdl : False
[2025/01/22 17:30:55] ppocr INFO: Loss :
[2025/01/22 17:30:55] ppocr INFO: loss_config_list :
[2025/01/22 17:30:55] ppocr INFO: CTCLoss : None
[2025/01/22 17:30:55] ppocr INFO: NRTRLoss : None
[2025/01/22 17:30:55] ppocr INFO: name : MultiLoss
[2025/01/22 17:30:55] ppocr INFO: Metric :
[2025/01/22 17:30:55] ppocr INFO: ignore_space : False
[2025/01/22 17:30:55] ppocr INFO: main_indicator : acc
[2025/01/22 17:30:55] ppocr INFO: name : RecMetric
[2025/01/22 17:30:55] ppocr INFO: Optimizer :
[2025/01/22 17:30:55] ppocr INFO: beta1 : 0.9
[2025/01/22 17:30:55] ppocr INFO: beta2 : 0.999
[2025/01/22 17:30:55] ppocr INFO: lr :
[2025/01/22 17:30:55] ppocr INFO: learning_rate : 0.0005
[2025/01/22 17:30:55] ppocr INFO: name : Cosine
[2025/01/22 17:30:55] ppocr INFO: warmup_epoch : 5
[2025/01/22 17:30:55] ppocr INFO: name : Adam
[2025/01/22 17:30:55] ppocr INFO: regularizer :
[2025/01/22 17:30:55] ppocr INFO: factor : 3e-05
[2025/01/22 17:30:55] ppocr INFO: name : L2
[2025/01/22 17:30:55] ppocr INFO: PostProcess :
[2025/01/22 17:30:55] ppocr INFO: name : CTCLabelDecode
[2025/01/22 17:30:55] ppocr INFO: Train :
[2025/01/22 17:30:55] ppocr INFO: dataset :
[2025/01/22 17:30:55] ppocr INFO: data_dir : ./train_data/ic15_data/
[2025/01/22 17:30:55] ppocr INFO: ds_width : False
[2025/01/22 17:30:55] ppocr INFO: ext_op_transform_idx : 1
[2025/01/22 17:30:55] ppocr INFO: label_file_list : ['./train_data/ic15_data/rec_gt_train.txt']
[2025/01/22 17:30:55] ppocr INFO: name : MultiScaleDataSet
[2025/01/22 17:30:55] ppocr INFO: transforms :
[2025/01/22 17:30:55] ppocr INFO: DecodeImage :
[2025/01/22 17:30:55] ppocr INFO: channel_first : False
[2025/01/22 17:30:55] ppocr INFO: img_mode : BGR
[2025/01/22 17:30:55] ppocr INFO: RecConAug :
[2025/01/22 17:30:55] ppocr INFO: ext_data_num : 2
[2025/01/22 17:30:55] ppocr INFO: image_shape : [48, 320, 3]
[2025/01/22 17:30:55] ppocr INFO: max_text_length : 25
[2025/01/22 17:30:55] ppocr INFO: prob : 0.5
[2025/01/22 17:30:55] ppocr INFO: RecAug : None
[2025/01/22 17:30:55] ppocr INFO: MultiLabelEncode :
[2025/01/22 17:30:55] ppocr INFO: gtc_encode : NRTRLabelEncode
[2025/01/22 17:30:55] ppocr INFO: KeepKeys :
[2025/01/22 17:30:55] ppocr INFO: keep_keys : ['image', 'label_ctc', 'label_gtc', 'length', 'valid_ratio']
[2025/01/22 17:30:55] ppocr INFO: loader :
[2025/01/22 17:30:55] ppocr INFO: batch_size_per_card : 62
[2025/01/22 17:30:55] ppocr INFO: drop_last : True
[2025/01/22 17:30:55] ppocr INFO: num_workers : 8
[2025/01/22 17:30:55] ppocr INFO: shuffle : True
[2025/01/22 17:30:55] ppocr INFO: sampler :
[2025/01/22 17:30:55] ppocr INFO: divided_factor : [8, 16]
[2025/01/22 17:30:55] ppocr INFO: first_bs : 96
[2025/01/22 17:30:55] ppocr INFO: fix_bs : False
[2025/01/22 17:30:55] ppocr INFO: is_training : True
[2025/01/22 17:30:55] ppocr INFO: name : MultiScaleSampler
[2025/01/22 17:30:55] ppocr INFO: scales : [[320, 32], [320, 48], [320, 64]]
[2025/01/22 17:30:55] ppocr INFO: profiler_options : None
[2025/01/22 17:30:55] ppocr INFO: train with paddle 2.6.1 and device Place(gpu:0)
W0122 17:30:55.334977 53252 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.6, Runtime API Version: 11.7
W0122 17:30:55.342484 53252 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
[2025/01/22 17:31:02] ppocr INFO: load pretrain successful from pretrain_models/en_PP-OCRv4_rec_train/best_accuracy
[2025/01/22 17:31:02] ppocr INFO: infer_img: doc/imgs_en/K.png
[2025/01/22 17:31:05] ppocr INFO: result: K 0.9910803437232971
[2025/01/22 17:31:05] ppocr INFO: success!