很棒的基于PaddlePaddle的OCR工具包（8.6M超轻量级预训练模型，支持服务器、移动、嵌入式和物联网设备之间的训练和部署）

License: Apache License 2.0

Tags ocr, textdetection, textrecognition, paddleocr, crnn, east, star-net, rosetta, ocrlite, db, chineseocr, chinesetextdetection, chinesetextrecognition

Intended Audience
- Developers
Natural Language
- Chinese (Simplified)
Operating System
- OS Independent
Programming Language
Topic
- Utilities

项目描述

Paddleocr 包

1 快速上手

1.1 安装包

由 pypi 安装

pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+

构建自己的 whl 包并安装

python3 setup.py bdist_wheel
pip3 install dist/paddleocr-x.x.x-py3-none-any.whl # x.x.x is the version of paddleocr

2 使用

2.1 代码使用

paddleocr whl 包会自动下载 ppocr 轻量级模型作为默认模型，可以根据第 3 节自定义模型进行自定义替换。

检测角度分类与识别

from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)


# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

输出将是一个列表，每个项目包含边界框、文本和识别置信度

[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......

结果的可视化

检测与识别

from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR(lang='en') # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=False)
for line in result:
    print(line)

# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

输出将是一个列表，每个项目包含边界框、文本和识别置信度

[[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]]
[[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]]
[[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]]
......

结果的可视化

分类和识别

from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, cls=True)
for line in result:
    print(line)

输出将是一个列表，每个项目都包含识别文本和置信度

['PAIN', 0.990372]

仅检测

from paddleocr import PaddleOCR,draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path,rec=False)
for line in result:
    print(line)

# draw result
from PIL import Image

image = Image.open(img_path).convert('RGB')
im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

输出将是一个列表，每个项目只包含边界框

[[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]]
[[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]]
[[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]]
......

结果的可视化

唯一认可

from paddleocr import PaddleOCR
ocr = PaddleOCR(lang='en') # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, cls=False)
for line in result:
    print(line)

输出将是一个列表，每个项目都包含识别文本和置信度

['PAIN', 0.990372]

仅分类

from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to load model into memory
img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png'
result = ocr.ocr(img_path, det=False, rec=False, cls=True)
for line in result:
    print(line)

输出将是一个列表，每个项目包含分类结果和置信度

['0', 0.99999964]

2.2 命令行使用

显示帮助信息

paddleocr -h

检测分类识别

paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --use_angle_cls true --lang en

输出将是一个列表，每个项目包含边界框、文本和识别置信度

[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]
[[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]
[[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]
......

检测与识别

paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --lang en

输出将是一个列表，每个项目包含边界框、文本和识别置信度

[[[441.0, 174.0], [1166.0, 176.0], [1165.0, 222.0], [441.0, 221.0]], ('ACKNOWLEDGEMENTS', 0.9971134662628174)]
[[[403.0, 346.0], [1204.0, 348.0], [1204.0, 384.0], [402.0, 383.0]], ('We would like to thank all the designers and', 0.9761400818824768)]
[[[403.0, 396.0], [1204.0, 398.0], [1204.0, 434.0], [402.0, 433.0]], ('contributors who have been involved in the', 0.9791957139968872)]
......

分类和识别

paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --lang en

输出将是一个列表，每个项目都包含文本和识别置信度

['PAIN', 0.9934559464454651]

仅检测

paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --rec false

输出将是一个列表，每个项目只包含边界框

[[397.0, 802.0], [1092.0, 802.0], [1092.0, 841.0], [397.0, 841.0]]
[[397.0, 750.0], [1211.0, 750.0], [1211.0, 789.0], [397.0, 789.0]]
[[397.0, 702.0], [1209.0, 698.0], [1209.0, 734.0], [397.0, 738.0]]
......

唯一认可

paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --det false --lang en

输出将是一个列表，每个项目都包含文本和识别置信度

['PAIN', 0.9934559464454651]

仅分类

paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --rec false

输出将是一个列表，每个项目包含分类结果和置信度

['0', 0.99999964]

3 使用自定义模型

当内置模型不能满足需求时，需要使用自己训练好的模型。首先，参考inference_en.md的第一部分将你的 det 和 rec 模型转换为推理模型，然后按如下方式使用

3.1 代码使用

from paddleocr import PaddleOCR,draw_ocr
# The path of detection and recognition model must contain model and params files
ocr = PaddleOCR(det_model_dir='{your_det_model_dir}', rec_model_dir='{your_rec_model_dir}', rec_char_dict_path='{your_rec_char_dict_path}', cls_model_dir='{your_cls_model_dir}', use_angle_cls=True)
img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

# draw result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

3.2 命令行使用

paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir} --rec_char_dict_path {your_rec_char_dict_path} --cls_model_dir {your_cls_model_dir} --use_angle_cls true

4 使用网络图像或 numpy 数组作为输入

4.1 网页图片

按代码使用

from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = 'http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

# show result
from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

通过命令行使用

paddleocr --image_dir http://n.sinaimg.cn/ent/transform/w630h933/20171222/o111-fypvuqf1838418.jpg --use_angle_cls=true

4.2 Numpy 数组

仅在代码使用时支持 numpy 数组作为输入

import cv2
from paddleocr import PaddleOCR, draw_ocr, download_with_progressbar
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg'
img = cv2.imread(img_path)
# img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY), If your own training model supports grayscale images, you can uncomment this line
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

# show result
from PIL import Image

download_with_progressbar(img_path, 'tmp.jpg')
image = Image.open('tmp.jpg').convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

5 参数说明

范围	描述	默认值
使用_gpu	是否使用 GPU	真的
gpu_mem	用于初始化的 GPU 内存大小	8000M
图像目录	命令行使用时用于预测的图像路径或文件夹路径
det_algorithm	选择的检测算法类型	D B
det_model_dir	文本检测推理模型文件夹。参数传递有两种方式， 1. 无：自动将内置模型下载到`~/.paddleocr/det`；2.自己转换的推理模型的路径，模型和params文件必须包含在模型路径中	没有任何
det_max_side_len	图像长边的最大尺寸。当长边超过这个值时，长边会调整到这个大小，短边会按比例缩放	960
det_db_thresh	DB输出图的二值化阈值	0.3
det_db_box_thresh	DB输出框的阈值。低于此值的盒子将被丢弃	0.5
det_db_unclip_ratio	DB输出箱扩大比例	2
det_db_score_mode	控制如何计算检测框分数的参数。有“快”和“慢”选项。如果要检测的文字是弯曲的，建议使用'slow'	'快速地'
det_east_score_thresh	EAST输出图的二值化阈值	0.8
det_east_cover_thresh	EAST 输出框的阈值。低于此值的盒子将被丢弃	0.1
det_east_nms_thresh	EAST模型输出框的NMS阈值	0.2
rec_algorithm	选择的识别算法类型	神经网络
rec_model_dir	文本识别推理模型文件夹。参数传递有两种方式， 1. 无：自动将内置模型下载到`~/.paddleocr/rec`；2.自己转换的推理模型的路径，模型和params文件必须包含在模型路径中	没有任何
rec_image_shape	图像形状识别算法	“3,32,320”
rec_batch_num	进行识别时，前向图像的batchsize	30
最大文本长度	识别算法可以识别的最大文本长度	25
rec_char_dict_path	`rec_model_Name`使用方式2时需要修改为自己路径的字母路径	./ppocr/utils/ppocr_keys_v1.txt
使用空间字符	是否识别空格	真的
drop_score	按分数过滤输出（来自识别模型），低于此分数的将不返回	0.5
use_angle_cls	是否加载分类模型	错误的
cls_model_dir	分类推理模型文件夹。参数传递有两种方式， 1. 无：自动将内置模型下载到`~/.paddleocr/cls`；2.自己转换的推理模型的路径，模型和params文件必须包含在模型路径中	没有任何
cls_image_shape	分类算法的图像形状	“3,48,192”
标签列表	分类算法的标签列表	['0','180']
cls_batch_num	进行分类时，前向图像的batchsize	30
enable_mkldnn	是否启用 mkldnn	错误的
use_zero_copy_run	是否通过 zero_copy_run 转发	错误的
郎	支持语言，目前只支持中文(ch)、English(en)、French(french)、German(german)、Korean(korean)、Japanese(japan)	ch
检测	`ppocr.ocr`func exec时启用检测	真的
记录	`ppocr.ocr`func exec时启用识别	真的
分类	`ppocr.ocr`func exec((在命令行模式下使用use_angle_cls控制是否正向启动分类)	错误的
显示日志	是否打印日志	错误的
类型	进行ocr或table structuring，取值在['ocr','structure']	ocr
ocr_version	OCR型号版本号，目前模型支持列表如下：PP-OCRv3支持中英文检测、识别、多语言识别、方向分类器模型，PP-OCRv2支持中文检测识别模型，PP-OCR支持中文检测、识别和方向分类器、多语言识别模型	PP-OCRv3

项目详情

许可证： Apache 许可证 2.0

标签 ocr, textdetection, textrecognition , paddleocr, crnn, 东, 星网 , 罗塞塔, ocrlite, db, chineseocr, chinesetextdetection, chinesetextrecognition

发布历史发布通知| RSS订阅

这个版本

2.6.0.1

2022 年 9 月 7 日

2.6

2022 年 8 月 24 日

2.5.0.3

2022 年 5 月 10 日

2.5.0.2

由 Python 中文网翻译和维护。

paddleocr 2.6.0.1

导航

项目链接

统计数据

Meta

Maintainers

分类

项目描述

Paddleocr 包

1 快速上手

1.1 安装包

2 使用

2.1 代码使用

2.2 命令行使用

3 使用自定义模型

3.1 代码使用

3.2 命令行使用

4 使用网络图像或 numpy 数组作为输入

4.1 网页图片

4.2 Numpy 数组

5 参数说明

项目详情

项目链接

统计数据

元

维护者

分类器

发布历史发布通知| RSS订阅

paddleocr 2.6.0.1

导航

项目链接

统计数据

Meta

Maintainers

分类

项目描述

Paddleocr 包

1 快速上手

1.1 安装包

2 使用

2.1 代码使用

2.2 命令行使用

3 使用自定义模型

3.1 代码使用

3.2 命令行使用

4 使用网络图像或 numpy 数组作为输入

4.1 网页图片

4.2 Numpy 数组

5 参数说明

项目详情

项目链接

统计数据

元

维护者

分类器

发布历史 发布通知| RSS订阅

发布历史发布通知| RSS订阅