类库 › pdf-document-layout-analysis
huridocs

huridocs/pdf-document-layout-analysis

基于Docker的PDF文档布局分析微服务,支持OCR、页面元素分割与分类(文本、表格等)、阅读顺序识别及格式转换。提供REST API和Gradio Web界面,具备GPU加速和自动翻译功能,适用于智能内容提取与分析。

1,133 127 1,133 11
在 GitHub 上查看
huridocs/pdf-document-layout-analysis

技术栈

根目录 python

框架

FastAPI

网络

Requests
查看全部依赖 (26)

依赖

Pillow PyMuPDF Pydantic Shapely cachetools gunicorn huggingface_hub hydra-core latex2mathml lightgbm ollama opencv-python pdf-annotate pdf2image pix2tex pypandoc python-multipart rapid-table rapidocr roman scipy setuptools torch torchvision transformers uvicorn

截图

https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/ui.png
https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample1.png
https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample2.png
https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample3.png
https://raw.githubusercontent.com/huridocs/pdf-document-layout-analysis/main/images/vgtexample4.png

评论

首页 - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.2. UTC+08:00, 2026-05-17 00:16
浙ICP备14020137号-1 $访客地图$