类库 › Doctra
AdemBoukhris457

AdemBoukhris457/Doctra

Doctra是一个Python文档解析库,能够解析、提取和分析PDF、DOCX等文档格式,支持表格提取、图表分析和OCR识别,提供Web界面和命令行工具

技术栈

框架

Tornado

测试

pytest

网络

Requests
查看全部依赖 (261)

依赖

-e Jinja2 MarkupSafe NumPy Pandas PyMuPDF PyYAML Pydantic Pygments SQLAlchemy Send2Trash aiohappyeyeballs aiohttp aiosignal airportsdata aistudio_sdk annotated-types anthropic anyio argon2-cffi argon2-cffi-bindings arrow astor asttokens async-lru attrs babel backoff bce-python-sdk beautifulsoup4 bleach build cachetools certifi cffi chardet charset-normalizer click cloudpickle colorama colorlog comm contourpy coverage cssselect cssutils cycler dataclasses-json datasets debugpy decorator defusedxml dill diskcache distro docutils dottxt einops et_xmlfile executing fastjsonschema filelock fonttools fqdn frozenlist fsspec ftfy future genson google-auth google-genai gradio greenlet h11 httpcore httpx httpx-sse huggingface-hub id idna imagesize iniconfig ipykernel ipython ipython_pygments_lexers iso3166 isoduration jaraco.classes jaraco.context jaraco.functools jedi jiter joblib json5 jsonpatch jsonpath-ng jsonpointer jsonschema jsonschema-specifications jupyter-events jupyter-lsp jupyter_client jupyter_core jupyter_server jupyter_server_terminals jupyterlab jupyterlab_pygments jupyterlab_server keyring kiwisolver langchain langchain-community langchain-core langchain-openai langchain-text-splitters langsmith lark llama_cpp_python llvmlite lxml markdown-it-py marshmallow matplotlib matplotlib-inline mdurl mistune mkdocs mkdocs-material mkdocs-static-i18n mkdocstrings modelscope more-itertools mpmath multidict multiprocess mypy_extensions nbclient nbconvert nbformat nest-asyncio networkx nh3 notebook notebook_shim numba nvidia-cublas-cu11 nvidia-cuda-nvrtc-cu11 nvidia-cuda-runtime-cu11 nvidia-cudnn-cu11 nvidia-cufft-cu11 nvidia-curand-cu11 nvidia-cusolver-cu11 nvidia-cusparse-cu11 ollama openai opencv-contrib-python opencv-python opencv-python-headless openpyxl opt-einsum orjson outlines outlines_core packaging paddleocr paddlepaddle paddlepaddle-gpu paddlex pandas-stubs pandocfilters parso pdf2image pillow platformdirs pluggy ply premailer prettytable prometheus_client prompt_toolkit propcache protobuf psutil pure_eval py-cpuinfo pyarrow pyasn1 pyasn1_modules pyclipper pycparser pycryptodome pydantic-settings pydantic_core pymdown-extensions pyparsing pypdfium2 pyproject_hooks pytesseract pytest-cov python-dateutil python-docx python-dotenv python-json-logger pytz pywin32 pywin32-ctypes pywinpty pyzmq readme_renderer referencing regex requests-toolbelt rfc3339-validator rfc3986 rfc3986-validator rfc3987-syntax rich rpds-py rsa ruamel.yaml ruamel.yaml.clib safetensors scikit-image scikit-learn scipy setuptools shapely six sniffio soupsieve stack-data sympy tenacity terminado tesseract threadpoolctl tiktoken tinycss2 tokenizers torch torchvision tqdm traitlets transformers twine types-python-dateutil types-pytz typing-inspect typing-inspection typing_extensions tzdata ujson uri-template urllib3 wcwidth webcolors webencodings websocket-client xxhash yarl zstandard

评论

首页 - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.1. UTC+08:00, 2026-04-06 14:12
浙ICP备14020137号-1 $访客地图$