类库
› Doctra
AdemBoukhris457/Doctra
Doctra是一个Python文档解析库,能够解析、提取和分析PDF、DOCX等文档格式,支持表格提取、图表分析和OCR识别,提供Web界面和命令行工具
标签
技术栈
框架
Tornado
测试
pytest
网络
Requests
查看全部依赖 (261)
依赖
-e
Jinja2
MarkupSafe
NumPy
Pandas
PyMuPDF
PyYAML
Pydantic
Pygments
SQLAlchemy
Send2Trash
aiohappyeyeballs
aiohttp
aiosignal
airportsdata
aistudio_sdk
annotated-types
anthropic
anyio
argon2-cffi
argon2-cffi-bindings
arrow
astor
asttokens
async-lru
attrs
babel
backoff
bce-python-sdk
beautifulsoup4
bleach
build
cachetools
certifi
cffi
chardet
charset-normalizer
click
cloudpickle
colorama
colorlog
comm
contourpy
coverage
cssselect
cssutils
cycler
dataclasses-json
datasets
debugpy
decorator
defusedxml
dill
diskcache
distro
docutils
dottxt
einops
et_xmlfile
executing
fastjsonschema
filelock
fonttools
fqdn
frozenlist
fsspec
ftfy
future
genson
google-auth
google-genai
gradio
greenlet
h11
httpcore
httpx
httpx-sse
huggingface-hub
id
idna
imagesize
iniconfig
ipykernel
ipython
ipython_pygments_lexers
iso3166
isoduration
jaraco.classes
jaraco.context
jaraco.functools
jedi
jiter
joblib
json5
jsonpatch
jsonpath-ng
jsonpointer
jsonschema
jsonschema-specifications
jupyter-events
jupyter-lsp
jupyter_client
jupyter_core
jupyter_server
jupyter_server_terminals
jupyterlab
jupyterlab_pygments
jupyterlab_server
keyring
kiwisolver
langchain
langchain-community
langchain-core
langchain-openai
langchain-text-splitters
langsmith
lark
llama_cpp_python
llvmlite
lxml
markdown-it-py
marshmallow
matplotlib
matplotlib-inline
mdurl
mistune
mkdocs
mkdocs-material
mkdocs-static-i18n
mkdocstrings
modelscope
more-itertools
mpmath
multidict
multiprocess
mypy_extensions
nbclient
nbconvert
nbformat
nest-asyncio
networkx
nh3
notebook
notebook_shim
numba
nvidia-cublas-cu11
nvidia-cuda-nvrtc-cu11
nvidia-cuda-runtime-cu11
nvidia-cudnn-cu11
nvidia-cufft-cu11
nvidia-curand-cu11
nvidia-cusolver-cu11
nvidia-cusparse-cu11
ollama
openai
opencv-contrib-python
opencv-python
opencv-python-headless
openpyxl
opt-einsum
orjson
outlines
outlines_core
packaging
paddleocr
paddlepaddle
paddlepaddle-gpu
paddlex
pandas-stubs
pandocfilters
parso
pdf2image
pillow
platformdirs
pluggy
ply
premailer
prettytable
prometheus_client
prompt_toolkit
propcache
protobuf
psutil
pure_eval
py-cpuinfo
pyarrow
pyasn1
pyasn1_modules
pyclipper
pycparser
pycryptodome
pydantic-settings
pydantic_core
pymdown-extensions
pyparsing
pypdfium2
pyproject_hooks
pytesseract
pytest-cov
python-dateutil
python-docx
python-dotenv
python-json-logger
pytz
pywin32
pywin32-ctypes
pywinpty
pyzmq
readme_renderer
referencing
regex
requests-toolbelt
rfc3339-validator
rfc3986
rfc3986-validator
rfc3987-syntax
rich
rpds-py
rsa
ruamel.yaml
ruamel.yaml.clib
safetensors
scikit-image
scikit-learn
scipy
setuptools
shapely
six
sniffio
soupsieve
stack-data
sympy
tenacity
terminado
tesseract
threadpoolctl
tiktoken
tinycss2
tokenizers
torch
torchvision
tqdm
traitlets
transformers
twine
types-python-dateutil
types-pytz
typing-inspect
typing-inspection
typing_extensions
tzdata
ujson
uri-template
urllib3
wcwidth
webcolors
webencodings
websocket-client
xxhash
yarl
zstandard