类库
› OmniParser
microsoft/OmniParser
OmniParser 是一款将用户界面屏幕截图解析为结构化易懂元素的工具,提升了GPT-4V生成与界面区域准确对应动作的能力。
技术栈
框架
Flask
测试
pytest
查看全部依赖 (32)
依赖
NumPy
PyAutoGUI
accelerate
anthropic
azure-identity
boto3
dashscope
dill
easyocr
einops
google-auth
gradio
groq
jsonschema
openai
opencv-python
opencv-python-headless
paddleocr
paddlepaddle
pre-commit
pyautogui
pytest-asyncio
ruff
screeninfo
streamlit
supervision
timm
torch
torchvision
transformers
uiautomation
ultralytics