类库 › OmniParser
microsoft

microsoft/OmniParser

OmniParser 是一款将用户界面屏幕截图解析为结构化易懂元素的工具,提升了GPT-4V生成与界面区域准确对应动作的能力。

23,603 2,011 23,603 221
在 GitHub 上查看

技术栈

框架

Flask

测试

pytest
查看全部依赖 (32)

依赖

NumPy PyAutoGUI accelerate anthropic azure-identity boto3 dashscope dill easyocr einops google-auth gradio groq jsonschema openai opencv-python opencv-python-headless paddleocr paddlepaddle pre-commit pyautogui pytest-asyncio ruff screeninfo streamlit supervision timm torch torchvision transformers uiautomation ultralytics

评论

首页 - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.1. UTC+08:00, 2026-04-04 09:05
浙ICP备14020137号-1 $访客地图$