类库 › OmniParser
microsoft

microsoft/OmniParser

OmniParser 是一款将用户界面屏幕截图解析为结构化易懂元素的工具,提升了GPT-4V生成与界面区域准确对应动作的能力。

23,603 2,011 23,603 221
在 GitHub 上查看

技术栈

框架

Flask

测试

pytest
查看全部依赖 (32)

依赖

NumPy PyAutoGUI accelerate anthropic azure-identity boto3 dashscope dill easyocr einops google-auth gradio groq jsonschema openai opencv-python opencv-python-headless paddleocr paddlepaddle pre-commit pyautogui pytest-asyncio ruff screeninfo streamlit supervision timm torch torchvision transformers uiautomation ultralytics

评论

inicio - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.2. UTC+08:00, 2026-05-15 18:26
浙ICP备14020137号-1 $mapa de visitantes$