awslabs/agent-evaluation

类库 › agent-evaluation

awslabs/agent-evaluation

Agent Evaluation 是一个生成式AI驱动的框架，用于测试虚拟AI助手。它通过内置的LLM评估器与待测目标agent进行多轮对话，并实时评估其响应质量，支持集成到CI/CD流水线中，内置了对亚马逊Bedrock、Q Business等服务的支持。

358 49 358 23

在 GitHub 上查看

技术栈

根目录 python

查看全部依赖 (7)

依赖

Pydantic boto3 click jinja2 jsonpath-ng pyyaml rich

samples/aws_step_functions_deployment/layers/agent-evaluation python

查看全部依赖 (1)

依赖

agent-evaluation

samples/aws_step_functions_deployment/layers/aws-lambda-powertools python

查看全部依赖 (1)

依赖

aws-lambda-powertools

samples/aws_step_functions_deployment python

查看全部依赖 (3)

依赖

aws-cdk-lib constructs pathlib

samples/streamlit_app python

框架

Tornado

网络

Requests

查看全部依赖 (81)

依赖

Faker GitPython Jinja2 Markdown MarkupSafe NumPy Pandas PyYAML Pydantic Pygments SQLAlchemy agent-evaluation altair annotated-types attrs beautifulsoup4 blinker boto3 botocore cachetools certifi charset-normalizer click contourpy cycler entrypoints favicon fonttools gitdb htbuilder idna jmespath jsonpath-ng jsonschema jsonschema-specifications kiwisolver lxml markdown-it-py markdownlit matplotlib mdurl more-itertools narwhals packaging pillow plotly ply prometheus_client protobuf pyarrow pydantic_core pydeck pymdown-extensions pyparsing python-dateutil pytz referencing rich rpds-py s3transfer six smmap soupsieve st-annotated-text st-theme streamlit streamlit-camera-input-live streamlit-card streamlit-embedcode streamlit-extras streamlit-faker streamlit-image-coordinates streamlit-keyup streamlit-toggle-switch streamlit-vertical-slider tenacity toml typing_extensions tzdata urllib3 validators

awslabs/agent-evaluation