类库 › skillsbench
benchflow-ai

benchflow-ai/skillsbench

SkillsBench是首个专门评估AI智能体使用技能能力的基准测试平台,通过模块化技能文件夹(包含指令、脚本等)测试智能体执行复杂工作流的有效性。提供Gym风格的基准测试框架,支持多技能组合任务,旨在构建高质量、覆盖广泛的技能评估标准。

1,139 289 1,139 52
在 GitHub 上查看

技术栈

experiments/metrics-dashboard javascript

框架

React ^18.3.1

构建工具

Vite ^6.0.6

CSS 框架

Tailwind CSS ^3.4.17
查看全部依赖 (25)

依赖

@radix-ui/react-dialog ^1.1.15 @radix-ui/react-dropdown-menu ^2.1.16 @radix-ui/react-popover ^1.1.15 @radix-ui/react-select ^2.2.6 @radix-ui/react-slot ^1.2.4 @radix-ui/react-tabs ^1.1.13 @tanstack/react-table ^8.21.3 class-variance-authority ^0.7.1 clsx ^2.1.1 cors ^2.8.6 express ^5.2.1 lucide-react ^0.563.0 react-dom ^18.3.1 recharts ^3.7.0 tailwind-merge ^3.4.0

开发依赖

@types/cors ^2.8.19 @types/express ^5.0.6 @types/node ^22.10.0 @types/react ^18.3.18 @types/react-dom ^18.3.5 @vitejs/plugin-react ^4.3.4 autoprefixer ^10.4.20 postcss ^8.4.49 tsx ^4.21.0 typescript ^5.7.2

tasks/fix-visual-stability/environment/api-server javascript

查看全部依赖 (2)

依赖

cors ^2.8.5 express ^4.18.2

tasks/fix-visual-stability/environment/app javascript

框架

Next.js 14.0.4 React 18.2.0

测试

Playwright ^1.49.1

CSS 框架

Tailwind CSS ^4.1.18
查看全部依赖 (8)

依赖

pngjs ^7.0.0 react-dom 18.2.0

开发依赖

@tailwindcss/postcss ^4.1.18 @types/node 20.10.0 @types/react 18.2.0 autoprefixer ^10.4.23 postcss ^8.5.6 typescript 5.3.3

tasks/flink-query/environment/workspace java

查看全部依赖 (10)

依赖

org.apache.flink:flink-clients org.apache.flink:flink-connector-kafka org.apache.flink:flink-connector-wikiedits org.apache.flink:flink-examples-streaming org.apache.flink:flink-java org.apache.flink:flink-streaming-java org.apache.flink:flink-walkthrough-common org.apache.logging.log4j:log4j-api org.apache.logging.log4j:log4j-core org.apache.logging.log4j:log4j-slf4j-impl

tasks/react-performance-debugging/environment/api-simulator javascript

查看全部依赖 (5)

依赖

express ^4.18.2

开发依赖

@types/express ^4.17.21 @types/node ^20.10.0 tsx ^4.7.0 typescript ^5.3.0

tasks/simpo-code-reproduction/environment/SimPO python

框架

Tornado 6.1=py310h5764c6d_3
查看全部依赖 (49)

依赖

_libgcc_mutex 0.1=main _openmp_mutex 5.1=1_gnu asttokens 2.4.1=pyhd8ed1ab_0 bzip2 1.0.8=h5eee18b_5 ca-certificates 2024.2.2=hbcca054_0 comm 0.2.2=pyhd8ed1ab_0 debugpy 1.6.7=py310h6a678d5_0 decorator 5.1.1=pyhd8ed1ab_0 entrypoints 0.4=pyhd8ed1ab_0 executing 2.0.1=pyhd8ed1ab_0 ipykernel 6.29.3=pyhd33586a_0 ipython 8.22.2=pyh707e725_0 jedi 0.19.1=pyhd8ed1ab_0 jupyter_client 7.3.4=pyhd8ed1ab_0 jupyter_core 5.7.2=py310hff52083_0 ld_impl_linux-64 2.38=h1181459_1 libffi 3.4.4=h6a678d5_0 libgcc-ng 11.2.0=h1234567_1 libgomp 11.2.0=h1234567_1 libsodium 1.0.18=h36c2ea0_1 libstdcxx-ng 11.2.0=h1234567_1 libuuid 1.41.5=h5eee18b_0 matplotlib-inline 0.1.7=pyhd8ed1ab_0 ncurses 6.4=h6a678d5_0 nest-asyncio 1.6.0=pyhd8ed1ab_0 openssl 3.0.13=h7f8727e_1 packaging 24.0=pyhd8ed1ab_0 parso 0.8.4=pyhd8ed1ab_0 pexpect 4.9.0=pyhd8ed1ab_0 pickleshare 0.7.5=py_1003 platformdirs 4.2.1=pyhd8ed1ab_0 ptyprocess 0.7.0=pyhd3deb0d_0 pure_eval 0.2.2=pyhd8ed1ab_0 python 3.10.14=h955ad1f_0 python_abi 3.10=2_cp310 pyzmq 25.1.2=py310h6a678d5_0 readline 8.2=h5eee18b_0 setuptools 68.2.2=py310h06a4308_0 six 1.16.0=pyh6c4a22f_0 sqlite 3.41.2=h5eee18b_0 stack_data 0.6.2=pyhd8ed1ab_0 tk 8.6.12=h1ccaba5_0 traitlets 5.14.3=pyhd8ed1ab_0 typing_extensions 4.11.0=pyha770c72_0 wcwidth 0.2.13=pyhd8ed1ab_0 wheel 0.41.2=py310h06a4308_0 xz 5.4.6=h5eee18b_0 zeromq 4.3.5=h6a678d5_0 zlib 1.2.13=h5eee18b_0

tasks/spring-boot-jakarta-migration/environment/workspace java

框架

Spring Boot Web
查看全部依赖 (8)

依赖

com.h2database:h2 io.jsonwebtoken:jjwt javax.xml.bind:jaxb-api org.springframework.boot:spring-boot-starter-data-jpa org.springframework.boot:spring-boot-starter-security org.springframework.boot:spring-boot-starter-test org.springframework.boot:spring-boot-starter-validation org.springframework.security:spring-security-test

tasks_excluded/scheduling-email-assistant/environment/skills/gmail-skill javascript

查看全部依赖 (4)

依赖

googleapis ^144.0.0 mailcomposer ^4.0.1 minimist ^1.2.8 open ^10.1.0

tasks_excluded/scheduling-email-assistant/environment/skills/google-calendar-skill javascript

查看全部依赖 (3)

依赖

googleapis ^144.0.0 minimist ^1.2.8 open ^10.1.0

评论

Главная - Вики-сайт
Copyright © 2011-2026 iteam. Current version is 2.155.2. UTC+08:00, 2026-05-10 19:46
浙ICP备14020137号-1 $Гость$