类库
› skillsbench
benchflow-ai/skillsbench
SkillsBench是首个专门评估AI智能体使用技能能力的基准测试平台,通过模块化技能文件夹(包含指令、脚本等)测试智能体执行复杂工作流的有效性。提供Gym风格的基准测试框架,支持多技能组合任务,旨在构建高质量、覆盖广泛的技能评估标准。
技术栈
experiments/metrics-dashboard javascript
框架
React
^18.3.1
构建工具
Vite
^6.0.6
CSS 框架
Tailwind CSS
^3.4.17
查看全部依赖 (25)
依赖
@radix-ui/react-dialog
^1.1.15
@radix-ui/react-dropdown-menu
^2.1.16
@radix-ui/react-popover
^1.1.15
@radix-ui/react-select
^2.2.6
@radix-ui/react-slot
^1.2.4
@radix-ui/react-tabs
^1.1.13
@tanstack/react-table
^8.21.3
class-variance-authority
^0.7.1
clsx
^2.1.1
cors
^2.8.6
express
^5.2.1
lucide-react
^0.563.0
react-dom
^18.3.1
recharts
^3.7.0
tailwind-merge
^3.4.0
开发依赖
@types/cors
^2.8.19
@types/express
^5.0.6
@types/node
^22.10.0
@types/react
^18.3.18
@types/react-dom
^18.3.5
@vitejs/plugin-react
^4.3.4
autoprefixer
^10.4.20
postcss
^8.4.49
tsx
^4.21.0
typescript
^5.7.2
tasks/fix-visual-stability/environment/api-server javascript
查看全部依赖 (2)
依赖
cors
^2.8.5
express
^4.18.2
tasks/fix-visual-stability/environment/app javascript
框架
Next.js
14.0.4
React
18.2.0
测试
Playwright
^1.49.1
CSS 框架
Tailwind CSS
^4.1.18
查看全部依赖 (8)
依赖
pngjs
^7.0.0
react-dom
18.2.0
开发依赖
@tailwindcss/postcss
^4.1.18
@types/node
20.10.0
@types/react
18.2.0
autoprefixer
^10.4.23
postcss
^8.5.6
typescript
5.3.3
tasks/flink-query/environment/workspace java
查看全部依赖 (10)
依赖
org.apache.flink:flink-clients
org.apache.flink:flink-connector-kafka
org.apache.flink:flink-connector-wikiedits
org.apache.flink:flink-examples-streaming
org.apache.flink:flink-java
org.apache.flink:flink-streaming-java
org.apache.flink:flink-walkthrough-common
org.apache.logging.log4j:log4j-api
org.apache.logging.log4j:log4j-core
org.apache.logging.log4j:log4j-slf4j-impl
tasks/react-performance-debugging/environment/api-simulator javascript
查看全部依赖 (5)
依赖
express
^4.18.2
开发依赖
@types/express
^4.17.21
@types/node
^20.10.0
tsx
^4.7.0
typescript
^5.3.0
tasks/simpo-code-reproduction/environment/SimPO python
框架
Tornado
6.1=py310h5764c6d_3
查看全部依赖 (49)
依赖
_libgcc_mutex
0.1=main
_openmp_mutex
5.1=1_gnu
asttokens
2.4.1=pyhd8ed1ab_0
bzip2
1.0.8=h5eee18b_5
ca-certificates
2024.2.2=hbcca054_0
comm
0.2.2=pyhd8ed1ab_0
debugpy
1.6.7=py310h6a678d5_0
decorator
5.1.1=pyhd8ed1ab_0
entrypoints
0.4=pyhd8ed1ab_0
executing
2.0.1=pyhd8ed1ab_0
ipykernel
6.29.3=pyhd33586a_0
ipython
8.22.2=pyh707e725_0
jedi
0.19.1=pyhd8ed1ab_0
jupyter_client
7.3.4=pyhd8ed1ab_0
jupyter_core
5.7.2=py310hff52083_0
ld_impl_linux-64
2.38=h1181459_1
libffi
3.4.4=h6a678d5_0
libgcc-ng
11.2.0=h1234567_1
libgomp
11.2.0=h1234567_1
libsodium
1.0.18=h36c2ea0_1
libstdcxx-ng
11.2.0=h1234567_1
libuuid
1.41.5=h5eee18b_0
matplotlib-inline
0.1.7=pyhd8ed1ab_0
ncurses
6.4=h6a678d5_0
nest-asyncio
1.6.0=pyhd8ed1ab_0
openssl
3.0.13=h7f8727e_1
packaging
24.0=pyhd8ed1ab_0
parso
0.8.4=pyhd8ed1ab_0
pexpect
4.9.0=pyhd8ed1ab_0
pickleshare
0.7.5=py_1003
platformdirs
4.2.1=pyhd8ed1ab_0
ptyprocess
0.7.0=pyhd3deb0d_0
pure_eval
0.2.2=pyhd8ed1ab_0
python
3.10.14=h955ad1f_0
python_abi
3.10=2_cp310
pyzmq
25.1.2=py310h6a678d5_0
readline
8.2=h5eee18b_0
setuptools
68.2.2=py310h06a4308_0
six
1.16.0=pyh6c4a22f_0
sqlite
3.41.2=h5eee18b_0
stack_data
0.6.2=pyhd8ed1ab_0
tk
8.6.12=h1ccaba5_0
traitlets
5.14.3=pyhd8ed1ab_0
typing_extensions
4.11.0=pyha770c72_0
wcwidth
0.2.13=pyhd8ed1ab_0
wheel
0.41.2=py310h06a4308_0
xz
5.4.6=h5eee18b_0
zeromq
4.3.5=h6a678d5_0
zlib
1.2.13=h5eee18b_0
tasks/spring-boot-jakarta-migration/environment/workspace java
框架
Spring Boot Web
查看全部依赖 (8)
依赖
com.h2database:h2
io.jsonwebtoken:jjwt
javax.xml.bind:jaxb-api
org.springframework.boot:spring-boot-starter-data-jpa
org.springframework.boot:spring-boot-starter-security
org.springframework.boot:spring-boot-starter-test
org.springframework.boot:spring-boot-starter-validation
org.springframework.security:spring-security-test
tasks_excluded/scheduling-email-assistant/environment/skills/gmail-skill javascript
查看全部依赖 (4)
依赖
googleapis
^144.0.0
mailcomposer
^4.0.1
minimist
^1.2.8
open
^10.1.0
tasks_excluded/scheduling-email-assistant/environment/skills/google-calendar-skill javascript
查看全部依赖 (3)
依赖
googleapis
^144.0.0
minimist
^1.2.8
open
^10.1.0