训练AI代理使用强化学习编写和自我纠正SQL

[

[

](https://medium.com/@yugez?source=post_page---byline--571ed31281ad---------------------------------------)

](https://medium.com/@yugez?source=post_page---byline--571ed31281ad---------------------------------------)

This post demonstrates how to build and train a self-correcting SQL agent. It leverages Agent Lightning and the verl framework for Reinforcement Learning (RL) based training, and LangGraph to define the agent’s complex, cyclical reasoning workflow. The goal is to fine-tune a Large Language Model (LLM) to accurately convert natural language questions into executable SQL queries.

本帖子演示了如何构建和训练一个自我纠正的 SQL 代理。它利用 Agent Lightning 和 verl 框架进行基于强化学习 (RL) 的训练,以及 LangGraph 来定义代理的复杂循环推理工作流程。目标是微调大型语言模型 (LLM),以准确将自然语言问题转换为可执行的 SQL 查询。

The example is tested with verl v0.5.0, vLLM v0.10.0, and Agent Lightning v0.1.1.

该示例在 verl v0.5.0、vLLM v0.10.0 和 Agent Lightning v0.1.1 上进行了测试。

SQL Agent Implementation

SQL代理实现

The design of Agent-lightning allows flexible integration with various agent frameworks, including AutoGen, CrewAI, OpenAI Agent SDK, LangGraph, and more. It can also work without agent frameworks, allowing you to train an agent built from scratch with Python code. See example gallery for more details.

Agent-lightning 的设计 允许与各种代理框架灵活集成,包括 AutoGen、CrewAI、OpenAI Agent SDK、LangGraph 等。它也可以在没有代理框架的情况下工作,允许您使用 Python 代码从头开始训练一个代理。有关更多详细信息,请参见 示例库

The core of the agent is a state machine built with LangGraph, which allows for a robust and transparent workflow. The agent’s logic, as visualized below, starts by writing a query, executes it, and then enters a refinement loop where it checks and rewrites the query until it is deemed correct or a turn limit is reached.

代理的核心是一个使用LangGraph构建的状态机,允许进行强大而透明的工作流程。代理的逻辑,如下所示,首先编写查询,执行它,然后进入一个精炼循环,在该循环中检查并重写查询,直到被认为正确或达到回合限制。

Press enter or click to view image in full size

按回车或点击以查看完整图像

SQL Agent Visualization

SQL 代理可视化

This workflow is implemented in the SQLAgent class within sql_agent.py. It consists of the following key steps:

此工作流程在 sql_agent.py 中的 SQLAgent 类中实现。它包括以下关...

开通本站会员,查看完整译文。

Home - Wiki
Copyright © 2011-2025 iteam. Current version is 2.147.1. UTC+08:00, 2025-11-05 15:01
浙ICP备14020137号-1 $Map of visitor$