AI Agent:从实验室到企业
如果无法正常显示,请先停止浏览器的去广告插件。
1. 章毅
俞舟
2.
3. AI Agents Labs->World
LLM
LLM
LLM
Zhou(Jo) Yu
Columbia University
& Arklex AI
4. Arklex.AI
amazon prize
Founder, CEO
Dr. Zhou (Jo)
Yu
●
●
●
●
© 2025 Arklex.AI - All rights reserved
Columbia University CS Professor (CMU PhD)
Open-source AI models (1M+ downloads)
AI Consultant, Microsoft Research
Forbes 30 Under 30
5. Arklex.AI
What are AI agents?
Perception: Multimodal inputs including,
text, image, audio, video, touch, etc.
Planning (Inner Monologue): Chain-of-
Thought reasoning
Reflection: meta-reasoning in every stop
Actions: function/tool calling, embodied
actions.
5
© 2025 Arklex.AI - All rights reserved
6. The field will continue to push the frontier of capability
“Levels of AI” (Sam Altman on Levels of AI)
Level 1: Chatbots (2022 onwards)
Level 2: Reasoners (2024 onwards)
Level 3: Agents (2025 onwards)
Level 4: Innovators (202?)
Level 5: Organizations (20??)
Currently: tasks that take seconds to
hours (OpenAI Operator, deep
research models, Claude Code, Manus)
Eventually: hours to days of work
Tasks that are at the edge of human
performance, or are totally new
- New scientific discoveries (Alpha*)
- Prize-winning writing
- Unsolved mysteries
Huge demand for
specialists in other fields
7. Arklex.AI
Next-Gen AI Agents
Enterprises are exploring AI employees
VS.
AI Agent + Human Agent
Human + Software
© 2025 Arklex.AI - All rights reserved
7
8. Arklex.AI
Current Agent Ecosystem
AI Agent Verticals Applications
(Leverage the entire agentic infra to further enhance agents’ usability through UI/UX, verifiers,
human intervention, human teaching, etc to provide real ROIs to enterprise and individuals)
Agent Orchestration Framework
(Leverage search, planning algorithms, tool-use, scaffolds, guardrails, to improve agent tasks’
success-rate, efficiency, security, etc)
Arklex
Agent Foundation Models
(Continue training, reinforcement learning to adapt Foundation Models for agent tasks)
Foundation Models (openAI, Anthropic, etc)
8
© 2025 Arklex.AI - All rights reserved
9. Arklex.AI
LangGraph
AutoGen
Arklex
Hybrid Approach:
Combining AI Workflows &
AI Agents
© 2025 Arklex.AI - All rights reserved
9
10.
11. Arklex.AI
How does Arklex Work?
-TaskGraph continual learning
1. Task Graph Generation:language Instruction + API → TaskGraph
2. Human Expert Review: Business team adjusts and verifies TaskGraph with an
interactive UI.
3. Control and Compliance: Human hand-over for security and safety
4. Continuous Learning: Agents automatically update TaskGraph based on
human agent interactions (offline agent-human simulation + human teaching
interactions)
11
© 2025 Arklex.AI - All rights reserved
12. Arklex.AI
1) Task Graph Generation
Instruction:
Your name is elle, you are a a shopify sales agents
of#Store. Make sure you are cheerful and be
responsive. You responsibility includes: recommend
product, answer product question, give discount,
check delivery, ….
Available tools:
Shopify product database
Shopify inventory management
Shopify recommendation
etc.
12
© 2025 Arklex.AI - All rights reserved
13. Arklex.AI
2) Human Expert Review and Edit
13
© 2025 Arklex.AI - All rights reserved
14. Arklex.AI
Sample Customer Conversation
1. Product Discovery
Agent’s recommendation
results
Agent
User
Agent
(recommendations)
(Intent: the user wants
to see a list of products
based on some
keywords)
14
© 2025 Arklex.AI - All rights reserved
15. Arklex.AI
Sample Customer Conversation
2. Proactively Offer Promotion
We currently have a promotion on
this item. It is now get a second
(intent: the user is
one for 50% off!
exploring products)
That’s great! Why don’t you add
another yellow one to my cart?
Got it! I just added the yellow Cool
boy baseball hat to your cart.
You’re only $5 away from
qualifying for free shipping. Want
to add anything else?
15
© 2025 Arklex.AI - All rights reserved
16. Arklex.AI
4 Update TaskGraph
16
© 2025 Arklex.AI - All rights reserved
17. Arklex Agents Get Smarter Over Time
17
18. Arklex VS. Other Frameworks
Arklex.AI
Framework
Open-Source Mixed-Control Action Graph
NLU
Task
Composition
Human
Intervention
Continual
Learning
Traditional Dialog System
DialogFlow ✘ ✘ rule-based ✓ ✓ - ✘
Amazon Lex ✘ ✘ rule-based ✓ ✓ - ✘
RASA ✓ ✘ rule-based ✓ ✓ - ✘
LLM-Based Agent Framework
LangChain ✓ ✘ LangGraph ✘ ✘ LangGraph ✘
LlamaIndex ✓ ✘ Workflows ✘ ✘ WorkFlows ✘
AG2 ✓ ✘ ✘ ✘ ✘ Always/Never ✓
CrewAI ✓ ✘ Flows ✘ ✘ Always/Never ✘
AgentForce ✘ ✓ - − ✓ - -
OpenAI Swarm ✓ ✘ ✘ ✘ ✘ Swarm ✘
Arklex ✓ ✓ ✓ ✓ ✓ ✓ ✓
18
© 2025 Arklex.AI - All rights reserved
19. Arklex.AI
Tutorial: Build a Customer Service AI Agent with Arklex
Screenshot 2025-03-15 at 1.49.09 PM.png
20 min
19
© 2025 Arklex.AI - All rights reserved
20. Use Case 1:
Car Loan
Application
Challenge
Long form to fill that leads to potential customers
can’t complete the form in one try, results in low
completion rate and high rate in lead loss.
Solution
Arklex AI framework enables customers to create a
solution that provides 24/7 response with
clarification questions, multi-language supports
and a highly personalized experience that scales
Result
40% increase in qualified leads
Company is positioned as a leader in customer
conversion and engagement
21. Use Case 2:
E-commerce
Shopping Agents for
Product Education
Challenge
GPT-wrapper Agents often provide wrong or
inaccurate answers, creates customer frustration,
lose customer interest or potential sales
Solution
Arklex AI framework enables customers to create a
solution that maintains brand voice, ensure natural
flow of the conversation and guarantee high
accuracy of response
Result
50% more accurate response, no AI hallucination
30% higher customer satisfaction
50% of fewer returns
Increase sales by 50%
22. ExACT: Teaching AI Agents to Explore with Reflective-
MCTS and Exploratory Learning
Xiao Yu 1 , Baolin Peng 2† , Vineeth Vajipey 1 , Hao Cheng 2 , Michel Galley 2
Jianfeng Gao 2* & Zhou Yu 1*
†Project Lead; *Equal Advisory Contribution
1 Columbia
University, NY
2 Microsoft
Research, Redmond
23. Background: VLM on Computer Tasks
Q: What is he doing?
He is performing a skateboard trick…
VQA Tasks
Computer Tasks
Can you help me clear my shopping cart?
click button [shopping cart] ….
1
24. 2
Challenge: extremely difficult as interacting with computer
was not part of VLM (pre-)training
25. 3
1. Scale test-time compute to improve agent performance
2. Transfer search knowledge back to VLM via training
26. Introduce R-MCTS
R-MCTS
Introduction
=
explore decision space and self-improve on-the-fly
Scaling test-time compute
Transferring search knowledge
Conclusion
27. Introduce R-MCTS
R-MCTS
Introduction
=
explore decision space and self-improve on-the-fly
Scaling test-time compute
Transferring search knowledge
Conclusion
28. Introduce R-MCTS
R-MCTS
Introduction
=
1
MCTS with contrastive self-reflection
Scaling test-time compute
Transferring search knowledge
Conclusion
29. Introduce R-MCTS
R-MCTS
Introduction
=
1
MCTS with contrastive self-reflection
Scaling test-time compute
Transferring search knowledge
Conclusion
30. Introduce R-MCTS
R-MCTS
=
1
2
MCTS with contrastive self-reflection and a multi-agent-debate value function
2
Good action, because…
Bad action, because…
Q=0.07
N=1
V=0.38!
V=0.07
Judge
Q=0.15
Introduction
N=1 N=1
V=0.15 V=0.38
Scaling test-time compute
Transferring search knowledge
Conclusion
31. Introduce R-MCTS
Within each task, R-MCTS performs a tree search to find the best trajectory
Introduction
Scaling test-time compute
Transferring search knowledge
Conclusion
32. Introduce R-MCTS
Within each task, R-MCTS performs a tree search to find the best trajectory
After each task, R-MCTS performs contrastive self-reflection to improve it future execution
Introduction
Scaling test-time compute
Transferring search knowledge
Conclusion
33. R-MCTS Results
Benchmark: VisualWebArena and OSWorld
- Realistic and reproducible
- Tasks spans multiple domains
VisualWebArena
Introduction
Scaling test-time compute
OSWorld
Transferring search knowledge
Conclusion
34. R-MCTS Results
R-MCTS outperforms other search algorithms (ToT, A*, or MCTS)
Introduction
Scaling test-time compute
Transferring search knowledge
Conclusion
35. R-MCTS Results
R-MCTS achieves new SOTA on VisualWebArena, and is highly competitive on OSWorld!
VisualWebArena Leaderboard
Introduction
Scaling test-time compute
OSWorld Leaderboard
Transferring search knowledge
Conclusion
36. 3
1. Scale test-time compute to improve agent performance
2. Transfer search knowledge back to VLM via training
37. Introduce Exploratory Learning
Exploratory Learning
Introduction
=
explore, evaluate, and backtrack by training on tree traversals!
Scaling test-time compute
Transferring search knowledge
Conclusion
38. Exploratory Learning Results
GPT-4o after exploratory learning on R-MCTS trees
exhibits compute scaling properties without augmenting with search algorithm!
Introduction
Scaling test-time compute
Transferring search knowledge
Conclusion
39. Exploratory Learning Results
GPT-4o after exploratory learning on R-MCTS trees
exhibits compute scaling properties without augmenting with search algorithm!
Eval and
backtrack!
Introduction
Scaling test-time compute
Transferring search knowledge
Conclusion
40. Summary
❖ Inference/training methods to improve agent performance
-
-
R-MCTS improves agent performance at inference-time
Exploratory Learning improve agent performance at training-time
R-MCTS
41. Summary
❖ Inference/training methods to improve agent performance
-
-
R-MCTS to improve agent performance at inference-time
Exploratory Learning to improve agent performance at training-time
R-MCTS
❖ Future work
-
-
RL methods to reduce reliance on tree search
Model predictive control (MPC) methods to reduce expensive
environment interactions
42. Arklex: Agent-First Organization Framework
Join Arklex Open-source Community:
Github Documentation Discord
43. DAPLab on AI Agents
Advisory Board
Richard Zemel, ML
Columbia University
Director of NSF AI Institute
Eugene Wu Zhou Yu Kostis Kaffes David Blei
Data Systems NLP Computer Vision ML, Causal Inference
Lydia Chilton Adam Elmachtoub Junfeng Yang Baishakhi Ray
Human-AI Interaction Operations Research Secure Systems Software Eng
Daniel Hsu Shipra Agrawal Carl Vondrick Yunzhu Li
ML Theory RL Computer Vision Robotics, Agents
Michael Franklin, CS
University of Chicago
Founder of AMPLab
Co-Director of Chicago’s Data
Science Institute
44.
45.