AI Agent:从实验室到企业

如果无法正常显示,请先停止浏览器的去广告插件。
分享至:
1. 章毅 俞舟
2.
3. AI Agents Labs->World LLM LLM LLM Zhou(Jo) Yu Columbia University & Arklex AI
4. Arklex.AI amazon prize Founder, CEO Dr. Zhou (Jo) Yu ● ● ● ● © 2025 Arklex.AI - All rights reserved Columbia University CS Professor (CMU PhD) Open-source AI models (1M+ downloads) AI Consultant, Microsoft Research Forbes 30 Under 30
5. Arklex.AI What are AI agents? Perception: Multimodal inputs including, text, image, audio, video, touch, etc. Planning (Inner Monologue): Chain-of- Thought reasoning Reflection: meta-reasoning in every stop Actions: function/tool calling, embodied actions. 5 © 2025 Arklex.AI - All rights reserved
6. The field will continue to push the frontier of capability “Levels of AI” (Sam Altman on Levels of AI) Level 1: Chatbots (2022 onwards) Level 2: Reasoners (2024 onwards) Level 3: Agents (2025 onwards) Level 4: Innovators (202?) Level 5: Organizations (20??) Currently: tasks that take seconds to hours (OpenAI Operator, deep research models, Claude Code, Manus) Eventually: hours to days of work Tasks that are at the edge of human performance, or are totally new - New scientific discoveries (Alpha*) - Prize-winning writing - Unsolved mysteries Huge demand for specialists in other fields
7. Arklex.AI Next-Gen AI Agents Enterprises are exploring AI employees VS. AI Agent + Human Agent Human + Software © 2025 Arklex.AI - All rights reserved 7
8. Arklex.AI Current Agent Ecosystem AI Agent Verticals Applications (Leverage the entire agentic infra to further enhance agents’ usability through UI/UX, verifiers, human intervention, human teaching, etc to provide real ROIs to enterprise and individuals) Agent Orchestration Framework (Leverage search, planning algorithms, tool-use, scaffolds, guardrails, to improve agent tasks’ success-rate, efficiency, security, etc) Arklex Agent Foundation Models (Continue training, reinforcement learning to adapt Foundation Models for agent tasks) Foundation Models (openAI, Anthropic, etc) 8 © 2025 Arklex.AI - All rights reserved
9. Arklex.AI LangGraph AutoGen Arklex Hybrid Approach: Combining AI Workflows & AI Agents © 2025 Arklex.AI - All rights reserved 9
10.
11. Arklex.AI How does Arklex Work? -TaskGraph continual learning 1. Task Graph Generation:language Instruction + API → TaskGraph 2. Human Expert Review: Business team adjusts and verifies TaskGraph with an interactive UI. 3. Control and Compliance: Human hand-over for security and safety 4. Continuous Learning: Agents automatically update TaskGraph based on human agent interactions (offline agent-human simulation + human teaching interactions) 11 © 2025 Arklex.AI - All rights reserved
12. Arklex.AI 1) Task Graph Generation Instruction: Your name is elle, you are a a shopify sales agents of#Store. Make sure you are cheerful and be responsive. You responsibility includes: recommend product, answer product question, give discount, check delivery, …. Available tools: Shopify product database Shopify inventory management Shopify recommendation etc. 12 © 2025 Arklex.AI - All rights reserved
13. Arklex.AI 2) Human Expert Review and Edit 13 © 2025 Arklex.AI - All rights reserved
14. Arklex.AI Sample Customer Conversation 1. Product Discovery Agent’s recommendation results Agent User Agent (recommendations) (Intent: the user wants to see a list of products based on some keywords) 14 © 2025 Arklex.AI - All rights reserved
15. Arklex.AI Sample Customer Conversation 2. Proactively Offer Promotion We currently have a promotion on this item. It is now get a second (intent: the user is one for 50% off! exploring products) That’s great! Why don’t you add another yellow one to my cart? Got it! I just added the yellow Cool boy baseball hat to your cart. You’re only $5 away from qualifying for free shipping. Want to add anything else? 15 © 2025 Arklex.AI - All rights reserved
16. Arklex.AI 4 Update TaskGraph 16 © 2025 Arklex.AI - All rights reserved
17. Arklex Agents Get Smarter Over Time 17
18. Arklex VS. Other Frameworks Arklex.AI Framework Open-Source Mixed-Control Action Graph NLU Task Composition Human Intervention Continual Learning Traditional Dialog System DialogFlow ✘ ✘ rule-based ✓ ✓ - ✘ Amazon Lex ✘ ✘ rule-based ✓ ✓ - ✘ RASA ✓ ✘ rule-based ✓ ✓ - ✘ LLM-Based Agent Framework LangChain ✓ ✘ LangGraph ✘ ✘ LangGraph ✘ LlamaIndex ✓ ✘ Workflows ✘ ✘ WorkFlows ✘ AG2 ✓ ✘ ✘ ✘ ✘ Always/Never ✓ CrewAI ✓ ✘ Flows ✘ ✘ Always/Never ✘ AgentForce ✘ ✓ - − ✓ - - OpenAI Swarm ✓ ✘ ✘ ✘ ✘ Swarm ✘ Arklex ✓ ✓ ✓ ✓ ✓ ✓ ✓ 18 © 2025 Arklex.AI - All rights reserved
19. Arklex.AI Tutorial: Build a Customer Service AI Agent with Arklex Screenshot 2025-03-15 at 1.49.09 PM.png 20 min 19 © 2025 Arklex.AI - All rights reserved
20. Use Case 1: Car Loan Application Challenge Long form to fill that leads to potential customers can’t complete the form in one try, results in low completion rate and high rate in lead loss. Solution Arklex AI framework enables customers to create a solution that provides 24/7 response with clarification questions, multi-language supports and a highly personalized experience that scales Result 40% increase in qualified leads Company is positioned as a leader in customer conversion and engagement
21. Use Case 2: E-commerce Shopping Agents for Product Education Challenge GPT-wrapper Agents often provide wrong or inaccurate answers, creates customer frustration, lose customer interest or potential sales Solution Arklex AI framework enables customers to create a solution that maintains brand voice, ensure natural flow of the conversation and guarantee high accuracy of response Result 50% more accurate response, no AI hallucination 30% higher customer satisfaction 50% of fewer returns Increase sales by 50%
22. ExACT: Teaching AI Agents to Explore with Reflective- MCTS and Exploratory Learning Xiao Yu 1 , Baolin Peng 2† , Vineeth Vajipey 1 , Hao Cheng 2 , Michel Galley 2 Jianfeng Gao 2* & Zhou Yu 1* †Project Lead; *Equal Advisory Contribution 1 Columbia University, NY 2 Microsoft Research, Redmond
23. Background: VLM on Computer Tasks Q: What is he doing? He is performing a skateboard trick… VQA Tasks Computer Tasks Can you help me clear my shopping cart? click button [shopping cart] …. 1
24. 2 Challenge: extremely difficult as interacting with computer was not part of VLM (pre-)training
25. 3 1. Scale test-time compute to improve agent performance 2. Transfer search knowledge back to VLM via training
26. Introduce R-MCTS R-MCTS Introduction = explore decision space and self-improve on-the-fly Scaling test-time compute Transferring search knowledge Conclusion
27. Introduce R-MCTS R-MCTS Introduction = explore decision space and self-improve on-the-fly Scaling test-time compute Transferring search knowledge Conclusion
28. Introduce R-MCTS R-MCTS Introduction = 1 MCTS with contrastive self-reflection Scaling test-time compute Transferring search knowledge Conclusion
29. Introduce R-MCTS R-MCTS Introduction = 1 MCTS with contrastive self-reflection Scaling test-time compute Transferring search knowledge Conclusion
30. Introduce R-MCTS R-MCTS = 1 2 MCTS with contrastive self-reflection and a multi-agent-debate value function 2 Good action, because… Bad action, because… Q=0.07 N=1 V=0.38! V=0.07 Judge Q=0.15 Introduction N=1 N=1 V=0.15 V=0.38 Scaling test-time compute Transferring search knowledge Conclusion
31. Introduce R-MCTS Within each task, R-MCTS performs a tree search to find the best trajectory Introduction Scaling test-time compute Transferring search knowledge Conclusion
32. Introduce R-MCTS Within each task, R-MCTS performs a tree search to find the best trajectory After each task, R-MCTS performs contrastive self-reflection to improve it future execution Introduction Scaling test-time compute Transferring search knowledge Conclusion
33. R-MCTS Results Benchmark: VisualWebArena and OSWorld - Realistic and reproducible - Tasks spans multiple domains VisualWebArena Introduction Scaling test-time compute OSWorld Transferring search knowledge Conclusion
34. R-MCTS Results R-MCTS outperforms other search algorithms (ToT, A*, or MCTS) Introduction Scaling test-time compute Transferring search knowledge Conclusion
35. R-MCTS Results R-MCTS achieves new SOTA on VisualWebArena, and is highly competitive on OSWorld! VisualWebArena Leaderboard Introduction Scaling test-time compute OSWorld Leaderboard Transferring search knowledge Conclusion
36. 3 1. Scale test-time compute to improve agent performance 2. Transfer search knowledge back to VLM via training
37. Introduce Exploratory Learning Exploratory Learning Introduction = explore, evaluate, and backtrack by training on tree traversals! Scaling test-time compute Transferring search knowledge Conclusion
38. Exploratory Learning Results GPT-4o after exploratory learning on R-MCTS trees exhibits compute scaling properties without augmenting with search algorithm! Introduction Scaling test-time compute Transferring search knowledge Conclusion
39. Exploratory Learning Results GPT-4o after exploratory learning on R-MCTS trees exhibits compute scaling properties without augmenting with search algorithm! Eval and backtrack! Introduction Scaling test-time compute Transferring search knowledge Conclusion
40. Summary ❖ Inference/training methods to improve agent performance - - R-MCTS improves agent performance at inference-time Exploratory Learning improve agent performance at training-time R-MCTS
41. Summary ❖ Inference/training methods to improve agent performance - - R-MCTS to improve agent performance at inference-time Exploratory Learning to improve agent performance at training-time R-MCTS ❖ Future work - - RL methods to reduce reliance on tree search Model predictive control (MPC) methods to reduce expensive environment interactions
42. Arklex: Agent-First Organization Framework Join Arklex Open-source Community: Github Documentation Discord
43. DAPLab on AI Agents Advisory Board Richard Zemel, ML Columbia University Director of NSF AI Institute Eugene Wu Zhou Yu Kostis Kaffes David Blei Data Systems NLP Computer Vision ML, Causal Inference Lydia Chilton Adam Elmachtoub Junfeng Yang Baishakhi Ray Human-AI Interaction Operations Research Secure Systems Software Eng Daniel Hsu Shipra Agrawal Carl Vondrick Yunzhu Li ML Theory RL Computer Vision Robotics, Agents Michael Franklin, CS University of Chicago Founder of AMPLab Co-Director of Chicago’s Data Science Institute
44.
45.

ホーム - Wiki
Copyright © 2011-2025 iteam. Current version is 2.147.0. UTC+08:00, 2025-10-27 09:36
浙ICP备14020137号-1 $お客様$