让AI不止回答问题：企业级Agentic AI重构智能生产力

如果无法正常显示，请先停止浏览器的去广告插件。

1. 让 AI 不止回答问题：企业级 Agentic AI 重构智能生产力杨扬 Snowflake AI数据云亚太及日本地区解决方案工程副总裁

3. An Easier to Use, Connected, Trusted Platform 更加易用、互联、可信的平台外部引擎连接生态体系一体化集成合作伙伴 Snowflake Marketplace 溯源： Snowflake 是一家成立于 2012年的数据和AI公司; 2020年于纽约证券交易所 (NYSE) 上市全托管一体化平台分析数据工程合作:《福布斯》全球 2,000 强企业里，已有751 家企业正在使用 Snowflake ；全球合作企业超过 12,000家应用&合作 Snowflake Horizon Catalog 和 Iceberg Rest 的 Catalog 可互操作安全与治理业绩: 2025财年全年业绩（截至2025年1月31日)：营收达到 35亿美元（同比增长 30%）弹性计算与引擎数据湖仓数据仓库数据网状架构互操作存储与灵活架构捷报：问鼎2025年《财富》全球未来50强榜首非结构化数据跨云基础设施半结构化数据亚马逊结构化数据 Iceberg 表微软混合表 Snowflake 表谷歌云

4. The Five Pillars of Enterprise-Grade Agentic AI 企业级智能体 AI 的五大核心支柱 1. Agentic Orchestration & Tool Use 智能体编排与工具使用 2. Structured Data Intelligence 结构化数据智能 Com positional workflows and dynamic tool use 组合式工作流和动态工具调用 From semantic modeling and accurate SQL generation 语义建模和准确的 SQL 生成 Agentic Research 3. Unstructured Data Intelligence 非结构化数据智能 4. Observability & Trust 可观测性与信任 5. System Optimizations 系统优化 Extracting grounded insights from diverse unstructured data sources 从多样化的非结构化数据源中提取可靠见解 Transparent, traceable, and controllable decisions 透明、可追溯和可控的决策 Fast and cost-effective agent execution 快速且经济的智能体执行 Build intelligent, composable, trustworthy, and efficient enterprise AI agents. 构建智能、可组、可信且高效的企业级AI智能体。

6. Pillar 1: Agentic Orchestration 核心支柱一：智能体编排 Planning, Adapting and Composing Tools in Real Time 实时规划、调整与编排工具 ● Challenge: Enterprise tasks span multiple tools, data types and systems, and they require many steps of operations. • Our Solution: Agentic reasoning orchestration system ○ ○ ○ • Planning – Selects tools, decomposes tasks, defines the strategy Execution – Maintains shared context across steps Adaptation – Updates the plan as new information emerges From research to production: Our orchestration system powers Snowflake Intelligence ○ ○ Orchestration – Operationalized by Cortex Agents Tools – Leverage Cortex Analyst, Cortex Search, and visualization components

7. Pillar 1: Agentic Orchestration in Action 核心支柱一：智能体编排实战 Key Capabilities • • • Multi-hop reasoning with tool chaining Context-Aware Execution Modular and extensible framework Composable, Multi-tool, Context-rich reasoning — tailored for enterprise workflows.

8. Research Paper - Modular and extensible 研究论文 — 模块化与可扩展性

10. Pillar 2: Structured Data Intelligence Building Agents That Reason and Act ● 构建可推理、可行动的智能体 Recap: From Reasoning to Agentic Systems ○ ○ ○ ● 核心支柱二：结构化数据智能 Reasoning models give us a strong base for structured query generation But real enterprise SQL tasks are often underspecified, schema-heavy, and multi-step These challenges demand agents that can clarify, probe, and verify Enter ReFoRCE — our agentic system for real-world SQL ○ ○ ○ ○ Schema compression Self-refinement Majority-vote consensus Column exploration (when needed)

11. ReFoRCE in Action: Agentic Reasoning Over Complex SQL Tasks ReFoRCE 实战：面向复杂 SQL 任务的智能体推理

12. ReFoRCE Achieves #2 Accuracy on Spider 2.0 Lite 准确率在 Spider 2.0 Lite 排第二 Our agentic semantic models improved accuracy by more than 20%, as compared to agents without schema understanding Snapshot from https://spider2-sql.github.io/ on Sep 29, 2025

13. AT&T 案例研究：推动 Text2SQL 进阶

14. Ask AT&T 向 AT&T 提问 100K+ 90 410 Users Onboarded Fine-tuned SLM Production Agentic Workers 超十万员工数 90 个微调小语言模型 410 个生产环境智能体工作单元 450M+ 71 20% Production API Calls Production RAG+FT Coding Efficiency Gain 四亿五千万次生产级 API 调用 71个生产环境检索增强生成 + 微调方案代码开发效率提升20%

15. AT&T’s Multi-Pronged Approach AT&T 多管齐下 1. 2. 3. 4. 5. 6. 7. 8. 9. Data Profiling - Systematic querying for table properties Schema Deduplication – Curating database schema Schema Linking Self-Consistency - Leveraging self-consistency techniques to improve reliability Schema Search – with GraphRAG Schema Refinement - LLMs help us build elements of the Relational KG Query Log Analysis - Extract patterns from expert-written queries SQL-to-Text Generation - LLM-powered question generation from SQL Fine Tuning

16. 模式去重

17. Schema Deduplication 模式去重 Goal: Reducing the token count by deduping schema • Time series databases • Database information compression: ○ Full table name ○ Column name ○ Column type ○ Column description 通过模式去重减少 Token 数

18. Research Paper - ReFoRCE Text2SQL Agent 研究报告： ReFoRCE Text2SQL 智能体

19. Pillar 3: Unstructured Data Intelligence Ground agents in enterprise knowledge ● Verified DIversification with ConsolidaTion (VerDICT) ● Retriever: relevance feedback: Unlike DtV, which diversifies into all possible interpretations, our approach first checks which interpretations are supported by the retrieved passages. ● Generator: answerability feedback: Even if a document is relevant to the interpretation grounded to this document, it may not answer the query. Thus retrieval alone is insufficient for feedback — we introduce a generator feedback, to ensure that an answer can be generated before retraining an interpretation. 核心支柱三：非结构化数据智能让智能体扎根企业知识

20. Pillar 3: Unstructured Data Intelligence Ground agents in enterprise knowledge ● ● Verified Diversification improved groundedness by 1.8x over baseline Efficiency alone isn’t enough — accuracy is critical. In our evaluations, 93% of VerDICT-generated interpretations led to correct and grounded answers, compared to just 56% with DtV on Llama 3.3 70B and GPT-4o ● Even human-generated interpretations scored only 65%, proving that VerDICT is both accurate and reliable. 核心支柱三：非结构化数据智能让智能体扎根企业知识

22. Pillar 4: Observability & Trust 核心支柱四：可观测性与信任 Transparent, traceable, and controllable decisions-making 透明、可追溯、可控的决策机制 In enterprise AI, intelligence isn’t enough; systems must be transparent, verifiable and cost-aware. ● Accuracy Can AI provide accurate answers based on facts? ● Effectiveness Will it be performant and cost effective? ● Trust Compliance Ethicalness ● End-to-end evaluation: Evaluate the performance of agents and apps, using techniques such as LLM-as-a- judge. It can report metrics such as relevance, groundedness and harmfulness, giving customers the ability to quickly iterate and refine the agent for improved performance. ● Comparison: Compare evaluation runs side by side and assess the quality and accuracy of responses across different LLM configurations to identify the best configuration for production deployments. ● Comprehensive tracing: Logging for every step of agent executions across input prompts, tool use and final response generation using OpenTelemetry traces. This allows easy debugging and refinement for accuracy, latency and cost. OpenTelemetry

23. Pillar 4: Observability & Trust 核心支柱四：可观测性与信任 Transparent, traceable, and controllable decisions-making 透明、可追溯、可控的决策机制 In enterprise AI, intelligence isn’t enough; systems must be transparent, verifiable and cost-aware.

25. Pillar 5: System Optimizations 核心支柱五：系统优化 Arctic Inference: Responsive, Fast and Efficient — Finally All at Once Arctic Inference: 兼顾灵敏、快速和高效 Inference systems needs to be: ⚫ Responsive (prefill speed – time to first token) ⚫ Fast Generation (generation speed of output tokens) ⚫ Cost Efficient (combined throughput) Existing parallelism leads to tradeoffs Tensor Parallel ⚫ Split each token in each request across GPUs ⚫ Incurs coordination overhead ⚫ Good for latency, bad for throughput Data Parallel • • • Splits work across requests No Communication overhead Good for throughput but bad for latency First Response (Prefill Speed) Generation Speed

26. Mitigating Tradeoff with Shift Parallelism 用Shift Parallelism化解取舍难题 Can we combined tensor and data parallelism? • No because they have different KV data layouts Match data layout with Arctic Sequence Parallel • • • Split work within request across tokens Less communication than tensor parallel KV data layout same as tensor parallel Shift Parallelism: Tensor + Arctic Sequence Parallel • • • Tensor Parallel for small batch Arctic Sequence Parallel for large batch No more latency vs throughput tradeoffs First Response (Prefill Speed) Generation Speed Tensor Parallel Data Parallel

27. One of the Fastest and most Efficient Inference Systems — And It’s Open Source 业界最快、最高效的推理系统之一：现已全面开源 Latency vs Throughput Arctic Inference’s breakthrough performance achieved via novel Shift Parallelism + multiple SoTA optimizations Lowest end-to-end latency and highest cost efficiency for generative AI among open-source • • Up to 3.4x faster e2e response latency Up to 1.7x higher throughput Third Party Benchmarking* (June 5, 2025) Up to 16x higher throughput for embedding models over vLLM. Powering select workloads in Snowflake Cortex AI *GPU Cloud Providers Only Snowflake And now it’s open source — free for the community to build, extend, and use Arctic Inference makes responsive, fast and cost efficient AI accessible to the AI Community.

28. Snowflake Cortex AI is Easy, Connected, Trusted 易用、互联、可信的Snowflake Cortex AI SNOWFLAKE CORTEX AI AGENTIC BUSINESS INSIGHT MODELS Agent APIs Cortex Agents GA Soon Agent Apps Snowflake Intelligence PU RBAC Guardrails STATE OF THE ART RETRIEVAL Unstructured Data Retrieval Cortex Search Structured Data Retrieval Cortex Analyst Evaluations Monitoring SCALABLE AI PROCESSING Multimodal Structured Data Retrieval AI-powered Cortex Analyst Cortex AISQL PU SQL GOVERNANCE Document Processing Document AI, Parse, Embed AI Gateway STR DOC AUDIO IMAGE

29. InfoQDemoFinal.mov

30. The Five Pillars of Enterprise-Grade Agentic AI 企业级智能体 AI 的五大核心支柱 1. Agentic Orchestration & Tool Use 智能体编排与工具使用 2. Structured Data Intelligence 结构化数据智能 Com positional workflows and dynamic tool use 组合式工作流和动态工具调用 From semantic modeling and accurate SQL generation 语义建模和准确的 SQL 生成 Agentic Research 3. Unstructured Data Intelligence 非结构化数据智能 4. Observability & Trust 可观测性与信任 5. System Optimizations 系统优化 Extracting grounded insights from diverse unstructured data sources 从多样化的非结构化数据源中提取可靠见解 Transparent, traceable, and controllable decisions 透明、可追溯和可控的决策 Fast and cost-effective agent execution 快速且经济的智能体执行 Build intelligent, composable, trustworthy, and efficient enterprise AI agents. 构建智能、可组、可信且高效的企业级AI智能体。

31.

32. THANKS 大模型正在重新定义软件 Large Language Model Is Redefining The Software