Al Vision Shape the Future
如果无法正常显示,请先停止浏览器的去广告插件。
1. Vickie Zeng
2.
3. Agenda
4. 1 Scaling laws continue 4 Every Developer (Employee) is an
AI Developer
2 Reasoning, Planning & Memory 5 From “use-case” to reshaping every
business processes
3 Multimodal and Multi-model including
cost/performance 6 Multi-agent AI
5. Simple
Advanced
Repetitive Tasks
Take actions when asked,
automate workflows,
and replace repetitive
tasks for users.
Constrained
Processes Dynamic Plans
& “Workers”
constrained business
processes, predefined rules
and guidelines to optimize
workflow and enhance
decision-making. Operate independently,
dynamically plan, orchestrate
other agents, learn
and escalate.
6. Natural language
Code-first
Agent builder
For End Users
Copilot Studio
For Makers
AI Foundry+ Visual Studio / GitHub
For Developers
7.
8.
9.
10.
11. Microsoft Fabric
SharePoint
BYO-file storage
Bing Search
Azure AI Search
BYO-search index
(GPT-4o, GPT-4o mini)
Your own licensed data
Files (local or Azure Blob)
OBO Authorization Support
File Search
Code Interpreter
Llama 3.1-405B-Instruct
Mistral Large
Enhanced Observability
Azure Logic Apps
OpenAPI 3.0 Specified Tools
Azure Functions
Cohere-Command-R-Plus
12.
13. Supply Chain
Optimization Product Compliance
Assurance Product Quality
Monitoring
• Tech Agent: Predicts
demand, manages inventory.
• Procurement Agent:
Automates purchasing
decisions. • Legal Agent: Validates
industry standards.
• Product Agent: Monitors
compliance. • Product Agent: Identifies
defects.
• Tech Agent: Suggests
improvements.
Privacy-compliant
Data Collection Intellectual
Property
Compliance Vendor Contract
Negotiation
• Legal Agent: Ensures privacy
regulations.
• Marketing Agent: Collects
customer data.
Software
Compliance
Management
• Tech Agent: Ensures licensing
compliance.
• Legal Agent: Reviews
software contracts.
• Marketing Agent: Ensures IP
adherence.
• Legal Agent: Reviews
content.
Tax-compliant
Expense Reporting
• Expense Billing Agent:
Ensures tax compliance.
• Legal Agent: Reviews
expenses.
• Legal Agent: Evaluates
contracts.
• Procurement Agent:
Negotiates terms.
Personalized
Product
Recommendations
• Product Agent: Analyzes
customer behavior.
• Marketing Agent: Tailors
recommendations for
campaigns.
Expense Reporting
and Invoice
Reconciliation IT Support
Automation Contractor Invoice
Verification
• HR Agent: Handles
technical issues.
• Tech Agent: Resolves
support requests. • Procurement Agent:
Manages contractor
payments.
• Invoice Reconciliation
Agent: Validates invoices.
Employee
Development
Recommendations Vendor Evaluation
and Cost
Optimization Legal Compliance
in Marketing
Content
• HR Agent: Recommends
training programs.
• Product Agent: Suggests
career growth opportunities. • Procurement Agent:
Selects suppliers.
• Product Agent: Assesses
quality. • Legal Agent: Reviews
marketing materials.
• Marketing Agent: Ensures
compliance.
Legal Compliance
in Marketing
Content Marketing
Campaign Cost
Analysis Employee
Onboarding
Automation
• Legal Agent: Reviews
marketing materials.
• Marketing Agent: Ensures
compliance. • Invoice Reconciliation
Agent: Analyzes costs.
• Marketing Agent:
Evaluates ROI.
• Expense Billing Agent:
Verifies receipts.
• Invoice Reconciliation
Agent: Matches invoices with
expenses.
• HR Agent: Verifies documents,
generates contracts.
• Legal Agent: Ensures
compliance with
employment laws.
14. 0.4
Orchestrator
File
Surfer
Web
Surfer
Coder
Executor
Team tops multiple agentic benchmarks
Agents being used in Webby,
Expected Tools
File Handling & Browsing
0.3
File Handling & Code
File Handling
0.2
Code, File Handling & Browsing
Code
Browser & Code
0.1
Browser
No Tools Needed
Copilot Studio Team’s PoC, Hugging Face’s
Transformers Agents 2.0, …
0
Orchestrator +
WebSurfer
+ Coder
+ FileSurfer
Full Magentic-1
Team
15. Webby Demo
16.
17. Agenda
18. o1 generally outperforms GPT-4o when:
• Reasoning over a large amount of “micro
datasets”
• Keeping track of the state of multiple variables
• Developing a strategy/set of steps and
performing math calculations
• Handling dynamic reasoning steps that change
regularly
• •
Signs of a strong o1/o3-mini use case include:
• Reasoning over many small, related data sources (e.g.,
customer profiles with past orders, preferences, and
unstructured data like chat conversations).
• Determining steps to solve a problem and calculating
multiple intermediate values to reach an insight or
conclusion (e.g., evaluating sales performance by
calculating cost of goods sold, revenue, and gross
margin).
Answering open-ended problem statements • o1 offers reasoning capabilities out of the box,
without extensive CoT prompting. Adapting the set of steps required to solve the
problem based on the context of the question.
• Handling open-ended problems with many potentially
valid solutions, where the quality of reasoning and
output is crucial.
• Addressing close-ended questions where extremely
high accuracy is required.
19.
20. Circuit design is a core asset. Higher development efficiency means shorter development cycles, allowing products to reach the
market faster, gain a first-mover advantage, start generating revenue earlier, and capture market share.
Finding suitable components
from different vendors is a
time-consuming task.
Component Sourcing
RT9078 ?
Designer
Manually checking component
compatibility and pin
alignment is extremely
tedious.
Manual circuit design review is error-
prone. Design mistakes can delay
product launches by up to 3 months
and increase development costs —
often adding $30K or more per board
Component
Creation
Symbol
PCB Copilot is an AI-powered solution that brings large language models (LLMs) into the printed circuit board
(PCB) design process. By automating key tasks such as rule checking, component validation, and layout
verification, the solution reduces manual effort, accelerates development, and minimizes costly design errors.
Circuit Design
Symbol
Schematics
Netlist/Partlist
21. Search electronic databases
and standards (including text
and images) to extract
summaries and key indicators,
and build vector databases.
Use a multi-modal reasoning
model to extract electronic
database content for
component symbol and
electrical property
verification and correction.
Integrate internal and external data
and relevant design guidelines for
reasoning
PCB Copilot integrates datasheets, netlists, and layout images to verify circuit designs.
It extracts specs from vendor datasheets, reasons over connections, and applies rule checks using layout and netlist context.
22. Example: How Engineers Perform Rule Checks for Output Voltage Accuracy
Step 1: Inspect the Board
Design
Step 2: Get voltage specifications from supplier’s
datasheet
Step 3: Extract Resistor Values and Calculate Vout
Step 4: Check Accuracy and suggest Fixes
Identify key resistor components
(e.g., PR405, PR406, PR407) and
connection points (FB pin, GND,
etc.).
Retrieve reference voltage VREF and accuracy range from supplier
datasheet. Use provided formula for feedback voltage calculation
23. Example: How Engineers Perform Rule Checks for Output Voltage Accuracy
Step 1: Inspect the Board
Design
Multi-Modal
reasoning model
to inspect board
design
Step 2: Get voltage specifications from supplier’s
datasheet
Leverages rule checks and
circuit layout to retrieve
relevant data via multi-modal
RAG and AI Search.
Step 3: Extract Resistor Values and Calculate Vout
Reasoning model calculates
Vout based on the formula,
circuit diagram, and datasheet
values
Step 4: Check Accuracy and suggest Fixes
Identify key resistor components
(e.g., PR405, PR406, PR407) and
connection points (FB pin, GND,
etc.).
Retrieve reference voltage VREF and accuracy range from supplier
datasheet. Use provided formula for feedback voltage calculation
Each board has many rules to verify. Engineers must manually match layout, read datasheets,
and run calculations. LLMs automate this at scale, saving time and reducing errors.
Reasoning model check accuracy
and suggest fixes
24.
25. Agenda
26. Plan an iterative path from basic to advanced GenAI leveraging your data
Prompt engineering
• Crafting specialized
prompts and pipelines
to guide model
behavior
Retrieval augmented
generation (RAG)
• Combining an
LLM/SLM with your
enterprise data
Fine-tuning
• Adapting a pre-trained
Gen AI model to specific
datasets or domains
Pre-training
• Training a GenAI
model from scratch
27. • Fine-tuning refers to customizing a pre-trained LLM with additional training on a specific
task or new dataset for enhanced performance, new skills, or improved accuracy
Azure OpenAI Service uses low rank approximation (LoRA) to fine-tune models. LoRA works by approximating the
original high-rank matrix with a lower rank one, only fine-tuning a smaller subset of "important" parameters. This
technique reduces the complexity of fine tuning while maintaining performance, making training faster and more
affordable.
28. Shorter prompts with Improved accuracy
Lower Latency and Cost
Teaching new skills and Improving tool use
Domain & Language adaptation
29. Customer does Coding SFT on Phi3.5 Mini based on private code.
Increase the accuracy from - 30% to 100% after LLMOps Cycle fine-tuning.
30. After 1 st training, the responses were completely irrelevant.
The validation set loss is decreasing too slowly, indicating overfitting.
Example of training corpus
.NET Framework
專案要如何使用
NEBULA SDK
使用 Quanta.PaaS; using (var scope = new
PaaSContextScope (User)) { PaasContext context =
范围。上下文;// .... }
Training Loss
Step Training
Loss Validation
Loss
25 1.879600 1.653950
50 0.983000 0.655794
75 0.411000 0.391818
100 0.237100 0.369338
125 0.143900 0.389201
150 0.101100 0.423663
175 0.085800 0.459196
200 0.075100 0.481036
225 0.069700 0.482507
31. Training Loss
Set lora_dropout = 0.05 25 1.922100 1.558381
50 1.179400 0.722776
Expanding each of the 30 questions to over 100, making
a total of 3000 entries in the dataset. 75 0.459700 0.309079
100 0.254300 0.199262
125 0.169300 0.136576
150 0.121700 0.108535
175 0.105400 0.096791
200 0.095400 0.090608
225 0.088400 0.086344
250 0.084900 0.083622
275 0.080800 0.079818
300 0.077600 0.078820
325 0.075500 0.077857
350 0.075000 0.077025
375 0.073900 0.076670
Change the training/validation split to 0.7/0.3
The overfitting issue during training has been resolved,
but the trained model is still giving incorrect answers.
32. Change the training corpus to be written entirely in English.
Enhance the training corpus by adding Chain of Thought (COT) steps.
The capability of the trained model has improved, with half of the four questions answered correctly.
Prompt
How to use NEBULA SDK to obtain the English name of CAMP User for .NET Framework project?
Completion
**Step-by-Step Analysis:**
1. **Define Purpose**: The code aims to use NEBULA SDK by retrieving data based on the user's
identity.
2. **Code Structure**: Imports necessary namespaces and defines an MVC controller with
authorization.
3. **Data Handling**: Retrieves the English name associated with the authenticated user identity.
**Augmented Code Sample**:
using System;
using System.Web;
using System.Web.Mvc;
public class HomeController : Controller
{
[Authorize]
// Retrieve data using authenticated User Identity
{
public IActionResult Index()
var CampEnglishName = User.Identity.GetCampEnglishName();
}
}
Step Training
Loss 验证损失 Step Training
Loss 验证损失
25 2.462500 2.382962 500 0.044300 0.047924
50 2.143100 1.864092 525 0.044000 0.047543
75 1.416400 0.756255 550 0.043400 0.047541
100 0.362600 0.199714 575 0.042900 0.047038
125 0.167100 0.140800 600 0.042700 0.047309
150 0.123800 0.106009 625 0.042200 0.046538
175 0.091500 0.082653 650 0.041900 0.046381
0.071972 675 0.041200 0.046308
0.041200 0.046133
200
0.074600
225 0.066600 0.066617 700 250 0.061700 0.062211 725 0.040600 0.046564
275 0.058400 0.059020 750 0.040500 0.045710
300 0.055200 0.056109 775 0.040200 0.045767
325 0.052400 0.054087 800 0.040000 0.045462
350 0.050300 0.052159 825 0.039700 0.045850
375 0.048400 0.050647 850 0.039400 0.045565
400 0.047300 0.050012 875 0.039200 0.045479
425 0.046200 0.049220 900 0.039100 0.045543
0.048783 925 0.038800 0.045464
0.048403 950 0.038800 0.045389
450
475
0.045800
0.045300
1
2
33. Random insertion, random swapping.
Accuracy improved by more than 10% compared to the last time.
using System;
using System.Web;
using System.Web.Mvc;
public class HomeController : Controller
{
[Authorize]
// Retrieve data using authenticated User
Identity
{
public IActionResult Index()
var CampEnglishName =
User.Identity.GetCampEnglishName();
var randomVar = "RandomValue"; //
Inserted a random variable
}
}
using System;
using System.Web;
using System.Web.Mvc;
public class HomeController : Controller
{
[Authorize]
// Retrieve data using authenticated User
Identity
{
public IActionResult Index()
// Swapped the comment and the variable
declaration
var CampEnglishName =
User.Identity.GetCampEnglishName();
// Retrieve data using authenticated User
Identity
}
}
25 1.185300 1.885971
50 0.637300 0.617216
75 0.121700 0.082259
100 0.031000 0.052487
125 0.021300 0.045465
150 0.019000 0.043421
175 0.017300 0.042817
200 0.015700 0.042288
34. •
•
•
•
•
•
Add more diverse prompts to the
training parameters and switch from
LoRA fine-tuning to full fine-tunin.
Set do_sample=False.
Set temperature=0.0:
Set learning_rate=5e-4.
Accuracy improved significantly, after
training the model can accurately
answer questions.
Trained model could answer the right
answer, but the output could not end
normally.
Step Training
Loss Validation
Loss
25 1.185300 1.885971
50 0.637300 0.617216
75 0.121700 0.082259
100 0.031000 0.052487
125 0.021300 0.045465
150 0.019000 0.043421
175 0.017300 0.042817
200 0.015700 0.042288
35. Then need to modify the EOS setting, check Phi-3.5-mini eos-token on HF.
https://huggingface.co/microsoft/Phi-3.5-mini-instruct/blob/main/special_tokens_map.json
},
"eos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
prompt = """<|system|>You are an expert in .NET/.NET Framework and
are familiar with the functions provided by NEBULA SDK. You know
how to use NEBULA SDK to develop application systems on the CAMP
platform.When providing the user with results, no explanation or thought
process is needed. Ensure the code format is correct, indentation is
standard, and it can compile successfully. Avoid repeating the same
code output.
<|user|>
How to use NEBULA SDK to get the CAMP User English name of a
user in a .NET Framework project? Give me sample code.<|endoftext|>
<|assistant|>""" + "<|endoftext|>"
generated_text = generate_text(prompt)
print(generated_text)
print("Tokenizer EOS token ID:", tokenizer.eos_token_id)
print("Model EOS token ID:", model.config.eos_token_id)
36. After re-train, all output is OK now.
prompt = """<|system|>
You are an expert in .NET/.NET Framework and are familiar with
the functions provided by NEBULA SDK.Please give the full and
right code directly without writing out the thinking process.
<|user|>
How to use NEBULA SDK to get the English name of a user in
a .NET Framework project? Give me sample code.
<|assistant|>"""
generated_text = generate_text(prompt)
print(generated_text)
You are an expert in .NET/.NET Framework and are familiar with the
functions provided by NEBULA SDK.Please give the full and right
code directly without writing out the thinking process.
How to use NEBULA SDK to get the English name of a user in a .NET
Framework project? Give me sample code.
**Step-by-Step Analysis:**
1. **Define Purpose**: The code aims to use NEBULA SDK by
retrieving data based on the user's identity.
2. **Code Structure**: Imports necessary namespaces and defines
an MVC controller with authorization;
public class HomeController : Controller{
[Authorize]
public IActionResult Index()
{
// Retrieve data using authenticated User Identity
var CampUserEnglishName = User.Identity.GetCampEnglishName();
}
}
37. Find the key issue
Data quality and diversity are very import
Quality Data Annotation Matters
Fine-Tuning Approach Should Match Objectives
Attention to Output Formatting
38.
39. •
•
•
•
•
Scalable: many real-world tasks require massive datasets for fine-tuning.
Privacy and Safety: Real data has potential privacy and safety issues
Labeling: Manually labeling data are time-consuming
Pre-assessing: Allows you to simulate new scenarios
Retraining: Counteract drift or inject up-to-date knowledge.
Explicit DP(Differential Policy)
Implicit DP
Source: https://www.microsoft.com/en-us/research/blog/the-crossroads-of-innovation-and-privacy-private-synthetic-data-for-generative-ai/
40. For each page
or chunk
Mini batch
Image Summary
Image
Unstructured/
structured files
(PDF, CSV, TXT,
…)
Azure AI
Document
Intelligence
QnA Dataset
Text
Table Summary
Text Summary
Azure OpenAI
(Fine-tuned
GPT-4o)
QnA Dataset
Azure OpenAI
(GPT-4o)
QnA Dataset
Fine-tuning (2 nd phase)
Train
Deploy
Fine-tuning
Azure OpenAI
(GPT-4o)
Evolve-Instruct
Fine-tuning (1 st phase)
Train
Table
Data
Augmentation
Azure OpenAI
(Fine-tuned
GPT-4o)
Augmented
Dataset
Train
QnA Dataset
Deploy
Fine-tuning
Azure OpenAI
(Fine-tuned
GPT-4o)
41. 1. Creating a seed dataset
For each page
or chunk
Mini batch
Image Summary
Image
Unstructured/
structured files
(PDF, CSV, TXT,
…)
Table
Azure AI
Document
Intelligence
Azure OpenAI
(GPT-4o)
Text
Table Summary
Azure OpenAI
(GPT-4o)
Text Summary
QnA Dataset
Fine-tuning
GLAN
Break
down
Azure OpenAI
(GPT-4o)
Sample
Train
Generate
Azure OpenAI
(GPT-4o)
Instruction
Dataset
Instruction
Dataset
QnA
Dataset
Deploy
Fine-tuning
Azure OpenAI
(Fine-tuned
GPT-4o)
42. • WizardLM’s main technique
• In-depth Evolving (blue) is used to evolve a simple instruction to a more complex one.
• In-breadth Evolving (red) is used to create new instruction for diversity.
• Elimination Evolving is used to filter failed evolutions and instructions.
Code: https://github.com/nlpxucan/WizardLM , https://github.com/Azure/synthetic-qa-generation/
Source: https://arxiv.org/pdf/2304.12244
43.
44. 大模型正在重新定义软件
Large Language Model Is Redefining The Software