Al Vision Shape the Future

如果无法正常显示，请先停止浏览器的去广告插件。

1. Vickie Zeng

3. Agenda

4. 1 Scaling laws continue 4 Every Developer (Employee) is an AI Developer 2 Reasoning, Planning & Memory 5 From “use-case” to reshaping every business processes 3 Multimodal and Multi-model including cost/performance 6 Multi-agent AI

5. Simple Advanced Repetitive Tasks Take actions when asked, automate workflows, and replace repetitive tasks for users. Constrained Processes Dynamic Plans & “Workers” constrained business processes, predefined rules and guidelines to optimize workflow and enhance decision-making. Operate independently, dynamically plan, orchestrate other agents, learn and escalate.

6. Natural language Code-first Agent builder For End Users Copilot Studio For Makers AI Foundry+ Visual Studio / GitHub For Developers

10.

11. Microsoft Fabric SharePoint BYO-file storage Bing Search Azure AI Search BYO-search index (GPT-4o, GPT-4o mini) Your own licensed data Files (local or Azure Blob) OBO Authorization Support File Search Code Interpreter Llama 3.1-405B-Instruct Mistral Large Enhanced Observability Azure Logic Apps OpenAPI 3.0 Specified Tools Azure Functions Cohere-Command-R-Plus

12.

13. Supply Chain Optimization Product Compliance Assurance Product Quality Monitoring • Tech Agent: Predicts demand, manages inventory. • Procurement Agent: Automates purchasing decisions. • Legal Agent: Validates industry standards. • Product Agent: Monitors compliance. • Product Agent: Identifies defects. • Tech Agent: Suggests improvements. Privacy-compliant Data Collection Intellectual Property Compliance Vendor Contract Negotiation • Legal Agent: Ensures privacy regulations. • Marketing Agent: Collects customer data. Software Compliance Management • Tech Agent: Ensures licensing compliance. • Legal Agent: Reviews software contracts. • Marketing Agent: Ensures IP adherence. • Legal Agent: Reviews content. Tax-compliant Expense Reporting • Expense Billing Agent: Ensures tax compliance. • Legal Agent: Reviews expenses. • Legal Agent: Evaluates contracts. • Procurement Agent: Negotiates terms. Personalized Product Recommendations • Product Agent: Analyzes customer behavior. • Marketing Agent: Tailors recommendations for campaigns. Expense Reporting and Invoice Reconciliation IT Support Automation Contractor Invoice Verification • HR Agent: Handles technical issues. • Tech Agent: Resolves support requests. • Procurement Agent: Manages contractor payments. • Invoice Reconciliation Agent: Validates invoices. Employee Development Recommendations Vendor Evaluation and Cost Optimization Legal Compliance in Marketing Content • HR Agent: Recommends training programs. • Product Agent: Suggests career growth opportunities. • Procurement Agent: Selects suppliers. • Product Agent: Assesses quality. • Legal Agent: Reviews marketing materials. • Marketing Agent: Ensures compliance. Legal Compliance in Marketing Content Marketing Campaign Cost Analysis Employee Onboarding Automation • Legal Agent: Reviews marketing materials. • Marketing Agent: Ensures compliance. • Invoice Reconciliation Agent: Analyzes costs. • Marketing Agent: Evaluates ROI. • Expense Billing Agent: Verifies receipts. • Invoice Reconciliation Agent: Matches invoices with expenses. • HR Agent: Verifies documents, generates contracts. • Legal Agent: Ensures compliance with employment laws.

14. 0.4 Orchestrator File Surfer Web Surfer Coder Executor Team tops multiple agentic benchmarks Agents being used in Webby, Expected Tools File Handling & Browsing 0.3 File Handling & Code File Handling 0.2 Code, File Handling & Browsing Code Browser & Code 0.1 Browser No Tools Needed Copilot Studio Team’s PoC, Hugging Face’s Transformers Agents 2.0, … 0 Orchestrator + WebSurfer + Coder + FileSurfer Full Magentic-1 Team

15. Webby Demo

16.

17. Agenda

18. o1 generally outperforms GPT-4o when: • Reasoning over a large amount of “micro datasets” • Keeping track of the state of multiple variables • Developing a strategy/set of steps and performing math calculations • Handling dynamic reasoning steps that change regularly • • Signs of a strong o1/o3-mini use case include: • Reasoning over many small, related data sources (e.g., customer profiles with past orders, preferences, and unstructured data like chat conversations). • Determining steps to solve a problem and calculating multiple intermediate values to reach an insight or conclusion (e.g., evaluating sales performance by calculating cost of goods sold, revenue, and gross margin). Answering open-ended problem statements • o1 offers reasoning capabilities out of the box, without extensive CoT prompting. Adapting the set of steps required to solve the problem based on the context of the question. • Handling open-ended problems with many potentially valid solutions, where the quality of reasoning and output is crucial. • Addressing close-ended questions where extremely high accuracy is required.

19.

20. Circuit design is a core asset. Higher development efficiency means shorter development cycles, allowing products to reach the market faster, gain a first-mover advantage, start generating revenue earlier, and capture market share. Finding suitable components from different vendors is a time-consuming task. Component Sourcing RT9078 ? Designer Manually checking component compatibility and pin alignment is extremely tedious. Manual circuit design review is error- prone. Design mistakes can delay product launches by up to 3 months and increase development costs — often adding $30K or more per board Component Creation Symbol PCB Copilot is an AI-powered solution that brings large language models (LLMs) into the printed circuit board (PCB) design process. By automating key tasks such as rule checking, component validation, and layout verification, the solution reduces manual effort, accelerates development, and minimizes costly design errors. Circuit Design Symbol Schematics Netlist/Partlist

21. Search electronic databases and standards (including text and images) to extract summaries and key indicators, and build vector databases. Use a multi-modal reasoning model to extract electronic database content for component symbol and electrical property verification and correction. Integrate internal and external data and relevant design guidelines for reasoning PCB Copilot integrates datasheets, netlists, and layout images to verify circuit designs. It extracts specs from vendor datasheets, reasons over connections, and applies rule checks using layout and netlist context.

22. Example: How Engineers Perform Rule Checks for Output Voltage Accuracy Step 1: Inspect the Board Design Step 2: Get voltage specifications from supplier’s datasheet Step 3: Extract Resistor Values and Calculate Vout Step 4: Check Accuracy and suggest Fixes Identify key resistor components (e.g., PR405, PR406, PR407) and connection points (FB pin, GND, etc.). Retrieve reference voltage VREF and accuracy range from supplier datasheet. Use provided formula for feedback voltage calculation

23. Example: How Engineers Perform Rule Checks for Output Voltage Accuracy Step 1: Inspect the Board Design Multi-Modal reasoning model to inspect board design Step 2: Get voltage specifications from supplier’s datasheet Leverages rule checks and circuit layout to retrieve relevant data via multi-modal RAG and AI Search. Step 3: Extract Resistor Values and Calculate Vout Reasoning model calculates Vout based on the formula, circuit diagram, and datasheet values Step 4: Check Accuracy and suggest Fixes Identify key resistor components (e.g., PR405, PR406, PR407) and connection points (FB pin, GND, etc.). Retrieve reference voltage VREF and accuracy range from supplier datasheet. Use provided formula for feedback voltage calculation Each board has many rules to verify. Engineers must manually match layout, read datasheets, and run calculations. LLMs automate this at scale, saving time and reducing errors. Reasoning model check accuracy and suggest fixes

24.

25. Agenda

26. Plan an iterative path from basic to advanced GenAI leveraging your data Prompt engineering • Crafting specialized prompts and pipelines to guide model behavior Retrieval augmented generation (RAG) • Combining an LLM/SLM with your enterprise data Fine-tuning • Adapting a pre-trained Gen AI model to specific datasets or domains Pre-training • Training a GenAI model from scratch

27. • Fine-tuning refers to customizing a pre-trained LLM with additional training on a specific task or new dataset for enhanced performance, new skills, or improved accuracy Azure OpenAI Service uses low rank approximation (LoRA) to fine-tune models. LoRA works by approximating the original high-rank matrix with a lower rank one, only fine-tuning a smaller subset of "important" parameters. This technique reduces the complexity of fine tuning while maintaining performance, making training faster and more affordable.

28. Shorter prompts with Improved accuracy Lower Latency and Cost Teaching new skills and Improving tool use Domain & Language adaptation

29. Customer does Coding SFT on Phi3.5 Mini based on private code. Increase the accuracy from - 30% to 100% after LLMOps Cycle fine-tuning.

30. After 1 st training, the responses were completely irrelevant. The validation set loss is decreasing too slowly, indicating overfitting. Example of training corpus .NET Framework 專案要如何使用 NEBULA SDK 使用 Quanta.PaaS; using (var scope = new PaaSContextScope (User)) { PaasContext context = 范围。上下文;// .... } Training Loss Step Training Loss Validation Loss 25 1.879600 1.653950 50 0.983000 0.655794 75 0.411000 0.391818 100 0.237100 0.369338 125 0.143900 0.389201 150 0.101100 0.423663 175 0.085800 0.459196 200 0.075100 0.481036 225 0.069700 0.482507

31. Training Loss Set lora_dropout = 0.05 25 1.922100 1.558381 50 1.179400 0.722776 Expanding each of the 30 questions to over 100, making a total of 3000 entries in the dataset. 75 0.459700 0.309079 100 0.254300 0.199262 125 0.169300 0.136576 150 0.121700 0.108535 175 0.105400 0.096791 200 0.095400 0.090608 225 0.088400 0.086344 250 0.084900 0.083622 275 0.080800 0.079818 300 0.077600 0.078820 325 0.075500 0.077857 350 0.075000 0.077025 375 0.073900 0.076670 Change the training/validation split to 0.7/0.3 The overfitting issue during training has been resolved, but the trained model is still giving incorrect answers.

32. Change the training corpus to be written entirely in English. Enhance the training corpus by adding Chain of Thought (COT) steps. The capability of the trained model has improved, with half of the four questions answered correctly. Prompt How to use NEBULA SDK to obtain the English name of CAMP User for .NET Framework project? Completion **Step-by-Step Analysis:** 1. **Define Purpose**: The code aims to use NEBULA SDK by retrieving data based on the user's identity. 2. **Code Structure**: Imports necessary namespaces and defines an MVC controller with authorization. 3. **Data Handling**: Retrieves the English name associated with the authenticated user identity. **Augmented Code Sample**: using System; using System.Web; using System.Web.Mvc; public class HomeController : Controller { [Authorize] // Retrieve data using authenticated User Identity { public IActionResult Index() var CampEnglishName = User.Identity.GetCampEnglishName(); } } Step Training Loss 验证损失 Step Training Loss 验证损失 25 2.462500 2.382962 500 0.044300 0.047924 50 2.143100 1.864092 525 0.044000 0.047543 75 1.416400 0.756255 550 0.043400 0.047541 100 0.362600 0.199714 575 0.042900 0.047038 125 0.167100 0.140800 600 0.042700 0.047309 150 0.123800 0.106009 625 0.042200 0.046538 175 0.091500 0.082653 650 0.041900 0.046381 0.071972 675 0.041200 0.046308 0.041200 0.046133 200 0.074600 225 0.066600 0.066617 700 250 0.061700 0.062211 725 0.040600 0.046564 275 0.058400 0.059020 750 0.040500 0.045710 300 0.055200 0.056109 775 0.040200 0.045767 325 0.052400 0.054087 800 0.040000 0.045462 350 0.050300 0.052159 825 0.039700 0.045850 375 0.048400 0.050647 850 0.039400 0.045565 400 0.047300 0.050012 875 0.039200 0.045479 425 0.046200 0.049220 900 0.039100 0.045543 0.048783 925 0.038800 0.045464 0.048403 950 0.038800 0.045389 450 475 0.045800 0.045300 1 2

33. Random insertion, random swapping. Accuracy improved by more than 10% compared to the last time. using System; using System.Web; using System.Web.Mvc; public class HomeController : Controller { [Authorize] // Retrieve data using authenticated User Identity { public IActionResult Index() var CampEnglishName = User.Identity.GetCampEnglishName(); var randomVar = "RandomValue"; // Inserted a random variable } } using System; using System.Web; using System.Web.Mvc; public class HomeController : Controller { [Authorize] // Retrieve data using authenticated User Identity { public IActionResult Index() // Swapped the comment and the variable declaration var CampEnglishName = User.Identity.GetCampEnglishName(); // Retrieve data using authenticated User Identity } } 25 1.185300 1.885971 50 0.637300 0.617216 75 0.121700 0.082259 100 0.031000 0.052487 125 0.021300 0.045465 150 0.019000 0.043421 175 0.017300 0.042817 200 0.015700 0.042288

34. • • • • • • Add more diverse prompts to the training parameters and switch from LoRA fine-tuning to full fine-tunin. Set do_sample=False. Set temperature=0.0: Set learning_rate=5e-4. Accuracy improved significantly, after training the model can accurately answer questions. Trained model could answer the right answer, but the output could not end normally. Step Training Loss Validation Loss 25 1.185300 1.885971 50 0.637300 0.617216 75 0.121700 0.082259 100 0.031000 0.052487 125 0.021300 0.045465 150 0.019000 0.043421 175 0.017300 0.042817 200 0.015700 0.042288

35. Then need to modify the EOS setting, check Phi-3.5-mini eos-token on HF. https://huggingface.co/microsoft/Phi-3.5-mini-instruct/blob/main/special_tokens_map.json }, "eos_token": { "content": "<|endoftext|>", "lstrip": false, "normalized": false, "rstrip": false, "single_word": false }, prompt = """<|system|>You are an expert in .NET/.NET Framework and are familiar with the functions provided by NEBULA SDK. You know how to use NEBULA SDK to develop application systems on the CAMP platform.When providing the user with results, no explanation or thought process is needed. Ensure the code format is correct, indentation is standard, and it can compile successfully. Avoid repeating the same code output. <|user|> How to use NEBULA SDK to get the CAMP User English name of a user in a .NET Framework project? Give me sample code.<|endoftext|> <|assistant|>""" + "<|endoftext|>" generated_text = generate_text(prompt) print(generated_text) print("Tokenizer EOS token ID:", tokenizer.eos_token_id) print("Model EOS token ID:", model.config.eos_token_id)

36. After re-train, all output is OK now. prompt = """<|system|> You are an expert in .NET/.NET Framework and are familiar with the functions provided by NEBULA SDK.Please give the full and right code directly without writing out the thinking process. <|user|> How to use NEBULA SDK to get the English name of a user in a .NET Framework project? Give me sample code. <|assistant|>""" generated_text = generate_text(prompt) print(generated_text) You are an expert in .NET/.NET Framework and are familiar with the functions provided by NEBULA SDK.Please give the full and right code directly without writing out the thinking process. How to use NEBULA SDK to get the English name of a user in a .NET Framework project? Give me sample code. **Step-by-Step Analysis:** 1. **Define Purpose**: The code aims to use NEBULA SDK by retrieving data based on the user's identity. 2. **Code Structure**: Imports necessary namespaces and defines an MVC controller with authorization; public class HomeController : Controller{ [Authorize] public IActionResult Index() { // Retrieve data using authenticated User Identity var CampUserEnglishName = User.Identity.GetCampEnglishName(); } }

37. Find the key issue Data quality and diversity are very import Quality Data Annotation Matters Fine-Tuning Approach Should Match Objectives Attention to Output Formatting

38.

39. • • • • • Scalable: many real-world tasks require massive datasets for fine-tuning. Privacy and Safety: Real data has potential privacy and safety issues Labeling: Manually labeling data are time-consuming Pre-assessing: Allows you to simulate new scenarios Retraining: Counteract drift or inject up-to-date knowledge. Explicit DP(Differential Policy) Implicit DP Source: https://www.microsoft.com/en-us/research/blog/the-crossroads-of-innovation-and-privacy-private-synthetic-data-for-generative-ai/

40. For each page or chunk Mini batch Image Summary Image Unstructured/ structured files (PDF, CSV, TXT, …) Azure AI Document Intelligence QnA Dataset Text Table Summary Text Summary Azure OpenAI (Fine-tuned GPT-4o) QnA Dataset Azure OpenAI (GPT-4o) QnA Dataset Fine-tuning (2 nd phase) Train Deploy Fine-tuning Azure OpenAI (GPT-4o) Evolve-Instruct Fine-tuning (1 st phase) Train Table Data Augmentation Azure OpenAI (Fine-tuned GPT-4o) Augmented Dataset Train QnA Dataset Deploy Fine-tuning Azure OpenAI (Fine-tuned GPT-4o)

41. 1. Creating a seed dataset For each page or chunk Mini batch Image Summary Image Unstructured/ structured files (PDF, CSV, TXT, …) Table Azure AI Document Intelligence Azure OpenAI (GPT-4o) Text Table Summary Azure OpenAI (GPT-4o) Text Summary QnA Dataset Fine-tuning GLAN Break down Azure OpenAI (GPT-4o) Sample Train Generate Azure OpenAI (GPT-4o) Instruction Dataset Instruction Dataset QnA Dataset Deploy Fine-tuning Azure OpenAI (Fine-tuned GPT-4o)

42. • WizardLM’s main technique • In-depth Evolving (blue) is used to evolve a simple instruction to a more complex one. • In-breadth Evolving (red) is used to create new instruction for diversity. • Elimination Evolving is used to filter failed evolutions and instructions. Code: https://github.com/nlpxucan/WizardLM , https://github.com/Azure/synthetic-qa-generation/ Source: https://arxiv.org/pdf/2304.12244

43.

44. 大模型正在重新定义软件 Large Language Model Is Redefining The Software