使用 Node.js 和 Llama Stack 实现 AI 安全措施
With Llama Stack being released earlier this year, we decided to look at how to implement key aspects of an AI application with Node.js and Llama Stack. This article is the third in a series exploring how to use large language models with Node.js and Llama Stack. This post covers safey and guardrails.
随着 Llama Stack 在今年早些时候发布,我们决定研究如何使用 Node.js 和 Llama Stack 实现 AI 应用程序的关键方面。本文是探索如何使用大型语言模型与 Node.js 和 Llama Stack 的系列文章中的第三篇。此帖子涵盖安全性和护栏。
For an introduction to Llama Stack, read A practical guide to Llama Stack for Node.js developers.
有关 Llama Stack 的介绍,请阅读 Node.js 开发人员的 Llama Stack 实用指南。
For an introduction to using retrieval-augmented generation with Node.js, read Retrieval-augmented generation with Llama Stack and Node.js.
有关如何使用Node.js进行检索增强生成的介绍,请阅读使用Llama Stack和Node.js进行检索增强生成。
What are guardrails?
什么是保护机制?
In the context of large language models (LLMs), guardrails are safety mechanisms intended to ensure that:
在大型语言模型 (LLMs) 的上下文中,保护机制是旨在确保的安全机制:
- The LLM only answers questions within the intended scope of the application.
- LLM 仅回答在应用程序预期范围内的问题。
- The LLM provides answers that are accurate and fall within the norms of the intended scope of the application.
- LLM 提供的答案是准确的,并且符合应用程序预期范围的规范。
Some examples include:
一些例子包括:
- Ensuring the LLM refuses to answer questions on how to break the law in an insurance quote application.
- 确保LLM拒绝在保险报价申请中回答如何违法的问题。
- Ensuring the LLM answers in a way that avoids bias against certain groups in an insurance approval application.
- 确保LLM在保险审批申请中以避免对某些群体的偏见的方式回答。
Llama Stack includes both built in guardrails and the ability to register additional providers that implement your own custom guardrails. In the sections that follow, we'll look at the Llama Stack APIs and some code which uses those guardrails.
Llama Stack 包含内置的保护机制以及注册实现自定义保护机制的额外提供者的能力。在接下来的部分中,我们将查看 Llama Stack API 以及一些使用这些保护机制的代码。
Built-in guardrails
内置保护措施
Llama Stack includes two built-in guardrails:
Llama Stack包括两个内置保护措施:
LlamaGuard
LlamaGuard
LlamaGuard is a model for use in human-AI conversations and aims to identify as unsafe i...