我们如何使用Lakera Guard来保护我们的LLMs
From search to organization, rapid advancements in artificial intelligence (AI) have made it easier for Dropbox users to discover and interact with their files. However, these advancements can also introduce new security challenges. Large Language Models (LLMs), integral to some of our most recent intelligent features, are also susceptible to various threats—from data breaches and adversarial attacks to exploitation by malicious actors. While hundreds of millions of users already trust Dropbox to protect their content, ensuring the security and integrity of these models is essential for maintaining that trust.
从搜索到组织,人工智能(AI)的快速进步使得Dropbox用户更容易发现和使用他们的文件。然而,这些进步也可能带来新的安全挑战。大型语言模型(LLM)是我们最新智能功能的核心,但也容易受到各种威胁,包括数据泄露、对抗性攻击和恶意行为的利用。虽然已经有数亿用户信任Dropbox保护他们的内容,但确保这些模型的安全和完整性对于保持这种信任至关重要。
Last year we evaluated several security solutions to help safeguard our LLM-powered applications and ultimately chose Lakera Guard. With its robust capabilities, Lakera Guard helps us secure and protect user data, and—as outlined in our AI principles—uphold the reliability and trustworthiness of our intelligent features.
去年,我们评估了几个安全解决方案,以帮助保护我们的LLM应用程序,并最终选择了Lakera Guard。凭借其强大的功能,Lakera Guard帮助我们保护和保护用户数据,并且-如我们的AI原则所述-维护我们智能功能的可靠性和值得信赖性。
Addressing these challenges requires a multifaceted approach, incorporating stringent security protocols, continuous monitoring, and proactive risk management strategies. In this story, we’ll share insights into our approach to securing our LLMs, the criteria we used to evaluate potential solutions, and the key benefits of implementing Lakera's technology.
解决这些挑战需要多方面的方法,包括严格的安全协议、持续监控和积极的风险管理策略。在这个故事中,我们将分享我们保护LLM的方法、评估潜在解决方案的标准以及实施Lakera技术的关键好处。
LLM security is comprised of many parts. Common problems include reliability, consistency, alignment, and adversarial attacks. However, the scope of the problem we were trying to solve was more customer-centric—specifically, using LLMs to chat about, summarize, transcribe, and retrieve information, in addition to agent/assistant use cases. These kinds of unt...