使用 LLMs 放大人工标注并提升 Dash 搜索相关性
When someone uses Dropbox Dash to search or ask a question, it follows a retrieval-augmented generation (RAG) pattern. This means our AI first retrieves relevant company information and then uses that information to generate responses. To produce those answers, it relies on enterprise search to retrieve company-specific context and then uses that context to ground the response. Rather than responding solely from general knowledge, Dash incorporates information that already exists within an organization.
当某人使用 Dropbox Dash 搜索或提问时,它遵循检索增强生成 (RAG) 模式。这意味着我们的 AI 首先检索相关公司信息,然后使用该信息生成响应。为了产生这些答案,它依赖企业搜索来检索公司特定上下文,然后使用该上下文来为响应提供依据。而不是仅基于一般知识进行响应,Dash 融入了组织内已存在的信息。
When a user submits a query, Dash first interprets the underlying information need and determines how to retrieve relevant content. Search returns a set of candidate documents, and a large language model (LLM) analyzes the most relevant results to generate an answer. Because there are millions (and, in very large enterprises, billions) of documents in the enterprise search index, Dash can pass along only a small subset of the retrieved documents to the LLM. This makes the quality of search ranking—and the labeled relevance data used to train it—critical to the quality of the final answer.
当用户提交查询时,Dash 首先解释底层信息需求并确定如何检索相关内容。搜索返回一组候选文档,大型语言模型 (LLM) 分析最相关的结果以生成答案。由于企业搜索索引中有数百万(在非常大的企业中,有数十亿)文档,Dash 只能将检索到的文档的一小部分传递给 LLM。这使得搜索排名的质量——以及用于训练它的标记相关性 数据——对最终答案的质量至关重要。
Search results in Dash are ordered by a relevance model that assigns a score to each document based on how well it matches the query. Like most modern ranking systems, this model is trained rather than hand-tuned. It learns from examples of queries paired with documents, annotated with human relevance judgments that define what high-quality search results look like. These judgments are labeled examples in which people evaluate how well a document answers a given query.
Dash 中的搜索结果由相关性模型排序,该模型根据文档与查询的匹配程度为每个文档分配分数。像大多数现代排名系统一样,这个模型是通过训练而不是手动调整的。它从查询与文档配对的示例中学习,这些示例标注了人工相关性判断,定义了高质量搜索结果的样子。这些判断是标记示例,其中...