在Dropbox规模下选择语义搜索模型

Nautilus is our search engine for finding documents and other files in Dropbox. Introduced in 2018, Nautilus uses a conventional keyword-based approach that, while functional, has some inherent shortcomings. Because Nautilus has limited contextual understanding of what someone may be looking for, users are required to precisely recall a file’s exact name or the specific keywords within. For instance, a search for “employment contract” may overlook relevant “job agreement” or “offer letter” documents, as Nautilus did not grab their contextual similarity. And for multilingual users, Nautilus expects queries and documents to be in the same language, hindering efficient retrieval when dealing with content in different languages.

Nautilus 是我们用于在 Dropbox 中查找文档和其他文件的搜索引擎。Nautilus 于 2018 年推出,使用传统的基于关键词的方法,虽然功能齐全,但存在一些固有的缺点。由于 Nautilus 对用户可能在寻找的内容的上下文理解有限,用户需要准确记住文件的确切名称或特定的关键词。例如,搜索“employment contract”可能会忽略相关的“job agreement”或“offer letter”文档,因为 Nautilus 没有抓住它们的上下文相似性。对于多语言用户,Nautilus 期望查询和文档使用相同的语言,这在处理不同语言的内容时会妨碍高效检索。

To mitigate these limitations, we considered techniques such as stemming, spelling correction, and query expansion for improved flexibility. However, we wondered if we could elevate the Dropbox search experience further. Could it be possible to help users find their content without needing to know the exact search term?

为了减轻这些限制,我们考虑了诸如词干提取、拼写校正和查询扩展等技术以提高灵活性。然而,我们在想是否可以进一步提升Dropbox的搜索体验。是否有可能帮助用户在不需要知道确切搜索词的情况下找到他们的内容?

Enter semantic search. Rather than rely on exact keyword matches, semantic search aims to better understand the relationship between user queries and document content. This functionality ultimately enables Dropbox users to locate crucial information more quickly, so they can spend less time searching and more time focusing on the task at hand.

进入语义搜索。语义搜索不仅依赖于精确的关键词匹配,还旨在更好地理解用户查询与文档内容之间的关系。这一功能最终使Dropbox用户能够更快地找到关键信息,从而减少搜索时间,更多地专注于手头的任务。

For multilingual users, semantic search also unlocks another capability: cross-lingual search. This advanced feature allows users to search in one langu...

开通本站会员,查看完整译文。

Home - Wiki
Copyright © 2011-2024 iteam. Current version is 2.139.0. UTC+08:00, 2024-12-23 03:23
浙ICP备14020137号-1 $Map of visitor$