现代化的Nextdoor搜索栈 - 第2部分

In our last blog post of the Modernizing Nextdoor Search Stack series, we explained the Query Understanding and the ML models that power our Query Understanding Engine. We also covered the nuances of the Search at Nextdoor and what it takes to understand the customer intent. This time, we will be focusing on the retrieval of the search results and ranking.

在我们的现代化Nextdoor搜索栈系列的最后一篇博文中,我们解释了查询理解和为我们的查询理解引擎提供动力的ML模型。我们还涵盖了Nextdoor搜索的细微差别以及理解客户意图所需的内容。这一次,我们将专注于搜索结果的检索和排名。

Retrieval

检索

Retrieval of information can take many forms. Users can express their information needs in the form of a text query — by typing into a search bar, by selecting a query from autocomplete, or in some cases a query may not even be explicit. Retrieval can involve ranking existing pieces of content, such as documents or short-text answers, or composing new responses incorporating retrieved information. At Nextdoor, we work on the information retrieval given the features that we capture or infer from the Query Understanding stage.

信息的检索可以采取多种形式。用户可以用文本查询的形式来表达他们的信息需求--通过在搜索栏中输入,从自动完成中选择查询,或者在某些情况下,查询甚至可能是不明确的。检索可能涉及对现有的内容进行排序,如文件或短文答案,或结合检索的信息组成新的回应。在Nextdoor,我们根据我们从查询理解阶段捕获或推断的特征进行信息检索。

The Query Understanding stage provides us with rich data about the customer intent and context. Query Understanding metadata consists of raw information we get from the user such as device, location, time of the day, day of the week, query itself, expanded queries, embedded version of the query, intent for the query, predicted vertical, and predicted topic of the query, to name a few.

查询理解阶段为我们提供了关于客户意图和背景的丰富数据。查询理解元数据包括我们从用户那里得到的原始信息,如设备、位置、一天中的时间、一周中的一天、查询本身、扩展的查询、查询的嵌入版本、查询的意图、预测的垂直方向和预测的查询主题,仅此而已。

We combine all of this information in the form of a query that we use for the recall. Our underlying retrieval engine is Elasticsearch. Considering the scale of Nextdoor and the amount of data that we produce each day, we need to ensure that the system that we build meets latency requirements. For that purpose, data for our verticals is split into multiple indi...

开通本站会员,查看完整译文。

inicio - Wiki
Copyright © 2011-2025 iteam. Current version is 2.139.0. UTC+08:00, 2025-01-10 04:37
浙ICP备14020137号-1 $mapa de visitantes$