使用 Sentence Transformers v4 训练和微调重排序模型
Published March 26, 2025
发布于2025年3月26日
Sentence Transformers is a Python library for using and training embedding and reranker models for a wide range of applications, such as retrieval augmented generation, semantic search, semantic textual similarity, paraphrase mining, and more. Its v4.0 update introduces a new training approach for rerankers, also known as cross-encoder models, similar to what the v3.0 update introduced for embedding models. In this blogpost, I'll show you how to use it to finetune a reranker model that beats all existing options on exactly your data. This method can also train extremely strong new reranker models from scratch.
句子变换器 是一个用于使用和训练嵌入和重排序模型的 Python 库,适用于广泛的应用,如检索增强生成、语义搜索、语义文本相似性、释义挖掘等。其 v4.0 更新引入了一种新的重排序器训练方法,也称为交叉编码器模型,类似于 v3.0 更新为嵌入模型引入的内容。在这篇博客中,我将向您展示如何使用它来微调一个在您的数据上超越所有现有选项的重排序模型。这种方法还可以从头开始训练出极其强大的新重排序模型。
Finetuning reranker models involves several components: datasets, loss functions, training arguments, evaluators, and the trainer class itself. I'll have a look at each of these components, accompanied by practical examples of how they can be used for finetuning strong reranker models.
微调重排序模型涉及多个组件:数据集、损失函数、训练参数、评估器和训练器类本身。我将查看每个组件,并附上如何使用它们微调强大重排序模型的实际示例。
Lastly, in the Evaluation section, I'll show you that my small finetuned tomaarsen/reranker-ModernBERT-base-gooaq-bce reranker model that I trained alongside this blogpost easily outperforms the 13 most commonly used public reranker models on my evaluation dataset. It even beats models that are 4x bigger.
最后,在 评估 部分,我将向您展示我小型微调的 tomaarsen/reranker-ModernBERT-base-gooaq-bce 重新排序模型,它在我的评估数据集上轻松超越了13个最常用的公共重新排序模型。它甚至击败了4倍大的模型。
Repeating the recipe with a bigger base model results in tomaarsen/reranker-ModernBERT-large-gooaq-bce, a reranker model that blows all existing general-purpose reranker models out of the water on my data.
使用更大的基础模型重复该配方会导致 tomaarsen/reranker-ModernBERT-large-gooaq-bce,这是一个在我的数据上超越所有现有通用重排序模型的重排序模型。
If you're interested in finetuning embedding models instead, then consider...