如何微调Florence-2以进行目标检测任务
Florence-2 is a lightweight vision-language model open-sourced by Microsoft under the MIT license. The model demonstrates strong zero-shot and fine-tuning capabilities across tasks such as captioning, object detection, grounding, and segmentation. You can learn more about the capabilities of the pre-trained Florence model from our blog post.
Florence-2是微软在MIT许可证下开源的轻量级视觉语言模型。该模型在图像描述、目标检测、定位和分割等任务上展示了强大的零样本和微调能力。您可以从我们的博客文章中了解更多关于预训练Florence模型的能力。
Figure 1. Illustration showing the level of spatial hierarchy and semantic granularity expressed by each task. Source: Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks.
图 1. 说明每个任务所表达的空间层次和语义粒度水平。来源: Florence-2: 推进多种视觉任务的统一表示。
Like other pre-trained foundational models, Florence-2 may lack domain-specific knowledge. For example, it may perform poorly with medical or satellite imagery. In such cases, fine-tuning with a custom dataset is necessary. This tutorial will show you how to fine-tune Florence-2 on object detection datasets to improve model performance for your specific use case. Let's dive in!
与其他预训练基础模型一样,Florence-2 可能缺乏特定领域的知识。例如,它在医学或卫星图像上的表现可能较差。在这种情况下,需要使用自定义数据集进行微调。本教程将向您展示如何在目标检测数据集上微调 Florence-2,以提高模型在特定用例中的性能。让我们开始吧!
Figure 2 The result of Florence-2 inference on a validation subset of the custom dataset before fine-tuning.
图 2 Florence-2 在自定义数据集的验证子集上推理的结果,未进行微调。
Figure 3. The result of Florence-2 inference on a validation subset of the custom dataset after fine-tuning.
图 3. 在微调后,Florence-2 在自定义数据集的验证子集上的推理结果。
Getting Started
入门
Before we fine-tune the Florence-2 model on a custom detection dataset, we need to properly configure our environment. This tutorial is accompanied by a notebook that you can open in a separate tab and follow along.
在我们对自定义检测数据集微调 Florence-2 模型之前,需要正确配置我们的环境。本教程附带一个 笔记本,您可以在单独的标签页中打开并跟随。
Open the notebook that accompanies this guide.
打开与本指南配套的 笔记本。
Before we discuss the data format, model training, and evaluation, make sure your environment is GPU-accelerated. If you are using our Google Colab, ensure y...