微调Gemma 3 1B-IT以进行金融情感分析:逐步指南

Gemma 3 is Google’s latest family of lightweight, state-of-the-art open AI models. Designed for high performance and resource efficiency, the 1B Instruct (IT) version is optimized for instruction-following tasks, making it a powerful yet accessible tool for developers.

Gemma 3 是谷歌最新的轻量级、最先进的开放 AI 模型系列。旨在实现高性能和资源效率,1B Instruct (IT) 版本针对遵循指令的任务进行了优化,使其成为开发人员强大而易于访问的工具。

More details are in the official announcement: Gemma 3 Blog Post.

更多细节请参见官方公告: Gemma 3 博客文章.

Gemma 3 utilizes a transformer architecture enhanced with techniques like RoPE embeddings and GeGLU activations. Key features include:

Gemma 3 利用增强了 RoPE 嵌入和 GeGLU 激活的变换器架构。主要特点包括:

  • 128K-token context window: Processes extensive information.
  • 128K-token 上下文窗口:处理大量信息。
  • Multilingual support: Covers over 140 languages.
  • 多语言支持:覆盖超过140种语言。
  • Multimodal capabilities: Supports text, images, and videos.
  • 多模态能力:支持文本、图像和视频。
  • Edge device optimization: Runs efficiently on consumer hardware.
  • 边缘设备优化:在消费硬件上高效运行。

Resources:

资源:

Dataset Selection

数据集选择

Annotated datasets for finance and economic texts are relatively rare, with many being proprietary. To address this challenge, researchers from the Aalto University School of Business introduced the FinancialPhraseBank Dataset in 2014, which contains approximately 5,000 sentences.

金融和经济文本的注释数据集相对稀缺,许多是专有的。为了解决这个挑战,来自Aalto University School of Business的研究人员在2014年推出了FinancialPhraseBank数据集,该数据集包含大约5,000个句子

This dataset provides human-annotated benchmarks, allowing for consistent evaluation of different modeling techniques. The annotations were performed by 16 individuals with a background in financial markets, who categorized the sentences as having a:

该数据集提供了 人工标注的基准,允许对不同建模技术进行一致的评估。这些注释由 16 名具有 金融市场 背景的个人 执行,他们将句子分类为:

  • Positive impact on stock prices
  • 对股价的 正面影响
  • Negative impact on stock prices
  • 对股价的 负面影响
  • Neutral impact on stock prices
  • 对股价的 中性影响

The impact was assessed from an investor’s perspective.

影响是从 投资者的角度 进行评估的。

...

开通本站会员,查看完整译文。

trang chủ - Wiki
Copyright © 2011-2025 iteam. Current version is 2.143.0. UTC+08:00, 2025-04-14 20:38
浙ICP备14020137号-1 $bản đồ khách truy cập$