开发一个由Gemma模型驱动的设备内RAG系统

The EmbeddingGemma 300M variant from Google enables text embeddings to be generated on-device (e.g., mobile or laptop), which supports semantic search, retrieval, classification and clustering across 100+ languages. A previous blog post was created here which explains both the high-level Python workflow (via SentenceTransformers) and the lower-level Android implementation (using LiteRT, tokenisation, TFLite, etc.), emphasising how developers can load the model asset, tokenize, run inference, then compute cosine similarity on the resulting vectors, all without server dependency.

Google 的 EmbeddingGemma 300M 变体能够在 设备上(例如,移动设备或笔记本电脑)生成文本嵌入,支持 100 多种语言的语义搜索、检索、分类和聚类。之前的博客文章 在这里 解释了高层次的 Python 工作流程(通过 SentenceTransformers)和低层次的 Android 实现(使用 LiteRT、分词、TFLite 等),强调开发者如何加载模型资产、进行分词、运行推理,然后计算结果向量的余弦相似度,所有这些都不依赖于服务器。

This post provides a step-by-step walkthrough for loading a PDF file, extracting and chunking its text, performing similarity matching, and using a Gemma 3 model to generate context-aware answers to user queries about the document.

这篇文章提供了一个逐步的指南,用于加载 PDF 文件,提取和分块其文本,执行相似性匹配,并使用 Gemma 3 模型生成与文档相关的用户查询的上下文感知答案。

Step 1 - Extract the text from the PDF file:
There is a library that you can use directly on mobile which is the IText Core one. It can extract text from a PDF file inside the assetts folder by easily running the below snippet:

步骤 1 - 从 PDF 文件中提取文本:
有一个库可以直接在移动设备上使用,即 IText Core。它可以通过轻松运行以下代码片段,从资产文件夹中的 PDF 文件中提取文本:

context.assets.open(assetFileName).use { inputStream -> val pdfReader = PdfReader(inputStream) val pdfDocument = PdfDocument(pdfReader) val text = StringBuilder() val numberOfPages = pdfDocument.numberOfPages // Extract text from all pages (limit to first n pages to avoid overwhelming) val pagesToProcess = minOf(numberOfPages, 100) for (page in 1..pagesToProcess) { val pageText = PdfTextExtractor.getTextFromPage(pdfDocument.getPage(page)) if (pageText.isNotBlank()) { text.append("## Page $page\n") text.append(pageText.trim()) text.append("\n\n") } } pdfDocument.close() val result = text.toSt...

开通本站会员,查看完整译文。

inicio - Wiki
Copyright © 2011-2025 iteam. Current version is 2.147.0. UTC+08:00, 2025-10-29 02:52
浙ICP备14020137号-1 $mapa de visitantes$