开源与内部开发:Uber如何优化LLM训练
Generative AI powered by LLMs (Large Language Models) has a wide range of applications at Uber, like Uber Eats recommendations and search, customer support chatbots, code development, and SQL query generation.
由 LLM(大型语言模型)驱动的生成式 AI 在 Uber 有广泛的应用,例如 Uber Eats 推荐和搜索、客户支持聊天机器人、代码开发和 SQL 查询生成。
To support these applications, Uber leverages open-source models like Meta® Llama 2 and Mistral AI Mixtral®, and closed-source models from OpenAI, Google, and other third-party providers. As a leading company in mobility and delivery, Uber also has considerable domain-specific knowledge that can improve LLM performance for these applications. One way Uber incorporates this domain-specific knowledge is through RAG (Retrieval Augmented Generation).
为了支持这些应用,Uber 利用开源模型如 Meta® Llama 2 和 Mistral AI Mixtral®,以及来自 OpenAI、Google 和其他第三方提供商的闭源模型。作为移动和配送领域的领先公司,Uber 还拥有大量的领域特定知识,可以提高这些应用的 LLM 性能。Uber 通过 RAG(检索增强生成)将这些领域特定知识融入其中。
Uber also explores ways to adapt LLMs to Uber’s knowledge base through continuous pre-training and instruction fine-tuning. For example, for Uber Eats, we found that a model finetuned on Uber’s knowledge of items, dishes, and restaurants could improve the accuracy of item tagging, search queries, and user preference understanding compared to open-source model results. Even further, these finetuned models can achieve similar performance to GPT-4 models while allowing for much more traffic at Uber’s scale.
Uber 还探索了通过持续预训练和指令微调将 LLMs 适应 Uber 知识库的方法。例如,对于 Uber Eats,我们发现基于 Uber 对商品、菜品和餐馆的知识微调的模型可以提高商品标签、搜索查询和用户偏好理解的准确性,相比开源模型结果更好。更进一步,这些微调模型可以在 Uber 的规模上处理更多流量的同时,达到与 GPT-4 模型相似的性能。
AI community support and open-source libraries like transformers, Microsoft DeepSpeed®, and PyTorch FSDP empower Uber to rapidly build infrastructure to efficiently train and evaluate LLMs. Emerging open-source initiatives like Meta® Llama 3 llama-recipes, Microsoft LoRA®, QLoRA™, and Hugging Face PEFT™ simplify the fine-tuning lifecycle for LLMs and reduce engineering efforts. Tools like Ray® and vLLM™ maximize the throughput of larg...