通过自然语言的流程生成:一种代理式建模方法

If you're building AI products on top of closed models, anyone with an API key can get similar capabilities. Lasting differentiation comes from proprietary data, the training recipe, the infrastructure, and the speed of iteration.

如果您在封闭模型之上构建 AI 产品,任何拥有 API key 的人都能获得类似的能力。持久的差异化来自于专有数据、训练配方、基础设施以及迭代速度。

Shopify has something most companies don't: a product surface where millions of merchant interactions directly signal whether the model's output is any good. That feedback loop is the foundation, but only if you keep learning from it.

Shopify 拥有大多数公司没有的东西:一个产品界面,在那里数百万商家互动直接表明模型输出是否良好。这个反馈循环是基础,但前提是你持续从中学习。

We fine-tuned a tool-calling agent to turn natural language into a Shopify Flow for Sidekick, our AI commerce assistant. It's 2.2x faster, 68% cheaper, and outperforms closed models.

我们对一个工具调用代理进行了微调,将自然语言转化为 Sidekick 的 Shopify Flow,它是我们的人工智能商务助手。它速度快 2.2 倍,成本降低 68%,并且性能优于闭源模型。

Along the way, we found lessons no paper warned us about. Data preprocessing decisions, from representation design to formatting details, that compound to swing accuracy by double digits. Silent infrastructure failures that degrade your model with zero warnings and take days to trace. Benchmark parity that masks a 35% gap once real users show up.

在过程中,我们发现了论文中没有警告的经验教训。从表示设计到格式细节的数据预处理决策,这些决策累积起来会使准确率波动两位数。无声的基础设施故障,会在零警告的情况下降低你的模型性能,并需要几天时间追踪。基准测试的平价掩盖了真实用户出现后 35% 的差距。

This post covers the problems we faced, how we fixed them, and what to look for if you're doing the same.

本文涵盖了我们面临的问题、我们如何修复它们,以及如果你在做同样的事情要注意什么。

Data pipeline > Flywheel

Building the training dataset

构建训练数据集

Shopify Flow is an automation platform where store owners build workflows from triggers, conditions, and actions. For store owners who aren't engineers, building the right workflow from a blank canvas is daunting. Sidekick generates it from plain English.

Shopify Flow 是一个自动化平台,店主从中构建由触发器、条件和动作组成的工作流。对于不是工程师的店主来说,从空白画布构建正确的工作流是令人生畏的。Sidekick 从纯英文生成它。

Shopify Admin showing Flow

The cold start problem

冷启动问题

Fine-tuning required training data, but since the feature hadn't been deployed yet, there were no production convers...

开通本站会员,查看完整译文。

- 위키
Copyright © 2011-2026 iteam. Current version is 2.155.2. UTC+08:00, 2026-05-06 11:13
浙ICP备14020137号-1 $방문자$