通过估计数据分布的梯度进行生成性建模
This blog post focuses on a promising new direction for generative modeling. We can learn score functions (gradients of log probability density functions) on a large number of noise-perturbed data distributions, and then generate samples by Langevin-type sampling. The resulting generative models, often called score-based generative models (or diffusion probabilistic models), has several important advantages over existing model families: GAN-level sample quality without adversarial training, flexible model architectures, exact log-likelihood computation, uniquely identifiable representation learning, and inverse problem solving without re-training models. In this blog post, we will show you in more detail the intuition, basic concepts, and potential applications of score-based generative models.
这篇博文主要讨论生成式建模的一个有前途的新方向。我们可以在大量噪声扰动的数据分布上学习分数函数(对数概率密度函数的梯度),然后通过朗文型抽样产生样本。由此产生的生成模型,通常被称为基于分数的生成模型(或扩散概率模型),与现有的模型系列相比有几个重要的优势。无需对抗性训练的GAN级样本质量,灵活的模型架构,精确的对数似然计算,唯一可识别的表示学习,以及无需重新训练模型的逆向问题解决。在这篇博文中,我们将向你详细介绍基于分数的生成模型的直觉、基本概念和潜在应用。
The score function, score-based models, and score matching
Naive score-based generative modeling and its pitfalls
Score-based generative modeling with multiple noise perturbations
Existing generative modeling techniques can largely be grouped into two categories based on how they represent probability distributions. (1) The first is likelihood-based models, which directly learn the distribution’s probability density (or mass) function via (approximate) maximum likelihood. Typical likelihood-based models include autoregressive models , normalizing flow models , energy-based models (EBMs), and variational auto-encoders (VAEs) . (2) The second is implicit generative models , where the probability distribution is implicitly represented by a model of its sampling process. The most prominent example is generative adversarial networks (GANs) , where new samples from the data distribution are synthesized by transforming a random Gaussian vecto...