Beyond Optimal Transport: Model-Aligned Coupling for Flow Matching

Sydney AI Centre, The University of Sydney    * Corresponding Author
MAC results

MAC can select model-aligned couplings and yields significantly better sample quality with fewer integration steps compared to other methods.

Abstract

Flow Matching (FM) is an effective framework for training a model to learn a vector field that transports samples from a source distribution to a target distribution. To train the model, early FM methods use random couplings, which often result in crossing paths and lead the model to learn non-straight trajectories that require many integration steps to generate high-quality samples. To address this, recent methods adopt Optimal Transport (OT) to construct couplings by minimizing geometric distances, which helps reduce path crossings. However, we observe that such geometry-based couplings do not necessarily align with the model's preferred trajectories, making it difficult to learn the vector field induced by these couplings, which prevents the model from learning straight trajectories. Motivated by this, we propose Model-Aligned Coupling (MAC), an effective method that matches training couplings based not only on geometric distance but also on alignment with the model's preferred transport directions based on its prediction error. To avoid the time-costly match process, MAC proposes to select the top-\( k \) fraction of couplings with the lowest error for training. Extensive experiments show that MAC significantly improves generation quality and efficiency in few-step settings compared to existing methods.

Method

The goal of Model-Aligned Coupling (MAC) is to construct couplings that are better aligned with the model's current ability to fit the data. Specifically, we aim to prioritize couplings \( (x_0, x_1) \) that have lower prediction error under the current vector field \( v_\theta \). To measure whether the model can fit the data well, we employ the pairwise prediction error, defined as:

\[ \mathcal{L}_{\mathrm{pair}}(x_0, x_1) := \mathbb{E}_{t \sim \mathcal{U}[0,1]} \left[ \left\| v_\theta((1 - t)x_0 + t x_1, t) - (x_1 - x_0) \right\|^2 \right] \]

Our objective is to find a coupling \( \tilde{\rho}(x_0, x_1) \in \mathcal{C}(p_0, p_1) \) that minimizes the expected prediction error:

\[ \tilde{\rho} = \arg\min_{\rho \in \mathcal{C}(p_0, p_1)} \, \mathbb{E}_{(x_0, x_1) \sim \rho} \left[ \mathcal{L}_{\mathrm{pair}}(x_0, x_1) \right] \]

where \( \mathcal{C}(p_0, p_1) \) denotes the set of admissible couplings with fixed marginals.

Results

Quantitative Results

MAC Results

Qualitative Results on Celeba-HQ-256

MAC Results