StochasticOptimalTransport
StochasticOptimalTransport.DiscreteMeasureStochasticOptimalTransport.ctransformStochasticOptimalTransport.wasserstein
StochasticOptimalTransport.DiscreteMeasure — MethodDiscreteMeasure(xs::AbstractVector, ps::AbstractVector)Construct a discrete measure with support xs and corresponding weights ps.
StochasticOptimalTransport.ctransform — Methodctransform(c, v, x, ν, ε)Compute the c-transform
\[v^{c,ε}(x) = \begin{cases} - ε \log\bigg(\int \exp{\Big(\frac{v_y - c(x, y)}{ε}\Big)} \, ν(\mathrm{d}y)\bigg) & \text{if } ε > 0,\\ \min_{y} c(x, y) - v_y & \text{otherwise}. \end{cases}\]
StochasticOptimalTransport.wasserstein — Methodwasserstein([rng, ]c, μ, ν[, ε; kwargs...])Estimate the (entropic regularization of the) Wasserstein distance
\[W_{ε}(μ, ν) = \min_{π ∈ Π(μ,ν)} \int c(x, y) \,π(\mathrm{d}(x,y)) + ε \mathrm{KL}(π \,|\, μ ⊗ ν)\]
with respect to cost function c using stochastic optimization.
If both measures μ and ν are DiscreteMeasures, then the Wasserstein distance is approximated with stochastic averaged gradient descent (SAG). The convergence rate of SAG is $O(1 / k)$, where $k$ is the number of iterations, and hence converges faster than stochastic gradient descent (SGD) at the price of increased memory consumption.
If only one of the measures μ and ν is a DiscreteMeasure, then the Wasserstein distance is approximated with stochastic gradient descent with averaging (SGA). In this case, it is required that samples of the non-discrete measure λ can be obtained with rand(rng, λ). The convergence rate of SGA is $O(1/√k)$, where $k$ is the number of iterations.
If ε is nothing (the default), then the unregularized Wasserstein distance is approximated. Otherwise, the entropic regularization with ε > 0 is estimated.
The SAG algorithm uses a constant step size, whereas the SGA algorithm uses the step size schedule
\[ τᵢ = \frac{τ₁}{1 + \sqrt{(i - 1) / w}}\]
for the $i$th iteration, where $τ₁$ corresponds to the initial step size and $w$ indicates the number of iterations serving as a warm-up phase.
Keyword arguments
maxiters::Int = 10_000: maximum number of gradient stepsstepsize = 1: constant stepsize of SAG or initial step size $τ₁$ of SGAwarmup_phase = 1: warm-up phase $w$ of SGAatol = 0: absolute tolerancertol = iszero(atol) ? typeof(float(atol))(1 // 10_000) : 0: relative tolerancemontecarlo_samples = 10_000: Number of Monte Carlo samples fromμfor approximating an expectation with respect toμin the semi-discrete optimal transport problem
References
Genevay et al. (2016). Stochastic Optimization for Large-Scale Optimal Transport. Advances in Neural Information Processing Systems (NIPS 2016), 29:3440-3448.
Peyré, Gabriel, & Marco Cuturi (2019). Computational Optimal Transport. Foundations and Trends in Machine Learning, 11(5-6):355-607.