StochasticOptimalTransport
StochasticOptimalTransport.DiscreteMeasure
StochasticOptimalTransport.ctransform
StochasticOptimalTransport.wasserstein
StochasticOptimalTransport.DiscreteMeasure
— MethodDiscreteMeasure(xs::AbstractVector, ps::AbstractVector)
Construct a discrete measure with support xs
and corresponding weights ps
.
StochasticOptimalTransport.ctransform
— Methodctransform(c, v, x, ν, ε)
Compute the c-transform
\[v^{c,ε}(x) = \begin{cases} - ε \log\bigg(\int \exp{\Big(\frac{v_y - c(x, y)}{ε}\Big)} \, ν(\mathrm{d}y)\bigg) & \text{if } ε > 0,\\ \min_{y} c(x, y) - v_y & \text{otherwise}. \end{cases}\]
StochasticOptimalTransport.wasserstein
— Methodwasserstein([rng, ]c, μ, ν[, ε; kwargs...])
Estimate the (entropic regularization of the) Wasserstein distance
\[W_{ε}(μ, ν) = \min_{π ∈ Π(μ,ν)} \int c(x, y) \,π(\mathrm{d}(x,y)) + ε \mathrm{KL}(π \,|\, μ ⊗ ν)\]
with respect to cost function c
using stochastic optimization.
If both measures μ
and ν
are DiscreteMeasure
s, then the Wasserstein distance is approximated with stochastic averaged gradient descent (SAG). The convergence rate of SAG is $O(1 / k)$, where $k$ is the number of iterations, and hence converges faster than stochastic gradient descent (SGD) at the price of increased memory consumption.
If only one of the measures μ
and ν
is a DiscreteMeasure
, then the Wasserstein distance is approximated with stochastic gradient descent with averaging (SGA). In this case, it is required that samples of the non-discrete measure λ
can be obtained with rand(rng, λ)
. The convergence rate of SGA is $O(1/√k)$, where $k$ is the number of iterations.
If ε
is nothing
(the default), then the unregularized Wasserstein distance is approximated. Otherwise, the entropic regularization with ε > 0
is estimated.
The SAG algorithm uses a constant step size, whereas the SGA algorithm uses the step size schedule
\[ τᵢ = \frac{τ₁}{1 + \sqrt{(i - 1) / w}}\]
for the $i$th iteration, where $τ₁$ corresponds to the initial step size and $w$ indicates the number of iterations serving as a warm-up phase.
Keyword arguments
maxiters::Int = 10_000
: maximum number of gradient stepsstepsize = 1
: constant stepsize of SAG or initial step size $τ₁$ of SGAwarmup_phase = 1
: warm-up phase $w$ of SGAatol = 0
: absolute tolerancertol = iszero(atol) ? typeof(float(atol))(1 // 10_000) : 0
: relative tolerancemontecarlo_samples = 10_000
: Number of Monte Carlo samples fromμ
for approximating an expectation with respect toμ
in the semi-discrete optimal transport problem
References
Genevay et al. (2016). Stochastic Optimization for Large-Scale Optimal Transport. Advances in Neural Information Processing Systems (NIPS 2016), 29:3440-3448.
Peyré, Gabriel, & Marco Cuturi (2019). Computational Optimal Transport. Foundations and Trends in Machine Learning, 11(5-6):355-607.