OptimalTransport.jl Documentation

Exact optimal transport (Kantorovich) problem

OptimalTransport.jl reexports the following functions for exact, i.e., unregularized, optimal transport problems from ExactOptimalTransport.jl.

ExactOptimalTransport.emd — Function

emd(μ, ν, C, optimizer)

Compute the optimal transport plan γ for the Monge-Kantorovich problem with source histogram μ, target histogram ν, and cost matrix C of size (length(μ), length(ν)) which solves

\[\inf_{γ ∈ Π(μ, ν)} \langle γ, C \rangle.\]

The corresponding linear programming problem is solved with the user-provided optimizer. Possible choices are Tulip.Optimizer() and Clp.Optimizer() in the Tulip and Clp packages, respectively.

ExactOptimalTransport.emd2 — Function

emd2(μ, ν, C, optimizer; plan=nothing)

Compute the optimal transport cost (a scalar) for the Monge-Kantorovich problem with source histogram μ, target histogram ν, and cost matrix C of size (length(μ), length(ν)) which is given by

\[\inf_{γ ∈ Π(μ, ν)} \langle γ, C \rangle.\]

The corresponding linear programming problem is solved with the user-provided optimizer. Possible choices are Tulip.Optimizer() and Clp.Optimizer() in the Tulip and Clp packages, respectively.

A pre-computed optimal transport plan may be provided.

ExactOptimalTransport.ot_plan — Function

ot_plan(c, μ, ν; kwargs...)

Compute the optimal transport plan for the Monge-Kantorovich problem with source and target marginals μ and ν and cost c.

The optimal transport plan solves

\[\inf_{\gamma \in \Pi(\mu, \nu)} \int c(x, y) \, \mathrm{d}\gamma(x, y)\]

where $\Pi(\mu, \nu)$ denotes the couplings of $\mu$ and $\nu$.

See also: ot_cost

ExactOptimalTransport.ot_plan — Method

ot_plan(c, μ::ContinuousUnivariateDistribution, ν::UnivariateDistribution)

Compute the optimal transport plan for the Monge-Kantorovich problem with univariate distributions μ and ν as source and target marginals and cost function c of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport plan is the Monge map

\[T = F_\nu^{-1} \circ F_\mu\]

where $F_\mu$ is the cumulative distribution function of μ and $F_\nu^{-1}$ is the quantile function of ν.

See also: ot_cost, emd

ExactOptimalTransport.ot_plan — Method

ot_plan(c, μ::DiscreteNonParametric, ν::DiscreteNonParametric)

Compute the optimal transport cost for the Monge-Kantorovich problem with univariate discrete distributions μ and ν as source and target marginals and cost function c of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport plan can be computed analytically. It is returned as a sparse matrix.

See also: ot_cost, emd

ExactOptimalTransport.ot_plan — Method

ot_plan(::SqEuclidean, μ::Normal, ν::Normal)

Compute the optimal transport plan for the Monge-Kantorovich problem with normal distributions μ and ν as source and target marginals and cost function $c(x, y) = \|x - y\|_2^2$.

See also: ot_cost, emd

ExactOptimalTransport.ot_plan — Method

ot_plan(::SqEuclidean, μ::MvNormal, ν::MvNormal)

Compute the optimal transport plan for the Monge-Kantorovich problem with multivariate normal distributions μ and ν as source and target marginals and cost function $c(x, y) = \|x - y\|_2^2$.

In this setting, for $\mu = \mathcal{N}(m_\mu, \Sigma_\mu)$ and $\nu = \mathcal{N}(m_\nu, \Sigma_\nu)$, the optimal transport plan is the Monge map

\[T \colon x \mapsto m_\nu + \Sigma_\mu^{-1/2} {\big(\Sigma_\mu^{1/2} \Sigma_\nu \Sigma_\mu^{1/2}\big)}^{1/2}\Sigma_\mu^{-1/2} (x - m_\mu).\]

See also: ot_cost, emd

ExactOptimalTransport.ot_cost — Function

ot_cost(c, μ, ν; kwargs...)

Compute the optimal transport cost for the Monge-Kantorovich problem with source and target marginals μ and ν and cost c.

The optimal transport cost is the scalar value

\[\inf_{\gamma \in \Pi(\mu, \nu)} \int c(x, y) \, \mathrm{d}\gamma(x, y)\]

where $\Pi(\mu, \nu)$ denotes the couplings of $\mu$ and $\nu$.

See also: ot_plan

ExactOptimalTransport.ot_cost — Method

ot_cost(
    c, μ::ContinuousUnivariateDistribution, ν::UnivariateDistribution; plan=nothing
)

Compute the optimal transport cost for the Monge-Kantorovich problem with univariate distributions μ and ν as source and target marginals and cost function c of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport cost can be computed as

\[\int_0^1 c(F_\mu^{-1}(x), F_\nu^{-1}(x)) \mathrm{d}x\]

where $F_\mu^{-1}$ and $F_\nu^{-1}$ are the quantile functions of μ and ν, respectively.

A pre-computed optimal transport plan may be provided.

See also: ot_plan, emd2

ExactOptimalTransport.ot_cost — Method

ot_cost(
    c, μ::DiscreteNonParametric, ν::DiscreteNonParametric; plan=nothing
)

Compute the optimal transport cost for the Monge-Kantorovich problem with discrete univariate distributions μ and ν as source and target marginals and cost function c of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport cost can be computed analytically.

A pre-computed optimal transport plan may be provided.

See also: ot_plan, emd2

ExactOptimalTransport.ot_cost — Method

ot_cost(::SqEuclidean, μ::Normal, ν::Normal)

Compute the squared 2-Wasserstein distance between univariate normal distributions μ and ν as source and target marginals.

See also: ot_plan, emd2

ExactOptimalTransport.ot_cost — Method

ot_cost(::SqEuclidean, μ::MvNormal, ν::MvNormal)

Compute the squared 2-Wasserstein distance between normal distributions μ and ν as source and target marginals.

In this setting, the optimal transport cost can be computed as

\[W_2^2(\mu, \nu) = \|m_\mu - m_\nu \|^2 + \mathcal{B}(\Sigma_\mu, \Sigma_\nu)^2,\]

where $\mu = \mathcal{N}(m_\mu, \Sigma_\mu)$, $\nu = \mathcal{N}(m_\nu, \Sigma_\nu)$, and $\mathcal{B}$ is the Bures metric.

See also: ot_plan, emd2

ExactOptimalTransport.wasserstein — Function

wasserstein(μ, ν; metric=Euclidean(), p=Val(1), kwargs...)

Compute the p-Wasserstein distance with respect to the metric between measures μ and ν.

Order p can be provided as a scalar of type Real or as a parameter of a value type Val(p). For certain combinations of metric and p, such as metric=Euclidean() and p=Val(2), the computations are more efficient if p is specified as a value type. The remaining keyword arguments are forwarded to ot_cost.

See also: squared2wasserstein, ot_cost

ExactOptimalTransport.squared2wasserstein — Function

squared2wasserstein(μ, ν; metric=Euclidean(), kwargs...)

Compute the squared 2-Wasserstein distance with respect to the metric between measures μ and ν.

The remaining keyword arguments are forwarded to ot_cost.

See also: wasserstein, ot_cost

ExactOptimalTransport.discretemeasure — Function

discretemeasure(
    support::AbstractVector,
    probs::AbstractVector{<:Real}=FillArrays.Fill(inv(length(support)), length(support)),
)

Construct a finite discrete probability measure with support and corresponding probabilities. If the probability vector argument is not passed, then equal probability is assigned to each entry in the support.

Examples

using KernelFunctions
# rows correspond to samples
μ = discretemeasure(RowVecs(rand(7,3)), normalize!(rand(10),1))

# columns correspond to samples, each with equal probability
ν = discretemeasure(ColVecs(rand(3,12)))

Note

If support is a 1D vector, the constructed measure will be sorted, e.g. for mu = discretemeasure([3, 1, 2],[0.5, 0.2, 0.3]), then mu.support will be [1, 2, 3] and mu.p will be [0.2, 0.3, 0.5]. Also, avoid passing 1D distributions as RowVecs(rand(3)) or [[1],[3],[4]], since this will be dispatched to the multivariate case instead of the univariate case for which the algorithm is more efficient.

Warning

This function and in particular its return values are not stable and might be changed in future releases.

Entropically regularised optimal transport

OptimalTransport.sinkhorn — Function

sinkhorn(
    μ, ν, C, ε, alg=SinkhornGibbs();
    atol=0, rtol=atol > 0 ? 0 : √eps, check_convergence=10, maxiter=1_000,
)

Compute the optimal transport plan for the entropically regularized optimal transport problem with source and target marginals μ and ν, cost matrix C of size (length(μ), length(ν)), and entropic regularization parameter ε.

The optimal transport plan γ is of the same size as C and solves

\[\inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma),\]

where $\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}$ is the entropic regularization term.

Every check_convergence steps it is assessed if the algorithm is converged by checking if the iterate of the transport plan G satisfies

isapprox(sum(G; dims=2), μ; atol=atol, rtol=rtol, norm=x -> norm(x, 1))

The default rtol depends on the types of μ, ν, and C. After maxiter iterations, the computation is stopped.

Batch computations for multiple histograms with a common cost matrix C can be performed by passing μ or ν as matrices whose columns correspond to histograms. It is required that the number of source and target marginals is equal or that a single source or single target marginal is provided (either as matrix or as vector). The optimal transport plans are returned as three-dimensional array where γ[:, :, i] is the optimal transport plan for the ith pair of source and target marginals.

See also: sinkhorn2

OptimalTransport.sinkhorn2 — Function

sinkhorn2(
    μ, ν, C, ε, alg=SinkhornGibbs(); regularization=false, plan=nothing, kwargs...
)

Solve the entropically regularized optimal transport problem with source and target marginals μ and ν, cost matrix C of size (length(μ), length(ν)), and entropic regularization parameter ε, and return the optimal cost.

A pre-computed optimal transport plan may be provided. The other keyword arguments supported here are the same as those in the sinkhorn function.

Note

As the sinkhorn2 function in the Python Optimal Transport package, this function returns the optimal transport cost without the regularization term. The cost with the regularization term can be computed by setting regularization=true.

See also: sinkhorn

OptimalTransport.sinkhorn_divergence — Function

sinkhorn_divergence(
    μ::AbstractVecOrMat,
    ν::AbstractVecOrMat,
    C,
    ε,
    alg::SinkhornDivergence=SinkhornDivergence(
        SinkhornGibbs(), SymmetricSinkhornGibbs(), SymmetricSinkhornGibbs()
    );
    kwargs...,
)

Compute the Sinkhorn Divergence between finite discrete measures μ and ν with respect to a common cost matrix C, entropic regularization parameter ε and algorithm alg.

In the default case where regularization = false, the Sinkhorn Divergence is that of ^[GPC18] and is computed as

\[\operatorname{S}_{ε}(μ,ν) := \operatorname{W}_{ε}(μ,ν) - \frac{1}{2}(\operatorname{W}_{ε}(μ,μ) + \operatorname{W}_{ε}(ν,ν)),\]

and $\operatorname{W}_{ε}$ is defined as

\[\operatorname{W}_{ε}(μ, ν) = \langle C, γ^\star \rangle,\]

where $γ^\star$ is the entropy-regularised transport plan between μ and ν. For regularization = true, the Sinkhorn Divergence is that of ^[FeydyP19] and is computed as above where $\operatorname{W}_{ε}$ is replaced by $\operatorname{OT}_{ε}$, the entropy-regularised optimal transport cost with regulariser penalty.

The default algorithm for computing the term $\operatorname{W}_{ε}(μ, ν)$ is the SinkhornGibbs algorithm. For the terms $\operatorname{W}_{ε}(μ, μ)$ and $\operatorname{W}_{ε}(ν, ν)$, the symmetric fixed point iteration of ^[FeydyP19] is used. Alternatively, a pre-computed optimal transport plan between μ and ν may be provided.

See also: sinkhorn2

sinkhorn_divergence(
    μ,
    ν,
    Cμν,
    Cμ,
    Cν,
    ε,
    alg::SinkhornDivergence=SinkhornDivergence(
        SinkhornGibbs(), SymmetricSinkhornGibbs(), SymmetricSinkhornGibbs()
    );
    kwargs...,
)

Compute the Sinkhorn Divergence between finite discrete measures μ and ν with respect to the precomputed cost matrices Cμν, Cμμ and Cνν, entropic regularization parameter ε and algorithm alg.

A pre-computed optimal transport plan between μ and ν may be provided.

See also: sinkhorn2, sinkhorn_divergence

OptimalTransport.sinkhorn_barycenter — Function

sinkhorn_barycenter(μ, C, ε, w, alg = SinkhornGibbs(); kwargs...)

Compute the Sinkhorn barycenter for a collection of N histograms contained in the columns of μ, for a cost matrix C of size (size(μ, 1), size(μ, 1)), relative weights w of size N, and entropic regularisation parameter ε. Returns the entropically regularised barycenter of the μ, i.e. the histogram ρ of length size(μ, 1) that solves

\[\min_{\rho \in \Sigma} \sum_{i = 1}^N w_i \operatorname{OT}_{\varepsilon}(\mu_i, \rho)\]

where $\operatorname{OT}_{ε}(\mu, \nu) = \inf_{\gamma \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma)$ is the entropic optimal transport loss with cost $C$ and regularisation $\epsilon$.

Currently the following variants of the Sinkhorn algorithm are supported:

OptimalTransport.SinkhornGibbs — Type

SinkhornGibbs()

Vanilla Sinkhorn algorithm.

OptimalTransport.SinkhornStabilized — Type

SinkhornStabilized(; absorb_tol::Real=1_000)

Construct a log-domain stabilized Sinkhorn algorithm with absorption tolerance absorb_tol for solving an entropically regularized optimal transport problem.

References

Schmitzer, B. (2019). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. SIAM Journal on Scientific Computing, 41(3), A1443–A1481.

OptimalTransport.SinkhornEpsilonScaling — Type

SinkhornEpsilonScaling(algorithm::Sinkhorn; factor=1//2, steps=5)

Construct an ε-scaling Sinkhorn algorithm for solving an entropically regularized optimal transport problem.

The function uses the specified Sinkhorn algorithm with steps ε-scaling steps with scaling factor factor. It sequentially solves the entropically regularized optimal transport with regularization parameters

\[\varepsilon_i := \varepsilon \lambda^{i-k}, \qquad (i = 1,\ldots,k),\]

where $\lambda$ is the scaling factor and $k$ the number of scaling steps.

The following methods are deprecated and will be removed:

OptimalTransport.sinkhorn_stabilized — Function

sinkhorn_stabilized(μ, ν, C, ε; absorb_tol=1_000, kwargs...)

This method is deprecated, please use

sinkhorn(
    μ, ν, C, ε, SinkhornStabilized(; absorb_tol=absorb_tol); kwargs...
)

instead.

See also: sinkhorn, SinkhornStabilized

OptimalTransport.sinkhorn_stabilized_epsscaling — Function

sinkhorn_stabilized_epsscaling(
    μ, ν, C, ε;
    scaling_factor=1//2, scaling_steps=5, absorb_tol=1_000, kwargs...
)

This method is deprecated, please use

sinkhorn(
    μ,
    ν,
    C,
    ε,
    SinkhornEpsilonScaling(
        SinkhornStabilized(; absorb_tol=absorb_tol);
        factor=scaling_factor,
        steps=scaling_steps,
    );
    kwargs...,
)

instead.

See also: sinkhorn, SinkhornEpsilonScaling

Unbalanced optimal transport

OptimalTransport.sinkhorn_unbalanced — Function

sinkhorn_unbalanced(μ, ν, C, λ1::Real, λ2::Real, ε; kwargs...)

Compute the optimal transport plan for the unbalanced entropically regularized optimal transport problem with source and target marginals μ and ν, cost matrix C of size (length(μ), length(ν)), entropic regularization parameter ε, and marginal relaxation terms λ1 and λ2.

The optimal transport plan γ is of the same size as C and solves

\[\inf_{\gamma} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma) + \lambda_1 \operatorname{KL}(\gamma 1 | \mu) + \lambda_2 \operatorname{KL}(\gamma^{\mathsf{T}} 1 | \nu),\]

where $\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}$ is the entropic regularization term and $\operatorname{KL}$ is the Kullback-Leibler divergence.

The keyword arguments supported here are the same as those in the sinkhorn_unbalanced for unbalanced optimal transport problems with general soft marginal constraints.

sinkhorn_unbalanced(
    μ, ν, C, proxdivF1!, proxdivF2!, ε;
    atol=0, rtol=atol > 0 ? 0 : √eps, check_convergence=10, maxiter=1_000,
)

Compute the optimal transport plan for the unbalanced entropically regularized optimal transport problem with source and target marginals μ and ν, cost matrix C of size (length(μ), length(ν)), entropic regularization parameter ε, and soft marginal constraints $F_1$ and $F_2$ with "proxdiv" functions proxdivF! and proxdivG!.

The optimal transport plan γ is of the same size as C and solves

\[\inf_{\gamma} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma) + F_1(\gamma 1, \mu) + F_2(\gamma^{\mathsf{T}} 1, \nu),\]

where $\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}$ is the entropic regularization term and $F_1(\cdot, \mu)$ and $F_2(\cdot, \nu)$ are soft marginal constraints for the source and target marginals.

The functions proxdivF1!(s, p, ε) and proxdivF2!(s, p, ε) evaluate the "proxdiv" functions of $F_1(\cdot, p)$ and $F_2(\cdot, p)$ at $s$ for the entropic regularization parameter $\varepsilon$. They have to be mutating and overwrite the first argument s with the result of their computations.

Mathematically, the "proxdiv" functions are defined as

\[\operatorname{proxdiv}_{F_i}(s, p, \varepsilon) = \operatorname{prox}^{\operatorname{KL}}_{F_i(\cdot, p)/\varepsilon}(s) \oslash s\]

where $\oslash$ denotes element-wise division and $\operatorname{prox}_{F_i(\cdot, p)/\varepsilon}^{\operatorname{KL}}$ is the proximal operator of $F_i(\cdot, p)/\varepsilon$ for the Kullback-Leibler ($\operatorname{KL}$) divergence. It is defined as

\[\operatorname{prox}_{F}^{\operatorname{KL}}(x) = \operatorname{argmin}_{y} F(y) + \operatorname{KL}(y|x)\]

and can be computed in closed-form for specific choices of $F$. For instance, if $F(\cdot, p) = \lambda \operatorname{KL}(\cdot | p)$ ($\lambda > 0$), then

\[\operatorname{prox}_{F(\cdot, p)/\varepsilon}^{\operatorname{KL}}(x) = x^{\frac{\varepsilon}{\varepsilon + \lambda}} p^{\frac{\lambda}{\varepsilon + \lambda}},\]

where all operators are acting pointwise.^[CPSV18]

Every check_convergence steps it is assessed if the algorithm is converged by checking if the iterates of the scaling factor in the current and previous iteration satisfy isapprox(vcat(a, b), vcat(aprev, bprev); atol=atol, rtol=rtol) where a and b are the current iterates and aprev and bprev the previous ones. The default rtol depends on the types of μ, ν, and C. After maxiter iterations, the computation is stopped.

See also: sinkhorn_unbalanced2

function sinkhorn_unbalanced(
        μ, C, proxdivF!, ε; atol = nothing, rtol = nothing, maxiter::Int = 1_000, check_convergence::Int=10
)

Specialised case of [`sinkhorn_unbalanced`](@ref) to the special symmetric case where both inputs `μ, ν` are identical and the cost `C` is symmetric.
This implementation takes advantage of additional structure in the symmetric case which allows for a fixed point iteration with much faster convergence,
similar to that described by [^FeydyP19] and also employed in [`sinkhorn_divergence`](@ref) for the balanced case.

[^FeydyP19]: Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé, and Gabriel Peyré. Interpolating between optimal transport and mmd using sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2681–2690. PMLR, 2019.

OptimalTransport.sinkhorn_unbalanced2 — Function

sinkhorn_unbalanced2(μ, ν, C, λ1, λ2, ε; plan=nothing, kwargs...)
sinkhorn_unbalanced2(μ, ν, C, proxdivF1!, proxdivF2!, ε; plan=nothing, kwargs...)

Compute the optimal transport plan for the unbalanced entropically regularized optimal transport problem with source and target marginals μ and ν, cost matrix C of size (length(μ), length(ν)), entropic regularization parameter ε, and marginal relaxation terms λ1 and λ2 or soft marginal constraints with "proxdiv" functions proxdivF1! and proxdivF2!.

A pre-computed optimal transport plan may be provided. The other keyword arguments supported here are the same as those in the sinkhorn_unbalanced for unbalanced optimal transport problems with general soft marginal constraints.

See also: sinkhorn_unbalanced

OptimalTransport.sinkhorn_divergence_unbalanced — Function

sinkhorn_divergence_unbalanced(μ, ν, cμν, cμ, cν, λ, ε; kwargs...)

Compute the unbalanced Sinkhorn divergence between unnormalized inputs μ and ν with cost matrix cμν, cμ and cν between (μ,ν), (μ, μ) and (ν, ν) respectively, regularization level ε and marginal constraint parameter λ. Following ^[SFVTP19], the unbalanced Sinkhorn divergence is defined as

\[ \operatorname{S}_{\varepsilon, \lambda} (\mu, \nu) := \operatorname{OT}_{ε, λ}(μ,ν) - \frac{1}{2}(\operatorname{OT}_{ε, λ}(μ,μ) + \operatorname{OT}_{ε, λ}(ν,ν)) + \frac{ε}{2}(m(μ) + m(ν))^2,\]

where $\operatorname{OT}_{ε, λ}(\alpha, \beta)$ is defined to be

\[ \operatorname{OT}_{ε, λ}(\alpha, \beta) = \inf_{\gamma} \langle C, \gamma \rangle + \varepsilon \operatorname{KL}(\gamma | \alpha \otimes \beta) + \lambda ( \operatorname{KL}(\gamma_1 | \alpha) + \operatorname{KL}(\gamma_2 | \beta) ),\]

i.e. the output of calling sinkhorn_unbalanced2 with the default Kullback-Leibler marginal penalties.

Quadratically regularised optimal transport

OptimalTransport.quadreg — Function

quadreg(μ, ν, C, ε, alg::QuadraticOT; kwargs...)

Computes the optimal transport plan of histograms μ and ν with cost matrix C and quadratic regularization parameter ε.

The optimal transport plan γ is of the same size as C and solves

\[\inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma),\]

where $\Omega(\gamma) = \frac{1}{2} \sum_{i,j} \gamma_{i,j}^2$ is the quadratic regularization term.

Every check_convergence steps it is assessed if the algorithm is converged by checking if the iterate of the transport plan γ satisfies

    norm_diff < max(atol, rtol * max(norm(μ, Inf), norm(ν, Inf)))

where

\[ \text{normdiff} = \max\{ \| \gamma \mathbf{1} - \mu \|_\infty , \| \gamma^\top \mathbf{1} - \nu \|_\infty \} . \]

After maxiter iterations, the computation is stopped.

Note that unlike in the case of Sinkhorn's algorithm for the entropic regularisation, batch computation of optimal transport is not supported for the quadratic regularisation.

See also: sinkhorn, QuadraticOTNewton

Currently the following algorithms for solving quadratically regularised optimal transport are supported:

OptimalTransport.QuadraticOTNewton — Type

QuadraticOTNewton

Semi-smooth Newton method (Algorithm 2 of Lorenz et al. 2019 ^[LMM19]) for solving quadratically regularised optimal transport

See also: QuadraticOTNewton, quadreg

Dual

OptimalTransport.Dual.ot_entropic_semidual — Function

ot_entropic_semidual(μ, v, eps, K)

Computes the semidual (in the second argument) of the entropic optimal transport loss, with source marginal μ, regularization parameter ε, and Gibbs kernel K.

That is,

\[ \operatorname{OT}_{\varepsilon}(\mu, \nu) = \inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma)\]

with $\Omega(\gamma) = \sum_{i,j} \gamma_{ij} \log \gamma_{ij}$, then the semidual in the second argument ν is ^[Z21]

\[\begin{aligned} \operatorname{OT}_{\varepsilon}^*(\mu, v) &= \sup_{\nu} \langle v, \nu \rangle - \operatorname{OT}_{\varepsilon}(\mu, \nu) \ \ &= -\varepsilon \left\langle \mu, \log\left( \dfrac{\mu}{K e^{v/\varepsilon}} \right) - 1\right\rangle. \end{aligned}\]

Notably, the semidual is computationally advantageous for solving variational problems since it is a smooth and unconstrained function of v since it admits a closed form gradient. See ^[CP16] for a detailed discussion of dual methods for variational problems in optimal transport.

OptimalTransport.Dual.ot_entropic_semidual_grad — Function

ot_entropic_semidual_grad(μ, v, eps, K)

Computes the gradient with respect to v of the semidual of the entropic optimal transport loss. That is,

\[\nabla_v \operatorname{OT}^*_{\varepsilon}(\mu, v) = K^\top \left( \dfrac{\mu}{K e^{v/\varepsilon}} \right) \odot e^{v/\varepsilon}.\]

See also: ot_entropic_semidual

OptimalTransport.Dual.getprimal_ot_entropic_semidual — Function

getprimal_ot_entropic_semidual(μ, v, eps, K)

Computes the the primal variable ν corresponding to the dual variable v at optimality. That is,

\[\nu^\star = e^{v^\star/\varepsilon} \odot K^\top \dfrac{\mu}{K e^{v^\star/\varepsilon}}. \]

See also: ot_entropic_semidual

OptimalTransport.Dual.ot_entropic_dual — Function

ot_entropic_dual(u, v, eps, K)

Computes the dual in both arguments of entropic optimal transport loss, where u and v are the dual variables associated with the source and target marginals respectively.

That is,

\[ \begin{aligned} \operatorname{OT}_{\varepsilon}^*(u, v) &= \sup_{\mu, \nu} \langle u, \mu \rangle + \langle v, \nu \rangle - \operatorname{OT}_\varepsilon(\mu, \nu) \ \ &= \varepsilon \log \langle e^{u/\varepsilon}, K e^{v/\varepsilon} \rangle. \end{aligned}\]

OptimalTransport.Dual.ot_entropic_dual_grad — Function

ot_entropic_dual_grad(u, v, eps, K)

Computes the gradient with respect to u and v of the dual of the entropic optimal transport loss. That is,

\[\begin{aligned} \nabla_u \operatorname{OT}^*_{\varepsilon}(u, v) &= \dfrac{e^{u/\varepsilon} \odot K e^{v/\varepsilon}}{\langle e^{u/\varepsilon}, K e^{v/\varepsilon} \rangle} \ \ \nabla_v \operatorname{OT}^*_{\varepsilon}(u, v) &= \dfrac{e^{v/\varepsilon} \odot K^\top e^{u/\varepsilon}}{\langle e^{v/\varepsilon}, K^\top e^{u/\varepsilon} \rangle}. \end{aligned}\]

See also: ot_entropic_dual

OptimalTransport.Dual.getprimal_ot_entropic_dual — Function

getprimal_ot_entropic_dual(u, v, eps, K)

Computes the the primal variable γ corresponding to the dual variable u, v at optimality. That is,

\[ \gamma = \operatorname{softmax}(\mathrm{diag}(e^{u/\varepsilon}) K \mathrm{diag}(e^{v/\varepsilon}))\]

See also: ot_entropic_dual

GPC18Aude Genevay, Gabriel Peyré, Marco Cuturi, Learning Generative Models with Sinkhorn Divergences, Proceedings of the Twenty-First International Conference on Artficial Intelligence and Statistics, (AISTATS) 21, 2018
FeydyP19Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé, and Gabriel Peyré. Interpolating between optimal transport and mmd using sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2681–2690. PMLR, 2019.
CPSV18Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F.-X. (2018). Scaling algorithms for unbalanced optimal transport problems. Mathematics of Computation, 87(314), 2563–2609.
SFVTP19Séjourné, T., Feydy, J., Vialard, F.X., Trouvé, A. and Peyré, G., 2019. Sinkhorn divergences for unbalanced optimal transport. arXiv preprint arXiv:1910.12958.
LMM19Lorenz, Dirk A., Paul Manns, and Christian Meyer. Quadratically regularized optimal transport. Applied Mathematics & Optimization 83.3 (2021): 1919-1949.
CP16Cuturi, Marco, and Gabriel Peyré. "A smoothed dual approach for variational Wasserstein problems." SIAM Journal on Imaging Sciences 9.1 (2016): 320-343.
Z21Zhang, Stephen Y. “A Unified Framework for Non-Negative Matrix and Tensor Factorisations with a Smoothed Wasserstein Loss.” ArXiv: Machine Learning, 2021.