# OptimalTransport.jl Documentation

## Exact optimal transport (Kantorovich) problem

OptimalTransport.jl reexports the following functions for exact, i.e., unregularized, optimal transport problems from ExactOptimalTransport.jl.

`ExactOptimalTransport.emd`

— Function`emd(μ, ν, C, optimizer)`

Compute the optimal transport plan `γ`

for the Monge-Kantorovich problem with source histogram `μ`

, target histogram `ν`

, and cost matrix `C`

of size `(length(μ), length(ν))`

which solves

\[\inf_{γ ∈ Π(μ, ν)} \langle γ, C \rangle.\]

The corresponding linear programming problem is solved with the user-provided `optimizer`

. Possible choices are `Tulip.Optimizer()`

and `Clp.Optimizer()`

in the `Tulip`

and `Clp`

packages, respectively.

`ExactOptimalTransport.emd2`

— Function`emd2(μ, ν, C, optimizer; plan=nothing)`

Compute the optimal transport cost (a scalar) for the Monge-Kantorovich problem with source histogram `μ`

, target histogram `ν`

, and cost matrix `C`

of size `(length(μ), length(ν))`

which is given by

\[\inf_{γ ∈ Π(μ, ν)} \langle γ, C \rangle.\]

The corresponding linear programming problem is solved with the user-provided `optimizer`

. Possible choices are `Tulip.Optimizer()`

and `Clp.Optimizer()`

in the `Tulip`

and `Clp`

packages, respectively.

A pre-computed optimal transport `plan`

may be provided.

`ExactOptimalTransport.ot_plan`

— Function`ot_plan(c, μ, ν; kwargs...)`

Compute the optimal transport plan for the Monge-Kantorovich problem with source and target marginals `μ`

and `ν`

and cost `c`

.

The optimal transport plan solves

\[\inf_{\gamma \in \Pi(\mu, \nu)} \int c(x, y) \, \mathrm{d}\gamma(x, y)\]

where $\Pi(\mu, \nu)$ denotes the couplings of $\mu$ and $\nu$.

See also: `ot_cost`

`ExactOptimalTransport.ot_plan`

— Method`ot_plan(c, μ::ContinuousUnivariateDistribution, ν::UnivariateDistribution)`

Compute the optimal transport plan for the Monge-Kantorovich problem with univariate distributions `μ`

and `ν`

as source and target marginals and cost function `c`

of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport plan is the Monge map

\[T = F_\nu^{-1} \circ F_\mu\]

where $F_\mu$ is the cumulative distribution function of `μ`

and $F_\nu^{-1}$ is the quantile function of `ν`

.

`ExactOptimalTransport.ot_plan`

— Method`ot_plan(c, μ::DiscreteNonParametric, ν::DiscreteNonParametric)`

Compute the optimal transport cost for the Monge-Kantorovich problem with univariate discrete distributions `μ`

and `ν`

as source and target marginals and cost function `c`

of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport plan can be computed analytically. It is returned as a sparse matrix.

`ExactOptimalTransport.ot_plan`

— Method`ExactOptimalTransport.ot_plan`

— Method`ot_plan(::SqEuclidean, μ::MvNormal, ν::MvNormal)`

Compute the optimal transport plan for the Monge-Kantorovich problem with multivariate normal distributions `μ`

and `ν`

as source and target marginals and cost function $c(x, y) = \|x - y\|_2^2$.

In this setting, for $\mu = \mathcal{N}(m_\mu, \Sigma_\mu)$ and $\nu = \mathcal{N}(m_\nu, \Sigma_\nu)$, the optimal transport plan is the Monge map

\[T \colon x \mapsto m_\nu + \Sigma_\mu^{-1/2} {\big(\Sigma_\mu^{1/2} \Sigma_\nu \Sigma_\mu^{1/2}\big)}^{1/2}\Sigma_\mu^{-1/2} (x - m_\mu).\]

`ExactOptimalTransport.ot_cost`

— Function`ot_cost(c, μ, ν; kwargs...)`

Compute the optimal transport cost for the Monge-Kantorovich problem with source and target marginals `μ`

and `ν`

and cost `c`

.

The optimal transport cost is the scalar value

\[\inf_{\gamma \in \Pi(\mu, \nu)} \int c(x, y) \, \mathrm{d}\gamma(x, y)\]

where $\Pi(\mu, \nu)$ denotes the couplings of $\mu$ and $\nu$.

See also: `ot_plan`

`ExactOptimalTransport.ot_cost`

— Method```
ot_cost(
c, μ::ContinuousUnivariateDistribution, ν::UnivariateDistribution; plan=nothing
)
```

Compute the optimal transport cost for the Monge-Kantorovich problem with univariate distributions `μ`

and `ν`

as source and target marginals and cost function `c`

of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport cost can be computed as

\[\int_0^1 c(F_\mu^{-1}(x), F_\nu^{-1}(x)) \mathrm{d}x\]

where $F_\mu^{-1}$ and $F_\nu^{-1}$ are the quantile functions of `μ`

and `ν`

, respectively.

A pre-computed optimal transport `plan`

may be provided.

`ExactOptimalTransport.ot_cost`

— Method```
ot_cost(
c, μ::DiscreteNonParametric, ν::DiscreteNonParametric; plan=nothing
)
```

Compute the optimal transport cost for the Monge-Kantorovich problem with discrete univariate distributions `μ`

and `ν`

as source and target marginals and cost function `c`

of the form $c(x, y) = h(|x - y|)$ where $h$ is a convex function.

In this setting, the optimal transport cost can be computed analytically.

A pre-computed optimal transport `plan`

may be provided.

`ExactOptimalTransport.ot_cost`

— Method`ExactOptimalTransport.ot_cost`

— Method`ot_cost(::SqEuclidean, μ::MvNormal, ν::MvNormal)`

Compute the squared 2-Wasserstein distance between normal distributions `μ`

and `ν`

as source and target marginals.

In this setting, the optimal transport cost can be computed as

\[W_2^2(\mu, \nu) = \|m_\mu - m_\nu \|^2 + \mathcal{B}(\Sigma_\mu, \Sigma_\nu)^2,\]

where $\mu = \mathcal{N}(m_\mu, \Sigma_\mu)$, $\nu = \mathcal{N}(m_\nu, \Sigma_\nu)$, and $\mathcal{B}$ is the Bures metric.

`ExactOptimalTransport.wasserstein`

— Function`wasserstein(μ, ν; metric=Euclidean(), p=Val(1), kwargs...)`

Compute the `p`

-Wasserstein distance with respect to the `metric`

between measures `μ`

and `ν`

.

Order `p`

can be provided as a scalar of type `Real`

or as a parameter of a value type `Val(p)`

. For certain combinations of `metric`

and `p`

, such as `metric=Euclidean()`

and `p=Val(2)`

, the computations are more efficient if `p`

is specified as a value type. The remaining keyword arguments are forwarded to `ot_cost`

.

See also: `squared2wasserstein`

, `ot_cost`

`ExactOptimalTransport.squared2wasserstein`

— Function`squared2wasserstein(μ, ν; metric=Euclidean(), kwargs...)`

Compute the squared 2-Wasserstein distance with respect to the `metric`

between measures `μ`

and `ν`

.

The remaining keyword arguments are forwarded to `ot_cost`

.

See also: `wasserstein`

, `ot_cost`

`ExactOptimalTransport.discretemeasure`

— Function```
discretemeasure(
support::AbstractVector,
probs::AbstractVector{<:Real}=FillArrays.Fill(inv(length(support)), length(support)),
)
```

Construct a finite discrete probability measure with `support`

and corresponding `probabilities`

. If the probability vector argument is not passed, then equal probability is assigned to each entry in the support.

**Examples**

```
using KernelFunctions
# rows correspond to samples
μ = discretemeasure(RowVecs(rand(7,3)), normalize!(rand(10),1))
# columns correspond to samples, each with equal probability
ν = discretemeasure(ColVecs(rand(3,12)))
```

If `support`

is a 1D vector, the constructed measure will be sorted, e.g. for `mu = discretemeasure([3, 1, 2],[0.5, 0.2, 0.3])`

, then `mu.support`

will be `[1, 2, 3]`

and `mu.p`

will be `[0.2, 0.3, 0.5]`

. Also, avoid passing 1D distributions as `RowVecs(rand(3))`

or `[[1],[3],[4]]`

, since this will be dispatched to the multivariate case instead of the univariate case for which the algorithm is more efficient.

This function and in particular its return values are not stable and might be changed in future releases.

## Entropically regularised optimal transport

`OptimalTransport.sinkhorn`

— Function```
sinkhorn(
μ, ν, C, ε, alg=SinkhornGibbs();
atol=0, rtol=atol > 0 ? 0 : √eps, check_convergence=10, maxiter=1_000,
)
```

Compute the optimal transport plan for the entropically regularized optimal transport problem with source and target marginals `μ`

and `ν`

, cost matrix `C`

of size `(length(μ), length(ν))`

, and entropic regularization parameter `ε`

.

The optimal transport plan `γ`

is of the same size as `C`

and solves

\[\inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma),\]

where $\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}$ is the entropic regularization term.

Every `check_convergence`

steps it is assessed if the algorithm is converged by checking if the iterate of the transport plan `G`

satisfies

`isapprox(sum(G; dims=2), μ; atol=atol, rtol=rtol, norm=x -> norm(x, 1))`

The default `rtol`

depends on the types of `μ`

, `ν`

, and `C`

. After `maxiter`

iterations, the computation is stopped.

Batch computations for multiple histograms with a common cost matrix `C`

can be performed by passing `μ`

or `ν`

as matrices whose columns correspond to histograms. It is required that the number of source and target marginals is equal or that a single source or single target marginal is provided (either as matrix or as vector). The optimal transport plans are returned as three-dimensional array where `γ[:, :, i]`

is the optimal transport plan for the `i`

th pair of source and target marginals.

See also: `sinkhorn2`

`OptimalTransport.sinkhorn2`

— Function```
sinkhorn2(
μ, ν, C, ε, alg=SinkhornGibbs(); regularization=false, plan=nothing, kwargs...
)
```

Solve the entropically regularized optimal transport problem with source and target marginals `μ`

and `ν`

, cost matrix `C`

of size `(length(μ), length(ν))`

, and entropic regularization parameter `ε`

, and return the optimal cost.

A pre-computed optimal transport `plan`

may be provided. The other keyword arguments supported here are the same as those in the `sinkhorn`

function.

As the `sinkhorn2`

function in the Python Optimal Transport package, this function returns the optimal transport cost without the regularization term. The cost with the regularization term can be computed by setting `regularization=true`

.

See also: `sinkhorn`

`OptimalTransport.sinkhorn_divergence`

— Function```
sinkhorn_divergence(
μ::AbstractVecOrMat,
ν::AbstractVecOrMat,
C,
ε,
alg::SinkhornDivergence=SinkhornDivergence(
SinkhornGibbs(), SymmetricSinkhornGibbs(), SymmetricSinkhornGibbs()
);
kwargs...,
)
```

Compute the Sinkhorn Divergence between finite discrete measures `μ`

and `ν`

with respect to a common cost matrix `C`

, entropic regularization parameter `ε`

and algorithm `alg`

.

In the default case where `regularization = false`

, the Sinkhorn Divergence is that of ^{[GPC18]} and is computed as

\[\operatorname{S}_{ε}(μ,ν) := \operatorname{W}_{ε}(μ,ν) - \frac{1}{2}(\operatorname{W}_{ε}(μ,μ) + \operatorname{W}_{ε}(ν,ν)),\]

and $\operatorname{W}_{ε}$ is defined as

\[\operatorname{W}_{ε}(μ, ν) = \langle C, γ^\star \rangle,\]

where $γ^\star$ is the entropy-regularised transport plan between `μ`

and `ν`

. For `regularization = true`

, the Sinkhorn Divergence is that of ^{[FeydyP19]} and is computed as above where $\operatorname{W}_{ε}$ is replaced by $\operatorname{OT}_{ε}$, the entropy-regularised optimal transport cost with regulariser penalty.

The default algorithm for computing the term $\operatorname{W}_{ε}(μ, ν)$ is the `SinkhornGibbs`

algorithm. For the terms $\operatorname{W}_{ε}(μ, μ)$ and $\operatorname{W}_{ε}(ν, ν)$, the symmetric fixed point iteration of ^{[FeydyP19]} is used. Alternatively, a pre-computed optimal transport `plan`

between `μ`

and `ν`

may be provided.

See also: `sinkhorn2`

```
sinkhorn_divergence(
μ,
ν,
Cμν,
Cμ,
Cν,
ε,
alg::SinkhornDivergence=SinkhornDivergence(
SinkhornGibbs(), SymmetricSinkhornGibbs(), SymmetricSinkhornGibbs()
);
kwargs...,
)
```

Compute the Sinkhorn Divergence between finite discrete measures `μ`

and `ν`

with respect to the precomputed cost matrices `Cμν`

, `Cμμ`

and `Cνν`

, entropic regularization parameter `ε`

and algorithm `alg`

.

A pre-computed optimal transport `plan`

between `μ`

and `ν`

may be provided.

See also: `sinkhorn2`

, `sinkhorn_divergence`

`OptimalTransport.sinkhorn_barycenter`

— Function`sinkhorn_barycenter(μ, C, ε, w, alg = SinkhornGibbs(); kwargs...)`

Compute the Sinkhorn barycenter for a collection of `N`

histograms contained in the columns of `μ`

, for a cost matrix `C`

of size `(size(μ, 1), size(μ, 1))`

, relative weights `w`

of size `N`

, and entropic regularisation parameter `ε`

. Returns the entropically regularised barycenter of the `μ`

, i.e. the histogram `ρ`

of length `size(μ, 1)`

that solves

\[\min_{\rho \in \Sigma} \sum_{i = 1}^N w_i \operatorname{OT}_{\varepsilon}(\mu_i, \rho)\]

where $\operatorname{OT}_{ε}(\mu, \nu) = \inf_{\gamma \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma)$ is the entropic optimal transport loss with cost $C$ and regularisation $\epsilon$.

Currently the following variants of the Sinkhorn algorithm are supported:

`OptimalTransport.SinkhornGibbs`

— Type`SinkhornGibbs()`

Vanilla Sinkhorn algorithm.

`OptimalTransport.SinkhornStabilized`

— Type`SinkhornStabilized(; absorb_tol::Real=1_000)`

Construct a log-domain stabilized Sinkhorn algorithm with absorption tolerance `absorb_tol`

for solving an entropically regularized optimal transport problem.

**References**

Schmitzer, B. (2019). Stabilized Sparse Scaling Algorithms for Entropy Regularized Transport Problems. SIAM Journal on Scientific Computing, 41(3), A1443–A1481.

`OptimalTransport.SinkhornEpsilonScaling`

— Type`SinkhornEpsilonScaling(algorithm::Sinkhorn; factor=1//2, steps=5)`

Construct an ε-scaling Sinkhorn algorithm for solving an entropically regularized optimal transport problem.

The function uses the specified Sinkhorn `algorithm`

with `steps`

ε-scaling steps with scaling factor `factor`

. It sequentially solves the entropically regularized optimal transport with regularization parameters

\[\varepsilon_i := \varepsilon \lambda^{i-k}, \qquad (i = 1,\ldots,k),\]

where $\lambda$ is the scaling factor and $k$ the number of scaling steps.

The following methods are deprecated and will be removed:

`OptimalTransport.sinkhorn_stabilized`

— Function`sinkhorn_stabilized(μ, ν, C, ε; absorb_tol=1_000, kwargs...)`

This method is deprecated, please use

```
sinkhorn(
μ, ν, C, ε, SinkhornStabilized(; absorb_tol=absorb_tol); kwargs...
)
```

instead.

See also: `sinkhorn`

, `SinkhornStabilized`

`OptimalTransport.sinkhorn_stabilized_epsscaling`

— Function```
sinkhorn_stabilized_epsscaling(
μ, ν, C, ε;
scaling_factor=1//2, scaling_steps=5, absorb_tol=1_000, kwargs...
)
```

This method is deprecated, please use

```
sinkhorn(
μ,
ν,
C,
ε,
SinkhornEpsilonScaling(
SinkhornStabilized(; absorb_tol=absorb_tol);
factor=scaling_factor,
steps=scaling_steps,
);
kwargs...,
)
```

instead.

See also: `sinkhorn`

, `SinkhornEpsilonScaling`

## Unbalanced optimal transport

`OptimalTransport.sinkhorn_unbalanced`

— Function`sinkhorn_unbalanced(μ, ν, C, λ1::Real, λ2::Real, ε; kwargs...)`

Compute the optimal transport plan for the unbalanced entropically regularized optimal transport problem with source and target marginals `μ`

and `ν`

, cost matrix `C`

of size `(length(μ), length(ν))`

, entropic regularization parameter `ε`

, and marginal relaxation terms `λ1`

and `λ2`

.

The optimal transport plan `γ`

is of the same size as `C`

and solves

\[\inf_{\gamma} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma) + \lambda_1 \operatorname{KL}(\gamma 1 | \mu) + \lambda_2 \operatorname{KL}(\gamma^{\mathsf{T}} 1 | \nu),\]

where $\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}$ is the entropic regularization term and $\operatorname{KL}$ is the Kullback-Leibler divergence.

The keyword arguments supported here are the same as those in the `sinkhorn_unbalanced`

for unbalanced optimal transport problems with general soft marginal constraints.

```
sinkhorn_unbalanced(
μ, ν, C, proxdivF1!, proxdivF2!, ε;
atol=0, rtol=atol > 0 ? 0 : √eps, check_convergence=10, maxiter=1_000,
)
```

Compute the optimal transport plan for the unbalanced entropically regularized optimal transport problem with source and target marginals `μ`

and `ν`

, cost matrix `C`

of size `(length(μ), length(ν))`

, entropic regularization parameter `ε`

, and soft marginal constraints $F_1$ and $F_2$ with "proxdiv" functions `proxdivF!`

and `proxdivG!`

.

The optimal transport plan `γ`

is of the same size as `C`

and solves

\[\inf_{\gamma} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma) + F_1(\gamma 1, \mu) + F_2(\gamma^{\mathsf{T}} 1, \nu),\]

where $\Omega(\gamma) = \sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}$ is the entropic regularization term and $F_1(\cdot, \mu)$ and $F_2(\cdot, \nu)$ are soft marginal constraints for the source and target marginals.

The functions `proxdivF1!(s, p, ε)`

and `proxdivF2!(s, p, ε)`

evaluate the "proxdiv" functions of $F_1(\cdot, p)$ and $F_2(\cdot, p)$ at $s$ for the entropic regularization parameter $\varepsilon$. They have to be mutating and overwrite the first argument `s`

with the result of their computations.

Mathematically, the "proxdiv" functions are defined as

\[\operatorname{proxdiv}_{F_i}(s, p, \varepsilon) = \operatorname{prox}^{\operatorname{KL}}_{F_i(\cdot, p)/\varepsilon}(s) \oslash s\]

where $\oslash$ denotes element-wise division and $\operatorname{prox}_{F_i(\cdot, p)/\varepsilon}^{\operatorname{KL}}$ is the proximal operator of $F_i(\cdot, p)/\varepsilon$ for the Kullback-Leibler ($\operatorname{KL}$) divergence. It is defined as

\[\operatorname{prox}_{F}^{\operatorname{KL}}(x) = \operatorname{argmin}_{y} F(y) + \operatorname{KL}(y|x)\]

and can be computed in closed-form for specific choices of $F$. For instance, if $F(\cdot, p) = \lambda \operatorname{KL}(\cdot | p)$ ($\lambda > 0$), then

\[\operatorname{prox}_{F(\cdot, p)/\varepsilon}^{\operatorname{KL}}(x) = x^{\frac{\varepsilon}{\varepsilon + \lambda}} p^{\frac{\lambda}{\varepsilon + \lambda}},\]

where all operators are acting pointwise.^{[CPSV18]}

Every `check_convergence`

steps it is assessed if the algorithm is converged by checking if the iterates of the scaling factor in the current and previous iteration satisfy `isapprox(vcat(a, b), vcat(aprev, bprev); atol=atol, rtol=rtol)`

where `a`

and `b`

are the current iterates and `aprev`

and `bprev`

the previous ones. The default `rtol`

depends on the types of `μ`

, `ν`

, and `C`

. After `maxiter`

iterations, the computation is stopped.

See also: `sinkhorn_unbalanced2`

```
function sinkhorn_unbalanced(
μ, C, proxdivF!, ε; atol = nothing, rtol = nothing, maxiter::Int = 1_000, check_convergence::Int=10
)
Specialised case of [`sinkhorn_unbalanced`](@ref) to the special symmetric case where both inputs `μ, ν` are identical and the cost `C` is symmetric.
This implementation takes advantage of additional structure in the symmetric case which allows for a fixed point iteration with much faster convergence,
similar to that described by [^FeydyP19] and also employed in [`sinkhorn_divergence`](@ref) for the balanced case.
[^FeydyP19]: Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé, and Gabriel Peyré. Interpolating between optimal transport and mmd using sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2681–2690. PMLR, 2019.
```

`OptimalTransport.sinkhorn_unbalanced2`

— Function```
sinkhorn_unbalanced2(μ, ν, C, λ1, λ2, ε; plan=nothing, kwargs...)
sinkhorn_unbalanced2(μ, ν, C, proxdivF1!, proxdivF2!, ε; plan=nothing, kwargs...)
```

Compute the optimal transport plan for the unbalanced entropically regularized optimal transport problem with source and target marginals `μ`

and `ν`

, cost matrix `C`

of size `(length(μ), length(ν))`

, entropic regularization parameter `ε`

, and marginal relaxation terms `λ1`

and `λ2`

or soft marginal constraints with "proxdiv" functions `proxdivF1!`

and `proxdivF2!`

.

A pre-computed optimal transport `plan`

may be provided. The other keyword arguments supported here are the same as those in the `sinkhorn_unbalanced`

for unbalanced optimal transport problems with general soft marginal constraints.

See also: `sinkhorn_unbalanced`

`OptimalTransport.sinkhorn_divergence_unbalanced`

— Function`sinkhorn_divergence_unbalanced(μ, ν, cμν, cμ, cν, λ, ε; kwargs...)`

Compute the unbalanced Sinkhorn divergence between unnormalized inputs `μ`

and `ν`

with cost matrix `cμν`

, `cμ`

and `cν`

between `(μ,ν)`

, `(μ, μ)`

and `(ν, ν)`

respectively, regularization level `ε`

and marginal constraint parameter `λ`

. Following ^{[SFVTP19]}, the unbalanced Sinkhorn divergence is defined as

\[ \operatorname{S}_{\varepsilon, \lambda} (\mu, \nu) := \operatorname{OT}_{ε, λ}(μ,ν) - \frac{1}{2}(\operatorname{OT}_{ε, λ}(μ,μ) + \operatorname{OT}_{ε, λ}(ν,ν)) + \frac{ε}{2}(m(μ) + m(ν))^2,\]

where $\operatorname{OT}_{ε, λ}(\alpha, \beta)$ is defined to be

\[ \operatorname{OT}_{ε, λ}(\alpha, \beta) = \inf_{\gamma} \langle C, \gamma \rangle + \varepsilon \operatorname{KL}(\gamma | \alpha \otimes \beta) + \lambda ( \operatorname{KL}(\gamma_1 | \alpha) + \operatorname{KL}(\gamma_2 | \beta) ),\]

i.e. the output of calling `sinkhorn_unbalanced2`

with the default Kullback-Leibler marginal penalties.

## Quadratically regularised optimal transport

`OptimalTransport.quadreg`

— Function`quadreg(μ, ν, C, ε, alg::QuadraticOT; kwargs...)`

Computes the optimal transport plan of histograms `μ`

and `ν`

with cost matrix `C`

and quadratic regularization parameter `ε`

.

The optimal transport plan `γ`

is of the same size as `C`

and solves

\[\inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma),\]

where $\Omega(\gamma) = \frac{1}{2} \sum_{i,j} \gamma_{i,j}^2$ is the quadratic regularization term.

Every `check_convergence`

steps it is assessed if the algorithm is converged by checking if the iterate of the transport plan `γ`

satisfies

` norm_diff < max(atol, rtol * max(norm(μ, Inf), norm(ν, Inf)))`

where

\[ \text{normdiff} = \max\{ \| \gamma \mathbf{1} - \mu \|_\infty , \| \gamma^\top \mathbf{1} - \nu \|_\infty \} . \]

After `maxiter`

iterations, the computation is stopped.

Note that unlike in the case of Sinkhorn's algorithm for the entropic regularisation, batch computation of optimal transport is not supported for the quadratic regularisation.

See also: `sinkhorn`

, `QuadraticOTNewton`

Currently the following algorithms for solving quadratically regularised optimal transport are supported:

`OptimalTransport.QuadraticOTNewton`

— Type`QuadraticOTNewton`

Semi-smooth Newton method (Algorithm 2 of Lorenz et al. 2019 ^{[LMM19]}) for solving quadratically regularised optimal transport

See also: `QuadraticOTNewton`

, `quadreg`

## Dual

`OptimalTransport.Dual.ot_entropic_semidual`

— Function`ot_entropic_semidual(μ, v, eps, K)`

Computes the semidual (in the second argument) of the entropic optimal transport loss, with source marginal `μ`

, regularization parameter `ε`

, and Gibbs kernel `K`

.

That is,

\[ \operatorname{OT}_{\varepsilon}(\mu, \nu) = \inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma)\]

with $\Omega(\gamma) = \sum_{i,j} \gamma_{ij} \log \gamma_{ij}$, then the semidual in the second argument ν is ^{[Z21]}

\[\begin{aligned} \operatorname{OT}_{\varepsilon}^*(\mu, v) &= \sup_{\nu} \langle v, \nu \rangle - \operatorname{OT}_{\varepsilon}(\mu, \nu) \ \ &= -\varepsilon \left\langle \mu, \log\left( \dfrac{\mu}{K e^{v/\varepsilon}} \right) - 1\right\rangle. \end{aligned}\]

Notably, the semidual is computationally advantageous for solving variational problems since it is a smooth and unconstrained function of `v`

since it admits a closed form gradient. See ^{[CP16]} for a detailed discussion of dual methods for variational problems in optimal transport.

`OptimalTransport.Dual.ot_entropic_semidual_grad`

— Function`ot_entropic_semidual_grad(μ, v, eps, K)`

Computes the gradient with respect to `v`

of the semidual of the entropic optimal transport loss. That is,

\[\nabla_v \operatorname{OT}^*_{\varepsilon}(\mu, v) = K^\top \left( \dfrac{\mu}{K e^{v/\varepsilon}} \right) \odot e^{v/\varepsilon}.\]

See also: `ot_entropic_semidual`

`OptimalTransport.Dual.getprimal_ot_entropic_semidual`

— Function`getprimal_ot_entropic_semidual(μ, v, eps, K)`

Computes the the primal variable `ν`

corresponding to the dual variable `v`

at optimality. That is,

\[\nu^\star = e^{v^\star/\varepsilon} \odot K^\top \dfrac{\mu}{K e^{v^\star/\varepsilon}}. \]

See also: `ot_entropic_semidual`

`OptimalTransport.Dual.ot_entropic_dual`

— Function`ot_entropic_dual(u, v, eps, K)`

Computes the dual in both arguments of entropic optimal transport loss, where `u`

and `v`

are the dual variables associated with the source and target marginals respectively.

That is,

\[ \begin{aligned} \operatorname{OT}_{\varepsilon}^*(u, v) &= \sup_{\mu, \nu} \langle u, \mu \rangle + \langle v, \nu \rangle - \operatorname{OT}_\varepsilon(\mu, \nu) \ \ &= \varepsilon \log \langle e^{u/\varepsilon}, K e^{v/\varepsilon} \rangle. \end{aligned}\]

`OptimalTransport.Dual.ot_entropic_dual_grad`

— Function`ot_entropic_dual_grad(u, v, eps, K)`

Computes the gradient with respect to `u`

and `v`

of the dual of the entropic optimal transport loss. That is,

\[\begin{aligned} \nabla_u \operatorname{OT}^*_{\varepsilon}(u, v) &= \dfrac{e^{u/\varepsilon} \odot K e^{v/\varepsilon}}{\langle e^{u/\varepsilon}, K e^{v/\varepsilon} \rangle} \ \ \nabla_v \operatorname{OT}^*_{\varepsilon}(u, v) &= \dfrac{e^{v/\varepsilon} \odot K^\top e^{u/\varepsilon}}{\langle e^{v/\varepsilon}, K^\top e^{u/\varepsilon} \rangle}. \end{aligned}\]

See also: `ot_entropic_dual`

`OptimalTransport.Dual.getprimal_ot_entropic_dual`

— Function`getprimal_ot_entropic_dual(u, v, eps, K)`

Computes the the primal variable `γ`

corresponding to the dual variable `u, v`

at optimality. That is,

\[ \gamma = \operatorname{softmax}(\mathrm{diag}(e^{u/\varepsilon}) K \mathrm{diag}(e^{v/\varepsilon}))\]

See also: `ot_entropic_dual`

- GPC18Aude Genevay, Gabriel Peyré, Marco Cuturi, Learning Generative Models with Sinkhorn Divergences, Proceedings of the Twenty-First International Conference on Artficial Intelligence and Statistics, (AISTATS) 21, 2018
- FeydyP19Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouvé, and Gabriel Peyré. Interpolating between optimal transport and mmd using sinkhorn divergences. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2681–2690. PMLR, 2019.
- CPSV18Chizat, L., Peyré, G., Schmitzer, B., & Vialard, F.-X. (2018). Scaling algorithms for unbalanced optimal transport problems. Mathematics of Computation, 87(314), 2563–2609.
- SFVTP19Séjourné, T., Feydy, J., Vialard, F.X., Trouvé, A. and Peyré, G., 2019. Sinkhorn divergences for unbalanced optimal transport. arXiv preprint arXiv:1910.12958.
- LMM19Lorenz, Dirk A., Paul Manns, and Christian Meyer. Quadratically regularized optimal transport. Applied Mathematics & Optimization 83.3 (2021): 1919-1949.
- CP16Cuturi, Marco, and Gabriel Peyré. "A smoothed dual approach for variational Wasserstein problems." SIAM Journal on Imaging Sciences 9.1 (2016): 320-343.
- Z21Zhang, Stephen Y. “A Unified Framework for Non-Negative Matrix and Tensor Factorisations with a Smoothed Wasserstein Loss.” ArXiv: Machine Learning, 2021.