Basics

You are seeing the HTML output generated by Documenter.jl and Literate.jl from the Julia source file. The corresponding notebook can be viewed in nbviewer.

Packages

We load the following packages into our environment:

using OptimalTransport
using Distances
using Plots
using PythonOT: PythonOT
using Tulip

using LinearAlgebra
using Random

Random.seed!(1234)

const POT = PythonOT

PythonOT

Problem setup

First, let us initialise two random probability measures $\mu$ (source measure) and $\nu$ (target measure) in 1D:

M = 200
μ = fill(1 / M, M)
μsupport = rand(M)

N = 250
ν = fill(1 / N, N)
νsupport = rand(N);

Now we compute the quadratic cost matrix $C_{ij} = \| x_i - x_j \|_2^2$:

C = pairwise(SqEuclidean(), μsupport', νsupport'; dims=2);

Exact optimal transport

The earth mover's distance is defined as the optimal value of the Monge-Kantorovich problem

\[\inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle = \inf_{\gamma \in \Pi(\mu, \nu)} \sum_{i, j} \gamma_{ij} C_{ij},\]

where $\Pi(\mu, \nu)$ denotes the set of couplings of $\mu$ and $\nu$, i.e., the set of joint distributions whose marginals are $\mu$ and $\nu$. If $C$ is the quadratic cost matrix, the earth mover's distance is known as the square of the 2-Wasserstein distance.

The function emd returns the optimal transport plan $\gamma$

γ = emd(μ, ν, C, Tulip.Optimizer());

whilst using emd2 returns the optimal transport cost:

emd2(μ, ν, C, Tulip.Optimizer())

0.0014877814134910635

Entropically regularised optimal transport

We may add an entropy term to the Monge-Kantorovich problem to obtain the entropically regularised optimal transport problem

\[\inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma),\]

where $\Omega(\gamma) = \sum_{i, j} \gamma_{ij} \log(\gamma_{ij})$ is the negative entropy of the coupling $\gamma$ and $\varepsilon$ controls the strength of the regularisation.

This problem is strictly convex and admits a very efficient iterative scaling scheme for its solution known as the Sinkhorn algorithm.

We compute the optimal entropically regularised transport plan:

ε = 0.01
γ = sinkhorn(μ, ν, C, ε);

We can check that one obtains the same result with the Python Optimal Transport (POT) package:

γpot = POT.sinkhorn(μ, ν, C, ε)
norm(γ - γpot, Inf)

5.79577190277028e-12

We can compute the optimal cost (a scalar) of the entropically regularized optimal transport problem with sinkhorn2:

sinkhorn2(μ, ν, C, ε)

0.00599172263125863

Quadratically regularised optimal transport

Instead of the common entropically regularised optimal transport problem, we can solve the quadratically regularised optimal transport problem

\[\inf_{\gamma \in \Pi(\mu, \nu)} \langle \gamma, C \rangle + \varepsilon \frac{\| \gamma \|_F^2}{2}.\]

One property of the quadratically regularised optimal transport problem is that the resulting transport plan $\gamma$ is sparse. We take advantage of this and represent it as a sparse matrix.

quadreg(μ, ν, C, ε; maxiter=100);

Stabilized Sinkhorn algorithm

When $\varepsilon$ is very small, we can use a log-stabilised version of the Sinkhorn algorithm.

ε = 0.005
γ = sinkhorn_stabilized(μ, ν, C, ε; maxiter=5_000);

Again we can check that the same result is obtained with the POT package:

γ_pot = POT.sinkhorn(μ, ν, C, ε; method="sinkhorn_stabilized", numItermax=5_000)
norm(γ - γ_pot, Inf)

8.684136200941273e-12

Stabilized Sinkhorn algorithm with $\varepsilon$-scaling

In addition to log-stabilisation, we can use $\varepsilon$-scaling:

γ = sinkhorn_stabilized_epsscaling(μ, ν, C, ε; maxiter=5_000);

The POT package yields the same result:

γpot = POT.sinkhorn(μ, ν, C, ε; method="sinkhorn_epsilon_scaling", numItermax=5000)
norm(γ - γpot, Inf)

1.665707076935717e-11

Unbalanced optimal transport

Unbalanced optimal transport deals with general positive measures which do not necessarily have the same total mass. For unbalanced source and target marginals $\mu$ and $\nu$ and a cost matrix $C$, entropically regularised unbalanced optimal transport solves

\[\inf_{\gamma \geq 0} \langle \gamma, C \rangle + \varepsilon \Omega(\gamma) + \lambda_1 \mathrm{KL}(\gamma 1 | \mu) + \lambda_2 \mathrm{KL}(\gamma^{\mathsf{T}} 1 | \nu),\]

where $\varepsilon$ controls the strength of the entropic regularisation, and $\lambda_1$ and $\lambda_2$ control how strongly we enforce the marginal constraints.

We construct two random measures, now with different total masses:

M = 100
μ = fill(1 / M, M)
μsupport = rand(M)

N = 200
ν = fill(1 / M, N)
νsupport = rand(N);

We compute the quadratic cost matrix:

C = pairwise(SqEuclidean(), μsupport', νsupport'; dims=2);

Now we solve the corresponding unbalanced, entropy-regularised transport problem with $\varepsilon = 0.01$ and $\lambda_1 = \lambda_2 = 1$:

ε = 0.01
λ = 1
γ = sinkhorn_unbalanced(μ, ν, C, λ, λ, ε);

We check that the result agrees with POT:

γpot = POT.sinkhorn_unbalanced(μ, ν, C, ε, λ)
norm(γ - γpot, Inf)

1.2458106547243164e-9

Plots

Entropically regularised transport

Let us construct source and target measures again:

μsupport = νsupport = range(-2, 2; length=100)
C = pairwise(SqEuclidean(), μsupport', νsupport'; dims=2)
μ = normalize!(exp.(-μsupport .^ 2 ./ 0.5^2), 1)
ν = normalize!(νsupport .^ 2 .* exp.(-νsupport .^ 2 ./ 0.5^2), 1)

plot(μsupport, μ; label=raw"$\mu$", size=(600, 400))
plot!(νsupport, ν; label=raw"$\nu$")

Now we compute the entropically regularised transport plan:

γ = sinkhorn(μ, ν, C, 0.01)
heatmap(
    μsupport,
    νsupport,
    γ;
    title="Entropically regularised optimal transport",
    size=(600, 600),
)

Quadratically regularised transport

Notice how the "edges" of the transport plan are sharper if we use quadratic regularisation instead of entropic regularisation:

γquad = quadreg(μ, ν, C, 5; maxiter=100);
heatmap(
    μsupport,
    νsupport,
    γquad;
    title="Quadratically regularised optimal transport",
    size=(600, 600),
)

Sinkhorn barycenters

For a collection of discrete probability measures $\{\mu_i\}_{i=1}^N \subset \mathcal{P}$, cost matrices $\{C_i\}_{i=1}^N$, and positive weights $\{\lambda_i\}_{i=1}^N$ summing to $1$, the entropically regularised barycenter in $\mathcal{P}$ is the discrete probability measure $\mu$ that solves

\[\inf_{\mu \in \mathcal{P}} \sum_{i = 1}^N \lambda_i \operatorname{OT}_{\varepsilon}(\mu, \mu_i)\]

where $\operatorname{OT}_\varepsilon(\mu, \mu_i)$ denotes the entropically regularised optimal transport cost with marginals $\mu$ and $\mu_i$, cost matrix $C$, and entropic regularisation parameter $\varepsilon$.

We set up two measures and compute the weighted barycenters. We choose weights $\lambda_1 \in \{0.25, 0.5, 0.75\}$.

support = range(-1, 1; length=250)
mu1 = normalize!(exp.(-(support .+ 0.5) .^ 2 ./ 0.1^2), 1)
mu2 = normalize!(exp.(-(support .- 0.5) .^ 2 ./ 0.1^2), 1)

plt = plot(; size=(800, 400), legend=:outertopright)
plot!(plt, support, mu1; label=raw"$\mu_1$")
plot!(plt, support, mu2; label=raw"$\mu_2$")

mu = hcat(mu1, mu2)
C = pairwise(SqEuclidean(), support'; dims=2)
for λ1 in (0.25, 0.5, 0.75)
    λ2 = 1 - λ1
    a = sinkhorn_barycenter(mu, C, 0.01, [λ1, λ2], SinkhornGibbs())
    plot!(plt, support, a; label="\$\\mu \\quad (\\lambda_1 = $λ1)\$")
end
plt

This page was generated using Literate.jl.