Asymptotics for Random Quadratic Transportation Costs

Online seminar - On the interactions between Statistics and Geometry

Dario Trevisan (joint with M. Goldman and M. Huesmann, arXiv:2409.08612)

2024-11-08

Assignment Problem Overview

The assignment problem (or bipartite matching) seeks an optimal correspondence between two sets of objects.
Given $\{x_i\}_{i=1}^n$ and $\{y_i\}_{i=1}^n$ and a cost function $c$: \[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n c(x_i, y_{\sigma(i)}) \]
Studied in many fields: operations research, algorithm theory, combinatorics, graph theory, probability, statistics, and theoretical physics.

Euclidean Setting:
- Sets of points: $\{x_i\}_{i=1}^n$, $\{y_i\}_{i=1}^n$ in $\mathbb{R}^d$.
- For $p>0$, a cost is defined as: \[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |x_i - y_{\sigma(i)}|^p \]
Random Problem:
- $x_i = X_i$, $y_i= Y_i$ are random variables,
- Still realistic but also simpler than worst cases analysis.

Simulations

100 pairs of points on $[0,1]^2$,

Simulations

100 pairs of points on $[0,1]^2$, $c(x,y) = |x-y|^{2}$

Simulations

100 pairs of points on $[0,1]^3$

Simulations

100 pairs of points on $[0,1]^3$, $c(x,y) = |x-y|^{2}$

Upper and lower bounds

Heuristics: $n$ points in $[0,1]^d$

$\Rightarrow$ typical distances $\approx n^{-1/d}$:

\[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \approx n \cdot n^{-p/d}\]

TRUE for $d \ge 3$, but FALSE for $d =1, 2$.

Theorem (Ajtai-Komlos-Tusnady, Talagrand, Barthe-Bordenave, Ledoux)

For $p\ge 1$ and $n+n$ i.i.d. uniform random points on $[0,1]^d$, \[\mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] \approx R_{d,p}(n) := \begin{cases} n \cdot n^{-p/2} & \text{for $d=1$,} \\ n \cdot (\log n /n)^{p/2} & \text{for $d=2$,}\\ n \cdot n^{-p/d} & \text{for $d\ge 3$.} \end{cases}\]

Tight upper and lower bounds are also known in the concave case $p \in (0,1)$.

Limit results

Theorem (Barthe-Bordenave, Ambrosio-Stra-T., Goldman-T., )

For $p>0$ and $n+n$ i.i.d. uniform random points on $[0,1]^d$, the limit \[ \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{R_{d,p}(n)}\] is currently known to exist in the cases:

$d=1$, $p\neq 1/2$,
$d=2$, $p=2$,
$d\ge 3$, $p>0$.

Non-uniform laws

What happens if the random points $\left\{ X_i \right\}_{i=1}^n$, $\left\{ Y_i \right\}_{i=1}^n$ are i.i.d. but not uniform?
Heuristics: if the density at $x$ is $\lambda(x)$, a cube of length $r$ contains $\approx n \lambda(x) r^d$ points
$\Rightarrow$ typical distance is $\approx (n \lambda(x))^{-1/d}$:

\[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p\approx n^{1-p/d} \int \lambda^{1-p/d} \]

Theorem (BoutetDeMonvel-Martin, Barthe-Bordenave, Goldman-T. )

For $d \ge 3$ and $p\ge 1$, there are $0<\underline{\mathsf{\alpha}}(p,d) \le \overline{\mathsf{\alpha}}(p,d)<\infty$ such that for every (smooth, strictly positive) density $\lambda$ on $[0,1]^d$: \[ \underline{\mathsf{\alpha}}(p,d) \int \lambda^{1-p/d} \le \liminf_{n \to \infty}\frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{n^{1-p/d}}\] \[ \limsup_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{n^{1-p/d}} \le \overline{\mathsf{\alpha}}(p,d)\int \lambda^{1-p/d}\] where $(X_i)_{i=1}^n$, $(Y_i)_{i=1}^n$ are i.i.d. random variables with common distribution $\lambda$.

Limiting constants and boundary cost

\[ \overline{\mathsf{\alpha}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{n^{1-p/d}} \]

\[ \underline{\mathsf{\alpha}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n \mathsf{b}^p_{[0,1]^d}(X_i, Y_{\sigma(i)}) \right] }{n^{1-p/d}}, \]

where $\left\{ X_i \right\}_i$, $\left\{ Y_i \right\}_i$ are i.i.d. uniform and \[ \mathsf{b}^p_{\Omega}(x,y) = \min\left\{ |x-y|^p, \min_{z \in \partial \Omega} |x-z|^p + |z-y|^p \right\}.\]

Open problem: for (some) $d\ge 3$, $p \ge 1$, prove that

\[ \underline{\mathsf{\alpha}}(p,d) = \overline{\mathsf{\alpha}}(p,d). \]

Simulations

100 pairs of points on $[0,1]^2$,

Simulations

100 pairs of points on $[0,1]^2$, $c(x,y) = |x-y|^{2}$

Simulations

100 pairs of points on $[0,1]^2$, $c(x,y) = \mathsf{b}_{[0,1]^2}^{2}(x,y)$

Simulations

100 pairs of points on $[0,1]^3$

Simulations

100 pairs of points on $[0,1]^3$, $c(x,y) = |x-y|^{2}$

Simulations

100 pairs of points on $[0,1]^3$, $c(x,y) = \mathsf{b}_{[0,1]^3}^{2}(x,y)$

Connection to Optimal Transport

Birkhoff-Von Neumann Theorem $\Rightarrow$ linear programming reformulation as optimal transport problem: \[ \min_{\sigma \in \mathcal{S}_n} \sum_{i} |x_i-y_{\sigma(i)}|^p = \min_{\pi} \sum_{i,j} |x_i-y_j|^p \pi(x_i, y_j)\] where $\pi$ is any bistochastic matrix.
Relaxation: map (permutation) $\Rightarrow$ Kantorovich plan (transition probability).

The optimal transport problem formulation leads to the notion of Wasserstein cost of order $p$ between general measures: \[ \mathsf{W}^p \left( \mu, \lambda \right) = \min_{\pi} \int \int |x-y|^p \pi(dx, dy)\] where $\pi$ runs over all transport plans between $\mu$ and $\lambda$.

Example (quantization cost)

\[\mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{x_i}, \lambda \right)\]

Theorem (Dereich-Scheutzow-Schottstedt, Goldman-T. )

For $d \ge 3$ and $p\ge 1$, there arr $0<\underline{\mathsf{c}}(p,d) \le \overline{\mathsf{c}}(p,d)<\infty$ such that for every (smooth, strictly positive) density $\lambda$ on a bounded domain $\Omega$: \[ \underline{\mathsf{c}}(p,d) \int \lambda^{1-p/d} \le \liminf_{n \to \infty}\frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \lambda \right) \right] }{n^{-p/d}}\]

\[ \limsup_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \lambda \right) \right]}{n^{-p/d}} \le \overline{\mathsf{c}}(p,d)\int \lambda^{1-p/d}\] where $(X_i)_{i=1}^n$ are i.i.d. random variables with common distribution $\lambda$.

Limiting constants and boundary cost

\[ \overline{\mathsf{c}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \mathcal{L}^d_{[0,1]^d} \right) \right] }{n^{-p/d}} \]

\[ \underlin{\mathsf{c}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{Wb}^p_{[0,1]^d} \left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \mathcal{L}^d_{[0,1]^d} \right) \right] }{n^{-p/d}} \]

where $\left\{ X_i \right\}_i$ and $\left\{ Y_i \right\}_i$ are i.i.d. uniform on $[0,1]^d$ and \[ \mathsf{Wb}^p_{\Omega} \left( \mu, \lambda \right) = \min_{\pi} \int \int \mathsf{b}^p_{\Omega} (x,y) \pi(dx, dy).\]

Main result

Theorem (Huesmann-Goldman-T.)

For every $d\ge 3$ it holds

\[\underline{\mathsf{c}}(2,d) = \overline{\mathsf{c}}(2,d).\]

Previously only known for
- $d=1$ and $p< 1/2$
- $p=d=2$.

Overview of the Proof Technique

Functional analysis tools,
- PDE ansatz for $\mathsf{W}$, $\mathsf{Wb}$
stability/regularity results for Optimal Transport,
- $L^\infty$ bounds for transport maps
- quadratic response for Kantorovich potentials
ideas from stochastic homogenization,
- Sub/super additivity arguments
- Intermediate-scale Poincaré inequality

Caracciolo-Parisi-Lucibello-Sicuro linearization

For measures $\mu$, $\lambda$ on $\Omega$, \[ \mathsf{W}^2 (\mu, \lambda) \approx \int |\nabla f|^2, \] where $f$ solves the Poisson equation with null Neumann boundary conditions \[ \Delta f = \mu- \lambda \quad \text{in $\Omega$.}\]

It yields the prediction, for $d=p=2$, \[ \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \mathcal{L}^d_{[0,1]^d} \right) \right] }{\log n/ n} = \frac{1}{4 \pi}.\]

Rigorously: Dacorogna-Moser interpolation, Benamou-Brenier
We still argue in the PDE heuristics (to simplify).

Linearization: boundary transport case

For measures $\mu$, $\lambda$ on $\Omega$, \[ \mathsf{Wb}^2_{\Omega} (\mu, \lambda) \approx \int |\nabla u|^2, \] where $u$ solves the Poisson equation with null Dirichlet boundary conditions. \[ \Delta u = \mu- \lambda \quad \text{in $\Omega$.}\]

Dual formulations

Neumann: \[ \int|\nabla f|^2 = \inf_{b} \left\{ \int |b|^2\, : \, \operatorname{div}b = \mu - \lambda,\, b \cdot \nu_{\Omega} = 0 \right\}.\]
Dirichlet: \[ \int|\nabla u|^2 = \inf_{b} \left\{ \int |b|^2\, : \, \operatorname{div}b = \mu - \lambda \right\}.\]

$\Rightarrow \int |\nabla u|^2 \le \int|\nabla f|^2$.

Construction of a competitor

Start from $b_0 := \nabla u$ and correct for the boundary flux.
Cut-off: $\eta(x) =1$ on the boundary and $\eta(x) = 0$ at distance $r>0$, \[ b_1 := b_0 - \eta b_0\]
We find $\operatorname{div}b_1 - (\mu - \lambda) = -\eta (\mu- \lambda) - \nabla \eta \cdot \nabla u$.
$\Rightarrow$ further corrections.

$\operatorname{div}b_1 - (\mu - \lambda) = -\eta (\mu- \lambda) - \nabla \eta \cdot \nabla u$.

Key identity (removes $\nabla u$)

\[ \nabla \eta \cdot \nabla u = \operatorname{div}( u \nabla \eta) - u \Delta \eta \]

Further steps: $b_2 := b_1 + u \nabla \eta$
Solve the PDE $\Delta g = \eta(\mu- \lambda) - u \Delta \eta$ with null Neumann conditions,

\[ b_3 := b_2 + \nabla g .\]

By the dual formulation, \[ \int |\nabla f|^2 \le \int |b_3|^2\]
Estimates for elliptic PDE’s and $\nabla \eta \approx r^{-1}$, $\Delta \eta \approx r^{-2}$ give

\[\begin{equation}\begin{split} & \int |\nabla f|^2 - \int |\nabla u|^2 \lesssim\varepsilon\int |\nabla u|^2\\ & \quad + \frac{1}{\varepsilon}\left[ \left\| \eta(\mu - \lambda) \right\|_{H^{-1,2}(\Omega)}^2 + \left( r^{-2}+ \operatorname{diam}(\Omega)^2 r^{-4} \right)\int|u|^2 \right] \end{split} \end{equation}\]

Application to Random Setting

Instead of $n$ i.i.d. points on $[0,1]^d$, rescale to $\Omega = (0,L)^d$ and take $N(L) \approx L^d \to \infty$ points
Precisely: consider a Poisson point process.

Main result (Poisson formulation)

\[\begin{equation} \begin{split} & \lim_{L \to \infty} \mathbb{E}\left[ \mathsf{W}^2\left( \frac 1 {N(L)} \sum_{i=1}^{N(L)} \delta_{X_i}, \frac 1 {L^d} \mathcal{L}^d_{[0,L]^d} \right) \right] \\ & \quad \quad = \lim_{L \to \infty} \mathbb{E}\left[ \mathsf{Wb}^2_{(0,L)^d} \left( \frac 1 {N(L)} \sum_{i=1}^{N(L)} \delta_{X_i}, \frac 1 {L^d} \mathcal{L}^d_{[0,L]^d} \right) \right]\end{split}\end{equation}\]

In the PDE ansatz: \[ \frac 1 {L^d} \mathbb{E}\left[ \int|\nabla u|^2 \right] \approx 1,\] (roughly) $|\nabla u| \approx 1$.
Aim: the error term is of lower order: \[ \left\| \eta(\mu - \lambda) \right\|_{H^{-1,2}(\Omega)}^2 + \left( r^{-2}+ \operatorname{diam}(\Omega)^2 r^{-4} \right)\int|u|^2\]
Since $r := \delta L$ for $\delta>0$, we have a balance between terms.

\[ \frac 1 {L^d} \left\| \eta(\mu - \lambda) \right\|_{H^{-1,2}( (0,L)^d )}^2 \approx \delta \] and \[ \frac 1 {L^d} \left( r^{-2}+ L^2r^{-4} \right)\int|u|^2 \approx \left( \delta^{-2} + \delta^{-4} \right) \cdot \frac{1}{L^{d+2}} \int|u|^2 \]

But $|\nabla u| \approx 1$, Poincaré inequality gives $|u| \lesssim L$.

We argue that \[ \mathbb{E}\left[ \int_{(0,L)^d}|u|^2 \right] \ll L^{d+2} \] (roughly) $|u| \ll L$.
Idea:
- $u$ is oscillating
- argue at multiple scales (as in stochastic homogenization).

Realization: intermediate scale $1 \ll L_0 \ll L$:

partition $(0,L)^d$ into $(L/L_0)^d$ cubes.
glue the solutions to PDE on sub-cubes $\Rightarrow$ competitor $\tilde u$:

\[\begin{equation} \frac 1 {L^d} \mathbb{E}\left[ \int_{(0,L)^d} |\nabla \tilde u|^2 \right] = \frac{1}{L_0^d} \mathbb{E}\left[ \int_{(0,L_0)^d} |\nabla u_0|^2 \right] \end{equation}\]

$\tilde u$ is almost optimizer:

\[\begin{equation} \frac 1 {L^d} \mathbb{E}\left[ \int_{(0,L)^d} |\nabla \tilde u|^2 \right] - \frac 1 {L^d}\mathbb{E}\left[ \int_{(0,L)^d} |\nabla u|^2 \right] \ll 1. \end{equation}\]

By stability of solutions, $\tilde u$ is close to $u$:

\[\begin{equation} \mathbb{E}\left[ \int_{(0,L)^d} |u- \tilde u|^2 \right] \ll L^{d+2}. \end{equation}\] (roughly) $|u - \tilde u| \ll L$.

By construction: $|\tilde u| \ll L_0 \ll L$
By triangle inequality:

\[ \mathbb{E}\left[ \int_{(0,L)^d}|u|^2 \right] \ll L^{d+2}. \]

Final comments

The PDE ansatz fails at microscopic scales $\Rightarrow$ Optimal Transport becomes technical
Stability of solutions to Optimal Transport is widely open!

Theorem (Mérigot-Delalande-Chazal, Mischler-T.)

For $p>1$, there exists $\theta(p)\in (0,1)$ and $C(p)< \infty$ such that, for any $\mu$, $\nu$, probability measures on $[0,1]^d$, it holds \[\begin{equation}\label{eq:stability-map} \int_{[0,1]^d} |T_{\mu}(x) - T_\nu(x)|^2 dx \le C \mathsf{W}_1(\mu, \nu)^{2\theta}, \end{equation}\] where $T_\mu$ denotes the optimal transport map for $\mathsf{W}_p(\mathcal{L}^d_{[0,1]^d}, \mu)$, and similarly $T_\nu$.