Asymptotics for Random Quadratic Transportation Costs

Online seminar - On the interactions between Statistics and Geometry

Dario Trevisan (joint with M. Goldman and M. Huesmann, arXiv:2409.08612)

2024-11-08

Assignment Problem Overview

  • The assignment problem (or bipartite matching) seeks an optimal correspondence between two sets of objects.

  • Given \(\{x_i\}_{i=1}^n\) and \(\{y_i\}_{i=1}^n\) and a cost function \(c\): \[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n c(x_i, y_{\sigma(i)}) \]

  • Studied in many fields: operations research, algorithm theory, combinatorics, graph theory, probability, statistics, and theoretical physics.

  • Euclidean Setting:
    • Sets of points: \(\{x_i\}_{i=1}^n\), \(\{y_i\}_{i=1}^n\) in \(\mathbb{R}^d\).
    • For \(p>0\), a cost is defined as: \[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |x_i - y_{\sigma(i)}|^p \]
  • Random Problem:
    • \(x_i = X_i\), \(y_i= Y_i\) are random variables,

    • Still realistic but also simpler than worst cases analysis.

Simulations

100 pairs of points on \([0,1]^2\),

Simulations

100 pairs of points on \([0,1]^2\), \(c(x,y) = |x-y|^{2}\)

Simulations

100 pairs of points on \([0,1]^3\)

Simulations

100 pairs of points on \([0,1]^3\), \(c(x,y) = |x-y|^{2}\)

Upper and lower bounds

  • Heuristics: \(n\) points in \([0,1]^d\)

\(\Rightarrow\) typical distances \(\approx n^{-1/d}\):

\[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \approx n \cdot n^{-p/d}\]

  • TRUE for \(d \ge 3\), but FALSE for \(d =1, 2\).

Theorem (Ajtai-Komlos-Tusnady, Talagrand, Barthe-Bordenave, Ledoux)

For \(p\ge 1\) and \(n+n\) i.i.d. uniform random points on \([0,1]^d\), \[\mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] \approx R_{d,p}(n) := \begin{cases} n \cdot n^{-p/2} & \text{for $d=1$,} \\ n \cdot (\log n /n)^{p/2} & \text{for $d=2$,}\\ n \cdot n^{-p/d} & \text{for $d\ge 3$.} \end{cases}\]

Tight upper and lower bounds are also known in the concave case \(p \in (0,1)\).

Limit results

Theorem (Barthe-Bordenave, Ambrosio-Stra-T., Goldman-T., )

For \(p>0\) and \(n+n\) i.i.d. uniform random points on \([0,1]^d\), the limit \[ \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{R_{d,p}(n)}\] is currently known to exist in the cases:

  • \(d=1\), \(p\neq 1/2\),

  • \(d=2\), \(p=2\),

  • \(d\ge 3\), \(p>0\).

Non-uniform laws

  • What happens if the random points \(\left\{ X_i \right\}_{i=1}^n\), \(\left\{ Y_i \right\}_{i=1}^n\) are i.i.d. but not uniform?

  • Heuristics: if the density at \(x\) is \(\lambda(x)\), a cube of length \(r\) contains \(\approx n \lambda(x) r^d\) points

  • \(\Rightarrow\) typical distance is \(\approx (n \lambda(x))^{-1/d}\):

\[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p\approx n^{1-p/d} \int \lambda^{1-p/d} \]

Theorem (BoutetDeMonvel-Martin, Barthe-Bordenave, Goldman-T. )

For \(d \ge 3\) and \(p\ge 1\), there are \(0<\underline{\mathsf{\alpha}}(p,d) \le \overline{\mathsf{\alpha}}(p,d)<\infty\) such that for every (smooth, strictly positive) density \(\lambda\) on \([0,1]^d\): \[ \underline{\mathsf{\alpha}}(p,d) \int \lambda^{1-p/d} \le \liminf_{n \to \infty}\frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{n^{1-p/d}}\] \[ \limsup_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{n^{1-p/d}} \le \overline{\mathsf{\alpha}}(p,d)\int \lambda^{1-p/d}\] where \((X_i)_{i=1}^n\), \((Y_i)_{i=1}^n\) are i.i.d. random variables with common distribution \(\lambda\).

Limiting constants and boundary cost

\[ \overline{\mathsf{\alpha}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n |X_i - Y_{\sigma(i)}|^p \right] }{n^{1-p/d}} \]

\[ \underline{\mathsf{\alpha}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \min_{\sigma \in \mathcal{S}_n} \sum_{i=1}^n \mathsf{b}^p_{[0,1]^d}(X_i, Y_{\sigma(i)}) \right] }{n^{1-p/d}}, \]

where \(\left\{ X_i \right\}_i\), \(\left\{ Y_i \right\}_i\) are i.i.d. uniform and \[ \mathsf{b}^p_{\Omega}(x,y) = \min\left\{ |x-y|^p, \min_{z \in \partial \Omega} |x-z|^p + |z-y|^p \right\}.\]

Open problem: for (some) \(d\ge 3\), \(p \ge 1\), prove that

\[ \underline{\mathsf{\alpha}}(p,d) = \overline{\mathsf{\alpha}}(p,d). \]

Simulations

100 pairs of points on \([0,1]^2\),

Simulations

100 pairs of points on \([0,1]^2\), \(c(x,y) = |x-y|^{2}\)

Simulations

100 pairs of points on \([0,1]^2\), \(c(x,y) = \mathsf{b}_{[0,1]^2}^{2}(x,y)\)

Simulations

100 pairs of points on \([0,1]^3\)

Simulations

100 pairs of points on \([0,1]^3\), \(c(x,y) = |x-y|^{2}\)

Simulations

100 pairs of points on \([0,1]^3\), \(c(x,y) = \mathsf{b}_{[0,1]^3}^{2}(x,y)\)

Connection to Optimal Transport

  • Birkhoff-Von Neumann Theorem \(\Rightarrow\) linear programming reformulation as optimal transport problem: \[ \min_{\sigma \in \mathcal{S}_n} \sum_{i} |x_i-y_{\sigma(i)}|^p = \min_{\pi} \sum_{i,j} |x_i-y_j|^p \pi(x_i, y_j)\] where \(\pi\) is any bistochastic matrix.

  • Relaxation: map (permutation) \(\Rightarrow\) Kantorovich plan (transition probability).

  • The optimal transport problem formulation leads to the notion of Wasserstein cost of order \(p\) between general measures: \[ \mathsf{W}^p \left( \mu, \lambda \right) = \min_{\pi} \int \int |x-y|^p \pi(dx, dy)\] where \(\pi\) runs over all transport plans between \(\mu\) and \(\lambda\).

Example (quantization cost)

\[\mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{x_i}, \lambda \right)\]

Theorem (Dereich-Scheutzow-Schottstedt, Goldman-T. )

For \(d \ge 3\) and \(p\ge 1\), there arr \(0<\underline{\mathsf{c}}(p,d) \le \overline{\mathsf{c}}(p,d)<\infty\) such that for every (smooth, strictly positive) density \(\lambda\) on a bounded domain \(\Omega\): \[ \underline{\mathsf{c}}(p,d) \int \lambda^{1-p/d} \le \liminf_{n \to \infty}\frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \lambda \right) \right] }{n^{-p/d}}\]

\[ \limsup_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \lambda \right) \right]}{n^{-p/d}} \le \overline{\mathsf{c}}(p,d)\int \lambda^{1-p/d}\] where \((X_i)_{i=1}^n\) are i.i.d. random variables with common distribution \(\lambda\).

Limiting constants and boundary cost

\[ \overline{\mathsf{c}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \mathcal{L}^d_{[0,1]^d} \right) \right] }{n^{-p/d}} \]

\[ \underlin{\mathsf{c}}(p,d) := \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{Wb}^p_{[0,1]^d} \left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \mathcal{L}^d_{[0,1]^d} \right) \right] }{n^{-p/d}} \]

where \(\left\{ X_i \right\}_i\) and \(\left\{ Y_i \right\}_i\) are i.i.d. uniform on \([0,1]^d\) and \[ \mathsf{Wb}^p_{\Omega} \left( \mu, \lambda \right) = \min_{\pi} \int \int \mathsf{b}^p_{\Omega} (x,y) \pi(dx, dy).\]

Main result

Theorem (Huesmann-Goldman-T.)

For every \(d\ge 3\) it holds

\[\underline{\mathsf{c}}(2,d) = \overline{\mathsf{c}}(2,d).\]

  • Previously only known for

    • \(d=1\) and \(p< 1/2\)
    • \(p=d=2\).

Overview of the Proof Technique

  • Functional analysis tools,

    • PDE ansatz for \(\mathsf{W}\), \(\mathsf{Wb}\)
  • stability/regularity results for Optimal Transport,

    • \(L^\infty\) bounds for transport maps
    • quadratic response for Kantorovich potentials
  • ideas from stochastic homogenization,

    • Sub/super additivity arguments
    • Intermediate-scale Poincaré inequality

Caracciolo-Parisi-Lucibello-Sicuro linearization

For measures \(\mu\), \(\lambda\) on \(\Omega\), \[ \mathsf{W}^2 (\mu, \lambda) \approx \int |\nabla f|^2, \] where \(f\) solves the Poisson equation with null Neumann boundary conditions \[ \Delta f = \mu- \lambda \quad \text{in $\Omega$.}\]

  • It yields the prediction, for \(d=p=2\), \[ \lim_{n \to \infty} \frac{ \mathbb{E}\left[ \mathsf{W}^p\left( \frac 1 n \sum_{i=1}^n \delta_{X_i}, \mathcal{L}^d_{[0,1]^d} \right) \right] }{\log n/ n} = \frac{1}{4 \pi}.\]
  • Rigorously: Dacorogna-Moser interpolation, Benamou-Brenier

  • We still argue in the PDE heuristics (to simplify).

Linearization: boundary transport case

For measures \(\mu\), \(\lambda\) on \(\Omega\), \[ \mathsf{Wb}^2_{\Omega} (\mu, \lambda) \approx \int |\nabla u|^2, \] where \(u\) solves the Poisson equation with null Dirichlet boundary conditions. \[ \Delta u = \mu- \lambda \quad \text{in $\Omega$.}\]

Dual formulations

  • Neumann: \[ \int|\nabla f|^2 = \inf_{b} \left\{ \int |b|^2\, : \, \operatorname{div}b = \mu - \lambda,\, b \cdot \nu_{\Omega} = 0 \right\}.\]

  • Dirichlet: \[ \int|\nabla u|^2 = \inf_{b} \left\{ \int |b|^2\, : \, \operatorname{div}b = \mu - \lambda \right\}.\]

\(\Rightarrow \int |\nabla u|^2 \le \int|\nabla f|^2\).

Construction of a competitor

  • Start from \(b_0 := \nabla u\) and correct for the boundary flux.

  • Cut-off: \(\eta(x) =1\) on the boundary and \(\eta(x) = 0\) at distance \(r>0\), \[ b_1 := b_0 - \eta b_0\]

  • We find \(\operatorname{div}b_1 - (\mu - \lambda) = -\eta (\mu- \lambda) - \nabla \eta \cdot \nabla u\).

  • \(\Rightarrow\) further corrections.

\(\operatorname{div}b_1 - (\mu - \lambda) = -\eta (\mu- \lambda) - \nabla \eta \cdot \nabla u\).

Key identity (removes \(\nabla u\))

\[ \nabla \eta \cdot \nabla u = \operatorname{div}( u \nabla \eta) - u \Delta \eta \]

  • Further steps: \(b_2 := b_1 + u \nabla \eta\)

  • Solve the PDE \(\Delta g = \eta(\mu- \lambda) - u \Delta \eta\) with null Neumann conditions,

\[ b_3 := b_2 + \nabla g .\]

  • By the dual formulation, \[ \int |\nabla f|^2 \le \int |b_3|^2\]

  • Estimates for elliptic PDE’s and \(\nabla \eta \approx r^{-1}\), \(\Delta \eta \approx r^{-2}\) give

\[\begin{equation}\begin{split} & \int |\nabla f|^2 - \int |\nabla u|^2 \lesssim\varepsilon\int |\nabla u|^2\\ & \quad + \frac{1}{\varepsilon}\left[ \left\| \eta(\mu - \lambda) \right\|_{H^{-1,2}(\Omega)}^2 + \left( r^{-2}+ \operatorname{diam}(\Omega)^2 r^{-4} \right)\int|u|^2 \right] \end{split} \end{equation}\]

Application to Random Setting

  • Instead of \(n\) i.i.d. points on \([0,1]^d\), rescale to \(\Omega = (0,L)^d\) and take \(N(L) \approx L^d \to \infty\) points

  • Precisely: consider a Poisson point process.

Main result (Poisson formulation)

\[\begin{equation} \begin{split} & \lim_{L \to \infty} \mathbb{E}\left[ \mathsf{W}^2\left( \frac 1 {N(L)} \sum_{i=1}^{N(L)} \delta_{X_i}, \frac 1 {L^d} \mathcal{L}^d_{[0,L]^d} \right) \right] \\ & \quad \quad = \lim_{L \to \infty} \mathbb{E}\left[ \mathsf{Wb}^2_{(0,L)^d} \left( \frac 1 {N(L)} \sum_{i=1}^{N(L)} \delta_{X_i}, \frac 1 {L^d} \mathcal{L}^d_{[0,L]^d} \right) \right]\end{split}\end{equation}\]

  • In the PDE ansatz: \[ \frac 1 {L^d} \mathbb{E}\left[ \int|\nabla u|^2 \right] \approx 1,\] (roughly) \(|\nabla u| \approx 1\).

  • Aim: the error term is of lower order: \[ \left\| \eta(\mu - \lambda) \right\|_{H^{-1,2}(\Omega)}^2 + \left( r^{-2}+ \operatorname{diam}(\Omega)^2 r^{-4} \right)\int|u|^2\]

  • Since \(r := \delta L\) for \(\delta>0\), we have a balance between terms.

\[ \frac 1 {L^d} \left\| \eta(\mu - \lambda) \right\|_{H^{-1,2}( (0,L)^d )}^2 \approx \delta \] and \[ \frac 1 {L^d} \left( r^{-2}+ L^2r^{-4} \right)\int|u|^2 \approx \left( \delta^{-2} + \delta^{-4} \right) \cdot \frac{1}{L^{d+2}} \int|u|^2 \]

  • But \(|\nabla u| \approx 1\), Poincaré inequality gives \(|u| \lesssim L\).
  • We argue that \[ \mathbb{E}\left[ \int_{(0,L)^d}|u|^2 \right] \ll L^{d+2} \] (roughly) \(|u| \ll L\).

  • Idea:

    • \(u\) is oscillating
    • argue at multiple scales (as in stochastic homogenization).

Realization: intermediate scale \(1 \ll L_0 \ll L\):

  • partition \((0,L)^d\) into \((L/L_0)^d\) cubes.
  • glue the solutions to PDE on sub-cubes \(\Rightarrow\) competitor \(\tilde u\):

\[\begin{equation} \frac 1 {L^d} \mathbb{E}\left[ \int_{(0,L)^d} |\nabla \tilde u|^2 \right] = \frac{1}{L_0^d} \mathbb{E}\left[ \int_{(0,L_0)^d} |\nabla u_0|^2 \right] \end{equation}\]

  • \(\tilde u\) is almost optimizer:

\[\begin{equation} \frac 1 {L^d} \mathbb{E}\left[ \int_{(0,L)^d} |\nabla \tilde u|^2 \right] - \frac 1 {L^d}\mathbb{E}\left[ \int_{(0,L)^d} |\nabla u|^2 \right] \ll 1. \end{equation}\]

  • By stability of solutions, \(\tilde u\) is close to \(u\):

\[\begin{equation} \mathbb{E}\left[ \int_{(0,L)^d} |u- \tilde u|^2 \right] \ll L^{d+2}. \end{equation}\] (roughly) \(|u - \tilde u| \ll L\).

  • By construction: \(|\tilde u| \ll L_0 \ll L\)

  • By triangle inequality:

\[ \mathbb{E}\left[ \int_{(0,L)^d}|u|^2 \right] \ll L^{d+2}. \]

Final comments

  • The PDE ansatz fails at microscopic scales \(\Rightarrow\) Optimal Transport becomes technical

  • Stability of solutions to Optimal Transport is widely open!

Theorem (Mérigot-Delalande-Chazal, Mischler-T.)

For \(p>1\), there exists \(\theta(p)\in (0,1)\) and \(C(p)< \infty\) such that, for any \(\mu\), \(\nu\), probability measures on \([0,1]^d\), it holds \[\begin{equation}\label{eq:stability-map} \int_{[0,1]^d} |T_{\mu}(x) - T_\nu(x)|^2 dx \le C \mathsf{W}_1(\mu, \nu)^{2\theta}, \end{equation}\] where \(T_\mu\) denotes the optimal transport map for \(\mathsf{W}_p(\mathcal{L}^d_{[0,1]^d}, \mu)\), and similarly \(T_\nu\).