比利时vs摩洛哥足彩
,
university of california san diego
****************************
defense talk
varun khurana
university of california, san diego
learning with measure-valued data
abstract:
this talk discusses computationally feasible machine learning methods, based on optimal transport and neural network theory, applied to measure-valued data. we first analyze linearized optimal transport (lot), which essentially embeds measure-valued data into an $l^2$ space, where out-of-the-box machine learning techniques are available. we analyze the situations when lot provides an isometric embedding with respect to the wasserstein-2 distance and provide necessary bounds when we can achieve a pre-specified linear separation level in the lot embedding space. second, we produce a computationally feasible algorithm to recover low-dimensional structures in measure-valued data by using the lot embedding along with dimensionality reduction techniques. using computational methods for solving optimal transport problems such as the sinkhorn algorithm or linear programming, we provide approximation guarantees in terms of the sampling rates. third, we study structured approximations of measures in wasserstein space by a scaled voronoi partition of $\mathbb{r}^d$ generated from a full rank lattice. we show that these structured approximations match rates of optimal quantizers and empirical measure approximation in most instances. we then extend these results for noncompactly supported measures that decay fast enough. finally, we study methods for comparing probability measures by analyzing a neural network two-sample test. in particular, we perform time-analysis on a related neural tangent kernel (ntk) two-sample test and extend the analysis to the neural network two-sample test with a small-time training regime. we also show the amount of time needed before the two-sample test detects a deviation $\epsilon > 0$ in the case the probability measures considered are different versus when they are the same.
advisor: alex cloninger
may 29, 2024
12:30 pm
apm 6402
research areas
statistics****************************