printable pdf
比利时vs摩洛哥足彩 ,
university of california san diego

****************************

math 278c - optimization and data science seminar

kaizheng wang

columbia university

clustering via uncoupled regression

abstract:

in this talk we consider a canonical clustering problem where one receives unlabeled samples drawn from a balanced mixture of two elliptical distributions and aims for a classifier to estimate the labels. many popular methods including pca and k-means require individual components of the mixture to be somewhat spherical, and perform poorly when they are stretched. to overcome this issue, we propose a non-convex program seeking for an affine transform to turn the data into a one-dimensional point cloud concentrating around -1 and 1, after which clustering becomes easy. our theoretical contributions are two-fold: (1) we show that the non-convex loss function exhibits desirable geometric properties when the sample size exceeds some constant multiple of the dimension, and (2) we leverage this to prove that an efficient first-order algorithm achieves near-optimal statistical precision without good initialization. we also propose a general methodology for clustering with flexible choices of feature transforms and loss objectives.

host: jiawang nie

march 3, 2021

2:00 pm

meeting id: 982 9781 6626 password: 278cwn21

****************************