Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Graphs for Statistical Learning and Modeling

less than 1 minute read

Published:

I’ve passed my general exam with presentation and reported titled Graphs for Statistical Learning and Modeling. This incorporates both my research work on graph-assisted approaches to machine learning tasks such as clustering and regression, and also application of graph and network theory to epidemic modeling with contact network.

portfolio

publications

On the translates of general dyadic systems on R

Published in Mathematische Annalen, 2020

Many techniques in harmonic analysis use the fact that a continuous object can be written as a sum (or an intersection) of dyadic counterparts, as long as those counterparts belong to an adjacent dyadic system. Here we generalize the notion of adjacent dyadic system and explore when it occurs, leading to some new and perhaps surprising classifications. In particular, we show that every dyadic grid is determined by two parameters, the \emph{shift} and the \emph{location}; moreover two dyadic grids form an adjacent dyadic system if and only if their shifts and locations satisfy readily verifiable conditions.

Recommended citation: Anderson, T.C., Hu, B., Jiang, L. et al. On the translates of general dyadic systems on R. Math. Ann. 377, 911–933 (2020). https://doi.org/10.1007/s00208-019-01951-z https://link.springer.com/article/10.1007/s00208-019-01951-z

Skeleton Clustering: Dimension-Free Density-based Clustering

Published in Journal of the American Statistical Association, 2023

We introduce a density-based clustering method called skeleton clustering that can detect clusters in multivariate and even high-dimensional data with irregular shapes. To bypass the curse of dimensionality, we propose surrogate density measures that are less dependent on the dimension but have intuitive geometric interpretations. The clustering framework constructs a concise representation of the given data as an intermediate step and can be thought of as a combination of prototype methods, density-based clustering, and hierarchical clustering. We show by theoretical analysis and empirical studies that the skeleton clustering leads to reliable clusters in multivariate and high-dimensional scenarios.

Recommended citation: Zeyu Wei & Yen-Chi Chen (2023) Skeleton Clustering: Dimension-Free Density-Aided Clustering, Journal of the American Statistical Association, DOI: 10.1080/01621459.2023.2174122 https://www.tandfonline.com/doi/full/10.1080/01621459.2023.2174122

Skeleton Regression: A Graph-Based Approach to Estimation with Manifold Structure

Published in Preprint, 2023

We introduce a new regression framework designed to deal with large-scale, complex data that lies around a low-dimensional manifold. Our approach first constructs a graph representation, referred to as the skeleton, to capture the underlying geometric structure. We then define metrics on the skeleton graph and apply nonparametric regression techniques, along with feature transformations based on the graph, to estimate the regression function. In addition to the included nonparametric methods, we also discuss the limitations of some nonparametric regressors with respect to the general metric space such as the skeleton graph. The proposed regression framework allows us to bypass the curse of dimensionality and provides additional advantages that it can handle the union of multiple manifolds and is robust to additive noise and noisy observations. We provide statistical guarantees for the proposed method and demonstrate its effectiveness through simulations and real data examples.

Recommended citation: https://arxiv.org/abs/2303.11786

talks

teaching

CSE 416: Introduction to Machine Learning

Undergraduate course, University of Washington, Department of Statistics, 2022

Provides practical introduction to machines learning. Modules include regression, classification, clustering, retrieval, recommender systems, and deep learning, with a focus on an intuitive understanding grounded in real-world applications. Intelligent applications are designed and used to make predictions on large, complex datasets.

STAT 567: Statistical Analysis of Social Networks

Undergraduate course, University of Washington, Department of Statistics, 2022

Statistical and mathematical descriptions of social networks. Topics include graphical and matrix representations of social networks, sampling methods, statistical analysis of network data, and applications.