An Optimal Transport Approach for Selecting a Representative Subsample with Application in Efficient Kernel Density Estimation. Issue 1 (2nd January 2023)
- Record Type:
- Journal Article
- Title:
- An Optimal Transport Approach for Selecting a Representative Subsample with Application in Efficient Kernel Density Estimation. Issue 1 (2nd January 2023)
- Main Title:
- An Optimal Transport Approach for Selecting a Representative Subsample with Application in Efficient Kernel Density Estimation
- Authors:
- Zhang, Jingyi
Meng, Cheng
Yu, Jun
Zhang, Mengrui
Zhong, Wenxuan
Ma, Ping - Abstract:
- Abstract: Subsampling methods aim to select a subsample as a surrogate for the observed sample. Such methods have been used pervasively in large-scale data analytics, active learning, and privacy-preserving analysis in recent decades. Instead of model-based methods, in this article, we study model-free subsampling methods, which aim to identify a subsample, that is, not confined by model assumptions. Existing model-free subsampling methods are usually built upon clustering techniques or kernel tricks. Most of these methods suffer from either a large computational burden or a theoretical weakness. In particular, the theoretical weakness is that the empirical distribution of the selected subsample may not necessarily converge to the population distribution. Such computational and theoretical limitations hinder the broad applicability of model-free subsampling methods in practice. We propose a novel model-free subsampling method by using optimal transport techniques. Moreover, we develop an efficient subsampling algorithm, that is, adaptive to the unknown probability density function. Theoretically, we show the selected subsample can be used for efficient density estimation by deriving the convergence rate for the proposed subsample kernel density estimator. We also provide the optimal bandwidth for the proposed estimator. Numerical studies on synthetic and real-world datasets demonstrate the performance of the proposed method is superior.
- Is Part Of:
- Journal of computational and graphical statistics. Volume 32:Issue 1(2023)
- Journal:
- Journal of computational and graphical statistics
- Issue:
- Volume 32:Issue 1(2023)
- Issue Display:
- Volume 32, Issue 1 (2023)
- Year:
- 2023
- Volume:
- 32
- Issue:
- 1
- Issue Sort Value:
- 2023-0032-0001-0000
- Page Start:
- 329
- Page End:
- 339
- Publication Date:
- 2023-01-02
- Subjects:
- Density estimation -- Inverse transform sampling -- Optimal transport -- Star discrepancy -- Subsampling
Mathematical statistics -- Data processing -- Periodicals
Mathematical statistics -- Graphic methods -- Periodicals
519.50285 - Journal URLs:
- http://pubs.amstat.org/loi/jcgs ↗
http://www.catchword.com/titles/10857117.htm ↗
http://www.tandf.co.uk/journals/titles/10618600.asp ↗
http://www.tandfonline.com/ ↗ - DOI:
- 10.1080/10618600.2022.2084404 ↗
- Languages:
- English
- ISSNs:
- 1061-8600
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4963.451000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26039.xml