Zixin Zhong

Zixin Zhong


University of Alberta

Online machine learning, reinforcement learning, multi-armed bandits

Zixin Zhong was born in China in 1995. She is currently a postdoctoral fellow at the Department of Computing Science of the University of Alberta (UofA). She is supervised by Prof. Csaba Szepesvári. Dr. Zhong received her PhD degree from the Department of Mathematics of the National University of Singapore (NUS) in October 2021. Dr. Zhong was privileged to be supervised by Prof. Vincent Y. F. Tan and Prof. Wang Chi Cheung during her PhD study, and she worked with them as a research fellow between June 2021 and July 2022. 

Dr. Zhong’s research interests are in reinforcement learning, online machine learning and, in particular, multi-armed bandits. Her work has been presented at top machine learning (ML) conferences including ICML and AISTATS, and also in top ML journals such as the Journal of Machine Learning Research (JMLR). She also serves as a reviewer for several conferences and journals including AISTATS, ICLR, ICML, NeurIPS, TIT, TSP, and TMLR. She was recognized as a top reviewer of NeurIPS 2022.

Almost Optimal Variance-Constrained Best Arm Identification

We design and analyze Variance-Aware-Lower and Upper Confidence Bound (VA-LUCB), a parameter-free algorithm, for identifying the best arm under the fixed-confidence setup and under a stringent constraint that the variance of the chosen arm is strictly smaller than a given threshold. An upper bound on VA-LUCB's sample complexity is shown to be characterized by a fundamental variance-aware hardness quantity HVA. By proving an information-theoretic lower bound, we show that sample complexity of VA-LUCB is optimal up to a factor logarithmic in HVA. Extensive experiments corroborate the dependence of the sample complexity on the various terms in HVA. By comparing VA-LUCB's empirical performance to a close competitor RiskAverse-UCB-BAI by David et al. [1], our experiments suggest that VA-LUCB has the lowest sample complexity for this class of risk-constrained best arm identification problems, especially for the riskiest instances.