Internship

Abbvie, Inc, Biostatistician, May-August, 2020

Supervisor: Yunxia Sui, Tian Feng and Yiran (Bonnie) Hu

Worked on missing-data handling in clinical trials.

  • Designed and implemented simulations to systematically evaluate the performance of different multiple imputation (MI) methods (MCMC, Monotone and FCS) based on SAS PROC MI, under MAR assumption.

  • Evaluated the performance of tipping point analysis and pattern mixture model under MNAR assumption.

Collaborative experience

Sep 2017-present, Provide statistical support for multiple principal investigators from different institutions.

principal investigators: Steffi Oesterreich, Hung Jung Park and Liza Konnikova

  • Analyze various high-throughput multi-omics datasets, including metabolomics and RNA-Seq data, starting from preprocessing raw data using command-line tools to downstream analysis such as differentially expressed (DE) analysis, pathway enrichment analysis, dimension reduction, clustering analysis, and network analysis.

  • Involved in multiple breast cancer projects, such as meta-analytic outcome-guided clustering in breast cancer research, analyzing clinicopathological characterization and clinical outcomes using 20-year cancer registry data from UPMC, transcriptomic landscape with tumor immune infiltration, etc.

  • Work on together to prepare peer-reviewed publications and posters.

Conference Presentation

  • Li, Y, Zeng, X, Lin, C, Tseng, G. (2019) “Simultaneous Estimation of Number of Clusters and Feature Sparsity in Clustering High-Dimensional Data”, Oral presentation on ENAR, Mar 25th, 2019 (student award winner)


Selected Courses Taken

CMU ML10-725: Convex Optimization, Fall 2019

STAT2194: High-dimensional Statistics, Spring 2019

HUGEN2040: Molecular Basis for Inherited Disease, Fall 2019


Paper Review Experience

  • Interaction Screening by Kendall's Partial Correlation for Ultrahigh-dimensional Data with Survival Trait, Bioinformatics.
  • Bipartite Tight Spectral Clustering (BiTSC) Algorithm for Identifying Conserved Gene Co-clusters in Two Species, Bioinformatics.
  • A supervised weeding method for high dimensional variable clustering with application to job market analysis, Journal of Applied Statistics.
  • HDSI: High dimensional selection with interactions algorithm on feature selection and testing, PLOS ONE.