Available Positions

Graduate Research Assistants for Ph.D. Students (with/without seeking dissertation mentorship)

  • Location: Houston local, other U.S. institutions, remote
  • Fields: Statistics, Computer Science, Engineering, Computational Biology, Quantitative Science

Research Interns for Ph.D. Students

  • Duration: On-site or remote
  • Fields: Statistics, Computer Science, Engineering, Computational Biology, Quantitative Science

Postdoctoral Fellows

  • Candidates with backgrounds in Statistics, Computer Science, or Engineering and genomic research experience
  • Biologists with strong computational skills are also encouraged to apply
  • We are especially looking for candidates who have experience with new models, algorithms, and software construction

If you’re excited to contribute to cutting-edge research in computational biology and cancer, we would love to hear from you!

How to Apply

Please send the following materials to yzheng8 AT mdanderson.org:

  • CV/resume
  • Brief cover letter describing your relevant experience and motivations
  • GitHub link to repository or materials demonstrating your programming skills
  • Related research manuscripts/writing samples (if applicable, required for postdoctoral candidate)

Hiring Projects (Updated August 2025)

1. [Epigenomics + Statistics/ML + Cancer Biology]

Our wet lab specializes in RNA PolII profiling on the formalin-fixed paraffin-embedded (FFPE) samples, which provides a cost-effective and robust approach to generating critical data for cancer research and motivating new associations and prediction models with patient phenotypes. We have several computational projects associated with this new technology:

  • AI-pathology annotation on the histopathological images for tumor contents and morphology identification
  • Normalization statistical modeling tailored for tumor tissues
  • Copy number variation calling for the epigenomic tumor data
  • Epigenomic marker and the cancer phenotype association modeling

2. [Epigenomics + Statistics + Immunology]

Single-cell epigenomics data are known for their ultra-sparsity. Denoising and imputation models are needed to gain useful cell information and integrate it across epigenomic markers.

3. [3D genomics + Statistics]

Investigating the three-dimensional chromatin organization and the long-range gene regulation through multimodality integrative modeling and accompanying software development, leveraging data across transcriptomics, epigenomics and 3D genomics.

4. [Proteomics + ML]

Cell surface protein measurement can provide deeper and standardized single-cell cell-type annotations and status descriptions. The project integrates CITE-seq, Flow Cytomery and Spatial Proteomics data across the study and platform for joint disease analysis. LLM tools are used to accommodate the distinct data characteristics of protein data across platforms. Machine learning and spatial image processing skills will be used.