Zexin Cai (蔡泽鑫)

Johns Hopkins University

profile_img.jpg

Zexin Cai is a postdoctoral research fellow at the Center for Language and Speech Processing (CLSP) at Johns Hopkins University, advised by Matthew and Nicholas. He received his PhD in Electrical and Computer Engineering from Duke University in 2023, supervised by Prof. Ming Li and Prof. Xin Li. His research interests include text-to-speech synthesis, voice conversion, and audio deepfake detection. Prior to joining Duke, Zexin earned his Bachelor’s degree in Software Engineering from Sun Yat-sen University and served as a research assistant at Duke Kunshan University. During his PhD studies, he completed an internship as an Applied Research Scientist at Microsoft. Zexin has contributed to various publications, with papers presented at ICASSP, Interspeech, and in the journal Computer Speech & Language.

Selected Publications

  1. spk_emo.png
    Privacy versus Emotion Preservation Trade-offs in Emotion-Preserving Speaker Anonymization
    Zexin Cai, Henry Li Xinyuan, Ashi Grag, and 5 more authors
    In IEEE Spoken Language Technology Workshop, 2024
  2. csl2024.png
    Integrating Frame-Level Boundary Detection and Deepfake Detection for Locating Manipulated Regions in Partially Spoofed Audio Forgery Attacks
    Computer Speech & Language, 2024
  3. partialfake.png
    Waveform Boundary Detection for Partially Spoofed Audio
    Zexin Cai, Weiqing Wang, and Ming Li
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
  4. invvc.png
    INVERTIBLE VOICE CONVERSION WITH PARALLEL DATA
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024
  5. clsyn.png
    Cross-Lingual Multi-Speaker Speech Synthesis with Limited Bilingual Training Data
    Zexin Cai, Yaogen Yang, and Ming Li
    Computer Speech & Language, 2023
  6. fcsyn.png
    From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint
    Zexin Cai, Chuxiong Zhang, and Ming Li
    In Conference of the International Speech Communication Association (INTERSPEECH), 2020
  7. polyphone.png
    Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features
    Zexin Cai, Chuxiong Zhang, and Ming Li
    In Conference of the International Speech Communication Association (INTERSPEECH), 2019
  8. f0gen.png
    F0 Contour Estimation Using Phonetic Feature in Electrolaryngeal Speech Enhancement
    Zexin Cai, Zhicheng Xu, and Ming Li
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019