Publications

2024

  1. csl2024.png
    Integrating Frame-Level Boundary Detection and Deepfake Detection for Locating Manipulated Regions in Partially Spoofed Audio Forgery Attacks
    Computer Speech & Language, 2024
  2. invvc.png
    INVERTIBLE VOICE CONVERSION WITH PARALLEL DATA
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024

2023

  1. partialfake.png
    Waveform Boundary Detection for Partially Spoofed Audio
    Zexin Cai, Weiqing Wang, and Ming Li
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
  2. clsyn.png
    Cross-Lingual Multi-Speaker Speech Synthesis with Limited Bilingual Training Data
    Zexin Cai, Yaogen Yang, and Ming Li
    Computer Speech & Language, 2023
  3. elec.png
    Electrolaryngeal Speech Enhancement Based on A Two Stage Framework with Bottleneck Feature Refinement and Voice Conversion
    Yaogen Yang, Haozhe Zhang, Zexin Cai, and 6 more authors
    Biomedical Signal Processing and Control, 2023
  4. srctrack.png
    Identifying Source Speakers for Voice Conversion Based Spoofing Attacks on Speaker Verification Systems
    Danwei Cai, Zexin Cai, and Ming Li
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023

2022

  1. sigvc.png
    SIG-VC: A Speaker Information Guided Zero-Shot Voice Conversion System for Both Human Beings and Machines
    Haozhe Zhang, Zexin Cai, Xiaoyi Qin, and 1 more author
    In IEEE International Conference on Acoustics, Speech and Signal Processing, 2022

2020

  1. fcsyn.png
    From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint
    Zexin Cai, Chuxiong Zhang, and Ming Li
    In Conference of the International Speech Communication Association (INTERSPEECH), 2020

2019

  1. polyphone.png
    Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features
    Zexin Cai, Chuxiong Zhang, and Ming Li
    In Conference of the International Speech Communication Association (INTERSPEECH), 2019
  2. f0gen.png
    F0 Contour Estimation Using Phonetic Feature in Electrolaryngeal Speech Enhancement
    Zexin Cai, Zhicheng Xu, and Ming Li
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019

2018

  1. ema.png
    The DKU-JNU-EMA Electromagnetic Articulography Database on Mandarin and Chinese Dialects with Tandem Feature Based Acoustic-to-Articulatory Inversion
    Zexin Cai, Xiaoyi Qin, Danwei Cai, and 3 more authors
    In International Symposium on Chinese Spoken Language Processing (ISCSLP), 2018
  2. e2elid.png
    Insights in-to-End Learning Scheme for Language Identification
    Weicheng Cai, Zexin Cai, Wenbo Liu, and 2 more authors
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018
  3. insightlid.png
    A Novel Learnable Dictionary Encoding Layer for End-to-End Language Identification
    Weicheng Cai, Zexin Cai, Xiang Zhang, and 2 more authors
    In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018
  4. kws.png
    Unsupervised Query by Example Spoken Term Detection Using Features Concatenated with Self-Organizing Map Distances
    Haiwei Wu, Ming LiZexin Cai, and 1 more author
    In International Symposium on Chinese Spoken Language Processing (ISCSLP), 2018
  5. lid.png
    End-to-end Language Identification Using NetFV and NetVLAD
    Jinkun Chen, Weicheng Cai, Danwei Cai, and 3 more authors
    In International Symposium on Chinese Spoken Language Processing (ISCSLP), 2018
  6. deepspk.png
    Deep Speaker Embeddings with Convolutional Neural Network on Supervector for Text-Independent Speaker Recognition
    Danwei Cai, Zexin Cai, and Ming Li
    In Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2018