sample

Mengyue WU 吴梦玥

Audio Analysis and its Medical implications


Hi hi! This is Mengyue 梦玥, interested in everything about human speech and perception/production. Specifically, my current research mainly lies on Rich Audio Analysis, though I always reckon this is an invented term. It does not have a definition per se, however I refer it to everything in audio processing excluding speech recognition. I'd like to stick to it - my very first student (co-supervised PhD, Heinrich) at SJTU has a funny username "richman" and my very first learned French sentence is "Je suis riche" (pls blame Duolingo for boosting our ego)! Due to work requirements (probably only existing in my head), I'll present a very official biography below. However just feel free to drop me a line if you share any similar interests/curiosity towards audio/acoustics/language/psychology/psychiatry/neuroscience etc. Have a chat - should be fun (not guaranteed on your side)!

basic information 



Associate Professor

X-LANCE LAB

Department of Computer Science and Engineering

Shanghai Jiao Tong University

Address : 3-225 SEIEE Building, 800 Dongchuan Road, Shanghai 200240, China

Email : mengyuewu@sjtu.edu.cn

 

Teaching



AI3611 Deep Learning: A Practical Course on Perception and Cognition

Research



SJTU X-LANCE Lab 上海交通大学跨媒体语言智能实验室丰富音频研究组

Environment Sound:

  • Sound event and scene detection
  • Audio caption, bridging the gap between audio analysis and natural language description
  • Audio-visual event detection

Human Speech: medical application

  • Speech emotion analysis
  • Depression/parkinson’s/Alzheimer’s disease detection
  • Acoustic-based disease diagnosis, e.g. coughing, voice, heartsound…

Activities



Fellow, 123

Member, 123

Publications


Selected Journal Papers

  1. Zhi Chen, Yuncong Liu, Lu Chen, Su Zhu, Mengyue Wu and Kai Yu. OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue. Transactions of the Association for Computational Linguistics, 2022, zc825-chen-tacl2022.pdf
  2. Heinrich Dinkel, Shuai Wang, Xuenan Xu, Mengyue Wu and Kai Yu. Voice activity detection in the wild: A data-driven approach using teacher-student training. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1542-1555, 2021, hedi7-dinkel-taslp2021-2.pdf
  3. Heinrich Dinkel, Mengyue Wu and Kai Yu. Towards Duration Robust Weakly Supervised Sound Event Detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 887-900, 2021, hedi7-dinkel-taslp2021.pdf

Selected Conference Papers

  1. Guangwei Li, Xuenan Xu, Mengyue Wu and Kai Yu. Category-Adapted Sound Event Enhancement with Weakly Labeled Data. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore, 2022, 851-855, category-adapted_sound_event_enhancement_with_weakly_labeled_data.pdf
  2. Guangwei Li, Xuenan Xu, Mengyue Wu and Kai Yu. Navigating Audio-Visual Event Detection Across Mismatched Modalities. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore, 2022, 1975-1979, navigating_audio-visual_event_detection_across_mismatched_modalities.pdf
  3. Siyu Lou, Xuenan Xu, Mengyue Wu and Kai Yu. Audio-Text Retrieval in Context. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore, 2022, 4793-4797, syl92-lou-icassp22.pdf
  4. Xuenan Xu, Mengyue Wu and Kai Yu. Diversity-controllable and Accurate Audio Captioning Based on Neural Condition. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore, 2022, 971-975, xnx98-xu-icassp22.pdf
  5. Wen Wu, Mengyue Wu and Kai Yu. Climate and Weather: Inspecting Depression Detection via Emotion Recognition. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore, 2022, 6262-6266, myw19-wu-icassp22-1.pdf
  6. Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu and Kenny Q. Zhu. Can Audio Captions Be Evaluated with Image Caption Metrics? IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Singapore, Singapore, 2022, 981-985, myw19-wu-icassp22-2.pdf
  7. Zhiling Zhang, Siyuan Chen, Mengyue Wu, Kenny Zhu. Symptom Identification for Interpretable Detection of Multiple Mental Disorders on Social Media. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022), syc20-chen-emnlp22.pdf
  8. Zhiling Zhang, Siyuan Chen, Mengyue Wu, Kenny Zhu. Psychiatric Scale Guided Risky Post Screening for Early Detection of Depression. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), syc20-chen-ijcai22.pdf
  9. Binwei Yao, Chao Shi, Likai Zou, Lingfeng Dai, Mengyue Wu, Lu Chen, Zhen Wang, and Kai Yu. 2022. D4: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2438–2459, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. (EMNLP 2022), ybw00-yao-emnlp22.pdf
  10. Pingyue Zhang, Mengyue Wu, Heinrich Dinkel and Kai Yu. DEPA: Self-Supervised Audio Embedding for Depression Detection. In Proceedings of the 29th ACM International Conference on Multimedia (ACM-MM), Virtual Event, China, 2021, 135-143, myw19-wu-mm2021.pdf
  11. Zhiling Zhang, Zelin Zhou, Haifeng Tang, Guangwei Li, Mengyue Wu and Kenny Q. Zhu. Enriching Ontology with Temporal Commonsense for Low-Resource Audio Tagging. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM), Queensland, Australia, 2021, 3652-3656, myw19-wu-cikm2021.pdf
  12. Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu. A Lightweight Framework for Online Voice Activity Detection in the Wild. Proc. Interspeech 2021, 371-375, doi: 10.21437/Interspeech.2021-1977, xnx98-xu-is2021.pdf
  13. Zhi Chen, Lu Chen, Hanqi Li, Ruisheng Cao, Da Ma, Mengyue Wu and Kai Yu. Decoupled Dialogue Modeling and Semantic Parsing for Multi-Turn Text-to-SQL. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3063–3074, August 1–6, 2021, 2021.findings-acl.270.pdf
  14. Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Zeyu Xie and Kai Yu. Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Ontario, Canada, 2021, 905-909, xnx98-xu-icassp21-1.pdf
  15. Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu. Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Ontario, Canada, 2021, 606-610, xnx98-xu-icassp21-2.pdf
  16. Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu. Audio Caption in a Car Setting with a Sentence-Level Loss. In The 12th International Symposium on Chinese Spoken Language Processing (ISCSLP), Hong Kong, China, 2021, 1-5, xnx98-xu-iscslp2021.pdf
  17. Yefei Chen, Heinrich Dinkel, Mengyue Wu and Kai Yu. Voice activity detection in the wild via weakly supervised sound event detection. In 21st Annual Conference of the International Speech Communication Association (InterSpeech), Shanghai, China, 2020, 3665-3669, hedi7-dinkel-is2020.pdf
  18. Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu. A CRNN-GRU Based Reinforcement Learning Approach to Audio Captioning. In The 5th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), Tokyo, Japan, 2020, 225-229, xnx98-xu-dcase2020.pdf
  19. Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu and Weiyao Lin. Multiple Sound Sources Localization from Coarse to Fine. The European Conference on Computer Vision (ECCV), Glasgow, 2020, hedi7-dinkel-eccv2020.pdf
  20. Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu and Weiyao Lin. A Two-Stage Framework for Multiple Sound-Source Localization. CVPR Sight and Sound Workshop, 2020, hedi7_dinkel_cvprw2020.pdf
  21. Mengyue Wu, Heinrich Dinkel and Kai Yu. Audio Caption: Listen and Tell. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, 830-834, myw19-wu-icassp2019.pdf