Adel Moumen portrait

University of Cambridge
Department of Engineering
Cambridge, UK

CV
GitHub
Google Scholar
LinkedIn

am3303 [at] cam.ac.uk

ABOUT ME

I am a 24-year-old second-year PHD student at the University of Cambridge under the supervision of Prof. Phil Woodland. I completed my Bachelor's and Master's degree in computer science and AI with distinction in an innovation and research-devoted curriculum and earned a two-year entrepreneurship diploma in 2022. I am professionally also contributing to the development of SpeechBrain, an all-in-one, open-source, PyTorch-based speech processing toolkit with more than 10,000+ stars on GitHub. At SpeechBrain, I lead the core efforts of the toolkit. In 2019, I started as an autodidact on deep learning and helped frame the largest French AI community.

RESEARCH INTERESTS

My research focuses on Speech Language Models (SLMs) that natively understand and generate speech, ideally bypassing intermediate text representations. The long-term vision is to develop fully speech-native agents capable of robust dialogue, reasoning, and paralinguistic expressivity—systems that could pass a 'Speech Turing Test' and approach the seamless, emotionally intelligent interaction depicted in 'Her.' Currently, I am developing a new architecture that operates directly over a high-level semantic workspace (similar to JEPA), rather than learning in the raw speech input domain.

OPEN SOURCE

I serve as a core maintainer of SpeechBrain, an open-source toolkit for speech processing research, where I contribute to its core development. My responsibilities include reviewing pull requests, addressing technical issues, and facilitating discussions within the research community. Currently, I focus on integrating Speech Language Model capabilities and maintaining the training infrastructure, with particular emphasis on distributed training workflows.

I also maintain a CUDA implementation of Li-GRU and SLi-GRU, hardware-efficient recurrent architectures that achieve strong RNN performance on ASR while offering significant speed improvements. This implementation provides researchers with an optimized CUDA/C++ backend with PyTorch bindings and torch.autograd support, enabling seamless integration into PyTorch codebases.

PUBLICATIONS

  • Cross‑Lingual Interleaving for Spoken Language Models. ICASSP 2026 (accepted!). Preprint. [preprint]
  • Text-speech language models with improved cross-modal transfer by aligning abstraction levels. Preprint. [preprint]
  • Discrete Audio Tokens: More Than a Survey! TMLR 2025 [preprint]
  • Open-source conversational ai with speechbrain 1.0. JMLR 2024 [preprint]
  • Stabilising and accelerating light gated recurrent units for automatic speech recognition. ICASSP 2023 [preprint]

For a more complete list: Google Scholar.