A paper accepted at Interspeech 2022
Our paper with Alan Zhou on modeling speech recognition and synthesis simultaneously got into Interspeech 2022.
The model learns lexical and sub-lexical (phon/n-gram) information without a direct access to training data.
One important finding: binary codes encode holistic (lexical) info, individual bits encode featural (sublexical) info in an interpretable way (tested with a causal technique).
Paper: arXiv

Modeling speech recognition and synthesis simultaneously.