Torbjørn Karl Svendsen
Background and activities
Torbjørn Svendsen (1955) is a Professor at the Department of Electronic Systems. Professor Svendsen holds a MScEE, and a PhD both from the NTNU.
Fields of interest and present research activities
My research interests have from the outset in 1979 been speech signal processing. The first period was focused on source coding, i.e. speech compression, which was also the subject of my doctoral thesis. From the mid 80’s the research interests have been mainly on automatic speech recognition, but also areas like spoken dialogue systems and speech synthesis have been included in my research. Speech analysis methods and lexical modelling, e.g. pronunciation modelling have been two central areas. Realizing that current approaches to speech recognition seem to be nearing a saturation point in terms of performance, a major activity in the last 5-year period has been to investigate new paradigms for speech recognition, aiming to integrate phonetic and linguistic knowledge in a statistical framework based on detection of (language universal) phonetic features.
- NTNU (1979-1981 Research assistant, 1983-1984 doctoral fellowship, 1988-1995 Associate professor, 1995-present Professor), Director NTNU Digital (2015-2021)
- SINTEF (1981-1987, Research scientist)
- Research visits at AT&T Bell Labs, Murray Hill, NJ (1986-1987, 1990); Griffith University, Brisbane, Australia (1996-97); AT&T Labs, Florham Park, NJ (2000); Queensland University of Technology, Brisbane, Australia (2002-03); Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA (2013)
Peer review and professional evaluation work:
- Reviewer for international journals like IEEE Transactions (Communications; Signal Processing; Audio, Speech and Language Processing; Multimedia); EURASIP Journal on Applied Signal Processing, Signal, Image and Video Processing; and Speech Communication, and various conferences and workshops on speech and signal processing.
- Member of Speech Communication journal Editorial Board
- Reviewer for EU's Language Engineering program and the Information Society Research Programme of the Academy of Finland. Project reviews for the Norwegian, Australian, Swiss, Dutch, Belgian and South African Research Councils
- Opponent/member of examination boards for 26 doctoral theses
Membership in academic and professional committees
- Various appointments at the national level, e.g. in the Research Council of Norway, incl. grant committee member for the IKTPLUSS program, program board chair for the VERDIKT program, and in the Norwegian Language Council.
- Member of advisory board, Norwegian Language Bank (“Språkbanken”)
- Member of Technical committees, Eurospeech2001 and Interspeech2012, and organizing committee of Eurospeech2001.
- Senior Member, IEEE Signal Processing Society Speech Technical Committee (1998-2001)
- Elected member, Norwegian Academy of Technological Sciences
- Vice president, International Speech Communication Association (ISCA)
Other professional merits
- Project manager, "Atomic Units for Language Universal Speech" (current), "Spoken dialog systems for telephony"; "Speech interfaces and reasoning systems"; "Norwegian corpus for language technology"; “Voice centric user interfaces for location based services”; “Tools for realistic speech synthesis in”; “Spoken Information Retrieval by Knowledge Utilization in Statistical Speech Processing”; “Rundkast – A transcribed broadcast news for applications in language technology”(past projects).
- Vice chair, COST action 278; WG chair COST actions 232 and 249; Advisory Scientific Board member, EU project ACORNS; Board member, Nordic Graduate School of Language Technology (former actions and activities)
- Previous NTNU appointments: Department Head, Department of Telecommunications; Vice Dean, Faculty of Electrical Engineering and Telecommunications; member of several NTNU committees
- 16 PhD students graduated (2 as co-supervisor). Currently supervising 2 PhD students.
- ~80 Master degree students graduated
- ~85 papers in international journals and conferences
Scientific, academic and artistic work
A selection of recent journal publications, artistic productions, books, including book and report excerpts. See all publications in the database
- (2021) Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP). vol. 30.
- (2021) Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation. Interspeech.
- (2020) Transfer learning of articulatory information through phone information. Interspeech (USB).
- (2020) Sequence-to-sequence articulatory inversion through time convolution of sub-band frequency signals. Interspeech (USB).
- (2019) A Comparative Study of Deep Learning Techniques on Frame-Level Speech Data Classification. Circuits, systems, and signal processing. vol. 38.
- (2019) A Phonetic-Level Analysis of Different Input Features for Articulatory Inversion. Interspeech (USB).
- (2014) An artificial neural network approach to automatic speech processing. Neurocomputing. vol. 140.
- (2013) Synthetic Speaker Models Using VTLN to Improve the Performance of Children in Mismatched Speaker Conditions for ASR. Interspeech (USB).
- (2012) Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data. IEEE Transactions on Audio, Speech, and Language Processing. vol. 20 (3).
- (2012) Universal attribute characterization of spoken languages for automatic spoken language recognition. Computer Speech and Language. vol. 27 (1).
- (2011) Pronunciation Variation Modeling of Non-Natie Proper Names by Discriminative Tree Search. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing.
- (2011) A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines. Interspeech.
- (2011) Frequency-Warped and Stabilized Time-Varying Cepstral Coefficients. Interspeech.
- (2011) iVector Approach to Phonotactic Language Recognition. Interspeech.
- (2010) A Minimum Classification Error Approach to Pronunciation Variation Modeling of Non-Native Proper Names. Interspeech.
- (2010) Exploiting Context-Dependency and Acoustic Resolution of Universal Speech Attribute Models in Spoken Language Recognition. Interspeech.
- (2010) Experimental Studies on Continuous Speech Recognition Using Neural Architectures with ‘Adaptive’ Hidden Activation Functions. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing.
- (2010) Intra-Frame Variability As a Predictor of Frame Classifiability. Interspeech.
- (2009) Exploring Universal Attribute Characterization of Spoken Languages for Spoken Language Recognition. Interspeech.
- (2009) A Phonetic Feature Based Lattice Rescoring Approach to LVCSR. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing.