Navigation

  • Skip to Content
NTNU Home NTNU Home

ntnu.edu

  • Studies
    • Master's programmes in English
    • For exchange students
    • PhD opportunities
    • All programmes of study
    • Courses
    • Financing
    • Language requirements
    • Application process
    • Academic calendar
    • FAQ
  • Research and innovation
    • NTNU research
    • Research excellence
    • Strategic research areas
    • Innovation resources
    • PhD opportunities
  • Life and housing
    • Student in Trondheim
    • Student in Gjøvik
    • Student in Ålesund
    • For researchers
    • Life and housing
  • About NTNU
    • Contact us
    • Faculties and departments
    • Libraries
    • International researcher support
    • Vacancies
    • About NTNU
    • Maps
  1. Employees

Språkvelger

Norsk

Torbjørn Karl Svendsen

Download press photo
Download press photo
Foto: Kai T. Dragland / NTNU

Torbjørn Karl Svendsen

Professor
Department of Electronic Systems

torbjorn.svendsen@ntnu.no
+4773591481 +4793080477 Elektro C, C335, Gløshaugen, O. S. Bragstads plass 2
About Research Publications Outreach

About

CV

For a complete CV, please use the link above ("CV")

 

Torbjørn Svendsen (1955) is an emeritus Professor at the Department of Electronic Systems. Professor Svendsen holds a MScEE, and a PhD both from the NTNU. He is an ISCA Fellow and IEEE Life Senior Member.

Fields of interest and present research activities

My research interests have from the outset in 1979 been speech signal processing. The first period was focused on source coding, i.e. speech compression, which was also the subject of my doctoral thesis. From the mid 80’s the research interests have been mainly on automatic speech recognition, but also areas like spoken dialogue systems and speech synthesis have been included in my research.  Speech analysis methods and lexical modelling, e.g. pronunciation modelling have been two central areas. Realizing that current approaches to speech recognition seem to be nearing a saturation point in terms of performance, a major recent activity has been to investigate new paradigms for speech recognition, aiming to integrate phonetic and linguistic knowledge in a statistical framework based on detection of (language universal) phonetic features. Lately, the challenges of reliable recognition of children's speech and transcription of conversational, accented and dialectal speech have been central in my research.

Work experience

  • NTNU (1979-1981 Research assistant, 1983-1984 doctoral fellowship, 1988-1995 Associate professor, 1995-present Professor), Director NTNU Digital (2015-2021)
  • SINTEF (1981-1987, Research scientist)
  • Research visits at AT&T Bell Labs, Murray Hill, NJ (1986-1987, 1990); Griffith University, Brisbane, Australia (1996-97); AT&T Labs, Florham Park, NJ (2000); Queensland University of Technology, Brisbane, Australia (2002-03); Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA (2013); Delft University of Technology (2022); Kore University of Enna, Italy (2023)

Professional merits

Peer review and professional evaluation work:

  • Reviewer for international journals like IEEE Transactions (Communications; Signal Processing; Audio, Speech and Language Processing; Multimedia); EURASIP Journal on Applied Signal Processing, Signal, Image and Video Processing; and Speech Communication, and various conferences and workshops on speech and signal processing.
  • Member of Speech Communication journal Editorial Board
  • Reviewer for EU's Language Engineering program and the Information Society Research Programme of the Academy of Finland. Project reviews for the Norwegian, Australian, Swiss, Dutch, Belgian and South African Research Councils
  • Opponent/member of examination boards for 26 doctoral theses

Membership in academic and professional committees

  • Various appointments at the national level, e.g. in the Research Council of Norway, incl. grant committee member for the IKTPLUSS program, program board chair for the VERDIKT program, and in the Norwegian Language Council.
  • Member of advisory board, Norwegian Language Bank (“Språkbanken”)
  • Member of Technical committees, Eurospeech2001 and Interspeech2012, and organizing committee of Eurospeech2001.
  • Life Senior Member, IEEE
  • Member, Signal Processing Society Speech Technical Committee (1998-2001)
  • Elected member, Norwegian Academy of Technological Sciences
  • ISCA Fellow
  • Board of International Speech Communication Association (ISCA) (Member 2015-2017, Vice President 2017-2021, Board Secretary 2021-2023)

Other professional merits

  • Project manager, "Atomic Units for Language Universal Speech" (current), "Spoken dialog systems for telephony"; "Speech interfaces and reasoning systems"; "Norwegian corpus for language technology"; “Voice centric user interfaces for location based services”; “Tools for realistic speech synthesis in”; “Spoken Information Retrieval by Knowledge Utilization in Statistical Speech Processing”;  “Rundkast – A transcribed broadcast news for applications in language technology”(past projects).
  • Vice chair, COST action 278; WG chair COST actions 232 and 249; Advisory Scientific Board member, EU project ACORNS; Board member, Nordic Graduate School of Language Technology (former actions and activities)
  • Previous NTNU appointments: Department Head, Department of Telecommunications; Vice Dean, Faculty of Electrical Engineering and Telecommunications; member of several NTNU committees
  • 19 PhD students graduated (3 as co-supervisor). Currently supervising 5 PhD students.
  • ~100 Master degree students graduated
  • >100 papers in international journals and conferences

Competencies

  • Artificial intelligence
  • Biometry
  • Digital signal processing
  • Human-machine system
  • Language Technology
  • Language resources
  • Machine learning
  • Pattern Recognition
  • Signal processing
  • Speech recognition

Research

My research interests have from the outset in 1979 been speech signal processing. The first period was focused on source coding, i.e. speech compression, which was also the subject of my doctoral thesis. From the mid 80’s the research interests have been mainly on automatic speech recognition, but also areas like spoken dialogue systems and speech synthesis have been included in my research.  Speech analysis methods and lexical modelling, e.g. pronunciation modelling have been two central areas. Realizing that current approaches to speech recognition seem to be nearing a saturation point in terms of performance, a major recent activity has been to investigate new paradigms for speech recognition, aiming to integrate phonetic and linguistic knowledge in a statistical framework based on detection of (language universal) phonetic features. Lately, the challenges of reliable recognition of children's speech and transcription of conversational, accented and dialectal speech have been central in my research.

Publications

  • Chronological
  • By category
  • All publications registered in NVA

2026

  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn; Salvi, Giampiero. (2026) Segmentation-Free Goodness of Pronunciation. IEEE Transactions on Audio, Speech and Language Processing
    Academic article
  • Parsons, Phoebe; Salvi, Giampiero; Svendsen, Torbjørn; Kvale, Knut. (2026) On Dialects and Speech Technology. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis

2025

  • Parsons, Phoebe Luree Turner; Bremnes, Heming Strømholt; Kvale, Knut; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Effects of Prosodic Information on Dialect Classification Using Whisper Features.
    Academic chapter
  • Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn. (2025) Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss.
    Academic chapter
  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Child speech assessment through large language model speech synthesis: Preliminary results.
    Academic chapter
  • Dymbe, Simen; Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Using Cross-Attention for Conversational ASR over the Telephone.
    Academic chapter
  • Rugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2025) Optimizing ASR Models with Semantic Information.
    Academic chapter
  • Parsons, Phoebe Luree Turner; Solberg, Per Erik; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2025) Adding Metadata to Existing Parliamentary Speech Corpus.
    Academic chapter
  • Parsons, Phoebe Luree Turner; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2025) Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR.
    Academic chapter

2024

  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) A Framework for Phoneme-Level Pronunciation Assessment Using CTC. Interspeech
    Academic article
  • Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2024) Towards Better Recognition of Spontaneous Children's Speech: Speaker-Clustering Fine-Tuning of Whisper. Machine Learning for Signal Processing
    Academic article
  • Quatra, Moreno La; Turco, Maria Francesca; Svendsen, Torbjørn Karl; Salvi, Giampiero; Orozco-Arroyave, Juan Rafael; Siniscalchi, Sabato Marco. (2024) Exploiting Foundation Models and Speech Enhancement for Parkinson’s Disease Detection from Speech in Real-World Operative Conditions. Interspeech
    Academic article
  • Kynych, Frantisek; Cerva, Petr; Zdansky, Jindrich; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) A lightweight approach to real-time speaker diarization: from audio toward audio-visual data streams. EURASIP Journal on Audio, Speech, and Music Processing
    Academic article
  • Olstad, Anne Marte Haug; Smolander, Anna; Strömbergsson, Sofia; Ylinen, Sari; Lehtonen, Minna; Kurimo, Mikko; Getman, Yaroslav; Grósz, Tamás; Cao, Xinwei; Svendsen, Torbjørn Karl. (2024) Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages. Proceedings of LREC
    Academic article

2023

  • Solberg, Per Erik; Ortiz Cabello, Pablo; Parsons, Phoebe; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) Improving Generalization of Norwegian ASR with Limited Linguistic Resources.
    Academic chapter
  • Parsons, Phoebe; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) A character-based analysis of impacts of dialects on end-to-end Norwegian ASR.
    Academic chapter
  • Getman, Yaroslav; Phan, Nhan; Al-Ghezi, Ragheb; Voskoboinik, Ekaterina; Singh, Mittul; Grosz, Tamas; Kurimo, Mikko; Salvi, Giampiero; Svendsen, Torbjørn Karl; Strombergsson, Sofia. (2023) Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children. IEEE Access
    Academic article
  • Rugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation. Interspeech (USB)
    Academic article
  • Gelderblom, Femke Berre; Tronstad, Tron Vedul; Svendsen, Torbjørn Karl; Myrvoll, Tor Andre. (2023) On the Predictive Power of Objective Intelligibility Metrics for the Subjective Performance of Deep Complex Convolutional Recurrent Speech Enhancement Networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
    Academic article
  • Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Using Modified Adult Speech as Data Augmentation for Child Speech Recognition. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) An Analysis of Goodness of Pronunciation for Child Speech. Interspeech
    Academic article
  • Gelderblom, Femke Berre; Myrvoll, Tor Andre; Svendsen, Torbjørn Karl. (2023) Evaluating Performance Metrics for Deep Neural Network-based Speech Enhancement Systems. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis

2022

  • Kvale, Knut; Gulla, Jon Atle; Adde, Line; Solberg, Per Erik; Svendsen, Torbjørn Karl; Moshagen, Sjur Nørstebø; Wettre, Jonas Engestøl. (2022) Taleteknologi og kunstig intelligens. Teknologirådet
    Research report
  • Rugayan, Janine Lizbeth Cabrera; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2022) Semantically Meaningful Metrics for Norwegian ASR Systems. Interspeech (USB)
    Academic article
  • Getman, Yaroslav; Al-Ghezi, Ragheb; Voskoboinik, Ekaterina; Grósz, Tamás; Kurimo, Mikko; Salvi, Giampiero; Svendsen, Torbjørn Karl; Strömbergsson, Sofia. (2022) wav2vec2-based Speech Rating System for Children with Speech Sound Disorder. Interspeech (USB)
    Academic article

2021

  • Sabzi Shahrebabaki, Abdolreza; Salvi, Giampiero; Svendsen, Torbjørn Karl; Siniscalchi, Sabato Marco. (2021) Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Svendsen, Torbjørn Karl. (2021) Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation. Interspeech
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Imran, Ali Shariq; Johnsen, Magne Hallstein; Siniscalchi, Sabato Marco; Svendsen, Torbjørn Karl. (2021) A Two-Stage Deep Modeling Approach to Articulatory Inversion.
    Academic chapter
  • Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2021) A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion.
    Academic chapter

2020

  • Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn. (2020) Transfer learning of articulatory information through phone information. Interspeech (USB)
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2020) Sequence-to-sequence articulatory inversion through time convolution of sub-band frequency signals. Interspeech (USB)
    Academic article

2019

  • Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Imran, Ali Shariq; Marco, Siniscalchi Sabato; Svendsen, Torbjørn Karl. (2019) A Phonetic-Level Analysis of Different Input Features for Articulatory Inversion. Interspeech (USB)
    Academic article
  • Imran, Ali Shariq; Haflan, Vetle; Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Svendsen, Torbjørn Karl. (2019) Evaluating Acoustic Feature Maps in 2D-CNN for Speaker Identification.
    Academic chapter
  • Imran, Ali Shariq; Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Svendsen, Torbjørn Karl. (2019) A Study on the Performance Evaluation of Machine Learning Models for Phoneme Classification.
    Academic chapter
  • Imran, Ali Shariq; Kastrati, Zenun; Svendsen, Torbjørn Karl; Kurti, Arianit. (2019) Text-Independent Speaker ID for Automatic Video Lecture Classification Using Deep Learning.
    Academic chapter
  • Sabzi Shahrebabaki, Abdolreza; Imran, Ali Shariq; Olfati, Negar; Svendsen, Torbjørn Karl. (2019) A Comparative Study of Deep Learning Techniques on Frame-Level Speech Data Classification. Circuits, systems, and signal processing
    Academic article

2018

  • Sabzi Shahrebabaki, Abdolreza; Imran, Ali Shariq; Olfati, Negar; Svendsen, Torbjørn Karl. (2018) Acoustic Feature Comparison for Different Speaking Rates.
    Academic chapter

2015

  • Næss, Arild Brandrud; Svendsen, Torbjørn Karl; Livescu, Karen. (2015) Nearest Neighbor Frame Classification for Articulatory Speech Recognition. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis
  • Svendsen, Torbjørn Karl; Hamar, Jarle Bauck. (2015) Combining NdHMM and Phonetic Feature Detection for Speech Recognition.
    Academic chapter

2014

  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2014) An artificial neural network approach to automatic speech processing. Neurocomputing
    Academic article
  • Soufifar, Mehdi; Svendsen, Torbjørn; Burget, Lukas. (2014) Subspace Modeling of Discrete Features for Language Recognition. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis

2013

  • Hamar, Jarle Bauck; Doddipatla, Rama Sanand; Svendsen, Torbjørn; Sreenivas, Thippur. (2013) Non-Negative Durational HMM.
    Academic chapter
  • Doddipatla, Rama Sanand; Svendsen, Torbjørn. (2013) Synthetic Speaker Models Using VTLN to Improve the Performance of Children in Mismatched Speaker Conditions for ASR. Interspeech (USB)
    Academic article

2012

  • Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2012) Universal attribute characterization of spoken languages for automatic spoken language recognition. Computer Speech and Language
    Academic article
  • Siniscalchi, Sabato Marco; Lyu, DC; Svendsen, Torbjørn; Lee, CH. (2012) Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data. IEEE Transactions on Audio, Speech, and Language Processing
    Academic article

2011

  • Adde, Line; Svendsen, Torbjørn. (2011) Pronunciation Variation Modeling of Non-Natie Proper Names by Discriminative Tree Search. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2011) A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines. Interspeech
    Academic article
  • Soufifar, Mehdi; Kockmann, Marcel; Burget, Lukas; Plchot, Oldrich; Glembek, Ondrej; Svendsen, Torbjørn. (2011) iVector Approach to Phonotactic Language Recognition. Interspeech
    Academic article
  • Skogstad, Trond; Svendsen, Torbjørn. (2011) Frequency-Warped and Stabilized Time-Varying Cepstral Coefficients. Interspeech
    Academic article

2010

  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Sorbello, Filippo; Lee, Chin-Hui. (2010) Experimental Studies on Continuous Speech Recognition Using Neural Architectures with ‘Adaptive’ Hidden Activation Functions. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Adde, Line; Reveil, Bert; Martens, Jean-Pierre; Svendsen, Torbjørn. (2010) A Minimum Classification Error Approach to Pronunciation Variation Modeling of Non-Native Proper Names. Interspeech
    Academic article
  • Skogstad, Trond; Svendsen, Torbjørn. (2010) Intra-Frame Variability As a Predictor of Frame Classifiability. Interspeech
    Academic article
  • Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2010) Exploiting Context-Dependency and Acoustic Resolution of Universal Speech Attribute Models in Spoken Language Recognition. Interspeech
    Academic article
  • Adde, Line; Svendsen, Torbjørn. (2010) NameDat: A Database of English Proper Names Spoken by Native Norwegians.
    Academic chapter

2009

  • Mertens, Timo Pascal; Schneider, Daniel; Næss, Arild Brandrud; Svendsen, Torbjørn. (2009) Lexicon Adaptation for Subword Speech Recognition.
    Academic chapter
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2009) A Phonetic Feature Based Lattice Rescoring Approach to LVCSR. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2009) Exploring Universal Attribute Characterization of Spoken Languages for Spoken Language Recognition. Interspeech
    Academic article

2008

  • Amdal, Ingunn; Strand, Ole Morten; Almberg, Jørn; Svendsen, Torbjørn. (2008) RUNDKAST: An Annotated Norwegian Broadcast News Speech Corpus.
    Academic chapter
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; lee, chin-hui. (2008) A Penalized Logistic Regression Approach to Detection Based Phone Classification. Interspeech
    Academic article

2007

  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2007) Towards Bottom-Up Continuous Phone Recognition.
    Academic chapter

2006

  • Amdal, Ingunn; Svendsen, Torbjørn. (2006) FonDat1: A Speech Synthesis Corpus for Norwegian.
    Academic chapter
  • Amdal, Ingunn; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (2006) Log Likelihood Ratio Based Annotation Verification of a Norwegian Speech Synthesis Database.
    Academic chapter

2005

  • Bjørkan, Ingmund; Svendsen, Torbjørn; Farner, Snorre. (2005) Comparing Spectral Distance Measures for Join Cost Optimization in Concatenative Speech Synthesis. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Skogstad, Trond; Svendsen, Torbjørn. (2005) Distributed ASR Using Speech Coder Data for Efficient Feature Vector Representation. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Bjørkan, Ingmund; Svendsen, Torbjørn. (2005) Comparing Spectral Distance Measures for Join Cost Optmization. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Amdal, Ingunn; Svendsen, Torbjørn. (2005) Unit Selection Synthesis Database Development Using Utterance Verification. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Meen, Dyre; Svendsen, Torbjørn; Natvig, Jon-Emil. (2005) Improving Phone Label Alignment Accuracy by Utilizing Voicing Information.
    Academic chapter
  • Svendsen, Torbjørn; Amdal, Ingunn; Bjørkan, Ingmund; Meen, Dyre; Heggtveit, Per Olav; Natvig, Jon Emil. (2005) FONEMA - Tools for realistic speech synthesis in Norwegian.
    Academic chapter
  • Svendsen, Torbjørn; Egeberg, Andreas; Holter, Trym; Skogstad, Trond. (2005) VOCALS - Voice centric user interfaces for location based services.
    Academic chapter

2004

  • Nordgård, Torbjørn; Svendsen, Torbjørn; Harborg, Erik; Kvale, Knut. (2004) Language Technology Towards 2020.
    Academic chapter

2003

  • Svendsen, Torbjørn. (2003) Speech Technology: Past, Present and Future. Telektronikk
    Academic article

2002

  • Svendsen, Torbjørn. (2002) Roles for Speech And Language Technology in The Information Society.
    Academic chapter
  • Nordgård, Torbjørn; Svendsen, Torbjørn; Natvig, Jon Emil. (2002) Talsmann talesyntese som hjelpemiddel for dyslektikere. Telenor Communication AS
    Research report
  • Nordgård, Torbjørn; Svendsen, Torbjørn; Breivik, Torbjørg. (2002) Samling og tilgjengeleggjering av norske språkteknologiressursar. Norsk språkråd
    Research report

2001

  • Svendsen, Torbjørn. (2001) Nordisk forskningssamarbeid innen språkteknologi. Språknytt
    Popular science article

2000

  • Amdal, Ingunn; Holter, Trym; Svendsen, Torbjørn. (2000) Modellering av uttalevariasjon for automatisk talegjenkjenning. Nordlyd
    Academic article

1999

  • Svendsen, Torbjørn. (1999) Taleteknologi. Språk i Norden
    Academic article
  • Holter, Trym; Svendsen, Torbjørn. (1999) Maximum likelihood modelling of pronunciation variation. Speech Communication
    Academic article
  • Svendsen, Torbjørn; Johnsen, Magne Hallstein; Nordgård, Torbjørn; Hofland, Knut; Hofland, Knut; Ore, Christian Emil; Ore, Christian Emil. (1999) Nasjonalt korpus for språkteknologi - forprosjekt. Norges forskningsråd
    Research report

1998

  • Svendsen, Torbjørn. (1998) Blir norsk gresk for språkteknologien?. Språknytt
    Academic article

1995

  • Harborg, Erik; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1995) Talegjenkjenning II. SINTEF DELAB
    Research report
  • Harborg, Erik; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1995) Talegjenkjenning for teksting av direktesendte programmer - en studie. SINTEF DELAB
    Research report

1994

  • Svendsen, Torbjørn. (1994) Talebaserte brukergrensesnitt. NORSIGnalet : organ for NORSIG, Norsk forening for signalbehandling
    Popular science article

Journal publications

  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn; Salvi, Giampiero. (2026) Segmentation-Free Goodness of Pronunciation. IEEE Transactions on Audio, Speech and Language Processing
    Academic article
  • Svendsen, Torbjørn. (1999) Taleteknologi. Språk i Norden
    Academic article
  • Holter, Trym; Svendsen, Torbjørn. (1999) Maximum likelihood modelling of pronunciation variation. Speech Communication
    Academic article
  • Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2012) Universal attribute characterization of spoken languages for automatic spoken language recognition. Computer Speech and Language
    Academic article
  • Bjørkan, Ingmund; Svendsen, Torbjørn; Farner, Snorre. (2005) Comparing Spectral Distance Measures for Join Cost Optimization in Concatenative Speech Synthesis. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Imran, Ali Shariq; Marco, Siniscalchi Sabato; Svendsen, Torbjørn Karl. (2019) A Phonetic-Level Analysis of Different Input Features for Articulatory Inversion. Interspeech (USB)
    Academic article
  • Skogstad, Trond; Svendsen, Torbjørn. (2005) Distributed ASR Using Speech Coder Data for Efficient Feature Vector Representation. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Svendsen, Torbjørn. (2001) Nordisk forskningssamarbeid innen språkteknologi. Språknytt
    Popular science article
  • Sabzi Shahrebabaki, Abdolreza; Salvi, Giampiero; Svendsen, Torbjørn Karl; Siniscalchi, Sabato Marco. (2021) Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Svendsen, Torbjørn Karl. (2021) Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation. Interspeech
    Academic article
  • Bjørkan, Ingmund; Svendsen, Torbjørn. (2005) Comparing Spectral Distance Measures for Join Cost Optmization. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2014) An artificial neural network approach to automatic speech processing. Neurocomputing
    Academic article
  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) A Framework for Phoneme-Level Pronunciation Assessment Using CTC. Interspeech
    Academic article
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2009) A Phonetic Feature Based Lattice Rescoring Approach to LVCSR. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2009) Exploring Universal Attribute Characterization of Spoken Languages for Spoken Language Recognition. Interspeech
    Academic article
  • Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2024) Towards Better Recognition of Spontaneous Children's Speech: Speaker-Clustering Fine-Tuning of Whisper. Machine Learning for Signal Processing
    Academic article
  • Quatra, Moreno La; Turco, Maria Francesca; Svendsen, Torbjørn Karl; Salvi, Giampiero; Orozco-Arroyave, Juan Rafael; Siniscalchi, Sabato Marco. (2024) Exploiting Foundation Models and Speech Enhancement for Parkinson’s Disease Detection from Speech in Real-World Operative Conditions. Interspeech
    Academic article
  • Adde, Line; Svendsen, Torbjørn. (2011) Pronunciation Variation Modeling of Non-Natie Proper Names by Discriminative Tree Search. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Getman, Yaroslav; Phan, Nhan; Al-Ghezi, Ragheb; Voskoboinik, Ekaterina; Singh, Mittul; Grosz, Tamas; Kurimo, Mikko; Salvi, Giampiero; Svendsen, Torbjørn Karl; Strombergsson, Sofia. (2023) Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children. IEEE Access
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn. (2020) Transfer learning of articulatory information through phone information. Interspeech (USB)
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2020) Sequence-to-sequence articulatory inversion through time convolution of sub-band frequency signals. Interspeech (USB)
    Academic article
  • Rugayan, Janine Lizbeth Cabrera; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2022) Semantically Meaningful Metrics for Norwegian ASR Systems. Interspeech (USB)
    Academic article
  • Rugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation. Interspeech (USB)
    Academic article
  • Siniscalchi, Sabato Marco; Lyu, DC; Svendsen, Torbjørn; Lee, CH. (2012) Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data. IEEE Transactions on Audio, Speech, and Language Processing
    Academic article
  • Svendsen, Torbjørn. (1994) Talebaserte brukergrensesnitt. NORSIGnalet : organ for NORSIG, Norsk forening for signalbehandling
    Popular science article
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Sorbello, Filippo; Lee, Chin-Hui. (2010) Experimental Studies on Continuous Speech Recognition Using Neural Architectures with ‘Adaptive’ Hidden Activation Functions. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Adde, Line; Reveil, Bert; Martens, Jean-Pierre; Svendsen, Torbjørn. (2010) A Minimum Classification Error Approach to Pronunciation Variation Modeling of Non-Native Proper Names. Interspeech
    Academic article
  • Skogstad, Trond; Svendsen, Torbjørn. (2010) Intra-Frame Variability As a Predictor of Frame Classifiability. Interspeech
    Academic article
  • Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2010) Exploiting Context-Dependency and Acoustic Resolution of Universal Speech Attribute Models in Spoken Language Recognition. Interspeech
    Academic article
  • Sabzi Shahrebabaki, Abdolreza; Imran, Ali Shariq; Olfati, Negar; Svendsen, Torbjørn Karl. (2019) A Comparative Study of Deep Learning Techniques on Frame-Level Speech Data Classification. Circuits, systems, and signal processing
    Academic article
  • Amdal, Ingunn; Holter, Trym; Svendsen, Torbjørn. (2000) Modellering av uttalevariasjon for automatisk talegjenkjenning. Nordlyd
    Academic article
  • Gelderblom, Femke Berre; Tronstad, Tron Vedul; Svendsen, Torbjørn Karl; Myrvoll, Tor Andre. (2023) On the Predictive Power of Objective Intelligibility Metrics for the Subjective Performance of Deep Complex Convolutional Recurrent Speech Enhancement Networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
    Academic article
  • Getman, Yaroslav; Al-Ghezi, Ragheb; Voskoboinik, Ekaterina; Grósz, Tamás; Kurimo, Mikko; Salvi, Giampiero; Svendsen, Torbjørn Karl; Strömbergsson, Sofia. (2022) wav2vec2-based Speech Rating System for Children with Speech Sound Disorder. Interspeech (USB)
    Academic article
  • Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Using Modified Adult Speech as Data Augmentation for Child Speech Recognition. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
    Academic article
  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) An Analysis of Goodness of Pronunciation for Child Speech. Interspeech
    Academic article
  • Svendsen, Torbjørn. (2003) Speech Technology: Past, Present and Future. Telektronikk
    Academic article
  • Amdal, Ingunn; Svendsen, Torbjørn. (2005) Unit Selection Synthesis Database Development Using Utterance Verification. Eurospeech : Proceedings of the European Conference on Speech Communication and Technology
    Academic article
  • Kynych, Frantisek; Cerva, Petr; Zdansky, Jindrich; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) A lightweight approach to real-time speaker diarization: from audio toward audio-visual data streams. EURASIP Journal on Audio, Speech, and Music Processing
    Academic article
  • Svendsen, Torbjørn. (1998) Blir norsk gresk for språkteknologien?. Språknytt
    Academic article
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; lee, chin-hui. (2008) A Penalized Logistic Regression Approach to Detection Based Phone Classification. Interspeech
    Academic article
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2011) A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines. Interspeech
    Academic article
  • Soufifar, Mehdi; Kockmann, Marcel; Burget, Lukas; Plchot, Oldrich; Glembek, Ondrej; Svendsen, Torbjørn. (2011) iVector Approach to Phonotactic Language Recognition. Interspeech
    Academic article
  • Skogstad, Trond; Svendsen, Torbjørn. (2011) Frequency-Warped and Stabilized Time-Varying Cepstral Coefficients. Interspeech
    Academic article
  • Olstad, Anne Marte Haug; Smolander, Anna; Strömbergsson, Sofia; Ylinen, Sari; Lehtonen, Minna; Kurimo, Mikko; Getman, Yaroslav; Grósz, Tamás; Cao, Xinwei; Svendsen, Torbjørn Karl. (2024) Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages. Proceedings of LREC
    Academic article
  • Doddipatla, Rama Sanand; Svendsen, Torbjørn. (2013) Synthetic Speaker Models Using VTLN to Improve the Performance of Children in Mismatched Speaker Conditions for ASR. Interspeech (USB)
    Academic article

Part of book/report

  • Parsons, Phoebe Luree Turner; Bremnes, Heming Strømholt; Kvale, Knut; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Effects of Prosodic Information on Dialect Classification Using Whisper Features.
    Academic chapter
  • Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn. (2025) Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss.
    Academic chapter
  • Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Child speech assessment through large language model speech synthesis: Preliminary results.
    Academic chapter
  • Dymbe, Simen; Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Using Cross-Attention for Conversational ASR over the Telephone.
    Academic chapter
  • Rugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2025) Optimizing ASR Models with Semantic Information.
    Academic chapter
  • Amdal, Ingunn; Svendsen, Torbjørn. (2006) FonDat1: A Speech Synthesis Corpus for Norwegian.
    Academic chapter
  • Hamar, Jarle Bauck; Doddipatla, Rama Sanand; Svendsen, Torbjørn; Sreenivas, Thippur. (2013) Non-Negative Durational HMM.
    Academic chapter
  • Amdal, Ingunn; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (2006) Log Likelihood Ratio Based Annotation Verification of a Norwegian Speech Synthesis Database.
    Academic chapter
  • Mertens, Timo Pascal; Schneider, Daniel; Næss, Arild Brandrud; Svendsen, Torbjørn. (2009) Lexicon Adaptation for Subword Speech Recognition.
    Academic chapter
  • Sabzi Shahrebabaki, Abdolreza; Imran, Ali Shariq; Olfati, Negar; Svendsen, Torbjørn Karl. (2018) Acoustic Feature Comparison for Different Speaking Rates.
    Academic chapter
  • Amdal, Ingunn; Strand, Ole Morten; Almberg, Jørn; Svendsen, Torbjørn. (2008) RUNDKAST: An Annotated Norwegian Broadcast News Speech Corpus.
    Academic chapter
  • Solberg, Per Erik; Ortiz Cabello, Pablo; Parsons, Phoebe; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) Improving Generalization of Norwegian ASR with Limited Linguistic Resources.
    Academic chapter
  • Svendsen, Torbjørn. (2002) Roles for Speech And Language Technology in The Information Society.
    Academic chapter
  • Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Imran, Ali Shariq; Johnsen, Magne Hallstein; Siniscalchi, Sabato Marco; Svendsen, Torbjørn Karl. (2021) A Two-Stage Deep Modeling Approach to Articulatory Inversion.
    Academic chapter
  • Imran, Ali Shariq; Haflan, Vetle; Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Svendsen, Torbjørn Karl. (2019) Evaluating Acoustic Feature Maps in 2D-CNN for Speaker Identification.
    Academic chapter
  • Parsons, Phoebe; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) A character-based analysis of impacts of dialects on end-to-end Norwegian ASR.
    Academic chapter
  • Imran, Ali Shariq; Sabzi Shahrebabaki, Abdolreza; Olfati, Negar; Svendsen, Torbjørn Karl. (2019) A Study on the Performance Evaluation of Machine Learning Models for Phoneme Classification.
    Academic chapter
  • Imran, Ali Shariq; Kastrati, Zenun; Svendsen, Torbjørn Karl; Kurti, Arianit. (2019) Text-Independent Speaker ID for Automatic Video Lecture Classification Using Deep Learning.
    Academic chapter
  • Nordgård, Torbjørn; Svendsen, Torbjørn; Harborg, Erik; Kvale, Knut. (2004) Language Technology Towards 2020.
    Academic chapter
  • Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2007) Towards Bottom-Up Continuous Phone Recognition.
    Academic chapter
  • Adde, Line; Svendsen, Torbjørn. (2010) NameDat: A Database of English Proper Names Spoken by Native Norwegians.
    Academic chapter
  • Parsons, Phoebe Luree Turner; Solberg, Per Erik; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2025) Adding Metadata to Existing Parliamentary Speech Corpus.
    Academic chapter
  • Parsons, Phoebe Luree Turner; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2025) Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR.
    Academic chapter
  • Svendsen, Torbjørn Karl; Hamar, Jarle Bauck. (2015) Combining NdHMM and Phonetic Feature Detection for Speech Recognition.
    Academic chapter
  • Sabzi Shahrebabaki, Abdolreza; Siniscalchi, Sabato Marco; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2021) A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion.
    Academic chapter
  • Meen, Dyre; Svendsen, Torbjørn; Natvig, Jon-Emil. (2005) Improving Phone Label Alignment Accuracy by Utilizing Voicing Information.
    Academic chapter
  • Svendsen, Torbjørn; Amdal, Ingunn; Bjørkan, Ingmund; Meen, Dyre; Heggtveit, Per Olav; Natvig, Jon Emil. (2005) FONEMA - Tools for realistic speech synthesis in Norwegian.
    Academic chapter
  • Svendsen, Torbjørn; Egeberg, Andreas; Holter, Trym; Skogstad, Trond. (2005) VOCALS - Voice centric user interfaces for location based services.
    Academic chapter

Report

  • Kvale, Knut; Gulla, Jon Atle; Adde, Line; Solberg, Per Erik; Svendsen, Torbjørn Karl; Moshagen, Sjur Nørstebø; Wettre, Jonas Engestøl. (2022) Taleteknologi og kunstig intelligens. Teknologirådet
    Research report
  • Nordgård, Torbjørn; Svendsen, Torbjørn; Natvig, Jon Emil. (2002) Talsmann talesyntese som hjelpemiddel for dyslektikere. Telenor Communication AS
    Research report
  • Harborg, Erik; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1995) Talegjenkjenning II. SINTEF DELAB
    Research report
  • Nordgård, Torbjørn; Svendsen, Torbjørn; Breivik, Torbjørg. (2002) Samling og tilgjengeleggjering av norske språkteknologiressursar. Norsk språkråd
    Research report
  • Harborg, Erik; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1995) Talegjenkjenning for teksting av direktesendte programmer - en studie. SINTEF DELAB
    Research report
  • Svendsen, Torbjørn; Johnsen, Magne Hallstein; Nordgård, Torbjørn; Hofland, Knut; Hofland, Knut; Ore, Christian Emil; Ore, Christian Emil. (1999) Nasjonalt korpus for språkteknologi - forprosjekt. Norges forskningsråd
    Research report

Student thesis or dissertation

  • Parsons, Phoebe; Salvi, Giampiero; Svendsen, Torbjørn; Kvale, Knut. (2026) On Dialects and Speech Technology. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis
  • Næss, Arild Brandrud; Svendsen, Torbjørn Karl; Livescu, Karen. (2015) Nearest Neighbor Frame Classification for Articulatory Speech Recognition. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis
  • Soufifar, Mehdi; Svendsen, Torbjørn; Burget, Lukas. (2014) Subspace Modeling of Discrete Features for Language Recognition. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis
  • Gelderblom, Femke Berre; Myrvoll, Tor Andre; Svendsen, Torbjørn Karl. (2023) Evaluating Performance Metrics for Deep Neural Network-based Speech Enhancement Systems. Norges teknisk-naturvitenskapelige universitet
    Doctoral thesis

Outreach

2025

  • Conference lecture
    Parsons, Phoebe Luree Turner; Solberg, Per Erik; Kvale, Knut; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Adding Metadata to Existing Parliamentary Speech Corpus. Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
  • Conference lecture
    Parsons, Phoebe Luree Turner; Kvale, Knut; Svendsen, Torbjørn. (2025) Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR. Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
  • Conference lecture
    Parsons, Phoebe Luree Turner; Bremnes, Heming Strømholt; Kvale, Knut; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Effects of Prosodic Information on Dialect Classification Using Whisper Features. Interspeech 2025
  • Conference lecture
    Rugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn. (2025) Optimizing ASR Models with Semantic Information. Text, Speech and Dialogue
  • Conference lecture
    Dymbe, Simen; Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Using Cross-Attention for Conversational ASR over the Telephone. Text, Speech and Dialogue
  • Conference lecture
    Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn. (2025) Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss. 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP)
  • Conference lecture
    Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Child speech assessment through large language model speech synthesis: Preliminary results. 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP)

2024

  • Conference lecture
    Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2024) Towards Better Recognition of Spontaneous Children's Speech: Speaker-Clustering Fine-Tuning of Whisper. chine Learning for Signal Processing
  • Conference lecture
    Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) Framework for Phoneme-Level Pronunciation Assessment Using CTC. Interspeech
  • Conference lecture
    Parsons, Phoebe Luree Turner; Bremnes, Heming Strømholt; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) Norwegian dialect identification: is prosody enough?. Fonetik
  • Lecture
    Svendsen, Torbjørn Karl. (2024) Kunstig intelligens - hva, hvorfor, hvordan. Folkeakademiet
  • Lecture
    Svendsen, Torbjørn Karl. (2024) Hva er kunstig intelligens? Muligheter for KI i eiendomsbransjen. Internseminar
  • Lecture
    Svendsen, Torbjørn Karl. (2024) Machines may "think" - but can they master the spoken language?. Friday talk
  • Lecture
    Svendsen, Torbjørn Karl. (2024) What is spoken language technology?. From Toys to Tools to Terror(ist?) in a decade

2023

  • Conference lecture
    Parsons, Phoebe Luree Turner; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) A character-based analysis of impacts of dialects on end-to-end Norwegian ASR. 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
  • Conference lecture
    Rugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation. Interspeech
  • Conference lecture
    Fan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Using Modified Adult Speech as Data Augmentation for Child Speech Recognition. ICASSP
  • Conference lecture
    Cao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) An Analysis of Goodness of Pronunciation for Child Speech. Interspeech
  • Conference lecture
    Svendsen, Torbjørn Karl. (2023) Joint MAP of Direct and Indirect Adaptation. Symposium for Celebrating 40 Years of Bayesian Learning in Speech and Language Processing and Beyond
  • Conference lecture
    Svendsen, Torbjørn Karl. (2023) Combining direct and indirect adaptation for speech recognition. Seminar on speech technology
  • Conference lecture
    Svendsen, Torbjørn Karl. (2023) Speech Signal Processing. Speech DSP
  • Conference lecture
    Solberg, Per Erik; Ortiz Cabello, Pablo; Parsons, Phoebe Luree Turner; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) Improving Generalization of Norwegian ASR with Limited Linguistic Resources. 24th Nordic Conference on Computational Linguistics (NoDaLiDa)

2022

  • Conference lecture
    Getman, Yaroslav; Al-Ghezi, Ragheb; Voskoboinik, Ekaterina; Grósz, Tamás; Kurimo, Mikko; Salvi, Giampiero; Svendsen, Torbjørn Karl; Strömbergsson, Sofia. (2022) wav2vec2-based Speech Rating System for Children with Speech Sound Disorder. Interspeech
  • Conference lecture
    Rugayan, Janine Lizbeth Cabrera; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2022) Semantically Meaningful Metrics for Norwegian ASR Systems. Interspeech

2018

  • Lecture
    Øien, Geir Egil Dahle; Mengshoel, Ole Jakob; Ramampiaro, Heri; Svendsen, Torbjørn Karl. (2018) NTNUs strategiske satsing på kunstig intelligens (AI) – bakgrunn, aktiviteter og fremtidsvyer. Medlemsmøte, Det Kongelige Norske Vitenskapers Selskap

2012

  • Interview
    Svendsen, Torbjørn. (2012) Data med barnestemme.

2011

  • Conference lecture
    Rodriguez-Fuentes, Luis Javier; Penagarikano, Mikel; Varona, Amparo; Diez, Mireia; Bordel, German; Martinez, David; Villalba, Jesus; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. (2011) MULTI-SITE HETEROGENEOUS SYSTEM FUSIONS FOR THE ALBAYZIN 2010 LANGUAGE RECOGNITION EVALUATION. Automatic Speech Recognition and Understanding
  • Feature article
    Kvale, Knut; Nordgård, Torbjørn; Svendsen, Torbjørn; Lyse, Gunn Inger; Gjesdal, Anje Müller. (2011) Datamaskinen må skjønne norsk.
  • Lecture
    Svendsen, Torbjørn. (2011) Hva er det med tale? Forskningsutfordringer og aktiviteter innen taleteknologi. På snakkis med teknologien
  • Conference lecture
    Svendsen, Torbjørn. (2011) Universal Speech Attribute Characterization for Automatic Speech Recognition and Spoken Language Recognition. CSAIL Seminar

2010

  • Conference lecture
    Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Sorbello, Filippo; Lee, Chin-Hui. (2010) Experimental Studies on Continuous Speech Recognition Using Neural Architectures with ‘Adaptive’ Hidden Activation Functions. ICASSP 2010
  • Conference lecture
    Saeidi, Rahim; Soufifar, Mehdi; Kinnunen, Tomi; Svendsen, Torbjørn; Fränti, Pasi. (2010) UEF-NTNU System Description for Albayzin 2010 Language Recognition Evaluation. FALA 2010
  • Conference lecture
    Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2010) A Survey on Recent Progress in the ASAT/SIRKUS Paradigm. ISCSLP 2010
  • Conference lecture
    Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2010) Exploiting Context-Dependency and Acoustic Resolution of Universal Speech Attribute Models in Spoken Language Recognition. Interspeech 2010
  • Conference lecture
    Skogstad, Trond; Svendsen, Torbjørn. (2010) Intra-Frame Variability As a Predictor of Frame Classifiability. Interspeech 2010
  • Conference lecture
    Sikveland, Rein Ove; Öttl, Anton; Amdal, Ingunn; Ernestus, Mirjam; Svendsen, Torbjørn; Edlund, Jens. (2010) Spontal-N: A Corpus of Interactional Spoken Norwegian. LREC
  • Conference lecture
    Adde, Line; Reveil, Bert; Martens, Jean-Pierre; Svendsen, Torbjørn. (2010) A Minimum Classification Error Approach to Pronunciation Variation Modeling of Non-Native Proper Names. Interspeech 2010
  • Conference lecture
    Meen, Dyre; Svendsen, Torbjørn. (2010) The NTNU Concatenative Speech Synthesizer. Blizzard Challenge Workshop
  • Conference lecture
    Adde, Line; Svendsen, Torbjørn. (2010) NameDat: A Database of English Proper Names Spoken by Native Norwegians. LREC
  • Conference lecture
    Adde, Line; Svendsen, Torbjørn. (2010) A Comparative Analysis of Discriminative and Non-Discriminative Pronunciation Priors in Pronunciation Variation Modeling. IEEE Workshop on Spoken Language Technology 2010

2009

  • Interview
    Svendsen, Torbjørn. (2009) Språkteknologien gjør fremskritt igjen.
  • Interview
    Svendsen, Torbjørn. (2009) VERDIKT på Forskningsdagene.
  • Conference lecture
    Siniscalchi, Sabato Marco; Reed, Jeremy; Svendsen, Torbjørn; Lee, Chin-Hui. (2009) Exploring Universal Attribute Characterization of Spoken Languages for Spoken Language Recognition. Interspeech
  • Conference lecture
    Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2009) A Phonetic Feature Based Lattice Rescoring Approach to LVCSR. IEEE International Conference on Acoustics, Speech and Signal Processing

2008

  • Interview
    Svendsen, Torbjørn. (2008) Norsk talesyntese.
  • Interview
    Svendsen, Torbjørn. (2008) Taleteknologi.
  • Conference lecture
    Amdal, Ingunn; Strand, Ole Morten; Almberg, Jørn; Svendsen, Torbjørn. (2008) RUNDKAST: An Annotated Norwegian Broadcast News Speech Corpus. LREC 2008
  • Conference lecture
    Amdal, Ingunn; Svendsen, Torbjørn; Johnsen, Magne Hallstein; Siniscalchi, Sabato Marco; Hamar, Jarle Bauck; Martinez, Del Hoyo Canterla A.. (2008) SIRKUS - A new paradigm for speech recognition. VERDIKT Conference 2008
  • Conference lecture
    Siniscalchi, Sabato Marco; Svendsen, Torbjørn; lee, chin-hui. (2008) Toward a Detector-Based Universal Phone Recognizer. International Conference on Acoustics, Speech and Signal Processing
  • Conference lecture
    Skogstad, Trond; Svendsen, Torbjørn. (2008) Time-Varying Cepstral Coefficients. ISCA ITRW on Speech Analysis and Processing for Knowledge Discovery
  • Conference lecture
    Siniscalchi, Sabato Marco; Birkenes, Øystein; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (2008) Joint Optimization of Event Detectors and Evidence Merger for Continuous Speech Recognition. ISCA ITRW on Speech Analysis and Processing for Knowledge Discovery
  • Conference lecture
    Siniscalchi, Sabato Marco; Svendsen, Torbjørn; lee, chin-hui. (2008) A Penalized Logistic Regression Approach to Detection Based Phone Classification. Interspeech 2008
  • Interview
    Svendsen, Torbjørn. (2008) Norsk språkbank.

2007

  • Interview
    Svendsen, Torbjørn; Abelsen, Atle. (2007) IKE i hver puslebit.
  • Conference lecture
    Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Lee, Chin-Hui. (2007) Towards Bottom-Up Continuous Phone Recognition. 2007 IEEE Workshop on Automatic Speech Recognition and Understanding
  • Conference lecture
    Svendsen, Torbjørn. (2007) Articulatory Features and Segmental Information for Automatic Speech Recognition. ESF Exploratory Workshop on Models of Language Evolution, Acquisition and Processing

2006

  • Conference poster
    Amdal, Ingunn; Svendsen, Torbjørn. (2006) FonDat1: A Speech Synthesis Corpus for Norwegian. LREC 2006
  • Conference lecture
    Nordgård, Torbjørn; Svendsen, Torbjørn. (2006) Et norsk uttaleleksikon møter en spontan virkelighet. Oslomålet - et seminar med forskning fra NoTa-korpuset
  • Conference lecture
    Amdal, Ingunn; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (2006) Log Likelihood Ratio Based Annotation Verification of a Norwegian Speech Synthesis Database. NORSIG 2006
  • Conference lecture
    Svendsen, Torbjørn. (2006) Task and speaker adaptation. WISSAP'06

2005

  • Conference poster
    Skogstad, Trond; Svendsen, Torbjørn. (2005) Distributed ASR Using Speech Coder Data for Efficient Feature Vector Representation. Eurospeech 2005
  • Conference poster
    Meen, Dyre; Svendsen, Torbjørn; Natvig, Jon-Emil. (2005) Improving Phone Label Aligment Accuracy by Utilizing Voicing Information. SPECOM 2005
  • Conference lecture
    Svendsen, Torbjørn; Egeberg, Andreas; Holter, Trym. (2005) VOCALS - Voice centric user interfaces for location based services. NORSIG 05
  • Conference lecture
    Svendsen, Torbjørn; Amdal, Ingunn; Bjørkan, Ingmund; Meen, Dyre; Heggtveit, Per Olav; Natvig, Jon Emil. (2005) FONEMA - Tools for realistic speech synthesis in Norwegian. NORSIG 05
  • Conference poster
    Amdal, Ingunn; Svendsen, Torbjørn. (2005) Unit Selection Synthesis Database Development Using Utterance Verification. Interspeech 2005
  • Conference poster
    Bjørkan, Ingmund; Svendsen, Torbjørn; Farner, Snorre. (2005) Comparing Spectral Distance Measures for Join Cost Optimization in Concatenative Speech Synthesis. Interspeech 2005

2004

  • Conference lecture
    Svendsen, Torbjørn. (2004) Pronunciation Modeling for Speech Technology. 2004 International Conference on Signal Processing and Communications
  • Conference lecture
    Øien, Geir Egil; Holte, Nils; Andresen, Steinar; Svendsen, Torbjørn; Hammer, Mikael. (2004) Communication technology towards 2020. INFOSAM-2020 conference

2003

  • Conference poster
    Wong, Eddie; Martin, Terrence; Svendsen, Torbjørn; Sridharan, Sridha. (2003) Multilingual Phone Clustering for Recognition of Spontaneous Indonesian Speech Utilising Pronunciation Modelling Techniques. Eurospeech 2003
  • Conference poster
    Martin, Terrence; Svendsen, Torbjørn; Sridharan, Sridha. (2003) Cross-Lingual Pronunciation Modelling for Indonesian Speech Recognition. Eurospeech 2003
  • Lecture
    Svendsen, Torbjørn. (2003) Snakke dialekt med mobilen? Om dialektbruk i ny språkteknologi. [Mangler data]
  • Lecture
    Svendsen, Torbjørn. (2003) FONEMA - Metodeutvikling for naturtro norsk talesyntese. KUNSTI-seminar 2003
  • Lecture
    Svendsen, Torbjørn. (2003) Speech Processing Activities at NTNU: An Overview. Nordic Speech Technology Seminar
  • Conference lecture
    Svendsen, Torbjørn. (2003) Pronunciation Modelling for Speech Technology. [Mangler data]

2002

  • Conference lecture
    Amdal, Ingunn; Svendsen, Torbjørn. (2002) Evaluation of pronunciation variants in the ASR lexicon for different speaking styles. Third International Conference on Language Resources and Evaluation

2001

  • Conference lecture
    Johnsen, Magne Hallstein; Harborg, Erik; Svendsen, Torbjørn; Amble, Tore; Holter, Trym; Myrvoll, Tor Andre; Nordgård, Torbjørn. (2001) SPODIS - Spoken Dialog Systems for Telephony. NORSIG-2001, Norwegian Signal Processing Symposium
  • Conference poster
    Myrvoll, Tor Andre; Paliwal, Kuldip K.; Svendsen, Torbjørn. (2001) Fast Adaptation using Constrained Affine Transformations with Hierarchical Priors. Eurospeech 2001

2000

  • Lecture
    Svendsen, Torbjørn. (2000) Norsk språkbank, et nasjonalt korpus for språkteknologi. [Mangler data]
  • Lecture
    Svendsen, Torbjørn. (2000) Taleteknologi- teknologi med potensiale for kvalitetsheving og effektivisering ved håndtering av informasjon i sykehus. [Mangler data]
  • Lecture
    Svendsen, Torbjørn; Johnsen, Magne Hallstein. (2000) �Sesam sesam!� - Kan taleteknologi bli en døråpner for funksjonshemmede?. [Mangler data]
  • Lecture
    Svendsen, Torbjørn. (2000) Ordets makt � om taleteknologi som hjelpemiddel for funksjonshemmede. [Mangler data]
  • Conference lecture
    Johnsen, Magne Hallstein; Holter, Trym; Svendsen, Torbjørn; Harborg, Erik. (2000) Stochastic Modelling of Semantic Content for Use in a Spoken Dialogue System. 6th International Conference on Spoken Language Processing
  • Conference lecture
    Svendsen, Torbjørn. (2000) Pronunciation modeling for improved recognition of names. [Mangler data]
  • Conference lecture
    Johnsen, Magne Hallstein; Svendsen, Torbjørn; Amble, Tore; Holter, Trym; Harborg, Erik. (2000) TABOR - A Norwegian Spoken Dialogue System for Bus Travel Information. 6th International Conference on Spoken Language Processing
  • Conference lecture
    Holter, Trym; Harborg, Erik; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (2000) ASR-Based Subtitiling of Live TV-Programs for the Hearing Impaired. 6th International Conference on Spoken Language Processing
  • Feature article
    Foldvik, Arne Kjell; Nordgård, Torbjørn; Svendsen, Torbjørn; Thygesen, Ragnar. (2000) Dysleksi og språkteknologi.

1999

  • Conference lecture
    Amdal, Ingunn; Holter, Trym; Svendsen, Torbjørn. (1999) Maximum likelihood pronunciation modelling of Norwegian natural numbers for automatic speech recognition. NORSIG'99
  • Conference lecture
    Amdal, Ingunn; Holter, Trym; Svendsen, Torbjørn. (1999) Modellering av uttalevariasjon for automatisk talegjenkjenning. Møte om norsk språk (MONS 8)
  • Lecture
    Yang, Qian; Cremelie, Nick; Holter, Trym; Martens, Jean-Pierre; Svendsen, Torbjørn; Ringland, Simon. (1999) Lexicon building and word accuracy in continuous speech recognition. COST 249 meeting, Prague
  • Conference poster
    Harborg, Erik; Holter, Trym; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1999) Subtitling of live broadcast TV-programs for the hearing impaired. AAATE'99
  • Conference lecture
    Harborg, Erik; Holter, Trym; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1999) On-line captioning of TV-programs for the hearing impaired. EuroSpeech'99
  • Conference lecture
    Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1999) Menneske/maskin-kommunikasjon basert på tale. MONS-8 (8nde Møte Om Norsk Språk)
  • Conference lecture
    Harborg, Erik; Holter, Trym; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1999) Generation of closed captions for live TV-programs using speech recognition. Norsig'99

1998

  • Conference lecture
    Svendsen, Torbjørn. (1998) SPODIS - Spoken dialog systems for telephony services. Studiemøtet i elektronikk og data
  • Conference lecture
    Holter, Trym; Svendsen, Torbjørn. (1998) Maximum likelihood modelling of pronunciation variation. ESCA Tutorial and Research Workshop on Modeling Pronunciation Variation for ASR
  • Lecture
    Svendsen, Torbjørn. (1998) Speech processing activities at NTNU. [Mangler data]
  • Lecture
    Svendsen, Torbjørn. (1998) Taleteknolog. Nordisk språkmøte
  • Lecture
    Svendsen, Torbjørn. (1998) Taleteknologi ved NTNU. Aalborg workshop in speech communication

1997

  • Lecture
    Svendsen, Torbjørn. (1997) Acoustic subwords - some applications in speech processing. [Mangler data]
  • Lecture
    Svendsen, Torbjørn. (1997) Some topics from recent work in speech processing. [Mangler data]
  • Lecture
    Svendsen, Torbjørn. (1997) Speech recognition based on acoustic subword units. [Mangler data]
  • Lecture
    Holter, Trym; Svendsen, Torbjørn. (1997) Combined optimisation of baseforms and model parameters in speech recognition based on acoustic sub-word units. [Mangler data]
  • Conference lecture
    Holter, Trym; Svendsen, Torbjørn. (1997) Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition. Eurospeech '97
  • Conference lecture
    Holter, Trym; Svendsen, Torbjørn. (1997) Combined optimisation of baseforms and model parameters in speech recognition based on acoustic subword units. IEEE Speech recognition Workshop
  • Conference lecture
    Holter, Trym; Svendsen, Torbjørn. (1997) A joint segmentation and labelling scheme for use in acoustic subword based speech recognition. Norwegian Signal Processing Symposium

1996

  • Conference lecture
    Pihl, Johnny; Johnsen, Magne Hallstein; Svendsen, Torbjørn. (1996) A VLSI implementation of pdf computations in HMM based speech recognition. TENCON-96

1995

  • Conference lecture
    Johnsen, Magne Hallstein; Svendsen, Torbjørn; Harborg, Erik. (1995) Experiments on cepstral mean subtraction and Rasta-filtering applied to SAMPA phoneme recognition. COST249

1994

  • Lecture
    Svendsen, Torbjørn. (1994) Acoustic segmentation of speech : applications in speech processing. [Mangler data]
  • Lecture
    Svendsen, Torbjørn. (1994) Acoustic segmentation of speech : applications in speech processing. [Mangler data]
  • Conference lecture
    Svendsen, Torbjørn. (1994) Segmental quantization of speech spectral information. IEEE International Conference on Acoustics, Speech and Signal Processing

1993

  • Conference lecture
    Svendsen, Torbjørn. (1993) Efficient quantization of speech spectral information. EUROSPEECH '93 (1993 : Berlin)

1989

  • Conference lecture
    Svendsen, Torbjørn Karl; Paliwal, Kuldip K.; Harborg, Erik; Husøy, Per Ove. (1989) An Improved Sub-Word Based Speech Recognizer. International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

1988

  • Conference lecture
    Svendsen, Torbjørn Karl; Paliwal, K.K.; Harborg, Erik; Husøy, P.O.. (1988) Experiments with a Sub-Word Based Speech Recognizer. International Conference on Speech Science and Technology (ICSST)

NTNU – Norwegian University of Science and Technology

  • For employees
  • |
  • For students
  • |
  • Intranet
  • |
  • Canvas

Studies

  • Master's programmes in English
  • For exchange students
  • PhD opportunities
  • Courses
  • Career development
  • Continuing education
  • Application process

News

  • NTNU News
  • Vacancies

About NTNU

  • About the university
  • Libraries
  • NTNU's strategy
  • Research excellence
  • Strategic research areas
  • Organizational chart

Contact

  • Contact NTNU
  • Employees
  • Find experts
  • Press contacts
  • Researcher support
  • Maps

NTNU in three cities

  • NTNU in Gjøvik
  • NTNU in Trondheim
  • NTNU in Ålesund

About this website

  • Use of cookies
  • Accessibility statement
  • Privacy policy
  • Editorial responsibility
Facebook Instagram Linkedin Snapchat Tiktok Youtube
Sign In
NTNU logo