Giampiero Salvi
About
Giampiero Salvi (Senior Member IEEE) is a Full Professor at the Department of Electronic Systems. He is a member of the Signal Processing Group. He is also an Associate Professor with KTH Royal Institute of Technology in Sweden. Professor Salvi has an M.Sc. in Electronic Engineering from La Sapienza University in Rome, Italy, and a PhD in Computer Science from KTH Royal Institue of Technology in Sweden.
Research Interests
- Machine Learning
- Speech Technology
- Cognitive Systems
Current Projects
- SCRIBE: Machine transcription of Norwegian conversational speech [link]
- Teflon: Technology-enhanced foreign and second-language learning of Nordic languages [link]
- NordTrans: Technology for automatic speech transcription in selected Nordic languages [link]
- Center for Geophysical Forecasting [link]
Past Projects
- Gesture learning and language acquisition in humanoid robots, Fundação para a Ciência e a Tecnologia, IST, Lisbon, Portugal
- Biologically inspired statistical methods for flexible automatic speech understanding, Swedish Research Council, KTH, Stockholm, Sweden
- Interactive Grounded Language Understanding, CHIST-ERA (EU) and Swedish Research Council, KTH, Stockholm Sweden. [link]
Publications
2026
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2026)
Segmentation-Free Goodness of Pronunciation.
IEEE Transactions on Audio, Speech and Language Processing
Academic article
-
Parsons, Phoebe;
Salvi, Giampiero;
Svendsen, Torbjørn;
Kvale, Knut.
(2026)
On Dialects and Speech Technology.
Norges teknisk-naturvitenskapelige universitet
Doctoral thesis
2025
-
Stenwig, Eline;
Salvo Rossi, Pierluigi;
Salvi, Giampiero;
Skjaervold, Nils Kristian.
(2025)
The impact of feature combinations on machine learning models for in-hospital mortality prediction.
Scientific Reports
Academic article
-
Salvi, Giampiero.
(2025)
TeflonNorL2 NOCASA Challenge Dataset.
Nationalbiblioteket
Data set
-
Yaroslav, Getman,;
Tamás, Grósz,;
Mikko, Kurimo,;
Salvi, Giampiero.
(2025)
[2504.20678] Non-native Children's Automatic Speech Assessment Challenge (NOCASA).
Academic chapter
-
Parsons, Phoebe Luree Turner;
Bremnes, Heming Strømholt;
Kvale, Knut;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2025)
Effects of Prosodic Information on Dialect Classification Using Whisper Features.
Academic chapter
-
Fan, Zijian;
Cao, Xinwei;
Salvi, Giampiero;
Svendsen, Torbjørn.
(2025)
Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss.
Academic chapter
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2025)
Child speech assessment through large language model speech synthesis: Preliminary results.
Academic chapter
-
Dymbe, Simen;
Siniscalchi, Sabato Marco;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2025)
Using Cross-Attention for Conversational ASR over the Telephone.
Academic chapter
-
Rugayan, Janine Lizbeth Cabrera;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2025)
Optimizing ASR Models with Semantic Information.
Academic chapter
-
Adiban, Mohammad;
Stefanov, Kalin;
Siniscalchi, Sabato Marco;
Salvi, Giampiero.
(2025)
S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction.
IEEE transactions on multimedia
Academic article
-
Parsons, Phoebe Luree Turner;
Solberg, Per Erik;
Kvale, Knut;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2025)
Adding Metadata to Existing Parliamentary Speech Corpus.
Academic chapter
-
Parsons, Phoebe Luree Turner;
Kvale, Knut;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2025)
Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR.
Academic chapter
2024
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2024)
A Framework for Phoneme-Level Pronunciation Assessment Using CTC.
Interspeech
Academic article
-
Fan, Zijian;
Cao, Xinwei;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2024)
Towards Better Recognition of Spontaneous Children's Speech: Speaker-Clustering Fine-Tuning of Whisper.
Machine Learning for Signal Processing
Academic article
-
Quatra, Moreno La;
Turco, Maria Francesca;
Svendsen, Torbjørn Karl;
Salvi, Giampiero;
Orozco-Arroyave, Juan Rafael;
Siniscalchi, Sabato Marco.
(2024)
Exploiting Foundation Models and Speech Enhancement for Parkinson’s Disease Detection from Speech in Real-World Operative Conditions.
Interspeech
Academic article
-
Kynych, Frantisek;
Cerva, Petr;
Zdansky, Jindrich;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2024)
A lightweight approach to real-time speaker diarization: from audio toward audio-visual data streams.
EURASIP Journal on Audio, Speech, and Music Processing
Academic article
-
Olstad, Anne Marte Haug;
Smolander, Anna;
Strömbergsson, Sofia;
Ylinen, Sari;
Lehtonen, Minna;
Kurimo, Mikko;
Getman, Yaroslav;
Grósz, Tamás;
Cao, Xinwei;
Svendsen, Torbjørn Karl.
(2024)
Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages.
Proceedings of LREC
Academic article
2023
-
Stenwig, Eline;
Salvi, Giampiero;
Salvo Rossi, Pierluigi;
Skjaervold, Nils Kristian.
(2023)
Comparison of correctly and incorrectly classified patients for in-hospital mortality prediction in the intensive care unit.
BMC Medical Research Methodology
Academic article
-
Solberg, Per Erik;
Ortiz Cabello, Pablo;
Parsons, Phoebe;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2023)
Improving Generalization of Norwegian ASR with Limited Linguistic Resources.
Academic chapter
-
Parsons, Phoebe;
Kvale, Knut;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2023)
A character-based analysis of impacts of dialects on end-to-end Norwegian ASR.
Academic chapter
-
Getman, Yaroslav;
Phan, Nhan;
Al-Ghezi, Ragheb;
Voskoboinik, Ekaterina;
Singh, Mittul;
Grosz, Tamas;
Kurimo, Mikko;
Salvi, Giampiero;
Svendsen, Torbjørn Karl;
Strombergsson, Sofia.
(2023)
Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children.
IEEE Access
Academic article
-
Adiban, Mohammad;
Siniscalchi, Sabato Marco;
Salvi, Giampiero.
(2023)
A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity.
Neurocomputing
Academic article
-
Rugayan, Janine Lizbeth Cabrera;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2023)
Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation.
Interspeech (USB)
Academic article
-
Fan, Zijian;
Cao, Xinwei;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2023)
Using Modified Adult Speech as Data Augmentation for Child Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
Academic article
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2023)
An Analysis of Goodness of Pronunciation for Child Speech.
Interspeech
Academic article
2022
-
Rugayan, Janine Lizbeth Cabrera;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2022)
Semantically Meaningful Metrics for Norwegian ASR Systems.
Interspeech (USB)
Academic article
-
Abdelnour, Jerome;
Rouat, Jean;
Salvi, Giampiero.
(2022)
NAAQA: A Neural Architecture for Acoustic Question Answering.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Academic article
-
Getman, Yaroslav;
Al-Ghezi, Ragheb;
Voskoboinik, Ekaterina;
Grósz, Tamás;
Kurimo, Mikko;
Salvi, Giampiero;
Svendsen, Torbjørn Karl;
Strömbergsson, Sofia.
(2022)
wav2vec2-based Speech Rating System for Children with Speech Sound Disorder.
Interspeech (USB)
Academic article
-
Stenwig, Eline;
Salvi, Giampiero;
Salvo Rossi, Pierluigi;
Skjaervold, Nils Kristian.
(2022)
Comparative analysis of explainable machine learning prediction models for hospital mortality.
BMC Medical Research Methodology
Academic article
2021
-
Sabzi Shahrebabaki, Abdolreza;
Salvi, Giampiero;
Svendsen, Torbjørn Karl;
Siniscalchi, Sabato Marco.
(2021)
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
Academic article
-
Stefanov, Kalin;
Adiban, Mohammad;
Salvi, Giampiero.
(2021)
Spatial Bias in Vision-Based Voice Activity Detection.
International Conference on Pattern Recognition
Academic article
-
Adiban, Mohammad;
Safari, Arash;
Salvi, Giampiero.
(2021)
STEP-GAN: A One-Class Anomaly Detection Model with Applications to Power System Security.
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
Academic article
-
Sabzi Shahrebabaki, Abdolreza;
Siniscalchi, Sabato Marco;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2021)
A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion.
Academic chapter
2020
-
Sabzi Shahrebabaki, Abdolreza;
Olfati, Negar;
Siniscalchi, Sabato Marco;
Salvi, Giampiero;
Svendsen, Torbjørn.
(2020)
Transfer learning of articulatory information through phone information.
Interspeech (USB)
Academic article
-
Sabzi Shahrebabaki, Abdolreza;
Siniscalchi, Marco;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2020)
Sequence-to-sequence articulatory inversion through time convolution of sub-band frequency signals.
Interspeech (USB)
Academic article
2019
-
Stefanov, Kalin;
Salvi, Giampiero;
Kontogiorgos, Dimosthenis;
Kjellström, Hedvig;
Beskow, Jonas.
(2019)
Modeling of Human Visual Attention in Multiparty Open-World Dialogues.
ACM Transactions on Human-Robot Interaction
Academic article
-
Selamtzis, Andreas;
Castellana, Antonella;
Salvi, Giampiero;
Carullo, Alessio.
(2019)
Effect of vowel context in cepstral and entropy analysis of pathological voices.
Biomedical Signal Processing and Control
Academic article
-
Saponaro, Giovanni;
Jamone, Lorenzo;
Alexandre, Bernardino;
Salvi, Giampiero.
(2019)
Beyond the Self: Using Grounded Affordances to Interpret and Describe Others' Actions.
IEEE Transactions on Cognitive and Developmental Systems
Academic article
-
Stefanov, Kalin;
Beskow, Jonas;
Salvi, Giampiero.
(2019)
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition.
IEEE Transactions on Cognitive and Developmental Systems
Academic article
2015
-
Strömbergsson, Sofia;
Salvi, Giampiero;
House, David.
(2015)
Acoustic and perceptual evaluation of category goodness of /t/ and /k/ in typical and misarticulated children's speech.
Journal of the Acoustical Society of America
Academic article
2013
-
Koniaris, Christos;
Salvi, Giampiero;
Engwall, Olov.
(2013)
On mispronunciation analysis of individual foreign speakers using auditory periphery models.
Speech Communication
Academic article
-
Neiberg, Daniel;
Salvi, Giampiero;
Gustafson, Joakim.
(2013)
Semi-supervised methods for exploring the acoustics of simple productive feedback.
Speech Communication
Academic article
2012
-
Salvi, Giampiero;
Montesano, Luis;
Bernardino, Alexandre;
Santos-Victor, José.
(2012)
Language bootstrapping: Learning Word Meanings From Perception-Action Association.
IEEE Transactions on Cybernetics
Academic article
2009
-
Salvi, Giampiero;
Beskow, Jonas;
Moubayed, Samer Al;
Grandström, Björn.
(2009)
SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support.
EURASIP Journal on Audio, Speech, and Music Processing
Academic article
2006
-
Salvi, Giampiero.
(2006)
Dynamic behaviour of connectionist speech recognition with strong latency constraints.
Speech Communication
Academic article
-
Salvi, Giampiero.
(2006)
Segment boundary detection via class entropy measurements in connectionist phoneme recognition.
Speech Communication
Academic article
2004
-
Siciliano, Catherine;
Williams, Geoff;
Faulkner, Andrew J.;
Salvi, Giampiero.
(2004)
Intelligibility of an ASR-controlled synthetic talking face.
Journal of the Acoustical Society of America
Academic article
Journal publications
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2026)
Segmentation-Free Goodness of Pronunciation.
IEEE Transactions on Audio, Speech and Language Processing
Academic article
-
Stenwig, Eline;
Salvo Rossi, Pierluigi;
Salvi, Giampiero;
Skjaervold, Nils Kristian.
(2025)
The impact of feature combinations on machine learning models for in-hospital mortality prediction.
Scientific Reports
Academic article
-
Sabzi Shahrebabaki, Abdolreza;
Salvi, Giampiero;
Svendsen, Torbjørn Karl;
Siniscalchi, Sabato Marco.
(2021)
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models.
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)
Academic article
-
Stenwig, Eline;
Salvi, Giampiero;
Salvo Rossi, Pierluigi;
Skjaervold, Nils Kristian.
(2023)
Comparison of correctly and incorrectly classified patients for in-hospital mortality prediction in the intensive care unit.
BMC Medical Research Methodology
Academic article
-
Stefanov, Kalin;
Adiban, Mohammad;
Salvi, Giampiero.
(2021)
Spatial Bias in Vision-Based Voice Activity Detection.
International Conference on Pattern Recognition
Academic article
-
Adiban, Mohammad;
Stefanov, Kalin;
Siniscalchi, Sabato Marco;
Salvi, Giampiero.
(2025)
S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction.
IEEE transactions on multimedia
Academic article
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2024)
A Framework for Phoneme-Level Pronunciation Assessment Using CTC.
Interspeech
Academic article
-
Salvi, Giampiero;
Beskow, Jonas;
Moubayed, Samer Al;
Grandström, Björn.
(2009)
SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support.
EURASIP Journal on Audio, Speech, and Music Processing
Academic article
-
Stefanov, Kalin;
Salvi, Giampiero;
Kontogiorgos, Dimosthenis;
Kjellström, Hedvig;
Beskow, Jonas.
(2019)
Modeling of Human Visual Attention in Multiparty Open-World Dialogues.
ACM Transactions on Human-Robot Interaction
Academic article
-
Salvi, Giampiero;
Montesano, Luis;
Bernardino, Alexandre;
Santos-Victor, José.
(2012)
Language bootstrapping: Learning Word Meanings From Perception-Action Association.
IEEE Transactions on Cybernetics
Academic article
-
Selamtzis, Andreas;
Castellana, Antonella;
Salvi, Giampiero;
Carullo, Alessio.
(2019)
Effect of vowel context in cepstral and entropy analysis of pathological voices.
Biomedical Signal Processing and Control
Academic article
-
Saponaro, Giovanni;
Jamone, Lorenzo;
Alexandre, Bernardino;
Salvi, Giampiero.
(2019)
Beyond the Self: Using Grounded Affordances to Interpret and Describe Others' Actions.
IEEE Transactions on Cognitive and Developmental Systems
Academic article
-
Fan, Zijian;
Cao, Xinwei;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2024)
Towards Better Recognition of Spontaneous Children's Speech: Speaker-Clustering Fine-Tuning of Whisper.
Machine Learning for Signal Processing
Academic article
-
Quatra, Moreno La;
Turco, Maria Francesca;
Svendsen, Torbjørn Karl;
Salvi, Giampiero;
Orozco-Arroyave, Juan Rafael;
Siniscalchi, Sabato Marco.
(2024)
Exploiting Foundation Models and Speech Enhancement for Parkinson’s Disease Detection from Speech in Real-World Operative Conditions.
Interspeech
Academic article
-
Salvi, Giampiero.
(2006)
Dynamic behaviour of connectionist speech recognition with strong latency constraints.
Speech Communication
Academic article
-
Getman, Yaroslav;
Phan, Nhan;
Al-Ghezi, Ragheb;
Voskoboinik, Ekaterina;
Singh, Mittul;
Grosz, Tamas;
Kurimo, Mikko;
Salvi, Giampiero;
Svendsen, Torbjørn Karl;
Strombergsson, Sofia.
(2023)
Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children.
IEEE Access
Academic article
-
Strömbergsson, Sofia;
Salvi, Giampiero;
House, David.
(2015)
Acoustic and perceptual evaluation of category goodness of /t/ and /k/ in typical and misarticulated children's speech.
Journal of the Acoustical Society of America
Academic article
-
Koniaris, Christos;
Salvi, Giampiero;
Engwall, Olov.
(2013)
On mispronunciation analysis of individual foreign speakers using auditory periphery models.
Speech Communication
Academic article
-
Salvi, Giampiero.
(2006)
Segment boundary detection via class entropy measurements in connectionist phoneme recognition.
Speech Communication
Academic article
-
Stefanov, Kalin;
Beskow, Jonas;
Salvi, Giampiero.
(2019)
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition.
IEEE Transactions on Cognitive and Developmental Systems
Academic article
-
Neiberg, Daniel;
Salvi, Giampiero;
Gustafson, Joakim.
(2013)
Semi-supervised methods for exploring the acoustics of simple productive feedback.
Speech Communication
Academic article
-
Siciliano, Catherine;
Williams, Geoff;
Faulkner, Andrew J.;
Salvi, Giampiero.
(2004)
Intelligibility of an ASR-controlled synthetic talking face.
Journal of the Acoustical Society of America
Academic article
-
Sabzi Shahrebabaki, Abdolreza;
Olfati, Negar;
Siniscalchi, Sabato Marco;
Salvi, Giampiero;
Svendsen, Torbjørn.
(2020)
Transfer learning of articulatory information through phone information.
Interspeech (USB)
Academic article
-
Sabzi Shahrebabaki, Abdolreza;
Siniscalchi, Marco;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2020)
Sequence-to-sequence articulatory inversion through time convolution of sub-band frequency signals.
Interspeech (USB)
Academic article
-
Rugayan, Janine Lizbeth Cabrera;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2022)
Semantically Meaningful Metrics for Norwegian ASR Systems.
Interspeech (USB)
Academic article
-
Adiban, Mohammad;
Siniscalchi, Sabato Marco;
Salvi, Giampiero.
(2023)
A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity.
Neurocomputing
Academic article
-
Rugayan, Janine Lizbeth Cabrera;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2023)
Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation.
Interspeech (USB)
Academic article
-
Abdelnour, Jerome;
Rouat, Jean;
Salvi, Giampiero.
(2022)
NAAQA: A Neural Architecture for Acoustic Question Answering.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Academic article
-
Getman, Yaroslav;
Al-Ghezi, Ragheb;
Voskoboinik, Ekaterina;
Grósz, Tamás;
Kurimo, Mikko;
Salvi, Giampiero;
Svendsen, Torbjørn Karl;
Strömbergsson, Sofia.
(2022)
wav2vec2-based Speech Rating System for Children with Speech Sound Disorder.
Interspeech (USB)
Academic article
-
Fan, Zijian;
Cao, Xinwei;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2023)
Using Modified Adult Speech as Data Augmentation for Child Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
Academic article
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2023)
An Analysis of Goodness of Pronunciation for Child Speech.
Interspeech
Academic article
-
Adiban, Mohammad;
Safari, Arash;
Salvi, Giampiero.
(2021)
STEP-GAN: A One-Class Anomaly Detection Model with Applications to Power System Security.
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
Academic article
-
Stenwig, Eline;
Salvi, Giampiero;
Salvo Rossi, Pierluigi;
Skjaervold, Nils Kristian.
(2022)
Comparative analysis of explainable machine learning prediction models for hospital mortality.
BMC Medical Research Methodology
Academic article
-
Kynych, Frantisek;
Cerva, Petr;
Zdansky, Jindrich;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2024)
A lightweight approach to real-time speaker diarization: from audio toward audio-visual data streams.
EURASIP Journal on Audio, Speech, and Music Processing
Academic article
-
Olstad, Anne Marte Haug;
Smolander, Anna;
Strömbergsson, Sofia;
Ylinen, Sari;
Lehtonen, Minna;
Kurimo, Mikko;
Getman, Yaroslav;
Grósz, Tamás;
Cao, Xinwei;
Svendsen, Torbjørn Karl.
(2024)
Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages.
Proceedings of LREC
Academic article
Part of book/report
-
Yaroslav, Getman,;
Tamás, Grósz,;
Mikko, Kurimo,;
Salvi, Giampiero.
(2025)
[2504.20678] Non-native Children's Automatic Speech Assessment Challenge (NOCASA).
Academic chapter
-
Parsons, Phoebe Luree Turner;
Bremnes, Heming Strømholt;
Kvale, Knut;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2025)
Effects of Prosodic Information on Dialect Classification Using Whisper Features.
Academic chapter
-
Fan, Zijian;
Cao, Xinwei;
Salvi, Giampiero;
Svendsen, Torbjørn.
(2025)
Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss.
Academic chapter
-
Cao, Xinwei;
Fan, Zijian;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2025)
Child speech assessment through large language model speech synthesis: Preliminary results.
Academic chapter
-
Dymbe, Simen;
Siniscalchi, Sabato Marco;
Svendsen, Torbjørn;
Salvi, Giampiero.
(2025)
Using Cross-Attention for Conversational ASR over the Telephone.
Academic chapter
-
Rugayan, Janine Lizbeth Cabrera;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2025)
Optimizing ASR Models with Semantic Information.
Academic chapter
-
Solberg, Per Erik;
Ortiz Cabello, Pablo;
Parsons, Phoebe;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2023)
Improving Generalization of Norwegian ASR with Limited Linguistic Resources.
Academic chapter
-
Parsons, Phoebe;
Kvale, Knut;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2023)
A character-based analysis of impacts of dialects on end-to-end Norwegian ASR.
Academic chapter
-
Parsons, Phoebe Luree Turner;
Solberg, Per Erik;
Kvale, Knut;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2025)
Adding Metadata to Existing Parliamentary Speech Corpus.
Academic chapter
-
Parsons, Phoebe Luree Turner;
Kvale, Knut;
Svendsen, Torbjørn Karl;
Salvi, Giampiero.
(2025)
Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR.
Academic chapter
-
Sabzi Shahrebabaki, Abdolreza;
Siniscalchi, Sabato Marco;
Salvi, Giampiero;
Svendsen, Torbjørn Karl.
(2021)
A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion.
Academic chapter
Student thesis or dissertation
-
Parsons, Phoebe;
Salvi, Giampiero;
Svendsen, Torbjørn;
Kvale, Knut.
(2026)
On Dialects and Speech Technology.
Norges teknisk-naturvitenskapelige universitet
Doctoral thesis
Teaching
Courses
Outreach
2025
-
Conference lectureParsons, Phoebe Luree Turner; Solberg, Per Erik; Kvale, Knut; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Adding Metadata to Existing Parliamentary Speech Corpus. Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)
-
Conference lectureParsons, Phoebe Luree Turner; Bremnes, Heming Strømholt; Kvale, Knut; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Effects of Prosodic Information on Dialect Classification Using Whisper Features. Interspeech 2025
-
Conference lectureRugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn. (2025) Optimizing ASR Models with Semantic Information. Text, Speech and Dialogue
-
Conference lectureDymbe, Simen; Siniscalchi, Sabato Marco; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Using Cross-Attention for Conversational ASR over the Telephone. Text, Speech and Dialogue
-
Conference lectureFan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn. (2025) Improving Phone Recognition through Informed Initialization and Path-Aligned CTC Loss. 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP)
-
Conference lectureCao, Xinwei; Fan, Zijian; Svendsen, Torbjørn; Salvi, Giampiero. (2025) Child speech assessment through large language model speech synthesis: Preliminary results. 2025 IEEE 35th International Workshop on Machine Learning for Signal Processing (MLSP)
2024
-
Blog post
-
Blog post
-
Blog post
-
Conference lectureOlstad, Anne Marte Haug; Smolander, Anna; Strömbergsson, Sofia; Ylinen, Sari; Lehtonen, Minna; Kurimo, Mikko; Getman, Yaroslav; Grósz, Tamás; Cao, Xinwei; Svendsen, Torbjørn. (2024) Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages. LREC-COLING
-
Conference lectureQuarta, Moreno La; Turco, Maria Francesca; Svendsen, Torbjørn; Salvi, Giampiero; Orozco-Arroyave, Juan Rafael; Siniscalchi, Sabato Marco. (2024) oundation Models and Speech Enhancement for Parkinson’s Disease Detection from Speech in Real-World Operative Conditions. Interspeech
-
Conference lectureFan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2024) Towards Better Recognition of Spontaneous Children's Speech: Speaker-Clustering Fine-Tuning of Whisper. chine Learning for Signal Processing
-
Conference lectureCao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) Framework for Phoneme-Level Pronunciation Assessment Using CTC. Interspeech
-
Conference lecture
-
Conference lectureSalvi, Giampiero. (2024) Speech Research at NTNU. Visit at Electical Engineering, Sapienza University
-
Conference lectureParsons, Phoebe Luree Turner; Bremnes, Heming Strømholt; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2024) Norwegian dialect identification: is prosody enough?. Fonetik
2023
-
Blog post
-
Blog post
-
Blog post
-
Lecture
-
Conference lectureSalvi, Giampiero. (2023) Speech Research at NTNU. Visit at Computer Engineering, Sapienza University
-
Conference lectureParsons, Phoebe Luree Turner; Kvale, Knut; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) A character-based analysis of impacts of dialects on end-to-end Norwegian ASR. 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
-
Conference lectureRugayan, Janine Lizbeth Cabrera; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Perceptual and Task-Oriented Assessment of a Semantic Metric for ASR Evaluation. Interspeech
-
Conference lectureFan, Zijian; Cao, Xinwei; Salvi, Giampiero; Svendsen, Torbjørn Karl. (2023) Using Modified Adult Speech as Data Augmentation for Child Speech Recognition. ICASSP
-
Conference lectureCao, Xinwei; Fan, Zijian; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) An Analysis of Goodness of Pronunciation for Child Speech. Interspeech
-
Conference lectureSolberg, Per Erik; Ortiz Cabello, Pablo; Parsons, Phoebe Luree Turner; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2023) Improving Generalization of Norwegian ASR with Limited Linguistic Resources. 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
2022
-
Conference lectureGetman, Yaroslav; Al-Ghezi, Ragheb; Voskoboinik, Ekaterina; Grósz, Tamás; Kurimo, Mikko; Salvi, Giampiero; Svendsen, Torbjørn Karl; Strömbergsson, Sofia. (2022) wav2vec2-based Speech Rating System for Children with Speech Sound Disorder. Interspeech
-
Conference lectureRugayan, Janine Lizbeth Cabrera; Svendsen, Torbjørn Karl; Salvi, Giampiero. (2022) Semantically Meaningful Metrics for Norwegian ASR Systems. Interspeech