+47 73594455
Sem Sælands vei 9, IT-bygget * 360

Background and activities

I work as a researcher in the area of Computational Linguistics and Natural Language Processing. I am interested in basically all aspects of natural language and speech from a computational point of view. Some of things I worked on (roughly in reverse chronological order) are:
  • Text Mining from scientific literature (marine/climate/environmental science)
  • Information Retrieval (with Random Indexing)
  • Cross-lingual IR
  • Information Extraction
  • Machine Translation (without parallel corpora), in particular word translation disambiguation
  • Treebanks of monolingual parallel/comparable text
  • Semantic Textual Similarity
  • Paraphrasing
  • Text-to-Text Generation and Sentence Fusion
  • Sentence Compression
  • Multi-document Summarization
  • Recognizing Textual Entailment
  • Dependency Parsing
  • Prosody prediction, intonation in particular
  • Speech Synthesis, both Text-to-Speech and Concept-to-Speech conversion
  • Talking heads and Embodied Conversational Agents
  • Natural Language Generation
  • Corpus annotation and validation
  • Morphological analysis and POS tagging of Arabic
  • Machine Learning (memory-based learning in particular)

Scientific, academic and artistic work

Journal publications

Part of book/report

  • Barik, Biswanath; Marsi, Erwin. (2017) NTNU-2 at SemEval-2017 Task 10: Identifying Synonym and Hyponym Relations among Keyphrases in Scientific Documents. 11th International Workshop on Semantic Evaluations (SemEval-2017).
  • Barik, Biswanath; Marsi, Erwin; Öztürk, Pinar. (2017) Extracting Causal Relations among Complex Events in Natural Science Literature. The 22nd International Conference on Applications of Natural Language to Information Systems, NLDB 2017, held in Liège, Belgium, in June 2017..
  • Marsi, Erwin; Skidar, Utpal; Marco, Cristina; Barik, Biswanath; Sætre, Rune. (2017) NTNU-1@ScienceIE at SemEval-2017 Task 10: Identifying and Labelling Keyphrases with Conditional Random Fields. 11th International Workshop on Semantic Evaluations (SemEval-2017).
  • Bøhler, Henrik; Asla, Petter Fagerlund; Marsi, Erwin; Sætre, Rune. (2016) IDI@NTNU at SemEval-2016 Task 6: Detecting Stance in Tweets Using Shallow Features and GloVe Vectors for Word Representation. The 10th International Workshop on Semantic Evaluation. Proceedings of the Workshop..
  • Marsi, Erwin; Øzturk, Pinar. (2015) Extraction and generalisation of variables from scientific publications. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015).
  • Marsi, Erwin; Øzturk, Pinar; Aamot, Elias; Sizov, Gleb Valerjevich; Ardelan, Murat Van. (2014) Towards Text Mining in Climate Science: Extraction of Quantitative Variables and their Relations. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14).
  • Moen, Hans; Marsi, Erwin; Ginter, Filip; Murtola, Laura-Maria; Salakoski, Tapio; Salanterä, Sanna. (2014) Care Episode Retrieval. Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi).
  • Bungum, Lars; Gambäck, Björn; Lynum, Andre; Marsi, Erwin. (2013) Improving Word Translation Disambiguation by Capturing Multiword Expressions with Dictionaries. Proceedings of the 9th Workshop on Multiword Expressions.
  • Marsi, Erwin; Krahmer, Emiel. (2013) Automatic Tree Matching for Analysing Semantic Similarity in Comparable Text. Essential Speech and Language Technology for Dutch.
  • Marsi, Erwin; Moen, Hans; Bungum, Lars; Sizov, Gleb Valerjevich; Gambäck, Björn; Lynum, Andre. (2013) NTNU-CORE: Combining strong features for semantic similarity. Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity.
  • Moen, Hans; Marsi, Erwin; Gambäck, Björn. (2013) Towards Dynamic Word Sense Discrimination with Random Indexing. Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality.
  • Lynum, Andre; Marsi, Erwin; Bungum, Lars; Gambäck, Björn. (2012) Disambiguating Word Translations with Target Language Models. Text, Speech and Dialogue: 15th International Conference, TSD 2012, Brno, Czech Republic, September 3-7, 2012, Proceedings.
  • Moen, Hans; Marsi, Erwin. (2012) Towards Cross-Lingual Information Retrieval using Random Indexing. Norsk informatikkonferanse NIK 2012; Universitetet i Nordland 19 – 21 november 2012.