LAP - Language and Personalization
Language and Personalization
Language and Personalization
This work package is a new construct, made up by the combination of the previous work packages on personalization (PERS) and natural language processing (LANG). The work in the work package will initially follow these two directions somewhat separately, for later to be tighter combined.
The purpose for this work package is to develop personalization techniques and Scandinavian language processing capabilities to provide personalized content generation and:
- Develop truly explainable, fair and transparent personalization techniques
- Enable proactivity in customer relations
- Provide an individualized experience that provably respects privacy concerns
- Develop individualized content
- Develop large-scale Scandinavian language models
- Enable human-like content creation and conversations
Personalization and contextualization have been successfully employed in diverse applications over the past decade, and currently see an extended usage, for instance in proactive interaction with customers and individualization of news stories. LAP will contribute to developing such systems while ensuring that the system usage will be ethical and respecting users’ requirements for privacy, fairness and accountability.
Building Scandinavian language models requires the compilation of large-scale reusable language resources, including general-purpose corpora from public sources (e.g., news and social media) as well as industry- and domain-specific text collections. We will address the scarcity of the latter by pre-training on the former and developing transfer learning methods. These large-scale language models will then be utilized in real-life scenarios by formulating a number of specific summarization, explanation, and conversational tasks based on our partners’ use-cases. LAP will develop appropriate evaluation methodology with user-oriented evaluation measures and objectives. It will thus contribute to providing measurable quantification of the amount of domain-specific training material needed in order to provide a language service that is of sufficiently high quality.
Short description: Based on a very large corpus consisting of newspaper articles from most Norwegian newspapers, NorwAI will create the largest Norwegian language model built so far, enabling new opportunities within areas as chatbots and text summarization.
Time perspective: 2020-
Language experts to report on speech and text to the Storting
The Norwegian Board of Technology (Teknologirådet) councels lawmakers and government. By starting with speech and later continuing on large language models, expert groups will disseminate the complex language technology step by step. NorwAI's director Jon Atle Gulla is part of the expert group.
-One particular area where the Norwegian AI stands out is the genuine interest in fairness, transparency and explainability, which align with societal values in Norway. Therefore, I can see Norwegian AI research taking a world-leading role in these areas, says professor Krisztian Balog at the University of Stavanger and Staff Research Scientist at Google.
Krisztian Balog heads NorwAIs work package for language technologies. He cooperates with NorwAI's research director, professor Kjetil Nørvaag. The two professors joined their skills as general chairs of the successful ECIR conference in Stavanger during the week before Easter, giving an international audience insight on new research results in the broadly conceived area of Information Retrieval.
Tailoring news content: How Scandinavian mediahouses have tested recommender systems
Scandinavian newspapers were early adapters to online services 25 years ago. Gradually some of them explored how recommender systems would enable individually tailored news streams. In an article in AI Magazine recently NorwAI associates, headed by Center director Jon Atle Gulla (picture) explore how Scandinavian media organizations are coping with these new technological opportunities.
New Language Models in NorwAI
The NorwAI center is determined to provide new Norwegian language models that are significantly larger and better than what is available to-day and can easily be employed in advanced Norwegian NLP applications for industrial use, says center director professor Jon Atle Gulla.